Significant at 90% level (0.05 < *p* < 0.1); + significant at 95% level (0.01 < *p* < 0.05); \* significant at 99% level (*p* < 0.01).

The larger influence of meteorology changes on AOD than surface PM2.5 is attributable to several reasons. First, emissions have larger influence on surface PM2.5 than aloft. Second, PM2.5 concentration decreases with increasing altitude. Mathematically, the same amount of meteorology changes show relatively larger effects on smaller PM2.5 concentrations. Third, observations show that change rates of meteorological elements are amplified with elevation [43,44]. Warming [44] and change of wind speed [43] are more rapid at higher elevations.

#### *3.2. Meteorology Changes in Spring Are Beneficial to Aerosol Reduction*

We used multiple linear regression to identify major meteorological elements that possibly have large influences on inter-annual variations of AOD and surface PM2.5. We found that meteorological elements that might influence inter-annual variations of AOD and surface PM2.5 in spring are similar and mainly related to RH and wind speed. In addition, increasing wind speed in spring is, possibly, the main reason for the beneficial effects of meteorology changes in AOD and surface PM2.5 reduction. We found that the inter-annual variations of AOD in NCP is strongly related to the variations of T at 850 hPa, surface RH, and O at 850 hPa, explaining 53% of AOD variations between 2006 and 2017 (Table 2). For surface PM2.5, the latter two elements explain 53% of the interannual variations (Table 3). In YRD, wind speeds at different altitudes are important factors controlling the inter-annual variations of both AOD and surface PM2.5. Surface PM2.5 has a stronger correlation with wind speed. Westerly wind at the surface in YRD increases at a rate of 0.07 m s−<sup>1</sup> yr−<sup>1</sup> (5.4% yr−1, *p* < 0.05), which possibly contributed to the reduction in AOD and surface PM2.5 in this region. In PRD, the contributions from meteorology (−0.009 yr−1, 1.7% of 12-year mean AOD) and emission reduction (−0.010 yr−1) are comparable. Multiple linear regression suggests that AOD in PRD is strongly correlated to surface meridional wind velocity and zonal wind velocity difference between the surface and 850 hPa. The former increased at a rate of 0.04 m s−<sup>1</sup> yr−<sup>1</sup> (1.1% of 12-year mean, *p* = 0.13) between 2006 and 2017, possibly causing decreasing aerosol concentration. The latter increased at a rate of 0.05 m s−<sup>1</sup> yr−1, indicating that dynamic instability was enhanced over the 12 years, favorable for pollution mitigation.

**Table 2.** Meteorological parameters that explain the AOD variations in NCP, YRD, and PRD in different seasons.



**Table 3.** Meteorological elements that explain inter-annual variations of surface PM2.5 in NCP, YRD, and PRD between 2006 and 2017 in different seasons.

We estimated the correlation of AOD (surface PM2.5) among different regions to investigate spatial variations of aerosols. We found that AOD in NCP and YRD in spring are highly correlated (*r* = 0.77, Table 4), but surface PM2.5 in the two regions are not (*r* = 0.09). This pattern of correlations in the two regions is related to the activity of the West Pacific Sub-Tropical High system (WPSTH, Figure 3). AOD in NCP and YRD show similar correlation with the area (*r* = 0.29 and 0.31) and strength (*r* = 0.24 and 0.24) of the WPSTH. Surface PM2.5 in YRD is also related to the two indices (*r* = 0.33 and 0.25), but surface PM2.5 in NCP is not (*r* = −0.02 and 0.05). Very few studies have investigated the effects of meteorology on the distribution of aerosols in China in spring. A recent study [45] showed that the activity of the WPSTH and Northeast Asia anticyclone system are important to the distribution of PM2.5 in NCP. They showed that the climatology of the winds at 850 hPa in spring over NCP and YRD is northwesterly. In an anomalous southeasterly wind year, wind speed is reduced and RH increases in Eastern China, resulting in high aerosol concentrations in the region.


**Table 4.** Correlation coefficients of seasonal mean AOD and surface PM2.5 between 2006 and 2017 among NCP, YRD, and PRD.

#### *3.3. Meteorology Changes in Summer Are Unfavourable to Aerosol Reduction*

GEOS-Chem shows that meteorology changes in summer between 2006 and 2017 offset aerosol pollution control efforts in NCP, YRD, and PRD (Table 1). We used two indices to describe the East Asian summer monsoon (EASM) in this study (Figure 4). Index 1 is the mean meridional wind speed at 850 hPa in Eastern Asia (20–40◦N, 110–125◦E), reflecting activity

of the monsoon system in the whole region [46]. Index 2 is the zonal wind speed difference at 200 hPa between Northern (40–50◦N, 110–150◦E) and Southern (25–35◦N, 110–150◦E) China, reflecting the different variations between the two regions [47]. We found that AOD and surface PM2.5 in the three regions are moderately to strongly correlated with both EASM Index 1 and Index 2 (Table 5). Both indices show that EASM weakened between 2006 and 2017 (Index 1: −0.023 m s−<sup>1</sup> yr−1; Index 2: −0.004 m s−<sup>1</sup> yr<sup>−</sup>1, Figure 4). As a result, AOD in the three target regions increased at rates from 0.006–0.011 yr<sup>−</sup>1, and surface PM2.5 increased at rates from 0.09–0.67 μg m−<sup>3</sup> yr−<sup>1</sup> (Table 1). If the enhancement of surface PM2.5 was totally attributed to wind speed changes, the sensitivity is from 0.09–0.67 μg m−<sup>3</sup> %<sup>−</sup>1, in general agreement with a recent estimate (0–0.5 μg m−<sup>3</sup> %<sup>−</sup>1, [48]).

**Figure 3.** Correlation coefficients of AOD (**a**) and surface PM2.5 (**b**) with area, strength, ridge position, and western extension index of Western Pacific sub-tropical high system in spring.

**Figure 4.** Monthly mean EASM Index 1, Index 2 and EAWM between 2006 and 2017.


**Table 5.** Correlation coefficients of East Asia monsoon system with AOD and surface PM2.5 in NCP, YRD, and PRD in summer and winter between 2006 and 2017.

In each region, AOD and surface PM2.5 are strongly correlated (*r* = 0.93–0.98) because of the strong vertical mixing in summer. AOD in NCP and YRD are strongly correlated in summer (*r* = 0.78). However, their correlations with AOD in PRD are relatively low (*r* = 0.30 and 0.27). Similar relationships of surface PM2.5 among the three regions are also shown. In different regions, the correlations of aerosols with EASM indices are different. In NCP, both AOD and surface PM2.5 are strongly positively related to EASM Index 1 (*r* = 0.57–0.61) and negatively related to Index 2 (*r* = −0.67). EASM Index 1 and Index 2 together explain from 65–69% of the inter-annual variations of AOD and surface PM2.5, indicating that both the activity of EASM in the whole Eastern China and the difference between the Northern and Southern China are critical to the inter-annual variations of aerosols in NCP. In contrast, AOD and surface PM2.5 in YRD are strongly correlated with EASM Index 2 (*r* = −0.76−0.82) but weakly correlated with Index 1 (*r* = 0.27–0.35), indicating that the zonal wind velocity difference at 200 hPa between Northern and Southern China is possibly more important to the inter-annual variations of aerosols in YRD.

We found that EASM Index 2 is strongly correlated with the position of the ridge of the WPSTH system (*r* = 0.89), indicating that the zonal wind difference between Northern and Southern China is possibly affected by the activity of the WPSTH system. A study [49] showed that, in addition to wind velocity, the ridge position of the WPSTH also strongly affects the distribution of precipitation in China in summer. The climatological pattern of precipitation is more present north of the Yangtze River region and less in Southern China. When the ridge shifts southward, the distribution of precipitation is the opposite. When the ridge shifts northward, two rain-belts show in Southern and Northern China. The humid maritime air mass brought to inland China by EASM shows dual effects on aerosols. The abundant water vapour enhances the hygroscopic growth of sulphate-nitrate-ammonia and, thus, increases aerosol concentrations, while large precipitation removes aerosols from the atmosphere and reduces aerosol concentrations. In NCP, AOD is more affected by RH, while surface PM2.5 is more affected by precipitation [50]. Both EASM Index 1 and Index 2 are weakly correlated with AOD and surface PM2.5 (*r* = −0.29–0.45) in PRD, indicating that EASM has limited influence on aerosol distribution in this region. Surface PM2.5 shows similar relationships. Regression analysis shows that major meteorological elements in PRD only explains ~40% of the inter-annual variations of aerosols in this region. Meteorology influences on aerosols in PRD need further investigation.

#### *3.4. Meteorology Changes in Fall Show Different Effects on Trends of Aerosols in Different Key Regions*

GEOS-Chem suggests adverse effects of meteorology changes on aerosol reduction in NCP, but beneficial effects in YRD and PRD between 2006 and 2017. The significant enhancements of aerosols in NCP (+0.011 yr−<sup>1</sup> and +0.26 μg m−<sup>3</sup> yr<sup>−</sup>1) offset the reduction caused by anthropogenic emission control by 85% for AOD and by 11% for surface PM2.5, respectively. We found that PV in NCP decreased at a rate of −0.02 PVU yr−<sup>1</sup> (*<sup>p</sup>* < 0.05), and RH at 850 hPa increased at a rate of 0.002 yr<sup>−</sup>1. Both changes favored the enhancement of aerosol concentration in this region. In contrast, meteorology changes contributed from 8–25% and 35–40% to the total reduction in aerosols in YRD and PRD (Table 1). In YRD, PV at 850 hPa became stronger over the years at a rate of +0.01 PVU yr−<sup>1</sup> (*p* = 0.004), likely

reducing aerosols in this region. In PRD, meridional wind speed at 500 hPa increased (+0.07 m s−<sup>1</sup> yr<sup>−</sup>1), related to the decline in aerosols in this region.

GEOS-Chem shows that AOD in NCP, YRD, and PRD have strong correlations in fall (*r* = 0.55–0.69). The surface PM2.5 in the three regions show even larger correlation coefficients (*r* = 0.76–0.79). In addition, AOD and surface PM2.5 in each region are also highly correlated (*r* = 0.81–0.94), similar to summer. Major meteorological elements that affect AOD and surface PM2.5 in each region are similar. A recent study [51] showed that in fall, Eastern China is dominated by large-scale stable circulation patterns without frequent disturbances of small-scale weather systems. Vertically, downward motion dominates. Multiple linear regressions suggest that sea level pressure is the major controlling factor that affects the inter-annual variations of AOD and surface PM2.5, particularly for the latter, in each region (Tables 2 and 3). We found that sea level pressure among the three regions is also moderately to highly correlated (*r* = 0.50–0.75), which partly explains the correlation of aerosols among the three regions.

#### *3.5. Meteorology Changes in Winter Show Opposite Effects on Trends of AOD and Surface PM2.5*

GEOS-Chem suggests that from 8–30% of the downward trends in AOD in NCP (−0.024 yr<sup>−</sup>1), YRD (−0.021 yr<sup>−</sup>1), and PRD (−0.013 yr<sup>−</sup>1) are attributable to meteorology changes. In contrast, meteorology changes show adverse effects on surface PM2.5 reduction over the 12 years, offsetting from 7–9% of the reductions caused by anthropogenic emission control. The opposite effects of meteorology changes on AOD and surface PM2.5 variations are mainly due to the isolation of PM2.5 at the surface and aloft by the stable boundary layer in winter. The boundary layer is more stable and vertical mixing is weaker in winter than in other seasons. Thus, the correlations of surface PM2.5 and AOD in winter in the three regions are much weaker than those during other seasons, with correlation coefficients of 0.44 in NCP, −0.01 in YRD, and 0.21 in PRD. This pattern is possibly partly explained with the dual effects of EAWM on haze–fog variations in Eastern China [52]. Cold wave activity in winter advects aerosol away and cleans up the region. However, the activity of the Siberian High system may reduce the near surface wind speed and enhance the stratification stability, thus, favoring pollution accumulation.

Major meteorological elements controlling the inter-annual variations of AOD and surface PM2.5 are completely different. In NCP, a local northerly wind speed at 850 hPa explains 74% of the inter-annual variations of AOD between 2006 and 2017 (Table 2). This wind speed increases at a rate of +0.15 m s−<sup>1</sup> yr−<sup>1</sup> (*p* < 0.1), partly explaining the decrease in AOD in the region (−0.007 yr<sup>−</sup>1, *<sup>p</sup>* < 0.05). The increasing wind speed in NCP is related to the enhanced Siberian High due to the rapid warming of the Barents-Kara Sea region [53]. Different from AOD, inter-annual variations of surface PM2.5 in NCP are mainly affected by surface RH, boundary layer height, and the temperature difference between 850 hPa and 500 hPa. This temperature difference increases over the 12 years, although it is statistically insignificant (+0.1 K yr−1, *p* = 0.33), favouring aerosol accumulation. These changes are in-line with surface PM2.5 enhancement in NCP during these years (+0.23 μg m<sup>−</sup>3, *p* = 0.70).

AOD in YRD is strongly correlated with AOD in NCP (*r* = 0.77), but surface PM2.5 in the two regions are only weakly correlated (*r* = 0.43). We used meridional wind speed at 850 hPa in Eastern Asia (25–50◦N, 115–145◦E) as an indicator of the strength of EAWM ([46], Figure 4). The EAWM index is moderately correlated with AOD in NCP (*r* = 0.52) and YRD (*r* = 0.52), but relatively weakly related to surface PM2.5 in NCP (*r* = 0.42) and YRD (*r* = 0.31). This correlation pattern suggests that surface PM2.5 in YRD is less affected by the 850 hPa wind speed in Eastern Asia.

Multiple linear regressions suggest that the inter-annual variations of AOD in YRD are mainly affected by RH at 850 hPa, potential vorticity at 500 hPa, and surface meridional wind velocity. We found that RH at 850 hPa decreased (−0.003 yr<sup>−</sup>1, *<sup>p</sup>* = 0.29), but surface wind increased (0.046 m s−<sup>1</sup> yr−1, *p* = 0.29), favouring aerosol accumulation and, thus, possibly enhancing AOD reduction. Meteorological elements determining the variations of surface PM2.5 include surface zonal wind velocity, dynamic instability, tropopause height, and meridional wind speed at 500 hPa in NCP. The former three elements explain 36% of the inter-annual variations of surface PM2.5. Including the last element explains 9% more variations, suggesting that transport from NCP to YRD is likely important to surface PM2.5 in YRD in winter. We found that meridional wind speed at 500 hPa in NCP increases over the years (+0.15 m s−<sup>1</sup> yr−1, *p* = 0.03), indicating that transport from NCP to YRD is possibly increasing. As a result, surface PM2.5 in YRD is increasing.

Similar as in NCP and YRD, meteorological elements that influence the inter-annual variations of AOD and surface PM2.5 are also completely different in PRD. Increasing zonal wind speed at 500 hPa (+0.17 m s−<sup>1</sup> yr<sup>−</sup>1, 0.8% of 12-year mean), and decreasing potential vorticity at 500 hPa (−0.004 PVU yr<sup>−</sup>1) and PBLH (−2.9 m yr<sup>−</sup>1), are related to the decrease in AOD in this region. Meteorological elements affecting surface PM2.5 include surface meridional velocity, temperature at 850 hPa, and potential vorticity.

#### **4. Discussion**

We showed that the weakening of EASM and the enhancing of AOD and surface PM2.5 between 2006 and 2017 are statistically insignificant, but the trends are still worth notice because they are in-line with the inter-decadal trend as reported by previous studies [54]. Zhu et al. [54] showed that the decadal-scale-weakening of EASM (index: +0.31 between 1948 and 1979 versus −0.32 between 1980 and 2010) within the last thirty years led to increases in aerosol concentration in Northern China by 20%. In addition, the monsoon system also affects the spatial distribution of aerosols. During an active monsoon year, AOD had a positive anomaly in NCP and a negative anomaly in PRD. During a weak monsoon year, the anomalies were the opposite [55,56].

We found that the weakening of EAWM enhances surface PM2.5 in the three key regions, in general agreement with [57], which showed that with fixed emissions, meteorological conditions led to an increase in haze in Beijing during winter between 2002 and 2016. In contrast, Wang et al. [58] found that EAWM was significantly anti-correlated with surface PM2.5 in Beijing between 2005 and 2016, with a correlation coefficient of ~0.75. The difference between the two studies can be attributed to several reasons. First, we analyzed a much larger region, NCP, in this study than in [58], which focused on a city Beijing. Second, we separated the contribution from anthropogenic emissions and meteorology using a chemical transport model, but Wang et al. [58] used surface in situ observations, which do not distinguish the contributions of meteorology changes from anthropogenic emissions control. Thus, their strong correlation is possibly due to the concurrent downward trends of EAWM index and surface PM2.5. Stronger EAWM circulation brings more cold and dry air to NCP and YRD [50] and cleans up the regions. A weaker monsoon barely reaches YRD and even farther north, and favors the accumulation of pollutants [6]. The long-term weakening trend and the inter-annual variations of EAWM were further related to the subtropical Western Pacific Sea surface temperature anomaly [59,60], Arctic sea ice, Eurasian snow [61,62], and El Niño–Southern Oscillation [63]. It is predicted that EAWM would keep the weakening trend in the future (2050–2099), with increased frequency and persistence of conducive weather conditions [8]. This suggests that future meteorology conditions are possibly unfavorable to pollution dissipation.

Very few studies have investigated the influence of meteorology on aerosols in China in spring and fall. PM2.5 in Eastern China is related to the inter-annual variations of Asia Polar Vortex intensity [64], North Atlantic Oscillation, and the North Atlantic Sea surface temperature [45] in spring. The influence of synoptic systems on AOD distribution in fall China was investigated [51] and it was found that heavy pollution events with high AOD (>0.6) in Eastern China is associated with a uniform surface pressure field or a steady westerly in the middle troposphere, while clean episodes (AOD < 0.4) occur when strong northwest cold air advection prevails or air masses are transported from sea to land. Further related, by [65], are the haze days in fall to the abnormally warming sea surface temperature over the North Atlantic subtropical and the Western North Pacific.

Most of the previous studies [15,20,51] use cloud cover, wind speed, PBLH, and RH to correct PM2.5 retrieval. Specifically, ref. [43] found large spatio-temporal diurnal variations of correlation of AOD and PM2.5 in China using measurement data and found that the distribution was strongly affected by cloud fraction, PBLH, and RH. Gong et al. also found that vertical correction by PBLH was important to PM2.5 retrieval in Northwestern China [22]. In other regions, vertical correction via CALIOP ratio is recommended [22]. We also suggest that these elements are important to AOD and surface PM2.5, but we recommend the inclusion sea level pressure and surface pressure in fall.

Our findings have implications for future surface PM2.5 retrieval from satellite-observed AOD. We investigated the controlling meteorological elements of AOD and surface PM2.5 variations over 12 years, including the long-term trends and inter-annual variations. We found that the controlling meteorological elements vary with regions and seasons. Thus, surface PM2.5 retrieval from satellite AOD should probably consider using different meteorological elements in different seasons. In addition, GEOS-Chem simulation with fixed anthropogenic emissions at the 2006 level showed that meteorology changes throughout the 12 years reduces AOD but enhances surface PM2.5 in China during winter, and a multiple linear regression model suggests that the controlling meteorological elements of AOD and surface PM2.5 are completely different. Thus, previous estimates, which used meteorological elements to correct surface PM2.5 retrieval in winter, should be used with caution, as the long-term trend of surface PM2.5 is possibly overestimated. We suggest the use other correction schemes to correct surface PM2.5 retrieval in the future, such as the CALIOP ratio and correlation coefficient of AOD and surface PM2.5.

#### **5. Conclusions**

We studied the effects of meteorology changes on trends in AOD and surface PM2.5 in the key regions NCP, YRD, and PRD in China between 2006 and 2017 using a 3D chemical transport model, GEOS-Chem, by fixing emissions at the 2006 level. We further identified major meteorological elements controlling the inter-annual variations of AOD and surface PM2.5 using multiple linear regressions.

We found that meteorology changes made larger contributions to trends in AOD than surface PM2.5 during spring, summer, and fall between 2006 and 2017. Meteorological changes contributed from 22–50% of AOD reduction in spring, larger than their contributions to surface PM2.5 (1–40%). The decrease in aerosols is possibly related to an increase in westerly wind speed (0.07 ms−<sup>1</sup> yr−1, 5.4% yr−1, *p* < 0.05) in YRD and an increase in meridional wind velocity (0.04 ms−<sup>1</sup> yr−1, 1.1% of 12-year mean) and dynamic instability (0.05 ms−<sup>1</sup> yr−1) in PRD. In summer, meteorological changes offset from 25–42% of AOD reduction caused by anthropogenic emission changes. For surface PM2.5, the contributions were from 5–10%. The adverse effects are possibly related to the weakening of EASM. In fall, meteorology changes offset 85% of AOD reduction and 11% of surface PM2.5 reduction induced by emission changes in NCP. In contrast, from 25–40% of AOD reduction and 8–35% of surface PM2.5 reduction is attributed to meteorology changes. Sea level pressure and surface pressure are critical to aerosol distribution in fall. In winter, meteorology changes were beneficial to AOD decreasing, but were unfavourable to surface PM2.5 reductions in NCP, YRD, and PRD in between 2006 and 2017. The stable boundary layer in winter suppressed vertical mixing, resulting in a weak correlation of AOD and surface PM2.5 in each region. Thus, meteorological elements controlling the inter-annual variations of PM2.5 and AOD in each region were completely different. The northerly wind speed at 850 hPa explained 72% of the inter-annual variations of AOD in NCP. The increase in this wind (−0.045 ms−<sup>1</sup> yr<sup>−</sup>1, *<sup>p</sup>* < 0.1) lowered AOD in this region (−0.007 yr<sup>−</sup>1). In other regions, the trends were statistical insignificant. Thus, previous estimates, which used meteorological elements to correct surface PM2.5 retrieval in winter, should be used with caution. Our study provides possible meteorological elements to correct surface PM2.5 retrieval from satellite AOD measurements on a seasonal scale.

**Supplementary Materials:** The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/rs14122762/s1, Figure S1: SONET sites and key regions in China: North China Plain (NCP, 35–41◦N, 110–120◦E), the Yangtze River Delta (YRD, 27–35◦N, 116–122◦E), and the Pearl River Delta (PRD, 22–25◦N, 110–117◦E); Figure S2: Monthly mean AOD from SONET (blue line) MODIS (red line) and GEOS-Chem (black line) averaged for 2013–2015. Beijing and Songshan are in the NCP region. Shanghai, Zhoushan, Nanjing and Hefei are in the YRD region. Guangzhou is in the PRD region; Figure S3: GEOS-Chem simulated monthly mean AOD components averaged for 2013–2015; Figure S4: Observed (red line) and GEOS-Chem simulated (BASE, black line) monthly mean surface PM2.5 concentrations (μg m-3) in NCP, YRD and PRD in 2013–2017; Figure S5: Observed (red line) and GEOS-Chem simulated (BASE, black line) annual and seasonal mean surface PM2.5 concentrations (μg m<sup>−</sup>3) in NCP, YRD and PRD in 2013–2017. The vertical lines are standard deviations of daily means in each season in each year; Figure S6: Ratio of annual and seasonal mean AOD relative to their values in 2006 from MODIS (red line) and GEOS-Chem simulations in 2006–2017. Three experiments are shown: varying meteorology and varying emissions (BASE, black line), varying meteorological fields with fixed emissions in 2006 (FIXEMISS, purple line), varying emissions with meteorological fields fixed in 2009 (FIXMET, blue line). See text for details. Figure S7: Similar as Figure S6, but for surface PM2.5. Table S1: Statistics of MODIS observed and GEOS-Chem simulated AOD compared to SONET AOD observations at 16 sites; Table S2: List of abbreviations. References [14,30,31,39,66–76] are cited in the Supplementary Materials.

**Author Contributions:** Conceptualization, L.Q. and S.W.; methodology, D.Y.; software, L.Q.; formal analysis, L.Q.; data curation, H.Z. and D.D.; writing—original draft preparation, L.Q.; writing review and editing, L.Q. and S.W.; funding acquisition, L.Q. and S.W.; supervision, S.W. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was funded by the National Natural Science Foundation of China (No. 21806088), Beijing Natural Science Foundation (No. 8222066), and Fundamental Research Funds for the Central Universities (No. FRF-TP-20-056A1).

**Data Availability Statement:** Data is contained within the article or Supplementary Materials.

**Acknowledgments:** We thank the Samsung Advanced Institute of Technology and National Environmental and Energy Science and Technology International Cooperation Base. Shuxiao Wang acknowledges the support from the Tencent Foundation through the XPLORER PRIZE. The simulations were completed on the "Explorer 100" cluster system of Tsinghua National Laboratory for Information Science and Technology.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Ambient Formaldehyde over the United States from Ground-Based (AQS) and Satellite (OMI) Observations**

**Peidong Wang 1,2, Tracey Holloway 2,3,\*, Matilyn Bindl 2, Monica Harkey <sup>2</sup> and Isabelle De Smedt <sup>4</sup>**


**Abstract:** This study evaluates formaldehyde (HCHO) over the U.S. from 2006 to 2015 by comparing ground monitor data from the Air Quality System (AQS) and a satellite retrieval from the Ozone Monitoring Instrument (OMI). Our comparison focuses on the utility of satellite data to inform patterns, trends, and processes of ground-based HCHO across the U.S. We find that cities with higher levels of biogenic volatile organic compound (BVOC) emissions, including primary HCHO, exhibit larger HCHO diurnal amplitudes in surface observations. These differences in hour-to-hour variability in surface HCHO suggests that satellite agreement with ground-based data may depend on the distribution of emission sources. On a seasonal basis, OMI exhibits the highest correlation with AQS in summer and the lowest correlation in winter. The ratios of HCHO in summer versus other seasons show pronounced seasonal variability in OMI, likely due to seasonal changes in the vertical HCHO distribution. The seasonal variability in HCHO from satellite is more pronounced than at the surface, with seasonal variability 20–100% larger in satellite than surface observations. The seasonal variability also has a latitude dependency, with more variability in higher latitude regions. OMI agrees with AQS on the interannual variability in certain periods, whereas AQS and OMI do not show a consistent decadal trend. This is possibly due to a rather large interannual variability in HCHO, which makes the small decadal drift less significant. Temperature also explains part of the interannual variabilities. Small temperature variations in the western U.S. are reflected with more quiescent HCHO interannual variability in that region. The decrease in summertime HCHO in the southeast U.S. could also be partially explained by a small and negative trend in local temperatures.

**Keywords:** formaldehyde; trend; OMI; satellite; monitor; annual; seasonal; temperature

#### **1. Introduction**

Formaldehyde (HCHO) is a carcinogen and mutagen that has been categorized by the U.S. Environmental Protection Agency (EPA) as one of 187 Hazardous Air Pollutants (HAPs). HCHO can be either primary (emitted) or secondary (chemically produced). Global background HCHO concentration primarily comes from the oxidation of methane or methanol [1,2]. In the continental boundary layer (PBL), HCHO is most often formed by the oxidation of non-methane volatile organic compounds (VOCs) [3,4]. The dominant biogenic precursor of HCHO is isoprene, which comes from vegetation and can quickly oxidize to form HCHO [5,6]. The reaction of HCHO with OH, as well as HCHO photolysis [7], among other loss pathways, results in a lifetime of HCHO on the order of hours [8]. On shorter time scales, biomass burning and wildfires are also sources for instantaneous

**Citation:** Wang, P.; Holloway, T.; Bindl, M.; Harkey, M.; De Smedt, I. Ambient Formaldehyde over the United States from Ground-Based (AQS) and Satellite (OMI) Observations. *Remote Sens.* **2022**, *14*, 2191. https://doi.org/10.3390/ rs14092191

Academic Editors: Maria João Costa and Daniele Bortoli

Received: 24 March 2022 Accepted: 26 April 2022 Published: 4 May 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

elevated HCHO [9]. The major emission sources for anthropogenic VOCs include fuelwood production, utilization of gasoline, and biomass burning [10,11].

To support air quality management and public health assessments, the U.S. maintains several networks and programs to monitor HCHO at the surface. These include the Interagency Monitoring of Protected Visual Environments (IMPROVE), National Air Toxics Trends Stations (NATTS), and Photochemical Assessment Monitoring Stations (PAMS) networks. Data from these networks are archived through the EPA Air Quality System (AQS) Ambient Monitoring Archive (AMA) for hazardous air pollutants (available at https://www3.epa.gov/ttnamti1/toxdat.html#data; accessed on 30 August 2017). Several methods are used for in situ measurements of HCHO, including spectroscopic, colorimetric, chromatographic, and fluorometric techniques, which are discussed in detail by [12]. Among these techniques, the EPA commonly uses the chromatographic technique with 2,4-Dinitrophenylhydrazine (DNPH) [13] to measure HCHO, though there may be some bias with this method, such as interference from ozone, nitrogen dioxides, and water vapor [14–17].

Between 2006 and 2015, there were 338 ground-based monitors operated by states, local agencies, and tribes throughout the U.S. in compliance with EPA standards on archiving and measuring HCHO. Of these 338 total stations, the measurement frequency of HCHO varies, with stations reporting either 1 h (10 stations), 3 h (41 stations), or 24 h concentrations (322 stations). Most stations measured HCHO by collecting air samples via cartridges coated with DNPH and analyzing those samples using the High-Performance Liquid Chromatography (HPLC) method. Even among these monitors, there are significant gaps in the data records of every station.

Satellite observations offer the potential to complement this limited surface monitoring network. Satellite retrievals of HCHO bear relevance to air quality planning as an indicator of HCHO exposure, as well as VOC emissions, VOC reactivity, and associated ozone formation. This study compares satellite and ground-based HCHO with the goal of assessing spatial and temporal HCHO patterns and the appropriate role of satellite vertical column density (VCD) as a potential proxy for near-surface HCHO.

Currently, four polar-orbiting satellite instruments detect HCHO, including: the Ozone Monitoring Instrument (OMI) onboard the Aura satellite [18], which has a local overpass time in the early afternoon at 13:30 and a spatial resolution of 24 × 13 km2; the Tropospheric Monitoring Instrument (TROPOMI) onboard the Sentinel-5 Precursor, with the same overpass time as OMI, but finer resolution [19] of 3.5 × 5.5 km2 as of August 2019; the Ozone Mapping and Profiler Suite (OMPS) on the Suomi NPP satellite [20], which also has an overpass time of 13:30, but with a spatial resolution of 50 × 50 km2; and the Global Ozone Monitoring Experiment-2 (GOME-2) on the Metop satellite series [21], with a local overpass time of 09:30 and a spatial resolution of 80 × 40 km2. In addition to these polar-orbiting satellites, a number of instruments are planned for or have recently reached geostationary orbit: Tropospheric Emissions: Monitoring of Pollution (TEMPO), from the National Aeronautics and Space Administration (NASA; spatial resolution: 2 km × 4.5 km) in 2022; Sentinel-4 from the European Space Agency (ESA; spatial resolution: 8 km × 8 km) in 2023; and Geostationary Environment Monitoring Spectrometer (GEMS), from the Korea Aerospace Research Institute (KARI; spatial resolution: 7 km × 8 km), which launched successfully in early 2020. These satellites will provide continuous observation of HCHO over North America, Europe, and Asia, respectively [22–24].

Satellite observations of HCHO have been used to advance the understanding of atmospheric chemistry, e.g., refs. [25–33], as well as for air quality management to protect public health [34–37]. Because of HCHO's strong detectability from space, its local footprint due to a short atmospheric lifetime, and its high yield from VOCs, HCHO has been used as an indicator of total VOCs in the atmosphere [38–40]. Combined with satellite derived NO2, HCHO has been used to support the assessment of the ozone production regime [36,41,42] and has even been used in decision-making contexts [43].

As satellite data usage expands, there is interest in the relevance of satellite products to better characterize emissions and near-surface concentrations. Much of work has focused on satellite observations of NO2. For example, ref. [44] used satellite observations of NO2 to constrain emissions of NOx (NO + NO2) at the surface, ref. [45] used a modelderived scaling factor to scale satellite observations of NO2 to near-surface amounts, and ref. [46] found similar responses to weather variables for both surface and column NO2. Ref. [47] found a good correlation between surface and column NO2, discovering that both datasets captured weekly cycles over Leicester, England; ref. [48] found strong seasonal and weekly cycles in both datasets over Israeli cities in 2006; and ref. [49] found that both datasets showed a small weekly cycle in NO2 in Beijing. Similar work with satellite HCHO includes studies by [50,51], who evaluated relationships among 17 years of satellite-based HCHO, biogenic isoprene emissions, and land cover datasets, by [52–54], who characterized anthropogenic emissions, and by [34], who used satellite HCHO to evaluate differing chemical transport model configurations over the continental U.S., and following [45], explored the utility of scaling satellite HCHO to near-surface amounts, and by [55], who derived surface HCHO amounts from TROPOMI and surface monitor observations using a neural network technique.

Fewer studies have compared satellite and surface HCHO observations. The majority of previous studies using both in situ and remote measurements of HCHO have focused on global patterns and VOC emissions. Between 2004 and 2014, ref. [26] found that OMI HCHO decreased in the eastern U.S., central South America, and across Europe, but increased in India and central-eastern China. That same study found that HCHO is highest in the early afternoon in the mid-latitudes by differencing the morning overpass of GOME-2 with the afternoon overpass of OMI [26]. Several studies have used satellite-based HCHO observations to infer the spatial distribution of isoprene over the U.S., e.g., [4,38,56]. Between 2005 and 2014, OMI HCHO increased over the U.S. overall, but decreased in the southeast [57]. HCHO trends have been found to be largely dependent on temperature and fire events [26,56,58–61], as well as anthropogenic emission sources [53,57,62,63].

This study informs potential applications of satellite-based HCHO within the health and air quality communities, which focus on near-surface concentrations. We evaluate the diurnal, seasonal, and interannual trends for HCHO over the U.S. by comparing HCHO from satellite retrievals with those from ground-based measurements. We assess HCHO data availability from EPA monitoring stations (Section 2.1), diurnal cycles (Section 3), seasonal variability (Section 4), and interannual trends (Section 5). In Section 6, we connect our results with previous findings to evaluate mechanisms that could potentially explain some of the observed behaviors. Finally, we conclude our results in Section 7.

#### **2. Data and Methods**

We evaluated HCHO for the U.S. between 2006 and 2015 from ground measurements and satellite retrievals. Ground monitor data for HCHO comes from the EPA Air Quality System (AQS) Ambient Monitoring Archive (AMA) for hazardous air pollutants, which includes data collected from the IMPROVE, NATTS, and PAMS networks. Satellite HCHO observations have been retrieved from the measurements of the NASA Earth Observing System's OMI instrument as a part of the Quality Assurance for Essential Climate Variables project (QA4ECV; available at http://www.qa4ecv.eu/; accessed on 24 July 2019).

#### *2.1. Ground-Based Measurements*

In this study, we used AQS HCHO concentrations (μg m−3) which have been converted to local meteorological conditions (using local pressure and temperature) from standard conditions. Detailed conversion is available in the Quality Assurance Summary Report for HAPs (https://www3.epa.gov/ttn/amtic/files/toxdata/techmemo2017.pdf; accessed on 30 August 2017). As described in this report, we eliminated data that are flagged as non-detect or below measurement detection limit. Of the 338 monitors in the AMA between 2006 and 2015, we only used data from sites with the DNPH/HPLC method. We choose to use publicly available data in a manner consistent with the use of monitoring data by air quality managers and by the EPA for the National Air Toxics Assessment.

There are significant gaps in the data records of every station, which we characterize as the percentage of available data for each available reporting frequency (Table 1). Percent availability is calculated as the number of available measurements divided by the total number of possible measurements at each station's measurement frequency. We find that no station offered more than 50% data coverage across the 10-year period of analysis. Ten stations measured HCHO at an hourly frequency, with one site, located at St. Louis, Missouri, having data coverage of 38% of all days. The other nine stations have below 1% of data coverage. Forty-one sites measured HCHO with a 3 h frequency, but only two stations in Los Angeles County in California (Burbank and Pico Rivera) had data coverage over more than 10% of the days in the study period, and these sites only had measurements in July, August, and September.

**Table 1.** Number of HCHO ground monitoring stations in the United States, with data distributed by the EPA AQS. Stations are grouped based on data availability, and by measurement frequency (hourly, 3 h, and 24 h). Data availability is calculated as the percentage of measurements available from 2006 to 2015 relative to the potential number of measurements during this period at the monitor's reporting frequency. The three sites used for analysis of the diurnal HCHO cycle are marked in bold, which include the 1 h site with 30–40% data availability (St. Louis, MO, USA), and the two 3 h site with 10–20% data availability (Burbank, CA, USA; Pico Rivera, CA, USA). These three sites are marked as triangle in Figure 1.


For our diurnal analysis, we relied on the three stations that offered >10% data coverage in a 1 h or 3 h measurement frequency. For our seasonal and interannual analyses, we used data from sites with a 24 h measurement frequency and six or more samplings of each season throughout 2006 to 2015 continuously. For comparison, only 37 sites have one or more measurements per month, so we chose the seasonal average basis to include more monitors. The threshold of six samples each season was selected to balance temporal and geographic coverage of monitors. For thresholds less than six samples per season, more monitors would be available (58 monitors at a threshold of five samples per season; 64 monitors with a threshold of one sample per season). For thresholds greater than six samples per season, fewer monitors would be available (41 monitors at a threshold of seven samples per season; 10 monitors with a threshold of ten samples per season). We removed five outlier stations which had significantly large maximum over median values in each seasonal average (larger than two standard deviations among all stations), since these sites might not be representative for interannual trend studies and cannot represent regional conditions. This approach yielded 45 ground monitor stations for the seasonal and interannual analyses. The black symbols in Figure 1 identify the locations of the AQS stations used in this study: circles (45 sites) indicate the stations used in seasonal and interannual analyses; triangles (3 sites) indicate sites included in the diurnal analysis.

**Figure 1.** 2006–2015 annual average HCHO vertical column density from OMI. Four U.S. regions are designated for analysis, with region name aside the map. Overall, for studying diurnal patterns, there are three sites: two in the Western U.S. and one in the Midwest. These stations are labeled as triangles. For seasonal and interannual studies, there are 45 AQS stations (10 in the west, 6 in the Midwest, 14 in the southeast and 15 in the northeast) that are labeled as circles. For more details on monitoring sites, please see Section 2.1 and Table 1.

#### *2.2. Satellite Observations*

We used the HCHO Level-3 product, with horizontal resolution at 0.25◦ × 0.25◦, from OMI onboard the Aura satellite, for which the U.S. overpass occurs in the early afternoon. We used the OMI retrieval algorithm from the EU FP7-project QA4ECV (hereafter abbreviated OMI unless otherwise specified) [64,65]. The QA4ECV algorithm utilizes a fitting window ranging from 328.5 to 359 nm from OMI. We removed data with solar zenith angles greater than 70 degrees and cloud fractions greater than 40%. We also removed data that are quality flagged or influenced by the OMI row anomaly (http://projects.knmi.nl/omi/research/product/rowanomaly-background; accessed on 24 July 2019). Detailed descriptions of this algorithm are described by [66].

Compared to other instruments with data covering any of our 2006–2015 study years (GOME2A, GOME2B, OMPS), OMI offers the highest spatial retrieval resolution of HCHO at 24 × 13 km<sup>2</sup> at nadir, as discussed by [67]. Aura's early afternoon overpass time corresponds with the average daily peak amount of HCHO at mid-latitudes [26,68]. Following [69], who compared trends in satellite- and ground-based observations of NO2, we use OMI HCHO VCD. While the total vertical column density indicates the number of molecules between the satellite and ground, tropospheric HCHO accounts for the majority of the total column amount. We use OMI HCHO observations for all seasons from 2006 to 2015 and compare with AQS measurements to evaluate OMI's ability to indicate surface HCHO trends.

OMI has exhibited a positive drift since 2008, possibly due to instrumental degradation [70,71]. The QA4ECV algorithm applied a background correction over the remote

Pacific to reduce HCHO slant column uncertainty. Note that this approach assumes that the remote Pacific HCHO is only due to the oxidation of methane [66].

Ref. [56] notes an instrument detection threshold of ~4 × <sup>10</sup><sup>15</sup> molec cm−2; here, we present all values for thoroughness and for the purpose of evaluating whether winter HCHO values from OMI agree with AQS observations [26,72].

Figure 1 shows oversampled OMI 2006–2015 averaged HCHO for the continental U.S. On average, HCHO column amounts are higher where precursor emissions of isoprene are high, e.g., [38]. In particular, the southeastern U.S. shows elevated HCHO column amounts (≥<sup>9</sup> × <sup>10</sup><sup>15</sup> molec cm−2). Amounts in other regions are lower, except in areas in the Western U.S. corresponding to mountainous terrain and national parks, where average amounts exceed 8 × <sup>10</sup><sup>15</sup> molec cm−2. High values may be caused by isoprene emissions and/or anthropogenic emissions associated with industries, such as oil and gas extraction, or caused by direct and precursor emissions from fires [73].

Unless otherwise noted, all analyses were conducted with seasonal average data over a three-month period (DJF: December January February; MAM: March April May; JJA: June July August; SON: September October November). Although the QA4ECV data would be appropriate for monthly average analysis, we chose to average by season due to the limited availability of data from the AQS monitors. We checked the impact of constraining AQS HCHO data to include only measurements taken when local cloudiness was <40%. As we screened for cloudiness in the OMI dataset, we found a high correlation (r = 0.999) between the seasonal mean of all 24 h ground measurements and the seasonal mean of AQS measurements under the same OMI viewing conditions described above, with the subset data having a small mean bias (−0.005 <sup>μ</sup>g m<sup>−</sup>3; Supplementary Material Figure S1). In order to obtain continuous seasonal data, we used all available AQS data with a 24 h measurement frequency.

#### *2.3. Emissions Data*

We compared AQS and OMI HCHO with the EPA's National Emissions Inventory (NEI, available at https://www.epa.gov/; accessed on 23 October 2017). Every three years, the NEI reports the annual summations from different emissions sources for various air pollutants. We considered NEI HCHO emissions from biogenic, anthropogenic, and wildfire sources to assess their contributions to observed HCHO abundance. Biogenic HCHO comes from vegetation and soil, anthropogenic HCHO is mostly due to fuel combustion and transportation, and wildfire is a single category that contributes significantly to HCHO amounts in the western U.S. We considered the 2008 NEI in this study for the diurnal section in which all three diurnal sites have only 2008 HCHO in common. This NEI includes biogenic emissions calculated using the Biogenic Emission Inventory System version 3.14 (BEIS) with land use data from the Biogenic Emissions Land use Database version 3 (BELD3) [74]. We used 2008 NEI county-level data for each station in our diurnal analysis and state-level data for studying seasonal and interannual trends.

#### **3. Diurnal Cycle of HCHO**

Given the limitations of the observational dataset, we focused our analysis on summer months, when data are consistently available. For our evaluation of HCHO diurnal cycles, we used data from three urban AQS sites (St. Louis, Burbank, and Pico Rivera) with 1- and 3 h measurement frequencies available in July, August, and September (JAS) from 2006 to 2010. The St. Louis site is located in St. Louis County in Missouri, while both the Burbank and Pico Rivera sites are located in Los Angeles County in California. We excluded days without full-day measurements (8 measurements per day for three-hour frequency and 24 measurements per day for hourly frequency). Figure 2 shows JAS mean HCHO for each year at three diurnal sites (note the different y-axes for each plot, due to the large differences in HCHO levels across the three sites).

**Figure 2.** HCHO mixing ratios from ground-based AQS monitors at three sites ((**a**): Burbank, CA; (**b**): St. Louis, MO; (**c**): Pico Rivera, CA) from 2006 to 2010. Solid lines show June, August, September (JAS) average diurnal cycle with complete measurements (24 measurements per day for 1 h site or 8 measurements per day for 3 h site). The shaded area indicates standard deviation in that averaging year. Different years are labeled in different colors, indicated in the bottom right. OMI overpass time (13:30 local time) is labeled with a vertical dashed line at each site.

In general, diurnal patterns in HCHO are affected by direct and precursor emissions, chemical reactions, and vertical and horizontal mixing. Biogenic precursor emissions are expected to peak in the mid- to late afternoon when photosynthetically active radiation and temperature are high [75]. Formaldehyde yields from these precursors tend to peak mid-day with elevated isoprene oxidation [76]. Vertical mixing of trace gases such as HCHO and its precursors will also vary as diurnal heating drives changes in mixed layer depth, e.g., ref. [77].

From 2006 to 2010, Burbank and Pico Rivera show clear diurnal trends with peaks ranging between 5 and 30 μg m−<sup>3</sup> around 11:00 LT (Figure 2a,c). Observations at St. Louis indicate a less significant diurnal pattern, with peak values at 14:00 and 20:00 LT in 2010, 9:00 LT in 2007, and all years showing minima between 4:00 and 5:00 LT (Figure 2b). It is noteworthy that for both 2007 and 2008, Burbank shows HCHO values over 15 μg m<sup>−</sup>3, which are almost three-times greater than concentrations observed in other years (Figure 2a). Since Burbank is 5 km southwest of the Angeles National Forest, these high concentrations could be due to significant wildfire emissions in 2007 and 2008. This wildfire enhancement is consistent with the NEI, which indicates that wildfires yielded 296 tons of HCHO for Los Angeles County in 2008, compared to 48 tons in 2011 and 36 tons in 2014. However, Pico Rivera, another station in Los Angeles County, did not record anomalously high HCHO in 2007 and 2008 (Figure 2c). This could be due to the fact that the Pico Rivera monitor is 20 km south of the Angeles National Forest. In Figure 2, we overlaid the OMI overpass

time (13:30 LT) for each AQS site and found that with the exception of the St. Louis site in 2009 and 2010, OMI generally passes the sites after the peak in HCHO.

To better describe the amount of diurnal variability at each site, we quantified the amplitude of the diurnal pattern in AQS data, presented in Table 2. We differentiated between absolute amplitude (Aabs) and relative amplitude (Arel) and calculated mean amplitudes from 2006 to 2010 at the three sites. Amplitudes for 2008 were compared with NEI emissions reported for that year. We defined Aabs as the difference in value between the daily maximum and minimum, in μg m−3, and Arel as the ratio between Aabs and the daily average value to represent the difference between the two, expressed as a percentage. Overall, Burbank and Pico Rivera, with clear diurnal patterns, show larger amplitudes (Aabs of 5.94 μg m−<sup>3</sup> and 4.70 μg m<sup>−</sup>3, respectively, and Arel of 53.92% and 80.47%, respectively) than St. Louis (Aabs of 2.74 μg m−<sup>3</sup> and Arel of 34.70%; Table 2). These differences are likely the result of differences in HCHO sources. We extracted NEI emissions for total VOCs and for HCHO in 2008 where the three sites have data in common (Table 3). In both Los Angeles and St. Louis counties, direct HCHO emissions only account for a small portion compared to total VOC emissions, regardless of emission sector. Overall Los Angeles has four times more total VOC emissions and ten times more direct HCHO emissions than those in St. Louis. Both counties also indicate more anthropogenic sources of HCHO and total VOCs than biogenic sources, with different anthropogenic to biogenic ratios.

**Table 2.** 2006–2010 and 2008 (NEI reported year) mean June, August, September (JAS) diurnal HCHO amplitudes at the three AQS sites in Figure 2. Aabs is the absolute amplitude, and Arel is the relative amplitude.


**Table 3.** 2008 NEI total VOC and HCHO emissions at St. Louis and Los Angeles county, along with anthropogenic to biogenic emissions ratio.


For total VOC emissions, St. Louis has above five-times more anthropogenic emissions than biogenic emissions, in which the major anthropogenic source is on-road, non-diesel, and light-duty vehicles. Direct HCHO emissions at St. Louis also have a higher ratio between anthropogenic and biogenic sources, in which on-road vehicles from the mobile sector contribute to more than half of the anthropogenic HCHO emissions. Although Los Angeles has much higher emissions than St. Louis, it has a lower ratio of anthropogenic to biogenic VOC and HCHO emissions (~2 versus ~5 in St. Louis). As indicated by [78], the reactivity of anthropogenic VOCs (emissions mostly coming from motor vehicles) remains consistent with temperature, but the reactivity of biogenic VOCs grows exponentially with temperature. Therefore, higher contributions of anthropogenic emissions could explain the lack of a diurnal cycle in St. Louis County compared to the two sites in Los Angeles County.

However, since we only use three sites and they are all in urban areas, these results might have a sample size and location bias. Hourly retrievals of HCHO from geostationary satellites, when available, could be used to evaluate how anthropogenic versus biogenic emissions affect the diurnal cycle of HCHO over wider areas.

#### **4. Regional HCHO Seasonality Analysis**

To assess seasonal patterns in HCHO, we divided the continental U.S. into four geographical regions (shown in Figure 1) generally following regions defined by the U.S. Census. There are 6 AQS stations in the Midwest; these have a humid continental climate (hot summer) and crops as the dominant Plant Functional Type (PFT). There are 15 AQS sites in the Northeast; these have humid continental climate with a cool summer, and broadleaf trees as the dominant PFT. The southeast has 14 stations, a humid subtropical climate, and a mixture of broadleaf and fine leaf trees, shrubs, crops, and grass as the PFTs. The west has 10 sites, a mixture of Mediterranean, semi-arid, and desert climate, and shrubs and grassland as the PFTs. Since AQS sites are not evenly distributed in each zone, they might not be representative of the entire region. For consistency, we used OMI coincident pixels at these 45 AQS sites for all of the AQS-OMI comparison analyses.

#### *4.1. Overall AQS-OMI Seasonal Correlation*

Figure 3 compares HCHO AQS and OMI observations for each season at 45 ground monitor stations. Abundance of HCHO varies seasonally, with greater amounts of biogenic precursors in warm seasons at all sampling sites (corresponding with the 45 ground monitors). To obtain continuous winter averages, we combined January and February data with December data from the previous year. For these seasonal evaluations, we considered 2007 to be the first year of analysis, which includes December 2006. Each symbol represents seasonal HCHO averaged from 2007 to 2015 at one AQS station, with symbols color-coded by region.

As shown in Figure 3, correlations between AQS and OMI peak in summer and drop to a minimum in winter. This result is consistent with [35], who reported a larger contribution of near-surface HCHO in summer months, and a lower vertical gradient in winter. The summer has a larger fraction of column HCHO in the boundary layer, consistent with the positive correlation between AQS and OMI in warm months.

We find a positive correlation between AQS and OMI HCHO in all four geographical regions for every season except for winter, where the correlations becomes negative or insignificant for most regions. This could be due to the fact that winter has (1) greater solar zenith angles in the northern hemisphere, (2) frequent cloud coverage, and (3) lower HCHO emissions that are below the OMI detection limit. AQS and OMI have consistently high agreement in the Northeast (r = 0.49, 0.65, 0.87, and 0.56 for DJF, MAM, JJA, and SON, respectively), and a slightly weaker but consistent correlation in the west (r = 0.25, 0.61, 0.71, and 0.67 for DJF, MAM, JJA, and SON, respectively). AQS and OMI show high agreement in the Midwest and southeast during JJA, but weak or even negative correlation in other seasons.

**Figure 3.** Spatial correlation of 2007–2015 average HCHO from 45 AQS sites (*x*-axis) and coincident OMI pixels (*y*-axis). Each point represents a single site, with OMI and AQS values averaged over the season. Points are color coded to reflect their region, and spatial correlation coefficient r values in different regions are shown in the legend. Each panel represents average values for: (**a**) winter, (**b**) spring, (**c**) summer, and (**d**) autumn.

#### *4.2. Seasonal Variability*

We evaluated seasonal variability by comparing the HCHO ratios of summer to the other three seasons. Previous studies indicate that in the U.S., summertime HCHO amounts are higher than in winter due to higher summertime temperatures, leading to an increase in biogenic VOC emissions and HCHO production [4,58,72]. This increase is characterized by the seasonal variability in surface-level HCHO, estimated by the GEOS-Chem model using the ratio of yearly mean to summer amounts. For both AQS (measured in μg m−3) and OMI (measured in 1015 molec cm−2), we calculated the unitless ratio of the summer JJA average to winter DJF average, JJA to spring MAM average, and JJA to autumn SON average. These ratios are given in Figure 4 for each region as box plots.

**Figure 4.** Summer to all other season ratios from AQS and OMI for the Midwest (**a**), northeast (**b**), west (**c**), and southeast (**d**). Red lines indicate regional median, green triangles indicate regional mean, and cycles represent extreme values.

Overall, as seen in Figure 4, both AQS and OMI HCHO have summer to other season ratios >1 in all four regions, with peak ratios in JJA/DJF and similar ratios between JJA/MAM and JJA/SON, indicating a clear seasonal cycle with maximum HCHO in summer, minimum HCHO in winter, and similar amounts in spring and autumn. For all four regions, OMI shows more pronounced ratios than AQS, which reflects the greater variability in column amounts compared to near-surface amounts seen in Figure 3. Among those regions, OMI ratios mostly bias high in the higher latitude regions in the Midwest and Northeast by a factor or 1.3 to 2.8 depending on different seasons compared to AQS, despite these regions having strong correlation between AQS and OMI in summer from Figure 3. OMI ratios have less overestimation in the two lower latitude regions in the southeast and west, though still bias high with a factor between 1.2 to 1.3, with larger bias in winter (up to 1.9 greater than AQS ratios).

#### **5. Interannual Trends**

In Section 4.2, we showed that HCHO exhibits a strong seasonal cycle. To assess HCHO's interannual variability between 2006 and 2015, we deseasonalized the seasonal mean values for each region, using observations from all 45 AQS stations and collocated OMI data. We also calculated the line of best fit for the deseasonalized data in each region. Endpoints are SON average 2006 and SON 2015 to avoid seasonality affecting the trend. Figure 5 shows the mean deseasonalized HCHO for AQS and OMI in each region (solid lines). These plots are overlaid with collocated seasonal average 2 m temperature from the North American Regional Reanalysis (NARR) [79], to indicate the role of temperature in the variability of each data set. The slopes from the linear regression and the associated standard errors are indicated in each panel.

**Figure 5.** 2006–2015 AQS and OMI HCHO deseasonalized interannual trends averaged by season in the (**a**) Midwest, (**b**) northeast, (**c**) west, and (**d**) southeast. Blue lines indicate AQS trends, red lines represent OMI trends, and yellow represents 2 m temperature from the North American Regional Reanalysis (NARR) data, sampled at the AQS monitor locations. Solid lines are the mean seasonal HCHO abundance or temperatures, and shaded areas are the standard deviation. Region name and number of stations in the region are labeled at the top of each figure, along with the slope of the line of best fit and the standard errors associated with the slopes.

After deseasonlizing the data, AQS and OMI show some consistency in certain periods. For example, both AQS and OMI captured the double peak in HCHO in 2010–2012 over the Midwest (Figure 5a). Both AQS and OMI HCHO also show elevated amounts in the west in summer of 2008 (Figure 5c), possibly corresponding to local fire emissions. However, the vertical and horizontal transport of VOC emissions from fires are more likely captured by satellite observations, and then affect regional trends downstream of fires more in the OMI record than in AQS.

During the entire ten-year period, the slopes from the linear regression on AQS and OMI do not aways agree. Since the standard errors associated with the slopes are rather large, this indicates HCHO did not have a significant and monotonic drift during 2006–2015. There are two exceptions: AQS in the west does indicate a significant increasing trend at a rate of 0.059 ± 0.019 <sup>μ</sup>g m−<sup>3</sup> yr−1, and AQS in the southeast is decreasing at a rate of 0.029 ± 0.027 <sup>μ</sup>g m−<sup>3</sup> yr−<sup>1</sup> from 2006 to 2015. The large standard error in the southeast AQS data could partially come from the winter in 2014, when the regional mean is skewed by a few exceptionally high values.

In all regions, the increase in HCHO is partially explained by temperature trends, as shown by the NARR data. The west, which shows a significant increase in AQS HCHO, also shows the greatest warming at a rate of 0.181 ± 0.046 ◦C yr−1. However, this large increase is not captured by OMI, despite OMI having high agreement in seasonal variability with AQS. The interannual variations in temperature are also more quiescent in the western U.S. than in other regions, which could partially explain smaller year-to-year variations in HCHO in the West.

We used the same best fit linear regression to consider interannual changes for each season (Figure 6) and yearly mean for every grid box from OMI. We see a significant increase in California at around 0.3 × <sup>10</sup><sup>15</sup> molec cm−<sup>2</sup> per year; this is mostly contributed by an increase in MAM. The broader southeast and northeast regions experienced a strong increase in HCHO during DJF but were offset by a strong decrease in JJA. Such a decrease in HCHO could be partially explained by summertime cooling in the southeast U.S. over the past few decades [80,81].

**Figure 6.** Map of the U.S. with 2006–2015 OMI HCHO trend in (**a**) all seasons, (**b**) winter, (**c**) spring, (**d**) summer, and (**e**) autumn.

Though OMI (co-located with AQS sites) indicates a decreasing trend that is opposite to AQS and NARR in the west (Figure 5c), there is a significant increase in OMI HCHO over large area of California, Oregon and Washington (Figure 6). Thus, using the OMI co-located pixels at AQS sites might not be indicative of the entire region.

#### **6. Discussion**

The evaluation of diurnal, seasonal, and interannual HCHO suggests that satellitederived HCHO serves as a useful indicator for surface HCHO change on seasonal to interannual timescales. While surface-to-column agreement varies in space and time, the

combined analysis of these two datasets informs the chemical and meteorological processes that impact HCHO.

This study builds on the earlier work of [72], with a greater focus on seasonality, temperature, and comparison between satellite and ground-based data. Whereas ref. [72] evaluated May-September change of temperature-corrected HCHO, we examine all seasons without removing temperature effects with the goal of characterizing the degree to which satellite data can inform near-surface trends and patterns.

The weighting function for OMI HCHO peaks in the upper troposphere [26], so it is well established that satellite data does not capture the same surface air as measured by the AQS monitors. With such differing physical characteristics between observation methods, it would be challenging to reconstruct ground level HCHO merely from satellite data. For example, ref. [82] reviewed methodologies of calculating surface level PM2.5 from satellite observed aerosol optical depth, which requires either a chemistry transport model or a statistical model based on empirical relations from the existing data. Future work to derive surface level HCHO from satellite observations could be made possible with such large datasets, such as the newly available EPA Air Quality Time Series (EQATES) project (available online, https://www.epa.gov/cmaq/equates; accessed on 21 April 2022).

Additionally, validation studies have shown that satellite HCHO tends to be biased low when column amounts are high (e.g., summer) and biased high when column amounts are low (e.g., winter), which would tend to dampen the OMI HCHO seasonal cycle [83–85]. Our observational results corroborate [35], who found that surface HCHO is a more significant contributor to column HCHO in the summer, based on model simulations. Despite the low agreement between AQS and OMI data in the winter, both datasets capture a consistent seasonal cycle and consistent interannual trends over certain periods.

Our findings suggest that temperature sensitivity of column HCHO is greater than near-surface HCHO. The seasonal amplitude of HCHO is higher in OMI data than in AQS in all regions, which may be due to the larger role of secondary HCHO in the column versus the surface. Furthermore, the warming across the 2006–2015 period leads to a stronger increase in the HCHO column in most regions, and a less-pronounced increase (or decrease) in the AQS monitor data, except in the west. The authors in [46] found that column NO2 is more sensitive to temperature than surface NO2, both in observations (monitors versus satellite) and in a numerical model; the same appears to be true for HCHO. As discussed by [72,86], trends in HCHO are attributable to a range of land use and emissions changes, independent of temperature. The impact of local emissions is evident in our results as well. In our analysis of diurnal HCHO at three sites with sufficient ground-data (Section 3), we found a mid-day peak in HCHO only at sites dominated by biogenic VOC emissions (Burbank and Pico Rivera in California). The increased temperature in the middle of the day accelerates the emission and oxidation of isoprene, consistent with [76]. Although currentgeneration satellites provide daily (or less frequent) HCHO data, the sensitivity of HCHO to temperature, including as a function of biogenic VOC emissions versus anthropogenic VOC emissions, would be a valuable application of future hourly HCHO observations from geostationary satellites.

#### **7. Conclusions**

As a pollutant with direct health impacts and an indicator of ozone formation, HCHO has emerged as an atmospheric species of interest for air quality management. Ground monitoring data of HCHO is limited, and satellite data may complement the sparse network with spatially continuous information on HCHO abundance. Multiple satellites can detect HCHO, but these data are only beginning to be applied to operational and health-relevant applications. For user communities interested in the interpretation of satellite data, a key question is the agreement in spatial and temporal patterns between ground-based and space-based measurements. Our comparison focuses on the utility of satellite data to inform patterns, trends, and processes of ground-based HCHO across the U.S.

Over our study period, HCHO data were available from 338 sites managed by the EPA. However, only 45 of these had continuous seasonal measurements in the time range, and only three stations provided hourly and three-hour measurement frequencies that had more than 10% data available and passed quality control. Of those three sites, the two sites with larger diurnal amplitudes (in Los Angeles County) had a lower anthropogenic/biogenic ratio for both direct HCHO emissions and for total VOC emissions, as compared to the site with the smaller diurnal amplitude (in St. Louis county). These relative diurnal changes indicate that the origin of VOC emissions may be an important driver of diurnal HCHO patterns. In addition to emissions, chemistry and meteorology also play important roles in affecting the diurnal cycle of HCHO. With the upcoming availability of hourly HCHO data from TEMPO, Sentinel-4, and GEMS, it will be interesting to assess how the diurnal amplitude of HCHO changes between areas dominated by biogenic versus anthropogenic emissions. We expect that areas with larger anthropogenic emissions will exhibit a weaker diurnal signal.

On a seasonal basis, OMI exhibits the highest correlation with AQS in summer and the lowest correlation in winter. These results are consistent with past work indicating that boundary layer HCHO is a greater contributor to the summertime HCHO column and less so in the winter [35]. Combining summer to other season ratios showed OMI bias is high compared to AQS in all regions, but with different factors depending on different regions. Seasonal differences across the regions are likely due to the differences in dominant plant types in each area, as well as VOC emissions in young versus mature trees [87]. The overestimation of the ratios from OMI suggests a more pronounced sensitivity to temperature in the HCHO column than in surface HCHO concentrations.

There are emerging opportunities to study HCHO and its trends in the near future from both ground- and space-based platforms. In 2021, U.S. states, in coordination with the EPA, established 27 new Photochemical Assessment Monitoring Stations (PAMS), which provide hourly ground-based HCHO measurements (official and supporting documents available at https://www.regulations.gov/document/EPA-HQ-OAR-2019-0137-0013; accessed on 22 April 2022). Additionally, ozone nonattainment areas have been required to develop Enhanced Monitoring Plans (EMP), which would expand observations of meteorology and VOCs, potentially including observations of columnar VOCs from PANDORA spectrometers [88]. In addition to these monitoring programs, next generation satellitebased observations are expected to provide high-resolution and hourly column HCHO measurements. Further analysis of space- versus ground-based measurements, using these next-generation platforms, will maximize the relevance of Earth-observing satellites to air quality and public health user communities.

**Supplementary Materials:** The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/rs14092191/s1, Figure S1: All available AQS data from 2006 to 2015 calculated for every season (*x*-axis) compared with seasonal mean AQS under same OMI viewing conditions (*y*-axis), with correlation coefficient r = 0.999, mean bias = <sup>−</sup>0.005 <sup>μ</sup>g m<sup>−</sup>3. Red dotted line indicates 1-1 ratio.

**Author Contributions:** Conceptualization, T.H.; Data curation, I.D.S.; Formal analysis, P.W. and M.B.; Funding acquisition, T.H.; Investigation, P.W.; Methodology, T.H.; Resources, M.H. and I.D.S.; Software, P.W.; Supervision, T.H.; Visualization, P.W. and M.B.; Writing—original draft, P.W. and M.H.; Writing—review & editing, T.H., M.B., M.H. and I.D.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research and the APC were funded by the NASA Health and Air Quality Applied Sciences Team (HAQAST), grant numbers NNX16AQ92G and 80NSSC21K0427, as well as the Wisconsin Hilldale Undergraduate/Faculty Research Fellowship.

**Data Availability Statement:** Data and code for this study is available at https://doi.org/10.5281/ zenodo.6499349 (accessed on 15 April 2022).

**Acknowledgments:** The authors acknowledge the processing of AQS data by Madeleine Strum from EPA. The authors appreciate the editorial contributions by Daegan Miller. Ground monitor data for HCHO is available at EPA AMA (https://www3.epa.gov/ttnamti1/toxdat.html#data; accessed on 30 August 2017), Satellite HCHO observations come from QA4ECV (available at http://www.qa4 ecv.eu; accessed on 24 July 2019). EPA's NEI data is available at https://www.epa.gov (accessed on 23 October 2017). NOAA North American Regional Reanalysis data is available at https://www.esrl. noaa.gov (accessed on 28 October 2016).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **The Effect of Urban Form on PM2.5 Concentration: Evidence from China's 340 Prefecture-Level Cities**

**Ying Liu 1, Lijie He 2, Wenmin Qin 3, Aiwen Lin <sup>4</sup> and Yanzhao Yang 1,\***


**Abstract:** Exploring how urban form affects the Particulate Matter 2.5 (PM2.5) concentration could help to find environmentally friendly urbanization. According to the definition of geography, this paper constructs a comprehensive urban form evaluation index system applicable to many aspects. Four urban form metrics, as well as road density and five control variables are selected. Based on 2015 data on China's 340 prefecture-level cities, the spatial regression model and geographically weighted regression model were used to explore the relationship between the urban form evaluation index system and PM2.5 pollution. The main results show that the spatial distribution of PM2.5 in China follows an increasing trend from northwest to southeast. Urban form indicators such as AI, LPI, PLAND, LSI and road density were all significantly related to PM2.5 concentrations. More compact urban construction, lower fragmentation of urban land, and lower density of the road network are conducive factors for improving air quality conditions. In addition, affected by seasonal changes, the correlation between urban form and PM2.5 concentration in spring and winter is higher than that in summer and winter. This study confirmed that a reasonable urban planning strategies are very important for improving air quality.

**Keywords:** urban form; PM2.5; landscape metrics; geographically weighted regression

#### **1. Introduction**

In the past several decades, with the rapid urbanization and industrialization across many regions of the world, atmospheric pollution became increasingly serious and is already a major social problem [1,2]. Especially in China, as the largest developing country, the rate of urbanization increased from 17.9% to 54.8% in years between 1978 and 2015 years and continues to increase [3,4]. The average Particulate Matter 2.5 (PM2.5) concentration in cities reached 62 μgm−3, 60.8 μgm−3, and 57 μgm−<sup>3</sup> in 2013, 2014, and 2015, respectively. China became the most PM2.5-polluted area in the world [5]. PM2.5 is considered one of the most important pollutants because of its indirect impacts on health, agriculture production, atmospheric visibility, and climate environment [6,7]. In 2001–2006, 165 prefectures' annual PM2.5 levels had far and away beyond the national atmosphere quality standard of China (NAQSC, annual average < 35 μgm−3) [8]. Many studies showed that PM2.5 is a key atmosphere pollutant threatening public health [9,10]. An increase of PM2.5 concentration by 10 μg/m3 causes a 0.40% increase in all-course mortality, a 1.43% increase in deaths caused by respiratory failure, and a 0.53% increase in deaths caused by cardiovascular failure [11]. It is estimated that PM2.5 pollution caused 1.2 million premature deaths in China in 2010 and nearly 35% of worldwide deaths [12].

To explore the factors that affect PM2.5 concentration can help to better analyze the effects of PM2.5 pollution. A number of literatures showed that human activities and

**Citation:** Liu, Y.; He, L.; Qin, W.; Lin, A.; Yang, Y. The Effect of Urban Form on PM2.5 Concentration: Evidence from China's 340 Prefecture-Level Cities. *Remote Sens.* **2022**, *14*, 7. https://doi.org/ 10.3390/rs14010007

Academic Editors: Maria João Costa and Daniele Bortoli

Received: 21 October 2021 Accepted: 17 December 2021 Published: 21 December 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

natural factors act on PM2.5 pollution concentrations either through direct or indirect influences. These influences may be social economy [2,13], the industrial structure [5], climate change [14], seasonality [15], or the prevalence of monsoons [16]. For example, Xu found that economic growth had the large impact on PM2.5 [5]. Most studies found a clear reverse "U-shaped" curve between economic development levels, urbanization, and atmosphere pollution, and with improving economy levels, most cities are at a stage of increasing pollution levels [2,17]. It was confirmed that PM2.5 pollution is influenced by seasons and regions, and the highest levels were found in winter despite differences in temperature and relative humidity among different regions. He and his colleague applied the global regression model, and found that increased fossil energy consumption leads to an increase in PM2.5 concentrations, while elevation, precipitation, temperature, and GDP per capita are all likely to reduce the impact of PM2.5 pollution [18].

Moreover, many studies showed that urban planning exert a positive effect on the reduction of PM2.5 levels over recent years. Examples for these factors are a reasonable urban form, the moderate reduction of road density, building density, population ratios, and improving green spaces [19]. Of course, there are also studies that used comprehensive indicators in the urbanization process to explore the impact on PM2.5, for example, the Liveability and Health Index (LHI) [20]. Urban form, which includes a city area as well as its shape and layout, can be defined as urban land use organization and human activity arrangement [21], and it is usually measured by several landscape indicators of a city. PM2.5 pollution is affected by vehicle use, green land, pollutant diffusion, and the heat island effect [22]. Research proved that higher urban compactness and less fragmentation (i.e., the largest patch index (LPI)) can reduce PM2.5 pollution in China [23,24]. However, other studies argued that motor vehicles are the main cause of atmospheric pollutants emission in cities, and there is a strong correlation between PM2.5 and mortality in the traffic emission [25]; thus, a more compact development alone may still increase local PM2.5 concentrations and also cause more population to be affected by PM2.5 [26]. In the USA, controlling the population, the level of urbanization, and the mixing of different land cover types were found to be important influencing factors between pollutant levels and atmospheric quality [27,28]. In addition, the distance from the main road, the standard deviation of the building floor, and the average floor are the main urban morphological characteristics that affect the spatial variation of a variety of pollutants [29]. In the above analysis, because of the complexity of socioeconomic and natural conditions, the relationship between urban form and PM2.5 may be inconsistent and complex. Scientific urban planning can effectively reduce urban PM2.5. Therefore, it is necessary to explore the effect of urban form on PM2.5.

Investigating PM2.5 concentration is important for research. A large number of experiments used to study the relationship between PM2.5 and urban form to identify better approaches for reducing atmosphere pollution. Most studies used linear regression models to analyze the urban form indexes that are related to PM2.5 pollution and estimated the coefficient of form indexes in the model [30]. In addition, spatial econometric models and the Environmental Kuznets curve (EKC) hypothesis were used to study the socioeconomic and natural factors on urban atmosphere pollution [13,31]. Most studies mainly obtained urban form data through urban land use data and calculated the urban landscape pattern index to represent specific characteristics of the urban form. Most models selected class area (CA), number of patches (NP), patch density (PD), LPI, area-weighted mean shape index (AWMSI), percentage of landscape (PLAND), aggregation index (AI), landscape shape index (LSI), contiguity index (CONTIG), effective mesh size (MESH), interspersion juxtaposition index (IJI), and other landscape pattern indexes [32,33]. PM2.5 data are obtained in three main ways: monitoring real-time data through environmental observation sites, monitoring through experimental instruments, and estimating PM2.5 concentrations using atmospheric aerosol optical depth (AOD) data obtained from remote sensing images [34]. The latter can compensate for the shortcomings of experimental technology and ground monitoring sites and provides large-scale and real-time continuous observation data [35]. Although many studies explored the correlation between urban form and PM2.5 pollution

at different scales from different-sized cities to urban agglomerations to countries, more specific urban form indicators need to be analyzed in the case of China. For example, there is a lack of variables explaining the effects of local meteorological conditions on the spatial aggregation and dispersion of PM2.5 pollution. The geographical conditions of China are complex, and the urban forms of different geographical locations vary greatly; therefore, it is necessary to develop more specific urban form indicators to evaluate the shape of cities from different aspects and to measure the urban forms of China more adequately.

To develop more effective urban development strategies that can alleviate air pollution, as well as integrate the strengths and weaknesses of previous scientific research, this paper assesses the relationship between urban form and PM2.5. For this, multisource data is used to establish an index system of urban form. Estimates of PM2.5 concentrations are based on AOD data. Based on 340 prefecture-level cities in China, the linear regression model is applied to study the correlation between urban form and PM2.5. Next, a geographically weighted regression (GWR) model is used to analyze the geographical differentiation of the impact urban form exerts on pollutant emissions. PM2.5 concentration data and other natural factors are derived from satellite-derived data. Then, a comprehensive evaluation system of urban form indicators was established by using land use data and road network data. Next, the results of the linear regression method and GWR model between urban form and PM2.5 concentration are analyzed. The paper ends with a discussion of the research results and presents relevant policy implications.

#### **2. Data and Methods**

#### *2.1. Data Sources*

Data were collected on the scale of 340 prefecture-level cities to explore the effect of urban form on PM2.5 concentration of this study. Table 1 provides the data framework and variables abbreviations for the study. By reference to other research findings as well as our own experimental results [21,36], this research assumes that the urban form will affect the PM2.5 concentration through road density, AI, LPI, PLAND, LSI, population density, per capita GDP, and the proportion of secondary industry. Compared with that of other studies, these indicators are combined from the perspective of urban form and economic development. Then, these eight aspects are combined to quantify urban form indicators. In China, different regions have large climatic differences. Both temperature and precipitation significantly impact atmosphere pollution [37], and therefore, these are used as explanatory variables.


**Table 1.** Abbreviation summary.

#### 2.1.1. PM2.5 Concentration Data Estimation

Many studies used AOD to estimate the concentration of PM2.5 pollution, and many experiments proved a significant correlation between PM2.5 levels and satellite-obtained AOD. Therefore, satellite remote sensing AOD is an effective tool for PM2.5 pollution monitoring. In this study, MERRA-2 was mainly used to estimate PM2.5 levels. The dataset was compiled by NASA's Goddard Earth Science Data and Information Ser-vice Center (GESDISC, https://daac.gsfc.nasa.gov/ (accessed on 10 August 2021)), used an upgraded version of the Goddard Earth Observing System Model, Version 5 (GEOS-5) data assimilation system. The spatial resolution of MERRA2 data is 0.5◦ × 0.625◦, and the temporal resolution is daily. Combined with the GEOS-5 model, the annual and monthly averages of near-surface PM2.5 concentrations were obtained from the AOD observations in the MERRA-2 data set. For data verification, 200 prefecture-level cities were selected, and their urban pollution point data from the China National Environmental Monitoring Centre were compared to the satellite data. As shown in Figure 1, data verification found a high correlation coefficient (R<sup>2</sup> = 0.688) between the PM2.5 obtained by satellite and field observations. Thus, satellite data can be used to analyze spatial distribution characteristics. Because of the lack of collection and estimation of nitrate aerosols in the adopted data sources and models, the overall PM2.5 value is low (accounting for only 20% of the PM2.5 level on severely polluted days), but this was shown to impose little influence on the overall spatial distribution pattern [38].

**Figure 1.** Scatter plot of observation data vs. satellite-retrieved PM2.5 data. Note: \*\* indicate significance levels at 5% levels.

#### 2.1.2. Urban Form Data

Urban form can be defined as the physical characteristics of urban built-up areas, such as their size, shape, and density. Related research on the relationship between urban form and atmospheric quality showed that atmospheric quality is closely related to the fragmentation, size, shape, accessibility, and continuity of the urban form [23]. The landscape patch index is widely used to describe the characteristics of urban land use, as it can scientifically represent the urban form. In the landscape patch of construction land, total built-up area (TA), patch density (PD), mean patch area (MPA), PLAND, LPI, area weighted mean fractal dimension (AWMFD), edge density (ED), LSI, and AI were used to represent the fragmentation, size, shape, accessibility, and continuity of the urban form [39]. Those indices were calculated based on the land use and land cover dataset (1 × 1 km), obtained by remote sensing classification from Landsat 8 data of 2015.

However, multiple collinearity among these indexes is high. To avoid downstream problems, four indicators were selected based on a variance inflation factor of less than 7.5 [40]. AI indicates the degree of land concentration, and values range within 0–100 (the higher the value, the better the urban land use connectivity). LPI represents the proportion of the maximum patch area to the total land area, with a value range of 0–100 (the higher the value, the lower the city continuity and fragmentation). PLAND is used to measure the size of the area occupied by urban construction land in the whole urban landscape, with a value range of 0–100. LSI indicates the complexity of the shape of a city (the larger the value, the more fragmented urban construction land). These above-mentioned four indicators (i.e., AI, LPI, PLAND, and LSI) represent the expansion of urban construction area, the compactness and fragmentation of construction form, and the complexity of the internal landscape.

In addition, road density is used as a representation of the scale of the urban road network, which is a good measure of urban traffic accessibility. A large amount of traffic increases the concentration of atmosphere pollution, especially in areas next to roads; car exhausts discharge into the atmosphere, and the movement of cars transports dust from the ground into the air, which causes atmospheric pollution [41]. The road network of China was downloaded from OpenStreetMap (https://www.openstreetmap.org/ (accessed on 10 August 2021)) and computed through ArcGIS 10.x platform. Table 2 provides the calculation method and simple description of 5 urban form indexes.


**Table 2.** Main urban form indexes in this study.

#### 2.1.3. Control Variables

The level of air pollution emissions was influenced by many variables indirectly related to urban form. Therefore, it is necessary to employ a more accurate statistical assessment of the association between urban form and air pollution using control variables.

As socioeconomic data, this paper mainly selects GDP per capita, population density, and the secondary industry proportion (SIP). GDP per capita refers to the economic development of a city. Economic development is the ultimate goal of urban development. According to the EKC, China's economy is ahead of the EKC peak; thus, economic development causes more energy consumption and construction activities, which may be the main source of urban PM2.5 pollution [42]. Population density is defined as the number of people per unit of land area, and a significant correlation between population density and atmospheric pollution was found [22]. This paper mainly adopts the population density of urban built-up areas, which is an important indicator of the current status of urban population distribution. SIP refers to the proportion of the output value of the secondary industry within the total industrial output value, which is an important source of urban atmospheric pollutants. The secondary industry encompasses many energy-intensive industries, mainly fossil-fuel power plants, steel mills, cement plants, and chemical plants [2]. The Overall Energy Balance Sheet for National Bureau of Statistics showed that nearly 70% of China's energy consumption is concentrated in the secondary industry. Therefore, compared with that of other industries, the secondary industry emits more atmospheric pollutants [43]. The data of these three indicators all originate from the "China City Statistical Yearbook" of 2015, where missing and partially imputed data were replaced with relevant data from adjacent years (which was the case in 1.5% of the sample).

As natural factors, temperature (TEM) and precipitation (PRE) were selected to measure the city's climatic characteristics. Meteorological factors play an important role in the concentrations of PM2.5 in China (more precipitation and the lower the temperature, the lowed the concentration of pollutants) [37]. Latitude and longitude grid data of China were extracted from the acquired MERRA-2 data set, and interpolation processing and regional statistics at 340 prefecture-level cities were performed in the ArcGIS 10.2 software.

#### *2.2. Statistical Models*

China has obvious characteristics of regional differentiation, and atmosphere pollution also presents typical regional characteristics. Air pollution between regions has a strong spatial correlation; thus, the air pollution concentration of a city will affect the air quality of nearby cities.

As typical global linear regression model, the ordinary least squares (OLS) model is a common method to quantify the statistical relationship between independent and dependent variables. OLS can be used to study the correlation between urban form and PM2.5. However, the OLS model ignores the influence of spatial heterogeneity, which may lead to evaluation bias [39]. Because of the existing spatial correlation among influencing factors, several spatial regression models were selected to solve the problem by controlling these potential spatial correlations. This paper used the spatial lag model (SLM) and the spatial error model (SEM). SLM explains the influence of variables of the surrounding area by adding lag variables to the model, while SEM considers the spatial dependence of dependent variables (that may otherwise be missed) by adding error terms to the model. The OLS model can be described as:

$$\mathbf{S} = \beta\_{\rm n} + \sum\_{m=1}^{\rho} \beta\_{\rm m} \mathbf{a}\_{m} + \varepsilon \tag{1}$$

where S is the dependent variable, *β<sup>n</sup>* is the intercept, *β<sup>m</sup>* is the regression coefficient corresponding to the explanatory variable *m*, and *ε* is the random error value. This model can represent the intensity of the relationship between PM2.5 and urban form indicators.

The SLM model can be expressed as:

$$\mathbf{y} = \beta\_0 + \mu \sum\_{i=1}^{p} \mathcal{W}\_i \mathbf{y} + \sum\_{i=1}^{p} \beta\_i \mathbf{x}\_i + \varepsilon \tag{2}$$

where *μ* is the regression coefficient of the spatial lag term, representing the influencing degree of adjacent spatial units on the spatial unit. This value has certain directivity, and the larger the spatial influence degree, the greater the spatial influence degree. *W* is the spatial weight matrix of *n* × *n*, and *Wi* is the spatial lag dependent variable of the spatial weight matrix *W*. The parameter *β* mainly reflects the influence of the independent variable on the dependent variable and the effect of spatial distance on each spatial unit. In this model, inverse distance was used as the weight of the spatial lag term.

The SEM model can be represented as follows:

$$y = \beta\_0 + \sum\_{i=1}^{p} \beta\_i x\_i + \omega \sum\_{i=1}^{p} W'\_i \varepsilon \tag{3}$$

where *y* is the dependent variable. *W* is the spatial weight matrix, where the inverse distance was used to calculate the spatial error matrix. *β*<sup>0</sup> is a normal distributed random error vector. Parameter *β<sup>i</sup>* is the influence coefficient of independent variable *x* on dependent variable *y*, and *W <sup>i</sup>* is the spatial error coefficient of the dependent variable vector, which represents the spatial autocorrelation of the spatial error.

All results of the three regression models can be used to explain the relationship between dependent and independent variables, and the statistical results can be compared by the measurement coefficient (R2) and Akaike information criterion (AIC) in the model. Both values can be measured relative to a model that is more suitable for this paper. The higher the R2 value, the smaller the AIC value, indicating that the model is more suitable. All calculation procedures are conducted in Geoda 2017 software.

#### *2.3. Geographically Weighted Regression*

In this study, atmospheric pollution presents typical regional characteristics. In other words, the air quality between neighboring cities is geographically closely related. Regression analysis assumes that the regression parameters have no relationship with the geographic location of the sample data, and the spatial characteristics are not considered. The research results do not reflect geographic location characteristics well. In addition, as a spatial autocorrelation index, the results of the bivariate Moran index of PM2.5 and 10 indicators are statistically significant (e.g., LPI, AI, and PLAND) and have obvious spatial autocorrelation.

To identify the influence of spatial location, GWR is used to assess the influence of urban morphology of different regions on PM2.5 concentration. GWR is an extension of the OLS linear regression model. It uses local regression, embeds spatial position information of the data into the regression parameters, establishes the local weight of the spatial position matrix, and estimates the regression parameters point by point through the local weighted least squares method, to quantify spatial heterogeneity. The model construction is expressed as follows:

$$y\_i = \beta\_o(u\_{i\prime}v\_i) + \sum\_{z=1}^n \beta\_z(u\_{i\prime}v\_i)x\_{iz} + \varepsilon\_i \tag{4}$$

where the dependent variable *yi* represents the PM2.5 concentration of city *i*, *βo*(*ui*, *vi*) represents the constant term of city *i*, *xiz* represents the explanatory variable, *βz*(*ui*, *vi*) represents the regression parameter of the independent variable at the data sampling point, and *ε<sup>i</sup>* Represents the accumulation error term.

The parameter *β <sup>f</sup>*(*ui*, *vi*) can be estimated by the following formula:

$$\mathcal{S}\_f(\boldsymbol{u}\_{i\prime}\boldsymbol{v}\_i) = \left(\boldsymbol{X}^T\mathcal{W}(\boldsymbol{u}\_{i\prime}\boldsymbol{v}\_i)\boldsymbol{X}\right)^{-1}\boldsymbol{X}^T\mathcal{W}(\boldsymbol{u}\_{i\prime}\boldsymbol{v}\_i)\boldsymbol{y} \tag{5}$$

where *β <sup>f</sup>*(*ui*, *vi*) is the parameter estimation value of (*ui*, *vi*), *W*(*ui*, *vi*) is an *n* × *n* spatial weight matrix, the nondiagonal original element value of which is 0, and the diagonal element data is the spatial weight of the observation data of city *i*. The choice of the spatial weight function is the core of GWR model estimation, and directly determines the correctness of the model parameter estimation. To avoid estimation error caused by less sample data around individual sampling points, this model uses the Gaussian kernel function as the spatial weight function:

$$\mathcal{W}\_{ij} = \begin{cases} \left[ 1 - \left( d\_{si} / d\_{\text{max}} \right)^2 \right]^2 & d\_{si} \le d\_{\text{max}} \\ 0 & otherwise \end{cases} \tag{6}$$

where *dsi* represents the distance between sampling points *s* and *i*, and *dmax* represents the maximum distance between neighboring cities and the city to be assessed.

For the GWR model, bandwidth is important for determining the spatial weight calculation scheme. The smoothness of the model is controlled by bandwidth. Different spatial weighting functions are used to obtain different bandwidths. Fotheringham proposed how to obtain the optimal bandwidth [44]. The standard is the best bandwidth when the AIC of the GWR model is smallest. Therefore, AIC is used to determine the bandwidth.

The formula of the AIC is shown in the following:

$$\text{AICC} = -2\text{InL}\,\theta L\,\text{x} + 2q\,\text{y} \tag{7}$$

where L ˆ *θL*, *x* is the likelihood function of the model, ˆ *θ<sup>L</sup>* is the maximum likelihood estimation of *θ*, *x* is a random sample, and *q* is the number of unknown parameters. The GWR tool in ArcGIS 10.2 was used to build the model.

#### **3. Results**

*3.1. Spatial Distribution Characteristics of PM2.5*

Figure 2 illustrates the geographic distribution of the average PM2.5 concentrations of China's cities in 2015, clearly indicating that the spatial distribution of PM2.5 is heterogeneous. Overall, cities in eastern China tend to have higher PM2.5 levels than cities in western China, and cities in northern China have higher PM2.5 levels than cities in southern China. Specifically, areas with highest PM2.5 levels are concentrated in the North China Plain and Sichuan Basin, as well as in parts of the Northwest China. Among these, the high PM2.5 level-area of the northwest region is mainly caused by the Taklamakan desert (the world's second largest desert), where perennial wind and sand influx causes extremely rich suspended particulate matter; therefore, the concentration of PM2.5 in the desert area is high. The level of economic development in the North China Plains is high. The development of pollution-intensive industries in North China promotes regional economic development, and therefore, man-made atmospheric pollutant emissions are very large. In the southwest region, area with high PM2.5 pollution is mainly concentrated in the Sichuan Basin, a region that is particularly affected by humidity and precipitation, which causes rich suspended particles in the atmosphere. Moreover, the special structure of the terrain is not conducive to the spread of pollutants, and the population of the region causes high levels of anthropogenic pollution emissions. Because of its elevation, the Qinghai–Tibet Plateau has a thin atmosphere, these conditions are not conducive for the formation and accumulation of particulate matter in the atmosphere. Therefore, the lowest PM2.5 levels were found in the Qinghai–Tibet Plateau. The global Moran's I index for PM2.5 levels were 0.765 (*p* < 0.01), indicating a relatively strong positive spatial correlation. Local indicators on PM2.5 spatial association (LISA) maps show similar typical distributions, with a high PM2.5 cluster in the North China Plains and a low PM2.5 cluster in the Northeast China and the Qinghai-Tibet Plateau regions (Figure 3). Seasonally, winter had the highest PM2.5 level, followed by spring, autumn, and summer. However, North China has always been a region with severe PM2.5 pollution, especially in winter, where the climate is not conducive to the diffusion of atmospheric pollutants [45]. On a seasonal scale, winter had the highest PM2.5 level, followed by spring, autumn, and summer. However, North China was always a region with severe PM2.5 pollution, especially in winter, where the climate is not conducive to the diffusion of atmospheric pollutants. It is worth noting that Southern China always had low PM2.5 pollution because of its advantageous climate (Figure 4).

**Figure 2.** Spatial distribution of PM2.5 levels at prefecture-city level in China.

**Figure 3.** Local indicators on PM2.5 spatial association (or LISA) maps of prefecture-level cities in China.

**Figure 4.** Spatio-temporal distribution of PM2.5 levels of prefecture-level cities in spring, summer, autumn, and winter.

#### *3.2. Correlations between Urban Form and PM2.5*

Table 3 shows the results of global regression model analysis (i.e., OLS, SLM, and SEM). The results of suitability statistics such as R2, AIC, and log-likelihood imply that the spatial analysis technique is more suitable for this data. R2 values of OLS, SLM, and SEM models are 0.601, 0.943, and 0.874, respectively. These results show that the spatial effect is important in regression analysis and thus, ignoring the spatial effect will reduce the effectiveness of the model. In addition, R2, AIC, and log-likelihood test results show that the spatial lag effect is more significant.



Note: \*\*\*, \*\*, or \* indicate significance levels at the 1%, 5%, and 10% levels, respectively. OLS model: R2: 0.601, Log likelihood: −1224.72, *p*-value: 0.000; AIC: 2471.45. SLM: R-squared: 0.943, Log likelihood: −954.714, *p*-value: 0.000, AIC: 1931.43. SEM: R-squared: 0.874, Log likelihood: −1057.64; *p*-value: 0.000, AIC: 2139.28.

The results of OLS indicate that most urban form indicators are significantly correlated with PM2.5 concentrations, and six urban form indicators showed significantly (*p* < 0.01) positive relationships with city-level annual mean PM2.5 levels. Among these six significant factors, LPI, PLAND, and LSI also have a significant impact. LPI has a significant negative correlation with PM2.5 levels, indicating that a better continuity of the urban form leads to a lower fragmentation, and a better atmospheric quality. PLAND and LSI were significantly positively correlated with PM2.5 levels, indicating that the more complex the urban form, the worse the atmospheric quality. AI indicators on PM2.5 concentration is not significant. The four landscape pattern indicators indicate that the fragmentation and complexity of the urban form exert a significant impact on PM2.5 levels, and thus, more attention should be focused on urban form area expansion and the internal composition of fragmentation and complexity. RD is negatively correlated with PM2.5 levels, and the higher the density of the road network, the lower the atmospheric pollution levels. In addition, there is a significant negative correlation between temperature and precipitation and PM2.5 pollution, which confirms that meteorological conditions are conducive to the spread and reduction of atmospheric pollutants. Among other control indicators, SIP has a significantly positive impact on PM2.5 pollution, and PD and PRGDP are positively correlated with PM2.5 levels. Therefore, the development of the secondary industry has a more significant impact on atmospheric pollution.

The SLM model has the best regression results, indicating that urban form and atmospheric pollution have clear spatial dependence. Among the major urban form indicators, LPI was significantly negatively correlated with PM2.5 levels, AI, PLAND, and LSI were significantly positively correlated with PM2.5 pollution, and RD was negatively correlated with PM2.5 levels. In addition, the correlation between temperature and precipitation on PM2.5 was significant (the more precipitation, the lower the temperature), which is beneficial for reducing PM2.5 concentrations in the atmosphere. This is consistent with literature. Therefore, clear spatial correlation exists between urban form indicators and PM2.5 levels. the analysis process of the global regression model analysis has limitations. The GWR model can be used for further analysis by adding the relationship of spatial location.

#### *3.3. Spatial Features of Urban Form on PM2.5*

The coefficient of determination of the GWR model was 0.77, and the results of AIC and variance analysis (F test) showed that the results of the model are statistically significant. All 10 variables show noncollinearity and are used under the AIC minimization standard. The GWR model is superior to the OLS model. Figure 5 shows a distribution map of the regression fitting coefficient R2 in the regression results for prefecture-level cities. The spatial distribution of R<sup>2</sup> shows that the fitting results of the 10 variables of the urban form range between 0.43 and 0.83, indicating that the 10 indicators selected in this paper exert a stronger comprehensive impact on urban PM2.5 levels. Moreover, the R<sup>2</sup> value in the spatial distribution decreases from north to south. Therefore, the urban form system has a stronger explanatory power for the urban form of the northern region. On the one hand, this paper uses temperature and precipitation as control variables of two natural factors. They exert a significant positive effect on reducing atmospheric pollutants. The role of climatic factors is more significant in the south, thus reducing the concentration of urban PM2.5 pollution. On the other hand, this may be due to a lack of estimation of nitrate levels in the PM2.5 data used in this paper, and therefore, the impact of using a large amount of coal for heating in winter in northern regions may be underestimated, resulting in a low degree of fitting for northern regions. The further south a city is located, the more the explanatory power for PM2.5 of the urban form decreases.

**Figure 5.** Local R<sup>2</sup> distribution characteristics according to geographically weighted regression (GWR) model.

In the GWR model, each urban form index has a specific regression coefficient for the influence degree of PM2.5 levels, and each coefficient has a different spatial distribution law (Figure 6). The spatial distribution can more intuitively depict the influencing effect and changing trend between different urban form indicators and cities. In addition, different indicators exert different positive and negative impacts on PM2.5 levels, and their proportions differ. This also indicates that the influence index is not spatially stable and shows spatial heterogeneity (Figure 5). The directions of significant relationships were not the same for the studied cities, even for the same factor.

**Figure 6.** Spatial distribution of local relationship between PM2.5 and 10 factors for prefecture-level cities in China.

The regression coefficient of urban form index decreases in the order of RD, PLAND, LPI, LSI, and AI. The correlation between road density and PM2.5 levels is highest. Construction dust from the construction phase of roads is the main cause of PM2.5 pollution, followed by pollution caused by motor vehicle driving, as well as more harmful gases that are discharged during traffic congestions. The regression coefficient ranges from 7.7 to 4.0 and decreases from northeast to southwest. In China's major urban areas, the density of road networks is negatively correlated with PM2.5 concentrations. Improvement of the road network system can effectively reduce traffic congestions and atmospheric pollution. The correlation coefficient between PLAND and PM2.5 concentrations is high and negative. This coefficient mainly measures the size of the area occupied by urban construction land in the whole urban landscape. The influencing factor follows a decreasing trend from center to surrounding areas. Cities where construction land is the main land use type are more likely to cause atmospheric pollution. Developed areas in the south are less affected but may be

affected by increasing levels of urban construction, where the application of scientific dust reduction measures and the use of green materials are conducive for reducing emissions of atmospheric pollutants such as building dust. Cities in part of the central and western regions are more susceptible to the impact of the area of construction land. Therefore, reasonable increases of urban construction area and improvement of the level of building construction technology (e.g., green building materials and dust reduction) can reduce PM2.5 concentrations to a certain extent. LPI exerts a significant influence on urban PM2.5 concentrations, where a continuous increase of LPI indicates that the landscape dominance of urban construction land increases, the degree of spatial connection increases, and the intensity of human activities also increases. The values range from −1.9 to 2.8 and increase from north to south with the continuous development of urban construction, which causes more atmospheric pollution. In southern China, driven by the reform and opening up policy, the urbanization level grows faster, and human activities are more intense. This indicates that enhancing the connectivity and dominance of urban construction land has a significant impact on reducing PM2.5 concentrations. A higher LSI index indicates stronger fragmentation and more complicated urban areas. Most regression coefficients between LSI and PM2.5 concentrations are positive, with values ranging from −0.12 to 0.69, indicating that a higher LSI value represents higher levels of PM2.5 pollution. The spatial increase from north to south may be due to the complexity and fragmentation of the shape of the landscape of urban construction land, which leads to an increase of people's daily commuting time and distance, thus also increasing the pollution caused by the heavy use of commuting means. The impact degree of the southern region is large, indicating that the complexity of urban landscape in the southern region is more likely to affect the PM2.5 levels. AI is used to measure the agglomeration and compactness of urban construction land. The regression coefficient ranges from −0.17 to 0.31, and it changes from negative to positive from north to south. The higher the compactness of urban construction land, the lower the PM2.5 concentrations. Compact urban construction can shorten people's travel distance, improve the efficiency of land use, and reduce energy consumption. Therefore, in the process of urban development, compact and continuous urban construction is conducive to the improvement of urban atmospheric quality.

Among control indicators, the influences of the three socioeconomic factors on urban PM2.5 pollution show clear spatial differences. In most urban areas, SIP has a significant positive effect on PM2.5 concentrations, indicating that industrial activities aggravate the PM2.5 concentrations in Chinese cities, which is consistent with the literature. The effect of SIP on PM2.5 concentrations in southern China is strong, indicating that reducing the proportion of output of the secondary industry in southern China can significantly improve atmospheric quality. At the same time, the population density in the southeast coastal areas exerts a greater impact on the PM2.5 concentrations. Economic development prompted the migration of many people to the south to work or live. The increased population density has caused more man-made atmospheric pollutant emissions, which strongly impact atmospheric pollution. The overall coefficient of PCGDP is low and its influence is weak. However, in the process of urban development, the improvement of the economic level exerts a significant positive effect on reducing PM2.5 pollution levels.

Reasonable increases in the area of urban construction land and improvements of the level of construction, reducing the fragmentation of urban construction land, compacting urban construction, improving traffic accessibility, applying a reasonable road network density are all very beneficial factors for the improvement of urban atmospheric quality. These can, to a certain extent, reduce the concentration of PM2.5. However, as urbanization continues to increase, the different geographical location, scale and level of construction, and development of the city should be considered according to local conditions, thus providing planning and construction guidance. Urban form can affect PM2.5 concentrations from different aspects. Urban area, geographical location, and economic development level have an impact, and thus also need further discussion and analysis in the future.

#### **4. Discussion**

On an annual scale, urban planning factors (e.g., the area of urban construction land, construction fragmentation, and road network density) all exert a specific influence on PM2.5 levels. At the same time, the economic development of a city is also closely related to PM2.5 levels. China's rapid urbanization led to structural changes of its economy, and large areas of land are being used by energy-intensive and labor-intensive industries. In addition, many people move from rural areas to cities, where the population grows rapidly, which increases the release of large amounts of man-made atmospheric pollutants emission. The study showed that temperature and precipitation (as control variables) were significantly correlated with PM2.5 concentrations in 340 cities in China, and climatic factors played a significant role for reducing atmosphere pollution. Therefore, we further discussed this paper discusses the relationship between urban form and PM2.5 in different seasons. Seasonality can affect atmospheric quality through changes in precipitation, wind, relative humidity, monsoons, and other diffusion conditions [16,30]. When seasonal variations are considered, different seasons exert different impacts on PM2.5 through different urban form indicators. The SLM model achieved the best regression results, and the relationship between urban form and PM2.5 in different seasons is discussed through this model (Table 4).

**Table 4.** Spatial lag model results for different seasons in China during 2015.


Note: \*\*\*, \*\*, or \* indicate significance levels at the 1%, 5%, and 10% levels, respectively.

Table 4 shows the results of the regression model, in which the R2, log-likelihood, AIC, and other statistical results are significantly higher, indicating that the SLM regression model in this study has a relatively good fit. This model can accurately assess the impact of seasonal changes in the urban form on PM2.5 concentrations. In addition, the seasonal analysis according to the GWR model shows that the regression coefficients of the impact of each city's form index on the level of PM2.5 pollution have different spatial distribution patterns (Figure 7). As shown in Table 4, seasonal variation significantly affects the relationship between urban form and PM2.5 concentration.

**Figure 7.** Local R<sup>2</sup> derived from multivariate GWR model in spring, summer, autumn, and winter.

The four main findings of this study are summarized in the following: firstly, a significant correlation exists between urban form and PM2.5 concentrations in all four seasons, with the highest R<sup>2</sup> value in winter. Secondly, the temperature and precipitation in the control variables always exerted a significant impact on PM2.5 concentrations, while other socioeconomic indicators had no significant impact on PM2.5 concentrations. Thirdly, the effect of PLAND and PM2.5 concentrations is significantly positively correlated, while that of LPI and PM2.5 concentrations is significantly negatively correlated in spring, summer, and winter. Fourthly, a negative correlation was found between the density of urban road network and PM2.5 concentrations. Specifically, in spring, LPI, PLAND, and RD significantly impact PM2.5 concentrations, with regression coefficients of −0.289, 0.243, and −1.537, respectively. In the summer, urban compactness (AI = 0.052) can decrease the PM2.5 concentration to some extent. LPI (−0.489) and PLAND (0.320) were all significantly associated with PM2.5 concentrations. In the autumn, the correlation between urban form and PM2.5 concentrations was not significant. In the winter, PLAND (0.240) was significantly positively correlated with the PM2.5 concentrations. LPI (−0.258) and RD (−1.728) were significantly negatively correlated with PM2.5 concentrations.

Therefore, data analysis showed that seasonal change exerts a certain influence on the relationship between urban form indicators and PM2.5 concentrations. Especially in spring and winter, increasing the connectivity of urban construction land and improving the efficiency of land use within a city can effectively decrease urban PM2.5 concentrations [15]. Moreover, increasing the road connectivity also exerts a significant effect on reducing atmosphere pollution. However, the season of autumn does not exert a significant effect on the relationship between urban form and PM2.5 concentrations. In contrast, the indicators between urban form and PM2.5 are more significantly correlated in spring and winter, and relatively less in summer and autumn. Firstly, the previous analysis on an annual scale shows that the compact and continuous urban construction land, the reduction of urban land fragmentation, and the reasonable road network density are all conducive to reducing urban atmospheric pollutant emissions. Secondly, seasonal changes are mainly reflected by different climatic variables. The strong Asian monsoon during summer and the subtropical high during autumn result in favorable weather conditions that clean the air from atmospheric pollutants, as they enhance the mobility of the atmosphere above

urban centers. The summer monsoon climate moves more precipitation to North China, and at the same time increases the wind speed under the subtropical high pressure. These meteorological conditions are conducive to reducing atmosphere pollutants [16]. Under such climatic conditions, small changes in the structure of building areas or land areas near the ground exert little impact on atmospheric pollutants. Finally, low temperatures in winter and spring cause the atmospheric flow to sink and wind speeds are also lower than in summer [15]. In addition, more coal is burnt in the north in winter, and the resulting atmospheric pollutants cannot easily spread and remain concentrated near the ground. Under these conditions, the irregular urban form near the ground has a more significant impact on PM2.5 concentrations. Therefore, seasonal changes exert a significant impact on PM2.5 concentrations. When exploring the relationship between urban form and PM2.5 concentrations, focusing on the results of spring and winter is more effective.

There are many factors that affect the PM2.5 concentration, but our research just focused on the urban form. In addition, industrial areas, greenness area, vegetation coverage, transportation and other factors also have a significant impact on PM2.5 concentration. Therefore, there are some limitations of this study. Firstly, more influencing factors should be considered, such as the emission forces, change of pollution effects, LHI [20], meteorological conditions, the development levels of different cities and environmental conditions that surround the region of interest. These factors have an important effect on PM2.5 and are also associated with PM2.5 through interaction. They can all improve our indicator system. By selecting typical cities for further research, further problems may be identified. Secondly, the classification accuracy of urban land use data can be improved, and the impact of different land use patterns of urban construction land on PM2.5 pollution can be explored. For example, extract the industrial area in city for research. Thirdly, experiment with different research methods and data sources. For example, compare the remote sensing estimated PM2.5 concentration obtained by different sensors; compare the results of different research models; compare the impact levels of more influencing factors on the PM2.5 concentration, and so on.

#### **5. Conclusions**

Exploring the relationship between PM2.5 pollution and urban form helps to better understand the distribution of PM2.5 pollution, provide suggestions for urban planning, and the government with exploring a more sustainable and environmentally friendly development model for cities. Therefore, this study selected 340 prefecture-level cities in China to explore the relationship between urban form and PM2.5 pollution via regression analysis and GWR model. The following lists the main conclusions and recommendations:

Firstly, the distribution of PM2.5 pollution showed spatial heterogeneity, with an increasing trend from northwest to southeast. Areas with high PM2.5 concentrations are mainly located on the North China Plain, which is greatly affected by human activities. The southwest and parts of the northwest are affected by climatic factors, and thus, their PM2.5 concentrations are also high. The climatic and human activity conditions on the Qinghai–Tibet Plateau are not conducive to the accumulation of PM2.5 pollution, and thus, this area was always less polluted. Affected by seasonal changes, the PM2.5 concentration decreases in the respective order of winter, spring, autumn, and summer.

Secondly, most urban form indicators are significantly related to PM2.5 concentration. The results of the GWR model show that the spatial distribution decreases from north to south, and the urban form indicators system exerts a stronger influence on cities in northern regions. In general, better continuity is associated with a lower the degree of fragmentation, and a higher compactness and reasonable density of the road network are conducive to reducing the PM2.5 concentrations. In addition, meteorological conditions are also conducive to the diffusion and reduction of atmospheric pollutants. Thirdly, on a seasonal scale, seasonal changes impact pollution levels, but not all urban form indicators are significantly related to PM2.5 concentrations. Affected by seasonal changes, more urban

form indicators were significantly correlated with PM2.5 concentrations during spring and winter compared with summer and autumn.

This research used a large data set and confirmed that a good urban form exerts a positive effect on reducing PM2.5 concentrations. In the future development, the proportion of the secondary industry in the urban area should be reduced, green industries should be developed, atmospheric pollutant treatment technologies should be improved, and pollution sources should be controlled to reduce emissions. In areas with low pollution values, protection measures should be increased. More importantly, in urban planning, the blind expansion of urban land should be avoided, the compactness of the urban form, the efficiency of land use, and the transportation network in the city should be improved, and various forms of public transportation should be offered.

There are three main limitations of this study. Firstly, many factors influence PM2.5 pollution, and the geographical location, climatic conditions, and development levels of different cities are all different. By selecting typical cities for further research, further problems may be identified. Secondly, the classification accuracy of urban land use data can be improved, and the impact of different land use patterns of urban construction land on PM2.5 pollution can be explored. Thirdly, different research methods and data can be used to improve the accuracy of pollution source data, as well as research models, and different data methods can be utilized to further explore the estimation of PM2.5 levels and relevant influencing factors.

**Author Contributions:** Conceptualization, Y.L. and L.H.; methodology, Y.L.; software, Y.L.; formal analysis validation, Y.L.; formal analysis, Y.L.; writing—original draft investigation, Y.L.; investigation, L.H.; data curation, Y.L.; visualization, W.Q.; writing—review & editing, A.L. and Y.Y.; supervision, Y.Y.; funding acquisition, Y.Y. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Strategic Priority Research Program of the Chinese Academy of Sciences (XDA20010201).

**Data Availability Statement:** MERRA-2 dataset is available at https://daac.gsfc.nasa.gov/ (accessed on 10 August 2021).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Space-Time Machine Learning Models to Analyze COVID-19 Pandemic Lockdown Effects on Aerosol Optical Depth over Europe**

#### **Saleem Ibrahim \*, Martin Landa, Ondˇrej Pešek, Karel Pavelka and Lena Halounova**

Department of Geomatics, Faculty of Civil Engineering, Czech Technical University in Prague, 166 29 Prague, Czech Republic; martin.landa@fsv.cvut.cz (M.L.); ondrej.pesek@fsv.cvut.cz (O.P.); karel.pavelka@fsv.cvut.cz (K.P.); lena.halounova@fsv.cvut.cz (L.H.)

**\*** Correspondence: saleem.ibrahim@fsv.cvut.cz

**Abstract:** The recent COVID-19 pandemic affected various aspects of life. Several studies established the consequences of pandemic lockdown on air quality using satellite remote sensing. However, such studies have limitations, including low spatial resolution or incomplete spatial coverage. Therefore, in this paper, we propose a machine learning-based scheme to solve the pre-mentioned limitations by training an optimized space-time extra trees model for each year of the study period. The results have shown that our trained models reach a prediction accuracy up to 95% when predicting the missing values in the MODIS MCD19A2 Aerosol Optical Depth (AOD) product. The outcome of the mentioned scheme was a geo-harmonized atmospheric dataset for aerosol optical depth at 550 nm with 1 km spatial resolution and full coverage over Europe. As an application, we used the proposed machine learning based prediction approach in AOD levels analysis. We compared the mean AOD levels between the lockdown period from March to June in 2020 and the mean AOD values of the same period for the past 5 years. We found that AOD levels dropped over most European countries in 2020 but increased in several eastern and western countries. The Netherlands had the most significant average decrease in AOD levels (19%), while Spain had the highest average increase (10%). Moreover, we analyzed the relationship between the relative percentage difference of AOD and four meteorological variables. We found a positive correlation between AOD and relative humidity and a negative correlation between AOD and wind speed. The value of the proposed prediction scheme is further emphasized by taking into consideration that the reconstructed dataset can be used for future air quality studies concerning Europe.

**Keywords:** aerosol optical depth; CAMS; COVID-19; machine learning; MODIS

#### **1. Introduction**

The Severe Acute Respiratory Syndrome-COronaVIrus Diseases 2019 (SARS-COVID-19) pandemic made humanity reconsider how to adapt their daily activities. By late June 2020, the EU average infection rate was around 160 per million inhabitants [1]. In general, most European countries started applying restrictions in March 2020. These restrictions included lockdown, contain, various kinds of curfew, mandatory face masks, etc. By 18 March 2020, more than 250 million people in Europe were in lockdown [2].

Despite the unfortunate losses in human lives and the economy, there could be a bright side to this pandemic when it comes to air quality. Some studies showed that air quality has improved under the applied restrictions. For example, only two weeks of lockdown reduced urban air pollution in Spain, with essential differences among pollutants. The most considerable reduction was in black carbon and Nitrogen Dioxide (NO2) by 45–51% [3].

According to data released in 2019–2020 by the National Aeronautics and Space Administration (NASA) and the European Space Agency (ESA), NO2 was reduced up to 30% in some regions that were highly affected by COVID-19 lockdowns such as Wuhan in

**Citation:** Ibrahim, S.; Landa, M.; Pešek, O.; Pavelka, K.; Halounova, L. Space-Time Machine Learning Models to Analyze COVID-19 Pandemic Lockdown Effects on Aerosol Optical Depth over Europe. *Remote Sens.* **2021**, *13*, 3027. https://doi.org/10.3390/rs13153027

Academic Editors: Maria João Costa and Daniele Bortoli

Received: 22 June 2021 Accepted: 29 July 2021 Published: 2 August 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

China, Italy, Spain, and the USA [4]. Similar results were found in Poland when comparing air quality observations for the year 2020 in five major cities with the same time periods as in the previous two years. In addition, AOD concentrations were reduced in April and May of 2020 by nearly 23% and 18% as compared to 2018–2019 [5].

During the lockdown in China, there was a significant drop in NO2 (−37%), SO2 (−64%), and AOD (−8%) for the year 2020, when compared with the 11 year mean average (2009–2019) [6]. Another study of the eastern part of China, where AOD levels are usually high (AOD > 0.7), showed that the emission of pollutants in the first three months of 2020 has decreased when compared to the same period of the previous year [7]. In India, the AOD level was greatly decreased (~45%) during the COVID-19 lockdown periods compared to the mean AOD level in the previous 20 years [8]. Similarly, significant reductions in black carbon concentration (~8.4%) and AOD (10.8%) were observed in southern India during the first lockdown period (25 March–14 April 2020) when compared to the pre-lockdown period (1–24 March 2020) over the selected measuring location [9].

In this study, we focused on AOD, which is defined as a measure of the columnar atmospheric aerosol content. High AOD concentrations have a negative impact on all living things by affecting the respiratory system and reducing naked eye visibility. AOD is measured either from ground-based stations or retrieved by satellites measurements. AOD satellite-based products provide a vast spatial coverage compared to the limited number of ground stations [10].

Due to the correlation between AOD and particulate matter (PM), AOD satellite products are commonly used to retrieve surface PM [11–13]. This justifies the increasing interest in AOD satellite products. Many sensors retrieve AOD at different spatial and temporal resolutions [14], such as the Total Ozone Mapping Spectrometer (TOMS) [15], the Ozone Monitoring Instrument (OMI) [16], the Sea-viewing Wide Field-of-view Sensor (SeaWiFS) [17], the Geostationary Operational Environmental Satellite (GOES) [18], the Advanced Himawari Imager (AHI) [19], the Multi-angle Imaging SpectroRadiometer (MISR) [20], and the widely used Moderate Resolution Imaging Spectroradiometer (MODIS) which we used in our study.

MODIS instrumentations have been carried on both the Terra and Aqua satellites in sun-synchronous polar orbits, since 1999 and 2002, respectively. They can record the earth's surface reflectance and emittance with a 2330 km swath every one to two days [21]. MODIS measures 36 spectral bands between 0.4 and 14.4 μm wavelengths at many different spatial resolutions that provide a great opportunity to study the aerosol thickness and parameters characterizing aerosol size from space with good accuracy and on a worldwide scale.

MODIS provides various AOD products based on different aerosol retrieval algorithms. The most common algorithms are the Dark Target (DT) [22,23], the Deep Blue (DB) [24,25], and the Multi-Angle Implementation of Atmospheric Correction for MODIS (MAIAC) [26] which is the algorithm used to generate the MODIS MCD19A2 product with 1 km spatial resolution.

However, AOD satellite-based products have a great number of gaps due to cloud cover and snow reflectance. An analysis of the spatial and temporal distribution of clouds retrieved by MODIS over 12 years of continuous observations from the Terra satellite and over 9 years from the Aqua satellite showed that clouds cover ~67% of the earth's surface worldwide and ~55% over land [27]. To solve this issue, it has become common to use machine learning and deep learning algorithms in developing models that fill the gaps in satellite-based products either by removing the clouds [28], applying spatiotemporal interpolation [29], or merging different sources of data to predict gaps-free images [30]. Therefore, in this study, we propose a machine learning-based scheme to fill the gaps in MODIS MAIAC AOD retrievals and to generate daily, full coverage, high-resolution AOD maps over Europe. Such maps will minimize time series analysis bias and uncertainty while investigating the influence of COVID-19 lockdown on AOD levels.

#### **2. Material and Data**

#### *2.1. Study Area and Period*

The study area is shown in Figure 1. It includes the "Continental EU," hence EEA (European Economic Area), and the United Kingdom, Switzerland, Serbia, Bosnia and Herzegovina, Montenegro, Kosovo, North Macedonia, and Albania [31]. In this paper, we refer to the area of study as "Europe" located inside this coordinates box 26◦ W, 72◦ N, 42◦ E, and 36◦ S. The total study area covers 13,391,504 of 1 km grid cells; 5,450,009 of the total cell number are located over land. The study period covers the months of March–June from the years 2015–2020.

**Figure 1.** The study area with AERONET stations shown as black dots.

#### *2.2. Data*

In this section, we summarize different data used throughout our study.

#### 2.2.1. MODIS Data

MCD19A2 daily product from MODIS collection 6 was released and made publicly available on 30 May 2018. It was generated from both the Aqua and Terra satellites and delivered in Hierarchical Data Format [26]. MCD19A2 has a 1 km spatial resolution and uses the MAIAC algorithm that utilizes time series (TMS) analyses, a set of image-based and pixel processing to enhance the precision of cloud recognition, AOD, and other atmospheric rectification [32,33]. Daily MODIS MCD19A2 data were downloaded, and two science datasets (SDS) were extracted; AOD green band (at 550 nm) and AOD quality assurance layer (AOD\_QA), which was used to retrieve only pixels with the best quality. We created daily mosaics that cover the study area.

#### 2.2.2. Copernicus Atmosphere Monitoring Service (CAMS) Data

In this study, modeled AOD at 550 nm data with 80 km spatial resolution produced by the European center for medium-range weather forecasts Atmospheric Composition Reanalysis 4 (EAC4) was used to fill the gaps in the MODIS MCD19A2 product. Reanalysis merges model data with worldwide observations into a compatible dataset generated by an atmospheric model that uses the laws of physics and chemistry. EAC4 estimates modeled AOD every 3 h using the 4D-Var assimilation method [34].

#### 2.2.3. Digital Elevation Model

The elevation of the grid cells was added as a land predictor in our study. The Japan Aerospace Exploration Agency (JAXA) provides a worldwide digital surface model for scientific research and other geospatial services. It provides a horizontal resolution (~30 m) by the Panchromatic Remote-sensing Instrument for Stereo Mapping (PRISM), which was carried on the Advanced Land Observing Satellite "ALOS" [35]. Data was accessed in March 2021 from (https://www.eorc.jaxa.jp/ALOS/).

#### 2.2.4. Ground-Based AOD Data

NASA's Aerosols Robotic Network (AERONET) is considered one of the most reliable aerosol networks [36]. AERONET measures direct solar and sky radiance in various channels every 15 min at the local point to compute columnar AOD at intervals from 350 to 1020 nm with low expected uncertainties ranging between 0.01 to 0.02 under cloud-free conditions [37]. There are several categories of AERONET data: level 1.0 (unscreened), level 1.5 (cloud screened), and level 2.0 (cloud screened and quality assured).

In this study, AERONET level 2.0 quality assurance observations were used from 57 stations over Europe, as shown in Figure 1. Since AERONET stations do not measure AOD at 550 nm, available measurements at the nearest two wavelengths to 550 nm (440 or 500 nm as *λ*<sup>1</sup> and 675 nm as *λ*2) for each station were interpolated to 550 nm using the Ångström's turbidity equation represented in Equation (1) [21,38].

$$
\pi\_a(\lambda) = \beta \lambda^{-a} \tag{1}
$$

where *τa*(*λ*) is the AOD at *λ* wavelength in micrometers, *β* is the Angstrom's turbidity coefficient, and α is the band index represented in Equation (2).

$$\alpha = -\frac{\ln(\pi\_{\mathfrak{a}}(\lambda\_1) / \pi\_{\mathfrak{a}}(\lambda\_2))}{\ln(\lambda\_1 / \lambda\_2)}\tag{2}$$

AOD values at two different wavelengths *λ*1, *λ*<sup>2</sup> are related by Equation (3).

$$
\pi\_{\mathfrak{a}}(\lambda\_1) = \pi\_{\mathfrak{a}}(\lambda\_2) \* \left(\frac{\lambda\_1}{\lambda\_2}\right)^{-\alpha} \tag{3}
$$

#### 2.2.5. European Centre for Medium-Range Weather Forecasts reanalysis (ECMWF)

ERA-5 is the fifth generation of ECMWF reanalysis for the global climate and weather. Hourly data between 10 a.m. and 2 p.m. of U and V wind components, total precipitation, and 2 m surface temperature for the months of March–June of the years 2015–2020 with 0.1◦ spatial resolution were extracted from the ERA-5 land hourly data. Relative humidity data between 10 a.m. and 2 p.m. at 0.25◦ spatial resolution was extracted from the ERA-5 monthly averaged data.

All used data shown in Table 1 were reprojected to the European Terrestrial Reference System 1989 (EPSG:3035), using a 1 km grid cell with bilinear interpolation method for CAMSAOD and ECMWF data and the cubic convolution for the ALOS elevation model. All values of MODISAOD, CAMSAOD, and elevations were assigned to the closest grid cell.

**Table 1.** Summary of data used in this study.


#### **3. Methodology**

In this study, we created a Geo-Harmonized Atmospheric Dataset for Aerosol optical depth (GHADA) that covers the study area. Three stages were applied to generate GHADA: first, we merged the Terra and Aqua datasets of the MODIS MCD19A2 product by applying a simple average for all pixels that passed the quality assurance criteria (QACloudMask = Clear and QAAdjacencyMask = Clear) of this product. Second, we created a machine learning model for every year of the study period to predict AOD values over the study area. MCD19A2 high-quality retrievals were used as the dependent variable, and since the Terra satellite is passing locally around 10:30 a.m. and the Aqua satellite passes around 1:30 p.m., we used the modeled AOD from CAMS at the closest three times per day to the satellites passing (9 a.m., 12 p.m., and 3 p.m.). In addition, the spatiotemporal information for the grid cells was used as independent variables. Finally, we filled MODIS MCD19A2 gaps with the predicted AOD by merging the outputs from stages one and two. We validated the daily maps of GHADA with ground-based observation, and then we utilized this dataset to analyze how the COVID-19 lockdown has affected AOD levels over Europe during the period of March–June 2020 by comparing AOD levels for this period with the average AOD levels in the last five years (2015–2019) for the same months.

#### **4. Space-Time Models**

In this section, we propose a novel approach based on the Extremely Randomized Trees (ET) to predict the missing AOD values in the MODIS MCD19A2 product. First, we illustrate the principles of the ETs and discuss their suitability for the AOD prediction problem. Second, we describe in detail the proposed ET training and parameters setting for AOD prediction.

#### *4.1. Extra Trees Algorithm*

ET is a tree-based ensemble learning method used in our study to deal with the supervised regression and create prediction models for AOD. The idea behind ET is to strongly randomize the selection of both attributes and cut points while splitting a tree node. Unlike the widely used random forest algorithm that chooses the optimum split, ET chooses it randomly, which further reduces bias and variance. When needed, the latter algorithm creates independent randomized trees of learning sample output values [38].

The number of attributes that are randomly selected at each node (K) and the minimum sample size for splitting a node (nmin) are the two main parameters in the ET splitting process. This procedure is applied several times with the whole learning dataset to create an ensemble model that aggregates the predictions of the decision trees to obtain the final estimation by majority vote in classification problems and arithmetic average in regression problems. In addition to accuracy, ET has high computational efficiency [39], which is required when dealing with big data problems.

#### *4.2. Improved Spatiotemporal Information*

To determine the spatial and temporal correlation between MAIACAOD and CAMSAOD, we included the following independent variables. For space, we used both the elevations of the grid cells and the great circle distance (D) between each grid cell and a reference point on a sphere identified by their latitudes and longitudes using the haversine approach (Equations (4)–(6)). For time, we used the day of the year (DOY) to calculate the radian time (Rt) for the grid cells on different days in a year to improve model handling of the seasonal cycle, Equation (7) [40].

$$\theta = f(\lambda\_{\text{i,t}}, \varphi\_{\text{i,t}}) = \text{havas} \sin(\varphi\_1 - \varphi\_2) + \cos(\varphi\_1) \ast \cos(\varphi\_2) \ast \text{havas} \sin(\lambda\_1 - \lambda\_2) \tag{4}$$

$$\text{havasin}(\theta) = \sin^2(\frac{\theta}{2}) = \frac{1 - \cos(\theta)}{2} \tag{5}$$

$$\mathbf{D}\_{\mathbf{i},\mathbf{t}} = \mathbf{r} \* \operatorname{archavas}(\boldsymbol{\theta}) = \mathbf{2} \* \mathbf{r} \* \operatorname{arcsin}\left(\sqrt{\boldsymbol{\theta}}\right) \tag{6}$$

$$\text{Rt}\_{\text{i,t}} = \cos\left(2\pi \, \* \, \frac{\text{DOYi,t}}{\text{T}}\right) \tag{7}$$

where θ is the central angle between two points in space, ϕ<sup>1</sup> and ϕ<sup>2</sup> denote the geographical latitudes in radians of two points in space, λ<sup>1</sup> and λ<sup>2</sup> denote the geographical longitudes in radians of two points in space, r denotes the earth's radius in km, DOY represents the day of the year, T represents the total number of days in the year, for every grid cell (i) on day (t).

For each year between 2015–2020, the model was built using Equation (8).

$$\text{AOD}\_{\text{i,t}} = f(\text{CAMS-9}\_{\text{i,t}}, \text{CAMS-12}\_{\text{i,t}}, \text{CAMS-15}\_{\text{i,t}}, \text{D}\_{\text{i,t}}, \text{H}\_{\text{i,t}}, \text{Rt}\_{\text{i,t}}) \tag{8}$$

where for each grid cell (i) on day (t): AODi,t is the target AOD value, CAMS-x represents the AOD value extracted from CAMS at hour x, Di,t represents the great circle distance, Hi,t represents the elevation, Rti,t represents the temporal information identified by the radian time.

#### **5. Results**

In this section, we present the results of the space-time ET models when predicting the MAIAC AOD values. Then we utilize these models to generate AOD maps over the study area. The validation process is also stated below. Finally, these maps were used to analyze the effects of COVID-19 lockdowns on AOD levels, as discussed in Section 5.4.

#### *5.1. Models*

Due to the great number of MODISAOD -CAMSAOD pairs over land in the study area (on average 380 million pairs per year), representative subsets consisting of ~10% of the whole population (all MODISAOD -CAMSAOD pairs per year) were chosen using the Kolmogorov–Smirnov test to be used as learning dataset for a space-time model for each year. Then for each learning dataset, we used the k-fold cross-validation (where k=5) to train and validate each model. In this method, the learning dataset is divided into 5 folds, which means 80% of the pairs in the learning dataset are used as a training set for the model, and the remaining 20% are used for validation. This procedure was repeated five times to test the model on each fold. Based on learning curve results, we found that increasing the learning dataset size to 15% only increased the accuracy of the models by less than 1%, and the curve reaches a plateau beyond this percentage. Therefore, to decrease the computational complexity, we used ~10% of the whole population as a learning dataset. In other words, a learning dataset size of 10% is enough to reach satisfactory accuracy for each year of the study period. The optimized models (number of trees = 30, maximum depth of the tree = 50) were tested on the remaining ~90% (approximately 340 million pairs) of the population.

The results of the trained models for each year are summarized in Table 2. All models achieved high accuracies when predicting MAIAC AOD with a correlation of determination (R2) ranging between 92.5% to 95% and root mean squared errors from 0.016 to 0.02. These high achieved accuracies with the relatively small errors show the efficiency of our spacetime models in predicting the missing AOD values and emphasize the appropriateness of exploitation modeled AOD with improved spatiotemporal information in improving satellite AOD data.


**Table 2.** Results of the space-time extremely randomized models used to predict the missing AOD in the MODIS MCD19A2 product for each year of the study period.

Feature importance was calculated based on the reduction in sum of squared errors whenever a variable is chosen to split. Mean importance scores were calculated for all selected input variables of the models (see Figure 2). CAMSAOD at 12:00 p.m. is the most influential variable, accounting for ~33% of MODISAOD estimates. The other two modeled AOD at 9:00 a.m. and 3:00 p.m. contributed by 18% and 24%, respectively. The radian time and the great circle distance had almost the same influence (10–10.4%). Finally, the elevation had the lowest influence, with ~5% on MODISAOD estimates.

**Figure 2.** Mean importance scores (%) of independent variables to AOD estimates for the space-time extremely randomized models.

#### *5.2. AOD Maps*

We used the optimized space-time models to predict the missing values in the daily MCD19A2 data of the study period. Then we used these predictions to fill the gaps in this product. The outputs of the previous processes were daily AOD maps with 1 km spatial resolution and full coverage over Europe for the period of March–June in the years 2015–2020. To analyze the COVID-19 lockdown effects on AOD levels, we calculated the average AOD levels for the months' March–June of the years 2015–2019 and compared these levels with the same period of the year 2020 (see Figure 3). Moreover, we generated daily AOD maps for the period of January 2018–June 2020 to validate GHADA through all seasons and not solely during the chosen lockdown months.

**Figure 3.** The average AOD values for the months March–June of (**a**) the years 2015–2019 and (**b**) of the year 2020 during the chosen lockdown period.

#### *5.3. Validation with AERONET*

With the assumption that the aerosol column is relatively uniform within a certain time-space boundary [41], the validation of satellite-based AOD products is usually performed between AOD retrievals within the spatiotemporal window and the corresponding AERONET observations [42]. An acceptable accuracy of AOD products can be achieved when 66% of retrievals fall within expected error envelopes (EE) [23,43]. We used for validation the average AERONET level 2.0 quality assurance observations between 10 a.m. and 2 p.m. from 57 stations across Europe during the period of January 2018–June 2020. We chose two spatial diameters, 20 km and 50 km, with AERONET stations in the center for validation and statistical analysis that extensively uses root-mean-square error (RMSE), mean absolute error (MAE), expected error (EE) envelopes, and the fraction of AOD retrievals of the total number (N) falling within EE envelope (Equations (9)–(13)).

$$\text{RMSE} = \sqrt{\frac{1}{N} \sum \left( AOD\_{GHADA} - AOD\_{AERONET} \right)^2} \tag{9}$$

$$\text{MAE} = \frac{1}{N} \sum |AOD\_{GHADA} - AOD\_{AERONET}| \tag{10}$$

$$\text{Bias} = \frac{1}{N} \sum (AOD\_{GHADA} - AOD\_{AERONET}) \tag{11}$$

$$\text{EE} = \pm \left( 0.05 + 0.15 \, \* \, \text{AOD}\_{\text{AERONET}} \right) \tag{12}$$

$$\text{AOD}\_{\text{AERONET}} - |\text{EE}| \le \text{AOD}\_{\text{GMADA}} \le \text{AOD}\_{\text{AERONET}} + |\text{EE}| \tag{13}$$

The statistical analysis between daily GHADA maps and AERONET observations has shown similar validation results for the two chosen spatial diameters with ~84% of the samples falling within the EE, good correlations R ~ 76–77%, and relatively small RMSE ~ 0.066–0.067, refer to Table 3.

**Table 3.** Validation results of GHADA with AERONET at two spatial diameters, where N is the total number of sample points.


Figure 4 represents the density scatter plots for the validation of AOD at 550 nm from GHADA with the AERONET stations at the two chosen spatial diameters.

**Figure 4.** Density scatter plots of validation AOD at 550 nm from GHADA with 57 AERONET stations between 10 a.m. and 2 p.m. at two spatial diameters of 20 km and 50 km. The colored scale bar stands for the frequency of occurrence.

#### *5.4. AOD Relative Percentage Difference*

The variations in AOD levels were calculated for each grid cell using the Relative Percentage Difference (RPD) Equation (14).

$$\text{RPD} = \frac{\text{AOD}\_{2020} - \text{AOD}\_{2015-2019}}{\text{AOD}\_{2015-2019}} \ast 100 \tag{14}$$

where AOD2020 is the mean AOD value in the study period of 2020 and AOD2015–2019 is the mean AOD value for the study period covering 2015–2019. The changes are presented in Figure 5.

**Figure 5.** Relative percentage difference of AOD over Europe for the months March–June of the year 2020 and the same months of the previous 5 years.

#### **6. Discussion**

In this study, a machine learning-based scheme was used to overcome the limitations in time series analysis concerning AOD. A new dataset for AOD at 550 nm with full coverage over Europe and with 1 km spatial resolution (GHADA) was built. We trained an extra

trees model for each year (2015–2020) using the MODIS MCD19A2 as the target variable and CAMS modeled AOD with improved spatiotemporal information as the independent variables. Results showed that the trained models had high accuracies ranging between 92.5–95% when estimating the missing MAIACAOD retrievals. We compared the AOD550 from GHADA and surface observations at 57 AERONET sites over Europe, with two spatial diameters around these AERONET stations within the period of January 2018–June 2020. The overall comparison with ground-based measurements showed a good correlation, with a bias as low as 0.014 and R ~ 0.76. Then we used GHADA to study the influence of COVID-19 pandemic lockdown on AOD levels over Europe in the months March–June by comparing it to AOD levels in the same months for the past five years (2015–2019). The most important advantage of our study when compared to similar work is that we used daily full-coverage AOD maps with high spatial resolution when calculating the average AOD values before and after the lockdown. Such complete coverage reduces bias and uncertainty in such time-series analyses. As shown above, in Figure 5, we have found that AOD levels decreased by 10–30% over most countries of the study area in 2020, mainly the countries located at the center of the analyzed area, while AOD levels increased over the countries that are located on the boundaries of the study area. In the west, AOD increased over Spain and Portugal; in the east, AOD increased over Romania, Bulgaria, Moldova, and Kosovo; in the north, the level slightly increased over Iceland. The decrease in AOD levels was the greatest in the Netherlands, with an average decrease of 20%, while Spain had the highest average increase in AOD levels by 10%. It must be noted that the five AERONET stations in Spain included in this study did not reflect the average increase in AOD over the whole country due to their limited spatial coverage.

As an attempt to justify the findings in areas of increased AOD, we investigated the relationship between the RPD in AOD for the months March–June of the year 2020 and the previous five years and the RPD for four meteorological variables (relative humidity, wind speed, surface temperature, and total precipitation) calculated for the times of MODIS satellites overpassing (10 a.m. to 2 p.m.). We found a close trend between relative humidity and AOD. Spain, Portugal, northern Norway, eastern Belarus, and southern Bulgaria had higher RPD in both AOD and relative humidity. Spain and Portugal had the highest increase of 10–23% in relative humidity. In agreement, areas of decreased humidity had lower RPD of AOD; however, such correlation is to a lower extent than the effect of increased humidity. An exception to this finding is Romania, where RPD in humidity was decreased however AOD was increased. Regarding wind speed, RPD decreased by ~18% in Spain and Portugal, where AOD had a significant increase. Also, the northern part of Italy and the western part of Austria had a clear inverse trend between AOD and wind speed. The average relative humidity over Spain was 65% during the lockdown period of the year 2020. High relative humidity combined with a low average wind speed of less than ~3 m/s play an important role in increasing AOD. Our findings are consistent with [44], where they associated higher humidity and lower wind speed with higher AOD. We found no direct relationship between RPD of neither surface temperature nor total precipitation and RPD of AOD, all of which strengthens the argument that lowering AOD is a consequence of the lockdown. Although we proved that AOD levels increased over Spain, other pollutants such as NO2 were decreased, which is attributed to the difference in the source of these pollutants as discussed elsewhere [44]. Figure 6 shows the RPD of relative humidity and RPD of wind speed between the lockdown months of the year 2020 and the same period of the previous five years.

**Figure 6.** Relative percentage difference of (**a**) relative humidity and (**b**) wind speed over Europe between 10 a.m. and 2 p.m. for the months March–June of the year 2020 and the same months of the previous 5 years.

Nevertheless, it must be noted that the average AOD levels over Europe are relatively low (AOD < 0.3) compared to other more polluted regions, where more prominent differences in AOD levels can be observed, for example, as published in [8] where AOD levels over India were investigated. In addition, the extent of restrictions imposed and the adherence to them may contribute to the significance of the change in AOD levels.

#### **7. Conclusions**

The advancement of machine learning algorithms provides solutions for AOD satellitebased data drawbacks such as low spatial resolution and gaps caused by persistent clouds, cloud contamination, and high surface reflectance and opens new horizons for studies that can influence decision making. A machine learning-based scheme was used to enhance time series analysis of AOD over the study period. Space-time extremely randomized trees models were built to fill the gaps in the MCD19A2 product of the moderate imaging spectroradiometer (MODIS). The output was a geo-harmonized atmospheric dataset for aerosol optical depth (GHADA) with complete coverage of 1 km spatial resolution over Europe. To the best of our knowledge, GHADA is the first dataset with this coverage and resolution for Europe, and we are the first to analyze how COVID-19 affected AOD levels over Europe with gaps-free AOD maps at high spatial resolution.

We compared AOD levels during the chosen lockdown period to the mean AOD values during the same period in the previous five years. We found a general decrease trend in the countries located at the center of the study area, with the Netherlands scoring the highest average decrease. In contrast, AOD levels increased in the eastern and western European countries as it is distinctly visible in Kosovo and Spain, respectively. We found a correlation between high humidity and low wind speed with AOD increase, which justifies such an increase in countries like Spain and Portugal. We excluded surface temperature and total precipitation as contributing factors to the detected changes in AOD levels, which in return makes COVID-19 lockdown the major cause for the decrease in AOD levels.

Once GHADA is made publicly accessible, it can be used to investigate air quality over Europe with 1 km spatial resolution and improve time series analysis, overcoming the gaps encountered during such studies. The lockdown that happened due to the pandemic generally lowered AOD levels; however, such lockdown is not the ultimate solution to control AOD levels. Cleaner sources of energy and road transport are needed to maintain lower levels of AOD and good air quality. Based on our obtained results, we recommend utilizing machine learning to solve time series analysis limitations and to conduct various applications concerning air quality.

**Author Contributions:** S.I. and L.H. conceptualized the work. S.I., M.L. and O.P. designed and implemented the workflow and processed the data. M.L., O.P., K.P. and L.H. contributed to the improvement of the draft manuscript. Saleem Ibrahim wrote the paper. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work is co-financed under Grant Agreement Connecting Europe Facility (CEF) Telecom project 2018-EU-IA-0095 by the European Union and by the Grant Agency of the Czech Technical University in Prague, grant No. SGS21/054/OHK1/1T/11.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data and data analysis methods are available upon request.

**Acknowledgments:** The authors sincerely thank NASA EOSDIS for providing the daily MODIS MAIAC AOD product (MCD19A2) available from the Land Processes Distributed Active Archive Center (LPDAAC), AERONET (https://aeronet.gsfc.nasa.gov/) for providing AOD ground-based observation data (was last accessed in May 2021), the European Center for Medium-Range Weather Forecasts (ECMWF) for providing global reanalysis of atmospheric composition, and the Japan Aerospace Exploration Agency (JAXA) for providing the digital surface model used in this study.

**Conflicts of Interest:** Authors declare no conflict of interest.

#### **References**


## *Article* **Change of CO Concentration Due to the COVID-19 Lockdown in China Observed by Surface and Satellite Observations**

**Minqiang Zhou 1, Jingyi Jiang 2,\*, Bavo Langerock 1, Bart Dils 1, Mahesh Kumar Sha <sup>1</sup> and Martine De Mazière <sup>1</sup>**


**\*** Correspondence: jiangjingyi@bjfu.edu.cn

**Abstract:** The nationwide lockdown due to the COVID-19 pandemic in 2020 reduced industrial and human activities in China. In this study, we investigate atmospheric carbon monoxide (CO) concentration changes during the lockdown from observations at the surface and from two satellites (TROPOspheric Monitoring Instrument (TROPOMI) and Infrared Atmospheric Sounding Interferometer (IASI)). It is found that the average CO surface concentration in 2020 was close to that in 2019 before the lockdown, and became 18.7% lower as compared to 2019 during the lockdown. The spatial variation of the change in the CO surface concentration is high, with an 8–27% reduction observed for Beijing, Shanghai, Chengdu, Zhengzhou, and Guangzhou, and almost no change in Wuhan. The TROPOMI and IASI satellite observations show that the CO columns decreased by 2–13% during the lockdown in most regions in China. However in South China, there was an 8.8% increase in the CO columns observed by TROPOMI and a 36.7% increase observed by IASI, which is contrary to the 23% decrease in the surface CO concentration. The enhancement of the CO column in South China is strongly affected by the fire emissions transported from Southeast Asia. This study provides an insight into the impact of COVID-19 on CO concentrations both at the surface and in the columns in China, and it can be extended to evaluate other areas using the same approach.

**Keywords:** carbon monoxide; COVID-19; China; surface concentration; TROPOMI; IASI

#### **1. Introduction**

The COVID-19 worldwide pandemic has caused millions of deaths, reported by the World Health Organization (WHO) coronavirus disease dashboard. The first COVID-19 patient was detected in Wuhan, Hubei Province, China, in December 2019, and then the disease quickly spread to the whole country before the Chinese New Year 2020 [1]. To prevent the further spread of the outbreak, the Chinese government carried out a nationwide lockdown starting on 23 January 2020 in Wuhan and extending rapidly (in 6 days) to all other provinces [2]. The lockdown outside of Hubei province was eased at the beginning of March, while it continued to 25 March for Hubei province and to 8 April for Wuhan [3].

The strict measures related to COVID-19 had a large impact on economic activities, including energy production, industrial activities, and transportation [4,5]. As a result, the emissions of many atmospheric components were significantly reduced [6–9]. There was a 3.7% decrease in Chinese carbon dioxide (CO2) emissions in the first half of 2020 related to the COVID-19 pandemic [10]. The reduction mainly occurred in January and February, and the CO2 emissions in March returned to the emission level of 2019 [11] as the lockdowns were gradually relaxed. Bauwens et al. reported an average 40% decrease in nitrogen dioxide (NO2) column concentration from satellite measurements over Chinese cities due to measures against the coronavirus outbreak [12]. Based on NO2 surface observations,

**Citation:** Zhou, M.; Jiang, J.; Langerock, B.; Dils, B.; Sha, M.K.; De Mazière, M. Change of CO Concentration Due to the COVID-19 Lockdown in China Observed by Surface and Satellite Observations. *Remote Sens.* **2021**, *13*, 1129. https://doi.org/10.3390/rs13061129

Academic Editor: Maria João Costa

Received: 24 February 2021 Accepted: 13 March 2021 Published: 16 March 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Feng et al. pointed out that nitrogen oxide (NO*X*) emissions were reduced by 36% in China due to the COVID-19 lockdown measures [13].

CO is a pollutant that also plays an important role in atmospheric chemistry, e.g., the formation of tropospheric ozone. CO is predominantly removed by OH [14], and the lifetime of CO is relatively long (weeks to months) as compared to other air pollutants [15]. The main atmospheric CO sources are anthropogenic emissions and biomass burning [16], primarily when carbon fuels are not burned completely. According to the Emissions Database for Global Atmospheric Research (EDGAR) v5.0 [17], the anthropogenic CO emissions in China are dominated by residential cooking and heating, and combustion for manufacturing, the power industry, and road transportation.

Previous studies have been carried out to understand the reduction in CO surface concentration due to the COVID-19 lockdown in China on city and regional scales. There was an average 22.7% decrease in the CO surface concentration in Wuhan during the lockdown as compared to the period before lockdown [18]. Shi and Brasseur found that the CO surface concentration during the lockdown decreased from 1.2–1.5 to 0.7–1.0 mg/m<sup>3</sup> before the lockdown in northern China [19]. However, there is a large seasonal variation in CO surface concentrations in eastern Asia, with a maximum in winter and a minimum in summer [20], which has not been taken into account in the these studies. The atmospheric compositions can also be observed by the satellite remote sensing technique using their absorption or emission spectra, which has been applied to understand the CO column changes due to the COVID-19 lockdown in China [21,22]. It is important to compare the CO concentration changes observed by the surface and satellite measurements. However, to our knowledge, few studies have been performed to investigate this. Here, we aim at looking into the changes in CO concentration due to the COVID-19 lockdown in China using both surface and satellite observations, and investigating whether CO reduction can be observed by both surface and satellite observations. The data and method are presented in Section 2. To reduce the impact from the seasonal variation of CO, the observations in 2020 are compared to similar observations in 2019. In Section 3, the changes in CO surface concentrations in China and the variations at six megacities are discussed. In addition, the column-averaged dry-air mole fraction of CO (XCO) observed from the TROPOspheric Monitoring Instrument (TROPOMI) onboard the Sentinel 5 Precursor (S5P) satellite and the CO column observed from the Infrared Atmospheric Sounding Interferometer (IASI) onboard the Meteorological Operational (Metop)-B satellite are analyzed and compared to the surface measurements. The discussions about the results as well as the limitations of this study are carried out in Section 4 and the conclusions are drawn in Section 5.

#### **2. Materials and Methods**

#### *2.1. Data*

Hourly CO surface observations are carried out at air pollution monitoring sites by the Ministry of Ecology and Environment of China (http://www.mee.gov.cn/, accessed on 10 March 2021). The CO concentration is reported in units of mg/m3. In this study, we used the sites where observations were available in both 2019 and 2020: we found 1375 sites in China (Figure 1), including 12 sites at Beijing, 10 sites at Shanghai, 11 sites at Wuhan, 10 sites at Chengdu, 9 sites at Zhengzhou, and 12 sites at Guangzhou. Note that few sites were in western China, and most sites were located in highly polluted regions with large CO anthropogenic emissions.

The offline level 2 CO product from the TROPOMI was used in this study, which was downloaded from https://scihub.copernicus.eu/ accessed on 10 March 2021. The XCO product was retrieved from the 2.3 μm spectral range of the shortwave infrared solar radiance measurements under clear-sky conditions; it is sensitive to the tropospheric boundary layer [23]. The spatial resolution of the TROPOMI XCO observations was 7.2 × 7.2 km2 for the footprint at nadir before 6 August 2019 and changed to 7.2 × 5.6 km<sup>2</sup> afterwards. The overpass time was about 13:00. The TROPOMI CO level 2 measurements were filtered out with the qa\_value less than 0.5, which is recommended by the

user guide (https://sentinel.esa.int/documents/247904/3541451/Sentinel-5P-Carbon-Monoxide-Level-2-Product-Readme-File, accessed on 10 March 2021). After that, the daily TROPOMI level 2 observations were binned to 0.05◦ × 0.05◦ (latitude by longitude) grids as the level 3 data, and we studied the CO changes based on these level 3 daily products.

**Figure 1.** The location of the air pollution sites (light gray dots), six megacities (white hexagons) and regions (red boxes), together with the CO anthropogenic emission annual mean in 2015 from the Emissions Database for Global Atmospheric Research (EDGAR) v5.0 inventory.

The IASI level 2 CO column dataset was processed using the Fast Optimal Retrievals on Layers for IASI (FORLI) software [24] by the Université Libre de Bruxelles, Laboratoire Atmosphères, Milieux, Observations Spatiales (ULB-LATMOS) before 14 May 2019 (v20140922) and by the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) afterward (v6.5.0), which was downloaded from https://iasi.aeris-data.fr/cos\_iasi\_b\_arch/ accessed on 10 March 2021. The field of view at nadir of the IASI instrument is about 12 km. The CO is retrieved from the thermal infrared spectra in the spectral range 4.58 to 4.69 μm, so that IASI CO product is more sensitive to the mid- and upper-troposphere, and less sensitive to the lower-troposphere [25]. IASI provides both daytime and nighttime CO measurements (9:30 and 21:30). As the diurnal variation in CO at the mid- and upper-troposphere is much weaker than for the surface, we used both daytime and nighttime IASI CO observations to generate the 0.5◦ × 0.5◦ daily product.

As fire emissions are an important source of CO, we used the Visible Infrared Imaging Radiometer Suite (VIIRS) 375 m data [26] onboard the Suomi National Polar-Orbiting Partnership (Suomi NPP) satellite to understand the fire impacts in 2019 and 2020. The VIIRS sensor has a swath width of 3060 km, which is able to provide complete coverage of the Earth everyday. There are 22 spectral channels, between 0.412 μm and 12.01 μm: 16 channels are moderate resolution bands (M-bands), which have a spatial resolution of 750 m at the nadir; 5 channels are imaging resolution bands (I-bands), which have a spatial resolution of 375 m at the nadir; 1 channel is a one day/night panchromatic

band with a spatial resolution of 750 m [27]. The VIIRS fire data were download from https://firms.modaps.eosdis.nasa.gov/ accessed on 10 March 2021.

Apart from the measurements, four emission datasets were used to understand the CO anthropogenic and wildfire fluxes in China. The EDGAR v5.0 and the Regional Emission Inventory in Asia (REAS) v3.2 [28] were used to estimate the CO anthropogenic emissions in China. Note that both the EDGAR v5.0 and the REAS v3.2 only hold data up to 2015 for CO, and there is no information about the CO anthropogenic emissions in 2019 and 2020. The Global Fire Assimilation System (GFAS) [29] and the Fire Inventory from NCAR (FINN) [30] were used to understand the CO wildfire emissions. The GFAS and FINN data are up to date and available for both 2019 and 2020, as they use satellite measurements as the inputs. For both the anthropogenic and wildfire emissions, two datasets were compared to each other to assess the uncertainty.

#### *2.2. Method*

The surface and satellite CO data in 2020 were compared to similar observations in 2019 during four periods: the month before the Chinese New Year (BCNY; before lockdown), the month after the Chinese New Year (ACNY; lockdown), the month between 11 March and 10 April (3/11–4/10), and the month between 11 April and 10 May (4/11–5/10). To reduce the impact of the Spring Festival, the national holidays in 2019 (4 February to 10 February) and 2020 (24 January to 2 February) were not considered in our study. We considered that in 2019, BCNY was between 1 January and 3 February, and ACNY was between 11 February and 10 March, and that in 2020, BCNY was between 1 January and 23 January, and ACNY was between 3 February and 10 March. From 3/11–4/10, the lockdown was relaxed at most places in China except Hubei Province, and from 4/10–5/10, the lockdown was officially ended throughout the whole of China. The four periods in 2019 and 2020 are summarized in Figure 2.

**Figure 2.** The four periods (before the Chinese New Year (BCNY), after the Chinese New Year (ACNY), 3/11–4/10, and 4/11–5/10) in 2019 and 2020. Note that the Chinese New Year (CNY) national holiday was not considered in this study.

According to the European Centre for Medium-Range Weather Forecasts (ECMWF) ERA5 reanalysis data, the winds at 850 hPa above China during these four periods, especially for the first three periods, were similar in 2019 and 2020 (Figure S1). The layer at 850 hPa (about 1.5 km a.s.l.) is between the lower troposphere and the free atmosphere, as it is close to the Planetary Boundary Layer (PBL) height. On a small scale, such as in a city, the winds in 2019 and 2020 could be very different, but the winds in 2019 and 2020 were generally similar in both wind speed and wind direction on a large scale, such as for the whole of China. Therefore, it is indicated that the changes in CO concentration during the COVID-19 lockdown on the national scale had limited influence from meteorological conditions.

The relative difference in CO concentration at the surface observed by the air pollution sites or in the column observed by the satellite measurements during these periods between 2020 and 2019 was calculated as (Δ*CO* = (2020 − 2019)/2019 × 100%). Then, the mean and standard deviation (std) of the differences were derived from all measurement locations (sites or grids) within a city, a region, or the whole of China:

$$
\Delta CO\_m = \frac{\sum (\Delta CO\_i)}{N} \,\tag{1}
$$

$$
\Delta \text{CO}\_{std} = \sqrt{\frac{\sum (\Delta \text{CO}\_i - \Delta \text{CO}\_m)^2}{N}},\tag{2}
$$

where *N* is the total number of locations and *i* is the index of the location. To reduce the impact from outliers, we also used the median when comparing CO changes at the surface with those in the column.

#### **3. Results and Discussions**

#### *3.1. CO Surface Concentration*

The CO surface concentrations during the four periods in 2019, together with the relative differences between 2020 and 2019, are shown in Figure 2. The mean CO concentrations were 1.21, 1.00, 0.74, 0.71 mg/m<sup>3</sup> during BCNY, ACNY, 3/11–4/10, and 4/11–5/10 in 2019, respectively. There was a large month-to-month variation, and the average CO concentration during ACNY was about 17% less than that during BCNY in 2019.

The mean and std of the relative difference between 2020 and 2019 at all sites are 1.1 ± 24.3%, −18.7 ± 22.2%, −6.2 ± 20.2%, and −4.8 ± 23.6% during BCNY, ACNY, 3/11–4/10, and 4/11–5/10, respectively (Table 1). The CO concentrations during BCNY in 2019 and 2020 were at the same level. The mean difference during ACNY indicates that there was an 18.7% reduction in CO surface concentration due to the COVID-19 lockdown. The reduction in CO surface concentration is also observed for 3/11–4/10 and 4/11–5/10, but the amplitudes become much weaker as compared to that during ACNY. The large std (20–24%) suggests that the spatial variability of CO surface concentration changes across China is high, as CO is affected by local as well as transported emissions from hundreds and thousands of kilometers away due to its lifetime of weeks to months.


**Table 1.** The mean and standard deviation (std) of the relative change in CO surface concentrations.

As the change in CO surface concentration varied with location (Figure 3), we investigated in detail six megacities (Beijing, Shanghai, Wuhan, Chengdu, Zhengzhou, and Guangzhou). The hourly means and stds of CO surface concentrations in these cities during the four periods in 2019 and 2020, together with their relative changes between 2020 and 2019, are shown in Figure 4. The diurnal variations of CO surface concentrations in these cities are similar, with two peaks around 10:00 and 24:00 local hours. During BCNY, the phase and amplitude of the diurnal variations in 2020 were close to those in 2019. During ACNY, except in Wuhan, the peak-to-peak amplitudes of the diurnal variations became smaller in 2020 as compared to 2019 despite the large stds. Large reductions of the CO surface concentrations of 20–27% are observed during ACNY at Shanghai, Chengdu, Zhengzhou, and Guangzhou. Reductions by 6–25% in the CO surface concentration are also observed from 3/11–4/10 in these cities. A reduction during BCNY is also observed in Guangzhou, but it is less significant as compared to that during ACNY. However in Beijing, the reduction of the CO surface concentration during ACNY was only 8%, which is less than the observed 12% reduction during BCNY and 16% reduction from 3/11–4/10. The relatively low reduction in CO during ACNY in Beijing was affected by the meteorological background. Previous studies found that the wind speed was decreased by

20% and the PBL heights were generally lower during the lockdown period as compared to the climatology for Beijing, leading to higher surface concentrations of atmospheric pollutants [31,32].

**Figure 3.** The mean CO surface concentrations in units of mg/m3 observed at all sites in China during BCNY, ACNY, 3/11–4/10, and 4/11–5/10 in 2019 (**first column**) and 2020 (**second column**), together with their percentage differences between 2020 and 2019 ((2020–2019)/2019×100%) (**third column**). The six megacities are marked as the purple (**a**,**b**) and yellow (**c**) circles.

The city of Wuhan shows a behavior that is different from the five other cities: the CO surface concentration in 2020 was even slightly larger than that in 2019 during ACNY but was about 20% less than that in 2019 during BCNY. As the city was hit heavily by the virus, the most strict measures were carried out in Wuhan. More than a 50% reduction in atmospheric NO2 concentrations was observed from both satellite measurements of column abundances [12] and surface in-situ observations [18] during the lockdown period. Apart from anthropogenic emissions, biomass burning is also an important CO source [33]. The VIIRS satellite observed many fires (burning or combustion at places giving out bright light, heat, and smoke) in Wuhan and in the northern area of Wuhan during BCNY, and the fires were almost extinguished during ACNY in 2019. In contrast to 2019, there was almost no fire observed during BCNY, but more fires existed during ACNY in 2020 (Figure S2). First, we looked at the CO wildfire emissions from the GFAS during BCNY and ACNY in 2019 and 2020. Consistent with VIIRS fire measurements, the CO wildfire emissions during BCNY in 2019 were higher than those in 2020, and the CO wildfire emissions during ACNY

in 2019 were lower than those in 2020. However, the CO wildfire emissions were much lower as compared to the CO anthropogenic emissions from the EDGAR v5.0 and the REAS v3.2 around Wuhan (Figure S3). There were two things to be addressed there: (1) to assess the uncertainty of the CO wildfire emission, we compared the GFAS with the FINN. It was found that the difference between GFAS and FINN CO wildfire emissions around Wuhan was within 20%; (2) the anthropogenic CO emission from EDGAR v5.0 or REAS v3.2 was only available for 2015, and it was decreasing during the last decade in China with an annual change of about 3–4% [34]. Even though we took the 4%/year decrease in the CO anthropogenic emissions into account, the contributions from the CO wildfire emissions were still less than 1.0% of the CO anthropogenic emissions during ACNY and BCNY in 2019 and 2020 within the 1.0◦ × 1.0◦ box around Wuhan. In summary, the change in CO surface concentration in Wuhan cannot be explained by the local wildfire emissions (biomass burning). Second, we looked at the concentrations of other air pollutants (NO2, SO2, PM2.5, PM10) in 2019 and 2020 in Wuhan (Figure S4). The averaged NO2, SO2, PM2.5, PM10 concentrations during BCNY in 2020 were 17%, 9%, 34%, and 31% less than those in 2019. The decreases of those four air pollutants are consistent with the 20% decrease in CO during BCNY in 2020 as compared to 2019. The averaged NO2, PM2.5, PM10 concentrations in Wuhan during ACNY in 2020 were 51%, 43% and 42% less than those in 2019. However, SO2 and CO increased slightly during ACNY in 2020 as compared to 2019. The similar behavior of CO and SO2 suggests that these two gases come from common sources, e.g., the burning of fossil fuels by power plants and other industrial facilities. Finally, we looked at the VIIRS and MODIS fire observations inside Wuhan, where more fires were observed above a large coke factory (Wuhan Pingmei Wugang Joint Coking Company) during ACNY in 2020 as compared to 2019 (Figure S5). According to the sources of SO2, CO, and NO*<sup>X</sup>* in Asia [35], it is inferred that the CO and SO2 emissions from industry (such as the coke factory) during ACNY in 2020 were larger than the reduced emissions from road transportation.

#### *3.2. CO Column Observed from Satellites*

The TROPOMI XCO and IASI CO column measurements in 2019, together with the relative differences between 2020 and 2019 during the four periods, are shown in Figure 5. In general, the TROPOMI and IASI measurements have a similar spatial distribution in China. The means and stds of XCO observed by TROPOMI in 2019 in China are 110.1 ± 24.1, 109.7 ± 24.9, 113.3 ± 26.9, and 112.9 ± 22.8 ppb during the BCNY, ACNY, 3/11–4/10, and 4/11–5/10 periods, respectively. There is almost no change in the mean XCO in China during these four periods, which is different from the large month-to-month variation of CO surface concentration. The means and stds of CO columns observed by IASI in 2019 in China are 1.83 ± 0.47 × 1018, 1.98 ± 0.50 × 1018, 2.18 ± 0.54 × 1018 and 2.25 ± 0.53 × <sup>10</sup><sup>18</sup> molecules/cm2 during BCNY, ACNY, 3/11–4/10, and 4/11–5/10 periods, respectively. The month-to-month change of the CO column is opposite to that observed at the surface.

As satellite measurements are contaminated by cloud, the variability in them is relatively high. To reduce random uncertainty, the satellite measurements (both TROPOMI and IASI) were averaged on regional scales, and we focused on the CO changes in four regions with high values (Figure 5a2,c2; Figure 1): North, East, and Central China (A); South China (B), Sichuan basin (C), and Urumqi region (D). The quantitative estimates of the CO changes are shown in Figure 6 and Table 2. The medians of the XCO relative changes during ACNY in 2020 relative to 2019 observed by the TROPOMI satellite are −10.5%, 8.8%, −1.9%, and −4.6% in regions A, B, C, and D, respectively. The medians of the CO column relative changes during ACNY in 2020 relative to 2019 observed by the IASI satellite are −13.3%, 36.7%, −1.8%, and −3.6% in regions A, B, C, and D, respectively. The largest reduction in CO concentration was found by both satellites in Region A during ACNY in 2020, with a minimum in the region between Zhengzhou and Beijing. The reductions in the CO column during ACNY were also significant in Regions C and D,

especially when we compare the CO changes during ACNY to the changes during BCNY, 3/11–4/10, and 4/11–5/10. However, there was an 8.8% increase in XCO observed by TROPOMI and a 36.7% increase in CO columns observed by IASI for Region B, which was related to the fires in Southeast Asia, and will be discussed later. To compare the satellite with the surface observations, the relative changes in CO surface concentrations for the same regions are also shown in Figure 6. The medians of the relative changes in CO surface concentrations during ACNY in 2020 as compared to that in 2019 are −25.1%, −23.1%, −15.8%, and −18.2%, for Regions A, B, C, and D, respectively. At these regions, the CO surface concentrations decreased dramatically during the lockdown and then increased afterward, with Region A being the most prominent.

**Figure 4.** Upper: the hourly means (solid line) and standard deviations (shadow) of CO surface concentrations observed in Beijing, Shanghai, Wuhan, Chengdu, Zhengzhou, and Guangzhou during BCNY (**first column**), ACNY (**second column**), 3/11–4/10 (**third column**), and 4/11–5/10 (**last column**) in 2019 and 2020. Lower: the relative changes in CO surface concentrations between 2020 and 2019 in these six megacities during BCNY, ACNY, 3/11–4/10, and 4/11–5/10.

**Figure 5.** The TROPOspheric Monitoring Instrument (TROPOMI) satellite XCO observations in units of ppb (**a1**–**a4**) and the Infrared Atmospheric Sounding Interferometer (IASI) CO column observations in unit of molec./cm<sup>2</sup> (**c1**–**c4**) over China during BCNY, ACNY, 3/11–4/10, and 4/11–5/10 in 2019, together with the relative differences between 2020 and 2019 ((2020–2019)/2019 × 100%) (TROPOMI: **b1**–**b4**, IASI: **d1**–**d4**). The six megacities are marked as the purple and black circles. The four regions are marked in (**a2**,**c2**).

The surface and satellite observations both showed reductions during ACNY in Regions A, C, and D, but the reduction in CO columns was less significant as compared to the reduction in the CO surface concentrations. The satellites observe the column CO abundance. The CO partial columns in the PBL only account for 20–40% of the total columns in these regions according to the Copernicus Atmosphere Monitoring Service (CAMS) operational model [36]. Assuming that there is no change in CO partial columns above the PBL, the magnitude of the CO total column reduction is expected to be 2.5–5 times less than that at the surface. In this case, the relative changes in CO during ACNY in 2020 observed by the satellite and surface observations are generally in good agreement for Regions A, C, and D.

A large disagreement between the satellite and surface observations was found in Region B, where the CO surface concentrations were significantly reduced (>20%) during the lockdown in 2020, while the TROPOMI and IASI observations show that the CO during ACNY and 3/11–4/10 in 2020 was much larger than that in 2019. As the weather conditions between January and March (cool and dry) are favorable for burning, there are vast numbers of fires that emerge across the countryside in Southeast Asia (Myanmar, Laos, Thailand, and Cambodia). The VIIRS satellite detected more fires in Southeast Asia during ACNY and 3/11–4/10 in 2020 as compared to 2019 (Figure 7). The CO columns in Southeast Asia observed by TROPOMI and IASI during ACNY and 3/11–4/10 in 2020 were also increased as compared to 2019 (Figure 5). The CO wildfire emissions from GFAS in March 2019 and March 2020 in Southeast Asia (blue box in Figure 7) were 1.75 × <sup>10</sup>−<sup>10</sup> and 9.88 × <sup>10</sup>−<sup>10</sup> kg/m2/s, respectively. Both the absolute values and the variation of CO wildfire emissions in Southeast Asia are comparable to the CO anthropogenic emis-

sion annual means in 2015 in Region B of 1.05 × <sup>10</sup>−<sup>9</sup> kg/m2/s from REAS v3.2 and of 7.13 × <sup>10</sup>−<sup>10</sup> kg/m2/s from EDGAR v5.0. As CO has a lifetime of about weeks to months, CO observed in Region B could be transported from the surrounding areas. The 3-day backward trajectories of 2m-height air at local noon for each day during ACNY and 3/11–4/10 in 2020 were simulated by the Hybrid Single-Particle Lagrangian Integrated Trajectory (HYSPLIT) model driven by the National Centers for Environmental/Prediction Global Data Assimilation System (NCEP/GDAS) meteorological data with a 1.0◦ × 1.0◦ (latitude by longitude) spatial resolution. Note that we only plotted the backward trajectories at the center of Region B, as the wind is generally harmonized in this region. The backward trajectories from 2 m height at the center of Region B suggest that the CO surface concentration in this region has little influence from the fire in Southeast Asia. The 3-day backward trajectories from a 2 km height at the center of Region B show that the fire emission in Southeast Asia can be transported to South China, which is consistent with the winds at 750 hPa from the ERA5 reanalysis data. As a result, the CO column in Region B is strongly affected by the fire in Southeast Asia, and more fires in 2020 led to a CO enhancement in the free troposphere in South China during the lockdown observed by the satellite. The CO increase during ACNY and 3/11–4/10 in 2020 observed by IASI was even larger than that observed by TROPOMI, as the IASI retrieval is more sensitive to the mid- and upper-troposphere.

**Table 2.** The median of the relative changes in CO surface concentration observed by surface measurements, and in CO columns observed by TROPOMI and IASI satellite measurements during four periods in each region.


**Figure 6.** Box plots of the CO changes from the surface (**a**), TROPOMI (**b**), and IASI observations (**c**) during 4 periods in 2020 against those in 2019. Each box plot shows the values of relative difference for the maximum (top of solid line), 75th percentile (top of box), median (line through middle of box), 25th percentile (bottom of box) and minimum (bottom of solid line) of the distribution.

**Figure 7.** The number of fires observed by the Visible Infrared Imaging Radiometer Suite (VIIRS) satellite in 0.5◦ × 0.5◦ (latitude by longitude) grids over Southeast Asia (blue box) together with the wind at the 750 hPa from the ERA5 reanalysis data during ACNY and 3/11–4/10 in 2019 (**left**) and the difference in the number of fires between 2020 and 2019 (**right**). The green and black lines in the right panels are 3-day backward trajectories at 12:00 (local time) from 2 m and 2 km heights at the center of Region B (red box) for each day during ACNY and 3/11–4/10 2020 simulated by the Hybrid Single-Particle Lagrangian Integrated Trajectory (HYSPLIT) model.

#### **4. Discussions**

In this study, we investigated CO changes based on both CO surface measurements and satellite column measurements. CO reduction was observed by both surface and satellite observations during the COVID-19 lockdown at most places in China. We have highlighted the importance of seasonal variations of CO surface concentration, which must be taken into account when looking at the CO changes during the COVID-19 lockdown, but have not been done in several previous studies [18,19]. In addition, we found that the specific changes in the industrial emissions at the city scale are important to the changes in CO surface concentration at Wuhan, which are suggested by the simultaneous SO2 measurements and VIIRS/MODIS fire measurements. However, the limitation of this study is that the impact of the industrial emission on the CO change was only discussed qualitatively, because up-to-date CO anthropogenic emissions for the year 2020 are not currently available. A further study could focus on the application of the inverse modeling approach with the surface CO measurements as the inputs to optimize each anthropogenic component.

Different from the CO surface concentration, the changes in CO columns during the COVID-19 lockdown in China observed by TROPOMI satellite measurements using the difference between 2019 and 2020 in this study are similar to the results using only 2020 measurements before and after the lockdown [21], because the XCO means from TROPOMI were almost the same during these four periods. However, the month-to-month variation in CO columns observed by IASI cannot be ignored. In order to reduce the uncertainty, the satellite measurements were only discussed with the median values during each period on the regional scale. The changes in the CO columns observed by satellites are generally consistent with those at the surface in most regions in China under the assumption that the CO concentration above the PBL is not greatly changed. The assumption works well for NO*x* [12], as it has a short lifetime of several hours in the atmosphere. However, due to the relatively long lifetime of CO, the assumption does not work for CO in Region B, where the CO concentration above the PBL was strongly affected by the fire emissions transported from Southeast Asia. We addressed the fact that the CO changes in the free atmosphere are important when comparing the surface and satellite measurements.

#### **5. Conclusions**

Surface observations have shown that CO concentrations were at the same level during BCNY in 2019 and 2020, and there was a mean reduction of 18.7% during ACNY in 2020 as compared to 2019, from 1375 sites in China due to the COVID-19 lockdown. Reductions in CO surface concentration were also observed from 3/11–4/10 and 4/11–5/10 in 2020, but they were smaller than the reduction during ACNY. As the spatial variability of CO surface concentration changes across China is high, we investigated the CO changes at six megacities specifically. Large reductions in CO concentration between 20% and 27% during ACNY in 2020 were found in Shanghai, Chengdu, Zhengzhou, and Guangzhou. The CO surface reduction during ACNY in Beijing was only 8%, which may be explained by the exceptional meteorological conditions in that period in 2020. The most strict measures related to COVID-19 were carried out at Wuhan, but there was no decrease in the CO surface concentration observed during the lockdown in 2020 as compared to 2019. By looking at other air pollutants in Wuhan, we found that NO2, PM2.5 and PM10 were significantly reduced (>40%) during ACNY in 2020 as compared to 2019, and SO2 and CO were both slightly increased. The similar behavior of CO and SO2 suggests that they came from common sources, e.g., the burning of fossil fuels by industrial facilities. The TROPOMI and IASI CO column observations captured the reduction in CO columns (by 2 to 13%) during ACNY in Regions A, C, and D, but the reductions in CO columns were less significant than the reductions in the surface CO concentrations. However, the TROPOMI and IASI observations show that there were 8.8% and 36.7% CO column enhancements during ACNY in 2020 in Region B, which is contrary to the significant reduction (>20%) observed in CO surface concentrations.

**Supplementary Materials:** The following are available online at https://www.mdpi.com/2072-4 292/13//1129/s1, Figure S1: the wind above China during four periods in 2019 and 2020. Figure S2: VIIRS fire map around Wuhan during BCNY and ACNY in 2019 and 2020. Figure S3: GFAS CO wildfire emissions around Wuhan during BCNY and ACNY in 2019 and 2020, together with the CO anthropogenic emissions from EDGAR v5.0 and REAS v3.2 around Wuhan in 2015. Figure S4: The time series of CO, SO2, NO*x*, PM2.5 and PM10 in Wuhan. Figure S5: VIIRS fire map inside Wuhan during BCNY and ACNY in 2019 and 2020.

**Author Contributions:** Conceptualization, M.Z. and M.D.M.; methodology, M.Z. and J.J.; writing original draft preparation, M.Z. and J.J.; writing—review and editing, J.J., B.L., B.D., M.K.S., and M.D.M.; visualization, M.Z. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by S5P-MPC (No. 4000117151/16/I-LG) and CAMS84.

**Data Availability Statement:** The TROPOMI CO data are publicly available at ESA Copernicus Open Access Hub https://scihub.copernicus.eu/. The IASI satellite data are publicly available at https://iasi.aeris-data.fr/cos\_iasi\_b\_arch/. The surface CO measurements are publicly available at https://quotsoft.net/air/. The VIIRS fire observations are publicly available at https://firms. modaps.eosdis.nasa.gov/.

**Acknowledgments:** The authors would like to thank the Ministry of Ecology and Environment of China for providing the CO surface observations, ESA for making the TROPOMI satellite data publicly available, ULB-LATMOS and EUMETSAT for providing the IASI data, and NASA for providing the VIIRS fire products.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **A Satellite-Based Land Use Regression Model of Ambient NO2 with High Spatial Resolution in a Chinese City**

**Lina Zhang 1, Changyuan Yang 1, Qingyang Xiao 2, Guannan Geng 2, Jing Cai 1, Renjie Chen 1, Xia Meng 1,3,\* and Haidong Kan <sup>1</sup>**


**Abstract:** Previous studies have reported that intra-urban variability of NO2 concentrations is even higher than inter-urban variability. In recent years, an increasing number of studies have developed satellite-derived land use regression (LUR) models to predict ground-level NO2 concentrations, though only a few have been conducted at a city scale. In this study, we developed a satellite-derived LUR model to predict seasonal NO2 concentrations at a city scale by including satellite-retrieved NO2 tropospheric column density, population density, traffic indicators, and NOx emission data. The R<sup>2</sup> of model fitting and 10-fold cross validation were 0.70 and 0.61 for the satellite-derived seasonal LUR model, respectively. The satellite-based LUR model captured seasonal patterns and fine gradients of NO2 variations at a 100 m × 100 m resolution and demonstrated that NO2 pollution in winter is 1.46 times higher than that in summer. NO2 concentrations declined significantly with increasing distance from roads and with increasing distance from the city center. In Suzhou, 84% of the total population lived in areas with NO2 concentrations exceeding the annual-mean standard at 40 μg/m3 in 2014. This study demonstrated that satellite-retrieved data could help increase the accuracy and temporal resolution of the traditional LUR models at a city scale. This application could support exposure assessment at a high resolution for future epidemiological studies and policy development pertaining to air quality control.

**Keywords:** satellite-based; NO2; land use regression; exposure assessment

#### **1. Introduction**

Nitrogen dioxide (NO2) is not only a primary pollutant mainly from fossil fuel emissions but also a secondary pollutant arising in large part from a photochemical conversion combining NO with O3 [1,2]. It is a common indicator for traffic-related air pollution and proven to be associated with a myriad of adverse health effects. NO2 has been positively linked to lung cancer mortality in California by the American Cancer Society Cancer Prevention II Study [3]. In China, short-term exposure to NO2 was significantly associated with total natural causes mortality and cardiorespiratory disease mortality across 272 cities [4]. Even at or below the current European Air quality limit values, the associations between NO2 exposure and adverse effects have been found for both short-term and long-term exposure in Europe [5]. In previous epidemiological studies, exposure to NO2 was mostly evaluated using ground-based fixed monitoring data, interpolation methods, or land use regression (LUR) models [6,7].

**Citation:** Zhang, L.; Yang, C.; Xiao, Q.; Geng, G.; Cai, J.; Chen, R.; Meng, X.; Kan, H. A Satellite-Based Land Use Regression Model of Ambient NO2 with High Spatial Resolution in a Chinese City. *Remote Sens.* **2021**, *13*, 397. https://doi.org/10.3390/rs1303 0397

Academic Editor: Vijay Natraj Received: 9 December 2020 Accepted: 21 January 2021 Published: 24 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

The concentrations of NO2 may decline at a distance of several hundred meters from emission sources [8], and the spatial distributions of NO2 differ significantly between, and especially within, cities [9,10]. In Canada, variations in NO2 concentrations within a city further showed a stronger association with cause-specific mortality than that between cities [11]. Thus, it is an essential issue to evaluate intra-urban NO2 concentrations with a high spatial resolution for epidemiological studies. The LUR models are one of the most common assessment methods used to capture spatial variability of NO2 with a high spatial resolution, and have been applied in NO2-related cohort studies in Europe and the United States [9,12–15]. Land use regression models also have been developed for predicting NO2 concentrations in Chinese cities, including Shanghai, Tianjin, and Wuhan [16–18]. Traditional LUR models highly depend on land use data and have lower temporal resolution, but these do not satisfy the flexible requirements of exposure assessment in epidemiological studies.

Satellite data have been proven to be one of the key predictors for estimating ambient NO2 concentrations with a high temporal resolution [19–21]. Specifically, a study in Western Europe indicated that the adjusted R2 of LUR models with satellite data was increased by 0.02–0.06 compared to the models without satellite data with the R<sup>2</sup> of 0.48–0.56 [22]. Other studies showed that the satellite-based LUR models could expand the temporal resolution of traditional LUR models for predicting air pollutants' concentrations, from annual level to monthly or seasonal scales [19,23–25]. NO2 column density from the Ozone Monitoring Instrument (OMI) aboard satellite Aura is the most commonly used dataset for establishing satellite-based LUR or machine learning models [26–28]. The satellite-based LUR models not only expanded the temporal resolution of traditional ones [19], but also simultaneously helped improve model performance [22,29,30]. However, in China, most of these studies were conducted at regional or national scales [21,31]; whether satellite data can improve the resolution and model performance of LUR models at a city scale, has not been fully evaluated. In addition, the row anomaly of OMI led to a large amount of missing data at the daily level [32], hence OMI NO2 column density data might be inappropriate to be directly used to assess NO2 exposure levels within a city at a daily scale, and some studies resampled the data at a seasonal scale [33].

Therefore, in this study, we developed a satellite-derived LUR model, in a Chinese metropolis, to capture intra-urban NO2 temporal variations at a seasonal level with a high spatial resolution. This model with a high spatial resolution is expected to capture the finer gradients of NO2 variations within a city at a higher temporal resolution than that of the traditional LUR model, which could provide more accurate exposure assessment for epidemiological studies.

#### **2. Materials and Methods**

#### *2.1. Study Area*

Suzhou is a city located in southeastern Jiangsu Province of East China (Figure 1). It includes five urban districts (Gusu, Huqiu, Wuzhong, Xiangcheng, and Wujiang) and four satellite cities (Changshu, Taicang, Kunshan, and Zhangjiagang). Suzhou is one of five urban locations in the China Kadoorie Biobank (CKB) cohort that have focused on common chronic diseases since 2004 [34]. We developed a satellite-derived LUR model in Suzhou as a case study to establish the methodology for the assessment of exposure to NO2 of the CKB cohort study to support the next phase of air pollution-related epidemiological studies. Suzhou covered 8488.42 km<sup>2</sup> in 2018 and about 42.5% of the total area was covered by waterbody. The total registered population in Suzhou reached 7.04 million by the end of 2018 (http://tjj.suzhou.gov.cn/sztjj/tjnj/2019/zk/indexce.htm). Suzhou is located in a subtropical monsoon climate zone with four distinct seasons.

**Figure 1.** The location of Suzhou in China and the NO2 monitoring sites in Suzhou that were used in this study.

#### *2.2. Data*

The database included data on NO2 monitoring, NO2 tropospheric column density from the OMI instrument, population density, road network, land use parameters, and NOx emissions.

#### 2.2.1. Monitoring Data

Daily NO2 monitoring data of 20 fixed air quality stations were obtained from the National Environmental Monitoring Network, and the locations of the stations are shown in Figure 1. In accordance with the Chinese Ambient Air Quality Standard (GB3095-2012), at least 20 hourly measurements were included to calculate the daily NO2 concentration; at least 27 daily values were needed to calculate monthly concentrations (25 daily values for February); at least 324 daily values were needed to calculate the annual concentration. Most of the fixed stations were located in areas with a relatively high population density to represent the averaged exposure levels for public health.

#### 2.2.2. Satellite Data

The OMI instrument is on board the National Aeronautics and Space Administration (NASA) Aura satellite that was launched in 2004. It measures radiances across 270–500 nm of the ultraviolet and visible waveband. Global tropospheric vertical column NO2 density data of OMI level 2 (OMNO2) product, with a spatial resolution of 13 km × 24 km at nadir [35], are available online at a daily time step and were downloaded from NASA Goddard Earth Sciences Data and Information Services Center (https://earthdata.nasa. gov/). Cloud cover and a dynamic row anomaly problem of OMI were responsible for a significantly high rate of missing values of daily data. The "row anomaly" occurred due to the technical issues of the OMI, which has produced invalid data in the center-right part of each swath of observations since 2008 [32]. Within a city, the high missing rate

might cause low availability of OMI NO2 tropospheric column density data at a daily level. Therefore, seasonal resampling was done by averaging all daily OMI NO2 tropospheric column density data falling inside a 40 km × 40 km grid to fill the gap caused by missing data and smooth the noise [33]. The satellite data were then interpolated to the fixed monitoring stations using an inverse distance weighted (IDW) method.

#### 2.2.3. Other Predictors

#### Land Use Parameters

Land use data (agricultural, forest, grassland, waterbody, urban and built up, and unused land) from 2014 were interpreted from the Landsat TM5 dataset (https://earthexplorer. usgs.gov/) with a 30 m spatial resolution (Figure 2). Specifically, agricultural land included dry land and paddy fields; forest land included dense forests, shrub forests, loose forests, and other forests; grassland included highly-covered grassland; waterbody included rivers, lakes, beaches, bottomlands, and reservoirs; urban and built up land included urban and rural settlements and other built-up land; unused land included bare rock and sand. In Suzhou, the major land use types were urban and built-up land, agricultural land, and waterbody; and agricultural land mainly consisted of paddy fields. To optimize the correlation between NO2 measurements and land use predictors, different buffer distances were applied, from 100 m to 5000 m, at 100-m intervals, around the 20 fixed monitoring sites [10,17,36]. The areas of each land use type were then calculated within these buffer zones separately.

**Figure 2.** The spatial distribution of types of land use in this study in Suzhou in 2014.

#### Road Network

Lengths of major roads and distances to the nearest major road were calculated as indicators of traffic emissions. Types of roads included expressways, national roads, provincial roads, urban expressways, county roads, town roads, and other roads. Then, expressways, national roads, provincial roads, and urban expressways were merged as major roads. Within the buffers from 100 m to 5000 m (at 100 m intervals) around the 20 fixed monitoring sites, the lengths of major roads were then calculated [6,17]. Distance from monitoring sites to the nearest major road, inverse of the distance, and logarithmic transformation of the inverse distance were also calculated as indicators of traffic emissions [6,10].

#### Population Density

Population density data were obtained from the Oak Ridge National Laboratory (ORNL)'s LandScan 2014 global database at 30" × 30" resolution in raster format (http: //www.ornl.gov/sci/landscan/), which were then interpolated to the NO2 monitoring stations using the IDW method. The population data, with an ESRI binary raster format, is approximately at a 1 km × 1 km resolution and each grid represents an average population number within the grid at an annual level (https://landscan.ornl.gov/documentation). Figure 3 shows the spatial distribution of the population in Suzhou in 2014, suggesting that more people tended to live in the center of five urban districts and four satellite cities in Suzhou.

**Figure 3.** The distribution of population and major roads in Suzhou.

#### NOx Emissions

NOx emission inventory data were collected from the Multiresolution Emission Inventory of China (MEIC, http://www.meicmodel.org) at a spatial resolution of 1 km × 1 km. The industrial NO2 emissions from power plants and non-power plants were computed separately within buffer zones of 1 km to 10 km, at 1-km intervals, around each monitoring site.

#### *2.3. Model Development and Evaluation*

A traditional LUR model was developed, as the first step, to select the most optimized predictors from all parameters with a linear regression model [6,10,20,36]. Since the OMI NO2 tropospheric column density was aggregated at a seasonal level to fill the gap caused by the high missing rate of the satellite data [32], this model was developed at a seasonal level [37,38]. First, we set every potential variable a prior direction. Second, manual backward supervised regression was conducted based on NO2 seasonal concentrations to select the most optimized predictor variables. Predictors were kept in the model if they satisfied the criteria proposed by previous studies [6,10,17]: (1) the variables improved the model R<sup>2</sup> by at least 1%; (2) the effect directions of the variables were consistent with the prior directions; (3) the variables that were already in the model did not change their effect directions; (4) the variable would be excluded from the model if the *p* value was less than 0.1. This process continued until there were no more variables meeting the criteria. Variance inflation factors (VIFs) were calculated as an indicator of multicollinearity. Variables with VIF values greater than three were removed from the satellite-based LUR model and this step was repeated.

In the second step, a linear mixed effects model was developed (see Equation (1)) by involving random effects of OMI NO2 tropospheric column density [23,37]. The advantage of employing this model was to include the variability of associations between NO2 concentrations and OMI NO2 tropospheric column density over time. Similar satellite-based models had been developed for predicting PM2.5 concentrations in a national assessment [37] and PM10 concentrations within a city in Shanghai [23]. In this model, the OMI NO2 tropospheric column density had both random effect and fixed effect coefficients, which represented seasonal variability in the association between NO2 measurements and OMI NO2 tropospheric column density and the average effect of satellite measurements on the ground NO2 measurements for the whole year, respectively [23,37]. The model structure can be summarized as:

$$NO\_{2,st} = \left(\beta\_0 + \beta\_0\,'\right) + \left(\beta\_1 + \beta\_1\,'\right) \, OMI\_{st} + \beta\_{is}X\_{is} + \varepsilon\_{st} \tag{1}$$

where *NO*2*,st* indicates the mean observed NO2 concentrations (μg/m3) at the fixed station *s* in season *t*; *OMIst* is the only independent variable with both fixed and random effects, which represents OMI NO2 tropospheric column density data at the fixed station *s* in season *t*; β<sup>0</sup> and β0' are the intercepts of the fixed and season-specific random effects for the model, respectively; β<sup>1</sup> and β1'indicate the fixed and season-specific random slopes for *OMIst*, respectively; *Xis* represents a series of predictors, which are selected by satisfying the criteria from the first step; and βis represents the fixed slope for predictor *i* at the fixed station *s*; and ε*st* is the error term at the fixed station *s* in season *t*.

In the third step, 10-fold cross validation (CV) was applied to evaluate the model performance [17,37]: 90% of the data were randomly selected for model development, which was used to predict NO2 concentrations of the remaining 10% of the data; and this process was repeated 10 times. Root mean squared error (RMSE) was calculate as the standard deviation of the residuals. RMSE and R2 were used to evaluate the model's performance by comparing measured and predicted NO2 concentrations during model development and 10-fold CV, respectively. The relative prediction error (RPE, defined as RMSE divided by the mean NO2 measurements) from 10-fold CV was then calculated to evaluate prediction accuracy.

In the fourth step, seasonal prediction maps of NO2 concentrations in Suzhou were produced based on the satellite-derived LUR models, at a 100 m × 100 m resolution at a seasonal timescale. In addition, we further calculated annual-mean and seasonal-mean population-weighted NO2 concentrations in Suzhou [39] (see Equation (2)).

$$C\_{Pop} = \sum Pop\_i \times C\_i / \sum Pop\_i \tag{2}$$

where *CPop* indicates the annual-mean or seasonal-mean population-weighted NO2 exposure concentrations in Suzhou; *Popi* represents the population density of grid *i*; and *Ci* indicates the estimated annual-mean or seasonal-mean NO2 concentrations of grid *i*.

Figure 4 shows the workflow for the development of the satellite-derived LUR model in our study. Statistical analyses were performed with nlme packages (https://www. rdocumentation.org/packages/nlme/versions/3.1-151/topics/nlme) of R3.6.1.

**Figure 4.** Workflow for the development of the satellite-derived LUR model.

#### **3. Results**

#### *3.1. Descriptive Statistics Analyses*

In 2014, the annual-mean NO2 was 46.23 μg/m<sup>3</sup> in Suzhou, with the lowest concentration of 36.52 μg/m<sup>3</sup> recorded in summer and the highest concentration of 53.22 μg/m3 in winter, as measured at fixed monitoring sites. Among all predictors, the Pearson's correlation coefficient between seasonal OMI NO2 tropospheric column density and seasonal NO2 measurements was highest with the value of 0.65.

#### *3.2. Model Development and Evaluation*

After variable selection, as the results of the first step, the satellite-derived LUR model included four predictors: NO2 tropospheric column density from OMI, population density, log transformed inverse of nearest distances to major roads (Log\_distance), and NO2 non-power plants emissions within a 10-km buffer zone (Table 1). The R2 and RMSE of this model were 0.63 and 5.76 μg/m3, respectively. The R2 and RMSE of the 10-fold CV were 0.59 and 6.09 μg/m3, respectively. The VIFs of the four variables were all less than 2, showing weak multicollinearity among them.

**Table 1.** The traditional land use regression (LUR) model for predicting NO2 concentrations.


The results of the second step, including the estimated coefficients of fixed effects of the four predictor variables, are shown in Table 2. All predictors were positively and significantly associated with measured NO2 concentrations, with *p* values less than 0.05. The absolute contribution (IQR × β), for each influencing predictor, was calculated as the regression coefficient (β) of fixed effects multiplied by the inter-quartile range (IQR) of the corresponding predictor. The results indicated that the non-power emissions within a 10-km buffer zone and OMI NO2 tropospheric column density contributed most to NO2 concentrations, because they had higher IQR × β values (Table 2).

**Table 2.** The fixed effects of the satellite-derived LUR model for predicting NO2 concentrations.


<sup>1</sup> represents the regression coefficient (β) of fixed effects multiplied by the inter-quartile range (IQR) for each predictor at 20 monitoring sites.

The R2 and RMSE of the seasonal satellite-derived LUR model were 0.70 and 5.24 μg/m3, respectively. The R2 and RMSE of the 10-fold CV were 0.61 and 5.91 μg/m3, respectively, for the seasonal model (Figure 5). The RPE from 10-fold CV was 12.78%, which indicated a relatively high predicting accuracy at the seasonal level. The linear mixed effects model performed better than the traditional linear regression model, suggesting the importance of considering the seasonal variability of the association between ground NO2 measurements and OMI NO2 tropospheric column density.

**Figure 5.** Scatter plots of measured and predicted NO2 concentrations from model fitting (**left**), and 10-fold cross validation (**right**), respectively, for the satellite-derived linear mixed effects model at a seasonal timescale.

#### *3.3. Spatiotemporal Trends of Predicting NO2 Concentrations*

Predictive maps of NO2 concentrations with a spatial resolution of 100 m × 100 m were produced at a seasonal timescale (Figure 6). The seasonal pattern of predicted NO2 concentrations agreed well with field measurements. Mean NO2 concentration was highest in winter (47.3 μg/m3) in Suzhou, which was 1.46 times higher than that in summer. The spatial patterns of NO2 predictions were similar at different seasons throughout the year. Maps with high spatial resolution showed that severe NO2 pollution occurred along the major roads and declined significantly with increasing distance from the road. Urban centers with high population density and an intensive road network also experienced higher NO2 concentrations than that of the rural areas (Figure 6). For example, in summer, the maximum NO2 concentration (58.99 μg/m3) that occurred in urban areas was 2.77 times higher than the minimum value (21.33 μg/m3) in rural areas; and in winter, the maximum concentration (76.93 μg/m3) was 2.03 times higher compared to the lowest value (37.91 μg/m3) in rural areas. The results indicated that the NO2 concentration was generally higher in urban areas than that in rural areas both in winter and summer.

The population-weighted annual mean NO2 concentration in 2014 was 44.94 μg/m<sup>3</sup> in Suzhou, higher than the annual-mean predicted concentration of 41.4 μg/m3 and also higher than the annual-mean NO2 standard of 40 μg/m<sup>3</sup> defined in the Chinese National Ambient Air Quality Standards (GB 3095-2012). In winter, 99% of the total population lived in areas with NO2 concentrations exceeding 40 μg/m3 in Suzhou (Table 3).

**Table 3.** Population-weighted NO2 exposure concentrations.


\* Proportion: Proportion of population living in areas with NO2 concentrations exceeding 40 μg/m3.

**Figure 6.** NO2 spatial distribution at the seasonal level in Suzhou, 2014.

#### **4. Discussion**

Our study built a satellite-derived LUR model with OMI NO2 tropospheric column density data to predict NO2 concentrations at seasonal timescales with a high spatial resolution (100 m × 100 m) in Suzhou. The R2 values of model fitting and 10-fold CV were 0.70 and 0.61 at seasonal timescales, respectively, reflecting the relatively high stability of the model.

Our seasonal satellite-derived LUR model performance was comparable with previous satellite-based LUR models on NO2 concentration assessment at global, national, and regional scales. For the global satellite-based LUR model, the R2 and MAE (mean absolute error) for the model were 0.54 and 3.7 ppb at a 100 m×100 m resolution, respectively [20]. The adjusted R<sup>2</sup> values of models with satellite data were 0.48–0.58 in 17 contiguous countries of Western Europe [22]. The R2 of the model fitting and CV were 0.79 and 0.77 of the national satellite-derived LUR in the United States, respectively [19]. Similarly, in China, Xu et al. and Yang et al. developed satellite-derived LUR models at national and regional scales, respectively [21,31]. The R2 of 10-fold cross-validation (CV) was 0.78 for the national model in 2015 [31], and the R2 of model fitting was 0.61 for the regional model [21]. Although increasing studies have used machine learning methods with satellite

data to evaluate NO2 concentrations based on a large number of measurements from fixed monitors at regional or national scales [40–43], the training data may be insufficient to develop machine learning models within a city because of the limited number of fixed stations in this study. The comparison suggested that our satellite-derived LUR model, including satellite-retrieved NO2 tropospheric column density, population density, traffic indicators, and NOx emission data, predicted ground NO2 concentrations with relatively high accuracy based on the fixed stations in Suzhou.

In terms of NO2 concentration, our results exhibited significant spatial variability within a city at a fine spatial resolution (100 m × 100 m), and found a distinctive decline with increasing distance from the roads and significant differences between urban and rural areas. The high variability within a city suggested that exposure assessments of NO2 might be inaccurate if they just depended on measurements of a limited number of fixed monitoring sites. This high spatial heterogeneity may be mainly dependent on NO2 pollution-related sources, such as traffic and industrial emissions. Traffic and industrial emissions are known as the main sources of NO2, contributing to the high spatial heterogeneity of NO2 concentrations along roads and within a city. On one hand, NO2 is emitted as a primary pollutant from these sources. On the other hand, NO2 is also a secondary pollutant [1,2]. In our study, NO2 concentrations were significantly higher along roads and declined gradually with increased distance from roads in Suzhou, consistent with previous results of NO2 spatial heterogeneity along roads [8]. The variables indicating traffic-related sources in our study were also frequently used in the previous LUR models for NO2 concentrations assessment [6,17,36]. Additionally, industrial emissions, an important influencing predictor for NO2 assessment in our model, had also been found to be an important variable in the previous LUR models to predict ground NO2 concentrations within cities such as in Shanghai and Tianjin [16,17]. A recent study observed a notable decrease of NO2 concentrations during the Chinese New Year holiday in 2020 led by the novel coronavirus (COVID-19) lockdown compared to those before or after this period in Suzhou [44]. A sharp decline in traffic emissions and a slight reduction in industry emissions caused by the shut-down policies might be the main contributors to the decrease of NO2 concentrations during the lockdown period in Suzhou [44], suggesting that both traffic and industrial emissions are crucial sources of NO2 in Suzhou. Additionally, our results found that mean NO2 concentrations were higher in winter compared to that in summer. This was consistent with the previous studies on the seasonal pattern of NO2 concentrations in China [24,45]. In winter, NO2-related emissions are stronger due to more emissions from coal combustion for heating; while meteorological conditions are less favorable and could impede the dispersion and transportation of NO2 pollution [44,46,47]. Both of these might be contributors to the higher NO2 concentrations in winter [44,46,47]. Our results in Figure 6 showed an approximately lower ratio between urban and rural NO2 concentrations in winter compared to those in summer. This might be due to more coal combustion for the heating of houses in rural areas in winter compared to that in urban areas [48].

As another influencing factor for NO2 spatial heterogeneity, the spatial pattern of population density was highly consistent with that of NO2 predictions in Suzhou, suggesting that population density can be used as an indicator of anthropogenic emissions that reflects a series of emissions including traffic, industrial process, and heating sources [6]. High population density not only intensified the NO2 pollution, but also resulted in an increased exposure of populations to high NO2 levels. In this study, 84% of the population were exposed to higher NO2 levels than the national annual-mean NO2 standards (40 μg/m3) in Suzhou in 2014; while the proportion of the population exposed to concentrations exceeding the World Health Organization (WHO) annual NO2 standards (40 μg/m3) was only 8% in Western Europe [39], which was much smaller than that in Suzhou. This might be because a high population density and high concentrations of air pollution coexist in Chinese cities. For example, many residential buildings are located along major roads for the convenience of transportation, and residents living in these buildings might be

both influenced by the traffic-related emissions and housing heating emissions, especially during winter in the rural areas. Our results suggested that policy makers should take effective interventions for these areas of higher NO2 concentrations, especially for urban regions with the higher population density, which is an urgent need for the public health.

The satellite-based LUR model also expanded the temporal resolution and improved the accuracy of seasonal NO2 predictions. Land use data, including land cover, road network, and population data, used in traditional LUR models commonly have lower temporal resolution, whereas the NO2 tropospheric column density data could represent temporal variability of NO2 concentration with a strong correlation with ground NO2 concentration. Previous studies mostly employed satellite data to expand the temporal resolution of the LUR model for the assessment of NO2 concentrations to seasonal or monthly timescales at national or regional scales [19,21,30]; however, few satellite-based LUR models on NO2 concentrations assessment have been developed at a city scale considering the local influencing factors with a flexible timescale in China. In this study, we developed a satellite-based LUR model in Suzhou to capture the fine gradients of NO2 concentrations at a spatial resolution of 100 m × 100 m. More importantly, our predictions captured the significant seasonal variability of NO2 concentrations within a city, which could not be achieved by traditional LUR models. These findings suggested that the satellite-derived model could provide exposure assessment of NO2 concentrations at a flexible timescale for epidemiological studies and scientific evidence for protecting residents from NO2 pollution.

Our study has several limitations. First, the OMI NO2 tropospheric column density for spatial prediction was relatively coarse (13 km × 24 km). Satellite-based NO2 data with a higher spatial resolution could help improve the model performance in the future when they are available. Second, our model was developed at a seasonal level rather than a daily level. The cloud cover and row anomaly problem of OMI lead to missing data at a daily level within a city; therefore, we resampled OMI data at a seasonal level to fill the gap. Satellite-based NO2 data with a lower missing rate might help improve the temporal resolution of our model in the future. Third, traffic counts are an ideal predictor to identify the traffic emissions, but these were not accessible for this study. We used major road lengths and distance to the nearest major road as surrogates of traffic counts to indicate the influence of traffic emissions on NO2 concentrations. This was also applied as a traffic variable in NO2 LUR models in the European Study of Cohorts for Air Pollution Effects (ESCAPE) project and other studies of the development of NO2 LUR models [6,36].

#### **5. Conclusions**

In summary, the satellite-derived LUR model could predict seasonal NO2 concentrations at a 100 m × 100 m resolution with relatively high accuracy, at a city scale. This model could capture the fine gradients both along the road and within the urban-rural areas for each season based on the satellite data. According to the predictions, we found that 84% of the city's total population lived in areas with NO2 concentrations exceeding the national annual standard of NO2 of 40 μg/m<sup>3</sup> in Suzhou in 2014. Hence, reducing NO2 concentrations is urgently needed, especially for urban areas with a higher population density. This model and its predictions could support policy developments in the control of air quality and accurate exposure assessment for future epidemiological studies.

**Author Contributions:** Conceptualization, L.Z., Q.X., G.G., J.C., R.C., X.M., and H.K.; methodology, C.Y., Q.X., G.G., J.C., R.C., and X.M.; software, L.Z., and X.M.; validation, L.Z., and X.M.; formal analysis, L.Z., and X.M.; investigation, C.Y.; resources, C.Y., and X.M.; data curation, C.Y.; writing original draft preparation, L.Z.; writing—review and editing, L.Z., and X.M.; visualization, L.Z.; supervision, L.Z., and X.M.; project administration, H.K.; funding acquisition, H.K. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the National Natural Science Foundation of China (91843302, 91643205 and 82003413) and the National Key Research and Development Program of China (2016YFC0206504).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Publicly available datasets were analyzed in this study. These data can be found here: https://earthexplorer.usgs.gov/ (OMI NO2 tropospheric column density data and Landsat TM5 dataset), http://www.ornl.gov/sci/landscan/ (population density data).

**Acknowledgments:** Thanks to Qiang Zhang, University of Tsinghua, for providing the NOx emission inventory data.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Global-Scale Patterns and Trends in Tropospheric NO2 Concentrations, 2005–2018**

**Sadegh Jamali 1,\*, Daniel Klingmyr <sup>2</sup> and Torbern Tagesson 3,4**


Received: 9 October 2020; Accepted: 26 October 2020; Published: 28 October 2020

**Abstract:** Nitrogen dioxide (NO2) is an important air pollutant with both environmental and epidemiological effects. The main aim of this study is to analyze spatial patterns and temporal trends in tropospheric NO2 concentrations globally using data from the satellite-based Ozone Monitoring Instrument (OMI). Additional aims are to compare the satellite data with ground-based observations, and to find the timing and magnitude of greatest breakpoints in tropospheric NO2 concentrations for the time period 2005–2018. The OMI NO2 concentrations showed strong relationships with the ground-based observations, and inter-annual patterns were especially well reproduced. Eastern USA, Western Europe, India, China and Japan were identified as hotspot areas with high concentrations of NO2. The global average trend indicated slightly increasing NO2 concentrations (0.004 <sup>×</sup> 1015 molecules cm−<sup>2</sup> <sup>y</sup><sup>−</sup>1) in 2005–2018. The contribution of different regions to this global trend showed substantial regional differences. Negative trends were observed for most of Eastern USA, Western Europe, Japan and for parts of China, whereas strong, positive trends were seen in India, parts of China and in the Middle East. The years 2005 and 2007 had the highest occurrence of negative breakpoints, but the trends thereafter in general reversed, and the highest tropospheric NO2 concentrations were observed for the years 2017–2018. This indicates that the anthropogenic contribution to air pollution is still a major issue and that further actions are necessary to reduce this contribution, having a substantial impact on human and environmental health.

**Keywords:** tropospheric NO2 concentrations; nitrogen dioxide; OMI; spatio-temporal trends; DBEST; PolyTrend; time-series analysis; breakpoint detection

#### **1. Introduction**

Air pollution is one of the main threats to human health, ecosystems and climate on a global scale [1,2]. The global population is growing substantially, and more than half of the world's population now live in urban areas. Large urban areas and high population densities are hotspots for air pollution [1,3]. According to the World Health Organization (WHO), about 3 million people die annually due to ambient air pollution, mainly in low- and middle-income countries, and about 90% of the world's population are exposed to air that exceeds the WHO air quality guidelines [4].

Nitrogen dioxide (NO2) is one of the most important air pollutants in the atmosphere [5] and linked to a number of both environmental and epidemiological effects [2,6]. It is formed in processes where nitrogen reacts with oxygen in high temperatures, e.g., through lightning and the combustion of fuels [7]. The main anthropogenic sources of NO2 emissions are transport, industry processes and energy production [8]. Some of the main environmental effects linked to high NO2 concentrations are acidification, eutrophication and photochemical formation of ozone (O3) [6,7,9]. NO2 also modifies the radiative balance in the atmosphere and influences the atmospheric lifetime of greenhouse gases [10,11]. NO2 is toxic at high concentrations, and the epidemiological effects include respiratory illnesses such as lung cancer, asthma exacerbations and cardiopulmonary mortality [2,5,7,12]. NO2 has a short atmospheric lifetime, on average 3.8 ± 1.0 h (mean ± 1 standard deviation) [8] as it reacts with sunlight, which triggers the production of hydroxyl radical OH [13]. Therefore, high concentrations of tropospheric NO2 are mainly confined to its emission sources, which in general are urban and industrialized areas [2,5].

Monitoring of NO2 concentrations can be done with ground-based monitoring stations. However, monitoring stations tend to be clustered in city centers, have a small spatial coverage and are often lacking in developing countries [2,14]. Ground-based air quality monitoring is thereby unevenly distributed, and large areas are under-represented [14,15]. An alternative approach to monitor air pollution is the usage of remotely sensed satellite data that increase the spatial coverage. Major advances have been made over the past decades to use satellite sensors to monitor atmospheric pollutants [1]. Satellite monitoring of NO2 started in 1995 with the Global Ozone Monitoring Experiment (GOME) instrument [3]. Since then, other satellite instruments have been used to monitor tropospheric NO2, such as GOME-2, the SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY (SCIAMACHY), the Ozone Monitoring Instrument (OMI) and the recent TROPOspheric Monitoring Instrument (TROPOMI) aboard Sentinel-5 Precursor. Out of these instruments, OMI offers the longest continuous monitoring record (ongoing since 2004) and has a relatively high spatial resolution (13 <sup>×</sup> 24 km<sup>2</sup> at nadir) [6,7]. Potential errors in estimating NO2 concentrations from satellite data include uncertainties in surface albedo, aerosols, cloud parameters, slant column density and air mass factor calculations [2,6,16]. Therefore, for satellite-based products to be trustworthy, the data need to be compared against other observations of NO2 concentrations, such as from ground-based monitoring stations [17].

Studies of long-term trends in air pollution provide information about likely changes and distribution patterns that are useful for assessing the effects of emission mitigation efforts [18–20]. Such studies investigating NO2 trends using OMI data and validating derived results against ground-based measurements have been performed previously. For instance, there are studies on NO2 trends over USA [2,15,21], over China [22], Russia [23], in eight European cities [1] and in cities around the globe [24]. These studies have reported declining NO2 trends in their respective study areas and relationship between OMI and ground-based measurements with correlation coefficients ranging between 0.3 and 0.93. NO2 trend studies on a global scale have also been performed previously using various satellite sensors, but these studies have overall found both negative and positive trends [3,5,19,25,26].

For trend analysis, one of the most widely used methods is the ordinary least-squares (OLS) linear regression, as performed in most of the above-mentioned studies. These simple linear models only provide partial insights on the mechanism essential for an appropriate attribution of drivers of changes. Actual changes can abruptly occur caused by climatic extreme events, anthropogenic mitigations efforts or changes in contributing factors to air pollution. These changes may only be visible for a short period in time, despite having long-lasting effects, and will therefore remain undetected using such traditional linear trend models [27–29].

Recent advances in time-series and breakpoint analysis open new possibilities for studying tropospheric NO2 concentrations observed by Earth observation satellites, as they allow for the detection of nonlinear trends and turning points in the concentrations. Nonlinear trend models (e.g., PolyTrend) can separate trends into linear and nonlinear trend types [30]. Piecewise linear models, such as Break For Additive Season and Trend (BFAST) [27] or Detecting Breakpoints and Estimating Segments in Trend (DBEST) [29], allow for separating time-series into individual segments, capturing dynamics in specific explanatory variables [28,31–33]. By using these methods, dynamics in tropospheric NO2 concentrations may be better characterized by capturing specific atmospheric conditions and stages of pollution development through time.

Hence, the main aim of this paper is to analyze global and regional patterns and trends in tropospheric NO2 concentrations using a continuous time-series of tropospheric NO2 concentrations from the OMI instrument from 2005 to 2018 with novel methods within time-series and breakpoint analyses. Specifically, we aim at (1) comparing the OMI data against NO2 concentrations from ground-based monitoring stations, (2) analyzing spatial patterns and temporal (nonlinear) trends, (3) investigating whether regional differences can be found in global NO2 concentrations and (4) spatially explicitly detecting major breakpoints in NO2 concentrations and estimating their timing and magnitude at global scale.

#### **2. Materials and Methods**

#### *2.1. Satellite-Based NO2 Dataset*

Aura is one of the National Aeronautics and Space Administration's (NASA) Earth Observing System (EOS) satellites. It was launched in 2004 with the mission to collect data of global air pollution and to monitor the chemistry and dynamics of Earth's atmosphere on a daily basis [34]. Aboard Aura there are four instruments, one of which is OMI [34,35]. OMI is a nadir-looking push broom hyperspectral imaging spectrometer that measures reflected solar radiation in the ultraviolet and the visible light (UV/VIS) channels of the electromagnetic spectrum (wavelength range of 264–504 nm) with a spectral resolution of 0.42–0.63 nm [36,37].

We used the OMI/Aura level 3 NO2 (OMNO2d) standard product (the cloud screened subset 4) downloaded from NASA's Earth Observation data collection [38]. The OMNO2d product contains composites of daily total tropospheric column NO2 data with a spatial resolution of 0.25◦ × 0.25◦. In this study, we used OMI data from 1 January 2005 until 31 December 2018 (in total, 5092 daily OMNO2d files considering 21 gaps in the daily data files). We also excluded all pixels with less than 50 days of data per year, in order to minimize influences of errors in the retrieval process.

#### *2.2. Ground-Based NO2 Dataset*

The ground-based data are annual averages (*n* = 6093) of daily observations of atmospheric NO2 concentrations (*n* = 1,706,830) from monitoring stations in the USA between the years 2005 to 2018, provided by the United States Environmental Protection Agency (US EPA) [39]. The reference method used by the US EPA for collection of ambient NO2 is chemiluminescence analysis [40] based on the reaction of nitric oxide (NO) with ozone (O3). The principle of the method is that a sample of ambient gas enters a reaction chamber where NO molecules react with O3 to form NO2. The reaction produces a quantity of light, a phenomenon known as chemiluminescence. The intensity of the light, which is proportional to the concentration of NO2, is then measured to determine the concentration of NO2 [40,41].

#### *2.3. Comparison against Ground Observations*

The daily OMI NO2 data were first averaged monthly, and thereafter annually. Annual averages were used since this study focuses on long-term trends, and it is therefore the inter-annual variability that must be validated. The annual averages were then compared to corresponding ground-based NO2 data in order to verify the validity of OMI NO2 product. Since the two datasets use different units (10<sup>15</sup> molecules cm−<sup>2</sup> for the satellite-based data and part per billion (ppb) for the ground-based data), we calculated z-scores using the z-statistic ((data value − average)/standard deviation) for both datasets. The relationships between the two datasets were quantified using the root-mean-square error (RMSE), and by goodness-of-fit when fitting the ordinary least-squares linear regression on the z-scores for the two datasets.

#### *2.4. Analysis of Spatial Patterns and Temporal Trends*

The spatial patterns were analyzed by averaging all OMI NO2 data pixel-wise over the study period. For analyzing the temporal trends, time-series of annual mean NO2 concentration were first calculated. Then we applied PolyTrend to analyze and classify trends in the annual NO2 time-series 2005–2018. We also applied the DBEST program to detect the greatest significant breakpoints in the annual NO2 time-series and estimate their timing and magnitude. The PolyTrend and DBEST analyses

were both performed at pixel level having a statistical significance threshold (α) of 0.05. Pixels with an absolute value of annual average tropospheric NO2 concentration below the OMI detection limit (0.5 <sup>×</sup> <sup>10</sup><sup>15</sup> molec.cm<sup>−</sup>2) [42] were excluded from the trend analyses.

#### 2.4.1. Nonlinear Trend Analysis with PolyTrend

PolyTrend is an automated method with an algorithm that accounts for nonlinear change in a trend [30]. It uses a polynomial fitting-based scheme that divides trends into linear and nonlinear trend behaviors and then subdivides the nonlinear trends into classes of cubic, quadratic, and concealed trend types. The linear trend type means that the trend line has a uniform direction over the study period (either increasing or decreasing). The quadratic trend type is a trend line with one bend in its curve, implying that the cell has experienced one direction-change in its trend line over the study period (i.e., first positive and then negative trend, or vice versa). The cubic trend type means that the trend line has two bends, implying that corresponding cell has experienced more than one change in the trend direction over the study period (i.e., first decreasing followed by increasing and then again decreasing change, or vice versa). The concealed trend type consists of cells with either cubic or quadratic trend types, but with no significant net change in tropospheric NO2 concentrations over the study period. We refer to Jamali et al. [30] for more details.

#### 2.4.2. Breakpoint Analysis with DBEST

DBEST was developed for analyzing time-series of satellite sensor data, and it uses a segmentation method for two main algorithms of trend generalization and change detection [29]. We used its change detection algorithm in order to detect breakpoints with greatest change in tropospheric NO2 concentrations. Our input data in DBEST were the pixel-wise time-series of the annual average NO2 concentrations data.

First, DBEST tests for the occurrence of discontinuities, in this case of tropospheric NO2 concentrations, by analyzing the absolute differences between consecutive data points and comparing this to the first level-shift-threshold set by the user (Table 1). If the difference is greater than the first level-shift-threshold, then it tests whether or not this difference caused a significant shift in the mean level of tropospheric NO2 concentrations and persisted over the duration-threshold. If the mean level before and after this identified discontinuity is greater than the second level-shift-threshold, DBEST considers this a level-shift point. DBEST then repeats this process for all data points, sorts them into descending order based on the absolute value of tropospheric NO2 concentrations difference, and tests if the spacing between a data point and an identified level-shift point is at least the duration-threshold. The trend component of the time-series is then segmented using a peak/valley detector function and a method that draws a straight line through detected peak/valley points and compares perpendicular distances to the non-peak and non-valley points between them with the distance-threshold parameter. If the distance is greater than this threshold, these points are added to the set of detected peak/valley points and level-shift points, all of which are called turning points. Detected turning points are then fit to the tropospheric NO2 concentrations trend using piecewise linear modelling, and those turning points that minimize the Bayesian Information Criterion (BIC) [43] are considered breakpoints. Here, we used the change detection algorithm of DBEST with a set value (2) for the number of significant breakpoints of interest for detection (Table 1), and as such, DBEST identifies a final set of greatest significant breakpoints as requested by user. The results of the change detection algorithm include the starting time of the breakpoints (break date); the change duration, or the temporal period over which this change occurred; the change value, or the amount of change that occurred over this time period; the change type, whether the change is abrupt (level-shift) or non-abrupt; the change significance, based on the statistical significance level (α = 0.05).


**Table 1.** DBEST setting parameters, description and the threshold values used in this study.

Here, the annual average tropospheric NO2 concentrations time-series data were set as non-cyclical type (Table 1). The first level-shift-threshold was set to 0.1 <sup>×</sup> 1015 molecules cm−<sup>2</sup> and the second level-shift-threshold to 0.5 <sup>×</sup> 1015 molecules cm-2. It is recommended that the first level-shift-threshold be set to a smaller value than that for the second level-shift-threshold [29]. Therefore, if a detected change was quick (between two consecutive observations/years) and large enough (0.1 <sup>×</sup> 1015 molecules cm<sup>−</sup>2) to shift the mean over the user-set duration (2 years) by 0.5 <sup>×</sup> <sup>10</sup><sup>15</sup> molecules cm−<sup>2</sup> before and after the point, it was characterized as an abrupt change, otherwise it was considered a non-abrupt change, provided that it was a significant breakpoint. The distance-threshold is normally set to be a default that is derived internally by DBEST.

#### **3. Results**

#### *3.1. Data Comparison against Ground Observations*

The comparison of OMI data against ground-based observations showed a strong relationship (Pearson's correlation coefficient R = 0.65) that was statistically significant (*p*-value < 0.01) (Figure 1). The relationship was equally strong (R = 0.65) when separating the analysis into a comparison of how well OMI captured the spatial variability (data averaged site-wise; Figure 1b). The OMI data were most successful at reproducing the inter-annual variability (data averaged annually), for which the observations were in a very close relationship with the ground-based observations (R = 0.99) (Figure 1c). The ordinary least-squares linear trend in annual averages of the z-scores in OMI NO2 concentrations (−0.220 <sup>±</sup> 0.027 z-scores y<sup>−</sup>1; R2 <sup>=</sup> 0.85) was very similar to the corresponding trend in the ground-based NO2 concentrations (−0.218 <sup>±</sup> 0.022 z-scores y<sup>−</sup>1; R2 <sup>=</sup> 0.83).

**Figure 1.** Comparison between z-scores from Ozone Monitoring Instrument (OMI)-based and ground-based tropospheric NO2 concentrations. (**a**) All annual averages of the ground-based stations against the annual averages of the corresponding OMI pixels. (**b**) The site-wise average for each ground-based station against the corresponding OMI-based pixels. (**c**) The annual averages of all ground-based stations against the annual averages of all corresponding OMI-based pixels. Included are also the ordinary least-squares linear regression (red) with corresponding regression equation and coefficient of determination (R2), the root-mean-square errors (RMSE) and the number of data points (n). Slope of the linear regression fit indicates Pearson's correlation coefficient (R). The black lines are the one-to-one lines.

#### *3.2. Spatial Patterns*

There is a distinct difference in the NO2 concentration distribution between the northern and southern hemispheres, where the higher concentrations are almost exclusively found in the northern hemisphere (Figure 2a). The primary hotspot areas are USA (Figure 2b), Western Europe (Figure 2c), and India, China and Japan (Figure 2d). While the mean global NO2 concentration was 0.2 <sup>×</sup> <sup>10</sup><sup>15</sup> molecules cm<sup>−</sup>2, The Netherlands, Belgium, Germany, France, UK, Italy and Spain had the highest average NO2 concentration (on average 1.91 <sup>×</sup> 1015 molecules cm<sup>−</sup>2), followed by Japan (0.91 <sup>×</sup> 1015 molecules cm<sup>−</sup>2), India (0.43 <sup>×</sup> 1015 molecules cm<sup>−</sup>2), USA (0.38 <sup>×</sup> 1015 molecules cm<sup>−</sup>2)andChina(0.36 <sup>×</sup> 1015 molecules cm<sup>−</sup>2) (Table 2). The maximum NO2 concentration was for China (28.24 <sup>×</sup> 1015 molecules cm<sup>−</sup>2) followed by Japan (14.28 <sup>×</sup> 1015 molecules cm<sup>−</sup>2), Italy (11.84 <sup>×</sup> 1015 molecules cm<sup>−</sup>2), Germany (11.34 <sup>×</sup> 1015 molecules cm<sup>−</sup>2), USA (11.25 <sup>×</sup> 1015 molecules cm<sup>−</sup>2) and India (9.22 <sup>×</sup> 1015 molecules cm<sup>−</sup>2). Due to their high concentrations in tropospheric NO2, we selected these areas as focus areas used for further analysis in the remaining part of the study.

**Figure 2.** Spatial distribution of tropospheric NO2 concentrations (10<sup>15</sup> molecules cm<sup>−</sup>2) averaged over the years 2005–2018: (**a**) globally; (**b**) USA; (**c**) Europe; (**d**) India, China, Japan. Pixels with less than 50 days of data per year were excluded.

**Table 2.** The average, maximum and range of tropospheric NO2 concentrations (1015 molecules cm<sup>−</sup>2), 2005–2018, for the focus areas. Included are the trends in tropospheric NO2 concentrations averaged country-wise, as well as their strongest positive and negative trend slope (1015 molecules cm−<sup>2</sup> y<sup>−</sup>1).


#### *3.3. Temporal Trends*

Significant trends in NO2 concentrations were observed largely over land and to a much lower degree over oceans along boundaries with lands (Figure 3). With the insignificant no-trends masked out, 79.55% of the remaining cells had positive trend whereas 20.45% had negative trend. The increasing trends were distributed over large parts of land, but the decreasing trends were generally observed over USA (Figure 3a), Western Europe (Figure 3b), Japan and the eastern parts of China (Figure 3c). The global average trend in 2005–2018 was slightly increasing (0.004 <sup>×</sup> 1015 molecules cm−<sup>2</sup> <sup>y</sup><sup>−</sup>1); however, the regional negative trends were strong enough to compensate for the global rising trend of NO2 concentrations over larger areas. Globally, the strongest negative trend was <sup>−</sup>0.969 <sup>×</sup> <sup>10</sup><sup>15</sup> molecules cm−<sup>2</sup> <sup>y</sup>−<sup>1</sup> while the strongest positive trend was only 0.363 <sup>×</sup> 1015 molecules cm−<sup>2</sup> <sup>y</sup>−<sup>1</sup> (Table 2).

Areas with high average NO2 concentrations, except India and western parts of China (Figure 2), generally showed negative trends (Figure 3; Table 2). On average, the strongest negative trends were found in Europe (Belgium: <sup>−</sup>0.143 <sup>×</sup> 1015 molecules cm−<sup>2</sup> <sup>y</sup><sup>−</sup>1; Netherlands: <sup>−</sup>0.132 <sup>×</sup> <sup>10</sup><sup>15</sup> molecules cm−<sup>2</sup> <sup>y</sup><sup>−</sup>1; U.K.: <sup>−</sup>0.089 <sup>×</sup> 1015 molecules cm−<sup>2</sup> y−1; Italy: <sup>−</sup>0.070 <sup>×</sup> 1015 molecules cm−<sup>2</sup> <sup>y</sup><sup>−</sup>1) followed by Japan (−0.049 <sup>×</sup> 1015 molecules cm−<sup>2</sup> y−1) and USA (−0.033 <sup>×</sup> <sup>10</sup><sup>15</sup> molecules cm−<sup>2</sup> <sup>y</sup><sup>−</sup>1). The average trend was positive over India and the Middle East. The strongest positive average trend (0.040 <sup>×</sup> 1015 molecules cm−<sup>2</sup> <sup>y</sup><sup>−</sup>1) was for India. Although the strongest negative trend in the focus areas (−0.946 <sup>×</sup> 1015 molecules cm−<sup>2</sup> <sup>y</sup><sup>−</sup>1) was for China, the average trend for the entire country was just slightly increasing (0.014 <sup>×</sup> <sup>10</sup><sup>15</sup> molecules cm−<sup>2</sup> <sup>y</sup><sup>−</sup>1) because strong increasing trends (0.363 <sup>×</sup> 1015 molecules cm−<sup>2</sup> <sup>y</sup><sup>−</sup>1) were observed over large parts of the country as well (Figure 3c).

#### 3.3.1. Trend Types

In a global context, the linear trend was the dominant trend type with a spatial coverage of 61.98%, out of which 54.47% were positive and 7.51% negative (Figure 4a, Table 3). The concealed trend was the second trend type with 21.89% spatial coverage and mainly found over east of China and Southwestern Europe (Figure 4c,d). For the remaining trends, 9.77% were quadratic and 6.36% were cubic, out of which the majority was found over the eastern parts of USA (Figure 4a) and west of Europe (Spain and Portugal) (Figure 4b,c). In the focus areas, the dominant trend type was different for different areas. In the USA, the nonlinear trends (67.59%) were spatially more than the linear trends (32.41%) (Figure 4a, Table 3). In the focus areas in Europe, the most common trend type was linear (negative), except for Spain where the nonlinear trends, particularly the quadratic negative trends (57.96%), were dominant (Figure 4c, Table 3). The most common trend type over India was linear (increasing) (84.36%), and over Japan was linear (decreasing) (43.03%). China was the country with the largest proportion of nonlinear concealed trends in NO2 concentration (45.81%); it was also the second country with the highest proportion of linearly increasing trends (39.19%) after India (84.36%) (Figure 4c,d, Table 3).

#### 3.3.2. Breakpoints in Tropospheric NO2 Concentrations

The global tropospheric NO2 concentrations showed a slightly decreasing trend from 2005 to 2008, followed by a small, positive change (0.03 <sup>×</sup> 10<sup>15</sup> molecules cm−2) starting in 2008, and then a gradual increasing trend between 2011 and 2018 (Figure 5a). The annual average reached its highest values towards the end of the period in 2017–2018 (0.66 <sup>×</sup> 10<sup>15</sup> and 0.67 <sup>×</sup> 1015 molecules cm−2). Among the focus areas, only India showed a similar trend behavior but at a higher NO2 level and with a much greater positive change (0.20 <sup>×</sup> 10<sup>15</sup> molecules cm−2) in 2015 (Figure 5d). Japan was also similar in showing a linear long-term trend with only one breakpoint change but different in that the detected breakpoint was a great negative change (−0.47 <sup>×</sup> 10<sup>15</sup> molecules cm<sup>−</sup>2), thus resulting in an overall decreasing trend (Figure 5f). In contrast, the number of the greatest changes detected in NO2 concentrations over USA, Europe and China was two. The two greatest changes of USA (−0.50 <sup>×</sup> 1015 molecules cm−<sup>2</sup> and <sup>−</sup>0.08 <sup>×</sup> 1015 molecules cm−2) as well as Europe (−0.08 <sup>×</sup> 1015 molecules cm−<sup>2</sup> and <sup>−</sup>0.16 <sup>×</sup> <sup>10</sup><sup>15</sup> molecules cm<sup>−</sup>2) were both negative and started either in the beginning (2004–2005) or towards the end of the studied period (2013–2016) (Figure 5b,c). The first greatest change detected over China was positive (0.78 <sup>×</sup> <sup>10</sup><sup>15</sup> molecules cm<sup>−</sup>2) and started in 2008, but then a second big reverse change (−0.81 <sup>×</sup> <sup>10</sup><sup>15</sup> molecules cm<sup>−</sup>2) happened in 2011 (Figure 5e). These two almost equally big but opposite changes (upward and then downward) with no relax time in between caused the overall NO2 trend being insignificant with no net-change in NO2 concentrations throughout the time period over China. This type of significant nonlinear trend was identified as concealed trend type (Figure 5e).

**Figure 3.** The slope of trend in tropospheric NO2 concentrations obtained by using the annual average tropospheric NO2 concentrations data series, 2005–2018, in PolyTrend: (**a**) globally; (**b**) USA; (**c**) Europe; (**d**) India, China, Japan. Insignificant no-trends were masked out (α = 0.05).

**Figure 4.** The type of trend in tropospheric NO2 concentrations obtained by using the annual average tropospheric NO2 concentration data series, 2005–2018, in PolyTrend: (**a**) globally; (**b**) USA; (**c**) Europe; (**d**) India, China, Japan. Insignificant no-trends were masked out (α = 0.05).

**Table 3.** Spatial coverage (%) of the significant increasing and decreasing trend types globally and in the focus areas with hotspots in average NO2 concentration. Insignificant no-trends were masked out (α = 0.05).


<sup>1.</sup> Lin = linear, Quad = quadratic, Cub = cubic, Conc = concealed.

**Figure 5.** Time-series of annual average tropospheric NO2 concentrations, 2005–2018, with a segmented trend estimated by Detecting Breakpoints and Estimating Segments in Trend (DBEST): (**a**) globally; (**b**) USA; (**c**) Europe; (**d**) India; (**e**) China; (**f**) Japan. The line segments in red denote breakpoints with greatest change (1015 molecules cm<sup>−</sup>2), and the dashed curves denote the type of trend estimated by PolyTrend.

Figure 6a shows the greatest breakpoint change detected in the annual average NO2 concentrations at pixel level. The spatial patterns of the detected short-term changes were similar to the long-term overall trends observed over lands (Figure 3): positive breakpoints were found over large areas in all continents (79.4%) and negative breakpoints mainly over the focus areas (20.6%). The greatest negative drop was for China (−12.41 <sup>×</sup> 1015 molecules cm<sup>−</sup>2), followed by USA (−5.60 <sup>×</sup> 1015 molecules cm<sup>−</sup>2), Italy (−3.81 <sup>×</sup> 1015 molecules cm−2) and Japan (−3.78 <sup>×</sup> 1015 molecules cm−2) and then the other focus countries in Europe (Figure 6a, Table 4). The greatest positive change was also for China (6.65 <sup>×</sup> <sup>10</sup><sup>15</sup> molecules cm<sup>−</sup>2) followed by India (2.13 <sup>×</sup> 10<sup>15</sup> molecules cm−2). Range of the change values was therefore the highest for China (19.06 <sup>×</sup> 1015 molecules cm<sup>−</sup>2) and the least for Netherlands (1.59 <sup>×</sup> <sup>10</sup><sup>15</sup> molecules cm<sup>−</sup>2) and Belgium (1.75 <sup>×</sup> 1015 molecules cm−2), where the average changes were high and no positive change was detected at all (Table 4). The type of majority of the detected greatest changes was non-abrupt, indicating that most of the changes occurred gradually over time, except for Belgium where the changes mainly happened abruptly (56.86%).

**Figure 6.** The breakpoint with greatest change in tropospheric NO2 concentrations obtained by using the annual average tropospheric NO2 concentration data series, 2005–2018, in Detecting Breakpoints and Estimating Segments in Trend (DBEST). (**a**) Magnitude of the change. (**b**) Starting time of the change.


**Table 4.** The values of the greatest breakpoint changes in tropospheric NO2 concentrations (10<sup>15</sup> molecules cm<sup>−</sup>2), the within-country average and range of changes, as well as the distribution of the type of the changes detected by Detecting Breakpoints and Estimating Segments in Trend (DBEST).

The starting time of the major drops in tropospheric NO2 concentrations is most often detected during the period of 2005–2009 for USA (89.6% of cells), Japan (78.8%) and Europe (57.8%) (Figure 6b). For India, the greatest positive change started most often during 2015–2017 (41.1% of cells). For China, the biggest positive change started mostly during 2008–2010 (54.3%) and then the greatest drop happened during 2011–2014 (88.7%).

In a global context, the years 2005 and 2007 were by far the years with the highest occurrence of negative breakpoints (27.7% and 17.4% respectively), indicating a major event during this period that had global effects and particularly in the focus areas (Figure 6a; Figure 7a). The time period with high occurrence of global positive breakpoints was 2008 to 2015, and the years 2008 and 2015 had the highest rates (12.4% and 12.2% respectively) (Figure 7b).

**Figure 7.** The temporal distribution of the global breakpoints with the greatest change in tropospheric NO2 concentrations detected over the years 2005–2018. The values on the *y*-axis are in percentage (%). (**a**) The greatest negative changes. (**b**) The greatest positive changes.

#### **4. Discussion**

The relationship between the satellite-based and the ground-based datasets supports previous OMI validation studies. For instance, the Pearson's correlation coefficient R was 0.65, which is within the middle of the range (0.40–0.80) of several other studies [1,2,15,22]. The statistical comparison further indicated that OMI was more successful at estimating the temporal component than the spatial component (Figure 1b,c). This can partially be explained since the ground-based monitoring stations are focused on a certain emission source (e.g., traffic locations), whereas an OMI pixel (13 <sup>×</sup> 24 km2) covers a larger area with potential emission sources both within and downwind from the pixel [1]. The strong relationships with the ground-based observations still indicate that OMI data are useful giving spatially explicit time-series of tropospheric NO2 concentrations to study global patterns and trends.

The spatial distribution of average NO2 concentrations found in this study (Figure 2) resembles those in other studies [5,19,25,26], confirming that the focus areas are indeed the main

hotspots of tropospheric NO2 concentrations globally. According to Krotkov et al. [25], the highest NO2 concentrations coincide with urban areas with high populations and industrialized regions. NO2 concentrations are generally much lower over oceans than that over land since there are no sources of NO2 emissions except for passing ships [44]. This indicates that the trends observed along offshore boundaries are possibly caused by atmospheric deposition of NO2 transported from their source by large-scale circulation [45]. According to Peters et al. [44], satellite instruments have issues with detecting trace gases over oceans because of the low NO2 concentrations often being below the detection limit of the instruments (0.5 <sup>×</sup> 1015 molecules cm<sup>−</sup>2).

The global and regional trends seen (Figure 3) generally agree with the results from previous studies. Previous studies have shown increasing trends over both India and China [5,19,25,26], where our results show increasing trends over both countries too (Figure 3). The decreasing trend with major drop in NO2 that we observed over Eastern USA confirms the previous study by Krotkov et al. [46] reporting a dramatic decrease in OMI NO2 from 2005 to 2015, as a result of both technological improvements and stricter regulation of emissions. In agreement with our trend results derived for Western Europe, recently Wang et al. [47] observed decreasing trends over Netherlands, Belgium, Germany and Italy, as detected in OMI NO2 concentrations for 2012–2018. The trend results seem to be consistent among studies with data used from different satellite instruments and/or study periods [5,19,25,26].

Decreases of NO2 concentrations can primarily be attributed to either local-, regionalor country-level environmental regulations, improvements in emission control technology (e.g., power plants and vehicles), or economic changes and the associated effects in energy usage [24,25]. Since the spatial distribution of average concentrations and significant decreasing trends correlate well, this indicates that environmental regulations and technological improvements in the countries with the most severe pollution have had a positive effect on concentrations of NO2. However, it should also be noted that the two final years of this study period (2017–2018) were the years with the highest average global concentrations. This clearly shows the importance of continuous satellite-based monitoring of global patterns and trends in NO2 concentrations, also for assessing the effects of regional environmental regulations and technological improvements to reduce emissions [48].

Linear regression models assume that changes occur linearly and gradually, which is not always the case [30,49]. Here, a polynomial fitting-based scheme (PolyTrend) was used to account for nonlinear trends. This polynomial approach thus helps to detect nonlinear trends in time-series that would not be identified by an ordinary least-squares (i.e., linear model) approach. The linear trend type was the dominant trend type globally (Figure 4; Table 3) as well as for Europe (except Spain), India and Japan, indicating monotonic (non-decreasing or non-increasing) trends over these areas. The nonlinear trends with a significant slope (quadratic and cubic) were mainly found over eastern parts of USA and Spain. Since the curve of these trends has one (quadratic) or two (cubic) bends, this indicates that the NO2 concentration trends in these areas either started with an increase and then decreased or the opposite started with a decrease and then increased (quadratic), or with even more short-term changes in the direction of the trend (cubic). The latter case is in agreement with the regional trend curve for USA: a cubic trend starting with a short-term downward trend, then an upward trend, and then again another downward trend (Figure 5b). The identified areas with the concealed trends, mainly in the eastern parts of China and south of Spain, are new findings that, up to the best of our knowledge, have not been reported yet. The reason is that the OLS method is often used in trend studies, and such nonlinear trends are not detectable when OLS is applied for the entire studied period. If OLS applies here, no significant trend in 2005–2018 is detected. However, the concealed changes are credible patterns of nonlinear changes such as the greatest breakpoint changes we detected in NO2 concentrations over China.

The majority of the detected significant breakpoints were non-abrupt indicating that the concentrations of NO2 changed gradually, possibly due to stricter environmental regulations or economic cycles, as opposed to abrupt changes (e.g., in Belgium and Netherlands), which could be due to power plants or industries that have been either opened or shut down suddenly. The years 2005 to 2009 were by far the years with the highest occurrence of negative breakpoints, and regional-scale reductions of tropospheric NO2 concentrations were also observed for USA, Europe and Japan during these years (Figures 5–7). It has also previously been pointed out that 2008 was a year of significant reductions in NO2 emissions (e.g., [21,22,50,51]) due to the start of the great economic recession [50,51]. This was an event, which caused large-scale economic reductions and affected anthropogenic activity globally, which in turn reduced the associated emissions of air pollution from, for example, vehicles, power plants and industries. According to the results of this study, the largest change magnitudes in NO2 concentrations during 2005–2008 were found in USA and Japan. The European countries appear to have suffered less, based on the changes in tropospheric NO2 concentrations (Figure 6, Table 4). The negative breakpoint we found over Eastern China with a four-year duration (2011–2014) is in general agreement with Li et al.'s [52] study of analyzing global change of tropospheric NO2 from 2012 to 2017 using data from the Ozone Mapping Profiler Suite (OMPS) Nadir Mapper (NM) onboard the Suomi National Polar Partnership (SNPP). They reported a large decline of NO2 in Eastern China started in 2013 and was almost entirely driven by wintertime decreases, thus indicating a decrease in anthropogenic emissions over the area. Souri et al. [53] in'their study of analyzing long-term trends of OMI NO2 concentration 2005–2014 over East Asia, also found downward trends in Japan and more developed Chinese cities such as Guangzhou and Beijing, and upward trends in the majority of northern regions of China in 2010–2013. This supports the concealed trend (upward–downward) we observed for China. Another study by Krotkov et al. [46] also showed similar severe declines of NO2 in Eastern China in 2011–2014 due to an economic shutdown and government efforts to restrain emissions from the power and industrial sectors. Likewise, the steepest increasing trend we observed was over India, and they reported a fast-growing trend from 2005 to 2015 for India's NO2 level from coal power plants and smelters.

The time-series analysis methods used in this study (PolyTrend and DBEST) benefit from recent developments, as mentioned earlier, but like many other methods they also have weaknesses. They work on a pixel-by-pixel basis, and they consider each pixel's time-series data as an isolated entity in their trend classification and change detection procedure; the spatial behavior of adjacent areas is not used to improve the robustness of trend/change detection [54]. Thus, the obtained trend and breakpoint results should be interpreted with caution.

Future research could include multiple breakpoint detection analyses using data for pre- and post-pandemic phases of COVID-19 to study impacts of possible changes in anthropogenic sources of NO2 emissions (e.g., transport, industry processes and energy production) on air pollution and tropospheric NO2 concentration trends.

#### **5. Conclusions**

This study contributes to the ongoing research regarding spatiotemporal patterns and trends in tropospheric NO2 concentrations using data from the OMI instrument, and it investigates how the tropospheric concentrations have changed globally and regionally over the period of 2005 through 2018. By applying novel techniques for analysis of time-series and their breakpoints, we quantified long-term nonlinear trends and provided information about distribution patterns in the point in time with the greatest changes.


Despite the breakpoint changes detected for the focus areas, the linear trend was the dominant trend type at global scale with a spatial coverage of 61.98%, out of which 54.47% were positive and 7.51% negative. The concealed trends, mainly observed over Eastern China and South Spain, ranked second. The years 2005 and 2007 were the years with the highest occurrence of negative breakpoints (27.7% and 17.4% respectively), indicating a major event during these years that had global effects and in the focus areas in particular. However, the trend thereafter reversed, and throughout the study period, the years 2017–2018 had the highest tropospheric NO2 concentrations. This indicates that the anthropogenic contribution to air pollution is still a major issue, and that further actions are necessary to reduce this contribution. These techniques for analysis of time-series and their breakpoints could be used for studying underlying causes to regional patterns in trends, possibly providing insights to impact of environmental regulations and other actions to prevent air pollution, having substantial impact on human and environmental health.

**Author Contributions:** Conceptualization, T.T. and D.K.; methodology, S.J., T.T. and D.K.; software, S.J.; validation, D.K. and T.T.; formal analysis, S.J. and T.T.; investigation, S.J. and T.T.; data curation, T.T. and D.K.; writing—original draft preparation, D.K.; writing—review and editing, S.J. and T.T.; visualization, S.J., D.K., and T.T.; supervision, S.J. and T.T.; funding acquisition, T.T. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Swedish National Space Board (SNSB Dnr 95/16). T.T. was also funded by the Danish Council for Independent Research (DFF, Grant ID: DFF-6111-00258).

**Acknowledgments:** We acknowledge NASA Goddard Space Flight Center, Goddard Earth Sciences Data and Information Service Center (GES DISC) for providing OMI/Aura NO2 Cloud-Screened Total and Tropospheric Column L3 (OMNO2d) through EARTHDATA GES DISC data portal (https://disc.gsfc.nasa.gov/). The authors thank the United States Environmental Protection Agency (US EPA) for providing ground-based atmospheric NO2 concentrations data (https://aqs.epa.gov/aqsweb/airdata/download\_files.html#Annual). The authors are very grateful for the constructive feedback from three anonymous reviewers that helped improve the quality of the article.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

*Technical Note*

## **The Cross-Border Transport of PM2.5 from the Southeast Asian Biomass Burning Emissions and Its Impact on Air Pollution in Yunnan Plateau, Southwest China**

**Qingjian Yang 1, Tianliang Zhao 1,\*, Zhijie Tian 2, Kanike Raghavendra Kumar 3, Jiacheng Chang 4, Weiyang Hu 5, Zhuozhi Shu <sup>1</sup> and Jun Hu <sup>6</sup>**


**Abstract:** Southeast Asia is one of the largest biomass burning (BB) regions in the world, and the air pollutants generated by this BB have an important impact on air pollution in southern China. However, the mechanism of the cross-border transport of BB pollutants to neighboring regions is yet to be understood. Based on the MODIS remote sensing products and conventional observation data of meteorology and the environment, the WRF-Chem and FLEXPART-WRF models were used to simulate a typical PM2.5 pollution episode that occurred during 24–26 March 2017 to analyze the mechanism of cross-border transport of BB pollutants over Yunnan Plateau (YP) in southwest China. During this air pollution episode, in conjunction with the flourishing BB activities over the neighboring Indo-China Peninsula (ICP) regions in Southeast Asia, and driven by the southwesterly winds prevailing from the ICP to YP, the cross-border transport of pollutants was observed along the transport pathway with the lifting plateau topography in YP. Based on the proximity to the BB sources in ICP, YP was divided into a source region (SR) and a receptor region (RR) for the cross-border transport, and the negative and positive correlation coefficients (R) between PM2.5 concentrations and wind speeds, respectively, were presented, indicating the different impacts of BB emissions on the two regions. XSBN and Kunming, the representative SR and RR sites in the border and hinterland of YP, respectively, have distinct mechanisms that enhance PM2.5 concentrations of air pollution. The SR site is mainly affected by the ICP BB emissions with local accumulation in the stagnant meteorological conditions, whereas the RR site is dominated by the regional transport of PM2.5 with strong winds and vertical mixing. It was revealed that the large PM2.5 contributions of ICP BB emissions lift from the lower altitudes in SR to the higher altitudes in RR for the regional transport of PM2.5. Moreover, the contributions of regional transport of PM2.5 decrease with the increase in transport distance, reflecting an important role of transport distance between the source–receptor areas in air pollution change.

**Keywords:** Yunnan Plateau; biomass burning; cross-border transport; PM2.5; WRF-Chem

#### **1. Introduction**

Haze pollution caused by aerosol especially PM2.5 has significant adverse effects on environmental change and human health [1,2]. In this regard, research has paid a large

**Citation:** Yang, Q.; Zhao, T.; Tian, Z.; Kumar, K.R.; Chang, J.; Hu, W.; Shu, Z.; Hu, J. The Cross-Border Transport of PM2.5 from the Southeast Asian Biomass Burning Emissions and Its Impact on Air Pollution in Yunnan Plateau, Southwest China. *Remote Sens.* **2022**, *14*, 1886. https://doi.org/ 10.3390/rs14081886

Academic Editors: Maria João Costa and Daniele Bortoli

Received: 10 March 2022 Accepted: 11 April 2022 Published: 14 April 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

amount of attention to the air pollution in emission source areas such as the Yangtze River Delta, the Sichuan Basin, and the North China plain [3–5]. However, the Yunnan Plateau (YP), which is a relatively clean region in Southwest China having an aerosol optical depth (AOD) of ~0.1–0.2 [6,7], has lacked attention and studies on the mechanisms responsible for air pollution change.

The regional transport of atmospheric pollutants is one of the major elements affecting the air quality in China [8,9], and has become a critical part of the field of the atmospheric environment [10–12]. Long-range transport, including cross-border transport of atmospheric pollutants, can influence air quality in a large region [13–15]. As a result, strong winds can easily transport PM2.5 from upstream source regions to downwind areas, resulting in a rise in PM2.5 concentrations [16,17]. Meanwhile, strong winds can also play a role in sweeping local pollutants [18,19]. When the wind speed at source regions increases, air pollutants can be carried to the downstream regions, thus causing a reduction in pollutants in source regions. Therefore, it is an interesting topic of study to understand the role played by winds in the regional atmospheric environment.

Southeast Asia is one of the largest biomass burning (BB) regions in the world, with active fire activities in the spring [20]. As one of the major sources of PM2.5 emissions [21], BB emissions can contribute 70–80% to the total PM2.5 in source regions [22,23]. Meanwhile, PM2.5 generated by a large amount of BB emissions also has an impact on the downwind regions, due to the regional transport driven by atmospheric circulation [24]. Air pollutants produced from Southeast Asian BB emissions can be transported to Southwest China and the Yangtze River Delta over long distances, and even to Taiwan Province, Japan, and the entire East Asia. Although some studies found that air pollutants from BB in Southeast Asia have a significant impact on the atmospheric environment of Southeast China and the northwestern Pacific, few studies have investigated the influence of Southeast Asian BB on air quality in southwest China, and especially the YP region [25–28].

The YP region is located on the southwest border of China, and has a complex topography that gradually decreases from north to south, and southwesterly winds prevail in the YP region throughout the year [29]. The Indo-China Peninsula (ICP) in Southeast Asia, which adjoins the YP region, has shown high AOD values in the spring due to BB emissions [30]. Therefore, the YP region is inevitably influenced by the regional transport of air pollutants from Southeast Asia, especially ICP governed by southwesterly winds. However, the current studies on BB in Southeast Asia have mainly focused on the long-range transport of air pollutants with the effect on atmospheric chemical compositions [31,32], aerosols radiation [33–35], and climatic forcing [36,37]. Due to the lack of studies and analyses on the mechanism of cross-border transport of BB emissions from ICP to the bordering YP, the extent of the influence of BB on air quality in Southwest China is still not well understood. Therefore, in a region such as that of YP, which has the complex terrain of a plateau in a relatively clean environment, the underlying mechanism of air pollution is worthy of in-depth investigation associated with the cross-border or transboundary transport of PM2.5 from the Southeast Asian BB emissions.

In this study, we utilized the satellite-based MODIS remote sensing products and the observational data of meteorology and air pollutants to investigate an air pollution event that occurred during 24–26 March 2017 in the YP region associated with the cross-border transport of PM2.5 from the BB emission sources in ICP. By utilizing the Weather Research and Forecasting model coupled with Chemistry (WRF-Chem) and the Flexible Particle dispersion model (FLEXPART) driven by WRF (FLEXPART-WRF), we simulated the spatial– temporal variations in PM2.5 over Southwest China and Southeast Asia. The present study explored how the regional PM2.5 transported from BB emission sources in ICP affects the air quality in the downwind YP region, and the extent to which the regional transport of PM2.5 from BB emissions affected PM2.5 concentrations in the air pollution episode in YP.

#### **2. Materials and Methods**

#### *2.1. Ground-Based Observation Data*

To investigate the distribution of meteorological factors and PM2.5, and to validate the performance of the WRF-Chem simulation, hourly data of surface PM2.5 concentrations were obtained from the China National Environmental Monitoring Centre (CNEMC). Hourly near-surface wind speed, relative humidity, and air temperature data were derived from the China Meteorological Administration (CMA). The time we mention henceforth is YP's local time (UTC + 08:00 h).

#### *2.2. MODIS Remote Sensing Products*

The MODIS instrument is a multispectral sensor aboard the Aqua and TERRA satellites. It contains 36 wavelength bands from 400 to 1440 nm and allows for the retrieval of aerosol items to cover the entire globe in 1–2 days. The AOD products derived from MODIS have been widely used at global or regional scales [38,39]. In this study, the MODIS Dark-Target/Deep-Blue combined data of Collection 6.1 averaged from Terra and Aqua were utilized to identify AOD's geographic distribution and to validate the performance of the WRF-Chem Model.

#### *2.3. Model Configuration*

#### 2.3.1. WRF-Chem Model

Here, the WRF-Chem online coupling model version 3.9.1 [40] was utilized to simulate an air pollution event that occurred over the YP. Two nested domains were used in the configuration. The coarse domain with a horizontal resolution of 27 × 27 km covered Southwest China and Southeast Asia, wherein the nested fine domain with a 9 × 9 km horizontal resolution included most of the YP and its surrounding ICP regions (Figure 1a). Thirty-two vertical hybrid layers were set from the surface to 50 hPa. The initial and boundary conditions of the WRF-Chem simulation were obtained from the ERA-Interim with a horizontal resolution of 0.75◦ × 0.75◦.

Multiple physical schemes are utilized in the WRF-Chem simulation, such as the YSU boundary layer scheme [41], the Morrison 2 microphysics [42], the RRTMG radiation scheme [43], and the unified Noah land surface model [44]. The RADM2 chemical scheme [45] was selected for the atmospheric gas-phase chemistry mechanism. Table 1 lists the primary parameterization schemes utilized in the modeling configuration.

**Figure 1.** (**a**) Two nesting domains of WRF-Chem modeling with the terrain heights (m in a.s.l.) over the YP (outlined with dash red line) and surrounding regions in southwest China and Southeast Asia. (**b**) Spatial distribution of averaged surface PM2.5 concentrations (color shaded circles with black edges, μg m−<sup>3</sup> in the upper color bar) observed at 16 urban sites (Table 2) from 24 to 26 March 2017 over YP, and spatial distribution of topographic height over YP and surrounding areas (color contours, m in a.s.l. in the lower color bar).


**Table 1.** Parameterization schemes for the WRF-Chem simulation.

**Table 2.** Names of all sites in YP and their corresponding site numbers.


2.3.2. Air Pollutant Emission Inventories

Three emissions were used to drive the WRF-Chem modeling. The anthropogenic emission data were taken from MIX [46], which covers more than 30 different countries and regions in Asia, based on a multi-scale data coupling method to include local source inventories, such as ANL-India (India), CAPSS (Korea), REAS2 (Japan, Taiwan, China), MEIC (anthropogenic sources in mainland China), and PKU-NH3 (ammonia emission inventory in China). MIX consists of emissions from on-road mobile sources, agricultural activities, power plants, industrial processes, and residential combustion.

Using the Model of Emissions of Gases and Aerosols from Nature (MEGAN), which includes more than 20 biogenic species [47], the online biogenic emissions were calculated. The hourly BB emissions were obtained from the Fire Inventory from NCAR (FINN) [48]. The FINN was produced using land cover types and fire point emissions monitored by the MODIS satellites (Terra and Aqua), combined with emission factors and combustible loads, and includes particulate matter and gas emissions from biomass burning in agriculture, forests, etc. The horizontal resolution of FINN is 1 km, and the vertical distribution of the fire emission pollutants is calculated by the online plume-rise parameterization [49,50].

#### 2.3.3. Numerical Experiments

Based on the modeling configurations, two numerical experiments were conducted during 21–27 March 2017, of which the first two days were used as the spin-up time of modeling a PM2.5 pollution episode. The experiments were: (1) a control experiment (CE), with the MIX anthropogenic emission inventory, the MEGAN biogenic emission inventory, and the FINN BB emission inventory in the modeling configuration; (2) a sensitivity experiment (SE), which was the same as CE but with the BB emissions turned off over two domains.

Through the comparison of the PM2.5 concentrations between CE and SE, the contribution rates of BB emissions to PM2.5 concentrations for the air pollution episode were evaluated by Equation (1):

$$\text{Contribution rates} = \frac{\text{PM}\_{2.5\\_CE} - \text{PM}\_{2.5\\_SE}}{\text{PM}\_{2.5\\_CE}} \times 100\% \tag{1}$$

where PM2.5\_CE and PM2.5\_SE represent the results from CE and SE, respectively.

#### 2.3.4. FLEXPART-WRF Models

The Flexible Particle dispersion model (FLEXPART) [51,52] is a Lagrangian particle diffusion model considering the processes of wet and dry depositions, turbulent diffusion, and tracer transport [53]. FLEXPART driven by WRF (FLEXPART-WRF) has been widely utilized to examine the potential sources and the long-distance transport of air pollutants [54–56]. Based on this backward trajectory model, we followed the method proposed by Chen et al. [57] and Yu et al. [55] to identify the upstream emission sources of air pollution in the YP region.

#### *2.4. WRF-Chem Modeling Validation*

A credible simulation of meteorology is essential for the simulation of air pollutants with WRF-Chem [58]. Therefore, the meteorological simulation in typical sites (sites 1–11 in Figure 1b and Table 2 with average surface PM2.5 concentrations over 35 μg m−<sup>3</sup> during the pollution episode) was validated by comparing the modeling results with meteorological observations of 10 m wind speed (WS10), 2 m relative humidity (RH2), and 2 m air temperature (T2). Table 3 and Figure 2a–c list the statistical measures used to compare the observed and simulated meteorological variables. R, RMSE, MB, and ME denote the correlation coefficient, the root mean square, the mean bias, and the mean error, respectively. The CE results of surface PM2.5 concentrations were validated with the observational data in typical sites, and the statistical verification is shown in Table 4 and Figure 2d with the R, RMSE, mean fractional bias (MFB), and mean fractional error (MFE).

**Table 3.** Statistical metrics between observed and simulated meteorological parameters averaged over 11 typical sites during 24–26 March 2017. The "\*" indicates R passed the 99.9% confidence level.


**Figure 2.** Scatter plots between observations and simulations of hourly meteorological parameters and PM2.5 at 11 typical sites during 24–26 March 2017 (blue dots) with liner fitting lines (purple solid lines) passing 99% significance level and 1:1 line (black dash lines) between observations and simulations.

**Table 4.** Statistical metrics of the comparisons from hourly observed and simulated surface PM2.5 concentrations at 11 typical sites during 24–26 March 2017. The "\*" indicates R passed the 99.9% confidence level.


Due to the different physical properties of AOD and PM2.5, there is no good linear relationship between them [59]. Consequently, the vertically integrated concentrations of PM2.5 above 700 hPa averaged from 24 to 26 March 2017 were compared with the MODIS AOD shown in Figure 3b,c. The WRF-Chem simulation was evaluated to reasonably capture the spatial distribution of AOD. Both AOD and PM2.5 concentrations increase gradually from northern to southern YP, reaching maximum values in Laos and Vietnam near the southeastern border of YP, and relatively large values in Beibu Gulf and Guangxi province.

**Figure 3.** (**a**) Spatial distribution of the monthly mean of MODIS AOD in 2017 over East and Southeast Asia. (**b**) Spatial distribution of the daily mean of MODIS AOD and BB PM2.5 emissions from FINN (red dots, emission rate > 3.5 μg m−<sup>3</sup> s−1) during 24–26 March 2017 over Southeast Asia, with the main BB emission sources marked in red rectangles and YP outlined with a bold black line. (**c**) Spatial distribution of the hourly mean of PM2.5 concentrations (μg m<sup>−</sup>3) modeled from 24 to 26 March 2017 over YP and its surrounding areas; the PM2.5 concentrations were obtained by vertical integration from 700 hPa upwards.

On the whole, the validation indicates that the meteorological variations and development of PM2.5 concentrations were reasonably reproduced by the simulation results of WRF-Chem during the air pollution episode, satisfying Boylan's recommendation for good modeling performance [60]. Thus, the WRF-Chem results could be utilized in the investigation of the cross-border transport of PM2.5 from BB sources in Southeast Asia and the influence on air pollution in YP, a clean region in Southwest China.

#### **3. Results and Discussion**

#### *3.1. A Springtime Air Pollution Event Observed in YP*

As shown in Figure 3a, in 2017, the AOD values were high in central-east China compared to the low AOD values in YP and Southeast Asia. Previous studies on the distributions of PM2.5 concentrations and AOD over China showed that the YP, which has low PM2.5 pollution and low AOD, presents a clean atmospheric environment compared to other regions of China [61,62].

However, against the background of such a clean atmospheric environment, an air pollution event occurred over YP during 24–26 March 2017. Based on the observation data obtained from CNEMC (Figure 1b) and MODIS (Figure 3b), the distribution of daily average PM2.5 concentrations and AOD during the pollution period over YP showed decreasing values with increasing distances to the Southeast Asian BB sources and the uplifting topographic heights from the southern to northern YP. The red dots in Figure 3b show the spatial distribution of average BB PM2.5 emissions obtained from FINN during the pollution period. There are few BB emissions in YP, whereas the three major emissionintensive areas are in the neighboring ICP regions, which are in northern Myanmar, eastern Myanmar, and northern Laos. We further summed the intensity of BB emissions in Domain 2 (Figure 1a) to obtain the hourly variation curve of BB emissions (Figure 4a). From 21 to 27 March 2017, the BB emission intensity and the PM2.5 concentrations in Xishuangbanna (XSBN, site 1) showed consistent daily variations. The PM2.5 concentrations in XSBN increased with increasing BB emission intensity (black arrows in Figure 4a). When the intensity of BB emissions decreased and remained at a low level, the PM2.5 concentrations also decreased rapidly. Moreover, the daily maximum PM2.5 concentrations and daily maximum BB emission intensity also had a good one-to-one correspondence. The pollution episode that occurred during 24–26 March over YP is in good agreement with the most active BB activities compared to other days. The lagged correlation between BB emission intensity and PM2.5 concentrations was further calculated with a lag time of 10 h. The changes in the two values were estimated to have a good positive correlation (R = 0.45), passing the significant level of 0.01, indicating the mechanism behind this pollution event in YP has a strong link to BB emissions over ICP.

**Figure 4.** (**a**) Hourly changes in PM2.5 concentrations observed in XSBN (purple line) and BB PM2.5 emission rate averaged in Domain 2 from FINN (red bar) The black arrows indicate the PM2.5 concentrations increase with increasing BB emission intensity. (**b**) Hourly changes in PM2.5 concentrations in three downstream sites, XSBN, Yuxi, and Kunming, from 18:00 of 24 March to 06:00 of 26 March. The black arrow indicates the intervals of the lag time along with the regional PM2.5 transport from XSBN to Kunming during the air pollution episode.

It is noteworthy that the AOD reached its maximum in northern Laos and northwestern Vietnam near the southeastern border of YP (Figure 3b), which is attributed to the convergence of the southeastern and southwestern winds (Figure 5c) and the obstruction effect of the large topographic height in both northern and eastern parts of this area (Figure 1b). The dual effect of these two factors led to the accumulation of pollutants because southwesterly winds prevailed in ICP and YP from the low to high altitude, and the high mountains largely blocked the further transport of pollutants from this area to the YP region. The present work mainly focuses on the effect of BB emissions on the air quality of YP under the prevailing southwesterly winds. The pollution mechanism of this high AOD area under the influence of BB emissions could be the material or objective for further study.

To further understand the influence of prevailing southwesterly winds on the pollution event on YP, we explored the hourly variations in PM2.5 concentrations from 18:00 h local time on 24 March to 06:00 h on 26 March at three observational sites, XSBN, Yuxi, and Kunming, which are in the major pollutants' transport pathway (sites 1,4,11 in Figure 1b). Driven by southwesterly winds, the surface PM2.5 peaks advanced northwards at 02:00 h on 25 March, from XSBN, at 10:00 on 25 March to Yuxi, and at 20:00 on 25 March to Kunming, with a quasi-9 h time lag (Figure 4b). At 02:00 on 25 March (Figure 5a), the PM2.5

concentrations reached more than 150 μg m−<sup>3</sup> in many areas of ICP, and the PM2.5 in XSBN, the southernmost border city in YP, peaked first. At this time, Yuxi and Kunming were relatively clean, and the southwesterly winds prevailing in ICP and YP were conducive to the cross-border transport of pollutants from the upward ICP to the downwind YP region. About 8 h later, the pollutants were transported to Yuxi at 10:00 h on 25 March (Figure 5b). With the strengthening of southwesterly winds, the pollutants were finally transported to Kunming at 20:00 h on 25 March (Figure 5c), and the PM2.5 concentrations in most cities of the central and southern YP reported an increase in PM2.5. This phenomenon reflects an obvious characteristic of transport-type pollution events, and previous studies on transporttype pollution events showed that pollutant concentrations have a good correlation with wind speed [17,55]. This provided a motivation to further understand the relationship between PM2.5 concentrations and wind speed during the pollution event that occurred over YP, which is explained in the next sections.

**Figure 5.** Spatial distribution of surface PM2.5 concentrations and 10 m wind vectors simulated at (**a**) 02:00 of 25 March, (**b**) 10:00 of 25 March, and (**c**) 20:00 of 25 March. The purple arrows highlight the major southwesterly winds, and the green arrow in (**c**) highlights the major southeasterly winds.

#### *3.2. Correlation between Wind Speeds and PM2.5 Concentrations*

The near-surface wind speeds at 10 m are correlated with the PM2.5 concentrations over YP, and the spatial distribution of the correlation coefficients (R) is shown in Figure 6a. The PM2.5 concentrations in XSBN, the closest site to the BB emissions in ICP, have a significant negative correlation with wind speed, with an R-value of −0.71, passing the significance level at 99%. This indicates that, when the near-surface wind speed increases, more PM2.5 concentrations are exported from XSBN, presenting a similar effect over the "source" region (SR) of PM2.5 emissions. The border sites on the southwestern part of YP, which are close to the fire activities, all showed the similar effect. In the hinterland of YP, which is further from the fire activities, positive correlations were observed between the near-surface wind speeds and the PM2.5 concentrations.

Previous studies showed that a major transport channel of BB pollutants from the ICP to southern China exists around 700 hPa [25], and the transport height is elevated with the increase in distance between YP and ICP. Therefore, we further calculated the R between surface PM2.5 concentrations and 700 hPa wind speed (Figure 6b), and the significant positive correlations with R > 0.5 over the central and eastern regions of YP, where the strong winds in the free troposphere play a crucial role in transporting air pollutants to the surface. As a result, these areas are depicted as a "receptor" region (RR) in regional PM2.5 transport, which is in good agreement with the study of Yu et al. [55]. The specific mechanisms of air pollution that occurred in the "source" and "receptor" regions in the regional PM2.5 transport are described in the following sections.

**Figure 6.** Spatial distribution of correlation coefficients (R) between surface PM2.5 concentrations and (**a**) 10 m wind speed and (**b**) 700 hPa wind speed in sites over YP (scatters), and distribution of averaged (**a**) 10 m wind fields and (**b**) 700 hPa wind fields during 24–26 March 2017. The blue and red rectangles in both (**a**,**b**) represent SR and RR, respectively. The red dots are the same as in Figure 2b. The "\*" indicates R passed the 99% confidence level.

#### *3.3. The Different Mechanisms of PM2.5 Pollution in SR and RR*

For XSBN, the representative site in SR, the pollution episode is divided into three periods (Figure 7a), namely the formation period (P1), the maintenance period (P2), and the dissipation period (P3). In P1, the weak wind speed decreased, the boundary layer height was mostly below 1000 m, and the PM2.5 concentrations increased rapidly. Previous studies showed that the transport distance plays a significant role in the regional transport of PM2.5 from BB emissions. On strong BB days, the mean PM2.5 concentrations increase sharply when the distance between the source region and the downwind region is less than 100 km [63]. Meanwhile, Figure 4a shows a synergistic daily variation in PM2.5 in XSBN and BB emission intensity in ICP bordering YP. As a result, the BB emissions can affect the PM2.5 concentrations in XSBN under the weak wind speed and low boundary layer height. The PM2.5 emitted from fire activities is also transported over XSBN by channels at high altitudes. The boundary layer height begins to increase in the second half of P1, once above 3000 m, which is conducive to the development of turbulence. The turbulence further promotes the vertical mixing of PM2.5 and facilitates the diffusion of PM2.5 from high altitude to the ground, further aggravating the pollution.

**Figure 7.** Hourly variations in PM2.5 concentrations (red lines), wind speed (WS, blue lines), and planetary boundary layer height (PBLH, black lines) at two representative sites: (**a**) XSBN in SR and (**b**) Kunming in RR, from 16:00 of 24 March to 03:00 of 26 March. P1, P2, and P3 indicate the formation, maintenance, and dissipation periods of the air pollution episode, respectively.

In P2, the fire activities ended in ICP (Figure 4a), and the vertical mixing process was weakened simultaneously with decreasing boundary layer height. However, the PM2.5 concentrations decreased slowly with the decreasing wind speed and boundary layer height. Hence, XSBN shows a stable meteorological condition that is conducive to the maintenance of air pollution [64,65]. Under this condition, the PM2.5 concentrations were continuously over 150 μg m−<sup>3</sup> and the PM2.5 residue lies on the surface. In P3, the wind speed and boundary layer height increased simultaneously; as a result, the PM2.5 concentrations decreased rapidly under the dual effect of horizontal and vertical diffusion. This resulted in the end of P2. To summarize, the changes in PM2.5 in XSBN are mainly affected by BB emissions, stagnant meteorological conditions, vertical mixing, and strong winds.

In Kunming, the representative site in RR (Figure 7b), although the PM2.5 concentrations increased during the event, the air quality still maintained a good level, which is closely related to the fact that the site is far away from fire activities. In this area, meteorological conditions play a major role in controlling the PM2.5 changes under specific processes. Kunming only experienced two periods i.e., the P1 and P3. With increasing wind speed and boundary layer height, the prevailing southwesterly winds transported PM2.5 over Kunming and then increased the ground PM2.5 concentrations through vertical mixing of turbulence. After that, with decreasing boundary layer height and wind speed, the vertical mixing effect diminished, and the PM2.5 concentrations decreased. As a result, the changes in PM2.5 in Kunming were mainly affected by the regional transport of PM2.5 due to strong winds and vertical mixing.

#### *3.4. Patterns of Regional PM2.5 Transport to Different YP Sites*

The representative SR and RR sites in the border and hinterland of YP, XSBN, and Kunming, respectively, were selected to estimate the contributions of regional PM2.5 transport to air pollution in YP. The estimation was based on the air particle residence time during the pollution period simulated by the FLEXPART-WRF model and three air pollutant emission inventories described in Section 2.3.2. Each simulation was run for a 48-h rearward trajectory of 50,000 air particles being released from two sites, and the air particle residence time was calculated in a 0.1◦ × 0.1◦ horizontal spatial resolution. The air particle residence time was further multiplied with the PM2.5 emission fluxes from three air pollutant emission inventories to quantify the contribution of regional PM2.5 transport to PM2.5 concentrations over the YP. Detailed methods can be found in Chen et al. [57] and Yu et al. [55].

Governed by the prevailing southwesterly winds, the regional transport of PM2.5 from the BB emission source regions in ICP provided a significant contribution to the elevated PM2.5 concentrations over XSBN and Kunming during 24–26 March 2017 (Figure 8). For XSBN, the major pathway is the southwesterly route from southern and eastern Myanmar, wherein the eastern regions of Myanmar bordering XSBN contribute the most. For Kunming, the PM2.5 concentrations are dominated by multiple sources, and the major pathway is the southwesterly route from eastern Myanmar. Moreover, there are two additional minor sources from the northern regions of ICP bordering YP and the domestic area in the east of Kunming. The short-range transport of PM2.5 has a major impact on PM2.5 concentrations over XSBN, whereas the PM2.5 concentrations over Kunming are dominated by the long-range transport of PM2.5.

#### *3.5. Contribution of BB Emissions to PM2.5 Concentrations over YP*

The contribution rates of BB emissions to PM2.5 concentrations in YP were quantitatively estimated by Equation (1). The spatial distribution of the contribution rates to surface PM2.5 concentrations in YP sites (Figure 9a) is highly similar to the distribution of PM2.5 concentrations in Figure 1b. The contribution rates gradually decrease along the transport pathway following the lifting plateau topography in YP. The regional average contribution rate over SR is larger than that in RR, with a difference of 23% (Table 5), and the regional average contribution rate over the whole YP is up to 69%. Three sites with low contribution rates (below 50%) are identified as Yuxi (site 4), Kunming (site 11), and Qujing (site 12). In Yuxi and Kunming, which are the industrial cities of YP, the local anthropogenic emissions are relatively higher, causing the reduction in BB contributions to PM2.5 concentrations. For

Qujing, the farthest city from BB emission sources, the long transport distance and greater topography height reduce the impact of BB contributions to local PM2.5 concentrations. However, the contribution rate is still up to 41% against the clean background, which further confirms the important impact of BB emissions on the air quality over YP.

**Figure 8.** Spatial distribution of contribution rates (color contours) to PM2.5 concentrations in (**a**) XSBN and (**b**) Kunming with the major pathways of regional transport (red dash arrows) simulated by the FLEXPART-WRF model from 08:00 of 24 March to 08:00 of 26 March.

**Figure 9.** Spatial distribution of the contribution rates of BB emissions to PM2.5 concentrations at 12 sites (the numbers 1–12) over YP: (**a**) at the surface and (**b**) at 700 hPa.

**Table 5.** Regional average contribution rates of BB emissions over the Southeast Asian region to PM2.5 concentrations at the surface and 700 hPa over SR and RR, and the regional average increments from the surface to 700 hPa in the two regions.


The pollutants from BB emissions have a characteristic of vertical distribution. As a result, the contribution rates of BB emissions to PM2.5 concentrations at 700 hPa were further analyzed (Figure 9b). Two distinctive features can be noticed: (1) The contribution rates in most sites and the regional average contribution rate of YP increase with increasing altitude. However, the increments in the SR sites are much smaller than those in RR, wherein the regional average increment in SR is 16% from surface to 700 hPa, whereas that in RR is 34% (Table 5). The regional PM2.5 transport at high altitudes has a larger impact on RR sites, which is consistent with the pollution mechanism discussed in Section 3.3. (2) In Yuxi and Kunming, where the contribution rates of BB emissions to PM2.5 concentrations at the surface are relatively small, the contribution rates increase more than 50% from the surface to 700 hPa, indicating that BB emissions have a much greater impact on high-altitude PM2.5 concentrations than anthropogenic emissions.

Moreover, the regional average contribution rate over SR is larger than that in RR at both the surface and 700 hPa (Table 5), but the difference between SR and RR (SR minus RR) at the surface (23%) is much greater than that at 700 hPa (4%) due to the obstruction effect of topographic height along the transport pathway. The contributions of regional transport of PM2.5 from BB activities decrease with increasing transport distance, reflecting an important role of transport distance between the source–receptor areas in changing the air pollution.

#### **4. Conclusions**

Using MODIS remote sensing products and ground-based observations, and conducting model simulations with WRF-Chem and WRF-FLEXPART, the present study examined an air pollution event that occurred over YP, resulting from the cross-border transport of PM2.5 due to BB activities from ICP to YP. The aim was to explore how BB emissions in ICP affect the air quality in the neighboring YP.

Under the prevailing southwesterly winds, the BB sources in ICP have different impacts on the PM2.5 concentrations over SR and RR. XSBN and Kunming, the representative sites in SR and RR, respectively, have distinct mechanisms enhancing PM2.5 concentrations of air pollution. The SR site is mainly affected by Southeast Asian BB emissions with local accumulation in the stagnant meteorological conditions, whereas the RR site is dominated by the regional PM2.5 transport with strong winds and vertical mixing. XSBN and Kunming also have different major pathways of regional PM2.5 transport. The PM2.5 concentrations in XSBN are mainly affected by short-range transport of PM2.5, whereas long-range transport of PM2.5 plays a dominating role in Kunming. The regional average PM2.5 contributions of ICP BB emissions to surface PM2.5 over SR is larger than that in RR, with a difference of 23%; in addition, the regional average increments in the contribution from the surface to 700 hPa are 16% in SR and 34% in RR. It is revealed that the large PM2.5 contributions of ICP BB emissions lift from the lower altitudes in SR to the higher altitudes in RR in regional PM2.5 transport. Moreover, the contributions of regional transport of PM2.5 decrease with an increase in transport distance, reflecting an important role of transport distance between the source–receptor areas in changing the scenario of air pollution.

Based on the investigation of a springtime air pollution event in YP, which differs from other regions such as Eastern China where pollution events happen frequently, the study revealed the underlying mechanism of the pollution episode in YP and the extent to which the regional transport of PM2.5 from BB emissions affects PM2.5 concentrations in YP. However, the MIX anthropogenic emissions in YP and ICP were produced based on data from 2010, which contain more uncertainties compared to those of Eastern China. As a result, future studies involving air pollution simulations can be greatly enhanced by a more accurate emission inventory. To further understand the mechanisms in the regional transport of PM2.5 from BB activities, future exploration can be conducted with the support of multi-source satellite data, long-term ground observations, and a modeling study with refined model schemes and data assimilation.

**Author Contributions:** Data curation, Q.Y. and Z.T.; Funding acquisition, T.Z.; Methodology, Q.Y., T.Z., J.C. and W.H.; Visualization, Z.S. and J.H.; Writing—original draft, Q.Y.; Writing—review and editing, T.Z. and K.R.K. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was financially funded by grants received from the National Key Research and Development Program of China (2019YFC0214604), the National Natural Science Foundation of China (41830965; 91744209). One of the authors KRK is grateful to the Science and Engineering Research Board (S ERB), a statutory body under the Department of Science and Technology (DST), India for providing financial grants through the Start-Up Research Grant (SRG; File No. SRG/2020/001445) scheme and the DST, Govt. of India, for the award of the DST-FIST Level-1 (Grant No.SR/FST/PS-1/2018/35) scheme to the Department of Physics, KLEF.

**Data Availability Statement:** MODIS L3 Atmosphere products (AOD) are available at https:// ladsweb.modaps.eosdis.nasa.gov/search/ (accessed on 1 March 2022). ERA-Interim reanalysis data are available at https://www.ecmwf.int/en/forecasts/datasets/reanalysis-datasets/era-interim (accessed on 1 March 2022). The PM2.5 datasets and near-surface meteorological data are available at http://www.cnemc.cn and http://data.cma.cn/ (accessed on 1 March 2022).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Technical Note* **Aerial Mapping of Odorous Gases in a Wastewater Treatment Plant Using a Small Drone**

**Javier Burgués 1,2,3,\*, María Deseada Esclapez 4, Silvia Doñate 4, Laura Pastor <sup>4</sup> and Santiago Marco 1,2,3**


**Abstract:** Wastewater treatment plants (WWTPs) are sources of greenhouse gases, hazardous air pollutants and offensive odors. These emissions can have negative repercussions in and around the plant, degrading the quality of life of surrounding neighborhoods, damaging the environment, and reducing employee's overall job satisfaction. Current monitoring methodologies based on fixed gas detectors and sporadic olfactometric measurements (human panels) do not allow for an accurate spatial representation of such emissions. In this paper we use a small drone equipped with an array of electrochemical and metal oxide (MOX) sensors for mapping odorous gases in a mid-sized WWTP. An innovative sampling system based on two (10 m long) flexible tubes hanging from the drone allowed near-source sampling from a safe distance with negligible influence from the downwash of the drone's propellers. The proposed platform is very convenient for monitoring hard-to-reach emission sources, such as the plant's deodorization chimney, which turned out to be responsible for the strongest odor emissions. The geo-localized measurements visualized in the form of a twodimensional (2D) gas concentration map revealed the main emission hotspots where abatement solutions were needed. A principal component analysis (PCA) of the multivariate sensor signals suggests that the proposed system can also be used to trace which emission source is responsible for a certain measurement.

**Keywords:** drone; UAV; gas sensors; odour; air pollution; industrial emissions; mapping; environmental monitoring

#### **1. Introduction**

The monitoring of emissions to air is a key element in preventing and reducing pollution from industrial installations, in ensuring a high level of protection of the environment, and in minimizing odor impact to the surrounding population. Industrial activities such as production of energy, intensive rearing of poultry and pigs or waste management are sources of greenhouse gases (GHGs), hazardous air pollutants (HAPs) and offensive odors. In 2017, emissions from waste management sites made up 3% of total GHG emissions and 5% of particulate matter (PM) emissions in Spain [1]. These facilities are also responsible for many citizen complaints to the local authorities regarding odor annoyance episodes [2]. The objectives of monitoring are many and diverse. For example, monitoring can be applied to assess compliance with environmental permit requirements; check the performance of odor abatement systems; determine the relative contribution of different sources to the overall emissions; report emissions for national and international inventories, e.g., the Pollutant Release and Transfer Registers (PRTRs); and many others [3].

**Citation:** Burgués, J.; Esclapez, M.D.; Doñate, S.; Pastor, L.; Marco, S. Aerial Mapping of Odorous Gases in a Wastewater Treatment Plant Using a Small Drone. *Remote Sens.* **2021**, *13*, 1757. https://doi.org/10.3390/ rs13091757

Academic Editor: Maria João Costa

Received: 2 April 2021 Accepted: 29 April 2021 Published: 30 April 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

In Europe, industrial air emissions are regulated by the Industrial Emissions Directive 2010/75/EU (IED) [4]. The IED and national regulations impose requirements on the monitoring approach to be used for a particular installation, for example the requirement for continuous monitoring of certain pollutants with specific instruments. The accepted monitoring methodologies and reference instruments for each type of gas are described in the Best Available Technique (BAT) document [5]. The quantification of the total emissions of an installation often requires the assessment of channeled (point-like) emissions and diffuse emissions including fugitive emissions. Channeled emissions are relatively easy to monitor with automated measuring systems (AMS) permanently installed on-site. However, the quantification of diffuse emissions might not be easy with AMS and is, in general, labor- and cost-intensive due to the number of potential sources.

To simplify the measurement of diffuse emissions, the European IED specifies that "*measurements techniques based on the use of a transportable measurement platform, despite being less accurate than reference methods, may be used to supplement the information supplied by fixed measurements for the determination of the spatial concentration distribution or for the assessment of diffusive gas emissions*". The advantage of a portable instrument over a set of fixed analyzers installed on different locations of the plant is the lower investment and operational costs, as well as higher spatial resolution of the measurements. However, manually scanning an entire plant with a portable instrument is a tedious and risky task. The use of terrestrial robots may seem the most obvious solution to this problem, however their limited maneuverability hinders their practical application in realistic scenarios which often include obstacles (e.g., buildings, stairs, trees, etc.) and elevated emission sources (e.g., chimneys, flares).

Aerial surveys with small drones (<10 kg) equipped with gas detectors are a promising cost-effective and safe alternative for emission monitoring in industrial plants [6]. Both fixed- and rotary-wing drones can be used, however rotorcrafts are preferred for this application due to key practical advantages such as vertical take-off and landing (VTOL), autonomous hovering, high maneuverability, and low cruise speed. Drones equipped with laser-based methane detectors have been demonstrated with great success in the oil and gas (O&G) industry, e.g., for quantifying whole-site methane emissions [7] and detecting fugitive methane leaks [8–10]. The main O&G companies are already testing this technology in their plants [11–13]. Similar platforms have been recently used in solid waste landfills (SWLs) for identifying surface methane hotspots [14].

Wastewater treatment plants (WWTPs) are another scenario where small drones could improve the monitoring of plant emissions/odors. To the best of our knowledge, there are no reports of drones being used for emission monitoring or odor sensing of WWTPs. In this case, the major emission problem is not methane, but odorous compounds produced during wastewater treatment, such as hydrogen sulfide (H2S), ammonia (NH3), mercaptans, or volatile organic compounds (VOCs) which can produce odor impact in workers and communities living nearby these facilities, even at low concentration levels [3]. Current odor assessment methodologies in WWTPs are mostly based on walkover surveys with portable H2S detectors or via olfactometric measurements involving expensive human panels, which leads to odor measurements with poor temporal and spatial resolution. The idea of using drones to monitor odorous emissions in WWTPs is very interesting because they can measure the concentration of key odorous compounds in different locations of the plant including hard-to-reach locations, and with higher spatial resolution, less risk, and lower cost than existing methods. This information can then be used by plant operators for (i) feedback into the industrial processes, (ii) as input for atmospheric dispersion models to estimate the odor emission rate and then to predict odor impact in the plant vicinity, and (iii) to identify fugitive emissions.

The two main challenges associated with the application of drones for monitoring emissions in WWTPs are (i) the lack of reliable and lightweight sensors to detect the relevant compounds and (ii) the plume distortion produced by the downwash of the rotating propellers. While methane can be selectively detected with laser-based spectrometers

amenable for drone integration, detection of H2S, NH3 or VOCs at the required concentration levels is yet not feasible with lightweight optical analyzers. In this case, the most straightforward approach is to use low-cost chemical sensors, such as electrochemical cells (EC) or metal oxide (MOX or MOS) sensors, which inherently have limited performance [15]. Electrochemical sensors offer decent selectivity (though not comparable to optical analysers) for compounds such as CO, SO2, NH3 or NO/NO2 (among many others) and are often the technology of choice when any of these compounds is targeted [16]. MOX sensors operating in the (default) isothermal mode are not selective but are more sensitive, faster, and cheaper than electrochemical cells [17]. These features make them very popular in robotic studies addressing gas source localization and mapping tasks [18–20] where selectivity is not critical because artificial gas sources releasing a single compound (typically ethanol) are normally used.

Up to now, the use of drones fitted with low-cost chemical sensors has been mostly explored in relatively simple scenarios, such as indoor areas [19] or outdoor environments [21,22], using artificial gas sources. A few exceptions exist at the industrial and academic level. For example, Aeromon (Helsinki, Finland) has been regularly using their BH-12 multi-sensor system (based on electrochemical cells) for monitoring the emission performance of vessels and checking compliance with the new emission regulations regarding fuel sulfur content (FSC). The DR1000 "Flying Lab" from Scentroid (Whitchurch-Stouffville, ON, Canada), which uses EC and MOX sensors, has been used for monitoring the quality of fuel used for domestic heating in Poland. The recently announced Muve C360 from FLIR Systems (Wilsonville, OR, USA) is a multi-gas detector completely integrated in a DJI M210 drone for emergency responders, industrial safety, and environmental monitoring. At the research level, drones equipped with electrochemical sensors have been used for atmospheric research studies, e.g. analysing the composition of volcanic plumes [23], among other applications [6].

Despite the many advantages offered by rotorcrafts, the intense downwash generated by the propellers is a main problem for chemical sensing applications in which the drone has to fly close to point or surface emitters. In these cases, the downwash strongly distorts the gas distribution, leading to gross errors in the sensor readings. This is a well-known problem that has received lots of attention from the research community. The downwash has been simulated by numerical methods (e.g., computer fluid dynamics, CFD) and empirically characterized using smoke tracers, anemometers, and particle tracking velocimetry (PTV) [6]. These studies show that the downwash is particularly strong in the vertical axis underneath the drone where its influence can extend up to several meters (depending on the drone´s take-off weight).

The downwash is the main factor to be considered in the design of gas sampling systems for drones, or for optimizing sensor placement, especially for point-like sensors or closed-path optical analysers. Although the sensing elements can be directly exposed to the air sample, it is more convenient to place them in a sensor chamber with an aspiration system. This provides more flexibility regarding the sampling point and more control in the sample delivery. The few existing commercial systems using low-cost sensors (e.g., Aeromon BH-12, Scentroid DR1000 and FLIR Muve C360) implement a rigid horizontal sampling tube (1–2 m length) to aspirate the gas sample from outside the rotors' influence zone [23,24]. This type of boom is very convenient for monitoring elevated and channeled sources, such as chimneys or flares, but has practical inconveniences for diffusive area sources such as those encountered in WWTPs. In this case, the problem is that a drone implementing a rigid horizontal probe would have to fly very close to the ground or nearby obstacles to sample the space directly above the source, which is risky and leads to a strong mixing (dilution) of the emissions because of the downwash.

The goal of our current research is to develop a drone to monitor and map odorous emissions in WWTPs. For that, we use a commercial drone (DJI Matrice 600) fitted with a custom payload based on an array of low-cost gas sensors (electrochemical and MOX sensors) and an innovative sampling system consisting of an aspiration pump connected to a 10-m sampling tube suspended from the drone. This system allows the drone to sample the emission sources with negligible effect from the downwash and, at the same time, fly at sufficient height above the obstacles to minimize the operational risks. This paper presents the first preliminary set of experiments carried out in a real WWTP in Murcia (Spain). The objectives of these initial measurements are to (i) check if the signals recorded by the drone are consistent with the expected concentrations based on previous measurements with hand-held detectors; (ii) build rough concentration maps of the most relevant compounds to understand their spatial distribution and identify the emission hotspots; and (iii) assess if the different emission sources can be identified based on the multivariate patterns produced by the sensor array. We will discuss some of the challenges encountered during these tests, and how future developments could overcome them.

#### **2. Materials and Methods**

#### *2.1. Test Site*

Field measurements were carried out in the WWTP of Molina del Segura (Murcia, Spain), which is operated by Depuración de Aguas del Mediterráneo (DAM). The plant (Figure 1) has an extension of 35,000 m<sup>2</sup> and serves a population of 290,000 inhabitants. Five emission sources (settler stage A, bioreactor stage A, sludge hoppers, sludge thickener, and deodorization chimney) were suggested by the plant manager as elements with the highest emissions based on previous measurement campaigns using hand-held detectors and olfactometric campaigns (human panels). Therefore, the aerial mapping was focused on a region of ~4500 m2 centered around these sources. An ultrasonic anemometer (Model: WindSonic, Gill Instruments, Lymington, UK) placed at 10 m a.g.l. in a clear area without nearby obstacles continuously measured wind speed and direction.

**Figure 1.** Map of the WWTP of Molina del Segura with the five main emission sources highlighted in red. The aerial mapping was focused on a 4500 m2 squared region centered around these sources.

#### *2.2. Drone and Payload*

A rotary-wing drone was selected for this application due to its ability to hover, slow flight speed and vertical takeoff and landing (VTOL). These characteristics are essential for close-up monitoring of emission sources, safe navigation around the plant infrastructure, and high-resolution mapping. The selected drone was the Matrice 600 Pro (DJI International, Nanshan, Shenzhen, China) which has a high payload capacity (6 kg) and allows for a flight time between 15 min (fully loaded) and 30 min (unloaded). The drone was equipped with a custom gas sensing payload (Figure 2) composed of a custom-made electronic nose (e-nose) and a commercial multi-gas analyzer Dräger X-am 8000 (Drägerwerk AG, Lübeck, Germany).

**Figure 2.** DJI Matrice 600 Pro drone equipped with a custom e-nose and a Dräger X-am 8000 analyzer. The inlets of both systems are connected to 10-m PTFE tubing.

Both sensor systems were attached underneath the drone using a custom mounting plate, and their inlets were connected to 10-m PTFE tubing (hanging vertically from the drone) to sample the region below the drone where the downwash has disappeared or it is greatly reduced. The required length of the tubing was determined by measuring the downwash influence with a hand-held anemometer placed below the loaded drone while it was hovering at multiple altitudes. We prefer this sampling approach over the horizontal tube because it allows the drone to fly over obstacles without risk. However, using a long sampling tube also has practical problems: (i) a delay in the measurements due to the sample transport, (ii) memory effects if some gases stick to the tubing walls, and (iii) tilt of the tube due to wind or drone motion. The delay in the measurements and the tilt of the tube lead to inaccuracies in the GPS marking of the sensor signals. To solve these issues we attached a weight of 150 g to the end of the tube as a plumb bob (to keep the tube as straight as possible during flight) and compensated the delay via software.

Regarding the e-nose architecture (Figure 3), it contains an array of 16 MOX sensors (several TGS models, Figaro Engineering Inc., Osaka, Japan) operated at various temperatures, a combo sensor for temperature, humidity and pressure, a flow sensor, GPS receiver, and long-range ZigBee 868 MHz radio communication. The specifications of the e-nose sensors are summarized in Table 1. A microcontroller reads the sensor signals and the GPS position, and sends them to the base station through the radio link at a sampling frequency of 0.2 Hz. A miniature pump delivers the gas flow to the sensing chamber at a flow rate of 1.8 L/min. Power is provided by a 7.4 V lithium polymer (LiPo) battery with 2200 mAh of capacity, allowing continuous operation for approximately 2 h. The weight of the e-nose including the battery is ~1200 g.

**Figure 3.** Internal architecture of the electronic nose.


**Table 1.** Sensors included in the electronic nose.

The Dräger X-am 8000 is equipped with four electrochemical sensors (for H2S, NH3, mercaptans and amines), a photo-ionization detector (PID) for quantifying total VOCs, an internal pump, an integrated battery, and weighs 550 g (Table 2). The sensor data are logged every second in an internal memory which can store up to 210 h of measurements.

**Table 2.** Sensors included in the Dräger X-am 8000 analyzer.


#### *2.3. Experimental Protocol*

All measurements were carried out in a single day. The e-nose sensors were preheated for 24 h before the start of the measurements to stabilize the sensors' baseline. At the beginning of the experiment the drone was positioned near the entry of the plant (P0 in Figure 1), where no odor was perceivable, and measurements were taken for 7 min to determine the sensors' baseline. The drone took off from there and scanned the target area slowly at a height of approximately 12 m, keeping the inlet of the sampling tube as close as possible to the emission sources. The drone hovered for 5–7 min at each of the five emission sources (highlighted in Figure 1) to capture the variability of the gas concentration over time. The whole experiment took slightly less than 2 h to complete, which required multiple sets of drone batteries.

#### *2.4. Data Processing and Visualization*

A laptop computer with a ZigBee 868 MHz radio antenna and a custom software application developed in MATLAB R2019B (The MathWorks, Natick, MA, USA) was used as base station to receive and log in real-time the data from the e-nose (timestamp, sensor signals and GPS position). The measurement data stored in the internal memory of the Dräger X-am 8000 (timestamp and sensor signals) was downloaded into the base station at the end of the flight (no radio link available for this device). Data from both instruments were merged into a single file, using linear interpolation (MATLAB interp1) to synchronize the data to a common timestamp. Each entry of the log file is a tuple (*t*, *x*, *y*, *z*, *c*1, ..., *c*5,*s*1, ..., *s*16) where *t* is the timestamp, *x*, *y*, *z* the spatial coordinates, *c*1, ... , *c*<sup>5</sup> the concentration (ppmv, parts-per-million in volume) of the five gases measured by the Dräger X-am 8000, and *s*1, ..., *s*<sup>16</sup> the MOX sensor resistances (Ω).

For data visualization, we used MATLAB and the Google Maps Javascript API to produce a heatmap visualization of the geolocalized raw sensor data. In addition, a principal component analysis (PCA) was used to visually determine if the different emission sources could be clustered based on the sensor responses. For that, a PCA model with three principal components was applied to the e-nose signals after logarithmic transformation (to reduce the dynamic range and improve normality) and mean-centering. The PCA modelling was done also in MATLAB.

#### **3. Results and Discussion**

#### *3.1. Weather Conditions*

The weather conditions during the field measurements were favorable, with clear sky, temperature between 18 and 20 ◦C, and 50% relative humidity. The wind direction was predominantly north-west, with average wind speed of 10–15 km/h, and gusts of up to 50 km/h (Figure 4). The effect of wind on the sampling tube can be observed in Figure 5, which shows pictures of the drone hovering above the five emission sources. For example, while measuring at the settlers (P1) and the deodorization chimney (P4) the drone had to be positioned slightly upwind to compensate the tilt of the sampling tube. The GPS signal reception was good throughout the experiment, with more than 12 satellites in line-of-sight (LOS) with the drone.

**Figure 4.** Wind speed and direction during the field measurements.

**Figure 5.** Drone hovering over the selected emission sources (P1–P5).

#### *3.2. Gas Concentration Measurements*

The raw sensor signals throughout the experiment are shown in Figure 6. The highest gas concentrations were recorded near the bioreactor stage A (P2) and the deodorization chimney (P4). The high variability of the sensor signals at the chimney is a consequence of the oscillations of the sampling tube around the chimney outlet due to the wind. The oscillations of the sampling tube were less problematic in the area sources because, since the concentration is more homogeneous, the exact location of the sampling point is not as critical as in ducted (point-like) sources. Very low concentration of all gases was measured near the settlers (P1) despite a strong malodor could be appreciated near this site. Only the response of the MOX sensors was distinguishable from the blank measurements, which may indicate that odor from this source was produced mostly by VOCs rather than by H2S or NH3. A peak of 100 ppm of CO2 above the background level was measured near the sludge hoppers (P3) during sludge discharge into a truck. Finally, low concentrations were measured at the sludge thickener (P5) probably because it was covered.

**Figure 6.** Raw sensor signals during the field measurements.

The measured concentrations were in line with the expected values based on previous measurement campaigns carried out at the same emission sources with a hand-held Xam 8000 detector (Table 3). It is not surprising that the measured values during a single day in very specific conditions (e.g., drone flight) differ from values obtained in other measurement campaigns carried out at a different date. This is because the pattern of

emissions in a WWTP is not stationary and there is a large variability in the emissions depending on process factors (e.g., quality of influent water and flow rate) but also on environmental conditions (wind, temperature, humidity, precipitation, etc.). There are also seasonal trends. Thus, the recorded signals only represent the emissions during the time of sampling. A comprehensive characterization of the emissions, which would require a much more elaborated measurement campaign spanning several months, was out of the scope of this preliminary measurements. Similarly, a precise characterization of the uncertainty associated with the drone measurements is also subject of future experiments. The goals of this preliminary work were less ambitious, e.g., showing that drone-based measurements using the proposed sampling approach provide sensible signals.


**Table 3.** Comparison between drone-based measurements and those performed with a hand-held X-am 8000 detector near the same emission sources.

It should be noted that while the recorded signals give a clear indication of the characteristics of emissions in the different sources, their exact values are subject to various sources of uncertainty. While low-cost sensors can provide relatively good results in the laboratory, their application in field conditions remains challenging. First of all, because the sensors react not only to the target gas but also to interfering compounds. For example, the response of an H2S electrochemical sensor is affected by the presence of SO2 or NH3 because of matrix effects. Uncontrolled or unknown variations in temperature, humidity, and pressure can also affect the sensor signals, as can overheating due to direct sunlight exposure. Strong winds also affect the measurements due to the oscillations of the sampling line.

#### *3.3. Gas Concentration Mapping*

The sensor data was used to produce heatmaps indicative of the concentration of each gas. An example of an H2S map is shown in Figure 7. As it was expected from the analysis of the raw sensor signals, the H2S concentration hotspots are located near the bioreactor stage A (P2) and the deodorization chimney (P4). These hotspots are shifted a few meters with respect to the location of the emission sources due to the inaccuracy of the GPS position (±3 m), the effect of wind on the gas dispersion, and the tilt of the sampling tube with respect to the vertical axis of the drone where the GPS receiver is located. This latter effect can be clearly seen in Figure 5 when the drone is sampling the chimney. In order to keep the inlet of the sampling tube centered above the chimney, the drone must be positioned a few meters upwind to compensate for the effect of wind on the tubing. Because the GPS receiver is placed on the drone and not at the inlet of the tube, the recorded position indicates the location of the drone and not the location where the gas is being sampled. This could be solved in the future by either placing the GPS receiver at the inlet of the tubing or using an on-board camera to track the position of the sampling inlet and compensate the offset via software.

**Figure 7.** Map of H2S concentration obtained from drone measurements.

#### *3.4. Gas Source Identification*

One research question of this work is whether the different emission sources could be distinguished based on the e-nose signals. A PCA score plot of the signals recorded while the drone was hovering over the sources revealed that this is indeed the case, and the different emission sources are clustered in different regions of the PCA space (Figure 8). This suggests that each source has a different gas composition, so the e-nose could be potentially used to identify which source is responsible for a certain measurement. Even the settler (P1) and sludge thickener (P5) could be differentiated from the blank measurements (P0) despite the gas concentrations measured at these sources were very close to the baseline level. This result, which may be a consequence of the low limit of detection (LOD) of MOX sensors, should be confirmed with more measurement campaigns and using external validation (blind) samples.

**Figure 8.** Principal component analysis (PCA) score plot of the e-nose signals.

#### **4. Conclusions**

This study has explored the possibility of using a small drone equipped with an array of low-cost gas sensors for real-time monitoring of odorous emissions in a WWTP. The drone was equipped with an innovative sampling system that allowed the drone to fly at a safe distance from obstacles and minimize the impact of downwash into the sensor signals. The proposed system was useful to measure gas concentrations near previously inaccessible emission sources, such as the deodorization chimney, which turned out to be the main odor source in this plant. The geolocalized sensor signals were used to build H2S concentration maps that highlighted the location of the main emission hotspots.

During these field measurements we faced several challenges that affect the operation of the drone and the quality of the acquired data. The main challenge was the presence of strong winds which affected the drone stability, made the sampling tube oscillate considerably, and induced a high variability in the spatial distribution of the released gases. Adding a weight at the end of the sampling line improved the stability of the measurements. Flying above the obstacles was key to minimize the operational risks considering the strong and unpredictable wind gusts present in our flights. Real-time visual feedback from the sensor signals was very helpful for fine-tuning the position of the sampling inlet close to the different emission sources (especially channeled sources). Nonetheless, the geolocalization of the sensor measurements was inaccurate under strong winds because the GPS receiver and the inlet of the sampling line were not necessarily in the same vertical axis. Two possible solutions to improve this in future experiments are (i) to place the GPS receiver at the inlet of the tubing or (ii) using an on-board camera to track the position of the sampling inlet and compensate the GPS offset via software.

Another problem that we want to address in future works is the quantification of odor concentration (e.g., in standardized units such as ou/m<sup>3</sup> [24]) from drone-based measurements. This is much more challenging than quantification of individual gas concentrations, as the relationship between the components of a gas mixture and the perceived odour concentration is non-linear and subject to synergic and masking effects [25]. We also plan to combine the drone measurements with atmospheric dispersion models, such as CALPUFF [26], to predict the impact outside of the plant. The proposed platform could be applied in the future to other industrial sectors, such as solid waste landfills, composting plants, and animal farms.

**Author Contributions:** Conceptualization, J.B., M.D.E., S.D. and S.M.; Data curation, J.B.; Formal analysis, J.B.; Funding acquisition, L.P. and S.M.; Investigation, J.B., M.D.E. and S.D.; Methodology, J.B., M.D.E. and S.M.; Project administration, L.P. and S.M.; Resources, M.D.E. and S.D.; Software, J.B.; Supervision, S.M.; Validation, J.B.; Visualization, J.B.; Writing—original draft, J.B.; Writing—review & editing, J.B., M.D.E., S.D., L.P. and S.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research has received funding as third party from the ATTRACT project funded by the EC under Grant Agreement 777222.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Restrictions apply to the availability of the data presented in this study. These data can be available on private request from the corresponding author.

**Acknowledgments:** CERCA Programme/Generalitat de Catalunya. The Signal and Information Processing for Sensor Systems group is a consolidated Grup de Recerca de la Generalitat de Catalunya and has support from the Departament d'Universitats, Recerca i Societat de la Informació de la Generalitat de Catalunya (expedient 2017 SGR 1721). We would also like to acknowledge Luis Fernández Romero, Maria José Ibáñez, Lidia Saúco, Ana Maciá and Pilar Pradas for their support during the field campaigns. Authors of this report gratefully acknowledge the cooperation of ESAMUR (Entidad Regional de Saneamiento y Depuración de Murcia).

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**


## *Letter* **Assessing the Impact of Corona-Virus-19 on Nitrogen Dioxide Levels over Southern Ontario, Canada**

#### **Debora Griffin 1,\*, Chris Anthony McLinden 1,2, Jacinthe Racine 3, Michael David Moran 1, Vitali Fioletov 1, Radenko Pavlovic 3, Rabab Mashayekhi 3, Xiaoyi Zhao <sup>1</sup> and Henk Eskes <sup>4</sup>**


Received: 16 October 2020; Accepted: 11 December 2020; Published: 16 December 2020

**Abstract:** A lockdown was implemented in Canada mid-March 2020 to limit the spread of COVID-19. In the wake of this lockdown, declines in nitrogen dioxide (NO2) were observed from the TROPOspheric Monitoring Instrument (TROPOMI). A method is presented to quantify how much of this decrease is due to the lockdown itself as opposed to variability in meteorology and satellite sampling. The operational air quality forecast model, GEM-MACH (Global Environmental Multi-scale - Modelling Air quality and CHemistry), was used together with TROPOMI to determine expected NO2 columns that represents what TROPOMI would have observed for a non-COVID scenario. Applying this methodology to southern Ontario, decreases in NO2 emissions due to the lockdown were seen, with an average 40% (roughly 10 kt[NO2]/yr) in Toronto and Mississauga and even larger declines in the city center. Natural and satellite sampling variability accounted for as much as 20–30%, which demonstrates the importance of taking meteorology into account. A model run with reduced emissions (from 65 kt[NO2]/yr to 40 kt[NO2]/yr in the Greater Toronto Area) based on emission activity data during the lockdown period was found to be consistent with TROPOMI NO2 columns.

**Keywords:** air pollution; TROPOMI; COVID; nitrogen oxides

#### **1. Introduction**

The outbreak of Coronavirus disease in late 2019 (COVID-19) reached Canada in early 2020, with the first Canadian COVID-related death reported in early March 2020 [1]. By mid-March, provinces were beginning to limit the size of gatherings and initiating an overall lockdown of their populations. In Ontario, the lockdown was announced on 16 March 2020. This greatly disrupted traffic patterns, with traffic density observed to decrease by roughly 50–60% by early April [2]. Travel restrictions also greatly curtailed air travel. These circumstances provided a unique and unprecedented natural experiment where emissions patterns were rapidly and drastically altered, especially in southern Ontario, home to the Greater Toronto Area (GTA), the most populous urban area in Canada [3]. The GTA consists of the City of Toronto and four surrounding regional municipalities (see Supplement Material Figure S1) and includes many limited-access highways and expressways, rail lines, and Toronto Pearson International Airport, Canada's busiest airport [4]. Its population in 2016 was over 6.4 million [3]. Ultimately, changing emissions in the GTA and the rest of southern

Ontario associated with the pandemic allow for testing and refining of emissions from different sectors, most notably those from vehicle traffic.

One pollutant that is associated with combustion processes such as vehicle traffic is nitrogen dioxide (NO*<sup>x</sup>* = NO2 + NO). NO*<sup>x</sup>* has adverse effects on human and environmental health: it is a key ingredient in smog, as precursors to both ozone and particulate matter, and can contribute to acid deposition. NO*x* concentrations strongly correlate with local emission sources due to its short lifetime of a few hours [5,6] and, because of the high and localized enhancements compared to background levels, NO*<sup>x</sup>* is a good tracer of human activity near cities. For example, urban NO*<sup>x</sup>* displays a strong weekly and diurnal cycle resulting from differences in traffic and manufacturing activity on weekends versus weekdays [7,8]. Observed NO2 is not merely a function of NO*<sup>x</sup>* emissions; but is also a function of the local chemical environment and meteorology. For example, it is well known that NO2 impacts its own chemical lifetime [5]. Furthermore, meteorological parameters such as cloud cover, temperature, and wind speed and direction all have a strong effect on local NO2 enhancements [9–11]. Given this temporal and spatial variability in NO2, precisely where and when observations are made is also very important. Taken together, one important challenge when interpreting changes in NO2 lies in disentangling potential changes in emissions from natural and sampling variability.

Satellite observations can help to identify NO*x* emissions and their variation globally. Declines in NO2 emissions, following the lockdown, have previously been observed by satellite instruments in China, India, Europe and North America [12–15]. In this study, observations from the European Space Agency's Sentinel-5p Tropospheric Monitoring Instrument (TROPOMI), in conjunction with forecasts from Environment and Climate Change Canada's (ECCC) operational regional air quality forecast model GEM-MACH (Global Environmental Multi-scale - Modelling Air quality and CHemistry) [16,17], are used to isolate the impact of the COVID associated lockdown on NO2 levels in southern Ontario, Canada. In this study, we show that combining satellite observations and model output, it is possible to determine the impact of meteorology and sampling variability on the observed NO2 column changes. The air quality model is further used to determine how possible lockdown-associated emission reductions impact the NO2 columns, and whether those match the observed changes.

#### **2. Materials and Methods**

In the context of satellite remote sensing, one method, and the most straightforward, to assess the impact of the COVID lockdown on NO2 is to directly compare the COVID period with a non-COVID period, perhaps using the same period from different years [13]. However, in order to completely isolate the COVID signal, this method assumes that among the two periods being compared, (i) baseline emissions do not differ, (ii) natural or seasonal variability in winds, sunlight, temperature, and other meteorological parameters are not important, (iii) differences in satellite sampling do not play any role, and (iv) any differences in the satellite retrieval algorithm are minimal. For many locations, including the Canadian domain studied here, differences in interannual NO*x* emission changes should be small. However, meteorological variability can be important, and given that, sampling variability is also likely to lead to differences between the two periods. In the case of TROPOMI, different retrieval algorithms were used for spring 2019 vs. spring 2020 (v1.2 until April 2019 and thereafter v1.3, differences include the treatment of "negative" cloud fractions and the lower limit of the tropospheric air mass factor (AMF) relaxed influencing the quality flag [18]). While differences tend to be small, it is difficult at present to completely eliminate this as a possible source of difference.

With these confounding factors in mind, the method presented here is the one in which the ECCC's operational GEM-MACH air quality model forecasts are used to control for non-COVID factors such as sampling variability, meteorological variability, and other sources of variability. Furthermore, to limit potential differences in the retrieval algorithm between 2019 and 2020, the two periods considered are consecutive in 2020: a pre-COVID period and the COVID-lockdown period.

#### *2.1. TROPOMI Observations*

Observations of NO2 from TROPOMI (2017-present [19]), an Earth-viewing spectrometer, are used here. TROPOMI has a resolution of 3.5 × 5.5 km<sup>2</sup> (since August 2019, before 3.5 × 7 km2) at nadir and measures back-scattered ultraviolet/visible/solar-infrared sunlight from which NO2 vertical column density (VCD), or the vertically-integrated NO2 number density, can be derived. Details on the retrieval algorithm can be found elsewhere [20], but in short: a spectral fit is performed matching laboratory-measured NO2 absorption cross-sections and other relevant parameters to these observed spectra which provide a determination of the NO2 slant column densities (SCDs), or the number density integrated along the path of the sunlight through the atmosphere. In a second step, the stratospheric component of the SCD is determined using a chemical data assimilated system and subtracted [21]. Finally, the remaining tropospheric SCD is converted to a VCD using an AMF which quantifies the sensitivity of the satellite to a particular scene which depends on factors such as shape of the NO2 profile, surface reflectivity, viewing geometry, and clouds. In this work, an alternative AMF is used which better accounts for the presence of snow and uses higher resolution NO2 profile shapes to improve the effective spatial resolution [22,23]; see Supplement material for more information [24–33]. A radiative transfer model is used to calculate AMFs [34] which depend on the following factors: solar and viewing geometry, surface pressure, the presence and pressure of clouds, scene reflectivity and the vertical distribution of the NO2 via VCD=SCD/AMF. Similar, as in the original TROPOMI AMF, the aerosols are corrected for implicitly [21]. Lastly, the TROPOMI data are filtered to use only the highest quality data (qa\_value > 0.75 and the cloud cover of the pixels is at most 30%). The TROPOMI tropspheric NO2 columns have been validated in a number of studies against ground-based, aircraft and other satellite observations [35–40]. The alternative AMFs have a smaller bias between ground-based and aircraft-borne observations over cities or near industry [23,41,42]. An evaluation of the TROPOMI NO2 observations over the GTA in 2020 shows overall good agreement with ground-based remote-sensing PANDORA [43] measurements (see Figure S8).

#### *2.2. GEM-MACH Air Quality Forecast Model*

The Canadian operational air quality forecast model, GEM-MACH [16,17,44,45], is used in this work. GEM-MACH consists of an on-line chemical transport module that is embedded within ECCC's Global Environmental Multi-scale (GEM), weather forecast model, and is applied over a domain that covers most of North America. It includes emissions, chemistry, dispersion, and removal process representations for 41 gaseous and eight particle chemical species, and provides hourly concentrations between the surface and 0.1 Pa (on 80 hybrid vertical levels) with a 10 × 10 km<sup>2</sup> grid cell. The standard operational model run inputs hourly emissions fields that are prepared using the Sparse Matrix Operator Kernel Emissions (SMOKE) [46] that account for seasonal, weekly and daily variations. The performance of GEM-MACH has previously been evaluated against surface and remote-sensing measurements [16,44,47–51]. A performance evaluation of NO2 forecasts for spring 2019 for Canada by the version of the air quality modelling system used in this study was carried out before it was implemented operationally in September 2019. As an indication of the quality of the pre-pandemic forecasts to be expected in this study, it was that found that NO2 forecasts for Canada for that period had a mean bias of 1.4 ppbv, a correlation coefficient of 0.57, and a root mean square error of 7.8 ppbv [48]. Additionally, an evaluation with PANDORA ground-based measurements was performed over the GTA for 2020, and showed overall good agreement with the model NO2 VCDs (see Figure S9). The current version of the emissions files used by the operational model are based on a Canadian emissions inventory compiled for the 2013 base year and a 2017 projected U.S. inventory [48]. While using year-specific NO*<sup>x</sup>* emissions is ideal, suitable emission inventories are not available in a timely manner. Alternative non-operational runs were also performed for a limited time period between 15 March and 10 May 2020 with projected Canadian 2020 emissions and COVID-modified emissions for vehicle, aircraft, manufacturing, and residential sectors (see Section 3 for details). The Canadian 2020 anthropogenic emissions are based on projected national emission

inventory that was generated by ECCC for policy studies [52]. The projections include expected changes in population, economic activity and energy use over a five-year period, from 2015 to 2020.

GEM-MACH output is used in this study for two purposes. The first is to provide profile shapes which are used in the calculation of revised TROPOMI AMFs as discussed above in Section 2.1, following the method proposed by Palmer et al. [34] and McLinden et al. [22]. These alternative AMFs (not the operational TROPOMI AMFs) are used to convert TROPOMI SCDs into VCDs. Thus, it is possible to carry out the direct comparison between our TROPOMI NO2 VCDs and those obtained from the GEM-MACH (further details can be found in the supplement material). The second is to determine the time evolution of NO2 on standard "business as usual" (BAU) input emissions that do not account for COVID impacts, which can then be contrasted with that observed by TROPOMI. In both cases, NO2 profiles are obtained from operational forecasts, are run at 10 km spatial resolution and are launched every 12 h (and every 24 h for the special runs).

In this study, we integrate the model NO2 profiles to obtain VCD values. The operational GEM-MACH model currently does not include NO*x* sources in the free troposphere (such as lightning and aircraft at cruising altitude); as a consequence the model NO*x* concentrations are near zero above the boundary layer. We obtain a more realistic free tropospheric column from GEOS-Chem [53], a 3-D model of atmospheric chemistry model (monthly averages between 18-21 UTC, from 2 km to 12 km; 0.5 × 0.67◦ resolution, version v8-03-01; http://www.geos-chem.org), these partial columns are on the order of 10<sup>14</sup> molec/cm2 and small compared to the partial columns in the boundary layer (see Figure S7), similar corrections have been applied in previous studies [22,23,49]. The model VCDs are then sampled (and filtered) in space and time at each TROPOMI pixel, and are filtered like the TROPOMI observations.

#### *2.3. Determination of Expected NO*<sup>2</sup>

GEM-MACH model output is used to estimate the impact of: (1) COVID measures on NO2 levels, (2) changes from any other possible sources of variability, including seasonal, inter-annual, or even shorter-term meteorological variability, and (3) the TROPOMI sampling variability. GEM-MACH forecasts using standard emissions inventories for both the pre-lockdown and lockdown periods are sampled at each TROPOMI pixel and overpass time.

Comparing pre-lockdown and lockdown TROPOMI observations together with pre-lockdown and lockdown GEM-MACH predictions provides an estimate of the changes in NO*x* emissions purely due to the lockdown, as this method accounts for effects of meteorology, seasonality, and sampling variability. The expected TROPOMI VCDs, V*T*,*e*, under a BAU scenario, are determined from the TROPOMI VCDs before the lockdown and are adjusted by the relative change seen in the model forecasts (GEM-MACH and free troposphere from GEOS-Chem ) between the two time periods:

$$V\_{T,c}(t\_{cvdid}) = V\_T(t\_{pr\epsilon}) \cdot \frac{V\_{Model}(t\_{cvdid})}{V\_{Model}(t\_{pr\epsilon})} \,. \tag{1}$$

When averaging over time to produce spatially resolved maps, observations from 15 February to 15 March 2020 and 16 March to 8 May 2020 are used for the pre-lockdown and lockdown time periods, respectively. This end date is associated with some traffic rebound and increased emissions throughout May 2020 (see Section 3). When averaging over a larger area to produce a time series, 15-day running means are used (the satellite data need to be averaged over multiple days in order to obtain enough data over this area, approximately 50% of observations are filtered due to clouds). The expected columns for the 15-day running means are estimated as in Equation (1), where *VT*,*e*(*tcovid*) and *VModel*(*tcovid*) are the 15-day means for a specific day.

#### **3. Results and Discussions**

#### *3.1. Spatial Averaging over Southern Ontario*

Figure 1 shows the TROPOMI and operational GEM-MACH NO2 VCDs averaged over the pre-lockdown and lockdown periods. There is excellent agreement between TROPOMI, panel (a), and GEM-MACH, panel (d), across southern Ontario for the pre-lockdown period in terms of both spatial distribution and magnitudes which provides confidence that the NO*x* emissions inventory and the model itself can accurately represent the complex physics and photochemistry of the real world.

**Figure 1.** TROPOMI averaged VCDs over southern Ontario are shown for (**a**) a pre-lockdown (16 February–15 March 2020) and (**b**) a lockdown (16 March–8 May 2020) period. The relative differences ((lockdown-pre-lockdown)/pre-lockdown) are shown in panel (**c**) for areas that exceed <sup>3</sup> <sup>×</sup> <sup>10</sup><sup>15</sup> molec/cm<sup>2</sup> in the pre-lockdown period. Panels (**d**–**f**) are the same but for the operational GEM-MACH model BAU NO2 VCDs, sampled at the time and location of the TROPOMI pixels.

When comparing TROPOMI observations between the pre-lockdown and lockdown periods, panel (a)–(c), there is a large decrease in VCDs over the GTA, the Windsor-Detroit urban area (which straddles the Canada-U.S. border), and virtually the entire domain. Decreases in the urban areas can reach or exceed 50%, and in parts of the GTA the decline can even exceed 60%. However, there is also a decrease predicted by GEM-MACH, despite not accounting for COVID-related emissions reductions as shown in panels (d)–(f). This is due to a combination of a seasonal effect in which increased sunlight means a decrease in NO*<sup>x</sup>* lifetime and less NO*<sup>x</sup>* present as NO2, but also expected seasonal changes in emissions (see Supplement Material Figure S2). This effect is on the order of 25% over the GTA between the two time periods, and is especially large because it occurs during the change from cold season to warm season.

Even using several weeks of TROPOMI observations, meteorological and sampling variability can impact the average. Spring 2020 was colder than 2019 and particularly cloudy over southern Ontario, leading to fewer cloud-free overpasses on which to base an average. This can have an impact on the averages, since approximately 50% of TROPOMI data are removed due to cloud cover, so that the remaining cloud-free observations are more representative of fair weather conditions. To determine the impact of the sampling variability, GEM-MACH averages are determined using all days over the entire domain, versus only those sampled as TROPOMI (qa > 0.75). For the average NO2 between 16 March and 8 May 2020, sampling variability can lead to differences as large as 10% near cities (see Supplement material Figure S3).

As a test of the methodology to create expected TROPOMI columns for the COVID-19 period from the change in the model forecasts, the same procedure was applied to TROPOMI observations and operational GEM-MACH output from 2019. In this case, differences between expected and TROPOMI observations should be minimal, because no unusual emission reductions occurred in 2019. As can been seen in Figure 2d,e, differences are small, suggesting the method is generally reliable. Averaged over the GTA, differences are 0–2%.

**Figure 2.** The figures show the expected and observed TROPOMI averaged NO2 VCDs over southern Ontario for 2020 and 2019. Expected and observed TROPOMI average VCD fields for the lockdown period (16 March–8 May 2020) are shown in panels (**a**,**b**), respectively. The same is shown in panels (**d**,**e**), but for 16 March–8 May 2019. Relative differences ((observed-expected)/expected; for areas that exceed <sup>3</sup> <sup>×</sup> <sup>10</sup><sup>15</sup> molec/cm2) between the TROPOMI observations and the expected columns are shown in panel (**c**,**f**) for 2020 and 2019, respectively. Note that panel (**b**) is the same as Figure 1b.

#### *3.2. COVID-Scenario Model Run*

To help evaluate the difference between expected and observed TROPOMI NO2 columns, as shown in Figure 2, GEM-MACH is re-run using an alternative emissions scenario designed to represent COVID-19 emissions changes: (i) a 30% reduction in industrial NO*<sup>x</sup>* emissions, (ii) a 60% reduction for traffic NO*<sup>x</sup>* emissions, (iii) an 80% reduction in aircraft NO*<sup>x</sup>* emissions (landings and takeoffs), and (iv) a 20% increase of residential fuel NO*<sup>x</sup>* emissions due to people staying at home. Emissions of other air pollutants emitted by these source types (CO, VOC, NH3, SO2, PM2.5, PM10) are also changed by these same percentages. The change of emissions is based on the following: (i) similar emission scenarios from Europe [11], (ii) an estimate of daily driving activities which showed a reduction of 50–65% decrease [2], (iii) the reduction of airline flights which were 79% lower in April 2020 compared to April 2019 [54], and (iv) Google mobility Reports [55] showed an 20% increase spent in residential spaces and thus an increase of 20% is applied to residential emissions. Over the entire GTA, average emissions decline from 65 kt[NO2]/yr pre-lockdown to 40 kt[NO2]/yr lockdown (around noon; see Figures S3, S5, S6, and Table S2). Note that only Canadian emissions are adjusted in this way due to the challenge of representing the complicated mixture of city-, county-, and state-level responses to COVID-19 in the U.S., but given the short atmospheric lifetime of NO*<sup>x</sup>* this is unlikely to make a big difference to NO2 levels except close to the international border (further details can be found in the supplement on the impact of trans-border NO2 transport, Figure S10). The results of this emissions scenario run are shown and compared to TROPOMI observations in Figure 3 (for 1 April to 8 May 2020). Good agreement is evident over much of southern Ontario. The TROPOMI observations are approximately 20–30% higher than the model output in Hamilton

(an industrial city), where industry emissions might be underestimated, and parts of Mississauga, where airport or vehicle traffic emissions might be underestimated in the model run.

**Figure 3.** Model NO2 VCDs from the reduced emissions scenario (**a**) and observed TROPOMI NO2 VCDs (**b**) over southern Ontario averaged over the period 1 April – 8 May 2020. The relative differences ((observations-model)/model) are shown in panel (**c**) for areas that exceed 3 <sup>×</sup> <sup>10</sup><sup>15</sup> molec/cm2. Note that emissions have only been reduced in Canada; thus, large differences can be seen for the US cities near the border, especially Detroit.

#### *3.3. Temporal Changes over Toronto*

An alternative method of considering these various data sources is to average spatially and look at temporal changes. Figure 4 shows a time series of 15-day running average NO2 over the Toronto and Mississauga area (part of the GTA with the highest emissions and population density, this area also includes Toronto Pearson Airport; see Supplement Material Figure S1). TROPOMI observations show a decline after the lockdown was announced (Figure 4a), the expected columns agree well with the TROPOMI observations during the pre-lockdown period, but, differences emerge after the lockdown begins as emissions are reduced, but the model assumes BAU emissions. The alternate model run with reduced emissions (Figure 4b) represents the decline observed by TROPOMI quite well and over the same time period, both the TROPOMI observations and the model predict a drop of roughly 40% over the GTA core (using data from 16 March to 8 May 2020) as a result of the lockdown. When the 2019 and 2020 satellite data are compared directly, however, the drop is only about half as much (20%), as the meteorology and sampling variability of the satellite are largely different in that area between 2019 and 2020. Note that the satellite data indicate that the peak of the emissions decline in Toronto and Mississauga occurred in mid-April. Throughout May 2020, the satellite measurements suggest that the NO*x* emissions began to increase again gradually (Figure 4a), though they are still lower than BAU emissions. Ontario entered Phase 1 of its re-opening on 19 May 2020, when certain restrictions were lifted.

**Figure 4.** Timeseries of 15-day running mean of NO2 VCDs over Toronto and Mississauga for 7 February to 9 June 2020, panel (**a**) shows the TROPOMI observations (navy), the expected columns (magenta). The timeseries of 2019 TROPOMI observations (grey) for the same period is shown as a reference. The red line indicates the percentage emission reductions based on the difference between the TROPOMI observations and expected columns. Panel (**b**) shows NO2 columns from the model predictions sampled like TROPOMI assuming a BAU scenario with 2020 updated emissions (blue) and a 2020 COVID reduced emissions scenario (purple). The percentage decrease in model predicted VCDs (red line) is estimated from the difference between the two model runs, the red dashed line shows the drop for perfect sampling. Average emission reductions are highlighted using observations between 16 March to 8 May 2020. Approximately 200 observations are averaged for the 15-day mean, the resulting standard errors are plotted, however, the standard error is seen to be small and on the order of 1013–1014 molec/cm2.

#### **4. Conclusions**

We present a method to disentangle the effects of meteorology and sampling variability on the observed NO2 changes, from the lockdown-related changes in NO*<sup>x</sup>* emissions. During the period from 16 March to 8 May 2020, NO2 columns in the center of the GTA decreased by nearly 60% compared to the previous month. About 25% of this decrease is associated with meteorological and seasonal changes independent of the COVID-19 pandemic. Even the TROPOMI sampling variability itself can impact the magnitude of the observed NO2 columns over the course of one or two months averaging (∼10%). From the TROPOMI observations and GEM-MACH air quality model results, we estimate that due to the lockdown the NO2 columns in Toronto and Mississauga declined by over 40%. These changes vary spatially, and in certain locations columns declined by over 50%. Applying the same method to 2019 observations leads to a 0–2% decline over the GTA, which is expected as there were no emission declines in spring 2019, which gives confidence that the method is robust.

A special model run with reduced NO*x* emissions of vehicle traffic, aircraft, and industry based on lockdown activity data [2,54,55] compares well with the TROPOMI observations during the lockdown and returned similar NO2 declines in the GTA. Although, spatial patterns over cities are somewhat visible, it is hard to disentangle the emission reductions by sector with our methodology. Nevertheless, emission changes of (i) a 30% reduction in industry, (ii) a 60% reduction for traffic, (iii) an 80% reduction in aircraft landings and takeoffs, and (iv) a 20% increase in residential fuel combustion, represent the TROPOMI NO2 observations well, at least in southern Ontario. In the GTA, NO*<sup>x</sup>* emissions of 40 kt[NO2]/yr represent the observations well, this is a drop of over 37% compared to a BAU scenario. The drop in the input emissions is almost identical to the drop determined from the model NO2 VCDs (36%) over the same area which further indicates that the method presented works well.

This study highlights the importance of considering meteorological and sampling variability when estimating emission reductions. One needs to be cautious when simply comparing two months, since the effects of meteorological and sampling variability are not negligible when only a short series of data is averaged. We show that spring 2019 and 2020 were, with regards to the meteorology, very different years and simply looking at the difference results in about half the NO*x* emission decline as compared to considering the meteorology. Further, the emission decline may vary strongly spatially, especially in cities. This can make it difficult to compare different studies unless the exact same areas are considered. The unique lockdown period associated with the 2020 COVID-19 pandemic can further be used to check and refine our existing emissions inventories for NO*x* and other pollutants by looking at spatial and temporal distributions of available satellite and surface measurements for a number of different urban areas.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2072-4292/12/24/4112/ s1 , Figure S1: Boundaries of the Greater Toronto Area, Figure S2: Operational forecast model's seasonal emission changes, Figure S3: Impact of the sampling on the averaged TROPOMI columns, Figure S4: Model input NO*x* emissions, Figure S5: Model input emissions in Toronto an Mississauga, Figure S6: Correlation between TROPOMI observations and model VCDs, Figure S7: Correlation between TROPOMI observations and model VCDs with and without free-tropospheric column, Figure S8: Comparison between TROPOMI and ground-based PANDORA NO2 measurements, Figure S9: Comparison between model and ground-based PANDORA NO2 measurements, Figure S10: Impact of US NO2 emission changes on the GTA NO2 concentrations, Table S1: Parameters and their reference points in the AMF look-up table, Table S2: Approximate average emissions used for the model runs in the GTA, Table S3: The statistics from the model and TROPOMI comparison.

**Author Contributions:** Conceptualization, D.G., C.A.M., M.D.M., and V.F.; methodology, D.G., C.A.M., M.D.M., V.F., J.R., R.M. and R.P.; software, D.G., and J.R.; validation, C.A.M., M.D.M., V.F., and X.Z.; formal analysis, D.G., J.R., and R.M.;investigation, D.G. and C.A.M.; resources, R.P., and H.E.; data curation, J.R., R.M., and H.E.; writing—original draft preparation, D.G. and C.A.M.; writing—review and editing, all authors; visualization, D.G., C.A.M., and X.Z.; supervision, C.A.M. and R.P. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work contains modified Copernicus Sentinel data. The Sentinel 5 Precursor TROPOMI Level 2 product is developed with funding from the Netherlands Space Office (NSO) and processed with funding from the European Space. TROPOMI data can be downloaded from https://s5phub.copernicus.eu (last access: 6 June 2020).

**Acknowledgments:** We would like to thank you MSC-REQA employees involved in emission adjustment and modelling: Mourad Sassi, Annie Duhamel and Jessica Miville.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


55. Google. COVID-19 Community Mobility Reports. 2020. Available online: https://www.google.com/ covid19/mobility/ (accessed on 29 September 2020).

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Remote Sensing* Editorial Office E-mail: remotesensing@mdpi.com www.mdpi.com/journal/remotesensing

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel: +41 61 683 77 34

www.mdpi.com

ISBN 978-3-0365-5894-3