*Article* **Use of Simulated and Observed Meteorology for Air Quality Modeling and Source Ranking for an Industrial Region**

**Awkash Kumar 1,2,\* , Anil Kumar Dikshit <sup>1</sup> and Rashmi S. Patil <sup>1</sup>**


**Abstract:** The Gaussian-based dispersion model American Meteorological Society/Environmental Protection Agency Regulatory Model (AERMOD) is being used to predict concentration for air quality management in several countries. A study was conducted for an industrial area, Chembur of Mumbai city in India, to assess the agreement of observed surface meteorology and weather research and forecasting (WRF) output through AERMOD with ground-level NO<sup>x</sup> and PM<sup>10</sup> concentrations. The model was run with both meteorology and emission inventory. When results were compared, it was observed that the air quality predictions were better with the use of WRF output data for a model run than with the observed meteorological data. This study showed that the onsite meteorological data can be generated by WRF which saves resources and time, and it could be a good option in low-middle income countries (LIMC) where meteorological stations are not available. Also, this study quantifies the source contribution in the ambient air quality for the region. NO<sup>x</sup> and PM<sup>10</sup> emission loads were always observed to be high from the industries but NO<sup>x</sup> concentration was high from vehicular sources and PM<sup>10</sup> concentration was high from industrial sources in ambient concentration. This methodology can help the regulatory authorities to develop control strategies for air quality management in LIMC.

**Keywords:** meteorology; WRF; air quality; AERMOD; source apportionment

#### **1. Introduction**

Urbanization-related issues have become very prominent across the world [1–3], especially in developing countries like India where cities have started facing an acute air pollution problem due to urbanization [4]. Many Indian megacities, such as Delhi, Mumbai, Bangalore, and Kolkata, are witnessing increasing health problems due to rapid increase in air pollution [5]. The total health cost due to air pollution for Mumbai was about USD 8 billion for the year 2012 [6]. This problem becomes particularly complex to resolve in urban areas because of diverse emission sources such as vehicles, industries, bakeries, hotels, diesel generating sets and combustion of solid fuels in the domestic sector.

Air quality monitoring networks have been installed at various locations in many cities. Also, installation and operation of a large number of air quality monitoring stations need considerable financial resources from government which may not be supported in low-middle income countries (LIMC). This monitoring data is increasingly used to communicate the existing status of air quality. However, it doesn't contribute to the understanding of sources and meteorological factors. Although the observed data represent air quality status for a particular location only, the use of dispersion models can provide information about much larger areas. Further, modeling helps in the determination of concentration plots on spatial and temporal scales and contributions from different types of source for air pollutants [7–11]. The dispersion model can also be used to identify pollution sources with the help of emission inventory [12,13]. This is very useful in making rational management strategies [7,14–20]. A dispersion model can also determine

**Citation:** Kumar, A.; Dikshit, A.K.; Patil, R.S. Use of Simulated and Observed Meteorology for Air Quality Modeling and Source Ranking for an Industrial Region. *Sustainability* **2021**, *13*, 4276. https://doi.org/10.3390/ su13084276

Academic Editor: Sara Egemose

Received: 3 February 2021 Accepted: 9 April 2021 Published: 12 April 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

the contribution of various sources in a region whereas a receptor model determines source contribution at a particular location [21]. Data required for dispersion models include emission inventory, geographical data, and meteorological data of the region [22]. Data availability, especially meteorological data, is an important factor for the assessment of air quality in LMIC because running a meteorological monitoring station requires resources. The use of poor-quality meteorological data in air quality models may contain significant adverse effect model output quality [23,24]. Meteorological data is generally taken from a nearby meteorological station and is used for the study region. The results of air quality model may have significant error despite advanced computer technology, and various techniques like numerical modeling techniques, performance evaluations of state-of-the-art [23–27]. A survey has been done for the air quality and meteorological model [28].

The observed meteorological data from a monitoring station may not give good performance in air quality modeling for the urban industrial region where several emission sources are present at multiple heights and variation in topography. An alternative is to generate onsite meteorological data using a meteorological model which could be an effective option in LMIC. A study was conducted on the coupling of American Meteorological Society/ Environmental Protection Agency Regulatory Model (AERMOD) with Weather Research and Forecasting (WRF) model in Pune city (India) for a single pollutant PM<sup>10</sup> [29]. However, their predicted concentration obtained by the WRF-generated meteorology and observed values have not been compared and contribution of various sources in the study region has not been estimated. Short term air quality forecasting also has been carried out using WRF forecasted meteorology and AERMOD for five days for Chembur region [30]. In these studies, the requirement of horizontal homogeneous hourly surface and upper meteorological data has been fulfilled from WRF model for AERMOD. The main objective of this study was to generate onsite meteorological data at mesoscale using WRF model and compare the results with observed meteorological data. Then, we proposed to use both the data in air quality modeling and to evaluate the option of making WRF coarse resolution output feasible in LMIC. This study was also continued to rank the contribution of emissions and ambient concentrations from sources for NO<sup>x</sup> and PM10. This will be useful for air quality management of the urban area for regulation purposes [31] in LMIC.

#### **2. Study Area**

The study area, Chembur, represents an industrial site of Mumbai city in India with global coordinates 19.05◦ N and 72.89◦ E. This area covers M East and M West wards of Municipal Corporation of Greater Mumbai, which is one of the financial centers of India as shown in Figure 1a. Chembur has a population of 1.2 million. It measures 6.5 km east-towest and 8.45 km north-to-south, as shown in Figure 1c. This region has marine alluvium type of soil and North-South running basalt hills to its South [32]. The topographical features have been shown in Figure 1d in the Universal Transverse Mercator coordinate system. The elevation is maximum at the central part of the study area and minimum along the boundary of the Eastern study area. The elevation ranges from 1 to 200 m. The elevation just above the location of Rashtriya Chemicals and Fertilizers Limited (RCFL) is 100 to 200 m. Major industries in this area are Bharat Petrochemical Corporation Limited (BPCL), Hindustan Petroleum Refinery Corporation Limited (HPCL), Tata Thermal Power Corporation Limited (TPCL) and RCFL. Containers and heavy-duty vehicles from this area use the Port Trust Road, Mahul Road and Ramakrishna Chemburkar Marg (R C Marg). Road conditions are poor due to the continuous movement of heavy vehicles. The residential areas spread over the north boundary of the study area has a residential zone comprising Chheda Nagar (between point 2 and 3) and Shramjivi Nagar (Left side area of point 2), the south boundary is adjacent to the Tata Thermal Power Plant. The west boundary lies by RCFL and Mahul, and the east boundary is aligned with Shahyadri Nagar and Prayag Nagar.

**Figure 1.** (**a**) Mumbai Area, (**b**) WRF Domain (**c**) Study Area Chembur (Downloaded from Google Earth) (**d**) Terrain Map. Note: Figure (**d**) is given in Universal Transverse Mercator coordinate system. 1-Eastern Express Highway Road, 2-NGA Marg, 3- Ghatkopar-Mankhurd Road, 4-V N Purav Road, 5-R C Marg, 6- B D Patil Marg, A-RCFL, B-BPCL, C-HPCL, D-TPCL.

Around twenty years ago, Chembur was one of the most polluted regions in Mumbai. With the sustained effort and pressure from authorities and industries for implementing a series of control measures, the region has witnessed an improvement in air quality. In the last two decades, the region characteristics have improved due to the closure of many industries, but residential development and vehicular density has increased [33]. Chembur still needs appropriate air quality studies for developing management strategies, as its ambient air quality is poor when compared with the National Ambient Air Quality Standard 2009, Central Pollution Control Board (CPCB) New Delhi (India). CPCB has published a document, giving a Comprehensive Environmental Pollution Index (CEPI) score for various industrial regions in the country. Chembur has a score of 69.19 CEPI in this report [34]. This score shows that this region should be rated as a severely polluted area. Hence, the region requires better understanding of air quality processes so that effectiveness of the action plan can be realized.

#### **3. Methodology and Data**

The schematic data flow of the study has been shown in Figure 2. AERMOD requires emission inventory and nine hourly meteorological parameters (wind speed, wind direction, rain fall, temperature, humidity, pressure, ceiling height, global horizontal radiation and cloud cover) as the input data. These meteorological parameters were generated from the WRF model for the year. Also, the meteorological parameters were observed at RCFL for the same time period and both data sets were compared. Prediction of concentrations using an air quality model (AERMOD) was carried out with the observed meteorological data and WRF generated data. Meteorological parameters were prepared in columns and temporal resolution was prepared in rows of a spreadsheet. This spreadsheet was processed in AERMET which is a pre-processor of AERMOD. The terrain data at 90 m resolution of Shuttle Radar Topography Mission (SRTM) was used in AERMAP which is also the pre-processor of AERMOD. Then, the AERMOD model was used to predict concentration of NO<sup>x</sup> and PM<sup>10</sup> as shown in Figure 2. Also, comparisons of both the models, WRF and AERMOD, were done. The metrological model, the setup of parameterization of variables, dispersion model, emission load, and observations have been presented section-wise.

**Figure 2.** Schematic Flow for the Study.

#### **4. Meteorological Model**

The mesoscale model, Advanced Research WRF model version 3.2, has been used in this study [35]. This model is designed to assist both atmospheric research and operational forecasting needs [36]. NCEP FNL (Final) Operational Global Analysis data have been used as an input for WRF, which are on 1.0 ×1.0 degree grids prepared operationally every six hours. This product is from the Global Data Assimilation System (GDAS), which continuously collects observational data from the Global Telecommunications System (GTS) and other sources for analyses [37]. It is a limited area, non-hydrostatic primitive equation model with multiple options for various physical parameterization schemes. This version employs Arakawa C-grid staggering for the horizontal grid and a fully compressible system of equations. A terrain following hydrostatic pressure coordinates with vertical grid stretching is implemented vertically. The time split integration uses a third order Runge Kutta scheme with smaller time step for acoustic and gravity wave modes. The WRF model physical options used in this study consist of the WRF model Single Moment 6-class simple ice scheme for microphysics, the Kain–Fritsch scheme for the cumulus convection parameterization, and the Yonsei University planetary boundary layer scheme. The rapid radiative transfer model and the Dudhia scheme are used for longwave and shortwave radiation, respectively, while the Noah land surface model has been selected. All these parameterizations constitute a well-tested suite of schemes over the Indian region [38–40]. The model domain extends between 71◦ E to 81◦ E zonally and 11◦ N to 21◦ N meridionally, consisting of 100 by 100 grid points with 25 km grid spacing as shown in Figure 1b. The model was run from 1st January to 31st December of the year. The model has 28 vertical levels with the top of the model at 10 hPa. Topography as

well as snow cover information have been obtained from the United States Geological Survey. The meteorological parameters have been extracted from the WRF model at ground level. The WRF model has been run at 25 km resolution which provides time series meteorological parameters for a specific period at a particular location. In this study, 9 hourly meteorological parameters (cloud cover, temperature, pressure, relative humidity, wind direction, wind speed, ceiling height, rainfall, and global horizontal radiation) have been simulated for the year using WRF. WRF gives output in network common data format, and GRADS v 2.2 is post processing software which reads the network common data format. The output from WRF was fed to Grads 2.2 to generate digital hourly meteorological data to arrange in an Excel spreadsheet. The AERMET required data in excel spreadsheet or other formats. The input in AERMET was given in excel spreadsheet which was prepared using the data obtained by Grads 2.2. Here, hourly data for each meteorological parameter are provided in different columns. The spreadsheet meteorological data were imported in AERMET which is pre-processor of AERMOD.

#### **5. Dispersion Model**

Dispersion model uses emission inventory, geographical and meteorological data to predict concentration at the receptor's point in the study region. The format of the input data varies with different models. There are many specific models for vehicular and industrial sources as well as for a variety of sources [19,41–43]. Industrial Source Complex (ISC3), developed by the United State Environmental Protection Agency, is a steady-state Gaussian plume model which can be used to assess pollutant concentrations from a wide variety of sources associated with an industrial source complex [44]. ISC3 operates in both long-term (ISCLT3) and short-term (ISCST3) models. ISCST3 model is the regulatory model in India and it has been used in many case studies [45]. Later on, it was updated to AERMOD whose performance was appreciable as they added some advanced algorithms to get more accurate results [46]. The air quality model that we use, AERMOD, has been applied to evaluate dispersion of several pollutants, including PM10, HCN, SO2, SF6, and VOCs and is recommended widely by regulatory authorities [47–51].

The study area (as given Figure 1c) was given in AERMOD and emission locations were digitized according to real earth surface reference and quantities of emissions were put based on estimated emission inventory. Therefore, there is no resolution concept for emission inventory in this study. The meteorological data output from the WRF model was processed in AERMET and its output was fed into AERMOD. The pre-procedure AERMAP of AERMOD calculates representative terrain-influence height, also referred to as the terrain height scale, at a receptor in modeling of air quality. Cartesian uniform gridded receptors were given, apart from discrete receptors and all receptors were at 2 m height. Anemometer height was 10 m and surface roughness length was 1 m in this model run. Building downwash terminology was not considered. AERMOD calculates concentration for each hour using hourly meteorological data for each pollutant separately.

#### **6. Emission and Concentration Data**

Emission load has been computed for point sources (specifically 36 stacks of BPCL, 30 stacks of HPCL, 4 stacks of Tata Power and 17 stacks of RCFL), line sources (the 6 roads of Chembur), and area sources (e.g., bakeries, hotels and restaurants, crematorium and domestic sector). These area sources were taken from a previous study (CSIR-NEERI) [52] for M East and M West wards, where area sources emission load has been computed. These sources are for the region of Chembur (M East and West Ward) where domestic sectors are available. Industrial emission data are collected from industries and vehicular emission inventory are prepared based on field survey data for the study period. Vehicular emission rate is estimated using the actual number of vehicles in unit time, emission factors and vehicle kilometer travelled [53]. The percentage of contribution is estimated from type of the sources after making emission inventory of the region. PM<sup>10</sup> and NO<sup>x</sup> emissions are mainly caused by industries (94% and 64%, respectively). Remaining 36% of NO<sup>x</sup> emission

is contributed by vehicles (18%), domestic sources (17%) and others (1%). The emission factors of vehicles are available for particulate matter, and this has also been taken as PM10. Figure 3 depicts emission loads of NO<sup>x</sup> and PM<sup>10</sup> from various types of sources in the study area. The observed concentration data were collected for NO<sup>x</sup> and PM<sup>10</sup> at industrial sites, i.e., HPCL and BPCL at a height of about 3.5–4.0 m. Continuous (hourly) ambient air quality monitoring was done at these sites using Telydene instrument for NO<sup>x</sup> and PM10.

**Figure 3.** Emission load of NO<sup>x</sup> and PM<sup>10</sup> in Chembur (in kg/day).

#### **7. Results and Discussion**

The results of WRF were used as inputs in estimation of concentration by AERMOD. NCEP FNL (Final) Operational Global Analysis data was used to process as an input for WRF. It produced 30 meteorological parameters for the required time resolution and study period. AERMET (pre-processor of AERMOD) required nine meteorological parameters and these were extracted, out of 30 meteorological parameters for air quality modeling. After this, the contributions from various sources in ambient concentration were estimated in this study.

#### **8. Validation of Wind and Temperature Time Series**

WRF generated the nine meteorological parameters hourly which were used in air quality modeling. The validation of meteorological output from WRF was done. In the validation of WRF, output time series temperature and wind are significant for air pollution. Therefore, the point of validation of temperature was conducted using the hourly temperature of the year of WRF with observed temperature data of RCFL industry, Chembur. A fair estimate of the dispersion of pollutants in the atmosphere is possible based on the frequency distribution of wind direction as well as wind speed. Wind transports pollutants from various sources, causing turbulent mixing and diluting pollution. Boundary layer cumulus clouds vent pollution into the free troposphere, and temperature and humidity levels in the boundary layer affect chemical reactions and the rates at which many dangerous compounds are formed.

#### *8.1. Validation of Temperature*

Figure 4 shows a comparison of surface temperature of the study area, between WRF simulation and observations, and it can be seen that they are in good agreement. The average percentage error between simulated and observed temperature is −5.3%. The standard deviation of percentage error is 10.4.

**Figure 4.** Comparison of Hourly Temperature of WRF with Observed Data.

#### *8.2. Validation of Wind*

Wind data was derived using WRF model at the height of 10 m from the ground for a whole year, and wind rose of Chembur, which was simulated by (a) WRF model and (b) RCFL observed data for the year, was plotted (Figure 5). Maximum wind persistence corresponds from the west and south-west direction in WRF, while in RCFL observed data, the wind was found to blow in the west and north-west direction for most of the time. Observed wind data had more periods of calmness than the data simulated by WRF. Observed wind rose had 62% calm condition whereas simulated wind rose had 4% calm condition over a period. The root mean square error and mean bias error between predicted and observed wind speed were 4.05 and 3.29, respectively. The root mean square error and mean bias error between predicted and observed temperature were 3.69 and −1.5, respectively. The observed data of wind was collected at RCFL industry at the height of 6 m, and it represents the microscale domain. Chembur has considerable variation in topography as shown Figure 1d. The topography changes after a few meters and this causes the wind to divert. Consequently, this variation of topography may cause the mismatch of wind rose of WRF and observed data.

Modeling was performed with both the meteorological parameters of WRF output and RCFL observed data. WRF performance of wind is poor due to the surface inhomogeneity. However, for the purpose of dispersion modeling, we consider WRF as a good representation of the mesoscale flow. The observed wind at RCFL is not representative of the entire Chembur region for dispersion modeling. This may be because the maximum emission is from industrial sources in this region, which are at an elevated height. Hence, for these sources, the mesoscale meteorology generated by the WRF model may be more appropriate than the observed microscale meteorology.

**Figure 5.** Annual Wind Rose Simulated by (**a**) WRF and (**b**) RCFL Observed Data.
