Development of Air Pollution Forecasting Models Applying Artificial Neural Networks in the Greater Area of Beijing City, China

Fazakis, Panagiotis; Moustris, Konstantinos; Spyropoulos, Georgios

doi:10.3390/su16198721

Open AccessArticle

Development of Air Pollution Forecasting Models Applying Artificial Neural Networks in the Greater Area of Beijing City, China

by

Panagiotis Fazakis

¹

,

Konstantinos Moustris

^1,*

and

Georgios Spyropoulos

^1,2

¹

Air Pollution Laboratory, Department of Mechanical Engineers, School of Engineers, University of West Attica, P. Ralli & 250 Thivon Str., Aegaleo, GR-12244 Athens, Greece

²

Soft Energy Applications & Environmental Protection Laboratory, University of West Attica, P. Ralli & 250 Thivon Str., Aegaleo, GR-12244 Athens, Greece

^*

Author to whom correspondence should be addressed.

Sustainability 2024, 16(19), 8721; https://doi.org/10.3390/su16198721

Submission received: 24 August 2024 / Revised: 17 September 2024 / Accepted: 8 October 2024 / Published: 9 October 2024

(This article belongs to the Section Pollution Prevention, Mitigation and Sustainability)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

The ever-increasing industrialization of certain areas of the planet combined with the simultaneous degradation of the natural environment are alarming phenomena, especially in the field of human health. The concentration of particulate matter with an aerodynamic diameter of 2.5 μm (PM_2.5) and 10 μm (PM₁₀), nitrogen oxides (NO_x), carbon monoxide (CO), sulfur dioxide (SO₂), and ozone (O₃) needs constant monitoring, as they consist of the main cause for many diseases. Based on the existence of statutory limits from the World Health Organization (WHO) for the concentration of each of the aforementioned air pollutants, it is considered necessary to develop forecasting systems that have the ability to correlate the current meteorological data with the concentrations of the above pollutants. In this work, the attempt to predict air pollutant concentrations in the wider area of Beijing, China, is successfully carried out using artificial neural network (ANN) models. In the frame of a specific work, a significant number of ANNs are developed. For this purpose, an open-access meteorological and air pollution database was used. Finally, a statistical evaluation of the developed prognostic models was carried out. The results showed that ANNs present a remarkable prognostic ability in order to forecast air pollution levels in an urban environment.

Keywords:

artificial neural networks; atmospheric pollution; predictive model; pollutant forecast

1. Introduction

Today’s way of life, as has been shaped by existing social and economic conditions, requires the continuous exposure of people to an atmosphere with adverse effects on health. Research findings from all over the world converge in terms of the destructive consequences that burdened atmospheric air has on humans, agriculture, and infrastructure. The continuous degradation of climate makes the need to find a means of predicting the quality of the atmosphere, as well as dealing with the production of atmospheric pollutants, imperative.

Air pollution is defined as the “contamination of the indoor or outdoor environment by any chemical, physical or biological agent that modifies the natural characteristics of the atmosphere” [1]. The effects of air pollution have been studied in many countries, both within and outside the EU, with each work of research confirming its catalytic role in the degradation of quality of life. Exposure to PM₁₀ was found to increase the likelihood of hospitalization for bronchitis symptoms, both for adults and children [2], while exposure to excess concentrations of PM_2.5 was found to be a cause of hospitalization for cardiovascular and respiratory causes, as well as the cause of death [3,4]. Also, exposure to this type of particle can negatively affect infants, even in the prenatal stage, increase the probability of birth defects, obesity, type I diabetes, neurological and behavioral dysfunctions, premature birth, and even neonatal death [5,6,7,8]. Equally important are the excessive concentrations of sulfur dioxide (SO₂), as an excess of 10 ppb can increase the number of hospitalizations by 1.7% [8]. Prenatal and postnatal exposure to nitrogen dioxide (NO₂) leads to increased chances of bronchopulmonary infections, pneumonia, as well as obesity [6,9]. Research that was performed in areas of Athens (Thrakomakedones and Athens center) for the period 2001–2018 showed that 6% and 7.5% of deaths in these areas, respectively, are due to increased concentrations of ozone (O₃) [10].

However, the effects of air pollution are not limited to human health. Extensive research carried out in the city of Beijing, China, showed a direct correlation between the concentration of ozone (O₃) as well as nitrogen oxides (NOx) and the reduction in the amount of grain yielded [11]. The percentages of the amount of grain yielded can reach up to 15%, while it is predicted that this will reach a height of 23% in the coming years [11,12,13]. Correspondingly, the effects on infrastructure due to acid rain, as well as the transportation sector, are harmful. In particular, after research carried out by Hernández et al. [14] regarding the effect of this on car paint, it was found that between the two aging methods, Xenon and a mixture of acid rain, the latter had the most destructive consequences. Finally, Ibrahim et al. [15] in their research on the surface treatment of concrete after exposure to an acidic environment, reported the phenomenon of gypsum formation and its destructive properties on building materials, gradually weakening the ability of cement to withstand compression.

The need to design a means of forecasting pollutant concentrations is imperative. Over time, many researchers have tried to design such predictive models. In 2015, Xiao Feng et al. [16] created a model that combined a Multi-Layer Perceptron (MLP) neural network with a geographical model based on the air trajectory and the use of the wavelet transformation method to forecast the average daily PM_2.5 concentration for the next two days in Beijing, China. Similar efforts were made by Madhavi Anushka Elangasinghe et al. [17] in 2014 to forecast the hourly concentration of nitrogen dioxide (NO₂) using an MLP neural network with one hidden layer, trained with the Levenburg–Marquardt algorithm for the Auckland, New Zealand region. In 2016, Yun Bai et al. [18], using the static wavelet transform method and the training of an ANN with a backpropagation algorithm, tried to predict the daily concentrations of PM₁₀ particles and the concentrations of sulfur dioxide (SO₂) and nitrogen dioxide (NO₂). In 2017, Fabio Biancofiore et al. [19] also tried to predict the average daily concentrations of PM_2.5 and PM₁₀ particles for the next one to three days, while in 2018, Fabiana Franceschi et al. [20] tried through a statistical study and AΝΝ integration to predict the concentrations of PM_2.5 and PM₁₀ particles in Bogotá, Colombia. According to further research published in 2024, Quanchao Chen et al. [21] designed an innovative model for the prediction of hourly concentrations for PM_2.5 and PM₁₀ as well as ozone (O₃) in the city of Beijing by creating an AAMGCRN (Adaptive Adjacency Matrix Graph Convolutional Recurrent Network). Accordingly, Sarmad Dashti Latif et al. [22] applied multiple machine learning methods to predict the ozone (O₃) concentration for the next 1, 3, 5, and 7 h in the Klang Valley, Malaysia.

This research was conducted in an attempt to design a universal air pollution forecasting model. It was also mentioned that over the course of a decade, no corresponding studies have been found that use strictly meteorological variables and no other pollutants as training data. The latest reference was in 2014, in the forecasting model created by Madhavi Anushka Elangasinghe et al. [17], who argued that models in which the inputs consist, among other things, of pollutant concentrations have limited practical use.

2. Materials and Methods

The present study addresses a wide range of atmospheric pollutants, specifically the concentrations of suspended particulate matter with an aerodynamic diameter of 2.5 μm (PM_2.5) and 10 μm (PM₁₀), as well as the concentration of sulfur dioxide (SO₂), nitrogen dioxide (NO₂), carbon monoxide (CO), and ozone (O₃).

The area of interest is the city of Beijing, China, and specifically 12 sub-districts within it. ANNs were developed and used to achieve the requested results. ANNs are a sub-field of machine learning (ML), and what sets them apart from other ML techniques is the special way in which they create their systems. ANNs, as their name suggests, try to imitate both the structure and method in which the human brain operates [23]. Specifically, they consist of artificial neurons, with “nerves” providing them with the required information (input data). The inputs, when entering the neuron, are multiplied by some “weights” (factors that reduce or enhance the effect that each input has on the final outcome), and after passing through a transfer function, they exit from it (output data) [23]. The schematic representation of the above process is presented in Figure 1.

The ANNs that were developed in this work are Multi-Layer Perceptron (MLP), and they stand out for their existence of hidden layers (Figure 2). Simple “perceptrons” use no hidden layers, making their use limited due to a lack of credibility. MLP, however, by using multiple hidden layers, can process data much faster and efficiently than their predecessors. Their topology and training parameters play a vital role, and if they are not adjusted properly, they can make the training process unsuccessful. If the number of hidden layers is lacking, then the training process will be incomplete, and the credibility of the final ANN will be low. However, if the number of hidden layers exceeds the required amount, then the ANN may produce results of high credibility for the current dataset but will not work properly on different datasets. This is because of a phenomenon called overfitting or overtraining, where the ANN training process fail, making the ANN replicate, not predict, the exact values of the testing and validation dataset [23].

The way in which the processing of the input data is carried out and their correlation with the output data makes this particular processing method ideal for cases of creating forecasting models. In particular, the present work is called upon to create and evaluate a sufficient number of forecasting ANN models in order to select those that present the best predictive ability.

2.1. Study Area and Data Availability

The study area is the city of Beijing, China (39°54′13″ N, 116°23′17″ E), for which hourly measurements of the concentrations of the pollutants of interest, as well as meteorological conditions, were collected for the time period 1 March 2013 to 28 February 2017. The corresponding values were obtained from a free dataset by Chen Song [25] and were obtained through the following link: https://archive.ics.uci.edu/dataset/501/beijing+multi+site+air+quality+data, accessed on 20 December 2023. The following data were given:

Concentration of particulate matter with an aerodynamic diameter of up to 2.5 micrometers, PM_2.5.
Concentration of particulate matter with an aerodynamic diameter of up to 10 micrometers, PM₁₀.
Concentration of sulfur dioxide, SO₂.
Concentration of nitrogen dioxide, NO₂.
Concentration of carbon monoxide, CO.
Ozone concentration, O₃.
Ambient air temperature, TEMP.
Dew temperature, DEWP.
Atmospheric pressure, PRES.
Precipitation, RAIN.
Wind speed, WSPM.
Wind direction, WD.

The above values were available for nine (9) total locations of Beijing, including the following:

Location 1: Aotizhongxin;
Location 2: Changping;
Location 3: Dongsi;
Location 4: Guanyuan;
Location 5: Gucheng;
Location 6: Nongzhanguan;
Location 7: Tiantan;
Location 8: Wanliu;
Location 9: Wanshouxigong.

The above locations are also shown in Figure 3 (locations with an asterisk). Then, it was decided that the distance chosen between the examined locations should be no more than 20 km. This decision was based on the assumption that within the distance of 20 km, the influence of the examined locations to each other would be significant, and the selected locations would be closer to the city’s center. Before any data processing, it was necessary to check the available datasets for non-available/missing values.

All preliminary screenings were conducted using Microsoft Excel 2021. For example, in Table 1, the data completeness for location 3 is presented. In total, in all nine (9) areas, more than 7000 missing values were found out of a total of 420,780 items in each area. The standard methodology dictates that, when checking the data for missing values, the researcher is advised to delete the entire row in which they are located. However, such a treatment is only possible when there is a large number of available data and/or a small number of missing values. In cases such as the specific dataset, which is not of sufficient size, if the specific methodology is applied, it can lead to a data reduction of 20%. Therefore, it was decided not to follow the specific directive and to make an attempt to cover the missing data through a combination of various methods, which are analyzed below.

2.1.1. Method 1: Using Existing Values and Microsoft Excel Commands

In order to apply this method, a basic assumption is made that consecutive values within 5 h do not differ greatly from each other. Having made this assumption, what must be conducted is quite simple. Initially, a check is made for the existence of values within the previous 5 h and within the next 4 h. From this check, four (4) individual scenarios emerge as follows:

The existence of both values within the time limits given. In this case, the missing value is the average of the two existing values.
The existence of only the previous value within the time limits. In this case, the missing value is equal to the previous value plus/minus a certain number, which is listed in Table 2 below.
The existence of only the next value within the time limits. In this case, the missing value is equal to the next value plus/minus a specific number, which is listed in Table 2 below.
The simultaneous absence of two values within the time limits. In this case, the element receives the value “NA” so that it can be processed later.

Although the above method is quite effective, it is not able to cover 100% of the missing values; however, it presents an average efficiency of more than 60%. The number of missing values covered are presented in detail in Table 3.

2.1.2. Method 2: Development of a Code within the MATLAB Programming Environment

This method also requires an assumption that the concentrations of pollutants and the values of meteorological factors do not change significantly in an area of about ten (10) kilometers or less. Initially, the distance between the nine (9) study areas is calculated, and for each one, the closest areas are selected. Let us take location 3 (Dongsi) as an example. If the distances of the other locations are calculated in relation to location 3, it is found that locations 6, 4, 8, and 1 (in order of proximity) are within a range of 10 km or less. It is logical that the values of the areas closest to location 3 have a greater influence on the values of interest. In the next step, the relative position of locations 1,4,6, and 8 with respect to location 3 (north, east, north-east, etc.) needs to be determined. Finally, the appropriate code is written within the MATLAB environment, version R2022a, and based on this, the wind direction is taken into account, and the appropriate location is selected. In each case, the missing value in location 3 becomes exactly the same value as the corresponding value of the selected area. This method, though much more complicated and detailed than the first one, does not cover a large number of missing values, with a success rate of less than 20%. The remaining values are once again set to “NA” so as to be processed at a later time.

2.1.3. Method 3: Applying the Linear Regression Methodology

The ideology of this method is directly related to the basic idea behind Method 2. The same as before, the basic assumption here is that the meteorological and atmospheric conditions prevailing in nearby areas describe with relative accuracy the conditions in the study area. First, the areas are categorized in order of proximity. Then, their graphical representation is carried out and is followed by the extraction of graphs, which are structured so that the linear correlation between the values is evident. Finally, these linear equations are extracted and used in order of proximity with the missing values of each region. This method offers 100% coverage of the remaining “NAs”. Figure 4 shows indicatively the graphs concerning the linear correlation between PM_2.5 concentrations for location 3 in relation to locations 1, 4, 6, and 8, respectively.

2.2. Data Preparation

Before the training process of the ANNs can begin, the appropriate processing of the input data is necessary. Initially, it was necessary to check the correlation of the data variables in order to reveal any existing relationships among them. Therefore, the correlation coefficient between the variables was calculated for each region, and in the end, the average was taken into account, as presented in Table 3. However, despite the attempt to extract these “hidden” relationships through the correlation coefficient, Table 3 does not offer much insight. As an alternative method for determining optimal combinations, the Principal Components Analysis (PCA) technique was used, in combination with the k-means clustering method, which showed that 40% of the information can be described solely with the set of temperatures. In practice, however, such a thing is not valid. Although based on methodology we should be satisfied only with the results of the PCA, in the present paper, further scenarios were analyzed, which proved to be better than the proposed one. After all, Fabiana Franceschi et al. [20] found a positive correlation between PM₁₀ concentration and wind direction, while a negative correlation was observed between the concentration of the same pollutant in terms of temperature and wind speed. These findings are further reinforced by Zhang et al. [26] in earlier research, in Beijing City, China. Accordingly, for PM_2.5, a positive correlation was observed between them and relative humidity [20], which is also verified by this specific research work.

2.3. Creation of Scenarios

The scenarios that were created were the same for each location; however, each of them was studied separately. The goal was to train ANN forecasting models for each location and merge them into a universal algorithm. Each location “includes” six (6) pollutants (PM_2.5, PM₁₀, SO₂, NO₂, CO, and O₃), where each pollutant consists of eight (8) scenarios, and each scenario includes ten (10) different ANN models. The total amount of developed ANNs comprised 4320 forecasting models. The training data included a different combination of scenarios of meteorological variables for the hourly values during the three previous days and predictions for the next 24 hourly concentrations. More specifically, in this work, the developed ANN models were able to forecast the hourly air pollutant concentration for the next 24 h, based on the hourly values from the previous three (3) days. To make this more understandable, let us assume that today is Sunday. At any hour on Sunday, the developed ANN models are able to give the concentration of each pollutant for the next 24 h (the hourly forecasting step) until Monday (the next day’s forecasting horizon). In any case, the hourly values of the necessary parameters (see the scenarios in Table 4) of the three previous days, which are Thursday, Friday, and Saturday, are taken into account. The specific structure gives the developed ANN forecasting models an operational interest since the forecast can be made at any time during the day for the next day, giving an advantage in making correct, valid, and timely decisions by the competent agencies.

For a better study and analysis of the data, it was considered appropriate to add one more variable, relative humidity (RH), as it combines dew point temperature (DEWP) and dry bulb temperature (TEMP) data. The developed scenarios are described in Table 4. RH was calculated using Equation (1) [27]:

RH = \frac{e^{(17.625 - DEWP / 243.04 + DEWP)}}{e^{(17.625 - TEMP / 243.04 + TEMP)}}

(1)

As mentioned, for each scenario, 10 different ANNs were trained, which were then evaluated, and the best one was selected. For reasons of repeatability in the experiment, the architecture of the developed ANNs is listed in Table 5. The training functions were chosen so that a secondary evaluation on the ways in which the training process could be optimized was executed. Specifically, the software’s default training function (Levenburg–Marquardt) was chosen as the function to create the first ANN, while its individual parameters were adjusted to require “medium” computational power. The next four (4) ANNs were chosen to be trained with the Bayesian Regularization training algorithm, which is suitable for training ANNs with the aim of pattern recognition. ANNs number 6 to 10 are examples of other train functions, which were chosen in an effort to find possible functions with better performance. Among the ten (10) developed ANN models, the first five proved to be the most “demanding” as they needed a lot of computational power. However, as represent in the next section, they consistently offered the most valid results. On the contrary, the second half of Table 5, although less “demanding”, provided very unstable results, with close to zero utilization capability. However, it is worth mentioning that none of the last five (5) training functions were suitable for training ANN models of this kind. The topologies/architectures of the developed ANN models were created in such a way that they could be evaluated by the same standards while allowing the training process to be commenced by a conventional desktop computer.

Furthermore, the initial dataset was split (randomly) into three subsets. The first was the training subset, containing 70% of the total data volume; the second was the cross-validation subset; and the third was the testing subset containing 15% of the total data volume, respectively. Finally, for all of the ten developed ANNs in Table 5, the number of training epochs was equal to 200.

2.4. Software and Infrastructure

The following software were used for the preparation of this work:

MATLAB R2022a;
Microsoft Excel 2021.

The training of ANNs was carried out on a home desktop computer with the following specifications:

CPU: AMD Ryzen 7 5700G (Advanced Micro Devices, Inc., Santa Clara, CA, USA);
RAM: G.Skill Ripjaws V 16GB DDR4-3200MHz (G.SKILL International Enterprise, Taipei, Taiwan);
GPU: N/A;
SSD Kingston NV1 500GB M.2 NVMe (SNVS/500G) (Kingston Technology Corporation, Fountain Valley, CA, USA).

In order to be able to evaluate the above neural networks, it was deemed necessary to calculate eight (8) statistical indices [28]. These indices and their equations are listed below: the equations (Equation (2)) to (Equation (5)) describe the statistical evaluation indices for the performance of models in order to predict the next 24 hourly concentrations, whereas equations (Equation (6)) to (Equation (9)) represent the evaluation indices that determine the predictive ability of each developed ANN model regarding a certain threshold. More specifically, if the developed model is able to predict the exceedances, in other words, the cases where the pollutant concentration was over a specific threshold value or not, these threshold concentrations can be determined by the reference values for each pollutant according to the WHO [29].

Mean Absolute Error (MAE) M A E = \frac{1}{n} \cdot \sum_{i = 1}^{n} |(P_{i} - O_{i})|

(2)

Root Mean Square Error (RMSE) RMS E = \sqrt{\frac{\sum_{i = 1}^{n} {(P_{i} - O_{i})}^{2}}{n}}

(3)

Pearson ’ s Correlation Coefficient (R) R = \frac{\sum_{i = 1}^{n} [(P_{i} - P_{mean}) \cdot (O_{i} - O_{mean})]}{\sqrt{\sum_{i = 1}^{n} [{(O_{i} - O_{m e a n})}^{2}]} \cdot \sqrt{\sum_{i = 1}^{n} [{(P_{i} - P_{m e a n})}^{2}]}}

(4)

Index of agreement (IA) IA = 1 - \frac{\sum_{i = 1}^{n} [{(P_{i} - O_{i})}^{2}]}{\sum_{i = 1}^{n} [{(|P_{i} - O_{mean}| + |O_{i} - O_{mean}|)}^{2}]}

(5)

True Prediction Rate (TPR) TPR = \frac{True Positives}{True Positives + False Negatives}

(6)

False Prediction Rate (FPR) FPR = \frac{False Positives}{False Positives + True Negatives}

(7)

False Alarm Rate (FAR) FAR = \frac{False Positives}{False Positives + True Positives}

(8)

Success Index (SI) SI = \frac{True Positives + True Negatives}{Total Predictions}

(9)

where (O_i) and (P_i) represent the observed and the predicted values, respectively, (O_mean) is the mean of the observed values, and (n) is the number of observations in each case.

3. Results and Discussion

Table 6 depicts the values of the statistical evaluation indices (MAE, RMSE, R, and IA) for the best-developed ANN model for each one of the six air pollutants among the nine examined locations and for all of the eight examined scenarios. According to Table 6, it seems that for particulate matters and NO₂, ANN#5 has the best predictive performance. For SO₂, O₃, and CO, ANN#4 presents the best predictive ability. In all cases, the most suitable scenarios are S2 and S1 (see Table 4). Location 8 seems to have the best forecasting ability, especially for air pollutants NO₂, SO₂, O₃, and CO. This may lead to the conclusion that sites with similar zoning to location 8 could provide more reliable data, thus assisting with the better training of ANN forecasting models can be the basis for air pollution forecasting in other locations. Concerning the general forecasting ability, we can say that the correlation coefficient (R) lies between 0.911 and 0.954, as well as the index of agreement (IA), which lies between 95.31% and 97.64%. Both of these indicate a very good forecasting ability for the next 24 h with an hourly forecasting step.

Table 7 shows the values of the exceedance’s statistical evaluation indices (TPR, FPR, FAR, and SI) for the best-developed ANN model and for each one of the six air pollutants among the nine examined locations and for all of the eight examined scenarios.

Concerning the forecasting of exceedances, in other words, this refers to the cases where the concentration of the examined pollutants is greater than a given threshold value based on the WHO directives and whether forecasted correctly or not, it seems that in all cases, model ANN#5 gives the best prediction. Also, location 8 seems could be considered as a reference location in future works. Furthermore, S1 and S2 were found to be the best training scenarios among the eight examined scenarios. Finally, TPR gives the rate of exceedances that are observed and correctly forecasted, lying between 88.16% and 97.75%, while SI shows the overall ability of the exceedances to forecast correctly, which lies between 93.113% and 99.86%. Both indicate that the developed ANN models are able to give very good and sufficient forecasting of the exceedances.

In the effort to be derived the general behavior of the developed ANs models in terms of their forecasting ability, appropriate Box and Whisker graphs were created. Necessary data for the design of these graphs were the maximum, average, and minimum values of IA (Figure 5) and SI (Figure 6), respectively, concerning all the developed ANNs for each air pollutant and for each one of the examined nine locations within the greater area of Beijing, China. More concretely, data were composed of the best performance values for each pollutant and for the optimal training function; therefore, 72 total values were used (9 locations × 8 input data training scenarios).

Figure 5 shows that the mean values of IA for all of the developed ANN models and all of the forecasted air pollutants lies between 0.92 and 0.95, indicating an extremely good forecasting performance. In addition, it seems that the developed ANN models are able to forecast next-day 24 h concentrations of SO₂, PM_2.5, CO, and O₃ in a more sufficient manner than NO₂ and PM₁₀.

In the same fashion as Figure 5, but concerning the ability of the developed ANNs to forecast air pollutant concentration exceedances, in Figure 6, the mean values of SI for all of the developed ANN models and all the examined air pollutants lie between 0.88 and 0.95, indicating an extremely good forecasting performance. In addition, it seems that the developed ANN models are able to forecast the exceedances of next-day SO₂, PM_2.5, CO, and O₃ at a more sufficient level than NO₂ and much more than PM₁₀.

4. Conclusions

The aim of this work was to create a universal forecasting model of air pollutant concentrations, specifically PM_2.5 and PM₁₀, as well as SO₂, NO₂, CO, and O₃. A basic condition for the training of these models is the appropriate pre-processing of input data in order to extract any hidden relationships, which can facilitate the subsequent creation of scenarios. The gap-filling process was an innovation that was not found in any of the aforementioned literature and offered satisfactory results. The algorithms created for this process are also universal models, and their application is feasible to any other dataset, with minor adjustments. Additionally, the comparison of eight possible input scenarios and the training of each with ten different training functions (ANN model architectures) was conducted. The results offered valuable insight regarding the optimization of the training of ANN models with a constant variable pollutant of interest. The results were acceptable for the majority of pollutants, with average prediction values well above 80.0% accuracy. Through further statistical analysis, it was shown that the use of all of the available meteorological variables enhanced the training performance of ANNs and did not “confuse” them. It was also observed that the BR₄ (ANN#5) structure was the most suitable. It seems that the exported model can be used by both public and private bodies as obtaining immediate information from the exported model can assist with protecting themselves from harmful environmental conditions.

In conclusion, the developed ANN forecasting models show an operational interest since the forecast can be made at any time during the day or for the next day, giving an advantage to competent agencies in making correct, valid, and timely decisions.

Finally, it is suggested that further research is required in order to improve the prediction of air pollution so that the developed models have an optimal design and performance for public and private authorities’ decision making, aiming for the protection of public health and also the avoidance of adverse health effects, as well as adverse effects on construction and infrastructure, taking into account the climatic crisis in addition.

Author Contributions

Conceptualization; methodology, P.F. and K.M.; software, P.F.; validation, P.F., K.M. and G.S.; formal analysis, P.F., K.M. and G.S.; investigation, P.F., K.M. and G.S.; resources, P.F. and K.M.; data curation, P.F. and K.M.; writing—original draft preparation, P.F. and K.M.; writing—review and editing, P.F., K.M. and G.S.; visualization, P.F., K.M. and G.S.; supervision, K.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are available at https://archive.ics.uci.edu/dataset/501/beijing+multi+site+air+%20quality+data, accessed on 20 December 2023.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Air Pollution. Available online: https://www.who.int/health-topics/air-pollution (accessed on 18 August 2024).
Ntourou, K.; Moustris, K.; Spyropoulos, G.; Fameli, K.-M.; Manousakis, N. Adverse Health Effects (Bronchitis Cases) Due to Particulate Matter Exposure: A Twenty-Year Scenario Analysis for the Greater Athens Area (Greece) Using the AirQ+ Model. Atmosphere 2023, 14, 389. [Google Scholar] [CrossRef]
Mo, X.; Li, H.; Zhang, L.; Qu, Z. Environmental Impact Estimation of PM_2.5 in Representative Regions of China from 2015 to 2019: Policy Validity, Disaster Threat, Health Risk, and Economic Loss. Air Qual Atmos Health 2021, 14, 1571–1585. [Google Scholar] [CrossRef]
Amoatey, P.; Sicard, P.; De Marco, A.; Khaniabadi, Y.O. Long-Term Exposure to Ambient PM_2.5 and Impacts on Health in Rome, Italy. Clin. Epidemiol. Glob. Health 2020, 8, 531–535. [Google Scholar] [CrossRef]
Veras, M.M.; Farhat, S.C.L.; Rodrigues, A.C.; Waked, D.; Saldiva, P.H.N. Beyond Respiratory Effects: Air Pollution and the Health of Children and Adolescents. Curr. Opin. Environ. Sci. Health 2023, 32, 100435. [Google Scholar] [CrossRef]
Veras, M.; Waked, D.; Saldiva, P. Safe in the Womb? Effects of Air Pollution to the Unborn Child and Neonates. J. Pediatr. 2022, 98, S27–S31. [Google Scholar] [CrossRef]
Elten, M.; Donelle, J.; Lima, I.; Burnett, R.T.; Weichenthal, S.; Stieb, D.M.; Hystad, P.; Van Donkelaar, A.; Chen, H.; Paul, L.A.; et al. Ambient Air Pollution and Incidence of Early-Onset Paediatric Type 1 Diabetes: A Retrospective Population-Based Cohort Study. Environ. Res. 2020, 184, 109291. [Google Scholar] [CrossRef]
Raz, R.; Roberts, A.L.; Lyall, K.; Hart, J.E.; Just, A.C.; Laden, F.; Weisskopf, M.G. Autism Spectrum Disorder and Particulate Matter Air Pollution before, during, and after Pregnancy: A Nested Case–Control Analysis within the Nurses’ Health Study II Cohort. Environ. Health Perspect. 2015, 123, 264–270. [Google Scholar] [CrossRef]
Simoncic, V.; Enaux, C.; Deguen, S.; Kihal-Talantikite, W. Adverse Birth Outcomes Related to NO₂ and PM Exposure: European Systematic Review and Meta-Analysis. Int. J. Environ. Res. Public Health 2020, 17, 8116. [Google Scholar] [CrossRef]
Ntourou, K.; Fameli, K.-M.; Moustris, K.; Augoustinos, A.; Tsitsis, C. The Influence of Ozone Concentrations on Public Health over the Greater Athens Area, Greece. In Proceedings of the 16th International Conference on Meteorology, Climatology and Atmospheric Physics—COMECAP 2023, Athens, Greece, 25–29 September 2023; p. 107. [Google Scholar]
Feng, Z.; Hu, E.; Wang, X.; Jiang, L.; Liu, X. Ground-Level O₃ Pollution and Its Impacts on Food Crops in China: A Review. Environ. Pollut. 2015, 199, 42–48. [Google Scholar] [CrossRef]
Du, C.; Pei, J.; Feng, Z. Unraveling the Complex Interactions between Ozone Pollution and Agricultural Productivity in China’s Main Winter Wheat Region Using an Interpretable Machine Learning Framework. Sci. Total Environ. 2024, 954, 176293. [Google Scholar] [CrossRef]
Wang, Y.; Wang, Y.; Feng, Z.; Yuan, X.; Zhao, Y. The Impacts of Ambient Ozone Pollution on China’s Wheat Yield and Forest Production from 2010 to 2021. Environ. Pollut. 2023, 330, 121726. [Google Scholar] [CrossRef] [PubMed]
Hernández-Peña, A.; Gallardo-Hernández, E.A.; Farfan-Cabrera, L.I.; Vite-Torres, M.; Muñoz-Saldaña, J. Solid Particle Erosion Evaluation of Automotive Paint Coatings under the Influence of Artificial Weathering. Wear 2023, 532–533, 205105. [Google Scholar] [CrossRef]
Ibrahim, A.M.; Bassuoni, M.T.; Carroll, J.; Ghazy, A. Performance of Concrete Superficially Treated with Nano-Modified Coatings under Sulfuric Acid Exposures. J. Build. Eng. 2024, 86, 108957. [Google Scholar] [CrossRef]
Feng, X.; Li, Q.; Zhu, Y.; Hou, J.; Jin, L.; Wang, J. Artificial Neural Networks Forecasting of PM2.5 Pollution Using Air Mass Trajectory Based Geographic Model and Wavelet Transformation. Atmos. Environ. 2015, 107, 118–128. [Google Scholar] [CrossRef]
Elangasinghe, M.A.; Singhal, N.; Dirks, K.N.; Salmond, J.A. Development of an ANN–Based Air Pollution Forecasting System with Explicit Knowledge through Sensitivity Analysis. Atmos. Pollut. Res. 2014, 5, 696–708. [Google Scholar] [CrossRef]
Bai, Y.; Li, Y.; Wang, X.; Xie, J.; Li, C. Air Pollutants Concentrations Forecasting Using Back Propagation Neural Network Based on Wavelet Decomposition with Meteorological Conditions. Atmos. Pollut. Res. 2016, 7, 557–566. [Google Scholar] [CrossRef]
Biancofiore, F.; Busilacchio, M.; Verdecchia, M.; Tomassetti, B.; Aruffo, E.; Bianco, S.; Di Tommaso, S.; Colangeli, C.; Rosatelli, G.; Di Carlo, P. Recursive Neural Network Model for Analysis and Forecast of PM10 and PM2.5. Atmos. Pollut. Res. 2017, 8, 652–659. [Google Scholar] [CrossRef]
Franceschi, F.; Cobo, M.; Figueredo, M. Discovering Relationships and Forecasting PM₁₀ and PM_2.5 Concentrations in Bogotá, Colombia, Using Artificial Neural Networks, Principal Component Analysis, and k-Means Clustering. Atmos. Pollut. Res. 2018, 9, 912–922. [Google Scholar] [CrossRef]
Chen, Q.; Ding, R.; Mo, X.; Li, H.; Xie, L.; Yang, J. An Adaptive Adjacency Matrix-Based Graph Convolutional Recurrent Network for Air Quality Prediction. Sci. Rep. 2024, 14, 4408. [Google Scholar] [CrossRef]
Latif, S.D.; Lai, V.; Hahzaman, F.H.; Ahmed, A.N.; Huang, Y.F.; Birima, A.H.; El-Shafie, A. Ozone Concentration Forecasting Utilizing Leveraging of Regression Machine Learnings: A Case Study at Klang Valley, Malaysia. Results Eng. 2024, 21, 101872. [Google Scholar] [CrossRef]
Krose, B.; Smagt, P.V.D. An Introduction to Neural Network, 8th ed.; The University of Amsterdam: Amsterdam, The Netherlands, 1996. [Google Scholar]
Sarraf Shirazi, A.; Frigaard, I. SlurryNet: Predicting Critical Velocities and Frictional Pressure Drops in Oilfield Suspension Flows. Energies 2021, 14, 1263. [Google Scholar] [CrossRef]
Chen, S. Beijing Multi-Site Air Quality; UCI Machine Learning Repository: Irvine, CA, USA, 2017. [Google Scholar] [CrossRef]
Zhang, Z.; Zhang, X.; Gong, D.; Quan, W.; Zhao, X.; Ma, Z.; Kim, S.-J. Evolution of Surface O₃ and PM_2.5 Concentrations and Their Relationships with Meteorological Conditions over the Last Decade in Beijing. Atmos. Environ. 2015, 108, 67–75. [Google Scholar] [CrossRef]
Relative Humidity Calculator. Available online: https://www.omnicalculator.com/physics/relative-humidity (accessed on 27 April 2024).
Moustris, K.P.; Ziomas, I.C.; Paliatsos, A.G. 3-Day-Ahead Forecasting of Regional Pollution Index for the Pollutants NO₂, CO, SO₂, and O₃ Using Artificial Neural Networks in Athens, Greece. Water Air Soil. Pollut. 2010, 209, 29–43. [Google Scholar] [CrossRef]
World Health Organization. WHO Global Air Quality Guidelines. Particulate Matter (PM2.5 and PM10), Ozone, Nitrogen Dioxide, Sulfur Dioxide and Carbon Monoxide; World Health Organization: Geneva, Switzerland, 2021; ISBN 978-92-4-003422-8. [Google Scholar]

Figure 1. Schematic representation of an ANN neuron.

Figure 2. MLP architecture topology [24].

Figure 3. Map with the positions of the nine (9) examined locations, Beijing, China.

Figure 4. Indicative illustration of the linear correlation of PM_2.5 concentrations between sites 3 and 1 (a), sites 3 and 4 (b), sites 3 and 6 (c), and sites 3 and 8 (d).

Figure 5. Box and Whisker plot of index of agreement (IA) for all the training scenarios and each examined pollutant.

Figure 6. Box and Whisker plot of Success Index (SI) for all training scenarios and for each examined pollutant.

Table 1. Indicative number of data completeness for location 3.

Variable	Missing Values	Variable	Missing Values
PM_2.5	750	TEMP	20
PM₁₀	553	PRES	20
SO₂	663	DEWP	20
NO₂	1601	RAIN	20
CO	3197	wd	78
O₃	664	WSPM	14
TOTAL	7428	TOTAL	172

Table 2. The addition and subtraction numbers for each pollutant and meteorological factor.

Variable	Additions/Subtractions Number
PM_2.5	1
PM₁₀	1
SO₂	0.5
NO₂	0.5
CO	50
O₃	0.5
TEMP	No additions or subtractions were needed.
PRES	No additions or subtractions were needed.
DEWP	No additions or subtractions were needed.
RAIN	No additions or subtractions were needed.
WD	No additions or subtractions were needed.
WSPM	No additions or subtractions were needed.

Table 3. Table of mean correlation coefficients.

	PM_2.5	PM₁₀	SO₂	NO₂	CO	O₃	TEMP	PRES	DEWP	RAIN	wd	WSPM	RH
PM_2.5	1.00	0.88	0.48	0.68	0.79	−0.16	−0.14	0.01	0.11	−0.01	−0.12	−0.27	0.39
PM₁₀		1.00	0.46	0.66	0.70	−0.13	−0.11	−0.02	0.06	−0.02	−0.07	−0.18	0.26
SO₂			1.00	0.49	0.54	−0.17	−0.34	0.22	−0.28	−0.03	−0.05	−0.10	−0.08
NO₂				1.00	0.71	−0.50	−0.29	0.14	−0.03	−0.03	−0.15	−0.42	0.33
CO					1.00	−0.32	−0.34	0.18	−0.07	−0.01	−0.14	−0.29	0.34
O₃						1.00	0.60	−0.45	0.32	0.02	0.14	0.29	−0.27
TEMP							1.00	−0.83	0.82	0.06	0.02	0.03	0.10
PRES								1.00	−0.77	−0.01	0.00	0.08	−0.24
DEWP									1.00	0.10	−0.11	−0.28	0.63
RAIN										1.00	−0.01	0.12	0.10
WD											1.00	0.24	−0.22
WSPM												1.00	−0.52
RH													1.00

Table 4. Table of scenarios.

A/A	Variable Inputs
S1	Pollutant—Temperature—Pressure—Dew Point—Rain—Wind Direction—Wind Speed
S2	Pollutant—Temperature—Pressure—Dew Point—Rain—Wind Direction—Wind Speed—Relative Humidity
S3	Pollutant—Temperature—Pressure—Wind Direction—Wind Speed—Relative Humidity
S4	Pollutant—Temperature—Pressure—Wind Direction—Wind Speed
S5	Pollutant—Temperature—Wind Direction—Wind Speed
S6	Pollutant—Wind Direction—Wind Speed
S7	Pollutant—Temperature—Pressure
S8	Pollutant—Temperature

Table 5. Table of ANN training parameters.

A/A	Training Function	Abbreviation	Hidden Layers	Input Layer of Neurons	Hidden Layer of Neurons
AΝΝ#1	Levenberg–Marquardt	LM	2	10	30-15
AΝΝ#2	Bayesian Regularization	BR₁	2	10	30-15
AΝΝ#3	Bayesian Regularization	BR₂	2	25	30-15
AΝΝ#4	Bayesian Regularization	BR₃	3	10	30-15-10
AΝΝ#5	Bayesian Regularization	BR₄	3	25	30-15-10
AΝΝ#6	Conjugate gradient backpropagation with Powell–Beale restarts	CGB	3	30	30-15-10
AΝΝ#7	Fletcher–Powell conjugate gradient	CGF	3	30	30-15-10
AΝΝ#8	Polak–Ribiére conjugate gradient	CGP	3	30	30-15-10
AΝΝ#9	One-step secant backpropagation	OSS	3	30	30-15-10
AΝΝ#10	Scaled conjugate gradient backpropagation	SCG	3	30	30-15-10

Table 6. Statistical evaluation indices for the best ANN performances and for each air pollutant.

Air Pollutant/ANN#	Location	Scenario	MAE (μg/m³)	RMSE (μg/m³)	R	IA (%)
PM_2.5/AΝΝ#5	4	S2	0.28	26.7	0.945	97.13
PM₁₀/AΝΝ#5	3	S1	0.48	40.61	0.911	95.31
NO₂/AΝΝ#5	8	S2	0.12	18.26	0.948	97.28
SO₂/AΝΝ#4	8	S2	0.01	7.13	0.950	97.40
O₃/AΝΝ#4	8	S2	0.01	16.26	0.954	97.64
CO/AΝΝ#4	8	S1	0.00 *	0.44 *	0.939	96.84

* For CO, the units are mg/m³.

Table 7. Evaluation of exceedances and statistical indices for the best-performing ANN models for each air pollutant.

Air Pollutant/ANN#	Location	Scenario	TPR (%)	FPR (%)	FAR (%)	SI (%)
PM_2.5/AΝΝ#5	1	S1	97.35	78.26	1.27	96.16
PM₁₀/AΝΝ#5	8	S1	97.75	24.67	6.15	93.13
NO₂/AΝΝ#5	9	S2	88.16	1.02	7.46	97.62
SO₂/AΝΝ#5	8	S2	96.91	0.23	1.88	99.45
O₃/AΝΝ#5	8	S2	89.88	0.89	8.39	98.22
CO/AΝΝ#5	6	S2	97.44	0.07	2.56	99.86

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Fazakis, P.; Moustris, K.; Spyropoulos, G. Development of Air Pollution Forecasting Models Applying Artificial Neural Networks in the Greater Area of Beijing City, China. Sustainability 2024, 16, 8721. https://doi.org/10.3390/su16198721

AMA Style

Fazakis P, Moustris K, Spyropoulos G. Development of Air Pollution Forecasting Models Applying Artificial Neural Networks in the Greater Area of Beijing City, China. Sustainability. 2024; 16(19):8721. https://doi.org/10.3390/su16198721

Chicago/Turabian Style

Fazakis, Panagiotis, Konstantinos Moustris, and Georgios Spyropoulos. 2024. "Development of Air Pollution Forecasting Models Applying Artificial Neural Networks in the Greater Area of Beijing City, China" Sustainability 16, no. 19: 8721. https://doi.org/10.3390/su16198721

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Development of Air Pollution Forecasting Models Applying Artificial Neural Networks in the Greater Area of Beijing City, China

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area and Data Availability

2.1.1. Method 1: Using Existing Values and Microsoft Excel Commands

2.1.2. Method 2: Development of a Code within the MATLAB Programming Environment

2.1.3. Method 3: Applying the Linear Regression Methodology

2.2. Data Preparation

2.3. Creation of Scenarios

2.4. Software and Infrastructure

3. Results and Discussion

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI