Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Air Quality Index Prediction in Six Major Chinese Urban Agglomerations: A Comparative Study of Single Machine Learning Model, Ensemble Model, and Hybrid Model

Atmosphere 2023, 14(10), 1478; https://doi.org/10.3390/atmos14101478

by Binzhe Zhang^1,2,†, Min Duan^2,†, Yufan Sun², Yatong Lyu², Yali Hou^3,* and Tao Tan^1,2,*

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Atmosphere 2023, 14(10), 1478; https://doi.org/10.3390/atmos14101478

Submission received: 12 August 2023 / Revised: 20 September 2023 / Accepted: 21 September 2023 / Published: 24 September 2023

(This article belongs to the Section Air Quality)

Round 1

Reviewer 1 Report

This study proposed a method for predicting air quality by utilizing seven single models and ensemble learning algorithms and a hybrid learning algorithm. The main aim is to construct multiple classes of machine learning models for predicting the AQI in major urban agglomerations in China. In addition, comparing the effectiveness of different types of algorithmic models in predicting air quality, eventually provides an effective framework for model performance analysis.

The authors gave a relatively thorough literature review. The methods are described clearly, and the results discussion is easy to follow. Overall, the applications of these machine learning models are well elaborated. The idea presented in the paper represents a sizeable effort to extend and develop the tool for providing valuable information for environmental management and sustainable development. But I do think the current manuscript needs to be improved on several aspects, especially in the training and introduction sections, the following are the suggestions, please make necessary modifications.

1. Data set: Why did the authors select these inputs and outputs? It would be helpful if the authors can present a more compelling motivation for why these particular quantities of interest are worth knowing, providing some references will help as well. I understand that the data source from China National Environmental Monitoring Center includes these variables, but not necessarily all inputs are significant. It would also help if the authors can give some analysis of the feature significance, such as the F-score.

2. Data set: What’s the size of the training dataset? Moreover, to justify the size of the training set, it will be nice for instance to have a figure showing the RMSE on the test set with different sizes of the training set (for instance from 300 to 1200). If significant variations are observed, it will justify your choice for the training set. Otherwise, it seems to be random.

3. Introduction: Please clearly present the significance of this study, the aim and results mentioned in this section are good, but it would be nice to stress the significance.

4. Introduction: the authors give a fairly thorough literature review on the application of machine learning models on AQI prediction, but one significant aim of this study seems to be providing an effective framework for model performance analysis by comparing the effectiveness of different types of algorithmic models, and I agree. Therefore, the authors should add some literature discussion on this model comparison aspect, this paper is a good example to discuss: https://doi.org/10.2514/1.I011047 Since it similarly compared multiple ML models and proposed an effective framework for model performance analysis.

5. Introduction: References should be given to the machine learning models you mentioned in the Introduction section, such as KNN, XGBT, etc al.

6. Model implementation or training: the training process is quite important, the hyperparameters, especially for the ensemble models could be complex to tune. Can you add some discussion? What’s the tunning process and the applied parameters value?

7. Training: Can the authors provide some training time information? Is it CPU time or process time? Moreover, can the authors specify the hardware used (how many CPUs, the type of processor, etc.).

8. Results: two suggestions, one is to add the feature significance analysis which should be reported by the machine models already; two is to have a Taylor diagram graphical presentation for comparing the performance of these models as shown in this article: “Comparison of Machine Learning Models for Data-Driven Aircraft Icing Severity Evaluation”. It’s simply a clear presentation of the performance comparison.

9. Results: Overall, I think the cases studied by the authors are very insightful and the authors gave a very thorough description of how the cases are constructed and interpreted the figures very well.

10. Conclusion: It’s good that the limitations are listed, the future work given by the authors is very thorough, which is a very good addition to the current work.

11. Please make the necessary modifications based on the comments above.

Author Response

To reviewer #1

Q1. Data set: Why did the authors select these inputs and outputs? It would be helpful if the authors can present a more compelling motivation for why these particular quantities of interest are worth knowing, providing some references will help as well. I understand that the data source from China National Environmental Monitoring Center includes these variables, but not necessarily all inputs are significant. It would also help if the authors can give some analysis of the feature significance, such as the F-score.

Regarding this issue, we have provided some references in the data source section, which comprehensively apply various air pollutants to predict air quality, and some also combine meteorological data. It further emphasizes why we should choose six types of air pollutants and five types of meteorological data for research.

Such as, A BA-LSSVM algorithm constructed by Wu et al. considers air pollutant factors such as PM2.5, PM10, CO, SO2, NO2, and O3, and takes predicted values from each component to obtain AQI predictions. Senthil Kumar et al. not only used six types of pollutants, but also added some nitrogen oxides and benzene pollutants to predict the environmental air quality index of cities in southern India.

Q2. Data set: What’s the size of the training dataset? Moreover, to justify the size of the training set, it will be nice for instance to have a figure showing the RMSE on the test set with different sizes of the training set (for instance from 300 to 1200). If significant variations are observed, it will justify your choice for the training set. Otherwise, it seems to be random.

In the 4.1 Data Preparation chapter, we supplemented the comparison of model accuracy under different training-test set split proportion to select the optimal split proportion. In this section, we chose two methods for comparison: random spilt and time series split. The first method was to randomly divide the training test data in proportion, from 60% training set and 40% test set to 90% training set and 10% test set. Seven different proportion split methods were discussed, where the training and test set data will be randomly selected. The second method is based on time series split, where 24∶24 represents using data from 2017 to 2018 as the training set and data from 2019 to 2020 as the testing set; 36∶12 indicates that the data from 2017 to 2019 will be used as the training set, and the data from 2020 will be used as the test set. Based on the results, we ultimately used the data from 2017 to 2019 as the training set and the data from 2020 as the testing set.

At the end of this section, we also added the specific sizes of training and testing sets for different urban agglomerations.

Q3. Introduction: Please clearly present the significance of this study, the aim and results mentioned in this section are good, but it would be nice to stress the significance.

We emphasize the significance of this research at the end of the introduction, including:

(1) We have accumulated air quality data from six major urban agglomerations in China, and established four single models (LR, KNN, SVR, LSTM), three ensemble models (RF, XGBT, LGBM), and one hybrid model LSTM-SVR. By considering six air pollutant concentrations and five meteorological factors, we predict AQI values, effectively and comprehensively comparing the effectiveness of different types of algorithm models in predicting air quality.

(2) The predictive performance of all models was evaluated through RMSE, MAE, and R2. The ensemble model RF and hybrid model LSTM-SVR showed good performance, with LSTM-SVR exhibiting lower RMSE in BTH-UA and CP-UA areas. The constructed hybrid model LSTM-SVR has certain practical significance for predicting air quality in high pollution areas.

Q4. Introduction: the authors give a fairly thorough literature review on the application of machine learning models on AQI prediction, but one significant aim of this study seems to be providing an effective framework for model performance analysis by comparing the effectiveness of different types of algorithmic models, and I agree. Therefore, the authors should add some literature discussion on this model comparison aspect, this paper is a good example to discuss: https://doi.org/10.2514/1.I011047 Since it similarly compared multiple ML models and proposed an effective framework for model performance analysis.

We are very grateful for this suggestion, and you have provided us with a good reference. We have added two references at the end of the fifth paragraph in the introduction section to discuss only a small number of research specifically compare the performance of different types of algorithms while analyzing practical problems

Q5. Introduction: References should be given to the machine learning models you mentioned in the Introduction section, such as KNN, XGBT, etc al.

We have added a list of commonly used machine learning models in the fourth paragraph of the introduction and provided references to these models.

Q6. Model implementation or training: the training process is quite important, the hyperparameters, especially for the ensemble models could be complex to tune. Can you add some discussion? What’s the tunning process and the applied parameters value?

Thank you very much for this suggestion, which has made our research more rigorous.

We have opened a new 4.2 Model parameter tuning chapter, which discusses the use of Grid Search methods to tune parameters for various models, and have created Table 7 to display these optimal parameters.

Q7. Training: Can the authors provide some training time information? Is it CPU time or process time? Moreover, can the authors specify the hardware used (how many CPUs, the type of processor, etc.).

We have added training time information for each model at the end of the 4.2 Model parameter tuning section and provided relevant explanations.

LR and KNN have the shortest time. Among the ensemble models, XGBT has the longest time, LGBM has the shortest time. The hybrid model LSTM-SVR includes Grid Search steps, LSTM and SVR model predictions, so it has the longest time among all models.

Q8. Results: two suggestions, one is to add the feature significance analysis which should be reported by the machine models already; two is to have a Taylor diagram graphical presentation for comparing the performance of these models as shown in this article: “Comparison of Machine Learning Models for Data-Driven Aircraft Icing Severity Evaluation”. It’s simply a clear presentation of the performance comparison.

We have launched a new 4.6 Taylor diagram graphical presentation for models section, in which we have created a Taylor diagram for each urban agglomeration and conducted relevant analysis.

Author Response File: Author Response.docx

Reviewer 2 Report

The paper presents the comparison of data-driven algorithms for the AQI prediction in various urban agglomerations of China. The topic is interesting but the paper should be updated according to the following comments:

1) The formal prediction problem's statement is missing. This statement should at least describe how many single-day data bundles in past are used to predict AQI in future and for how many days in future the AQI is predicted.

2) If the machine-learning algorithms are used to solve the same prediction problem then their descriptions should share common variables and notations. Description of the machine learning methods should be given according to the prediction problem's statement and it should be clear how y^test is evaluated in each case. Terms like "samples" and "feature space" should be defined in the context of the work. For example it is not clear what is \varphi, f(x), a_{i}, b_{i}, etc... Are \sigma for SVR and \sigma in Figure 4 the same?

3) The description of LST-SVR should be given in section 3.3.

4) It is not quite clear how to use these predictions in decision making since it seems that they do not explain the reasons of bad or good AQI (as model-based approaches do). Some discussion should be added regarding this question.

Author Response

Q1: The formal prediction problem's statement is missing. This statement should at least describe how many single-day data bundles in past are used to predict AQI in future and for how many days in future the AQI is predicted.

Thanks for your suggestion!

We have adopted a new approach to address this issue. At the end of 4.1 Data Preparation section, we added the specific sizes of training and testing sets for different urban agglomerations.

We have also added the following statement: Based on the above analysis and empirical research on AQI time series prediction, data from 2017 to 2019 were used as the training set, with a total of 100754 single day data packets used for AQI prediction training; Using the 2020 data as the test set, a total of 33904 days of data were used to predict the AQI index.

Q2: If the machine-learning algorithms are used to solve the same prediction problem then their descriptions should share common variables and notations. Description of the machine learning methods should be given according to the prediction problem's statement and it should be clear how y^test is evaluated in each case. Terms like "samples" and "feature space" should be defined in the context of the work. For example it is not clear what is \varphi, f(x), a_{i}, b_{i}, etc... Are \sigma for SVR and \sigma in Figure 4 the same?

Due to different methods, there are different terms. And some terms cannot be represented solely by the notations of training and testing sets. Therefore, in response to this suggestion, we have added some explanations for notations that have not yet been explained or are insufficiently explained.

Q3: The description of LSTM-SVR should be given in section 3.3.

We have placed LSTM-SVR in position 3.3.8.

Q4: It is not quite clear how to use these predictions in decision making since it seems that they do not explain the reasons of bad or good AQI (as model-based approaches do). Some discussion should be added regarding this question.

Thank you for your suggestion. We acknowledge that this suggestion is the weakness of our research, and we have also pointed out the shortcomings at the end of the paper. Based on your suggestion, we further emphasize the shortcomings of this study:

Fourth, this study only explored the predictive performance of different models and did not further delve into measuring the feature importance of AQI using high-performing models, performing feature selection, or constructing new features. So this article only provides a better model selection perspective in AQI prediction, but there is not much discussion on analyzing the causes that affect the quality of AQI and providing decision-making opinions for specific pollutant treatment.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

The authors have addressed all my questions, and significantly improved the manuscript. I recommend Accept in present form.

Author Response

Thank you very much for your review and modification feedback on our research！

Article Menu

Air Quality Index Prediction in Six Major Chinese Urban Agglomerations: A Comparative Study of Single Machine Learning Model, Ensemble Model, and Hybrid Model

Further Information

Guidelines

MDPI Initiatives

Follow MDPI