Next Article in Journal
Towards an Efficient Multi-Generation System Providing Power, Cooling, Heating, and Freshwater for Residential Buildings Operated with Solar-Driven ORC
Previous Article in Journal
Numerical Investigation of Asphalt Concrete Fracture Based on Heterogeneous Structure and Cohesive Zone Model
 
 
Article
Peer-Review Record

Forecasting of PM2.5 Concentration in Beijing Using Hybrid Deep Learning Framework Based on Attention Mechanism

Appl. Sci. 2022, 12(21), 11155; https://doi.org/10.3390/app122111155
by Dong Li 1,2,3,4, Jiping Liu 1,2 and Yangyang Zhao 2,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Appl. Sci. 2022, 12(21), 11155; https://doi.org/10.3390/app122111155
Submission received: 21 September 2022 / Revised: 26 October 2022 / Accepted: 2 November 2022 / Published: 3 November 2022

Round 1

Reviewer 1 Report

The article is concerning the application of deep-learning algorithms for air quality prediction. The subject is relevant and fits the scope of the journal. Presented method is interesting, but requires a major correction to be suitable for the publication.

Below, there are general and detailed remarks:

The application of the Bi LSTM needs more discussion/motivation while it suggests that future state of the atmosphere (meteorology, AQ) influences the previous state, which is contrary to current knowledge of atmospheric processes. Predicting of output based on future state seems to be questionable (L187-188). Surpassingly good model performance, both in short-term and long-term analysis may suggest use of the future data in the predictions. A mode detailed description of dataflow in the experiment is required.

Application of attention mechanism seems to be obvious while in wide range of environmental modelling techniques it is widely used. A function describing different weights associated to contribution of states from different past moments is sometimes called a time distribution function.  While in many ecosystems such function seems to be stable in time (e.g. Groundwater systems, river catchments) an application of such attempt to atmospheric processed needs more argumentation while in this case a variable synoptic situation may significantly influence and modify such function (e.g. periods of strong winds and calm periods dramatically change the dispersion conditions in the atmosphere influencing the air quality).

The analysis of short-term performance is inconsistent and incomplete. The total model performance intercomparison is reported for parameters which are not optimal (WS=24, EP=100) (as reported in the next parts presenting the dependence on window size and number of epochs (WS=56, EP=150). Moreover, there is no overfitting analysis results (e.g. the comparison of performance for training and testing sets).

A similar notes apply to the next section where total model performance is reported for 13-24 h, while other parts are regarding forecasts up to 96h.

The authors declare use of data from the period 2013-2017 while in the results, only data from December 2016 and January 2017 are used. How is the model performance for other seasons?

The discussion section is very superficial and doesn’t discuss the possible reasons why the model proposed by authors performs better than others.

L17: LSTM is not defined

L114: “the framework of the framework”

L117-118: From the fig.1 it looks like in the first step, the data from every station are analysed by 1D CNN’s separately and in the next step the data are concatenated. It is not clear how such approach allow to “capture trend features between some sites”? Correct text or fig 1.

L144-145: The information has been already included in lines 120-122

Table 1: Wind direction is expressed in symbolic way while the specified unit is ‘°’

Paragraph 3.3: The description of network geometry is specified without any justification of reported neuron numbers in hidden layers. Did the authors any optimisation of the geometry?

Figures 3 and 4 can be merged into one, while splitting to months or seasons doesn’t bring any new information.

Figure 5: The picture shows spatial rather that spatiotemporal variability (there is no temporal dimension on the picture). The presented distribution seems to be not realistic (especially elevated levels of PM2.5 in mountainous regions (north east) are strange. The distribution looks as interpolated far beyond the station locations.

L305: All acronyms used for the first time should be expanded.

Figure 11 can be replaced by the table reporting fitting parameters.

Author Response

Comments:

 

Point 1. The application of the Bi LSTM needs more discussion/motivation while it suggests that future state of the atmosphere (meteorology, AQ) influences the previous state, which is contrary to current knowledge of atmospheric processes. Predicting of output based on future state seems to be questionable (L187-188). Surpassingly good model performance, both in short-term and long-term analysis may suggest use of the future data in the predictions. A mode detailed description of dataflow in the experiment is required.

 

Response 1: Thank you for your suggestion. We agree with the reviewer.

According to the suggestions of the reviewer, we have sorted out previous related work on pollutant prediction and other aspects in the revised manuscript, raised relevant questions and objectives, and revised the introductory part of the paper.

 

 

Point 2. Application of attention mechanism seems to be obvious while in wide range of environmental modelling techniques it is widely used. A function describing different weights associated to contribution of states from different past moments is sometimes called a time distribution function.  While in many ecosystems such function seems to be stable in time (e.g. Groundwater systems, river catchments) an application of such attempt to atmospheric processed needs more argumentation while in this case a variable synoptic situation may significantly influence and modify such function (e.g. periods of strong winds and calm periods dramatically change the dispersion conditions in the atmosphere influencing the air quality).

 

Response 2: Thank you for your suggestion. We agree with the reviewer.

It is the presence of extreme weather that makes the model's predictions difficult. This is also mentioned in section 4.3 of the thesis, where the attention mechanism is to give different weights at different times to optimize the prediction results.

 

 

 

Point 3. The analysis of short-term performance is inconsistent and incomplete. The total model performance intercomparison is reported for parameters which are not optimal (WS=24, EP=100) (as reported in the next parts presenting the dependence on window size and number of epochs (WS=56, EP=150). Moreover, there is no overfitting analysis results (e.g. the comparison of performance for training and testing sets).

 

Response 3: Thank you for your suggestion. We agree with the reviewer.

Short-term forecasting is indeed simpler than long-term forecasting, but if the size of the Windows size is increased, the error in long-term forecasting can be minimized (similar to the experiment designed in Figure 6 in Section 4.2).

Since we did a lot of experiments initially, the comparison of training and test data was not covered.

 

 

 

Point 4. A similar notes apply to the next section where total model performance is reported for 13-24 h, while other parts are regarding forecasts up to 96h.

 

Response 4: Thank you for your suggestion. We agree with the reviewer.

 

 

Point 5. The authors declare use of data from the period 2013-2017 while in the results, only data from December 2016 and January 2017 are used. How is the model performance for other seasons?

 

Response 5: Thank you for your suggestion. We agree with the reviewer.

Data from December 2016 and January 2017 were chosen because (1) the PM2.5 concentration values in these two months were high, and the higher the PM2.5 concentration, the more difficult it is to predict; and (2) reference was made to other articles, many of which also used data from December for their experimental results.

 

Point 6. The discussion section is very superficial and doesn’t discuss the possible reasons why the model proposed by authors performs better than others.

Response 7: Thank you for your suggestion. I have made changes.

Please refer to the discussion part of the revised manuscript.

 

 

 

Point 7. L17: LSTM is not defined

Response 7: Thank you for your suggestion. I have made changes.

 

 

Point 8. L114: “the framework of the framework”

Response 8: Thank you for your suggestion. I have made changes.

 

 

Point 9. L117-118: From the fig.1 it looks like in the first step, the data from every station are analysed by 1D CNN’s separately and in the next step the data are concatenated. It is not clear how such approach allow to “capture trend features between some sites”? Correct text or fig 1.

Response 9: Thank you for your suggestion. I have made changes in the text.

 

 

 

Point 10. L144-145: The information has been already included in lines 120-122

Response 10: Thank you for your suggestion. I have made changes in the text.

 

 

Point 11. Table 1: Wind direction is expressed in symbolic way while the specified unit is ‘°’

Response 11: Thank you for your suggestion. We have removed ‘°’.

 

 

Point 12. Paragraph 3.3: The description of network geometry is specified without any justification of reported neuron numbers in hidden layers. Did the authors any optimisation of the geometry?

Response 12: Thank you for your suggestion. I set the neuron parameters for two reasons: (1) the neuron parameters are the same for all deep learning models; (2) the training of our proposed deep learning model is more dependent on the performance of the computer, and my computer performance is not particularly good, so the computer can barely run with such parameters set.

 

 

 

Point 13. Figures 3 and 4 can be merged into one, while splitting to months or seasons doesn’t bring any new information

Response 13: Thank you for your suggestion. I have made changes.

 

 

Point 14. Figure 5: The picture shows spatial rather that spatiotemporal variability (there is no temporal dimension on the picture). The presented distribution seems to be not realistic (especially elevated levels of PM2.5 in mountainous regions (north east) are strange. The distribution looks as interpolated far beyond the station locations.

Response 14: Thank you for your suggestion. As shown in Figure 2, there are 12 air monitoring stations in Beijing. Because only these 12 points have data when doing interpolation, it will make the PM2.5 concentration in the northeast increase. We have used many interpolation methods to remember this, and the root cause is that the air monitoring stations in Beijing are mainly located in the central area. This problem can be avoided if data from air monitoring sites in Hebei province are available.

 

 

Point 15. L305: All acronyms used for the first time should be expanded.

Response 15: Thank you for your suggestion. I have made changes in the text.

 

 

Point 16. Figure 11 can be replaced by the table reporting fitting parameters.

Response 15: Thank you for your suggestion. Figure 11 shows the prediction models for each model for 24/96 h. The parameters are presented in Figure 9. Figures 10 and 11 are presented to demonstrate the ability of the models to fit. Figure 11 can be deleted if necessary.

 

 

 

 

 

 

 

Due to time constraints, only the description of language errors in the article will be corrected. We will touch up the article in all aspects next time if necessary.

Author Response File: Author Response.docx

Reviewer 2 Report

This study proposed a hybrid deep learning framework FPHFA for PM2.5 concentration forecasting, which learns spatially correlated features and long-term dependencies of time series data related to PM2.5. The authors carried out experimental evaluations using the Beijing dataset, and the outcomes show that the proposed model can effectively handle PM2.5 concentration prediction with satisfactory accuracy Overall, the manuscript is interesting however, it needs a major revision. My specific comments are as under:

 

 

1.       Please highlight the novelty of the current work point-wise in the introduction section. From the current version, it is not very clear. Also, discuss how FPHFA is different from the existing deep learning methods.

2.       Please provide the section-wise breakup at the end of section 1.

3.       The literature review is not enough and updated; hence, expanding it by discussing different methods used in the literature is highly recommended. Also, discuss the findings of the literature already cited.

4.       How is the model in 2.2.2 estimated?

5.       Please also provide the Mean absolute Percentage error (MAPE) in the results.

6.       How is your model accommodating for different periodicities in the data?

7.       R squared cannot be used as a model section criterion. The author should use the Adj-R squared instead.

8.       The time series plots of the data are missing and are highly recommended to be added to the manuscript. Some exemplary plots can be seen (Forecasting next-day electricity demand and prices based on functional models).

9.       Table 2, authors should also conduct some statistical tests to ensure the superiority of the proposed approach, i.e., how could authors ensure that their results are superior to others? For example, see (Short-term electricity demand forecasting using components estimation technique)

10.   Please provide the ACF and PACF plots of the final residuals.

 

11.   The authors should also discuss other forecasting methods used for time series data for the information of interested readers. Some examples are (Electricity spot prices forecasting based on ensemble learning) (Short-term electricity prices forecasting using functional time series analysis)

Author Response

Comments:

 

Point 1. Please highlight the novelty of the current work point-wise in the introduction section. From the current version, it is not very clear. Also, discuss how FPHFA is different from the existing deep learning methods.

 

Response 1: Thank you for your suggestion. I agree with the reviewer. I have made changes.

Please refer to the section introduction of the revised manuscript.

 

 

Point 2. Please provide the section-wise breakup at the end of section 1.

 

Response 2: Thank you for your suggestion. I agree with the reviewer. I have made changes.

Please refer to the section introduction of the revised manuscript.

 

 

Point 3. The literature review is not enough and updated; hence, expanding it by discussing different methods used in the literature is highly recommended. Also, discuss the findings of the literature already cited.

 

Response 3: Thank you for your suggestion. I agree with the reviewer. I have made changes.

Please refer to the section introduction of the revised manuscript.

 

Point 4. How is the model in 2.2.2 estimated?

 

Response 4: Thank you for your suggestion. I'm sorry, I didn't understand what you meant. section 2.2.2 is a description of the Bi LSTM model, not an estimation of the model.

 

Point 5. Please also provide the Mean absolute Percentage error (MAPE) in the results.

 

Response 5: Thank you for your suggestion. I agree with the reviewer. I have made changes.

 

Please refer to the section 4.2 and 4.3 of the revised manuscript.

 

 

 

Point 6. How is your model accommodating for different periodicities in the data?

 

Response 6: Thank you for your suggestion. Predicting PM2.5 concentrations is not a cyclical experiment. Our models are judged to be good or bad by predicting PM2.5 concentrations at different times in the future.

 

 

 

Point 7. R squared cannot be used as a model section criterion. The author should use the Adj-R squared instead.

Response 7 : I referred to many articles before choosing R2 as the model evaluation metric. Figure 11 can be deleted if necessary.

 

Corresponding articles are(PM2.5 concentrations forecasting in Beijing through deep learning with different inputs, model structures and forecast time)

(Research on PM2.5 Concentration Prediction Based on the CE-AGA-LSTM Model)

 

Point 8. The time series plots of the data are missing and are highly recommended to be added to the manuscript. Some exemplary plots can be seen (Forecasting next-day electricity demand and prices based on functional models).

Response 8: Thank you for your suggestion. I agree with the reviewer. I have made changes.

Please refer to the figure 3 of the revised manuscript.

 

 

Point 9. Table 2, authors should also conduct some statistical tests to ensure the superiority of the proposed approach, i.e., how could authors ensure that their results are superior to others? For example, see (Short-term electricity demand forecasting using components estimation technique)

Response 9: Thank you for your suggestion. I am very sorry that I did not understand your point. But I have demonstrated the superiority of our proposed model through numerous experiments.

 

 

Point 10. Please provide the ACF and PACF plots of the final residuals.

Response 10: Thank you for your suggestion. For the sake of completeness of the article, it could not be added in the article. We are very sorry.

 

 

 

Point 11. The authors should also discuss other forecasting methods used for time series data for the information of interested readers. Some examples are (Electricity spot prices forecasting based on ensemble learning) (Short-term electricity prices forecasting using functional time series analysis)

Response 11: Thank you for your suggestion. I agree with the reviewer. I have made changes.

Please refer to the section introduction of the revised manuscript.

 

 

 

 

Due to time constraints, only the description of language errors in the article will be corrected. We will touch up the article in all aspects next time if necessary.

Author Response File: Author Response.docx

Reviewer 3 Report

This study presents a model named FPHFA to predict PM2.5 concentrations. That model forecasts PM2.5 concentrations at target site based on data.

Specific comments:

-        Please improve the quality of Figure 1 mainly on the right part of that Figure 1 such as attention layer and multi-channel 1D CNNs.

Author Response

Comments:

 

-        Please improve the quality of Figure 1 mainly on the right part of that Figure 1 such as attention layer and multi-channel 1D CNNs.

 

 

Response 1: Thank you for your suggestion. I agree with the reviewer. I have increased the DPI of the images.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

The authors improved the quality of the manuscript, however still, most of the remarks stay untouched.

 

Presentation of the spatial variability of PM2.5 concentration in the whole region based on measurements performed in limited urban area doesn't make sense. The authors should consider the limitation of analysed area.

 

It is still not clear if the model performance is related to full period of data availability or just one month presented on the pictures.

 

What the authors meant in the last sentence written in Chinese: "Due to time constraints, only descriptions of linguistic errors in the article will be corrected. If necessary, we will touch up all aspects of the article in the next one."?

 

In my opinion the manuscript still needs a major revision.

Author Response

Response to comments by Reviewer #1:

 

We would like to gratefully thank the reviewer for his/her constructive comments and recommendations for improving the paper. A point-by-point response to the interesting comments raised by the reviewer follows.

 

Comments:

 

Point 1. Presentation of the spatial variability of PM2.5 concentration in the whole region based on measurements performed in limited urban area doesn't make sense. The authors should consider the limitation of analysed area.

Response 1: Thank you for your suggestion. We agree with the reviewer.

We have added a description to the text.

Please refer to the section 4.1 of the revised manuscript.

 

 

Point 2. It is still not clear if the model performance is related to full period of data availability or just one month presented on the pictures

 

Response 2: Thank you for your suggestion. We agree with the reviewer.

We divide the data into seasons and conduct experiments on them respectively.

We have added experiments to prove that our model can obtain better prediction results in any period.

Please refer to the section 4.3 of the revised manuscript.

 

Point 3. What the authors meant in the last sentence written in Chinese: "Due to time constraints, only descriptions of linguistic errors in the article will be corrected. If necessary, we will touch up all aspects of the article in the next one."?

 

Response 3: Thank you for your suggestion. We agree with the reviewer.

I'm sorry we didn't make it clear. We have corrected the syntax errors.

Author Response File: Author Response.docx

Reviewer 2 Report

1.      As I previously requested, authors should also conduct some statistical tests to ensure the superiority of the proposed approach, i.e., how could authors ensure that their results in tables 2 and 3 are superior to other competing models? How are they statistically significant than each other? For example, use Diebold and Mariano test, see (Short-term electricity demand forecasting using components estimation technique).

 

2.      As requested previously, please add the ACF and PACF plots of the final model residuals to see if they are white noise or still contain some autocorrelation. This is not something difficult to do. Unfortunately, the authors' reply, “For the sake of completeness of the article, it could not be added in the article. We are very sorry.” is not satisfactory.

 

 

3.      R squared cannot be used as a model section criterion. Adding more explanatory variables to the model will always increase the R Squared values, even if the added variable is insignificant. This is a proven fact. Hence, the authors should use the Adj-R squared instead.

Author Response

Point 1. As I previously requested, authors should also conduct some statistical tests to ensure the superiority of the proposed approach, i.e., how could authors ensure that their results in tables 2 and 3 are superior to other competing models? How are they statistically significant than each other? For example, use Diebold and Mariano test, see (Short-term electricity demand forecasting using components estimation technique).

Response 1: Thank you for your suggestion. I agree with the reviewer. I have made changes.

The following table is made according to the forecast data in Table 2. But none of us have ever used Diebold and Mariano test to verify the model. We are not sure whether we are doing the right thing. Therefore, we did not put the table in the article. We put the form here and hope you can help us correct it.

 

 

Table2

Models

LSTM

GRU

CNN-LSTM

DAQFF

FPHFA

LSTM

-

0.46

0.28

0.40

0.17

GRU

0.54

-

0.44

0.35

0.07

CNN-LSTM

0.72

0.56

-

0.46

0.15

DAQFF

0.60

0.65

0.54

-

0.18

FPHFA

0.83

0.93

0.85

0.82

-

 

Please refer to the section introduction of the revised manuscript.

 

 

Point 2. As requested previously, please add the ACF and PACF plots of the final model residuals to see if they are white noise or still contain some autocorrelation. This is not something difficult to do. Unfortunately, the authors' reply, “For the sake of completeness of the article, it could not be added in the article. We are very sorry.” is not satisfactory.

 

Response 2: Thank you for your suggestion. I agree with the reviewer. I have made changes.

Please refer to the section 4.2 of the revised manuscript.

 

 

Point 3. R squared cannot be used as a model section criterion. Adding more explanatory variables to the model will always increase the R Squared values, even if the added variable is insignificant. This is a proven fact. Hence, the authors should use the Adj-R squared instead.

 

Response 3:  Thank you for your suggestion. I agree with the reviewer. We know that adding more explanatory variables to the model will always increase the R Squared values. The formula of Adj-R squared is as follows.

R2(adj)=1-(1-R2)*(n-1)/(n-p-1)

P is the number of variables, n is the number of samples. In each experiment, the number of P is 12. The number of n is same. P and n are invariant. R2 can replace Adj-R squared as the model section criterion. Under the condition that the results are relatively unchanged, we don't think it is necessary to replace R2.

 

 

Author Response File: Author Response.docx

Back to TopTop