Next Article in Journal
Free Vibrations of Sustainable Laminated Veneer Lumber Slabs
Previous Article in Journal
Farmers’ Knowledge, Perceptions and Attitudes on Crop-Dairy Goat Integration Farming System in Elgeyo Marakwet County
 
 
Article
Peer-Review Record

Multi-Site and Multi-Pollutant Air Quality Data Modeling

Sustainability 2024, 16(1), 165; https://doi.org/10.3390/su16010165
by Min Hu 1, Bin Liu 2,* and Guosheng Yin 3
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Sustainability 2024, 16(1), 165; https://doi.org/10.3390/su16010165
Submission received: 10 November 2023 / Revised: 4 December 2023 / Accepted: 19 December 2023 / Published: 23 December 2023
(This article belongs to the Section Air, Climate Change and Sustainability)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The manuscript titled "Multi-site multi-pollutant air quality data modeling," to the Sustainability Journal, presents a new approach: a de-trending graph convolutional Long Short-Term Memory (LSTM) model. This model not only captures spatial dependencies among multiple stations but also incorporates de-trending signals to handle nonstationary data.

The manuscript is coherent with “Aims & scope” of Sustainability journal.

However, there are two fundamental observations that require an answer from the authors before considering the possibility of publication.

Major observations

In my opinion, the 'Introduction' and 'Literature Review' sections should be revised. Additionally, the 'Problem and Research Gap' section should be incorporated at the end of the literature review. This arrangement would allow the authors to better highlight the current models used for this purpose after examining the state of the art. They can then emphasize the primary issues and challenges to address and illustrate how their model surpasses current limitations.

The authors state that their model includes the spatial aspect, but it's unclear how this aspect can be utilized within the model.

Minor observations

·       Lines 26-27 “Based on this air quality data matrix, an air quality index (AQI) can be calculated to inform public the air quality at present.” A Reference is needed.

·       Line 46: Figure 2 in the text appears before Figure 1.

·       Lines 52-54: The diffusion convolution [11] captures spatial dependency using bidirectional random walks on the meteorological monitoring sites graph G = (V, E, A) as shown in Fig 1 (b). - What do V, E, A represent? The authors should provide a better description of this in the model section.

·       The authors should avoid repetition. For instance: Lines 60 – 61 mention the extensive applications of LSTM in natural language processing (NLP), and the same idea is reiterated in Lines 109-110. In Lines 141 – 142, the authors mention the success of LSTM in handling sequential data and its applications in NLP and video analysis [27]. In these instances, the authors should explain how referencing the use of LSTM in NLP contributes to the development of the methodology in the article.

·       Figure 1 - Location of meteorological stations. It would be better to define them as monitoring stations.

·       Figure 1c and Figure 2 - Include units of measurement on the y-axis.

·       In the Results section, it would be beneficial for the authors to include an example of time pattern graphs. This addition would help readers better comprehend the accuracy and quality of the predictions made by the model.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The paper presents a study focused on modeling air quality data in major industrialized cities, particularly addressing multi-site and multi-pollutant data. The authors, Min Hu, Bin Liu, and Guosheng Yin, propose an advanced model for predicting air quality using a modified Long Short-Term Memory (LSTM) network, enriched with a de-trending operation to handle nonstationary data and a diffusion graph convolution to capture spatial correlations among different monitoring sites.

The article begins by discussing the importance of air quality as a public health and sustainable development issue, emphasizing that predicting future air quality is more beneficial to the population than just real-time reporting. The authors identify limitations in existing methods, such as their inability to model nonstationarity and spatial correlations among multiple stations, and propose their model, named Long-Short De-trending Graph Convolutional Network (LS-deGCN), as a solution. To improve the analysis of the state of the art, insert works such as: Pollution dispersion from a fire using a Gaussian plume model 10.18280/ijsse.100401

The LS-deGCN model is designed to capture both spatial and temporal correlations in air quality data, using an LSTM network to analyze temporal dependencies and graph diffusion convolution for spatial correlations. The data used comes from monitoring stations in Chengdu and seven other major cities in China, covering various pollutants like NO2, CO, PM2.5, and PM10.

To evaluate the model, the authors compare LS-deGCN with three baseline methods: linear regression, support vector regression, and LSTM sequence-to-scalar. They use metrics such as Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) for assessment. The results show that the LS-deGCN model outperforms the baseline methods and demonstrates promising capability in predicting air quality.

- The abstract and introduction provide a good overview of the problem and the proposed solution. However, they could be enhanced by more explicitly stating the unique contributions of this study compared to existing methods. For instance, emphasizing how the proposed model overcomes the limitations of traditional methods in handling spatio-temporal data and nonstationarity in air quality prediction could provide clearer context and highlight the novelty of the research.

- While the paper introduces the Long-Short De-trending Graph Convolutional Network (LS-deGCN) and its variants, more detailed explanations or visual representations could help in understanding the model's architecture and functioning better. This could include more detailed schematics or flowcharts that illustrate how the model processes data and makes predictions.

-The paper mentions the use of diffusion graph convolution to capture spatial correlations among air quality data from multiple sites. It would be beneficial to expand on how this method specifically addresses and models these spatial correlations, possibly including comparisons or examples that illustrate the effectiveness of this approach compared to traditional methods

-The paper could delve deeper into the rationale behind choosing specific tuning parameters like the time lag (l) and window width (∆t). Discussing how different values of these parameters impact the model's performance could provide deeper insights into the model's sensitivity and robustness

-The baseline models used for comparison (Linear Regression, Support Vector Regression, LSTM sequence-to-scalar) are well-chosen. However, a more in-depth comparative analysis, perhaps including graphical representations of performance metrics like RMSE and MAE, could help in better understanding the advantages of the proposed model over these baselines.

- While the paper mentions using linear interpolation for dealing with missing values and normalizing data, a more comprehensive explanation of these preprocessing steps could be beneficial. This might include discussing the choice of these methods over others and how they impact the final model performance. Additionally, explaining any data augmentation or feature engineering steps undertaken could provide a more complete understanding of the data preparation process.

Comments on the Quality of English Language

- In some sections, the language could be made clearer and more concise. This includes reducing complex sentences and removing unnecessary repetitions. A more straightforward and succinct language would facilitate comprehension for readers, especially those who are non-native English speakers.

- Ensure that technical and scientific terminology is used consistently throughout the document. This includes standardizing specific industry terms, acronyms, and units of measurement. Consistent terminology helps maintain clarity and technical accuracy in the document.

 

- Some sentences could benefit from revisions to improve fluency and grammatical structure. In particular, check for the correct use of verb tenses, the structure of subordinate clauses, and the placement of prepositions. These improvements would contribute to making the text more readable and professional.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

I think the paper is well written, interesting and proposing a useful idea to tackle multi-site and pollutant air quality modelling.

However, below some suggestions for improvement:

- introduction: some references to state-of-the-art techniques for air quality modelling as missing, as the use of krigiing statistical approaches (https://www.sciencedirect.com/science/article/abs/pii/S136481521000318X), or the use of Chemical Transport Model together with machine learning techniques to increase spatial resolution (https://iopscience.iop.org/article/10.1088/2515-7620/ac17f7), etc ... all these techniques are also available for forecasting, and should be mentioned here

- section 4.3. In Table 1 the authors show the partitioning for training, validation, testing...but while training uses data for full years, validation is only related to a season...is this a problem for the quality of the results? please better elaborate on this

- figure 5: please put a levend on the Figures' colors

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

I recommend that the revised paper be accepted in present form.

Back to TopTop