Next Article in Journal
Analysis and Experimental Study on the Stability of Large-Span Caverns’ Surrounding Rock Based on the Progressive Collapse Mechanism
Previous Article in Journal
Heat Transfer Performance and Operation Scheme of the Deeply Buried Pipe Energy Pile Group
 
 
Article
Peer-Review Record

Dynamic Spatio-Temporal Adaptive Graph Convolutional Recurrent Networks for Vacant Parking Space Prediction

Appl. Sci. 2024, 14(13), 5927; https://doi.org/10.3390/app14135927
by Liangpeng Gao 1,2, Wenli Fan 1 and Wenliang Jian 1,3,*
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3: Anonymous
Reviewer 4: Anonymous
Appl. Sci. 2024, 14(13), 5927; https://doi.org/10.3390/app14135927
Submission received: 24 May 2024 / Revised: 2 July 2024 / Accepted: 4 July 2024 / Published: 7 July 2024
(This article belongs to the Special Issue Intelligent Transportation System in Smart City)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The article concerns dynamic spatio-temporal adaptive graph convolutional recurrent networks for vacant parking space prediction. This innovative approach combines seasonal and periodic trends with daily and weekly information to accurately predict vacant parking spaces, ultimately reducing traffic congestion and saving drivers time.

The article in general is very interesting, but while reading I encountered some issues:

Many words have a dash in the middle. At first I thought that it was supposed to be like that, but words, like: cor-relation, con-volutional, dy-namic, de-notes, func-tion (and so on), do not need a dash, so please go through the paper and correct it.

I highly recommend reading IEEE Math Typesetting Guide available online: https://conferences.ieeeauthorcenter.ieee.org/wp-content/uploads/sites/8/IEEE-Math-Typesetting-Guide-for-LaTeX-Users.pdf. For example matrices and vectors should be written in bold font. Equations (5) and (6) are missing operations, it should be a circle with a dot inside? Functions like MLP, Concat, ResNet, and so on, should be written in simple font not italic (the same as sin or cos). Equations (9) are also missing operations/symbols.

Figure 4 is of a poor quality. Can Authors improve it? ‘tanh’ in this figure should also be in a simple font.

Figures 5, 6, 7 are of a poor quality. Can Authors improve it?

I have one significant substantive question: The authors show a diagram of the parking layout in Figure 1. The article contains only a note, "constructed based on distance or semantic similarity, for example," regarding the adjacency matrix between nodes, but this is not clearly specified which option was used in this paper. It would be helpful to have a graph illustration where a concrete information on what the connections look like. Graphs, in turn, are characterized by their nodes having specific attributes, and this information is missing here. Therefore, I missed two things: in Figure 1, it is unclear why there is no connection, for example, between parking lot P1 and P2; how are the attributes of individual nodes defined?

 

What are the future directions of this research?

Comments on the Quality of English Language

I included all of my concerns in the suggestions for Authors.

Author Response

  1. Many words have a dash in the middle. At first I thought that it was supposed to be like that, but words, like: cor-relation, con-volutional, dy-namic, de-notes, func-tion (and so on), do not need a dash, so please go through the paper and correct it.

Response: Thank you. We apologize for the formatting error due to typesetting issues. We have corrected it in the revised manuscript.

  1. I highly recommend reading IEEE Math Typesetting Guide available online: https://conferences.ieeeauthorcenter.ieee.org/wp-content/uploads/sites/8/IEEE-Math-Typesetting-Guide-for-LaTeX-Users.pdf. For example matrices and vectors should be written in bold font. Equations (5) and (6) are missing operations, it should be a circle with a dot inside? Functions like MLP, Concat, ResNet, and so on, should be written in simple font not italic (the same as sin or cos). Equations (9) are also missing operations/symbols.

Response: Thank you. We have made a correction regarding the missing symbols.

  1. Figure 4 is of a poor quality. Can Authors improve it? ‘tanh’ in this figure should also be in a simple font. Figures 5, 6, 7 are of a poor quality. Can Authors improve it?

Response: Thank you for your suggestion. We have modified the resolution of these 4 figure.

  1. I have one significant substantive question: The authors show a diagram of the parking layout in Figure 1. The article contains only a note, "constructed based on distance or semantic similarity, for example," regarding the adjacency matrix between nodes, but this is not clearly specified which option was used in this paper. It would be helpful to have a graph illustration where a concrete information on what the connections look like. Graphs, in turn, are characterized by their nodes having specific attributes, and this information is missing here. Therefore, I missed two things: in Figure 1, it is unclear why there is no connection, for example, between parking lot P1 and P2; how are the attributes of individual nodes defined?

Response: Figure 1 is a schematic diagram of parking lot distribution. Many studies set thresholds based on the distance between parking lots to determine whether there is a connection between parking lots. The geographical distance between parking lots is calculated by the longitude and latitude of each parking lot. First, the longitude and latitude of the parking lot are converted to radians, and then the semi-vector formula is used to calculate the geographical distance between parking lots. The calculation method is shown in the following formula:

In the formula, represents the parking lot, and represents the radians converted from the longitude and latitude of each parking lot. After calculating the geographical distance between each parking lot, a thresholdcan be defined according to actual needs to determine whether the parking lots are connected. The calculation steps are shown in the following formula:

If the geographical distance between two parking lots is less than the threshold, it is considered that the two parking lots are connected. Otherwise, the two parking lots are not connected. The attributes of each node generally include the number of available parking spaces and capacity of the parking lot. In this study, we only use the attribute of the number of available parking spaces in the parking lot.

However, constructing a graph based on distance or semantic similarity can only show part of the spatial correlation between parking lots. In this study, we refer to the Graph WaveNet and AGCRN methods to construct static graphs. On this basis, we adopt a method that combines daily information, weekly information and node embedding to construct a dynamic graph, so as to learn the hidden interdependencies between different parking lots. In Section 4.2, we describe the construction process of the dynamic graph.

  1. What are the future directions of this research?

Response: In future work, we plan to integrate more external factors (such as temperature, weather, POI, and accidents) into the model to further improve the performance and reliability of parking space prediction.

 

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

 

The subject treated by the authors is a topical one.

The title and the intentions declared in the abstract correspond to the contents of the paper. The paper contains an abstract and introduction which is in fact a critical review of the state of the art.

The objectives have been covered and the paper reads well and is adequately structured.

The research methodology presented by the authors reveals a good knowledge of the researched field. The results obtained with the developed STDGCRN model are positive, compared to other models.

     In conclusion, the paper is good in terms of scientific contribution.

Some questions for the authors:

1.At 5.1, the authors specify that the parking spaces in the 3 cities were downloaded from the available platforms.

- The data on available parking spaces according to [37] are up to the year 2022, the authors specify in table 1 the year 2023;

- Likewise for Guangdong [39] – it is an article and not information about parking places on public platforms

- Are the 5 minutes taken for analysis during rush hour, during the week or on the weekend?

2.In lines 366-374, the authors talk about the use of the DSTAGCRN and STGNCDE methods for 2 commercial parking lots:

-         - The location of the 2 parking lots

-         -  What is the time frame?

-          - What was the vehicle traffic in that interval (peak hour factor)

Author Response

  1. At 5.1, the authors specify that the parking spaces in the 3 cities were downloaded from the available platforms.

- The data on available parking spaces according to [37] are up to the year 2022, the authors specify in table 1 the year 2023;

- Likewise for Guangdong [39] – it is an article and not information about parking places on public platforms

- Are the 5 minutes taken for analysis during rush hour, during the week or on the weekend?

Response: Thank you. The download link provided by reference [37] is the storage website of Zurich parking data. We have updated the download URL. You can download the available parking space data for the time period you need through the API interface.

The Guangzhou data was downloaded according to the link provided in the article, so we cited the paper.

The 5 minutes mentioned in this study means that 24 hours a day is divided into a time slice of 5 minutes, that is, there are 288 slices a day. Therefore, the three parking datasets we provide include data during peak hours, weekdays, and weekends according to the time span.

  1. In lines 366-374, the authors talk about the use of the DSTAGCRN and STGNCDE methods for 2 commercial parking lots:

-         - The location of the 2 parking lots

-         -  What is the time frame?

-          - What was the vehicle traffic in that interval (peak hour factor)

 

Response: Thank you. The purpose of comparing the prediction results of two commercial parking lots is to show that DSTAGCRN can well capture the spatial correlation between semantically similar parking lots. The superior performance of DSTAGCRN over STGNCDE is to show that our model not only accurately adapts to the rapidly changing trend, but also captures similar trends in the available parking space data, showing more accurate prediction performance.

 

The coordinates of parking lot 1 are: (GPSX: 113.3437434, GPSY: 23.12339328). The coordinates of parking lot 2 are: (GPSX: 113.3283144, GPSY: 23.11409799)

 

The time range shown in Figure 7 is 1000 data points in the validation set, which is about 3.5 days.

 

Due to the lack of data, we have not yet conducted research on the vehicle flow within the interval. At this stage, we have only studied the available parking space data of the parking lot. As for the impact of the traffic flow around the parking lot on the available parking space data of the parking lot, we will further study it in the future when we get the surrounding traffic flow.

 

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

1.        The proposal requires a thorough review of the writing style and spelling.

2.        The document makes a lot of reference to methods for forecasting demand for empty spaces and little about the mechanisms for detecting said spaces (sensors, for example).

3.        Provide a brief introduction to the concept of the average pooling operation, for example, explaining that it performs down sampling by dividing the input into pooling regions and calculating the average value of each region.

4.        The meaning of the statement between lines 215 to 220 is unclear. Try to give a clearer explanation of it.

5.        In equation 10 use an appropriate size in the grouping parentheses

6.        How do you correct forecast errors when the forecast horizon becomes large over time? For example, from 8 minutes of observations.

7.        Make some mention about the methodology for obtaining real-time data to feed the forecast models.

Comments on the Quality of English Language

Minor corrections in style and grammar are necessary.

Author Response

  1. The proposal requires a thorough review of the writing style and spelling.

Response: Thank you. This paper was edited for proper English language, grammar, punctuation, spelling, and overall style by one or more of the highly qualified native English speaking editors at AJE.

 

  1. The document makes a lot of reference to methods for forecasting demand for empty spaces and little about the mechanisms for detecting said spaces (sensors, for example).

Response: Thank you. The main purpose of this study is to improve the performance of the available parking space prediction model. Since data acquisition sensors are rarely mentioned in the field of traffic flow prediction, this study mainly focuses on the prediction model rather than the data acquisition device.

 

  1. Provide a brief introduction to the concept of the average pooling operation, for example, explaining that it performs down sampling by dividing the input into pooling regions and calculating the average value of each region.

Response: Thank you. Average pooling is a common pooling operation in deep learning, which is used to reduce the spatial dimension of feature maps while retaining their important features. In convolutional neural networks (CNNs), pooling layers usually follow convolutional layers.

 

Its operation process is as follows: The input feature map is divided into non-overlapping pooling regions (usually rectangular regions). The feature values ​​in each pooling region are aggregated. In average pooling, the aggregation operation is to find the average of all feature values ​​in the region. The average value after aggregation is used as the value of the corresponding position in the output feature map. The main function of average pooling is to reduce the spatial dimension of the feature map, thereby reducing the number of parameters of subsequent neural network layers, speeding up calculations, and helping to extract robust features, because average pooling is insensitive to small local changes and is conducive to maintaining the position invariance of features (translation invariance).

 

In summary, average pooling performs downsampling by calculating the average value of the feature values ​​in each pooling region. It is a common operation in convolutional neural networks, which is used to reduce the dimension of feature maps and extract important features. However, in this paper, we believe that the average pooling operation is not the focus, so we did not add any content to the revised manuscript.

 

  1. The meaning of the statement between lines 215 to 220 is unclear. Try to give a clearer explanation of it.

Response: Thank you. We have made some modifications to this content in order to construct a dynamic graph. We refer to the construction of the adaptive graph in AGCRN [1] and introduce daily and weekly information to obtain a new spatiotemporal embedding.

[1] Bai, Lei, et al. "Adaptive graph convolutional recurrent network for traffic forecasting." Advances in neural information processing systems 33 (2020): 17804-17815.

 

  1. In equation 10 use an appropriate size in the grouping parentheses.

Response: Thank you. We have made some modifications to this content.

 

  1. How do you correct forecast errors when the forecast horizon becomes large over time? For example, from 8 minutes of observations.

Response: Thank you. This study predicts the available parking space data for the next 12 time steps, that is, 1 hour. The larger the prediction time range, the greater the error. Therefore, the model needs to have good robustness. This study decomposes the original parking data into periodic trends and seasonal trends, extracts long-term stable trends from hidden variables, and improves the prediction performance. In addition, a multi-head attention mechanism module is introduced. This method can self-adjust according to the dynamic information of each time step to explore more effective hidden information. The ablation experiment proves the effectiveness of these components, which shows that the model has great potential in exploring the spatiotemporal structure of available parking spaces. Compared with other existing prediction models, this paper conducts extensive experiments on three real-world datasets, and the results show that the proposed DSTAGCRN can model the available parking space volume more accurately and efficiently.

 

  1. Make some mention about the methodology for obtaining real-time data to feed the forecast models.

Response: Thank you. In the field of traffic flow prediction, there is currently no method to obtain real-time data to supply prediction models, and most of the literature uses historical data to train models.

 

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

Abstract: The abstract provides a good overview but some results of the study as well as limitations will enhance the abstract.

Introduction: Please explain the main advantages of your approach to the previous methods dealing with this issue.

Literature Review

The literature review could be extended with more references and organized better to differentiate more clearly between different types of methods (e.g., statistical, machine learning, graph neural networks).

Methodology

It is not explicitly stated why, besides other forecasting models, the authors opted for the convolutional recursive network.

It is not clear how many parking spaces were considered, i.e. how many in this case make up the sample for which the models were applied.

When creating the model, I believe that it will not be clear to the readers which data is the input and which is the output of the model. It is recommended that the authors explicitly state which data and to what extent represent the inputs and outputs of the model.

Given that the exact sample, i.e. volume of data, was not specified, on what basis did you define the data sets for training, test, and validation as a ratio of 60 - 20 - 20?

Why did the authors not include parameters such as accuracy, precision, and percentage of correct and incorrectly predicted data when presenting the results? Also, in this way, it would be indicated how well the models can accurately predict the data, which is not the case here.

The authors did not specify in which software they created the model.

 

Conclusion

The discussion should delve deeper into the implications of the findings and potential applications in smart cities, including potential limitations.

Author Response

  1. Abstract: The abstract provides a good overview but some results of the study as well as limitations will enhance the abstract.

Response: Thank you. We have revised the Abstract. In the revised manuscript, we denote revised sections in blue font.

 

  1. Introduction: Please explain the main advantages of your approach to the previous methods dealing with this issue.

Response: Thank you. In lines 60-70, we introduce the advantages of our model. In addition, we describe this advantage in detail in Methodology and Experiment.

 

3.Literature Review:The literature review could be extended with more references and organized better to differentiate more clearly between different types of methods (e.g., statistical, machine learning, graph neural networks).

Response: Thank you. We have modified this section.

 

  1. Methodology: It is not explicitly stated why, besides other forecasting models, the authors opted for the convolutional recursive network.

Response: Thank you. In addition to spatial dependencies, parking data also has complex temporal patterns, such as periodicity and front-back correlation. Li et al. replaced the linear layer in GRU with a static graph convolution to capture spatiotemporal features simultaneously [1]. As shown in Figure 4, this paper uses a parameter learning graph convolution recursive module that integrates a dynamic parameter learning module and GRU as the basic unit of spatiotemporal modeling, and replaces the linear layer with a dynamic graph convolution method to learn the dynamic associations between nodes.

[1] Li, Yaguang, et al. "Diffusion convolutional recurrent neural network: Data-driven traffic forecasting." arXiv preprint arXiv:1707.01926 (2017).

 

  1. It is not clear how many parking spaces were considered, i.e. how many in this case make up the sample for which the models were applied.

Response: Thank you. This paper aims to explore the spatiotemporal characteristics of multiple parking lots. Therefore, one parking lot is used as a node in the graph. The specific description of the dataset is shown in Table 1.

 

6.When creating the model, I believe that it will not be clear to the readers which data is the input and which is the output of the model. It is recommended that the authors explicitly state which data and to what extent represent the inputs and outputs of the model.

Response: Thank you. The historical data of available parking spaces in multiple parking lots is the input of the model, and the prediction results are the output of the model.

 

  1. Given that the exact sample, i.e. volume of data, was not specified, on what basis did you define the data sets for training, test, and validation as a ratio of 60 - 20 - 20?

Response: Thank you. The specific description of the data is shown in Table 1. Every 5 minutes is a data, and there are 288 data in a day. In the field of traffic flow prediction, the model is usually divided into training set, test set and validation set according to 6:2:2.

 

  1. Why did the authors not include parameters such as accuracy, precision, and percentage of correct and incorrectly predicted data when presenting the results? Also, in this way, it would be indicated how well the models can accurately predict the data, which is not the case here.

Response: Thank you for your suggestion. In the prediction task, MAE, RMSE, and MAPE are the most common and practical evaluation indicators, which can reflect the quality of the model. In addition, these three evaluation indicators are basically used in the field of traffic flow prediction. Therefore, this paper chooses MAE, RMSE, and MAPE as evaluation indicators.

 

  1. The authors did not specify in which software they created the model.

Response: Thank you. We describe the experimental setup in Section 5.1, where our model is constructed based on the Pytorch framework.

 

  1. Conclusion: The discussion should delve deeper into the implications of the findings and potential applications in smart cities, including potential limitations.

Response: Thank you. We have modified this section.

 

Author Response File: Author Response.pdf

Round 2

Reviewer 4 Report

Comments and Suggestions for Authors

 

Introduction: Please explaining main advantages of your approach to the previous methods dealing with this issue.

Authors did not make requested changes.

Literature Review

The literature review could be extended with more references and organized better to differentiate more clearly between different types of methods (e.g., statistical, machine learning, graph neural networks).

Authors did not make requested changes.

 

 

Methodology

It is not clear how many parking spaces were considered, i.e. how many in this case make up the sample for which the models were applied.

 

Why were Singapore, Zurich, and Guangzhou taken into account? I believe there is a significant reason for this that should be highlighted. Do these cities have a problem with the occupancy of parking spaces? It is necessary to devote one section to the description of the area from the aspect of parking, the data of which were selected for further model development. Also, it is unclear which parking spaces are considered, are they street or off-street?

 

Are you comparing your model with models whose results you took from other references or did you create them yourself? Did all the models use the same data covering the same period and the same cities?

 

When creating the model, I believe that it will not be clear to the readers which data is the input and which is the output of the model. It is recommended that the authors explicitly state which data and to what extent represent the inputs and outputs of the model.

 

What do the prediction results mean? Is it the number of free parking spaces in a given time interval? Please provide specific data for the output of the model.

 

Given that the exact sample, i.e. volume of data, was not specified, on what basis did you define the data sets for training, test, and validation as a ratio of 60 - 20 - 20?

 

It is necessary to point out why exactly that ratio was taken, considering that with smaller samples, a larger set of data is taken for training to obtain better results. Have you tried a different ratio (eg 7:2:1) or did this one give you the best results?

 

 

Why did the authors not include parameters such as accuracy, precision, and percentage of correct and incorrectly predicted data when presenting the results? Also, in this way, it would be indicated how well the models can accurately predict the data, which is not the case here.

 

Chapter 5.4.: Why are the data shown only for Zurich? How do the displayed algorithms work? Did those studies use the same data that you used? It is not clear where the results were obtained, as well as where the algorithms were implemented. Before presenting Table 4, it is necessary to explain the above.

 

 

part of the authors answer “Therefore, this paper chooses MAE, RMSE, and MAPE as evaluation indicators.”

All right, but you've shown results that you haven't specified whether they refer to the model's performance on the test data set or on the training data set. Model performance can vary significantly depending on the data set, where underfitting/overfitting can best be seen. Please specify the results.

 

 

Conclusion

The discussion should delve deeper into the implications of the findings and potential applications in smart cities, including potential limitations.

The conclusion is very scarce, and in addition to the description of the model you applied, there is no reference to the data used, cities, or discussions. It would be important to compare your results through discussion with the results of other authors who used a similar/same model to point out similarities/differences. It would be good to point out the performance of the model in the conclusion, to highlight its prediction. It is necessary to extend the conclusion to the above.

 

 

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 3

Reviewer 4 Report

Comments and Suggestions for Authors

I have no additional requests for manuscript improvement.

Back to TopTop