Next Article in Journal
The Impact of Smart City Pilots on Haze Pollution in China—An Empirical Test Based on Panel Data of 283 Prefecture-Level Cities
Previous Article in Journal
Spatial Characteristics and Influencing Factors of Commuting in Central Urban Areas Using Mobile Phone Data: A Case Study of Nanning
Previous Article in Special Issue
Daily Line Planning Optimization for High-Speed Railway Lines
 
 
Article
Peer-Review Record

Railway Freight Demand Forecasting Based on Multiple Factors: Grey Relational Analysis and Deep Autoencoder Neural Networks

Sustainability 2023, 15(12), 9652; https://doi.org/10.3390/su15129652
by Chengguang Liu 1, Jiaqi Zhang 2, Xixi Luo 2, Yulin Yang 3 and Chao Hu 1,*
Reviewer 2:
Reviewer 3:
Sustainability 2023, 15(12), 9652; https://doi.org/10.3390/su15129652
Submission received: 9 May 2023 / Revised: 12 June 2023 / Accepted: 14 June 2023 / Published: 16 June 2023
(This article belongs to the Special Issue Future-Proofing Study in Sustainable Railway Transportation Systems)

Round 1

Reviewer 1 Report

The presented research entitled "Railway Freight Demand Forecasting Based on Multiple Factors Grey  Relational Analysis and Deep Auto Encoder Neural Networks" has a strong scientific justification. Also, the need for its implementation was evident. By applying the factor analysis method, the research leads to key factors that can help railway company managers better understand the rules of changing the required railway cargos. The cargoes that influence the trend of demand (coal, oil, grain production) are shown as key factors, that is, railway locomotives and vehicles have a significant influence on the trend of demand for railway cargo. The paper itself is well-structured and well-written. The methodology of the work is well presented, and the research results are clear and well represent the research itself. The only personal suggestion is the excessive length of the manuscript, which can hardly hold the reader's attention, but certainly, this comment does not diminish the quality of the manuscript.

 

I propose that the paper be published with the following minor suggestions for corrections:

1.      The abstract is very well and sensibly written. I suggest that the aim of the research should be stated in one sentence in the abstract.

2.      The word "modelling" - UK, is more commonly used than the word "modeling" - US?

3.      It is recommended that the formula in line 326 be numbered with the ordinal number 11 for formulas.

4.      I suggest reducing the number of repetitions of certain words, sentences and parts of the text. For example, the following sentences are repeated two times (lines 289-293 and 361-364):  After dimensionless processing, the indicator values are all within the [0,1] range. This article combines all indicators in references [32-35] to construct an alternative set of influencing factors for railway freight demand prediction. We have divided it into three aspects for factor selection: macroeconomics, related industry output, and competitive environment.

5.      I suggest that the following sentence (lines 132-133) be corrected to read: Research [31] established a semi-supervised learning DAE model considering label constraints for classification tasks.

6.      The sentence in lines 86-87 seems confusing to those who do not know the black box method and its meaning. I suggest that the same sentence be corrected to read: However, the black box (of every unexplored object) characteristic of deep neural networks (DNN) results in poor interpretability of prediction results.

7.      First, it is necessary to mention Figure 7 in the text and then show it.

8.      The presentation and quality of the references are good, but they must be arranged and technical.

9.      It would be very important to tell future readers something about the reliability of rail transport. In the introductory part, I suggest that the authors say something more about the reliability of railway transport, ie. availability, sustainability, security and competitiveness. I suggest the authors consult the following literature: https://doi.org/10.46793/adeletters.2022.1.4.3 etc. Also, it would be very significant to point out the very characteristics of China's transport, because China has the longest and most complex high-speed rail transport. These statements, in addition to everything highlighted, would significantly influence the greater interest of readers.

 

 

Author Response

Response to Reviewer 1 Comments

Dear Reviewer,

Thank you for the comments concerning our manuscript entitled “Railway Freight Demand Forecasting Based on Multiple Factors Grey Relational Analysis and Deep Auto Encoder Neural Networks”. (Manuscript ID: sustainability- 2414936).

Those comments are all valuable and very helpful for revising and improving our paper, as well as the important guiding significance to our research. We have studied the comments carefully and have made corrections which we hope to meet with approval. The main revised portion is marked in red in the file named “Revised Manuscript-sustainability-2414936”. The main corrections in the paper and the response to the comments are as flowing:

Point 1: The abstract is very well and sensibly written. I suggest that the aim of the research should be stated in one sentence in the abstract.

 Response 1: We are grateful for the comment.

We have revised the expression of the research objectives in the abstract. This study combines Grey Relational Analysis (GRA) with Deep Neural Network (DNN) to propose a more interpretable method for predicting railway freight demand (see lines 17-18). At the same time, it explains the role of the proposed method in the real world. The method proposed in this study not only enables accurate prediction of railway freight demand but also helps railway transportation companies better understand the key factors influencing demand changes. (see lines 30-32)

Point 2: The word "modeling" - UK, is more commonly used than the word "modeling" - US?

Response 2: We are grateful for the comment.

We have searched the entire manuscript and corrected all words. At the same time, we also checked and corrected similar problems.

Point 3: It is recommended that the formula in line 326 be numbered with the ordinal number 11 for formulas.

Response 3: We are grateful for the comment.

We have added the numbering to the formulas and provided explanations for the formula parameters. Due to the addition of the problem formulation section earlier in the manuscript, the formula numbering in this section has been reorganized as (12) (refer to lines 353-357).

Point 4: I suggest reducing the number of repetitions of certain words, sentences, and parts of the text. For example, the following sentences are repeated two times (lines 289-293 and 361-364):  After dimensionless processing, the indicator values are all within the [0,1] range. This article combines all indicators in references [32-35] to construct an alternative set of influencing factors for railway freight demand prediction. We have divided it into three aspects for factor selection: macroeconomics, related industry output, and competitive environment.

Response 4: We are grateful for the comment.

Indeed, the occurrence of identical sentences in the manuscript is inappropriate. We appreciate the reviewer for pointing out this problem. We have made the necessary correction to the second statement as follows: "The influencing factors related to railway freight demand forecasting, specifically in the aspects of macroeconomic conditions, relevant industry output, and competitive environment, were selected according to the method described in Section 2.3." (Modification can be found in lines 389-392).

Point 5: I suggest that the following sentence (lines 132-133) be corrected to read: Research [31] established a semi-supervised learning DAE model considering label constraints for classification tasks.

Response 5: We are grateful for the comment.

We have made the following modification to the statement: "Research [34] established a semi-supervised learning DAE model considering label constraints for classification tasks." (Modification can be found in lines 142-143).

Point 6: The sentence in lines 86-87 seems confusing to those who do not know the black box method and its meaning. I suggest that the same sentence be corrected to read: However, the black box (of every unexplored object) characteristic of deep neural networks (DNN) results in poor interpretability of prediction results.

Response 6: We are grateful for the comment.

We have revised the statement as suggested by the reviewer: " However, the black box (of every unexplored object) characteristic of deep neural networks (DNN) results in poor interpretability of prediction results." (Modification can be found in lines 89-91).

Point 7: First, it is necessary to mention Figure 7 in the text and then show it.

Response 7: We are grateful for the comment.

We have placed the mention of Figure 7 above the figure, as suggested. Previously, to minimize white space, we placed some of the mentions after the figures and tables. We have optimized the layout of the manuscript to ensure that each figure or table is mentioned before it appears.

Point 8: The presentation and quality of the references are good, but they must be arranged and technical.

Response 8: We are grateful for the comment.

We have rechecked all the references and completed the missing information for each reference. Additionally, we have edited the reference format according to the requirements of the journal.

Point 9: It would be very important to tell future readers something about the reliability of rail transport. In the introductory part, I suggest that the authors say something more about the reliability of railway transport, ie. availability, sustainability, security, and competitiveness. I suggest the authors consult the following literature: https://doi.org/10.46793/adeletters.2022.1.4.3 etc. Also, it would be very significant to point out the very characteristics of China's transport, because China has the longest and most complex high-speed rail transport. These statements, in addition to everything highlighted, would significantly influence the greater interest of readers.

Response 9: We are grateful for the comment.

We have supplemented the discussion on the reliability of railway transportation in the introduction section. The reviewer's suggestion is highly valuable for further elucidating the research significance of this paper. The objective of this study is to enhance the reliability of service supply through an accurate prediction method for railway freight demand (modification made in lines 40-50). Additionally, we have further elaborated on the impact of China's high-speed rail on the freight capacity of existing lines (modification made in lines 50-52). Finally, we have appropriately cited the corresponding references about the added content (modification made in reference [1]).

 

Special thanks to you for your good comments.

We appreciate your warm work earnestly. We have tried our best to revise our manuscript according to the comments, and hope that the correction will meet with approval. Once again, thank you very much for your comments and suggestions.

 

Yours sincerely,

Chao Hu

Central South University

Author Response File: Author Response.docx

Reviewer 2 Report

This paper presents a railway freight demand forecasting method based on a combinations of machine learning. Although the topic is interesting, there are a few issues that should be addressed:

1. There are a number of freight types in practice, and different types of goods have distinct transport demand. As such, the freigh demand forecasting may be undertaken for different types of goods.

2. There exist a number of prediction methods for traffic demand. The comparison is less comprehensive. It would be better to compare the proposed method to more existing methods.

3. The literature revies is less comprehensive. Some recent studies are missing, just to name a few, Predicting Bus Passenger Flow and Prioritizing Influential Factors Using Multi-Source Data: Scaled Stacking Gradient Boosting Decision Trees. IEEE Transactions on Intelligent Transportation Systems, 2021, 22(4), 2510-2523.

The language is good but it can be improved.

Author Response

Response to Reviewer 2 Comments

 

Dear Reviewer,

Thank you for the comments concerning our manuscript entitled “Railway Freight Demand Forecasting Based on Multiple Factors Grey Relational Analysis and Deep Auto Encoder Neural Networks”. (Manuscript ID: sustainability- 2414936).

Those comments are all valuable and very helpful for revising and improving our paper, as well as the important guiding significance to our research. We have studied the comments carefully and have made corrections which we hope to meet with approval. The main revised portion is marked in red in the file named “Revised Manuscript-sustainability-2414936”. The main corrections in the paper and the response to the comments are as flowing:

Point 1: There are several freight types in practice, and different types of goods have distinct transport demands. As such, freight demand forecasting may be undertaken for different types of goods.

 Response 1: We are grateful for the comment.

The reviewer's suggestions have greatly helped us improve the quality of our research. In the real world, railway transportation encompasses a wide range of sources of goods. These goods can be categorized into various types and subdomains, each with distinct requirements regarding transportation conditions, time constraints, and cost-effectiveness. As a result, decision-makers may need to have a comprehensive understanding of the changes in freight demand overall and specific insights into the demand fluctuations for each category of goods. In this study, we propose a railway freight demand prediction method that incorporates explanatory variables. This method can be used not only to forecast overall freight demand but also to select different explanatory variables by adjusting the prediction target, thereby predicting the demand for specific categories of goods. We have supplemented this aspect in our manuscript (lines 567-573), and it also serves as a direction for our future work. In the prospect of future research, using finer-grained data may require the model to be fine-tuned. And including flow data of core explanatory variables among goods regions may provide more detailed prediction results. (lines 616-619).

 

Point 2: There exist several prediction methods for traffic demand. The comparison is less comprehensive. It would be better to compare the proposed method to more existing methods.

Response 2: We are grateful for the comment.

This modification suggestion is crucial for improving the reliability of our experimental study. Comprehensive model comparison and evaluation are essential for validating the effectiveness and advancement of the proposed model. We conducted another round of literature review on existing traffic demand forecasting methods, specifically focusing on demand prediction methods for goods transportation with time series characteristics. Additionally, we included two baseline models (Feedforward Neural Networks and General Recurrent Neural Networks) in the comparative experiments.

The current set of seven comparative experiments includes the classical ARIMA model based on statistical principles, the SVR model based on both statistics and machine learning, as well as the widely researched deep learning and artificial neural network models. Since our proposed method is based on artificial neural networks, we focused more on the selection of comparative models in this aspect. For the comparative experiments of artificial neural network models, we selected models that have demonstrated good performance in time series data prediction tasks, namely GRU, FC-LSTM, DNN (closest to the proposed DAE-NN), FNN, and GRNN.

Of course, it is not possible to exhaustively cover all model comparisons, but we strived to ensure the comprehensiveness of the comparative experiments while maintaining a certain focus (supplementary comparative experiments are detailed in Section 3.3.3). Furthermore, we supplemented the ablation experiments to verify the GRA module (covered in Section 3.3.4).

 

Point 3: The literature review is less comprehensive. Some recent studies are missing, to name a few, Predicting Bus Passenger Flow and Prioritizing Influential Factors Using Multi-Source Data: Scaled Stacking Gradient Boosting Decision Trees. IEEE Transactions on Intelligent Transportation Systems, 2021, 22(4), 2510-2523.

Response 3: We are grateful for the comment.

We have supplemented some of the latest literature reviews (detailed in lines 100-106 of the manuscript and references [17] Predicting peak load of bus routes with supply optimization and scaled Shepard interpolation: A newsvendor model. Transportation Research Part E-Logistics and Transportation Review. 2020, 142: 102041. And [18] Predicting Bus Passenger Flow and Prioritizing Influential Factors Using Multi-Source Data: Scaled Stacking Gradient Boosting Decision Trees. IEEE Transactions on Intelligent Transportation Systems. 2021, 22(4), 2510-2523.). Under the condition of multiple data sources, scholars have utilized the newsvendor model [17] and the decision tree method [18] to predict bus passenger flow. These methods demonstrate advantages in terms of prediction accuracy and stability. They also provide stronger explanatory power in determining the relative contribution and priority of influencing factors to the prediction. This literature has good references and support for this study.

 

Point 4: The language is good but it can be improved.

Response 4: We are grateful for the comment.

We have consulted native English readers. The whole manuscript has been polished accordingly. We hope the revised version met the English presentation standard. All modifications are marked in red in the manuscript.

 

Special thanks to you for your comments.

We appreciate your warm work earnestly. We have tried our best to revise our manuscript according to the comments, and hope that the correction will meet with approval. Once again, thank you very much for your comments and suggestions.

 

Yours sincerely,

Chao Hu

Central South University

Author Response File: Author Response.docx

Reviewer 3 Report

This paper reads well. I have the following concerns: 

1. There is no ablation study on the GRA. If you wish to demonstrate GRA is useful, then you should compare the prediction performance w/ and w/o the GRA module. 

2. It's unclear how the experiments result support the claims made in this paper. 

3. The problem formulation is missing, making the readers hard to identify the problem setup easily. 

4. The size of the training data is unclear, making it hard to identify if there is any overfitting. 

Author Response

Response to Reviewer 3 Comments

 

Dear Reviewer,

 

Thank you for the comments concerning our manuscript entitled “Railway Freight Demand Forecasting Based on Multiple Factors Grey Relational Analysis and Deep Auto Encoder Neural Networks”. (Manuscript ID: sustainability- 2414936).

 

Those comments are all valuable and very helpful for revising and improving our paper, as well as the important guiding significance to our research. We have studied the comments carefully and have made corrections which we hope to meet with approval. The main revised portion is marked in red in the file named “Revised Manuscript-sustainability-2414936”. The main corrections in the paper and the response to the comments are as flowing:

 

Point 1: There is no ablation study on the GRA. If you wish to demonstrate GRA is useful, then you should compare the prediction performance w/ and w/o the GRA module.

 Response 1: We are grateful for the comment.

To verify the effectiveness of the proposed model, we have supplemented ablation experiments (supplementary experiments are detailed in Section 3.3.3). In the ablation experiment, we trained and tested models with and without the GRA module on the same dataset. The prediction model without the GRA module was unable to perform selection on the initial set of railway freight demand-related factors. Therefore, we used all 22 variables as explanatory inputs for the model training. The results show that the GRA-DAE-NN model demonstrates higher prediction accuracy compared to the DAE-NN model. In terms of fitting historical data, the DAE-NN model only exhibits smaller errors for the years 2004, 2005, 2007, and 2009.

The main difference between these two models lies in whether the GRA module is used to filter the explanatory variables. Through analysis, it was found that the explanatory variables excluded by GRA have a non-significant contribution to railway freight demand. From the perspective of the railway transportation industry, the main source of railway freight in China is bulk goods. Retail goods and express deliveries are less commonly transported by railways. In recent years, most fixed asset investments have been focused on the construction of high-speed railway lines, which are primarily used for passenger transportation. The number of railway employees includes multiple departments such as operations, mechanical, engineering, electrical, and vehicles, and it does not directly reflect the operational situation of the freight department. The screening results of the GRA module align with the actual situation of railway transportation. During the training of the prediction model, irrelevant features and noise can easily interfere, impacting the accuracy of the model.  Therefore, the ablation experiment validates the importance of the GRA module in the GRA-DAE-NN model. (lines 530-546)

 

Point 2: It's unclear how the experiments result support the claims made in this paper.

Response 2: We are grateful for the comment.

This question indicates that the presentation of experimental results and conclusions in the manuscript was not sufficiently refined (lines 27-32, 164-171, 508-510, 521-527, 594-608, and 607-612). We have made revisions to these sections and corrected some statements in the abstract and introduction. The main objective of this study is to achieve an accurate prediction of railway freight demand and to attempt to interpret the prediction results from the perspective of transportation organizations, emphasizing both predictive accuracy and interpretability. The discussion on the accuracy of the proposed prediction method is mainly focused on Sections 3.3.2-3.3.4, which include the analysis of research results, comparative experiments, and the newly added ablation experiments. Under the same dataset conditions, the proposed model exhibits smaller prediction errors. The interpretability of the prediction results is demonstrated through the GRA selection of explanatory variables and the comparative analysis between explanatory variables and demand trends. We have elaborated on the correlation evaluation and ranking of explanatory variables in the latter part of the ablation experiments (Section 3.3.1) as well as the analysis of explanatory variables in Section 3.3.5. Additionally, we have also made corrections to sentences in the abstract and conclusion of the manuscript, using more rigorous language.

 

Point 3: The problem formulation is missing, making the readers hard to identify the problem setup easily.

Response 3: We are grateful for the comment.

This suggestion helped us improve the structure of the manuscript. We have added Section 2.1, where we elaborate on the problem formulation of this study and how we abstract real-world prediction problems into time series modeling and regression analysis forecasting.

Accurately predicting freight demand is of great significance in the current railway transportation industry for optimizing resource allocation, improving service reliability, and promoting the sustainable development of the railway transport sector. The problem addressed in this study is to propose a comprehensive and interpretable method for predicting railway freight demand using multiple data sources, aiming to meet the needs of railway transport enterprises for accurate prediction and resource planning. The problem of railway freight demand forecasting can be formulated as a nonlinear modeling and regression problem. The model consists of explanatory variables, a prediction target, and the nonlinear relationship between them. The goal of GRA is to optimize the explanatory variables and the DAE-NN is to learn the nonlinear relationship from data samples for the railway freight demand prediction. (lines 181-200)

 

Point 4: T The size of the training data is unclear, making it hard to identify if there is any overfitting.

Response 4: We are grateful for the comment.

This suggestion reminded us of the potential risks that the model. We have provided explanations of the data and model parameters in sections 3.1 and 3.2. The data used in this study covers the period from 2000 to 2018 and includes 23 indicators related to railway freight. We divided the dataset into a training set and a test set. During model training, we used data from 2000 to 2017 and then validated the model on the data from 2018 to assess its generalization ability. The results showed that the trained model maintained good predictive accuracy on the test set.

 

Special thanks to you for your comments.

We appreciate your warm work earnestly. We have tried our best to revise our manuscript according to the comments, and hope that the correction will meet with approval. Once again, thank you very much for your comments and suggestions.

 

Yours sincerely,

Chao Hu

Central South University

Author Response File: Author Response.docx

Round 2

Reviewer 2 Report

I am happy with the response

Reviewer 3 Report

The authors have addressed my concerns and comments. I'd suggest accept the paper as it is.

Back to TopTop