Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Regression Machine Learning Models for the Short-Time Prediction of Genetic Algorithm Results in a Vehicle Routing Problem

World Electr. Veh. J. 2024, 15(7), 308; https://doi.org/10.3390/wevj15070308

by Ivan Kristianto Singgih^1,2,3

and Moses Laksono Singgih^4,*

Reviewer 1:

Nikolay Shilov

Reviewer 2: Anonymous

World Electr. Veh. J. 2024, 15(7), 308; https://doi.org/10.3390/wevj15070308

Submission received: 29 June 2024 / Revised: 4 July 2024 / Accepted: 8 July 2024 / Published: 14 July 2024

Round 1

Reviewer 1 Report (Previous Reviewer 1)

Comments and Suggestions for Authors

The authors have addressed most of the comments. The paper has been improved and now it looks more specific and sound.

However, I still have a concern regarding the relevance of the paper to the journal. Please, justify the relevance of the word "capacitated" in your "capacitated vehicle routing problem". How does it differ from regular "vehicle routing problem" taking into account that regular vehicles alos have limited range and their routing has to be economical and environment-friendly (so it is not just a dhorter range)?

In the abstract yu write "This study allows predicting the objective of the vehicle routing problem directly" - this sentence also eliminates the boundary between the "vehicle routing problem" and "capacitated vehicle routing problem".

Author Response

Comment:

Response:

We have revised all “capacitated vehicle routing problem” terminologies into “vehicle routing problem”.

Reviewer 2 Report (Previous Reviewer 2)

Comments and Suggestions for Authors

The article presents a framework named Operations Research Problem Solving using Machine Learning (OpReMaL) for predicting the objective values of Capacitated Vehicle Routing Problems (CVRPs) solved using Genetic Algorithms (GAs). This method aims to reduce computational time by replacing the initial solution generation with regression machine learning models. The paper thoroughly explains the proposed framework, the regression models used, and the results of the numerical experiments.

The OpReMaL framework is a novel approach that combines operations research and machine learning to solve CVRPs efficiently. This is a significant contribution to the field, as it addresses the computational challenges associated with large-sized problems.

The methodology is well-detailed, including the description of the regression models, the GA for CVRP, and the step-by-step process of the OpReMaL framework.

The article provides extensive numerical experiments, evaluating different regression models and comparing their performance based on Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean Squared Error (RMSE).

Points for Improvement:

1)Some sections could benefit from clearer explanations. For instance, the process of selecting the best regression model for different sets of instances (Tables 4, 5, and 6) is critical but can be confusing. A more detailed walkthrough of these tables and their implications would enhance understanding.

2)While the paper claims that the proposed method is different from existing collaborative studies between operations research and machine learning, a more detailed comparison with specific studies would strengthen the argument. Highlighting specific advantages and potential drawbacks in comparison to other frameworks would provide a clearer context.

3)The scalability of the OpReMaL framework should be discussed in more detail. While the paper mentions that it significantly reduces computational time, it would be beneficial to understand its performance on much larger datasets or more complex CVRP scenarios.

4)Including a section on potential real-world applications or case studies could make the research more tangible. Demonstrating how this framework can be applied to actual logistics problems would add practical relevance.

5)Some of the figures, such as Figures 6, 7, and 8, could be enhanced for better readability. Using clearer labels and perhaps breaking down complex figures into simpler sub-figures could help in conveying the results more effectively.

6)The section on future work (p. 13, lines 375-385) is quite brief. Expanding on specific future research directions, such as exploring other types of routing problems or integrating more advanced machine learning techniques, would be valuable.

Minor comments:

1)Ensure there are no typographical errors throughout the document. For instance, "best-trained regression machine learning model" (p. 11, line 324) should be reviewed for grammar and clarity.

2)Maintain consistency in the terminology used for describing machine learning models and operations research methods. For example, terms like "metaheuristics," "genetic algorithms," and "regression models" should be used consistently to avoid confusion.

3)Ensure all references are up-to-date and correctly cited. For instance, the reference to "Scikit-learn 1. Supervised Learning" (p. 16, line 437) should include a more specific URL if possible and check the access date for accuracy.

Comments on the Quality of English Language

Moderate editing of the English language is required.

Author Response

Comment:

1)Some sections could benefit from clearer explanations. For instance, the process of selecting the best regression model for different sets of instances (Tables 4, 5, and 6) is critical but can be confusing. A more detailed walkthrough of these tables and their implications would enhance understanding.

Response:

An introduction about the experiments on different sets was presented as follows:

“When compared with the results in Figure 6, the best model is not always the same. It could be concluded that it is necessary to apply different regression machine learning models for each set of instances (Table 4). It would allow producing a better prediction, rather than considering instances from all sets simultaneously (Figure 6).”

We have added the following explanations at the end of Section 4:

“Experiments with MAE, MSE, and RMSE metrics show that the best models are the Random Forest Regression (for Set 1), the PoissonRegressor (for Set 2), the PoissonRegressor (for Set 3), and the RidgeCV (for Set 4).”

Comment:

2)While the paper claims that the proposed method is different from existing collaborative studies between operations research and machine learning, a more detailed comparison with specific studies would strengthen the argument. Highlighting specific advantages and potential drawbacks in comparison to other frameworks would provide a clearer context.

Response:

We mentioned examples of each type in Section 1 as follows:

“The first framework applies machine learning techniques to predict input data for the operations research problem. An application is estimating the energy consumption of electric vehicles on different paths and routes before solving the routing problem [1]. Another application is clustering flights based on the similarity in the working crews before solving the flight connecting optimization problem [2]. The last example is predicting the demand for cash transportation between bank branches based on historical data and calendar information before determining the transportation schedules [3]. A review of this first framework can be observed in [4].

The second framework applies operations research techniques to optimize machine learning method results. The examples are: (1) using differential flower pollination metaheuristic to optimize hyper-parameters of the support vector machine model for an image processing-based pavement condition observation [5] and (2) using firefly algorithm to optimize hyper-parameters in a support vector regression machine learning model used for predicting building energy consumption level [6]. A recent review of this type of study is presented in [7]. It shows that such research with such a framework is still rare.

The third framework applies machine learning models to improve the quality of operations research models. The first category in this framework is using machine learning methods (e.g. reinforcement learning) to find the best operator in metaheuristics, as stated in a recent review [7]. The second category in this framework is using machine learning to improve the quality of operations research methods. Two examples are (1) using a decision tree to differentiate bad and good vehicle routing problem solutions [8], and (2) using machine learning techniques to select bins in a stochastic bin packing problem considering various features (the bin’s capacity, the reduced cost, and variable values in the relaxed version of the optimization problem) [9].”

In Section 1, we also explained our proposed collaboration framework, which differs from all approaches above:

“Despite the continuous growth in machine learning studies in various fields and the development of numerous operations research techniques, collaboration between machine learning and operations research fields is still in a growing phase. As mentioned in [10], most of the proposed machine learning methods have not yet been applied to solve vehicle routing problem variants, one of the most studied topics in operations research.

Machine learning was used by Arnold and Sörensen [8] to extract important features and develop a problem-specific decision tree. It opened an opportunity to design heuristics with good knowledge of the studied problem. However, their approach still required developing and running the heuristics. Different from all of the three frameworks above, this study introduced a more general framework that could be used to predict the results of a solution method given an operations research problem without running an operations research algorithm. Such a situation is required when the decision-makers need to predict the system’s behavior without waiting for long computational times. The prediction is important before making any related decisions. As an example, after the decision-makers predict the total travel times of the trucks, they could measure how much energy (gasoline or electricity) is spent for the deliveries and possibly solve another follow-up optimization problem, e.g., (1) determining the number of energy supply centers to locate within the area and (2) allocating the trucks to the energy supply centers, to ensure the trucks run smoothly.”

Comment:

3)The scalability of the OpReMaL framework should be discussed in more detail. While the paper mentions that it significantly reduces computational time, it would be beneficial to understand its performance on much larger datasets or more complex CVRP scenarios.

Response:

We have added the following explanations in Section 4:

“Using any regression machine learning model, an objective value prediction of any instance could be conducted within less than one second. When compared with the computational time presented in Table 1, objective value prediction using the regression machine learning model is up to 1,800 times faster than when the objective values are calculated using the GA, especially when dealing with large-sized problems.”

Comment:

4)Including a section on potential real-world applications or case studies could make the research more tangible. Demonstrating how this framework can be applied to actual logistics problems would add practical relevance.

Response:

We have added the following explanations in Section 5:

“We considered up to 700 customers in the numerical experiments, which is larger than the size of the real problem (e.g., 385 customers in [32]). It shows that the proposed method could deal with real-world problems effectively.”

Comment:

5)Some of the figures, such as Figures 6, 7, and 8, could be enhanced for better readability. Using clearer labels and perhaps breaking down complex figures into simpler sub-figures could help in conveying the results more effectively.

Response:

We have enlarged Figures 6-8.

We have added Table 4 which lists all the MAE, MSE, and RMSE values for clarity.

Comment:

6)The section on future work (p. 13, lines 375-385) is quite brief. Expanding on specific future research directions, such as exploring other types of routing problems or integrating more advanced machine learning techniques, would be valuable.

Response:

In the Conclusions section, we suggested some routing topics. However, we have added some more problem types and revised the statement as follows:

“For future studies, it is interesting to observe more operations research problems (e.g., location routing problem [32], routing problem for shared logistics [33], electric vehicle relo-cation problem [14], multi-altitude drone routing problem for post-disaster observation [34]), more solution methods (e.g., beetle swarm optimization [35], hybrid metaheuristics [36,37]), and show how the proposed OpReMaL framework could also obtain good solutions.”

Reviewer 2’s suggestion on the use of more advanced machine learning techniques has been incorporated into the Conclusions section as follows:

“Next studies could also consider testing more advanced machine learning techniques, e.g., ensemble machine learning models [38].”

Comment:

Minor comments:

1)Ensure there are no typographical errors throughout the document. For instance, "best-trained regression machine learning model" (p. 11, line 324) should be reviewed for grammar and clarity.

Response:

The sentence has been revised. The whole manuscript has been checked.

Comment:

2)Maintain consistency in the terminology used for describing machine learning models and operations research methods. For example, terms like "metaheuristics," "genetic algorithms," and "regression models" should be used consistently to avoid confusion.

Response:

“Metaheuristics” terminology was used to explain the proposed framework and in the introduction part. Meanwhile, the “genetic algorithm” terminology was used to explain the case study. We suggest differentiating them as they were to provide the readers with a broader understanding of the proposed framework and its potential for being applied to other problems. We have used the same “regression machine learning model” terminology in the whole manuscript.

Comment:

3)Ensure all references are up-to-date and correctly cited. For instance, the reference to "Scikit-learn 1. Supervised Learning" (p. 16, line 437) should include a more specific URL if possible and check the access date for accuracy.

Response:

We have revised the links for all website references.

Comment:

Moderate editing of the English language is required.

Response:

We have carefully checked and revised the whole manuscript.

Round 2

Reviewer 2 Report (Previous Reviewer 2)

Comments and Suggestions for Authors

The authors well addressed the comments.

I have no additional remarks on the revised version.

Comments on the Quality of English Language

Minor editing of the English language is required.

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The paper presents an approach to model genetic algorithm via several regression models. The approach is considered on an example of routing problem.

It is absolutely unclear why the authors write about operation research. The modelling approach looks problem-agnostic. All text about operations research is irrelevant.

There is a similar concern related to the relevance of the capacitated vehicles to the proesented approach. The reouting problem considered equally applies to regular (gasoline-driven) vehicles.

There are some issues in the approach description.

When the authors present the work of genetic algorithm, they do not present the results, so it is not possible to understand how good are the regression models. At the same time the authors calculate MAE for the regression models. How did they get the ground truth for them? Did they solve the routing problem algorithmically? Do the regression models indeed model the GA results or just learn to solve the routing problem?

Comparing the problem solving time is not fair. Regression models require training samples, which have to be somehow calculated (probably, using the GA), which is time-consuming. GA can solve a problem having only one task at hand. What happens if the routing problem structure changes? In case of regression models a new training set would be needed, in case of GA the problem can be solved immidiately.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The article presents a novel framework called Operations Research Problem Solving using Machine Learning (OpReMaL), which integrates regression machine learning models to predict the outcomes of operations research problems, specifically focusing on the capacitated vehicle routing problem (CVRP) solved using genetic algorithms (GA).

The framework aims to reduce the computational time required to obtain high-quality solutions by using machine learning to predict objective values directly from input data, bypassing the need for running extensive optimization algorithms.

The article provides a detailed description of the methodology, including the genetic algorithm process, machine learning models used, and the specific characteristics of the problem instances.

Points to be addressed:

1)Conduct further validation of the OpReMaL framework on different operations research problems to demonstrate its applicability and robustness across various scenarios.

2)The article tests several regression models but does not provide a clear rationale for the selection of these specific models. Including a broader comparison with other advanced ML techniques could strengthen the findings.

3) Discuss the computational overheads of the initial data generation process and propose potential solutions to enhance scalability for large-scale real-world applications.

4) Include more examples and case studies to illustrate the framework’s application to other vehicle routing problems and different types of optimization challenges.

5)Incorporate additional performance metrics in the evaluation of regression models to offer a more comprehensive analysis of their predictive capabilities.

6)Expand the discussion to cover practical implications, limitations, and future research directions.

7)Conduct thorough proofreading to eliminate any grammatical or typographical errors.

Comments on the Quality of English Language

Moderate editing of the English language is required.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Article Menu

Regression Machine Learning Models for the Short-Time Prediction of Genetic Algorithm Results in a Vehicle Routing Problem

The OpReMaL framework is a novel approach that combines operations research and machine learning to solve CVRPs efficiently. This is a significant contribution to the field, as it addresses the computational challenges associated with large-sized problems.

The methodology is well-detailed, including the description of the regression models, the GA for CVRP, and the step-by-step process of the OpReMaL framework.

The article provides extensive numerical experiments, evaluating different regression models and comparing their performance based on Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean Squared Error (RMSE).

Points for Improvement:

3)The scalability of the OpReMaL framework should be discussed in more detail. While the paper mentions that it significantly reduces computational time, it would be beneficial to understand its performance on much larger datasets or more complex CVRP scenarios.

4)Including a section on potential real-world applications or case studies could make the research more tangible. Demonstrating how this framework can be applied to actual logistics problems would add practical relevance.

5)Some of the figures, such as Figures 6, 7, and 8, could be enhanced for better readability. Using clearer labels and perhaps breaking down complex figures into simpler sub-figures could help in conveying the results more effectively.

6)The section on future work (p. 13, lines 375-385) is quite brief. Expanding on specific future research directions, such as exploring other types of routing problems or integrating more advanced machine learning techniques, would be valuable.

Minor comments:

1)Ensure there are no typographical errors throughout the document. For instance, "best-trained regression machine learning model" (p. 11, line 324) should be reviewed for grammar and clarity.

2)Maintain consistency in the terminology used for describing machine learning models and operations research methods. For example, terms like "metaheuristics," "genetic algorithms," and "regression models" should be used consistently to avoid confusion.

3)Ensure all references are up-to-date and correctly cited. For instance, the reference to "Scikit-learn 1. Supervised Learning" (p. 16, line 437) should include a more specific URL if possible and check the access date for accuracy.

Moderate editing of the English language is required.

The authors well addressed the comments.

I have no additional remarks on the revised version.

Minor editing of the English language is required.

The framework aims to reduce the computational time required to obtain high-quality solutions by using machine learning to predict objective values directly from input data, bypassing the need for running extensive optimization algorithms.

The article provides a detailed description of the methodology, including the genetic algorithm process, machine learning models used, and the specific characteristics of the problem instances.

Points to be addressed:

1)Conduct further validation of the OpReMaL framework on different operations research problems to demonstrate its applicability and robustness across various scenarios.

2)The article tests several regression models but does not provide a clear rationale for the selection of these specific models. Including a broader comparison with other advanced ML techniques could strengthen the findings.

3) Discuss the computational overheads of the initial data generation process and propose potential solutions to enhance scalability for large-scale real-world applications.

4) Include more examples and case studies to illustrate the framework’s application to other vehicle routing problems and different types of optimization challenges.

5)Incorporate additional performance metrics in the evaluation of regression models to offer a more comprehensive analysis of their predictive capabilities.

6)Expand the discussion to cover practical implications, limitations, and future research directions.

7)Conduct thorough proofreading to eliminate any grammatical or typographical errors.

Moderate editing of the English language is required.

Further Information

Guidelines

MDPI Initiatives

Follow MDPI