Next Article in Journal
An Assembly Sequence Planning Method Based on Multiple Optimal Solutions Genetic Algorithm
Previous Article in Journal
Supply Chain Inventory Management from the Perspective of “Cloud Supply Chain”—A Data Driven Approach
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Comparative Study of Vehicle Velocity Prediction for Hybrid Electric Vehicles Based on a Neural Network

1
Hubei Key Laboratory of Advanced Technology for Automotive Components, Wuhan University of Technology, Wuhan 430070, China
2
Hubei Research Center for New Energy & Intelligent Connected Vehicle Engineering, Wuhan University of Technology, Wuhan 430070, China
3
Hubei Collaborative Innovation Center for Automotive Components Technology, Wuhan University of Technology, Wuhan 430070, China
4
Hubei Longzhong Laboratory, Wuhan University of Technology, Xiangyang 441000, China
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(4), 575; https://doi.org/10.3390/math12040575
Submission received: 14 January 2024 / Revised: 11 February 2024 / Accepted: 12 February 2024 / Published: 14 February 2024

Abstract

:
Vehicle velocity prediction (VVP) plays a pivotal role in determining the power demand of hybrid electric vehicles, which is crucial for establishing effective energy management strategies and, subsequently, improving the fuel economy. Neural networks (NNs) have emerged as a powerful tool for VVP, due to their robustness and non-linear mapping capabilities. This paper describes a comprehensive exploration of NN-based VVP methods employing both qualitative theory analysis and quantitative numerical simulations. The used methodology involved the extraction of key feature parameters for model inputs through the utilization of Pearson correlation coefficients and the random forest (RF) method. Subsequently, three distinct NN-based VVP models were constructed comprising the following: a backpropagation neural network (BPNN) model, a long short-term memory (LSTM) model, and a generative pre-training (GPT) model. Simulation experiments were conducted to investigate various factors, such as the feature parameters, sliding window length, and prediction horizon, and the prediction accuracy and computation time were identified as key performance metrics for VVP. Finally, the relationship between the model inputs and velocity prediction performance was revealed through various comparative analyses. This study not only facilitated the identification of an optimal NN model configuration to balance prediction accuracy and computation time, but also serves as a foundational step toward enhancing the energy efficiency of hybrid electric vehicles.

1. Introduction

Vehicle velocity prediction (VVP) has significant theoretical value and widespread applications in many vehicular applications, especially for energy saving and security control in new energy-based intelligent connected vehicles. For example, short-term VVP information can be applied in the energy management strategies (EMSs) and adaptive cruise control of hybrid electric vehicles [1]. To improve the fuel economy and safety of drivers, short-term VVP algorithms have been given top priority to enhance the effectiveness and viability of predictive energy management strategies (PEMSs). The prediction of vehicle velocity information in a timely and precise manner is of significant importance, providing useful instructions before implementing receding horizon control. However, the velocity of a vehicle is influenced by a variety of factors, including driving style, driving pattern, and traffic conditions, and accurate VVP remains one of the key bottlenecks for the application of PEMSs. Thus, VVP has become an important research focus related to PEMSs.
The mainstream VVP methods for PEMSs reported in the literature can be generally divided into two groups: stochastic approaches and deterministic approaches [2]. These two kinds of VVP methods predict the future vehicle velocity time series as a probability distribution interval and a single trajectory curve, respectively. A Markov chain (MC) is the most representative stochastic VVP algorithm [3]. A multi-level MC model is considered to be an effective measure for improving the prediction accuracy of an MC; however, the size of the transition probability matrix grows exponentially with additional model inputs, which leads to a high computational burden when using this algorithm, making it impossible to cover all potential Markov states [4,5]. Deterministic VVP algorithms can be further divided into parametric methods and non-parametric methods [6]. Parametric methods perform VVP by building models with various parameters, such as an auto-regressive and moving average model (ARMA), an exponential model (EM), or a gray model (GM). Parametric methods require pre-calibration of the model parameters for the target data. The randomness and non-linearity of a vehicle’s velocity hinder the parameter calibration process; accordingly, the predictive errors of parametric methods are higher than those of non-parametric methods [7]. In contrast, non-parametric methods are also known as data-driven methods, as they employ historical data to make predictions [8]. Data-driven methods are preferred when testing random driving cycles, and neural networks (NNs) have emerged as a powerful tool due to their robustness and non-linear mapping capabilities. Under the same or similar driving conditions, vehicle velocity changes are similar or even the same. Accurate VVP can be obtained by training an NN with a reasonable number of samples, modifying the weight ratio of the neuronal functions in the hidden layer and the output layer, storing the non-linear characteristics of velocity change in the NN model as a black box, and reproducing the non-linear characteristics in prediction behavior. Extensive and in-depth research has been conducted on different NN-based VVP methods. From the perspective of the network architecture, neural networks that have been most frequently studied in the field of VVP are commonly classified into four groups—namely, feedforward architectures, recurrent architectures, hybrid architectures, and attention mechanisms—as shown in Figure 1.
A unidirectional feedforward architecture flows from the input layer to the output layer. Two widely used configurations that fall under this category are backpropagation neural networks (BPNNs) and radial basis function neural networks (RBFNNs). The prediction accuracy of a BPNN is greatly affected by its initial weights and thresholds, making it vulnerable to local minimum issues. An RBFNN only needs to adjust a few significant weights that have an impact on the output, and the activation function used by the hidden layer’s nodes is a Gaussian radial basis function which is symmetric about the center, thus avoiding the local optimization problem and having the capacity to train faster than a BPNN. Lin et al. [9] developed a BPNN-based velocity prediction method for the predictive control strategy of fuel cell electric vehicles. Xiang et al. [10] proposed a vehicle velocity predictor based on an RBFNN for real-time energy management, and their simulation results showed that the RBFNN had a relatively high accuracy in the short-term prediction horizon. To improve the prediction performance of BPNNs and RBFNNs, some novel structures have also been applied to conduct VVP. Based on the network architecture of an RBFNN, a general regression neural network (GRNN) can be established to achieve a higher speed of training. Wang et al. [11] utilized a GRNN as an upper layer to make velocity prediction for an MPC-based EMS by taking advantage of the GRNN’s short training time and ability to yield accurate forecasts under a limited number of training samples. Based on the network architecture of the BPNN used, a non-linear auto-regressive model with exogenous inputs (NARX) is proposed by adding delay and feedback mechanisms to enhance the memory ability of historical data, which can converge more rapidly and generalize well with a lower sensitivity to long-term dependencies. Zhang et al. [12] constructed a VVP model with a NARX NN, and the model’s prediction accuracy was validated by means of simulation comparison. Although feedforward neural networks (FNNs) and their variants are good at modeling non-linear characteristics, as shown in Figure 2a, they are only applicable for one-to-one prediction. Thus, the use of recurrent architectures with feedback loops has been proposed to solve one-to-many prediction with input repetitions over time.
As shown in Figure 2b, the three representative structures of a recurrent neural network (RNN) are a standard RNN, a long short-term memory (LSTM), and a gated recurrent unit (GRU). As the standard RNN model suffers from gradient exploding or vanishing when dealing with longer sequences [13], LSTM and GRU, as promising variants of RNNs, have become effective methods for predicting future velocity with enhanced structures. LSTM adds an additional memory component to avoid the vanishing gradient problem, to some extent [14]. With a simpler internal configuration to balance prediction accuracy and computation time, a GRU exhibits better prediction performance than an LSTM. Du et al. [15] compared the predictive errors of an RNN and an LSTM model in each prediction step, and the results showed that the LSTM model exhibited a better prediction effect in the velocity time series. Wu et al. [16] established an LSTM model to perform medium-term prediction of driving cycles. Shin et al. [17] compared three VVP models—an RNN model, an LSTM model, and a GRU model—and the results indicated that the average prediction error of the GRU model was lower than that of the RNN model by 45.1% and that of the LSTM model by 11.4%.
With the development of deep learning, attention or self-attention mechanisms have emerged, which can achieve sequence-to-sequence correlation learning. NNs based on an attention mechanism have been introduced into the VVP field, and transformer modules with self-attention stacks have garnered extensive attention as a typical representative of this architecture; a component diagram is depicted in Figure 3. Xu et al. [18] devised a transformer-based model that integrated the features of multiple vehicles to predict the velocity of driving vehicles. Shen et al. [19] proposed a novel, deterministic, transformer-based NN to predict the acceleration and deceleration behaviors of drivers and implemented a stochastic MC-based Monte Carlo method to forecast the velocity trajectory.
From the above descriptions, various architectures of NNs that have been applied in the field of VVP have their own advantages and limitations, as identified in Table 1.
Prediction accuracy and computational efficiency are considered the two primary issues during the implementation of NN-based VVP methods. To improve these two performance indices of NN models, model parameter optimization of NN models and hybrid architectures have been extensively studied. On the one hand, optimization algorithms, such as genetic algorithms (GAs) and particle swarm optimization (PSO), are frequently employed to adjust the model parameters to enhance the prediction performance of NN models. For example, Liu et al. [20] applied a GA and PSO to optimize the model parameters of a BPNN-based VVP model. Hou et al. [21] used the fixed-order Akaike information criterion (AIC) to optimize the network parameters of an RBFNN-based vehicle velocity predictor. Bharti et al. [22] employed PSO to search for the best parameters of an LSTM-based VVP model on a global scale. On the other hand, hybrid architectures composed of feedforward, recurrent, or attention mechanism architectures have been designed to balance prediction accuracy and computational intensity. For instance, in [23,24], a VVP method combining an MC and a BPNN algorithm was proposed. Upadhyaya et al. [25] proposed a velocity prediction technique combining a BPNN and an RBFNN, wherein the RBFNN was adopted to compensate for the predictive errors resulting from the BPNN. Li et al. [26] presented a mixed BP-LSTM prediction approach to performing velocity prediction in different driving scenarios. A blended convolutional neural network (CNN) and a GRU model with an attention mechanism were proposed in [27] for VVP. Cao et al. [28] devised a CNN-LSTM-based model for traffic speed prediction.
Aside from network architectures and model parameters, the number and type of model inputs also have a great impact on the prediction performance of NN-based VVP methods, such as the historical velocity, acceleration, date, location, weather, gradient, and traffic signals. The findings reported in [29] indicates that, after adding a vehicle’s position information, relative velocity, and distance from the vehicle in front, the prediction accuracy of the proposed LSTM-based VVP model increased by 18.7%. Zhang et al. [30] utilized the distance from the first vehicle to the traffic light and the leading vehicle’s velocity as the model inputs for a CNN-based VVP model. Even though the prediction accuracy and generalization performance of VVP models have improved with the inclusion of more information characteristics, difficulties related to information acquisition and poor data stability remain challenges in the field.
In summary, the model structures, optimization algorithms, and model inputs of NN-based VVP methods have been widely studied by domestic and foreign scholars to improve their velocity prediction performance, and they have been proven to be effective through simulations and experiments. However, the quantitative impact of the above- mentioned measures on prediction performance is usually overlooked in existing studies. Thus, the optimal NN configuration for VVP remains unknown, and knowledge regarding its maximum improvement potential is also missing. To bridge this research gap, the primary objective of this paper is to explore the influencing mechanism of model inputs on the prediction performance of NN-based VVP models by means of a comparative analysis, thus laying a theoretical foundation for further improvement in the prediction performance of NN-based VVP models.
The major contributions of this paper are the following:
(1)
The feature parameters of NN-based VVP model inputs were extracted based using the random forest (RF) method and Pearson’s correlation coefficient (PCC).
(2)
In view of the three typical network architectures of NNs, three VVP models were constructed based on a BPNN, an LSTM, and generative pre-training (GPT).
(3)
The simulation setup was designed on the basis of feature parameters, sliding window length, and prediction horizon, and the mean absolute error (MAE), goodness-of-fit (R2), and computation time are proposed as the main performance metrics for quantifying the prediction accuracy and real-time performance of the proposed VVP models.
(4)
The effects of feature parameters, sliding window length, and prediction horizon on the prediction performance of the VVP models were analyzed through numerical simulations, following which the relationship between vehicle velocity and model inputs was examined.
The remainder of this paper is organized as follows. Section 2 describes the extraction and preprocessing of driving feature parameters as the VVP model inputs. Section 3 illustrates the prediction frameworks of the BPNN-, LSTM-, and GPT-based VVP models. Section 4 introduces the simulation setup and performance metrics of the VVP models for the comparative experiments, followed by a presentation of the results and discussion in Section 5. Section 6 presents the main conclusions and directions for future works.

2. Vehicle Velocity Feature Selection and Preprocessing

This section introduces the various input features used for VVP. The parameters that represent features of the vehicle driving conditions are based on vehicle velocity and can be utilized to supplement vehicle velocity as the model inputs. Through an investigation of the relevant literature [31,32,33], a total of 17 feature parameters were selected as the initial inputs of the VVP models, as illustrated in Table 2. The major parameters of the feature parameter equations are summarized in Table 3.
An appropriate number of extracted feature parameters should be determined with simultaneous consideration of prediction accuracy and computational intensity; fewer parameters lead to a low prediction accuracy, while more parameters bring about a high computational burden. Therefore, it was necessary to process the above 17 feature parameters through feature selection to remove some unnecessary and redundant features. The use of PCC is a general approach for evaluating the degree of correlation between two variables, whose calculation is presented in Equation (1). A larger correlation coefficient indicates a stronger correlation between two feature parameters. In this study, the correlation coefficient between two feature parameters was evaluated with a limit of 0.8 [32,33]. The formula for calculating the correlation coefficient is the following:
r = i = 1 n X i X ¯ Y i Y ¯ i = 1 n X i X ¯ 2 i = 1 n Y i Y ¯ 2
where X and Y represent the two variables under consideration, and X ¯ and Y ¯ are the average values of the corresponding variables. After performing the correlation analysis for the 17 feature parameters, it was evident that the pairs of feature parameters with a strong correlation were v m r and v m e , a max and a m e 1 , and a min and a m e 2 , as described in Figure 4. Therefore, it was crucial to avoid selecting similar features repeatedly when choosing feature parameters.
Apart from the correlation analysis of the 17 feature parameters, an RF algorithm was also employed to further extract the feature parameters that had the most significant impact on the predictive velocity. An RF algorithm calculates the contribution of each feature in every decision tree of the RF, and then compares the average value to acquire the importance ranking result of the feature parameters. For the RF algorithm in the scikit-learn library, feature importance scores were calculated according to the Gini index. The relevant calculation formulae are as follows.
G I m = k = 1 K p ^ m k 1 p ^ m k
V I M j m G i n i = G I m G I l G I r
V I M i j G i n i = m = 1 M V I M j m G i n i
V I M j G i n i = 1 n 1 n V I M i j G i n i
In Equation (2), G I m is the Gini index of node m , K is the number of categories in the sample set, and p ^ m k represents the probability estimate that node m belongs to class k . In Equation (3), V I M j m G i n i is the importance of feature X j in node m , and G I l and G I r represent the Gini indices of the two new nodes split by node m . In Equation (4), V I M i j G i n i represents the importance of the feature X j occurring M times in the i th decision tree. In Equation (5), V I M j G i n i is the final importance score for the feature X j , and n is the number of decision trees.
As shown in Table 4, the top five features in terms of importance were v m r , v m e , v a max , a min , and a . Combined with the above feature correlation analysis based on PPC, five feature parameters, namely v m r , v a max , a min , a , and f v , were finally selected. Furthermore, as a typical time series, velocity shows a certain coherence in terms of time, and the previous vehicle velocity information provides some guiding influence on the future direction of vehicle velocity. Simultaneously, prediction performance is affected by varying lengths of historical vehicle velocity time windows. Based on the above analysis, the current vehicle velocity was determined to be the primary feature output, and the five feature parameters and historical velocity were selected as the sub-feature inputs for the VVP models.

3. VVP Principle and Models

In this section, the specific implementation process of VVP is introduced, and a description of how three VVP models based on LSTM, BPNN, and GPT were constructed is provided.

3.1. The Principle of VVP

After extracting the input feature parameters as described in the preceding section, the generation of target samples for model training and prediction began. Three VVP models based on LSTM, BPNN, and GPT were constructed in the form of multiple inputs and single outputs. The training step of these NN-based VVP models is shown in Figure 5, and the specific procedures are described below, where M is the time length of the sliding window.
(1) Collect the standard driving cycle conditions of heavy vehicles to form a historical velocity database. Extract the velocity in historical M seconds to form a sliding window v t M , v t M + 1 , , v t 1 , v t . Calculate the required feature parameters within the sliding window, including average of driving velocity v m r , velocity times acceleration maximum v a max , minimum acceleration a min , acceleration a , and velocity variance f v . Combine the velocity of the sliding window and above calculated feature parameters to construct the input vector N i n , and the (t+1)th second velocity is taken as the output vector N o u t .
N i n = v t M , v t M + 1 , , v t 1 , v t , v m r , v a max , a min , a , f v
N o u t = v t + 1
(2) Based on step (1), the 1st vector of the input matrix is the velocity from the 1st second to the Mth second and the corresponding driving feature parameters, and the velocity of the (M+1)th second is then used as the 1st ground truth of the output matrix. With 1 s as the sliding step size, the 2nd vector of the input matrix is the velocity from the 2nd second to the (M+1)th second and the corresponding driving feature parameters, and the velocity of the (M+2)th second is then used as the 2nd ground truth of the output matrix. After performing data processing for (NM+1) times, the final model input N i n _ F and output N o u t _ F are formulated as indicated in Equation (8):
N i n _ F = v 1 v M f v M v 2 v M + 1 f v M + 1 v N M 3 v N 2 f v N 2 v N M 2 v N 1 f v N 1 , N o u t _ F = v M + 1 v M + 2 v N 1 v N
(3) Set up the parameters of the model structure and train the three NN-based models using the above training data set, including the BPNN-, LSTM-, and GPT-based models.
(4) Import the first sequence of the test data sets into the trained model and forecast the 1st second velocity in the future. To obtain a multi-step output, the predictive velocity is filled into the next sliding window of historical velocity, and the sliding window is shifted to the right by one step. In the meantime, the feature parameters of the current sliding window are extracted and combined with the velocity of the current sliding window as the input to predict the 2nd second velocity in the future. As illustrated in Figure 6, a future multi-step output can be attained through these rolling prediction operations.

3.2. The BPNN Model

Combining the multi-layer feedforward structure with an error backpropagation algorithm, the BPNN model is composed of an input layer, one or more hidden layers, and an output layer. The input layer receives external inputs, the hidden layer utilizes an activation function to accomplish the non-linear mapping of information, and the output layer converts the hidden layer’s output into a specific form of output data. The BPNN model, with a hyperbolic tangent S-function, is depicted in Figure 2a, and its activation function is defined as shown in Equation (9):
a 1 = tan s i g n = e n e n e n + e n
n = W a 0 + b
where a 1 and a 0 are the neuronal outputs of the current and previous layers, respectively; n is the cumulative production; W is the weight value; and b is the bias value.

3.3. The LSTM Model

As an enhancement of the RNN configuration, an LSTM has a sophisticated information transmission framework. Specifically, an LSTM contains three gate units, including an input gate i t , a forget gate f t , and an output gate o t , and a memory unit C t in the structure’s core. The past states’ information is stored in the memory unit, while the input, output, and forgetting information are managed by the gate units.
A typical LSTM network structure is shown in Figure 2b, and the three gate units are explained below.
(1) The forget gate determines how much information from the previous cell state can be transferred to the current moment, as shown in Equation (11):
f t = σ W f h t 1 , x t + b f
where W f is the forget gate’s weight matrix, h t 1 is the last cell’s output; x t is the current moment’s input; b f denotes the bias vector; and σ is the activation function sigmoid.
(2) The input gate decides how much of the newly generated information from the current input can be stored in the cell state, and its two factors i t and C ˜ t are calculated according to Equations (12) and (13):
i t = σ W i h t 1 , x t + b i
C ˜ t = tanh W c h t 1 , x t + b c
where W i and W c are the input gate’s weight matrices, and b i and b c are the bias vectors. The formula for generating the updated cell state information C t can be written as the following:
C t = f t C t 1 + i t C ˜ t
(3) The output gate outputs the current information based on the updated cell state information, as shown in Equation (15):
o t = σ W o h t 1 , x t + b o
where W o is the output gate’s weight matrix and b o is the bias vector. The output of the hidden layer at the current moment h t is calculated according to Equation (16):
h t = o t tanh C t

3.4. The GPT Model

The GPT model is made up of multi-layer unidirectional transformer decoder elements. As shown in Figure 7, each layer of the decoder is primarily composed of a feedforward NN module and a masked multi-head self-attention module.
The attention score for every input vector is calculated by the self-attention mechanism using scaled dot product attention [34,35]. First, for every input vector, the query, key, and value matrices are created. Next, the softmax function computes the dot products between the query and key matrices to convert them into attention scores. Finally, each attention score is weighted and added together with the value matrix, as described in Equation (17):
A t e n t i o n Q , K , V = s o f t   max Q K T d k V
Q = W q I
K = W k I
V = W v I
where Q is the query matrix; K is the key matrix; V is the value matrix; I is the input matrix; W q , W k , and W v are the corresponding weight matrices; and d k is the dimensions of Q , K , and V .
The multi-head attention module splits the single-head attention input matrix equally, and then each scaled dot product attention head focuses separately on information from different representation subspaces at different positions, as illustrated in Equation (21):
M u l t i H e a d Q , K , V = C o n c a t h e a d 1 , , h e a d h W O
h e a d i = A t t e n t i o n Q W i Q , K W i K , V W i V
where W i Q , W i K , and W i V are the weight matrices of the i th attention head; W O is the multi-attention weight matrix; h is the number of attention heads; and the concat function splices the output values calculated by each attention head.
The GPT model takes a given time series as the input, and for each new timestamp, it generates a new prediction for the following timestamp. The generated prediction sequence is then compared with the corresponding true sequence to calculate the training loss, as demonstrated in Figure 8. To accomplish the process described above, a mask must be employed to ensure that the model only has access to the tokens coming prior to the sequence at each step. An extra matrix is added to prevent the model from cheating by looking ahead before the softmax function is applied, which has a value of negative infinity for the upper triangle and a value of 0 for the diagonal and lower triangle.
To prevent disappearing gradients, a residual join operation is inserted between each decoder sublayer. Layer normalization is also utilized to hasten the network convergence, as depicted in Equation (23):
o = L a y e r N o r m x + S u b l a y e r x
where the term Sublayer denotes the function inside each sublayer, and it is a fully connected feedforward NN processing function, while LayerNorm is the layer normalization processing function.
Scheduled sampling is employed to train the GPT model. Unlike the traditional training method of teacher forcing, scheduled sampling selects the ground truth information with a higher probability as the model inputs in the early stage of model training, and it gradually employs the predicted outputs as the model inputs in the later stage of model training to avoid the problem of exposure bias [36,37]. The sampling rate is determined by using a probability decay function, and the general probability decay functions include linear decay, exponential decay, and inverse sigmoid decay, as illustrated in Figure 9. The inverse sigmoid decay presented in Equation (24) was chosen for this study.
P i = k k + e i k
where k is used to finetune the rate of decay and i is the number of training epochs.

4. Experimental Setup and Performance Metrics

4.1. Experimental Setup

Various data sets representing the typical standard driving cycle conditions of heavy vehicles were chosen as the data sets for this study. As shown in Figure 10, the training data set consisted of the following driving cycles: MANHATTAN_CYC, CHTC_HT, CHTC_B, CHTC_C, CHTC_TT, WVUSUB_CYC, WVUINTER_CYC, UDDSHDV_CYC, and HWFET_CYC. NYCTRUCK_CYC, NYCBUS_CYC, and C_WTVC were employed as three test data sets. Moreover, 70% of the data were included in the training set, while 30% of the data were included in the test sets.
As indicated in Table 5, with the consideration of the driving feature parameters, sliding window, and prediction horizon, a series of experiments were designed to explore the impact of various model inputs on the prediction performance of the NN-based VVP models.
The parameters of trained models should be configured properly to eliminate the extent of overfitting and underfitting of NN-based models. For the setup of the hidden layers of BPNN and LSTM models, a three-layer network structure is sufficient for fitting any non-linear curves of simple predictive regression projects. Therefore, the number of hidden layers of the BPNN and LSTM models was set to 1. In regard to the number of hidden units, it was initially determined for the BPNN model by using the empirical formula depicted in Equation (25), whereas it was resolved based on the power of two for the LSTM model. Eventually, the critical parameters of the three VVP models were set as shown in Table 6 and Table 7.
n = i + o + a
where n is the number of hidden units, i is the number of input neurons, o is the number of output neurons, and a is an adjustable parameter between 1 and 10.
Furthermore, the BPNN model is trained by the trainlm function, which obtains the current values of weights and biases from the neural network and minimizes the error function mean square error (MSE) through the Levenberg–Marquardt algorithm to update the weight and bias value. In addition, the LSTM model employs the adaptive moment estimation (Adam) optimization function to update the weight and bias values of the network adaptively, while the GPT model utilizes the stochastic gradient descent (SGD) algorithm to optimize the parameters of the network structure.

4.2. Performance Metrics

VVP models are typically tested in terms of their prediction accuracy and computation time. In this study, the MAE and R2 were employed to quantify the velocity prediction accuracy, and their expressions are shown in Equations (26) and (27):
M A E = 1 n i = 1 n y ^ i y i
R 2 = 1 i n y ^ i y i 2 i n y ¯ i y i 2
where y ^ i and y i are, respectively, the predicted velocity and actual velocity at the i th second; y ¯ i represents the average value of actual velocity; and n is the total number of velocity points.
The real-time performance of a prediction method is commonly evaluated based on its computation time. For NN-based VVP models, the computation time includes the inference time and training time. The inference time refers to the execution time of single-step prediction, while the training time represents the complexity of NN-based models. Moreover, the computation time of NN-based models is greatly affected by computer hardware. As a result, every simulation group was run on the same computer hardware—a Lenovo laptop with an Intel(R) Core (TM) i7-7700HQ with 2.8 GHz CPU. On the basis of the above analysis, the model training time T t r a i n and the model inference time T p r e were employed to measure the computational efficiency of the VVP model.

5. Simulation Results and Discussion

In this study, the above proposed performance metrics were employed to evaluate the performance of the three VVP models, and their prediction accuracy was analyzed in terms of three aspects, including the driving feature parameters, the sliding window, and the prediction horizon. Since a BPNN model tends to fall into local optimization easily, which will lead to substantial variation in the results and will not be conducive to comparison, each group of experiments was run through five simulations for the BPNN model in this study, and the results were averaged to obtain the final results.

5.1. Analysis of Prediction Accuracy

(1)
Driving feature parameters
Firstly, the effects of driving feature parameters as the model inputs was analyzed by comparing the simulation results of groups A1 and A2, B1 and B2, and C1 and C2, and the results are shown in Table 8. It is clear that, on the one hand, the BPNN, LSTM, and GPT models all show an improvement in their prediction accuracy after adding the driving feature parameters as the model inputs, with the LSTM model showing the greatest improvement effect and the GPT model showing the lowest improvement. On the other hand, in the simulation of groups where feature parameters were not chosen as the model inputs, the BPNN model predicts better than the LSTM and GPT models.
The quantitative impact of the feature parameters on the performance of the three VVP models for groups A1 and A2 is charted in Figure 11. The MAE for the 1 s, 5 s, and 10 s prediction horizons in group A2 were 5.4%, 17.5%, and 50.9% lower than the corresponding MAE in group A1 for the BPNN model, respectively; 43.2%, 56.3%, and 71.2% lower for the LSTM model, respectively; and 40.0%, 20.6%, and 20.9% lower for the GPT model, respectively. When the model input contained only historical velocity, as demonstrated in Figure 12a, the MAE for the 1 s, 5 s, and 10 s prediction horizons for the BPNN model in group A1 were 54.3%, 40.4%, and 34.2% lower than the corresponding MAE for the LSTM model, respectively, and 32.7%, 42.4%, and 42.4% lower than for the GPT model, respectively. When we added the feature parameters to the model inputs, as shown in Figure 12b, the MAE for the 5 s and 10 s prediction horizons for the LSTM model in group B2 were lower than the corresponding MAE for the BPNN model by 19.0% and 10.9%, respectively, and lower than the GPT model by 50.0% and 70.3%, respectively.
Furthermore, it can be seen from the comparison of groups C2–C5 and D1 in Table 9 that the number of driving feature parameters also had an effect on the prediction accuracy of the VVP models. As depicted in Figure 13, except for the 1 s prediction horizon, the MAE of the BPNN, LSTM, and GPT models first decreased and then increased as the number of driving feature parameters decreased, and the MAE was the smallest under the model inputs of group C3. This phenomenon indicates that the prediction accuracy of the BPNN, LSTM, and GPT models can only be greatly improved when the model inputs contain the acceleration parameter a . Meanwhile, the comparison of the MAE in groups C2 and C3 demonstrated that the addition of the velocity variance f v did not further improve the prediction performance. In addition, the prediction performance of the BPNN and LSTM models was more sensitive to the acceleration parameter a than that of the GPT model.
Additionally, the influence of the type of driving feature parameters on prediction accuracy was also revealed through a comparison of groups D1–D5 and C1 using the three test data sets, and the results are shown in Table 10. The MAE was minimum in group D4 for the three VVP models. These comparison results imply that the acceleration parameter a was the key feature parameter to improving the prediction accuracy of the VVP models. Compared to the results in group C1, as displayed in Figure 14, the MAE of the 10 s prediction horizon in group D4 decreased by 56.1% for the BPNN model, by 74.3% for the LSTM model, and by 18.6% for the GPT model when tested on the C_WTVC data set. Meanwhile, the impact of the driving feature parameters v m r , v a max , a min , and f v on the prediction performance of the three VVP models was not the same when tested on different test data sets. For example, taking the results for group C1 as the benchmark, it was observed that the driving feature parameter v m r improved the prediction accuracy of the LSTM and GPT models when tested on the C_WTVC data set, but reduced the prediction accuracy of the BPNN model. Regarding their prediction performance on the three test data sets, the BPNN and LSTM models performed better when tested on the C_WTVC data set than on the NYCBUS_CYC and NYCTRUCK_CYC data sets under the same model inputs.
The above analysis shows that the acceleration parameter a improved the prediction performance the most. However, the ranking of feature importance obtained by using the RF method shows the order of v m r , v m e , v a max , a min , and a , which indicates that v m r should be the most significant feature parameter. In view of the above difference, the variability in data distribution between the test data sets and the training data set should be considered as one of the important factors affecting the prediction performance of a model. The KDE is a non-parametric estimation method that does not require the inclusion of any prior knowledge and fits the distribution based on the characteristics of data themselves; hence, it was adopted in this study to analyze the distribution of feature parameters. The KDE can be easily visualized through a plot of the kernel density probability density function to study the distributional information of data. The relevant formulae are shown in Equations (28)–(30):
f ^ h x = 1 n i = 1 n K h x x i = 1 n h i = 1 n K x x i h
K h x = 1 h K x h
K x = 1 2 π e x 2 2
where K h x is the scaling kernel function; K x is the Gaussian kernel that satisfies the probability density function property; x i denotes the independent and identically distributed sample points; n is the number of sample points; and h is the bandwidth.
Figure 15a shows that the test data set C_WTVC and the training data set exhibit the slightest difference in the distribution of historical velocity, while the test data set NYCBUS_CYC shows the most significant difference. Regarding the distribution of driving feature parameters, Figure 15b–f demonstrate that the distribution difference in the driving feature parameters between the test data set C_WTVC and the training data set was minimal. The training data set and the test data sets NYCBUS_CYC and NYCTRUCK_CYC show relatively significant distribution differences in the feature parameters v m r , v a max , and a min , while all three test data sets and the training data set show minor distribution differences in the feature parameter a .
(2)
Sliding window length
The sliding window length is associated with the dimensionality of the model inputs, which affects the complexity of the model structure. By taking groups A1, B1, C1, E, and F in a simulation comparison, the velocity prediction results of different sliding window lengths within the 10 s prediction horizon when tested using the three test data sets are presented in Table 11. The MAE initially declined and then increased with an increase in the sliding window length. As illustrated in the shaded portion of Figure 16, the optimal sliding window length for the BPNN model was 15 s for the NYCBUS_CYC test set and 10 s for the NYCTRUCK_CYC and C_WTVC test sets; the optimal sliding window length for the LSTM model was 20 s for the NYCBUS_CYC test set and 15 s for the NYCTRUCK_CYC and C_WTVC test sets; and the optimal sliding window length for the GPT model was 15 s for the NYCTRUCK_CYC test set and 20 s for the NYCBUS_CYC and C_WTVC test sets.
In general, a short sliding window length makes the model layer structure simpler, while a long sliding window length contains more historical information and inadvertently contributes to an increase in the structural complexity of the model. Therefore, it is crucial to balance the influencing elements of the sequence information within the sliding window and the corresponding complexity of the NN model when choosing an optimal sliding window to obtain the best prediction results.
(3)
Prediction horizon
Table 12 displays the results obtained with different prediction horizons under the model inputs of group C2 for the three VVP models, which illustrates that the MAE of the VVP models increased rapidly with an increase in prediction horizon, and the LSTM model had the best performance in each prediction horizon except for 1 s. Specifically, as shown in Figure 17, Figure 18 and Figure 19, the BPNN model did not predict subsequent velocity changes effectively during the stage of rapid velocity changes over a long-term prediction horizon, while the predictive velocity results of the LSTM model fit the target velocity curve better when the target velocity changed sharply. The GPT model performed relatively better only within the 1 s and 5 s prediction horizons, and its R2 in long-term prediction horizon was the smallest, which shows that the GPT model exhibits the worst performance in curve fitting over long-term prediction horizons.

5.2. Analysis of Computation Time

The above section mainly presents the evaluation of the prediction accuracy of the three VVP models under different model inputs from the perspective of the MAE. To explore the possibility of real-time application requirements, the computational efficiencies of the three VVP models were also compared. Table 13 displays the results of the training time and inference time under the model input of group D1. It can be seen from Table 13 that the BPNN model had the shortest model training time due to its simple network structure, followed by the relatively complex LSTM model, and the GPT model had the longest training time. Nevertheless, in terms of inference time, the single-step prediction time of the GPT model had the smallest average value. In short, for a complex NN-based velocity prediction model, online training cannot be achieved under the conditions of an existing vehicle-embedded system in complex and constantly updated driving scenarios. However, this difficulty can be solved in the near future as the computing power of embedded systems is steadily improving.

6. Conclusions

In this study, a comparative analysis of NN-based VVP methods was conducted qualitatively based on theory and quantitatively based on simulations. As three representative NN-based models, BPNN, LSTM, and GPT models were constructed for VVP after extracting the driving feature parameters from historical vehicle velocity data using the PCC and RF methods. The effects of model inputs, including feature parameters, sliding window length, and prediction horizon, on the prediction performance of the three VVP models were analyzed through multiple simulation experiments. The main conclusions are summarized as the following: (1) Model inputs should match the model structure, and the BPNN model (with the simplest model structure) performs better when the model input is a single historical vehicle velocity parameter, while the LSTM model performs better when the model input contains driving feature parameters. (2) Prediction accuracy declines with an increase in the prediction horizon. The BPNN model achieved the most accurate prediction in the 1 s prediction horizon, while the LSTM model presented the best prediction accuracy in both the 5 s and 10 s prediction horizons with the addition of feature parameters to the model inputs. The GPT model made accurate predictions in the 1 s prediction horizon with different model inputs, but performed poorly over a long-term prediction horizon. (3) The acceleration parameter a was the most crucial feature parameter to enhancing the model prediction accuracy, while the number and type of feature parameters, as well as the distribution of the feature parameters between the training and testing data sets, had a significant impact on VVP model performance. (4) In terms of the sliding window length, the three VVP models achieved a relatively higher prediction accuracy when the sliding window was between 15 and 20 s.
In future works, some potential improvements can be made in the following aspects: (1) the parameters and structures of NN-based VVP models can be identified by employing optimization algorithms to improve the computational efficiency of these prediction models; (2) to further improve the prediction accuracy, the model input used for VVP should not only be limited to historical vehicle velocity information, but should also contain other historical vehicle state information or external ITS information; (3) the appropriate number of training samples should be identified, in order to balance prediction accuracy with computation time; and (4) the prediction performance of VVP models should be further validated through a combination with a PEMS.

Author Contributions

P.Z.: Conceptualization, supervision, funding acquisition, writing—original draft, and writing—review and editing. W.L.: Formal analysis, software, investigation, validation, writing—original draft, and writing—review and editing. C.D.: Methodology, writing—review and editing, and supervision. J.H.: Writing—review and editing, supervision, and funding acquisition. F.Y.: Writing—review and editing and supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the National Natural Science Foundation of China (52305069), Key R&D Project of Hubei Province (2022BAA076), and Independent Innovation Projects of the Hubei Longzhong Laboratory (2022ZZ-21).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available if requested from the corresponding author. The data are not publicly available due to privacy.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

  1. Wasserburger, A.; Schirrer, A.; Didcock, N.; Hametner, C. A probability-based short-term velocity prediction method for energy-efficient cruise control. IEEE Trans. Veh. Technol. 2020, 69, 14424–14435. [Google Scholar] [CrossRef]
  2. Liu, K.; Asher, Z.; Gong, X.; Huang, M.; Kolmanovsky, I. Vehicle Velocity Prediction and Energy Management Strategy Part 1: Deterministic and Stochastic Vehicle Velocity Prediction using Machine Learning; 0148-7191; SAE Technical Paper; SAE International: Warrendale, PA, USA, 2019. [Google Scholar] [CrossRef]
  3. Shin, J.; Sunwoo, M. Vehicle speed prediction using a Markov Chain with speed constraints. IEEE Trans. Intell. Transp. Syst. 2019, 20, 3201–3211. [Google Scholar] [CrossRef]
  4. Chao, S.; Xiaosong, H.; Moura, S.J.; Fengchun, S. Velocity predictors for predictive energy management in hybrid electric vehicles. IEEE Trans. Control. Syst. Technol. 2015, 23, 1197–1204. [Google Scholar] [CrossRef]
  5. Liu, H.; Li, X.; Wang, W.; Han, L.; Xiang, C. Markov velocity predictor and radial basis function neural network based real-time energy management strategy for plug-in hybrid electric vehicles. Energy 2018, 152, 427–444. [Google Scholar] [CrossRef]
  6. Lefevre, S.; Sun, C.; Bajcsy, R.; Laugier, C. Comparison of parametric and non-parametric approaches for vehicle speed prediction. In Proceedings of the 2014 American Control Conference, Portland, OR, USA, 4–6 June 2014; pp. 3494–3499. [Google Scholar] [CrossRef]
  7. Jing, J.; Filev, D.; Kurt, A.; Ozatay, E.; Michelini, J.; Ozguner, U. Vehicle speed prediction using a cooperative method of fuzzy Markov model and auto-regressive model. In Proceedings of the 2017 IEEE Intelligent Vehicles Symposium (IV), Los Angeles, CA, USA, 11–14 June 2017; pp. 881–886. [Google Scholar] [CrossRef]
  8. Rosolia, U.; Zhang, X.; Borrelli, F. Data-driven predictive control for autonomous systems. Annu. Rev. Control. Robot. Auton. Syst. 2018, 1, 259–286. [Google Scholar] [CrossRef]
  9. Lin, X.; Wang, Z.; Wu, J. Energy management strategy based on velocity prediction using back propagation neural network for a plug-in fuel cell electric vehicle. Int. J. Energy Res. 2020, 45, 2629–2643. [Google Scholar] [CrossRef]
  10. Xiang, C.; Ding, F.; Wang, W.; He, W. Energy management of a dual-mode power-split hybrid electric vehicle based on velocity prediction and nonlinear model predictive control. Appl. Energy 2017, 189, 640–653. [Google Scholar] [CrossRef]
  11. Wang, W.; Guo, X.; Yang, C.; Zhang, Y.; Zhao, Y.; Huang, D.; Xiang, C. A multi-objective optimization energy management strategy for power split HEV based on velocity prediction. Energy 2022, 238, 121714. [Google Scholar] [CrossRef]
  12. Zhang, Y.; Gao, M.; Hua, G.; Xie, Q.; Guo, Y.; Zheng, R. Multisource fusion of exogenous inputs based NARXs neural network for vehicle speed prediction between urban road intersections. Proc. Inst. Mech. Eng. Part D J. Automob. Eng. 2023, 09544070231186186. [Google Scholar] [CrossRef]
  13. Che, Z.; Purushotham, S.; Cho, K.; Sontag, D.; Liu, Y. Recurrent neural networks for multivariate time series with missing values. Sci. Rep. 2018, 8, 6085. [Google Scholar] [CrossRef]
  14. Sherstinsky, A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
  15. Du, Y.; Cui, N.; Li, H.; Nie, H.; Shi, Y.; Wang, M.; Li, T. The vehicle’s velocity prediction methods based on RNN and LSTM neural network. In Proceedings of the 2020 Chinese Control and Decision Conference (CCDC), Hefei, China, 22–24 August 2020; pp. 99–102. [Google Scholar] [CrossRef]
  16. Wu, Y.; Huang, Z.; Zheng, Y.; Liu, Y.; Li, H.; Che, Y.; Peng, J.; Teodorescu, R. Spatial–temporal data-driven full driving cycle prediction for optimal energy management of battery/supercapacitor electric vehicles. Energy Convers. Manag. 2023, 277, 116619. [Google Scholar] [CrossRef]
  17. Shin, J.; Yeon, K.; Kim, S.; Sunwoo, M.; Han, M. Comparative study of Markov chain with recurrent neural network for short term velocity prediction implemented on an embedded system. IEEE Access 2021, 9, 24755–24767. [Google Scholar] [CrossRef]
  18. Xu, M.; Lin, H.; Liu, Y. A deep learning approach for vehicle velocity prediction considering the influence factors of multiple lanes. Electron. Res. Arch. 2023, 31, 401–420. [Google Scholar] [CrossRef]
  19. Shen, H.; Wang, Z.; Zhou, X.; Lamantia, M.; Yang, K.; Chen, P.; Wang, J. Electric vehicle velocity and energy consumption predictions using transformer and Markov-chain Monte carlo. IEEE Trans. Transp. Electrif. 2022, 8, 3836–3847. [Google Scholar] [CrossRef]
  20. Liu, J.; Chen, Y.; Zhan, J.; Shang, F. An on-line energy management strategy based on trip condition prediction for commuter plug-in hybrid electric vehicles. IEEE Trans. Veh. Technol. 2018, 67, 3767–3781. [Google Scholar] [CrossRef]
  21. Hou, J.; Yao, D.; Wu, F.; Shen, J.; Chao, X. Online vehicle velocity prediction using an adaptive radial basis function neural network. IEEE Trans. Veh. Technol. 2021, 70, 3113–3122. [Google Scholar] [CrossRef]
  22. Redhu, P.; Kumar, K. Short-term traffic flow prediction based on optimized deep learning neural network: PSO-Bi-LSTM. Phys. A Stat. Mech. Its Appl. 2023, 625, 129001. [Google Scholar] [CrossRef]
  23. Zhang, L.; Liu, W.; Qi, B. Energy optimization of multi-mode coupling drive plug-in hybrid electric vehicles based on speed prediction. Energy 2020, 206, 118126. [Google Scholar] [CrossRef]
  24. Shen, P.; Zhao, Z.; Zhan, X.; Li, J.; Guo, Q. Optimal energy management strategy for a plug-in hybrid electric commercial vehicle based on velocity prediction. Energy 2018, 155, 838–852. [Google Scholar] [CrossRef]
  25. Upadhyaya, A.; Mahanta, C. Improving velocity prediction in electric vehicles using hybrid artificial neural network (ANN). In Proceedings of the 2022 IEEE 10th Conference on Systems, Process & Control (ICSPC), Malacca, Malaysia, 17 December 2022; pp. 94–99. [Google Scholar] [CrossRef]
  26. Yufang, L.; Mingnuo, C.; Wanzhong, Z. Investigating long-term vehicle speed prediction based on BP-LSTM algorithms. IET Intell. Transp. Syst. 2019, 13, 1281–1290. [Google Scholar] [CrossRef]
  27. Jiao, X.; Wang, Z.; Zhang, Z. Vehicle Speed Prediction Using a Combined Neural Network of Convolution and Gated Recurrent Unit with Attention. Res. Sq. 2022. [Google Scholar] [CrossRef]
  28. Cao, M.; Li, V.O.; Chan, V.W. A CNN-LSTM model for traffic speed prediction. In Proceedings of the 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring), Antwerp, Belgium, 25–28 May 2020; pp. 1–5. [Google Scholar] [CrossRef]
  29. Yeon, K.; Min, K.; Shin, J.; Sunwoo, M.; Han, M. Ego-vehicle speed prediction using a long short-term memory based recurrent neural network. Int. J. Automot. Technol. 2019, 20, 713–722. [Google Scholar] [CrossRef]
  30. Zhang, F.; Xi, J.; Langari, R. Real-time energy management strategy based on velocity forecasts using V2V and V2I communications. IEEE Trans. Intell. Transp. Syst. 2017, 18, 416–430. [Google Scholar] [CrossRef]
  31. Huang, X.; Tan, Y.; He, X. An intelligent multifeature statistical approach for the discrimination of driving conditions of a hybrid electric vehicle. IEEE Trans. Intell. Transp. Syst. 2011, 12, 453–465. [Google Scholar] [CrossRef]
  32. Montazeri-Gh, M.; Fotouhi, A.; Naderpour, A. Driving patterns clustering based on driving feature analysis. Proc. Inst. Mech. Eng. Part C J. Mech. Eng. Sci. 2011, 225, 1301–1317. [Google Scholar] [CrossRef]
  33. Montazeri-Gh, M.; Fotouhi, A. Traffic condition recognition using the -means clustering method. Sci. Iran. 2011, 18, 930–937. [Google Scholar] [CrossRef]
  34. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar] [CrossRef]
  35. Sun, S.; Liu, Y.; Li, Q.; Wang, T.; Chu, F. Short-term multi-step wind power forecasting based on spatio-temporal correlations and transformer neural networks. Energy Convers. Manag. 2023, 283, 116916. [Google Scholar] [CrossRef]
  36. Bengio, S.; Vinyals, O.; Jaitly, N.; Shazeer, N. Scheduled sampling for sequence prediction with recurrent neural networks. Adv. Neural Inf. Process. Syst. 2015, 28. [Google Scholar] [CrossRef]
  37. Mihaylova, T.; Martins, A.F. Scheduled sampling for transformers. arXiv 2019, arXiv:1906.07651. [Google Scholar]
Figure 1. Commonly used NN-based approaches to conducting VVP.
Figure 1. Commonly used NN-based approaches to conducting VVP.
Mathematics 12 00575 g001
Figure 2. (a) Commonly used feedforward NN architectures and (b) commonly used recurrent NN architectures.
Figure 2. (a) Commonly used feedforward NN architectures and (b) commonly used recurrent NN architectures.
Mathematics 12 00575 g002
Figure 3. Attention mechanism and transformer.
Figure 3. Attention mechanism and transformer.
Mathematics 12 00575 g003
Figure 4. Matrix of correlation coefficients for each pair of driving feature parameters.
Figure 4. Matrix of correlation coefficients for each pair of driving feature parameters.
Mathematics 12 00575 g004
Figure 5. Training step for velocity prediction.
Figure 5. Training step for velocity prediction.
Mathematics 12 00575 g005
Figure 6. Prediction process of sliding windows.
Figure 6. Prediction process of sliding windows.
Mathematics 12 00575 g006
Figure 7. Structure of the GPT model.
Figure 7. Structure of the GPT model.
Mathematics 12 00575 g007
Figure 8. Training loss calculation of the GPT model.
Figure 8. Training loss calculation of the GPT model.
Mathematics 12 00575 g008
Figure 9. Linear, exponential, and inverse sigmoid decay curves.
Figure 9. Linear, exponential, and inverse sigmoid decay curves.
Mathematics 12 00575 g009
Figure 10. Training data set and three test data sets.
Figure 10. Training data set and three test data sets.
Mathematics 12 00575 g010
Figure 11. Comparison of MAE in groups A1 and A2 for the three VVP models.
Figure 11. Comparison of MAE in groups A1 and A2 for the three VVP models.
Mathematics 12 00575 g011
Figure 12. (a) Comparison of MAE in group A1 for the three VVP models and (b) comparison of MAE in group B2 for the three VVP models.
Figure 12. (a) Comparison of MAE in group A1 for the three VVP models and (b) comparison of MAE in group B2 for the three VVP models.
Mathematics 12 00575 g012
Figure 13. Comparison of MAE in the 10 s prediction horizons in groups C2–C5 and D1 when tested on the C_WTVC test set.
Figure 13. Comparison of MAE in the 10 s prediction horizons in groups C2–C5 and D1 when tested on the C_WTVC test set.
Mathematics 12 00575 g013
Figure 14. Comparison of the MAE within the 10 s prediction horizons for groups C1 and D1–D5 when tested on the C_WTVC test set.
Figure 14. Comparison of the MAE within the 10 s prediction horizons for groups C1 and D1–D5 when tested on the C_WTVC test set.
Mathematics 12 00575 g014
Figure 15. The distributions of driving feature parameters.
Figure 15. The distributions of driving feature parameters.
Mathematics 12 00575 g015aMathematics 12 00575 g015b
Figure 16. Comparison of MAE for different sliding window lengths when tested on three test data sets.
Figure 16. Comparison of MAE for different sliding window lengths when tested on three test data sets.
Mathematics 12 00575 g016
Figure 17. Vehicle velocity prediction curves over different prediction horizons for the BPNN model.
Figure 17. Vehicle velocity prediction curves over different prediction horizons for the BPNN model.
Mathematics 12 00575 g017
Figure 18. Vehicle velocity prediction curves over different prediction horizons for the LSTM model.
Figure 18. Vehicle velocity prediction curves over different prediction horizons for the LSTM model.
Mathematics 12 00575 g018
Figure 19. Vehicle velocity prediction curves over different prediction horizons for the GPT model.
Figure 19. Vehicle velocity prediction curves over different prediction horizons for the GPT model.
Mathematics 12 00575 g019
Table 1. Comparison of different types of NNs.
Table 1. Comparison of different types of NNs.
NN TypeAdvantageLimitation
PNNSimple structure with strong non-linear mapping capabilityLong training time, slow convergence, affected by initial weights and thresholds, and easy to fall into local minima
RBFNNNo local minima problem and faster convergence than BPNNsDoes not work when there are insufficient data
GRNNFaster convergence than RBFNNs with high fault tolerance and robustnessHigh computational volume and high storage space requirements
NARXAdd delay and feedback mechanisms, and suitable for dealing with time series problemsReverse propagation and time-consuming training
RNNLimited ability to memorize time series informationGradient vanishing problem
LSTMStronger information memorization than RNNs, and can mitigate the gradient vanishing problemComplex structure, large number of parameters, and slow training speed
GRUSimpler structure and faster training speed than LSTMsPredictive performance may not be as good as LSTMs for complex tasks
TransformerStronger ability to memorize long time series information with parallel computing capabilityComplex structure requires positional encoding to characterize sequence order and may not perform well for simple prediction tasks
Table 2. Vehicle driving feature parameters.
Table 2. Vehicle driving feature parameters.
Feature ParameterDenotationUnitCalculation Equation
Average velocity v m e km/h v m e = 1 n i = 1 n v i
Average driving velocity v m r km/h v m r = 1 k m = 1 k v m
Average positive acceleration a m e 1 m/s2 a m e 1 = 1 m i = 1 m a i
Average negative acceleration a m e 2 m/s2 a m e 2 = 1 b j = 1 b a j
Velocity variance f v m2/s2 f v = 1 n i = 1 n v i v m e 2
Acceleration variance f a m2/s4 f a = 1 n i = 1 n a i a m e 2
Variance in velocity times acceleration f v a m4/s6 f v a = 1 n i = 1 n v a i v a m e 2
Acceleration time ratio P a P a = t a T × 100
Deceleration time ratio P d P d = t d T × 100
Uniform time ratio P c P c = t c T × 100
Idling time ratio P i P i = t i T × 100
Maximum acceleration a max m/s2 a max = a 1 , a 2 , , a T max
Minimum acceleration a min m/s2 a min = a 1 , a 2 , , a T min
Maximum value of velocity times acceleration v a max m2/s3 v a max = v a 1 , v a 2 , , v a T max
Minimum value of velocity times acceleration v a min m2/s3 v a min = v a 1 , v a 2 , , v a T min
Velocity first-order difference (acceleration) a m/s2 a i = v i v i 1 ,   a 0 = 0
Velocity second-order difference δ a m/s3 δ a i = a i a i 1 ,   a 0 = 0
Table 3. The major parameters of the feature parameter equations.
Table 3. The major parameters of the feature parameter equations.
SymbolNameSymbolName
n , T The sampling time length t a The acceleration time length on the sampling time
k The time length of non-zero velocity on the sampling time t d The deceleration time length on the sampling time
m The time length of positive acceleration on the sampling time t c The time length of uniform velocity on the sampling time
b The time length of negative acceleration on the sampling time t i The time length of idling velocity on the sampling time
Table 4. Importance scores of the first eight feature parameters.
Table 4. Importance scores of the first eight feature parameters.
Feature
Parameter
v m r v m e v a max a min a f v a m e 2 P d
Importance Score0.53120.41840.01990.01310.01270.00130.00120.0009
Table 5. Experimental setup.
Table 5. Experimental setup.
GroupModel InputGroupModel Input
A15 s historical velocityC515 s historical velocity + v m r , v a max
A25 s historical velocity + v m r , v a max , a min , a , f v D115 s historical velocity + v m r
B110 s historical velocityD215 s historical velocity + v a max
B210 s historical velocity + v m r , v a max , a min , a , f v D315 s historical velocity + a min
C115 s historical velocityD415 s historical velocity + a
C215 s historical velocity + v m r , v a max , a min , a , f v D515 s historical velocity + f v
C315 s historical velocity + v m r , v a max , a min , a E20 s historical velocity
C415 s historical velocity + v m r , v a max , a min F30 s historical velocity
Table 6. Related parameter settings for the BPNN and LSTM models.
Table 6. Related parameter settings for the BPNN and LSTM models.
ModelGroupEpochsNumber of
Hidden Layers
Number of
Hidden Units
Learning Rate
BPNNA1500160.05
A2\B15001110.05
B2\C15001150.05
C2\C3\E\F5001210.05
C4\C55001180.05
D1~D55001160.05
LSTMA1~F5001320.05
Table 7. Related parameter settings for the GPT model.
Table 7. Related parameter settings for the GPT model.
GroupEpochsBatch SizeNhkDropout
A11000100365000.1
A2\B110001003115000.1
B2\C110001003165000.1
C2\E10001003215000.1
C310001003205000.1
C410001003195000.1
C510001003185000.1
D1~D510001003175000.1
F10001003315000.1
Table 8. MAE of three VVP models with/without driving feature parameters on C_WTVC.
Table 8. MAE of three VVP models with/without driving feature parameters on C_WTVC.
GroupPrediction Horizon P(s)MAE
BPNNLSTMGPT
A110.370.810.55
51.372.382.38
102.794.244.84
A210.350.460.33
51.131.041.89
101.371.223.83
B110.360.830.60
51.352.262.32
102.703.944.72
B210.340.410.40
51.210.981.96
101.421.163.91
C110.350.820.55
51.322.192.13
102.623.824.69
C210.350.440.37
51.250.871.85
101.541.143.82
Table 9. Comparison of MAE of three VVP models with different numbers of driving feature parameters when tested on the C_WTVC test set.
Table 9. Comparison of MAE of three VVP models with different numbers of driving feature parameters when tested on the C_WTVC test set.
GroupPrediction Horizon P(s)MAE
BPNNLSTMGPT
C210.350.440.37
51.250.871.85
101.541.143.82
C310.340.400.32
51.110.781.76
101.321.083.61
C410.350.720.48
51.311.731.98
102.542.333.88
C510.380.880.55
51.341.962.11
102.683.254.35
D110.360.820.56
51.342.042.08
102.713.594.45
Table 10. The MAE of different types of driving feature parameters when examined using the three test data sets.
Table 10. The MAE of different types of driving feature parameters when examined using the three test data sets.
GroupPrediction Horizon P(s)MAE
BPNNLSTMGPT
Cycle 1Cycle 2Cycle 3Cycle 1Cycle 2Cycle 3Cycle 1Cycle 2Cycle 3
C111.040.930.351.861.400.820.590.440.55
53.252.461.324.572.582.192.351.902.13
105.383.982.627.124.093.823.893.534.69
D111.030.940.362.031.520.820.500.340.56
53.252.481.344.393.152.042.341.882.08
105.284.172.716.573.973.593.853.504.45
D211.000.950.361.791.310.780.560.350.45
53.452.271.333.882.831.912.441.782.05
105.683.642.665.533.943.283.893.274.38
D311.070.940.351.901.250.800.600.370.46
53.532.391.354.202.642.082.531.772.11
105.693.752.735.963.923.473.833.224.55
D411.040.920.351.181.110.390.400.350.39
51.901.271.021.301.200.722.231.731.85
102.631.571.151.581.400.983.633.123.82
D511.040.920.372.121.680.690.560.390.43
53.662.431.384.632.982.112.631.881.90
105.474.112.807.234.443.773.983.613.99
Note: Cycle 1, cycle 2, and cycle 3 represent NYCBUS_CYC, NYCTRUCK_CYC, and C_WTVC, respectively.
Table 11. Velocity prediction results of sliding window length within the 10 s prediction horizon when tested using the three test data sets.
Table 11. Velocity prediction results of sliding window length within the 10 s prediction horizon when tested using the three test data sets.
Length of Sliding
Window
MAE
BPNNLSTMGPT
Cycle 1Cycle 2Cycle 3Cycle 1Cycle 2Cycle 3Cycle 1Cycle 2Cycle 3
5 s5.494.092.797.374.334.245.595.724.84
10 s5.423.862.707.314.233.944.143.724.72
15 s5.383.982.627.124.093.823.893.534.69
20 s5.404.192.815.074.293.893.603.613.95
30 s5.684.322.866.694.704.223.673.664.15
Table 12. The MAE over different prediction horizons for the three VVP models.
Table 12. The MAE over different prediction horizons for the three VVP models.
Prediction HorizonBPNNLSTMGPT
MAER2MAER2MAER2
1 s0.350.9990.440.9990.370.999
5 s1.250.9960.870.9981.850.986
10 s1.540.9951.140.9963.820.947
15 s3.870.9463.240.9925.790.913
20 s6.630.9165.280.9718.020.844
Table 13. Training and prediction times of the three VVP models.
Table 13. Training and prediction times of the three VVP models.
Model T t r a i n (s) T p r e (s)
BPNN3.5660.011
LSTM82.1970.004
GPT423.6190.003
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, P.; Lu, W.; Du, C.; Hu, J.; Yan, F. A Comparative Study of Vehicle Velocity Prediction for Hybrid Electric Vehicles Based on a Neural Network. Mathematics 2024, 12, 575. https://doi.org/10.3390/math12040575

AMA Style

Zhang P, Lu W, Du C, Hu J, Yan F. A Comparative Study of Vehicle Velocity Prediction for Hybrid Electric Vehicles Based on a Neural Network. Mathematics. 2024; 12(4):575. https://doi.org/10.3390/math12040575

Chicago/Turabian Style

Zhang, Pei, Wangda Lu, Changqing Du, Jie Hu, and Fuwu Yan. 2024. "A Comparative Study of Vehicle Velocity Prediction for Hybrid Electric Vehicles Based on a Neural Network" Mathematics 12, no. 4: 575. https://doi.org/10.3390/math12040575

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop