3.2.1. Dynamic Programming
The objective of this economic velocity planning strategy is to solve the optimal control problem of a powertrain system with terminal constraints within a limited distance. During this process, the DP algorithm is employed, which is described in detail as follows:
The state variables are used to describe the states of each sub-stage, and the state of the 
 stage is denoted as 
, which belongs to the state space 
. The decision variables are used to describe the control decisions that transfer the system from one sub-stage to the next, and the decision variable for the 
 stage is denoted as 
, which belongs to the decision space 
. Both the state variable 
 and the control variable 
 are bounded and discrete, meaning they can take values within their respective domains. The state equation of this discrete system can be represented as
          
The transition cost from the current state to the next state in each sub-stage is represented by a cost function, and the cost for the 
 stage is denoted as 
. It can be expressed as
          
The cost function for the entire optimal control process can be expressed as
          
          where 
 represents the terminal cost of the system.
During the solving process, the backward induction starts from the 
 sub-stage, where the cost (terminal cost) of the 
 sub-stage can be represented as:
The optimal cost function for the 
 stage is 
, which can be represented as
          
After the backward induction is completed, for the control system with the initial state , the optimal cost function for the entire process is , and the optimal control decision sequence is .
The essence of solving the problem using the DP algorithm is to traverse all possible decisions of the state variables at each stage within the state space. It calculates the optimal decisions for all feasible states and stores them for direct table lookup, thereby avoiding redundant calculations. In the economic velocity planning for PHET, the original decision variables include gearbox gears, operating modes, and power allocation in different operating modes. Due to the complexity of the vehicle structure studied in this study, which includes four gearbox gears and three operating modes, using conventional DP algorithms to perform economic velocity planning would require traversing and solving for all the aforementioned factors as decision variables. This would exponentially increase the computational burden and significantly impact the efficiency of the algorithm.
In addition, in the conventional DP algorithm, the range of the state space and the discretization step size are fixed. However, velocity—as a state variable in economic velocity planning—has a significant range of fluctuations and is highly sensitive to temporal changes. If the conventional DP algorithm is used for economic velocity planning, there will be a large number of invalid state spaces and incorrect state transitions, which will affect the accuracy of the algorithm.
Therefore, it is necessary to optimize the DP algorithm before its application. The efficiency and accuracy of the DP algorithm depend on the number of state variables and decision variables, as well as the corresponding discrete grid. Since this study focuses on velocity planning, there is only one state variable, which is velocity. The optimization of the DP algorithm’s efficiency and accuracy can be achieved by reducing the number of decision variables and dynamically adjusting the discrete grid, as well as limiting the state space.
  3.2.2. Optimization of Decision Variables
The HET studied in this study has three operating modes: Single Electric Vehicle (SEV), Dual Electric Vehicle (DEV), and Hybrid Electric Vehicle (HEV). The operating modes of PHET are shown in 
Table 3. Additionally, the gearbox of the HET studied in this study consists of four gears. Due to the complexity of the operating modes and the vehicle structure, simplification is required before economic velocity planning. During the simplification process, both the vehicle’s power performance and fuel economy need to be considered to allow the powertrain system to achieve optimal performance.
It can be seen from 
Table 3 that, in terms of power performance, the SEV mode exhibits the weakest power performance among the various operating modes, while the power performance comparison between DEV and HEV varies depending on the gear position and vehicle velocity. 
Figure 8 shows the comparison of the maximum acceleration of the vehicle on a straight road in the second gear. Before point A in the graph, DEV outperforms HEV in terms of power performance, while the opposite is true after that point. Additionally, due to the limitations imposed by the planetary gears structure and the maximum rotational speed of each power source, the maximum velocities differ between the DEV and the HEV in second gear, as indicated by segments B and C in 
Figure 8. Therefore, when shifting gear and selecting the operating mode, the power performance of each component in the powertrain system needs to be considered comprehensively.
 In terms of fuel economy, according to 
Figure 2, it can be observed that 
 exhibits better fuel economy around 1200 rpm when 
 is constant. Due to the characteristics of the dual planetary gears, in the HEV mode, the motor MG1 can control 
 at 1200 rpm to ensure relatively good fuel economy. When controlling 
 using MG1, based on the speed characteristics of PG1 (Equation (6)), the 
 can be expressed as
          
The 
 can be expressed as
          
According to Equations (17) and (18), it can be observed that 
 is negatively correlated with vehicle velocity (
) when 
 is fixed, with 
 increasing at lower 
. Based on the external characteristic curve of MG1 (
Figure 3), 
 decreases with an increase in 
 after reaching 
. Additionally, according to Equations (8)–(10), the maximum output torque (
) of the engine at this time can be expressed as:
          where, 
 will be subject to restrictions imposed by 
 (segments D in 
Figure 8) and 
 (segments E in 
Figure 8). In summary, in the HEV mode, the engine may not be able to output maximum torque according to its external characteristics at lower 
. Therefore, it is necessary to achieve a rational gear shifting and operating mode selection to ensure fuel economy while avoiding the occurrence of this problem.
In this study, to address the aforementioned issues, gear shifting and operating mode selection is correlated with the 
AP and velocity, as shown in 
Figure 9. The determination of gear shift velocity takes into account the speed characteristics (
Table 1) of the electric motor and the engine, as well as the driver’s driving behavior; and the determination of operating modes considers the speed and torque characteristics (
Table 1) of the electric motor and the engine, as well as the driver’s driving behavior.
By employing this approach, the computational efficiency of DP is improved by reducing the decision variables, while ensuring that the vehicle achieves optimal power performance when transitioning between different operating modes.
To differentiate between different driving styles, they can be classified based on the 
AP and operating modes. The specific classification (I to IV in 
Figure 9) is as follows:
- I.
 Economical Driving (0–40%): In economical driving, the vehicle has relatively weak power demand. The HET operates only in the SEV, where only motor MG2 is engaged. There is no need to consider operating mode selection or power allocation during the drive.
- II.
 Comfortable Driving (40–60%): In comfortable driving, the vehicle has moderate power demand. The HET operates only in the DEV mode, with both motor MG1 and motor MG2 engaged. There is no need to consider operating mode selection, but power allocation between the two motors needs to be taken into account.
- III.
 Aggressive Driving (60–80%): In aggressive driving, the vehicle has strong power demand. The PHET alternates between the DEV and HEV, with a higher proportion in the DEV. Operating mode selection and power allocation between different components of the powertrain system need to be considered in this driving style.
- IV.
 Dangerous Driving (80–100%): In dangerous driving, the vehicle has the strongest power demand. The driving situation of the PHET is similar to aggressive driving, but with a higher proportion of the HEV during this driving style. Additionally, when the vehicle’s battery is low, gear shifting and operating mode selection will follow the rules of this driving style, although the power allocation strategy may differ.
  3.2.3. Optimization of Step Size
When applying the DP algorithm to solve problems, it is common to divide the problem into multiple decision stages in the time domain, and then calculate the optimal performance indicators and decision variables for each sub-stage. In the context of economic velocity planning, the system disturbances are based on spatial factors such as road gradient and lane speed limits. If the optimization problem is discretized in the time domain, significant variations in vehicle velocity can occur in the spatial domain across different sub-stages, thereby affecting the planning results. Additionally, in this study, it is necessary to obtain road condition information ahead of time through V2X communication before conducting economic velocity planning. 
Discretizing the optimization problem in the time domain alone would not accurately correspond the time domain to the spatial domain. Therefore, in this study, the DP algorithm based on spatial domain discretization is employed for the economic velocity planning strategy of the PHET. When determining the discretization step size (
) in the 
 stage, setting it as a constant value can lead to certain issues. For instance, if 
 is set to be too large, it can result in significant differences in velocity between adjacent nodes, thus impacting the accuracy of velocity planning and the effectiveness of the actual plans. On the other hand, if 
 is too small, it would require a larger number of steps in the planning space, thereby increasing the computational complexity. To effectively address this issue, this study sets 
 as a mathematical model that is correlated with the predicted velocity 
, and a gear adjustment factor 
 in the 
 stage:
          where 
 will be discussed in the next section, while 
 is positively correlated with the predicted gear position.
  3.2.4. Optimization of State Space
Before optimizing the DP algorithm’s state space, it is necessary to determine the terminal state. In the context of economic velocity planning, the terminal state refers to the desired final vehicle velocity. The variation in vehicle velocity during travel is influenced by spatial domain information. Therefore, in the process of economic velocity planning, the terminal velocity is affected by the vehicle’s current state, driver’s driving style, and road information. In this study, the terminal velocity within each planning cycle is determined using a velocity prediction equation. It assumes that within each stage, the road slope and speed limit remain constant, and any variations in driving resistance due to changes in velocity are ignored. Additionally, the vehicle acceleration is assumed to be constant. The velocity prediction equation can be expressed as
          
          where 
 is the predicted vehicle acceleration in the stage 
. It can be expressed as:
          where 
, 
, and 
 represent the predicted rolling resistance, predicted air resistance, and predicted gradient resistance for the 
 stage, respectively. These values can be computed by incorporating road condition information obtained from V2X and Equation (11). 
 represents the predicted driving force on the tire in the 
 stage, and can be expressed as
          
          where 
 represents the torque relaxation factor, which is correlated with the driving style and road surface information. Its purpose is to avoid prolonged operation of the powertrain system under high-load conditions, which could impact the lifespan of the motor or engine. 
 represents the predicted torque output of the powertrain system during the 
 stage.
The constraint conditions are as:
          where 
 and 
 (
 and 
) are the minimum (maximum) output torques provided by motors MG1 and MG2, respectively. 
 and 
 (
 and 
) are the minimum (maximum) output speeds provided by motors MG1 and MG2, respectively. 
 and 
 are the minimum and maximum output torques, respectively, that are provided by the engine. 
 and 
 are the upper and lower limits of battery power, respectively. 
 and 
 are the minimum and maximum predicted acceleration, respectively. 
 and 
 are the minimum and maximum step size, respectively.
In the DP algorithm, the number of grid points is crucial in determining the computation results. Increasing the number of discrete grid points can lead to more accurate results, but a longer computation time. Conversely, reducing the number of grid points can yield faster computation results, but may introduce distortion in the obtained results. To address this issue, this study proposes an IDP algorithm that reduces grid points by constraining the state space without compromising the overall optimization performance. The principle is illustrated in 
Figure 10, where the discrete state variables of each stage on the spatial domain are transformed from the global state space (
) to a local predictive state space (
). The spatial constraints consist of two boundary components, namely the predictive boundary (
) and the planning boundary (
). Among them, 
 represents the boundaries predicted during the forecasting stage based on road information and driving style. The upper predictive boundary 
 and the lower predictive boundary 
 can be expressed as
          
The upper planning boundary 
 and lower planning boundary 
 are determined based on terminal velocity 
, and are calculated by reverse planning. They can be expressed as
          
Please refer to Equation (21) for the specific calculation method.
Combining 
, 
, 
AP, and road velocity limit (
 and 
), the 
 for 
 stage in the IDP algorithm can be obtained. The upper bound 
 and lower bound 
 of 
 can be expressed as:
          where 
 represents the maximum vehicle velocity constrained by the accelerator pedal in the 
 stage, as referenced in 
Figure 9. 
 and 
 are the maximum and minimum velocity limits of the road, respectively, which can be obtained through V2X.
By constraining the state space, the number of grid points can be effectively reduced. However, in this study, gear shifting and operating mode selection are based on the vehicle velocity and 
AP. Within the local predicted state space 
, different gears and operating modes may be present. During the backward planning process, the vehicle state of the previous stage is derived from the vehicle state of the subsequent stage. This may lead to the situation where some state variables from the previous stage cannot be transferred to the subsequent stage during the forward transition, as shown in 
Figure 11.
In 
Figure 11, the 
 of stage 
 is derived by backward propagation from 
 of stage 
. However, due to the differences in the vehicle’s working mode and gear during forward transitions and backward propagation, there can be discrepancies between the true state space (
) and 
 of stage 
. To address this issue, this study introduces a penalty function (
) to adjust the cost function (
). The underlying principle can be expressed as
          
          where 
 is dependent on the state variables 
 and control variables 
 at stage 
. During the computation process of the IDP algorithm, if an infeasible state transition occurs, the introduction of the penalty function can significantly increase the overall cost. This allows for the avoidance of such situations in determining the optimal control decisions, and enhances the robustness of the algorithm during state transitions.
The optimized structure of the IDP algorithm, as shown in 
Figure 12, is presented in the previous sections. In this figure, 
 represents the number of stages in each planning cycle, 
 denotes the number of discrete state variables, 
 represents the number of discrete decision variables, 
 represents the discrete state variable function, 
 represents the discrete decision variable function, 
 represents the cost function of each sub-stage, and 
 represents the overall cost function.
Firstly, the terminal velocity  and step sizes  for each substage are obtained through predictive calculations. Secondly, the boundaries  and  are estimated using forward prediction and backward planning. Thirdly, , which satisfies the imposed constraints, is discretized to obtain . Based on the characteristics of ,  is determined. Subsequently,  is reevaluated to determine its capability to successfully accomplish the state transition. If it cannot do so, the cost at that moment is amplified through the use of a penalty function (). Finally, the minimum  in this state is calculated.