Platooning Cooperative Adaptive Cruise Control for Dynamic Performance and Energy Saving: A Comparative Study of Linear Quadratic and Reinforcement Learning-Based Controllers
Abstract
:1. Introduction
2. Model Description
2.1. Vehicle Model
2.2. Spacing Policy
3. LQ Controller
3.1. Open Loop System: Model Linearization
3.2. Control: Closed Loop System
3.2.1. Tuning of Q and R Matrices and Time Headway
- Comfortable drive: to ensure comfort for each vehicle, threshold values for longitudinal acceleration and jerk are set to and , as specified by the authors of [26].
- Safety: the control system must be able to safely stop each vehicle in the platoon, namely avoiding rear-end collisions, when an emergency braking condition occurs at the maximum speed permitted by the regulations, e.g., considering highway speed limits (80 km/h for HDVs).
3.2.2. WLTP Class 3 Driving Cycle
3.2.3. Emergency Braking Manoeuvre
Driving Cycle | ] (Limit 0.9 m/s3) | ] (Limit 2 m/s2) | |||||
---|---|---|---|---|---|---|---|
Vehicle | |||||||
Follower 1 | 0.156 | 0.113 | 0.088 | 0.49 | 0.423 | 0.373 | |
0.154 | 0.111 | 0.086 | 0.495 | 0.428 | 0.374 | ||
0.148 | 0.106 | 0.082 | 0.505 | 0.431 | 0.374 | ||
Follower 2 | 0.122 | 0.073 | 0.049 | 0.452 | 0.36 | 0.294 | |
0.117 | 0.070 | 0.047 | 0.453 | 0.359 | 0.293 | ||
0.108 | 0.063 | 0.042 | 0.456 | 0.354 | 0.287 |
4. RL Control System
4.1. Reinforcement Learning for Control
4.2. The DDPG Algorithm
- Actor, which receives observations and returns actions (thus playing the role of the policy).
- The critic, which receives the observations and the actions taken by the actor, evaluates them and returns a prediction of the discounted long-term reward.
Algorithm 1: DDPG Algorithm |
|
4.3. Agent Structure
- and are the time headway of follower 1 with respect to the leader and follower 2 with respect to follower 1, respectively;
- and are the velocity and acceleration of the leader;
- and are the velocities of follower 1 and follower 2.
- -
- The reward is set to the maximum negative value (−10);
- -
- The corresponding training iteration has been stopped.
4.4. Evaluation of the RL Agent
- percentage energy savings with respect to the leader vehicle,
- RMS of time headway,
- RMS of time headway error with the desired one
- RMS of jerk.
5. Results
- -
- Standard driving cycle: the leader vehicle tracks the FTP75 driving cycle (saturated to a minimum speed value of 2 m/s to avoid backward movement of the platoon), and the followers try to keep the reference inter-vehicle distance defined by a linear spacing policy and a reference platoon velocity.
- -
- Cut-in scenario: a new vehicle (not belonging to the platoon) invades the lane occupied by the platoon, thus altering the equilibrium of the controlled system. The controller’s ability to react when an unexpected obstacle breaks the platoon’s equilibrium can therefore be verified.
5.1. String Stability
5.2. LQC—RL Comparison
5.2.1. Driving Cycle Results
5.2.2. Cut-In Scenario Results
6. Conclusions
- The proposed model of the truck platoon includes the dependency of the aerodynamic drag with the inter-vehicle distance; it is vital to quantify the fuel savings of each vehicle in the platoon.
- The virtual environment that has been developed enables one to tune and train classical and AI controllers and assess platoon performance under different driving cycles.
- The LQC controller is string stable across the entire frequency range, while the RL-based controller may have a limited bandwidth of string stability.
- Regardless of the type of controller, a linear spacing policy proved to be a suitable choice to meet all the requirements (dynamic performance and energy savings).
- The training of RL provides satisfactory results even in the case of driving cycles different from the ones used for the learning phase of the agent.
- The simulation results of an RL-based controller are affected by nonlinearities; this control-related nonlinear behaviour comes from the combined operation of the safety-related penalty on the reward function and the activation functions of the neural network.
- The comparison through the selected performance indices (i.e., acceleration and jerk, final SOC, and energy consumption) during a standard driving cycle showed that, by properly selecting the reward function, RL and LQC achieve similar dynamic and energetic targets.
- The comparison during a cut-in simulation scenario showed that both controllers properly reduced the vehicle speed when the disturbance occurred, avoiding accidents.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
Appendix B
References
- Alam, A.; Gattami, A.; Johansson, K.H.; Tomlin, C.J. Guaranteeing safety for heavy duty vehicle platooning: Safe set computations and experimental evaluations. Control. Eng. Pr. 2014, 24, 33–41. [Google Scholar] [CrossRef]
- Vahidi, A.; Sciarretta, A. Energy saving potentials of connected and automated vehicles. Transp. Res. Part C Emerg. Technol. 2018, 95, 822–843. [Google Scholar] [CrossRef]
- Tsugawa, S.; Jeschke, S.; Shladovers, S.E. A review of truck platooning projects for energy savings. IEEE Trans. Intell. Veh. 2016, 1, 68–77. [Google Scholar] [CrossRef]
- Bergenhem, C.; Pettersson, H.; Coelingh, E.; Englund, C.; Shladover, S.; Tsugawa, S. Overview of platooning systems. In Proceedings of the 19th ITS World Congress, Vienna, Austria, 22–26 October 2012. [Google Scholar]
- Ellis, M.; Gargoloff, J.I.; Sengupta, R. Aerodynamic Drag and Engine Cooling Effects on Class 8 Trucks in Platooning Configurations. SAE Int. J. Commer. Veh. 2015, 8, 732–739. [Google Scholar] [CrossRef]
- Kaluva, S.T.; Pathak, A.; Ongel, A. Aerodynamic drag analysis of autonomous electric vehicle platoons. Energies 2020, 13, 4028. [Google Scholar] [CrossRef]
- McAuliffe, B.; Lammert, M.; Lu, X.Y.; Shladover, S.; Surcel, M.D.; Kailas, A. Influences on Energy Savings of Heavy Trucks Using Cooperative Adaptive Cruise Control. In SAE Technical Papers; SAE International: Warrendale, PA, USA, 2018. [Google Scholar] [CrossRef]
- Zhang, R.; Li, K.; Wu, Y.; Zhao, D.; Lv, Z.; Li, F.; Chen, X.; Qiu, Z.; Yu, F. A Multi-Vehicle Longitudinal Trajectory Collision Avoidance Strategy Using AEBS with Vehicle-Infrastructure Communication. IEEE Trans. Veh. Technol. 2022, 71, 1253–1266. [Google Scholar] [CrossRef]
- Wu, C.; Xu, Z.; Liu, Y.; Fu, C.; Li, K.; Hu, M. Spacing policies for adaptive cruise control: A survey. IEEE Access 2020, 8, 50149–50162. [Google Scholar] [CrossRef]
- Gunter, G.; Gloudemans, D.; Stern, R.E.; McQuade, S.; Bhadani, R.; Bunting, M.; Monache, M.L.D.; Lysecky, R.; Seibold, B.; Sprinkle, J.; et al. Are Commercially Implemented Adaptive Cruise Control Systems String Stable? IEEE Trans. Intell. Transp. Syst. 2021, 22, 6992–7003. [Google Scholar] [CrossRef]
- Naus, G.J.L.; Vugts, R.P.A.; Ploeg, J.; Van De Molengraft, M.J.G.; Steinbuch, M. String-stable CACC design and experimental validation: A frequency-domain approach. IEEE Trans. Veh. Technol. 2010, 59, 4268–4279. [Google Scholar] [CrossRef]
- Besselink, B.; Johansson, K.H. String Stability and a Delay-Based Spacing Policy for Vehicle Platoons Subject to Disturbances. IEEE Trans. Autom. Control 2017, 62, 4376–4391. [Google Scholar] [CrossRef]
- Sugimachi, T.; Fukao, T.; Suzuki, Y.; Kawashima, H. Development of autonomous platooning system for heavy-duty trucks? In IFAC Proceedings Volumes (IFAC-PapersOnline); IFAC Secretariat: Laxenburg, Austria, 2013; pp. 52–57. [Google Scholar] [CrossRef]
- Turri, V.; Besselink, B.; Johansson, K.H. Cooperative Look-Ahead Control for Fuel-Efficient and Safe Heavy-Duty Vehicle Platooning. IEEE Trans. Control Syst. Technol. 2017, 25, 12–28. [Google Scholar] [CrossRef]
- Ward, J.W.; Stegner, E.M.; Hoffman, M.A.; Bevly, D.M. A Method of Optimal Control for Class 8 Vehicle Platoons Over Hilly Terrain. J. Dyn. Syst. Meas. Control 2022, 144, 011108. [Google Scholar] [CrossRef]
- Gao, W.; Gao, J.; Ozbay, K.; Jiang, Z.P. Reinforcement-Learning-Based Cooperative Adaptive Cruise Control of Buses in the Lincoln Tunnel Corridor with Time-Varying Topology. IEEE Trans. Intell. Transp. Syst. 2019, 20, 3796–3805. [Google Scholar] [CrossRef]
- Yang, J.; Liu, X.; Liu, S.; Chu, D.; Lu, L.; Wu, C. Longitudinal tracking control of vehicle platooning using DDPG-based PID. In Proceedings of the 2020 4th CAA International Conference on Vehicular Control and Intelligence, CVCI 2020, Hangzhou, China, 18–20 December 2020; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2020; pp. 656–661. [Google Scholar] [CrossRef]
- Peake, A.; McCalmon, J.; Raiford, B.; Liu, T.; Alqahtani, S. Multi-Agent Reinforcement Learning for Cooperative Adaptive Cruise Control. In Proceedings of the International Conference on Tools with Artificial Intelligence, ICTAI, Baltimore, MD, USA, 9–11 November 2020; IEEE Computer Society: Washington, DC, USA, 2020; pp. 15–22. [Google Scholar] [CrossRef]
- Chu, T.; Kalabić, U. Model-based deep reinforcement learning for CACC in mixed-autonomy vehicle platoon. In Proceedings of the 2019 IEEE 58th Conference on Decision and Control (CDC), Nice, France, 11–13 December 2019; pp. 4079–4084. [Google Scholar] [CrossRef]
- Lin, Y.; McPhee, J.; Azad, N.L. Comparison of Deep Reinforcement Learning and Model Predictive Control for Adaptive Cruise Control. IEEE Trans. Intell. Veh. 2021, 6, 221–231. [Google Scholar] [CrossRef]
- Hussein, A.A.; Rakha, H.A. Vehicle Platooning Impact on Drag Coefficients and Energy/Fuel Saving Implications. IEEE Trans. Veh. Technol. 2022, 71, 1199–1208. [Google Scholar] [CrossRef]
- Vogel, K. A comparison of headway and time to collision as safety indicators. Accid. Anal. Prev. 2003, 35, 427–433. [Google Scholar] [CrossRef]
- Ostertag, E. Mono- and Multivariable Control and Estimation: Linear, Quadratic and LMI Methods; Springer Science & Business Media: Berlin, Germany, 2011. [Google Scholar] [CrossRef]
- Dimauro, L.; Tota, A.; Galvagno, E.; Velardocchia, M. Torque Allocation of Hybrid Electric Trucks for Drivability and Transient Emissions Reduction. Appl. Sci. 2023, 13, 3704. [Google Scholar] [CrossRef]
- Barata, J.C.A.; Hussein, M.S. The Moore-Penrose Pseudoinverse: A Tutorial Review of the Theory. Braz. J. Phys. 2012, 42, 146–165. [Google Scholar] [CrossRef]
- Bae, I.; Moon, J.; Jhung, J.; Suk, H.; Kim, T.; Park, H.; Cha, J.; Kim, J.; Kim, D.; Kim, S. Self-Driving like a Human driver instead of a Robocar: Personalized comfortable driving experience for autonomous vehicles. arXiv 2020, arXiv:2001.03908. [Google Scholar]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction, 2nd ed.; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Lillicrap, T.P.; Hunt, J.J.; Pritzel, A.; Heess, N.; Erez, T.; Tassa, Y.; Silver, D.; Wierstra, D. Continuous control with deep reinforcement learning. arXiv 2015, arXiv:1509.02971. [Google Scholar]
- Acquarone, M.; Borneo, A.; Misul, D.A. Acceleration control strategy for Battery Electric Vehicle based on Deep Reinforcement Learning in V2V driving. In Proceedings of the 2022 IEEE Transportation Electrification Conference & Expo (ITEC), Anaheim, CA, USA, 15–17 June 2022. [Google Scholar]
- Wei, S.; Zou, Y.; Zhang, T.; Zhang, X.; Wang, W. Design and experimental validation of a cooperative adaptive cruise control system based on supervised reinforcement learning. Appl. Sci. 2018, 8, 1014. [Google Scholar] [CrossRef]
- MATLAB & Simulink—MathWorks Italia. Train DDPG Agent for Adaptive Cruise Contro; MATLAB & Simulink—MathWorks Italia: Torino, Italy, 2023. [Google Scholar]
Quantity | Symbol | Value |
---|---|---|
Equivalent vehicle mass | 13,175 kg | |
Wheel radius | 0.5715 m | |
Electric motor maximum power | 300 kW | |
Electric motor maximum torque | 600 Nm | |
Total transmission efficiency | 95% | |
Total transmission ratio | 19.74 | |
Battery max capacity | 693 Ah | |
Battery Nominal Voltage | 500 V | |
Initial SOC | 80% | |
Isolated vehicles drag coefficient | 0.57 | |
Air density | 1.2 | |
Vehicle frontal Area | 8.9 | |
Road-tyre friction coefficient | 0.9 | |
Rolling resistance coefficient | 0.0041 | |
Rolling resistance coefficient | 0 | |
Time headway | 1.4 s | |
Nominal speed | 80 km/h | |
Nominal inter-vehicle distance | 36 m | |
Nominal road slope | 5° | |
LQC: Q—matrix | ||
LQC: R—matrix | ||
RL: reward function weight 1 | 1 | |
RL: reward function weight 2 | 1 | |
RL: Actor learning rate | ||
RL: Critic learning rate |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Borneo, A.; Zerbato, L.; Miretti, F.; Tota, A.; Galvagno, E.; Misul, D.A. Platooning Cooperative Adaptive Cruise Control for Dynamic Performance and Energy Saving: A Comparative Study of Linear Quadratic and Reinforcement Learning-Based Controllers. Appl. Sci. 2023, 13, 10459. https://doi.org/10.3390/app131810459
Borneo A, Zerbato L, Miretti F, Tota A, Galvagno E, Misul DA. Platooning Cooperative Adaptive Cruise Control for Dynamic Performance and Energy Saving: A Comparative Study of Linear Quadratic and Reinforcement Learning-Based Controllers. Applied Sciences. 2023; 13(18):10459. https://doi.org/10.3390/app131810459
Chicago/Turabian StyleBorneo, Angelo, Luca Zerbato, Federico Miretti, Antonio Tota, Enrico Galvagno, and Daniela Anna Misul. 2023. "Platooning Cooperative Adaptive Cruise Control for Dynamic Performance and Energy Saving: A Comparative Study of Linear Quadratic and Reinforcement Learning-Based Controllers" Applied Sciences 13, no. 18: 10459. https://doi.org/10.3390/app131810459
APA StyleBorneo, A., Zerbato, L., Miretti, F., Tota, A., Galvagno, E., & Misul, D. A. (2023). Platooning Cooperative Adaptive Cruise Control for Dynamic Performance and Energy Saving: A Comparative Study of Linear Quadratic and Reinforcement Learning-Based Controllers. Applied Sciences, 13(18), 10459. https://doi.org/10.3390/app131810459