Online Charging Strategy for Electric Vehicle Clusters Based on Multi-Agent Reinforcement Learning and Long–Short Memory Networks
Abstract
:1. Introduction
- We proposed a prediction strategy based on LSTM. This strategy can provide accurate prediction values for floating tariffs in the charging strategy algorithm and improves the accuracy of the algorithm greatly.
- We proposed a multi-agent RL-based online charging strategy algorithm for EV clusters, which takes into account uncertainties such as variable electricity prices and user usage.
- We designed an approach of centralized training and distributed execution by coordinating the charging strategies of each charging pile within a community home charging station to maintain a cooperative and competitive relationship between each charging post while globally controlling and coordinating the whole, minimizing user usage costs and maintaining load balancing on the community grid.
2. Real-Time Grid Price Forecasting Model Based on Long and Short Memory Networks
2.1. RNN-Based LSTM Network
2.2. LSTM-Based Real-Time Grid Price Prediction Model
3. A Multi-Agent Reinforcement Learning-Based Model for Electric Vehicle Cluster Charging Strategy
3.1. Introduction to Multi-Intelligence Reinforcement Learning
3.2. Electric Vehicle Cluster Charging Strategy Model
3.2.1. Cluster Charging Behavior Analysis
3.2.2. EV Cluster Charging Strategy: Markov Decision Process
- State-space: The states of EV cluster charging can be divided into two groups: (1) single-intelligent charging state, which is primarily accountable for ensuring that individual EV charging is satisfied; (2) multi-agent charging state, which is a collective state and is mainly responsible for maintaining the overall state stability.
- As shown in Table 1, the EV battery power is directly obtained when the charging post starts charging tasks, and the charging status is updated at the same time. In order to further reduce the user’s usage cost and extend the battery life, our idea is to set the charging threshold for EV charging during the daytime charging according to the past usage of the community users and charge the remaining power during the low power consumption period. At the same time, there is a discharge threshold in the V2G mode to ensure that users have electricity available when they use the car temporarily.
Single Smart Body Charging Status | Multi-Agent Body Charging Status | |
---|---|---|
Real-time data | ||
Historical data |
- Action space: EV cluster charging action space is the output power of each charging post.
- Reward: Reward provides delayed feedback to intelligence for its action decisions, enabling it constantly to optimize its strategies. Therefore, the reward function has a great influence on the optimization of intelligent decision-making. In the multi-agent reinforcement learning algorithm proposed in this paper, the reward function is set to a negative penalty function, which helps the rapid iterative convergence of the neural network.
- Observation value: To give the central processor a better grasp of the global information, an observation value is set for each intelligent body. The settings are as Equation (14):
3.2.3. Multi-Intelligence Reinforcement Learning-Based Charging Strategy Process for Electric Vehicle Clusters
Algorithm 1: MADDPG Electric Vehicle Cluster Charging Strategy Process |
1: for episode = 1 to do |
2: Initialize the random process |
3: Accept initial state |
4: for to maximum time period do |
5: Forecasting electricity prices based on LSTM networks |
6: Select action according to current strategy |
7: Perform actions , observe rewards and new states |
8: Obtain real-time electricity prices to return to LSTM |
9: network optimization |
10: Put into the experience pool |
11: |
12: for agent i = 1 to do |
13: Draw samples from experience pool |
14: Follow the New Critics Network and the Performers Network |
15: end for |
16: Update the target network parameters of each charging post smartbody |
17: end for |
18: end for |
4. Algorithm Analysis
4.1. Analysis of LSTM-Based Electricity Price Prediction Algorithm
4.1.1. Data Description
4.1.2. Parameter Setting
4.1.3. Analysis of Results
4.2. Analysis of a Cluster Charging Strategy Algorithm for Community Home Electric Vehicles Based on Multi-Agent Reinforcement Learning
4.2.1. Experiment Description
4.2.2. Parameter Setting
4.2.3. Analysis of Results
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Ensslen, A.; Ringler, P.; Dörr, L.; Jochem, P.; Zimmermann, F.; Fichtner, W. Incentivizing smart charging: Modeling charging tariffs for electric vehicles in German and French electricity markets. Energy Res. Soc. Sci. 2018, 42, 112–126. [Google Scholar] [CrossRef]
- Lu, R.; Hong, S.H. Incentive-based demand response for smart grid with reinforcement learning and deep neural network. Appl. Energy 2019, 236, 937–949. [Google Scholar] [CrossRef]
- Yang, T.; Zhao, L.; Li, W.; Zomaya, A.Y. Reinforcement learning in sustainable energy and electric systems: A survey. Annu. Rev. Control 2020, 49, 145–163. [Google Scholar] [CrossRef]
- Bibak, B.; Tekiner-Moğulkoç, H. A comprehensive analysis of Vehicle to Grid (V2G) systems and scholarly literature on the application of such systems. Renew. Energy Focus 2021, 36, 1–20. [Google Scholar] [CrossRef]
- Rücker, F.; Bremer, I.; Linden, S.; Badeda, J.; Sauer, D.U. Development and Evaluation of a Battery Lifetime Extending Charging Algorithm for an Electric Vehicle Fleet. Energy Procedia 2016, 99, 285–291. [Google Scholar] [CrossRef] [Green Version]
- Schneider, F.; Thonemann, U.W.; Klabjan, D. Optimization of Battery Charging and Purchasing at Electric Vehicle Battery Swap Stations. Transp. Sci. 2018, 52, 1211–1234. [Google Scholar] [CrossRef]
- Habeeb, S.A.; Tostado-Veliz, M.; Hasanien, H.M.; Turky, R.A.; Meteab, W.K.; Jurado, F. DC Nanogrids for Integration of Demand Response and Electric Vehicle Charging Infrastructures: Appraisal, Optimal Scheduling and Analysis. Electronics 2021, 10, 2484. [Google Scholar] [CrossRef]
- Sadeghianpourhamami, N.; Deleu, J.; Develder, C. Definition and Evaluation of Model-Free Coordination of Electrical Vehicle Charging with Reinforcement Learning. IEEE Trans. Smart Grid 2020, 11, 203–214. [Google Scholar] [CrossRef] [Green Version]
- Tuchnitz, F.; Ebell, N.; Schlund, J.; Pruckner, M. Development and Evaluation of a Smart Charging Strategy for an Electric Vehicle Fleet Based on Reinforcement Learning. Appl. Energy 2021, 285, 116382. [Google Scholar] [CrossRef]
- Chang, F.; Chen, T.; Su, W.; Alsafasfeh, Q. Control of battery charging based on reinforcement learning and long short-term memory networks. Comput. Electr. Eng. 2020, 85, 106670. [Google Scholar] [CrossRef]
- Wan, Y.; Qin, J.; Ma, Q.; Fu, W.; Wang, S. Multi-agent DRL-based data-driven approach for PEVs charging/discharging scheduling in smart grid. J. Frankl. Inst. 2022, 359, 1747–1767. [Google Scholar] [CrossRef]
- Sherstinsky, A. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network. Phys. Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef] [Green Version]
- Zhang, H.; Li, D.; He, Y. Multi-Robot Cooperation Strategy in Game Environment Using Deep Reinforcement Learning. In Proceedings of the 2018 IEEE International Conference on Robotics and Biomimetics (ROBIO), Kuala Lumpur, Malaysia, 12–15 December 2018; pp. 886–891. [Google Scholar]
- Li, L.; Lv, Y.; Wang, F.-Y. Traffic signal timing via deep reinforcement learning. IEEECAA J. Autom. Sin. 2016, 3, 247–254. [Google Scholar] [CrossRef]
- Wang, T.; Cao, J.; Hussain, A. Adaptive Traffic Signal Control for large-scale scenario with Cooperative Group-based Multi-agent reinforcement learning. Transp. Res. Part C Emerg. Technol. 2021, 125, 103046. [Google Scholar] [CrossRef]
- Li, X.; Wang, X.; Zheng, X.; Jin, J.; Huang, Y.; Zhang, J.J.; Wang, F.-Y. SADRL: Merging human experience with machine intelligence via supervised assisted deep reinforcement learning. Neurocomputing 2022, 467, 300–309. [Google Scholar] [CrossRef]
- Popper, J.; Yfantis, V.; Ruskowski, M. Simultaneous Production and AGV Scheduling Using Multi-Agent Deep Reinforcement Learning. Procedia CIRP 2021, 104, 1523–1528. [Google Scholar] [CrossRef]
- Xu, T.; Zhao, M.; Yao, X.; Zhu, Y. An improved communication resource allocation strategy for wireless networks based on deep reinforcement learning. Comput. Commun. 2022, 188, 90–98. [Google Scholar] [CrossRef]
- Narasipuram, R.P.; Mopidevi, S. A technological overview & design considerations for developing electric vehicle charging stations. J. Energy Storage 2021, 43, 103225. [Google Scholar] [CrossRef]
- AEMO|Combined Price and Demand Data. Available online: https://aemo.com.au/energy-systems/electricity/national-electricity-market-nem/data-nem/aggregated-data (accessed on 14 April 2022).
Experimental Configuration | Specific Parameters |
---|---|
CPU | Intel Core i7-11700K@ 5.0 GHz |
GPU | NVIDIA GeForce GTX 1660 Super |
Memory | 16 GB |
Operating System | Windows10 |
Programming Environment | python3.7, pytorch 1.10.2 + cu102 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Shen, X.; Zhang, Y.; Wang, D. Online Charging Strategy for Electric Vehicle Clusters Based on Multi-Agent Reinforcement Learning and Long–Short Memory Networks. Energies 2022, 15, 4582. https://doi.org/10.3390/en15134582
Shen X, Zhang Y, Wang D. Online Charging Strategy for Electric Vehicle Clusters Based on Multi-Agent Reinforcement Learning and Long–Short Memory Networks. Energies. 2022; 15(13):4582. https://doi.org/10.3390/en15134582
Chicago/Turabian StyleShen, Xianhao, Yexin Zhang, and Decheng Wang. 2022. "Online Charging Strategy for Electric Vehicle Clusters Based on Multi-Agent Reinforcement Learning and Long–Short Memory Networks" Energies 15, no. 13: 4582. https://doi.org/10.3390/en15134582
APA StyleShen, X., Zhang, Y., & Wang, D. (2022). Online Charging Strategy for Electric Vehicle Clusters Based on Multi-Agent Reinforcement Learning and Long–Short Memory Networks. Energies, 15(13), 4582. https://doi.org/10.3390/en15134582