Dynamic Allocation of C-V2X Communication Resources Based on Graph Attention Network and Deep Reinforcement Learning
Abstract
1. Introduction
1.1. Motivation
1.2. Related Work
1.3. Contributions
- Firstly, we adopt a GAT to extract global features, shifting the optimization objective from individual vehicle performance to system-wide optimality across the entire vehicular network.
- Secondly, we dynamically update neighbor relationships based on real-time vehicle positions to accurately capture current interference patterns between vehicles.
- Thirdly, we propose a novel GAT-Advantage Actor–Critic (GAT-A2C) RL framework, pioneering the integration of GAT with the Advantage Actor-Critic (A2C) algorithm. This architecture dynamically adapts to positional changes, communication states, and interference fluctuations among neighboring vehicles, enabling optimized resource allocation for both V2V and V2N links.
- Lastly, we conduct extensive experimental evaluations across diverse vehicular scenarios with varying densities. Results demonstrate that our GAT-A2C framework outperforms existing methods in key metrics (including V2N rate and V2V success ratio), particularly excelling in high-density environments. The solution further exhibits robust adaptability and superior scalability across all tested vehicle densities.
1.4. Organization
2. System Model, Problem Formulation, and Solution Scheme
2.1. System Model
2.2. Problem Formulation
2.3. Overall Solution Scheme
3. Design of Graph Attention Network
3.1. Graph Construction
3.1.1. Principle of Graph Construction
3.1.2. Graph Node State
3.1.3. Dynamic Adjacency Matrix for Graph Expression
3.2. GAT Model
- performing linear transformation on the features of itself and all neighboring nodes;
- calculating the attention score of itself and each neighbor;
- normalizing these attention scores as weights using softmax;
- The weights are applied to aggregate neighbor features and obtain the updated node features.
3.2.1. Normalization and Linear Transformation on the Features
3.2.2. Attention Score Computation and Aggregation Result of a Single Attention Head
3.2.3. Multi-Head Attention Mechanism
3.3. Loss Function of GAT
4. The GAT-A2C Model for Resource Allocation Problems
4.1. The Design of Key Elements in RL and A2C
4.1.1. State Space
4.1.2. Action Space
4.1.3. Reward Function
4.1.4. Actor Network
4.1.5. Critic Network
4.2. Overall Framework of GAT-A2C Model
Algorithm 1 ResourceAllocationAlgorithm(). |
Require: Local state , GAT model, Actor–Critic model, Maximum iteration counter , Current iteration counter , Convergence threshold , False Ensure: GAT Network parameter , Actor Network parameter , Critic Network parameter , Target Network parameter , , Replay Memory Buffer D; 1: while < and = do 2: Initialize Environment, get local State , 3: //Step 1: Build Adjacency matrix(dynamic Graph) 4: Build dynamic Graph 5: //Step 2: GAT aggregates neighbor information 6: 7: //Step 3: State concatenation 8: 9: //Step 4: action selection (Actor outputs resource blocks & power) 10: 11: //Step 5: Critic scores the current state 12: 13: //Step 6: The agent executes actions on the environment, and the environment provides feedback to the agent. 14: Get 15: //Step 7: Store experiences 16: Store (,a,,) into the Replay Memory Buffer D 17: //Step 8: Update Network 18: if reached the update period then 19: //Batch sample B instances 20: Batch sample B instances from Replay Memory Buffer D ← Batch 21: //Calculate the TD(Temporal Difference) target value 22: 23: //Update the Critic network by minimizing the TD error 24: 25: //Calculate the Advantage 26: 27: //Update the Actor network by maximizing the expected advantage function 28: 29: //Soft update the target network parameters 30: 31: 32: end if 33: 34: end while |
4.3. Time Complexity Analysis and Limitation Discussion
4.3.1. Time Complexity of GAT-A2C Model
4.3.2. Limitation Discussion
5. Experiment
5.1. Experimental Settings
5.2. Experiment Results
5.2.1. Training Loss and Effectiveness of the GAT-A2C Model
5.2.2. Performance Analysis of GAT-A2C at Different Densities
5.2.3. Performance Analysis Compared with Other Methods
5.2.4. The Effect of GAT in the Proposed Model
5.2.5. The Effect of Attention Head Number and Attention Layer Number in the Proposed Model
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Lu, N.; Cheng, N.; Zhang, N.; Shen, X.; Mark, J. Connected Vehicles: Solutions and Challenges. IEEE Internet Things J. 2014, 1, 289–299. [Google Scholar] [CrossRef]
- Wen, X.; Chen, J.; Hu, Z.; Lu, Z. A p-Opportunistic Channel Access Scheme for Interference Mitigation Between V2V and V2I Communications. IEEE Internet Things J. 2020, 7, 3706–3718. [Google Scholar] [CrossRef]
- Fang, Y. Connected Vehicles Make Transportation Faster, Safer, Smarter, and Greener! IEEE Trans. Veh. Technol. 2015, 64, 5409–5410. [Google Scholar] [CrossRef]
- Sehla, K.; Nguyen, T.M.T.; Pujolle, G.; Velloso, P.B. Resource Allocation Modes in C-V2X: From LTE-V2X to 5G-V2X. IEEE Internet Things J. 2022, 9, 8291–8314. [Google Scholar] [CrossRef]
- Gonzalez-Martín, M.; Sepulcre, M.; Molina-Masegosa, R.; Gozalvez, J. Analytical Models of the Performance of C-V2X Mode 4 Vehicular Communications. IEEE Trans. Veh. Technol. 2019, 68, 1155–1166. [Google Scholar] [CrossRef]
- Yuan, Y.; Zheng, G.; Wong, K.K.; Letaief, K.B. Meta-Reinforcement Learning Based Resource Allocation for Dynamic V2X Communications. IEEE Trans. Veh. Technol. 2021, 70, 8964–8977. [Google Scholar] [CrossRef]
- Ji, B.; Dong, B.; Li, D.; Wang, Y.; Yang, L.; Tsimenidis, C.; Menon, V.G. Optimization of resource allocation for V2X security communication based on multi-agent reinforcement learning. IEEE Trans. Veh. Technol. 2023, 74, 1849–1861. [Google Scholar] [CrossRef]
- Yang, H.; Xie, X.; Kadoch, M. Intelligent Resource Management Based on Reinforcement Learning for Ultra-Reliable and Low-Latency IoV Communication Networks. IEEE Trans. Veh. Technol. 2019, 68, 4157–4169. [Google Scholar] [CrossRef]
- Gyawali, S.; Qian, Y.; Hu, R.Q. Resource Allocation in Vehicular Communications Using Graph and Deep Reinforcement Learning. In Proceedings of the 2019 IEEE Global Communications Conference (GLOBECOM), Big Island, HI, USA, 9–3 December 2019; pp. 1–6. [Google Scholar] [CrossRef]
- Li, R.; Zhao, Z.; Chen, X.; Palicot, J.; Zhang, H. TACT: A Transfer Actor-Critic Learning Framework for Energy Saving in Cellular Radio Access Networks. IEEE Trans. Wirel. Commun. 2014, 13, 2000–2011. [Google Scholar] [CrossRef]
- Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph Attention Networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
- Ye, H.; Li, G.Y.; Juang, B.H.F. Deep Reinforcement Learning Based Resource Allocation for V2V Communications. IEEE Trans. Veh. Technol. 2019, 68, 3163–3173. [Google Scholar] [CrossRef]
- Zhao, D.; Qin, H.; Song, B.; Zhang, Y.; Du, X.; Guizani, M. A Reinforcement Learning Method for Joint Mode Selection and Power Adaptation in the V2V Communication Network in 5G. IEEE Trans. Cogn. Commun. Netw. 2020, 6, 452–463. [Google Scholar] [CrossRef]
- Nguyen, K.K.; Duong, T.Q.; Vien, N.A.; Le-Khac, N.A.; Nguyen, L.D. Distributed Deep Deterministic Policy Gradient for Power Allocation Control in D2D-Based V2V Communications. IEEE Access 2019, 7, 164533–164543. [Google Scholar] [CrossRef]
- Miao, J.; Chai, X.; Song, X.; Song, T. A DDQN-based Energy-Efficient Resource Allocation Scheme for Low-Latency V2V communication. In Proceedings of the 2022 IEEE 5th International Electrical and Energy Conference (CIEEC), Nanjing, China, 27–29 May 2022; pp. 53–58. [Google Scholar] [CrossRef]
- Gao, A.; Wang, Q.; Wang, Y.; Du, C.; Hu, Y.; Liang, W.; Ng, S.X. Attention Enhanced Multi-Agent Reinforcement Learning for Cooperative Spectrum Sensing in Cognitive Radio Networks. IEEE Trans. Veh. Technol. 2024, 73, 10464–10477. [Google Scholar] [CrossRef]
- Chen, T.; Zhang, X.; You, M.; Zheng, G.; Lambotharan, S. A GNN-Based Supervised Learning Framework for Resource Allocation in Wireless IoT Networks. IEEE Internet Things J. 2022, 9, 1712–1724. [Google Scholar] [CrossRef]
- Guo, J.; Yang, C. Learning Power Allocation for Multi-Cell-Multi-User Systems With Heterogeneous Graph Neural Networks. IEEE Trans. Wirel. Commun. 2022, 21, 884–897. [Google Scholar] [CrossRef]
- He, Z.; Wang, L.; Ye, H.; Li, G.Y.; Juang, B.H.F. Resource Allocation based on Graph Neural Networks in Vehicular Communications. In Proceedings of the GLOBECOM 2020—2020 IEEE Global Communications Conference, Taipei, Taiwan, 7–11 December 2020; pp. 1–5. [Google Scholar] [CrossRef]
- Ji, M.; Wu, Q.; Fan, P.; Cheng, N.; Chen, W.; Wang, J.; Letaief, K.B. Graph Neural Networks and Deep Reinforcement Learning-Based Resource Allocation for V2X Communications. IEEE Internet Things J. 2025, 12, 3613–3628. [Google Scholar] [CrossRef]
- Yuan, C.; Zhao, H.; Yan, W.; Hou, L. Resource Allocation with Multi-Level QoS for V2X Based on GNN and RL. In Proceedings of the 2023 International Conference on Information Processing and Network Provisioning (ICIPNP), Beijing, China, 26–27 October 2023; pp. 50–55. [Google Scholar] [CrossRef]
- Zhang, M.; Chen, Y. Link Prediction Based on Graph Neural Networks. arXiv 2018, arXiv:1802.09691. [Google Scholar]
- Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
- Zhang, M.; Cui, Z.; Neumann, M.; Chen, Y. An End-to-End Deep Learning Architecture for Graph Classification. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018. [Google Scholar]
- Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on International Conference on Machine Learning, Lille, France, 7–9 July 2015; Volume 37, pp. 448–456. [Google Scholar]
- Zhuofei, W.; Bartoletti, S.; Martinez, V.; Bazzi, A. Adaptive repetition strategies in IEEE 802.11 bd V2X networks. IEEE Trans. Veh. Technol. 2023, 72, 8262–8266. [Google Scholar] [CrossRef]
- Todisco, V.; Bartoletti, S.; Campolo, C.; Molinaro, A.; Berthet, A.O.; Bazzi, A. Performance analysis of sidelink 5G-V2X mode 2 through an open-source simulator. IEEE Access 2021, 9, 145648–145661. [Google Scholar] [CrossRef]
- Li, Z.; Wang, Y.; Zhao, J. Reliability Evaluation of IEEE 802.11p Broadcast Ad Hoc Networks on the Highway. IEEE Trans. Veh. Technol. 2022, 71, 7428–7444. [Google Scholar] [CrossRef]
- Bansal, G.; Kenney, J.B.; Weinfield, A. Cross-Validation of DSRC Radio Testbed and NS-2 Simulation Platform for Vehicular Safety Communications. In Proceedings of the 2011 IEEE Vehicular Technology Conference (VTC Fall), San Francisco, CA, USA, 5–8 September 2011; pp. 1–5. [Google Scholar] [CrossRef]
- 3GPP. 3rd Generation Partnership Project; Technical Specification Group Radio Access Network; Evolved Universal Terrestrial Radio Access (E-UTRA); Further advancements for E-UTRA physical layer aspects (Release 9). Technical Report (TR) 36.814. 2017. Version 9.2.0. Available online: https://www.3gpp.org/ftp//Specs/archive/36_series/36.814/36814-920.zip (accessed on 19 August 2025).
Description | Specification | Description | Specification |
---|---|---|---|
Scenario | Intersection | Dis. threshold for neighbor vehicles | 150 m |
Number of lanes | Weight coefficients [,] | [0.3, 1.0] | |
Vehicle speed | km/h | GAT input feature dimension | 60 |
Packet size | 1500 bytes | GAT output embedding dimension | 20 |
Avg. V2V pkt generation rate | 20 Hz | Number of GAT attention heads | 8 |
Carrier frequency | 2 GHz | GAT dropout rate | 0.6 |
Total number of RBs | 20 | State input dimension to actor–critic | 102 (82 base + 20 GAT) |
Antenna gain of veh. & BS | 3 dBi & 8 dBi | Replay memory capacity | 1 million |
Antenna height of veh. & BS | 1.5 m & 25 m | Replay batch size | 2000 |
Noise figure of veh. & BS | 9 dB & 5 dB | Learning rate | 0.01 (min , decay 0.96) |
Noise power | −114 dBm | Discount factor | 0.5 |
Maximum delay for V2V link | 100 ms | Soft target update rate | 0.01 |
Transmission power levels | [23, 10, 5] dBm | Training steps | 10000 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, Z.; Li, G.; Wu, Z.; Zhang, W.; Bazzi, A. Dynamic Allocation of C-V2X Communication Resources Based on Graph Attention Network and Deep Reinforcement Learning. Sensors 2025, 25, 5209. https://doi.org/10.3390/s25165209
Li Z, Li G, Wu Z, Zhang W, Bazzi A. Dynamic Allocation of C-V2X Communication Resources Based on Graph Attention Network and Deep Reinforcement Learning. Sensors. 2025; 25(16):5209. https://doi.org/10.3390/s25165209
Chicago/Turabian StyleLi, Zhijuan, Guohong Li, Zhuofei Wu, Wei Zhang, and Alessandro Bazzi. 2025. "Dynamic Allocation of C-V2X Communication Resources Based on Graph Attention Network and Deep Reinforcement Learning" Sensors 25, no. 16: 5209. https://doi.org/10.3390/s25165209
APA StyleLi, Z., Li, G., Wu, Z., Zhang, W., & Bazzi, A. (2025). Dynamic Allocation of C-V2X Communication Resources Based on Graph Attention Network and Deep Reinforcement Learning. Sensors, 25(16), 5209. https://doi.org/10.3390/s25165209