Carbon-Neutral Cellular Network Operation Based on Deep Reinforcement Learning
Abstract
:1. Introduction
1.1. Related Works
1.2. Our Contributions
1.3. Organization
2. System Model
2.1. Network Model
2.2. Power Consumption Model
3. Objective Function
4. Implementation of the DDPG Algorithm
4.1. Preliminaries
4.2. Problem Formulation
- State: Since the cellular network is operated in real-time, the proposed DDPG algorithm should operate as fast as possible. To this end, we define the state to be effective and simple. The state of our DDPG algorithm is given by
- Action: As explained in Section 3, we adjust the SBS transmission power and on/off switching for our goal. Therefore, the action is defined as follows:
- Reward: Obviously, the reward is directly given by the objective function. Thus, the reward is defined as
4.3. Operation of DDPG Algorithm for Carbon Neutrality
5. Simulation Results and Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Conflicts of Interest
Abbreviations
DDPG | Deep deterministic policy gradient |
DPG | Deterministic policy gradient |
UDN | Ultra-dense network |
BS | Base station |
MBS | Macro base station |
SBS | Small base station |
UE | User equipment |
RES | Renewable energy source |
DRL | Deep reinforcement learning |
DQN | Deep Q network |
SINR | Signal-to-interference-and-noise ratio |
NN | Neural network |
References
- Ge, X.; Tu, S.; Mao, G.; Wang, C.X.; Han, T. 5G ultra-dense cellular networks. IEEE Wirel. Commun. 2016, 23, 72–79. [Google Scholar] [CrossRef] [Green Version]
- Malmodin, J.; Lundén, D. The energy and carbon footprint of the global ICT and E&M sectors 2010–2015. Sustainability 2018, 10, 3027. [Google Scholar]
- Roll out 5G without Increasing Energy Consumption. Available online: https://www.ericsson.com/en/about-us/sustainability-and-corporate-responsibility/environment/product-energy-performance (accessed on 13 May 2022).
- Moon, S.; Kim, H.; Yi, Y. BRUTE: Energy-efficient user association in cellular networks from population game perspective. IEEE Trans. Wirel. Commun. 2015, 15, 663–675. [Google Scholar] [CrossRef]
- Lee, G.; Kim, H. Green small cell operation using belief propagation in wireless networks. In Proceedings of the 2014 IEEE Globecom Workshops (GC Wkshps), Austin, TX, USA, 8–12 December 2014; pp. 1266–1271. [Google Scholar]
- Jeong, J.; Kim, H. On Optimal Cell Flashing for Reducing Delay and Saving Energy in Wireless Networks. Energies 2016, 9, 768. [Google Scholar] [CrossRef] [Green Version]
- Choi, Y.; Kim, H. Optimal scheduling of energy storage system for self-sustainable base station operation considering battery wear-out cost. Energies 2016, 9, 462. [Google Scholar] [CrossRef] [Green Version]
- Liu, C.; Natarajan, B.; Xia, H. Small cell base station sleep strategies for energy efficiency. IEEE Trans. Veh. Technol. 2015, 65, 1652–1661. [Google Scholar] [CrossRef]
- Oh, E.; Son, K. A unified base station switching framework considering both uplink and downlink traffic. IEEE Wirel. Commun. Lett. 2016, 6, 30–33. [Google Scholar] [CrossRef]
- Son, K.; Kim, H.; Yi, Y.; Krishnamachari, B. Base station operation and user association mechanisms for energy-delay tradeoffs in green cellular networks. IEEE J. Sel. Areas Commun. 2011, 29, 1525–1536. [Google Scholar] [CrossRef]
- Feng, M.; Mao, S.; Jiang, T. BOOST: Base station on-off switching strategy for energy efficient massive MIMO HetNets. In Proceedings of the IEEE INFOCOM 2016—The 35th Annual IEEE International Conference on Computer Communications, San Francisco, CA, USA, 10–15 April 2016; pp. 1–9. [Google Scholar]
- Peng, C.; Lee, S.B.; Lu, S.; Luo, H. GreenBSN: Enabling energy-proportional cellular base station networks. IEEE Trans. Mob. Comput. 2014, 13, 2537–2551. [Google Scholar] [CrossRef]
- Celebi, H.; Yapıcı, Y.; Güvenç, I.; Schulzrinne, H. Load-based on/off scheduling for energy-efficient delay-tolerant 5g networks. IEEE Trans. Green Commun. Netw. 2019, 3, 955–970. [Google Scholar] [CrossRef] [Green Version]
- Salem, F.E.; Altman, Z.; Gati, A.; Chahed, T.; Altman, E. Reinforcement learning approach for advanced sleep modes management in 5G networks. In Proceedings of the 2018 IEEE 88th Vehicular Technology Conference (VTC-Fall), Chicago, IL, USA, 27–30 August 2018; pp. 1–5. [Google Scholar]
- El-Amine, A.; Hassan, H.A.H.; Iturralde, M.; Nuaymi, L. Location-Aware sleep strategy for Energy-Delay tradeoffs in 5G with reinforcement learning. In Proceedings of the 2019 IEEE 30th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), Istanbul, Turkey, 8–11 September 2019; pp. 1–6. [Google Scholar]
- Pujol-Roigl, J.S.; Wu, S.; Wang, Y.; Choi, M.; Park, I. Deep reinforcement learning for cell on/off energy saving on wireless networks. In Proceedings of the 2021 IEEE Global Communications Conference (GLOBECOM), Madrid, Spain, 7–11 December 2021; pp. 1–7. [Google Scholar]
- Ye, J.; Zhang, Y.J.A. DRAG: Deep reinforcement learning based base station activation in heterogeneous networks. IEEE Trans. Mob. Comput. 2019, 19, 2076–2087. [Google Scholar] [CrossRef] [Green Version]
- Giannopoulos, A.; Spantideas, S.; Kapsalis, N.; Karkazis, P.; Trakadas, P. Deep reinforcement learning for energy-efficient multi-channel transmissions in 5G cognitive hetnets: Centralized, decentralized and transfer learning based solutions. IEEE Access 2021, 9, 129358–129374. [Google Scholar] [CrossRef]
- Iqbal, A.; Tham, M.L.; Chang, Y.C. Double deep Q-network-based energy-efficient resource allocation in cloud radio access network. IEEE Access 2021, 9, 20440–20449. [Google Scholar] [CrossRef]
- Kim, E.; Jung, B.C.; Park, C.Y.; Lee, H. Joint Optimization of Energy Efficiency and User Outage Using Multi-Agent Reinforcement Learning in Ultra-Dense Small Cell Networks. Electronics 2022, 11, 599. [Google Scholar] [CrossRef]
- Kim, S.; Son, J.; Shim, B. Energy-Efficient Ultra-Dense Network Using LSTM-based Deep Neural Networks. IEEE Trans. Wirel. Commun. 2021, 20, 4702–4715. [Google Scholar] [CrossRef]
- Lillicrap, T.P.; Hunt, J.J.; Pritzel, A.; Heess, N.; Erez, T.; Tassa, Y.; Silver, D.; Wierstra, D. Continuous control with deep reinforcement learning. arXiv 2015, arXiv:1509.02971. [Google Scholar]
- Qiu, C.; Hu, Y.; Chen, Y.; Zeng, B. Deep deterministic policy gradient (DDPG)-based energy harvesting wireless communications. IEEE Internet Things J. 2019, 6, 8577–8588. [Google Scholar] [CrossRef]
- Lu, Y.; Lu, H.; Cao, L.; Wu, F.; Zhu, D. Learning deterministic policy with target for power control in wireless networks. In Proceedings of the 2018 IEEE Global Communications Conference (GLOBECOM), Abu Dhabi, United Arab Emirates, 9–13 December 2018; pp. 1–7. [Google Scholar]
- Auer, G.; Giannini, V.; Desset, C.; Godor, I.; Skillermark, P.; Olsson, M.; Imran, M.A.; Sabella, D.; Gonzalez, M.J.; Blume, O.; et al. How much energy is needed to run a wireless network? IEEE Wirel. Commun. 2011, 18, 40–49. [Google Scholar] [CrossRef]
- Silver, D.; Lever, G.; Heess, N.; Degris, T.; Wierstra, D.; Riedmiller, M. Deterministic policy gradient algorithms. In Proceedings of the 31st International Conference on Machine Learning, Beijing, China, 21–26 June 2014; pp. 387–395. [Google Scholar]
- Thrun, S.; Littman, M.L. Reinforcement Learning: An Introduction; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Yu, L.; Xie, W.; Xie, D.; Zou, Y.; Zhang, D.; Sun, Z.; Zhang, L.; Zhang, Y.; Jiang, T. Deep reinforcement learning for smart home energy management. IEEE Internet Things J. 2019, 7, 2751–2762. [Google Scholar] [CrossRef] [Green Version]
Network Parameter | Value | Hyperparameter | Value |
---|---|---|---|
Simulation count | 50,000 | Learning rate | Actor: 0.0005/ Critic: 0.001 |
Time step | 1 s | Discount factor | 0.99 |
Carrier frequency | 2 GHz | Mini batch size | 32 |
System bandwidth | 20 MHz | Size of experience replay buffer | 50,000 |
MBS deployment | 1 tier hexagonal | Soft update weight | 0.05 |
SBS deployment | 6 per MBS | Exploration standard dev. | 0.1 |
Max transmission power | MBS: 40 W/SBS: 0.5 W | ||
UE deployment | 40 | ||
Mobility | uniform in range [1–10] m/s | ||
Path loss exponent | 4 | ||
Antenna pattern | Omnidirectional | ||
Maintenance p. (ON) | 6.8 W | ||
Amplifier efficiency | 0.25 | ||
Maintenance p. (OFF) | 4.3 W |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kim, H.; So, J.; Kim, H. Carbon-Neutral Cellular Network Operation Based on Deep Reinforcement Learning. Energies 2022, 15, 4504. https://doi.org/10.3390/en15124504
Kim H, So J, Kim H. Carbon-Neutral Cellular Network Operation Based on Deep Reinforcement Learning. Energies. 2022; 15(12):4504. https://doi.org/10.3390/en15124504
Chicago/Turabian StyleKim, Hojin, Jaewoo So, and Hongseok Kim. 2022. "Carbon-Neutral Cellular Network Operation Based on Deep Reinforcement Learning" Energies 15, no. 12: 4504. https://doi.org/10.3390/en15124504
APA StyleKim, H., So, J., & Kim, H. (2022). Carbon-Neutral Cellular Network Operation Based on Deep Reinforcement Learning. Energies, 15(12), 4504. https://doi.org/10.3390/en15124504