Reinforcement Learning-Based Time-Slotted Protocol: A Reinforcement Learning Approach for Optimizing Long-Range Network Scalability
Abstract
:1. Introduction
- First, the introduction of a lightweight hybrid reinforcement learning (RL) solution that addresses the limitations of both fully centralized and fully decentralized RL approaches, particularly in the context of LoRa networks.Indeed, fully centralized RL approaches, while effective in leveraging global information, impose high computational burdens on the gateway and are severely constrained by duty cycle limitations. In large-scale, high-density networks, the time required to update nodes in duty-cycled environments becomes impractically long. On the other hand, fully decentralized RL approaches, though bypassing the duty cycle constraints, suffer from slow convergence as the nodes do not have full knowledge of the network conditions. Moreover, placing the entire computational load on energy-constrained sensor nodes can lead to significant resource exhaustion.To overcome these challenges, we propose a hybrid RL-based approach that achieves the following: (1) Minimizes the gateway’s role to simply broadcasting the level of congestion for each slot in the form of a compact vector, rather than performing complex calculations for every node to determine the optimal performance parameters. (2) Allows nodes to locally utilize this global congestion vector for fast convergence toward conflict-free, mutually exclusive slot assignments.
- Second, to achieve optimal overall configuration while respecting duty cycle constraints, our slot allocation mechanism is built on top of geographically based distributions of spreading factors (SFs), transmission powers (TPs), and channels. Unlike recent hybrid solutions in the literature, which attempt to optimize SF, TP, and channel selection simultaneously using RL—resulting in high computational overhead on resource-limited nodes—our approach statically assigns these parameters based on geographical positioning, as proposed in [4]. This enables RL to be focused solely on TDMA slot optimization, making the solution significantly more lightweight and scalable.
2. Related Work
2.1. Optimizing the Distribution of Transmission Parameters
2.2. Time Slot-Based Solutions
2.3. RL-Based Solutions
2.3.1. Centralized Approaches
2.3.2. Decentralized Approaches
2.3.3. Hybrid Approaches
3. Protocol Description
3.1. Setting the Environment
3.2. Learning Algorithm
- is the current Q-value of the state–action pair.
- is the learning rate, determining the extent to which newly acquired information overrides old information. A smaller learning rate leads to slower convergence but more stable learning.
- r is the immediate reward obtained after taking action a in state s.
- is the discount factor, as described earlier.
- represents the maximum Q-value achievable from the next state considering all possible actions .
- is the state reached after taking action a from state s.
3.3. Collision Detection Algorithm
4. Performance Evaluation
4.1. Parameter Settings
4.2. RL-TS Evaluation Results
4.2.1. Q-Learning Algorithm Evaluation
4.2.2. Case 1: The Number of Available TDMA Slots Equals the Number of Nodes
4.2.3. Case 2: Constant Number of Available TDMA Slots
4.3. Comparison
4.3.1. Comparing with LoRa
4.3.2. Comparing with RL-Based Protocol by X. Huang et al.
- : The probability of selecting action a in state s.
- : The reward for taking action a in state s.
- : The total reward for all actions a′ in state s.
- A: The set of all actions in state s.
Algorithm 1 RL-TS PHC-enhanced |
|
4.3.3. Decentralized Multi-Agent Learning
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Qadir, Q.M.; Rashid, T.A.; Al-Salihi, N.K.; Ismael, B.; Kist, A.A.; Zhang, Z. Low power wide area networks: A survey of enabling technologies, applications and interoperability needs. IEEE Access 2018, 6, 77454–77473. [Google Scholar] [CrossRef]
- Ayoub, W.; Samhat, A.E.; Nouvel, F.; Mroue, M.; Prévotet, J.C. Internet of mobile things: Overview of lorawan, dash7, and nb-iot in lpwans standards and supported mobility. IEEE Commun. Surv. Tutorials 2018, 21, 1561–1581. [Google Scholar] [CrossRef]
- Mekki, K.; Bajic, E.; Chaxel, F.; Meyer, F. A comparative study of LPWAN technologies for large-scale IoT deployment. ICT Express 2019, 5, 1–7. [Google Scholar] [CrossRef]
- Alahmadi, H.; Bouabdallah, F.; Al-Dubai, A.; Ghaleb, B. A Novel Autonomous Adaptive Frame Size for Time-Slotted LoRa MAC Protocol. IEEE Trans. Ind. Inform. 2024, 20, 12284–12293. [Google Scholar] [CrossRef]
- Li, X.; Xu, J.; Li, R.; Jia, L.; You, J. Advancing Performance in LoRaWAN Networks: The Circular Region Grouped Bit-Slot LoRa MAC Protocol. Electronics 2024, 13, 621. [Google Scholar] [CrossRef]
- Gkotsiopoulos, P.; Zorbas, D.; Douligeris, C. Performance determinants in LoRa networks: A literature review. IEEE Commun. Surv. Tutorials 2021, 23, 1721–1758. [Google Scholar] [CrossRef]
- Zorbas, D.; Papadopoulos, G.Z.; Maille, P.; Montavont, N.; Douligeris, C. Improving LoRa network capacity using multiple spreading factor configurations. In Proceedings of the 2018 25th International Conference on Telecommunications (ICT), Saint-Malo, France, 26–28 June 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 516–520. [Google Scholar]
- Reynders, B.; Wang, Q.; Tuset-Peiro, P.; Vilajosana, X.; Pollin, S. Improving reliability and scalability of LoRaWANs through lightweight scheduling. IEEE Internet Things J. 2018, 5, 1830–1842. [Google Scholar] [CrossRef]
- Gopal, S.R.; Prabhakar, V. A hybrid approach to enhance scalability, reliability and computational speed in LoRa networks. Int. J. Adv. Comput. Sci. Appl. 2020, 11, 463–469. [Google Scholar] [CrossRef]
- Jain, A.; Haque, M.A.; Saifullah, A.; Zhang, H. Burst-MAC: A MAC Protocol for Handling Burst Traffic in LoRa Network. In Proceedings of the 2024 IEEE Real-Time Systems Symposium (RTSS), York, UK, 10–13 December 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 148–160. [Google Scholar]
- Chen, C.; Lion, S.; Jansang, A.; Jaikaeo, C.; Phonphoem, A.; Tangtrongpairoj, W. Dynamic Slot Allocation Protocol for Multi-Channel LoRa Communication. In Proceedings of the 2023 20th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON), Nakhon Phanom, Thailand, 9–12 May 2023; pp. 1–4. [Google Scholar] [CrossRef]
- Zorbas, D.; Abdelfadeel, K.; Kotzanikolaou, P.; Pesch, D. TS-LoRa: Time-slotted LoRaWAN for the industrial Internet of Things. Comput. Commun. 2020, 153, 1–10. [Google Scholar] [CrossRef]
- Alahmadi, H.; Bouabdallah, F.; Al-Dubai, A. A novel time-slotted LoRa MAC protocol for scalable IoT networks. Future Gener. Comput. Syst. 2022, 134, 287–302. [Google Scholar] [CrossRef]
- Wang, R.; Song, T.; Ren, J.; Wang, X.; Xu, S.; Wang, S. D-LoRa: A Distributed Parameter Adaptation Scheme for LoRa Network. arXiv 2025, arXiv:2501.12589. [Google Scholar]
- Baimukhanov, B.; Gilazh, B.; Zorbas, D. Autonomous Lightweight Scheduling in LoRa-Based Networks Using Reinforcement Learning. In Proceedings of the 2024 IEEE International Black Sea Conference on Communications and Networking (BlackSeaCom), Tbilisi, Georgia, 24–27 June 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 268–271. [Google Scholar]
- Zhong, H.; Ning, L.; Wang, J.; Suo, S.; Chen, L. Optimization of LoRa SF allocation based on deep reinforcement learning. Wirel. Commun. Mob. Comput. 2022, 2022, 1690667. [Google Scholar] [CrossRef]
- Hamdi, R.; Baccour, E.; Erbad, A.; Qaraqe, M.; Hamdi, M. LoRa-RL: Deep reinforcement learning for resource management in hybrid energy LoRa wireless networks. IEEE Internet Things J. 2021, 9, 6458–6476. [Google Scholar] [CrossRef]
- Sandoval, R.M.; Garcia-Sanchez, A.J.; Garcia-Haro, J. Optimizing and updating lora communication parameters: A machine learning approach. IEEE Trans. Netw. Serv. Manag. 2019, 16, 884–895. [Google Scholar] [CrossRef]
- Onishi, T.; Li, A.; Kim, S.J.; Hasegawa, M. A reinforcement learning based collision avoidance mechanism to superposed LoRa signals in distributed massive IoT systems. IEICE Commun. Express 2021, 10, 289–294. [Google Scholar] [CrossRef]
- Huang, X.; Jiang, J.; Yang, S.H.; Ding, Y. A reinforcement learning based medium access control method for LoRa networks. In Proceedings of the 2020 IEEE International Conference on Networking, Sensing and Control (ICNSC), Nanjing, China, 30 October–2 November 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–6. [Google Scholar]
- Ivoghlian, A.; Salcic, Z.; Wang, K.I.K. Adaptive wireless network management with multi-agent reinforcement learning. Sensors 2022, 22, 1019. [Google Scholar] [CrossRef] [PubMed]
- Zhao, G.; Lin, K.; Chapman, D.; Metje, N.; Hao, T. Optimizing Energy Efficiency of LoRaWAN-based Wireless Underground Sensor Networks. Internet Things 2023, 22, 100776. [Google Scholar] [CrossRef]
- Park, G.; Lee, W.; Joe, I. Network resource optimization with reinforcement learning for low power wide area networks. EURASIP J. Wirel. Commun. Netw. 2020, 2020, 176. [Google Scholar] [CrossRef]
- Semtech Corporation LoRa® and LoRaWAN®. 2024. Available online: https://www.semtech.com/uploads/technology/LoRa/lora-and-lorawan.pdf (accessed on 8 April 2025).
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction, 2nd ed.; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
Ref. | Methodology | Advantages | Limitation | Comparison with RL-TS |
---|---|---|---|---|
[12] | Deterministic time slot allocation | Predictable latency, no collisions | Less flexibility for variable traffic loads | RL-based time slot allocation |
[13] | Divides network into cells, sub-cells, and sectors for sector-based time slot allocation | Reduces collisions, improves PDR and throughput | High coordination and scheduling overhead | Nodes autonomously select their SFs and channels based on geographical coordination |
[16,17,18] | Using RL at gateway for resource allocation | Improves spectral efficiency and throughput | High communication overhead | Using RL at nodes for slot allocation |
[19,20,21] | Collaborative learning at end devices | Reduces collisions, improves robustness | Slow convergence | Hybrid approach for faster convergence |
[22,23] | Using hybrid RL for SFs, TP and channel optimization | Reduces collisions, improves scalability | High computational overhead on energy constrained nodes | Geographical-based allocation of SF, TP, and channels, using hybrid RL for slot allocation |
SF | Range |
---|---|
SF 12 | 12 km |
SF 11 | 10 km |
SF 10 | 8 km |
SF 9 | 6 km |
SF 8 | 4 km |
SF 7 | 2 km |
Step | Description |
---|---|
1. Input | Node , current time slot , maxCollision[T]. |
2. Output | . |
3. Check collisions at | If , assign . |
4. Otherwise | Iterate over all time slots to T.
|
5. End Loop | |
6. End Algorithm |
Parameter | Value |
---|---|
Number of nodes in a given cell–sector intersection | 20–50–80–110–140–170–200 |
SF | 9 |
SF bitrate (bps) | 1757 |
Packet size (bit) | 200 |
Duration (ms) | 113 |
Channel | 2 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Alhattab, N.; Bouabdallah, F.; Khairullah, E.F.; Aseeri, A. Reinforcement Learning-Based Time-Slotted Protocol: A Reinforcement Learning Approach for Optimizing Long-Range Network Scalability. Sensors 2025, 25, 2420. https://doi.org/10.3390/s25082420
Alhattab N, Bouabdallah F, Khairullah EF, Aseeri A. Reinforcement Learning-Based Time-Slotted Protocol: A Reinforcement Learning Approach for Optimizing Long-Range Network Scalability. Sensors. 2025; 25(8):2420. https://doi.org/10.3390/s25082420
Chicago/Turabian StyleAlhattab, Nuha, Fatma Bouabdallah, Enas F. Khairullah, and Aishah Aseeri. 2025. "Reinforcement Learning-Based Time-Slotted Protocol: A Reinforcement Learning Approach for Optimizing Long-Range Network Scalability" Sensors 25, no. 8: 2420. https://doi.org/10.3390/s25082420
APA StyleAlhattab, N., Bouabdallah, F., Khairullah, E. F., & Aseeri, A. (2025). Reinforcement Learning-Based Time-Slotted Protocol: A Reinforcement Learning Approach for Optimizing Long-Range Network Scalability. Sensors, 25(8), 2420. https://doi.org/10.3390/s25082420