An Efficient Framework for Peer Selection in Dynamic P2P Network Using Q Learning with Fuzzy Linear Programming
Abstract
:1. Introduction
- Introduce a Q-learning-based peer selection framework that adapts to real-time network changes, handles frequent peer arrivals and departures efficiently, ensures stable and optimized peer selection, and improves network reliability and performance.
- Integrate Q learning with fuzzy logic-based constraints to manage uncertainties in peer attributes like bandwidth, availability, and processing power, with the aim of providing a multi-objective optimization strategy, balancing throughput, latency, and reliability.
- Design a scalable and adaptive framework capable of handling large-scale P2P networks efficiently and demonstrate significant improvements in network performance, reducing latency and enhancing data transfer rates.
- Apply the system to file-sharing networks and decentralized cloud storage, ensuring scalable and efficient peer communication.
2. Related Work
2.1. Traditional Peer Selection Techniques
2.2. Heuristic-Based Peer Selection Methods
3. Proposed System
3.1. Objective Function and Constraints
3.2. MDP Framework
3.2.1. State Representation in P2P Networks
3.2.2. Transition Probability Matrix (TPM)
Algorithm 1: State Representation and Transition Probability Matrix Analysis in a P2P Network |
(peer attributes), (content storage and demand matrices) , expected state time, network performance metrics
|
3.3. Fuzzy Linear Programming (FLP) for Peer Selection
Algorithm 2: Fuzzy Linear Programming for Peer Selection |
Initialize the input: Network parameters—Download rate, latency, resource allocation, Decision variables: (peer selection variables) Output: Optimal peer selection strategy Step 1: Define optimization problem Maximize download rate, minimize latency, and optimize resource allocation Step 2: Define decision variable represents the selection of peers j otherwise 0 Step 3: Formulate objective function and Step 6: Aggregate the membership functions using criteria Step 7: Defuzzification: Step 8: Solve the FLP problem using an optimization solver Step 9: Return transition probability matrix P, steady-state distribution π, expected time in each state , network performance metrics, optimized peer selection strategy |
3.4. Learning for Peer Selection Optimization
Algorithm 3: Q Learning for Peer Selection Optimization |
. Define state S, action A, and reward R. .
|
3.5. Optimization of Hyper Parameters
3.6. Integration of Fuzzy Linear Programming (FLP) and Q Learning
Algorithm 4: Integration of FLP and Q Learning |
The integration of FLP and Q learning combines the advantages of handling uncertainty with fuzzy logic and the learning capability of reinforcement learning for optimized peer selection in P2P networks. , discount factor . Output: Optimized peer selection set for P2P networks.
|
4. Performance Evaluation
4.1. Parameters
4.2. Existing Systems
4.3. Simulation
4.4. Dataset Structure
4.4.1. State Space
4.4.2. Action Space
4.4.3. Rewards
4.4.4. Transition Probability
4.5. Results and Discussion
4.6. Computational Complexity and Scalability Analysis
5. P2P Optimization in Sensor Networks and IoT
5.1. Optimized Data Dissemination in IoT
5.2. Adaptive Node Selection in Sensor Networks
5.3. Dynamic Resource Allocation in Heterogeneous Networks
5.4. Implementation of Proposed Work in SANs
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Gebraselase, B.G.; Helvik, B.E.; Jiang, Y. Bitcoin P2P Network Measurements: A testbed study of the effect of peer selection on transaction propagation and confirmation times. IEEE Trans. Netw. Serv. Manag. 2022, 19, 3975–3987. [Google Scholar] [CrossRef]
- Budhkar, S.; Tamarapalli, V. An overlay management strategy to improve QoS in CDN-P2P live streaming systems. Peer-to-Peer Netw. Appl. 2019, 13, 190–206. [Google Scholar] [CrossRef]
- Hwang, I.-S.; Rianto, A.; Kharga, R.; Ab-Rahman, M.S. Global P2P BitTorrent Real-Time Traffic Over SDN-Based Local-Aware NG-PON2. IEEE Access 2022, 10, 76884–76894. [Google Scholar] [CrossRef]
- Ren, Y.; Zeng, Z.; Wang, T.; Zhang, S.; Zhi, G. A trust-based minimum cost and quality aware data collection scheme in P2P network. Peer-to-Peer Netw. Appl. 2020, 13, 2300–2323. [Google Scholar] [CrossRef]
- Nacakli, S.; Tekalp, A.M. Controlling P2P-CDN Live Streaming Services at SDN-Enabled Multi-Access Edge Datacenters. IEEE Trans. Multimed. 2020, 23, 3805–3816. [Google Scholar] [CrossRef]
- Luo, S.; Yu, H.; Li, K.; Xing, H. Efficient file dissemination in data center networks with priority-based adaptive multicast. IEEE J. Sel. Areas Commun. 2020, 38, 1161–1175. [Google Scholar] [CrossRef]
- Yao, H.; Xiang, Y.; Liu, J. Virtual Prosumers’ P2P Transaction Based Distribution Network Expansion Planning. IEEE Trans. Power Syst. 2023, 39, 1044–1057. [Google Scholar] [CrossRef]
- Farahani, R.; Çetinkaya, E.; Timmerer, C.; Shojafar, M.; Ghanbari, M.; Hellwagner, H. ALIVE: A Latency- and Cost-Aware Hybrid P2P-CDN Framework for Live Video Streaming. IEEE Trans. Netw. Serv. Manag. 2023, 21, 1561–1580. [Google Scholar] [CrossRef]
- Nie, L.; Yang, S.; Zheng, X.; Wang, X. An Efficient and Adaptive Content Delivery System Based on Hybrid Network. IEEE Trans. Broadcast. 2023, 69, 904–915. [Google Scholar] [CrossRef]
- Kumar, D.; Pandey, M. An optimal and secure resource searching algorithm for unstructured mobile peer-to-peer network using particle swarm optimization. Appl. Intell. 2022, 52, 14988–15005. [Google Scholar] [CrossRef]
- Safara, F.; Souri, A.; Deiman, S.F. Super peer selection strategy in peer-to-peer networks based on learning automata. Int. J. Commun. Syst. 2020, 33, e4296. [Google Scholar] [CrossRef]
- Ali, M.S.; Vecchio, M.; Putra, G.D.; Kanhere, S.S.; Antonelli, F. A Decentralized Peer-to-Peer Remote Health Monitoring System. Sensors 2020, 20, 1656. [Google Scholar] [CrossRef] [PubMed]
- D’Alessandro Costa, M.A.; Gonçalves Rubinstein, M. Performance analysis of a locality-aware BitTorrent protocol in enterprise networks. Peer-to-Peer Netw. Appl. 2019, 12, 751–762. [Google Scholar] [CrossRef]
- Meng, X. speed Trust: A super peer-guaranteed trust model in hybrid P2P networks. J. Supercomput. 2018, 74, 2553–2580. [Google Scholar] [CrossRef]
- Geng, J.; Fujita, S. Enhancing Crowd-Sourced Video Sharing through P2P-Assisted HTTP Video Streaming. Electronics 2024, 13, 1270. [Google Scholar] [CrossRef]
- Xue, B.; Mao, Y.; Venkatakrishnan, S.B.; Kannan, S. Goldfish: Peer Selection using Matrix Completion in Unstructured P2P Network. In Proceedings of the 2023 IEEE International Conference on Blockchain and Cryptocurrency (ICBC), Dubai, United Arab Emirates, 1–5 May 2023; pp. 1–9. [Google Scholar]
- Ghasemkhani, H.; Li, Y.; Moinzadeh, K.; Tan, Y. Contracting Models for P2P Content Distribution. Prod. Oper. Manag. 2018, 27, 1940–1959. [Google Scholar] [CrossRef]
- Anandaraj, M.; Selvaraj, K.; Ganeshkumar, P.; Rajkumar, K.; Sriram, S. Genetic Algorithm Based Resource Minimization in Network Code Based Peer-to-Peer Network. J. Circuits Syst. Comput. 2020, 30, 2150092. [Google Scholar] [CrossRef]
- Naganandhini, S.; Shanthi, D. Optimizing Replication of Data for Distributed Cloud Computing Environments: Techniques, Challenges, and Research Gap. In Proceedings of the 2023 2nd International Conference on Edge Computing and Applications (ICECAA), Namakkal, India, 19–21 July 2023; pp. 35–41. [Google Scholar]
- Shoab, M.; Jubayrin, S.A. Intelligent neighbor selection for efficient query routing in unstructured P2P networks using Q-learning. Appl. Intell. 2022, 52, 6306–6315. [Google Scholar] [CrossRef]
- Yang, X.-P.; Zheng, G. Maximum number of line faults in a P2P network system based on the addition-min fuzzy relation inequalities. IEEE Trans. Fuzzy Syst. 2021, 30, 2241–2253. [Google Scholar] [CrossRef]
- Goguen, J.A. L. A. Zadeh. Fuzzy sets. Information and control, vol. 8 (1965), pp. 338–353. - L. A. Zadeh. Similarity relations and fuzzy orderings. Information sciences, vol. 3 (1971), pp. 177–200. J. Symb. Log. 1973, 38, 656–657. [Google Scholar] [CrossRef]
- Nguyen, A.-T.; Taniguchi, T.; Eciolaza, L.; Campos, V.; Palhares, R.; Sugeno, R.M. Fuzzy Control Systems: Past, Present and Future. IEEE Comput. Intell. Mag. 2019, 14, 56–68. [Google Scholar] [CrossRef]
- Liu, Y.; Sakamoto, S.; Matsuo, K.; Ikeda, M.; Barolli, L.; Xhafa, F. A comparison study for two fuzzy-based systems: Improving reliability and security of JXTA-overlay P2P platform. Soft Comput. 2015, 20, 2677–2687. [Google Scholar] [CrossRef]
- Zhang, G.; Chai, S.; Chai, R.; Garcia, M.; Xia, Y. Fuzzy Goal Programming Algorithm for Multi-Objective Trajectory Optimal Parking of Autonomous Vehicles. IEEE Trans. Intell. Veh. 2024, 9, 1909–1918. [Google Scholar] [CrossRef]
- Nasseri, S.H.; Verdegay, J.L.; Mahmoudi, F. A New Method to Solve Fuzzy Interval Flexible Linear Programming Using a Multi-Objective Approach. Fuzzy Inf. Eng. 2021, 13, 248–265. [Google Scholar] [CrossRef]
- Abdul Hakkeem, S.; Mohamed Assarudeen, S.N. An Algorithm for Solving Fully Fuzzy Linear Fractional Programming Problems in Fuzzy Environment. J. Comput. Anal. Appl. (JoCAAA) 2024, 33, 412–420. [Google Scholar]
- Rivaz, S.; Nasseri, S.H.; Ziaseraji, M. A Fuzzy Goal Programming Approach to Multiobjective Transportation Problems. Fuzzy Inf. Eng. 2020, 12, 139–149. [Google Scholar] [CrossRef]
- Zhang, L. Max-min fuzzy bi-level programming: Resource sharing system with application. Appl. Math. Sci. Eng. 2024, 32, 2335319. [Google Scholar] [CrossRef]
- Anandaraj, M.; Ganeshkumar, P.; Naganandhini, S.; Selvaraj, K. A novel fuzzy programming approach for piece selection problem in P2P content distribution network. PeerJ Comput. Sci. 2024, 10, e1645. [Google Scholar] [CrossRef]
- Yu, Y.; Qin, Y.; Gong, H. A Fuzzy Q-Learning Algorithm for Storage Optimization in Islanding Microgrid. J. Electr. Eng. Technol. 2021, 16, 2343–2353. [Google Scholar] [CrossRef]
- Ntabeni, U.; Basutli, B.; Alves, H.; Chuma, J. Improvement of the Low-Energy Adaptive Clustering Hierarchy Protocol in Wireless Sensor Networks Using Mean Field Games. Sensors 2024, 24, 6952. [Google Scholar] [CrossRef]
- Sadhana, S.; Sivaraman, E.; Daniel, D. Enhanced Energy Efficient Routing for Wireless Sensor Network Using Extended Power Efficient Gathering in Sensor Information Systems (E-PEGASIS) Protocol. Procedia Comput. Sci. 2021, 194, 89–101. [Google Scholar] [CrossRef]
- Yuan, J.; Peng, J.; Yan, Q.; He, G.; Xiang, H.; Liu, Z. Deep Reinforcement Learning-Based Energy Consumption Optimization for Peer-to-Peer (P2P) Communication in Wireless Sensor Networks. Sensors 2024, 24, 1632. [Google Scholar] [CrossRef] [PubMed]
Category | Techniques | Advantages | Limitations |
---|---|---|---|
Traditional Peer Selection | - Random Selection - Round-Robin - Latency-Based Selection - Proximity-Based Selection | - Simple implementation - Low computational cost | - Inefficient for dynamic networks - Cannot adapt to changing network conditions - High churn rate issues |
Heuristic-Based Peer Selection | - Game-Theoretic Models - Graph-Based Selection (MST, Clustering) - Multi-Criteria Decision Making (AHP, TOPSIS) | - More efficient than traditional methods - Optimized for specific scenarios - Reduces latency and improves connectivity | - Requires manual parameter tuning - Less adaptable to real-time network fluctuations - Limited scalability |
AI-Driven Peer Selection | - Supervised Learning (Prediction Models) - Reinforcement Learning (Q-Learning, DQN) - Fuzzy Logic-Based Selection (FLP) - Hybrid AI models (QLearning + FLP) | - Self-learning and adaptive - Handles uncertainties and real-time changes - Optimized peer selection strategies - Scalable and efficient for large networks | - Higher computational requirements - Requires sufficient training data - Complexity in implementation |
Symbol | Definition |
---|---|
Q-Learning Parameters (Reinforcement Learning for Peer Selection) | |
s | Current state of the network (peer selection scenario) |
s′ | Next state after an action is taken |
S | Set of all possible states |
a | Action taken (selecting a peer) |
A | Set of all possible actions (peer selection choices) |
ai | Action selecting peer i |
Q(s,a) | Q-value, representing the expected reward for selecting peer i in state s |
r | Immediate reward based on peer selection quality |
α | Learning rate in Q learning |
γ | Discount factor for future rewards in Q learning |
maxaQ(s′,a′) | Maximum expected Q-value for the next state |
π(s) | Policy function that determines the best action for state s |
R(s,a) | Reward function for selecting action a in state s |
Pij | Probability of transitioning from state i to state j |
Peer Attributes and Selection Criteria (Fuzzy Logic Components) | |
N | Total number of available peers |
Pi | Peer iii in the network |
Bi | Bandwidth of peer i |
Li | Latency of peer i |
Ti | Trust score of peer i |
Ai | Availability of peer i (1 if available, 0 otherwise) |
Ei | Energy consumption of peer i (if applicable) |
Ci | Computational power of peer i (if applicable) |
Ri | Peer reputation score (aggregated trust score) |
Fuzzy Membership and Normalization Functions (Handling Uncertainty in Attributes) | |
μi | Fuzzy membership function representing preference for peer i |
w1, w2, w3, w4 | Weight coefficients for different peer attributes (sum to 1) |
Bmin | Minimum required bandwidth |
Bmax | Maximum bandwidth available |
Lmin | Minimum latency observed |
Lmax | Maximum allowable latency |
Tmax | Maximum possible trust score |
Tthreshold | Minimum required trust score for selection |
Amin | Minimum availability requirement (usually 1) |
eb, el, eT | Tolerance levels for fuzzy constraints |
Parameter | Value |
---|---|
Simulation Duration | 100 s |
Number of Peers | 100 to 600 |
Network Topology | Erdos-Renyi graph |
Content Repository Size | 10 GB |
Bandwidth | 100 Mbps |
Peer Upload Capacity | 10 Mbps |
Peer Download Capacity | 20 Mbps |
Max/Min Arrival Rate | 50/10 peers per minute |
Max/Min Departure Rate | 30/5 peers per minute |
Traffic Model | Constant Bit Rate (CBR) |
Parameter | Value |
---|---|
Learning Rate (α) | 0.1 |
Discount Factor (γ) | 0.9 |
Exploration Rate (ε) | 0.2 |
Exploration Decay Rate | 0.99 |
Initial Q-Value | 0 |
Number of Episodes | 1000 |
Maximum Steps per Episode | 100 |
Reward for Successful Download | 100 |
Penalty for Failed Download | −10 |
Parameter | Value |
---|---|
Max Download Speed | 10 Mbps |
Min Download Speed | 1 Mbps |
Max Reliability | 0.9 |
Min Reliability | 0.5 |
Max Latency | 100 ms |
Min Latency | 10 ms |
Max Completion Rate | 95% |
Min Completion Rate | 80% |
Membership Functions | Triangular functions |
Weights | Equal |
State ID | Active Peers | Network Load (%) | Bandwidth Availability (Mbps) | Peer Trust Level |
---|---|---|---|---|
S1 | 100 | 50 | 100 | High |
S2 | 200 | 60 | 80 | Medium |
S3 | 300 | 40 | 120 | Low |
S4 | 400 | 70 | 90 | High |
S5 | 500 | 55 | 110 | Medium |
S6 | 600 | 65 | 95 | High |
Action ID | Action Description |
---|---|
A1 | Select Peer Based on Bandwidth |
A2 | Select Peer Based on Trust Level |
A3 | Select Nearest Peer |
A4 | Select Peer with Least Load |
A5 | Random Peer Selection |
State ID | Action ID | Reward (Q-Value) |
---|---|---|
S1 | A1 | 10 |
S1 | A2 | 7 |
S1 | A3 | 5 |
S1 | A4 | 8 |
S1 | A5 | 3 |
S2 | A1 | 6 |
S2 | A2 | 9 |
S2 | A3 | 4 |
S2 | A4 | 7 |
S2 | A5 | 2 |
S3 | A1 | 8 |
S3 | A2 | 6 |
S3 | A3 | 7 |
S3 | A4 | 5 |
S3 | A5 | 4 |
S4 | A1 | 9 |
S4 | A2 | 8 |
S4 | A3 | 6 |
S4 | A4 | 7 |
S4 | A5 | 3 |
S5 | A1 | 10 |
S5 | A2 | 9 |
S5 | A3 | 8 |
S5 | A4 | 7 |
S5 | A5 | 5 |
S6 | A1 | 12 |
S6 | A2 | 10 |
S6 | A3 | 9 |
S6 | A4 | 8 |
S6 | A5 | 6 |
Current State | Action | Next State | Probability |
---|---|---|---|
S1 | A1 | S2 | 0.4 |
S1 | A1 | S3 | 0.6 |
S2 | A2 | S4 | 0.7 |
S2 | A2 | S5 | 0.3 |
S3 | A3 | S1 | 0.5 |
S3 | A3 | S2 | 0.5 |
S4 | A4 | S3 | 0.8 |
S4 | A4 | S5 | 0.2 |
S5 | A5 | S1 | 0.6 |
S5 | A5 | S4 | 0.4 |
S6 | A1 | S2 | 0.5 |
S6 | A1 | S4 | 0.5 |
Criteria | Traditional Methods | Proposed Method | Improvement (%) |
---|---|---|---|
Handling Uncertainty | Fixed thresholds, high sensitivity to fluctuations | Fuzzy constraints provide smooth decision making | 30% Lower selection variability |
Resource Utilization | Load balancing inefficient, often leads to bottlenecks | Optimized allocation using learned policies | +40% Better load distribution |
Convergence Speed | Slow adaptation (Avg: 5000 iterations) | Faster convergence (Avg: 2000 iterations) | 60% Reduction in convergence time |
Success Rate in Peer Connections | 75% (High failure under churn) | 92% (Stable connections) | +22% Higher success rate |
Throughput (Mbps) | 25 Mbps | 45 Mbps | +21% Higher throughput |
Network Latency (ms) | 150 ms | 90 ms | 40% Lower latency |
Stability in High Churn | Unstable, frequent disconnections | Robust, maintains connections efficiently | 30% Higher stability |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Anandaraj, M.; Albalawi, T.; Alkhatib, M. An Efficient Framework for Peer Selection in Dynamic P2P Network Using Q Learning with Fuzzy Linear Programming. J. Sens. Actuator Netw. 2025, 14, 38. https://doi.org/10.3390/jsan14020038
Anandaraj M, Albalawi T, Alkhatib M. An Efficient Framework for Peer Selection in Dynamic P2P Network Using Q Learning with Fuzzy Linear Programming. Journal of Sensor and Actuator Networks. 2025; 14(2):38. https://doi.org/10.3390/jsan14020038
Chicago/Turabian StyleAnandaraj, Mahalingam, Tahani Albalawi, and Mohammad Alkhatib. 2025. "An Efficient Framework for Peer Selection in Dynamic P2P Network Using Q Learning with Fuzzy Linear Programming" Journal of Sensor and Actuator Networks 14, no. 2: 38. https://doi.org/10.3390/jsan14020038
APA StyleAnandaraj, M., Albalawi, T., & Alkhatib, M. (2025). An Efficient Framework for Peer Selection in Dynamic P2P Network Using Q Learning with Fuzzy Linear Programming. Journal of Sensor and Actuator Networks, 14(2), 38. https://doi.org/10.3390/jsan14020038