Cloud–Edge Collaborative Model Adaptation Based on Deep Q-Network and Transfer Feature Extraction
Abstract
Featured Application
Abstract
1. Background
2. Preliminaries
2.1. Definition
2.2. Problem Statement
3. Model Framework
3.1. Reinforcement Learning Decision and Strategy Mechanism
3.1.1. State Vector Construction
3.1.2. Reinforcement Learning Strategy
3.1.3. Reward Design and Decision-Making Mechanism
3.2. Local Transfer Mechanism of Edge Model
3.2.1. Local Classification Head Feature Transfer
3.2.2. Patch Fusion and Model Update
3.2.3. Edge Inference and State Update
3.3. The Algorithm Flowchart and Its Corresponding Pseudocode
Algorithm 1. The pseudocode of the algorithm. Adaptive Cloud-Edge Collaborative Model Optimization Algorithm |
Initialize DQN agent; Load initial detection results from result.xlsx; For each class c in the category set do: Step 1 Construct State Vector for Reinforcement Learning Compute edge-side detection accuracy: acc_edge_c; Compute cloud-side fusion accuracy: acc_cloud_c; Compute accuracy gap: Δc = acc_cloud_c - acc_edge_c; Form state vector Sc = {acc_edge_c, acc_cloud_c, Δc}; Step 2 Input state into DQN agent and obtain action action_c = DQN.predict(Sc); If action_c == 1: Load class-specific training samples (labeled XML); Freeze all parameters of Faster R-CNN except for classification head of class c; Calculate LC and Perform gradient update only on: - head.score.weight[c] - head.score.bias[c]; Save the updated parameters as class patch (pth file); Merge the patch into the base edge model; Step 3 Perform Edge Inference and Update State Perform inference on multi-view (4-angle) satellite images; Update result.xlsx with new edge-side predictions; Recalculate acc_edge_c; Else: No parameter update, retain previous model state; Step 4 Reinforcement Learning Update Compute reward r_c = |acc_cloud_c - acc_edge_c|; Store experience tuple (Sc, action_c, r_c, Sc + 1) into replay buffer; Update DQN by minimizing Bellman loss L_Q using sampled experiences; Step 5 Check Termination Condition If |acc_cloud_c - acc_edge_c| < threshold: Mark class c as “converged”; skip further transfer for this class; Else: Continue iterative transfer learning for class c; End For Output: Updated edge model and final result.xlsx with optimized edge-side detection results |
4. Experiment
4.1. Dataset Description
4.2. Experimental Environment
4.3. Consistency Verification of Cloud-Edge Collaboration Implemented Through Reinforcement Learning
4.4. Ablation and Baseline Comparison Experiments
4.4.1. Analysis of the Impact on Performance After Removing the Reinforcement Learning Mechanism
4.4.2. Comparison of Different Feature Migration Mechanisms
4.4.3. Comparison Between the Proposed Migration Mechanism and Mainstream Model Adaptation Methods
5. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Shan, C.; Gao, R.; Han, Q.; Liu, T.; Yang, Z.; Zhang, J.; Xia, Y. KCES: A Workflow Containerization Scheduling Scheme Under Cloud-Edge Collaboration Framework. IEEE Internet Things J. 2024, 12, 2026–2042. [Google Scholar] [CrossRef]
- Lin, W.; Zhu, M.; Zhou, X.; Zhang, R.; Zhao, X.; Shen, S.; Sun, L. A Deep Neural Collaborative Filtering Based Service Recommendation Method with Multi-Source Data for Smart Cloud-Edge Collaboration Applications. Tsinghua Sci. Technol. 2024, 29, 897–910. [Google Scholar] [CrossRef]
- Guo, L.; He, Y.; Wan, C.; Li, Y.; Luo, L. From cloud manufacturing to cloud-edge collaborative manufacturing. Robot. Comput. Manuf. 2024, 90, 102790. [Google Scholar] [CrossRef]
- Xu, P.; Wang, K.; Hassan, M.M.; Chen, C.-M.; Lin, W.; Hassan, R.; Fortino, G. Adversarial Robustness in Graph-Based Neural Architecture Search for Edge AI Transportation Systems. IEEE Trans. Intell. Transp. Syst. 2022, 24, 8465–8474. [Google Scholar] [CrossRef]
- Feng, L.; Yang, Y.; Tan, M.; Zeng, T.; Tang, H.; Li, Z.; Niu, Z.; Feng, F. Adaptive multi-source domain collaborative fine-tuning for transfer learning. PeerJ Comput. Sci. 2024, 10, e2107. [Google Scholar] [CrossRef] [PubMed]
- Cao, Z.; Kwon, M.; Sadigh, D. Transfer Reinforcement Learning Across Homotopy Classes. IEEE Robot. Autom. Lett. 2021, 6, 2706–2713. [Google Scholar] [CrossRef]
- Wang, Y.; Liu, H.; Zheng, W.; Xia, Y.; Li, Y.; Chen, P.; Guo, K.; Xie, H. Multi-Objective Workflow Scheduling with Deep-Q-Network-Based Multi-Agent Reinforcement Learning. IEEE Access 2019, 7, 39974–39982. [Google Scholar] [CrossRef]
- Zhong, H.; Yu, S.; Trinh, H.; Lv, Y.; Yuan, R.; Wang, Y. Fine-tuning transfer learning based on DCGAN integrated with self-attention and spectral normalization for bearing fault diagnosis. Measurement 2023, 210, 112421. [Google Scholar] [CrossRef]
- Jiang, Y.; Wang, S.; Valls, V.; Ko, B.J.; Lee, W.-H.; Leung, K.K.; Tassiulas, L. Model Pruning Enables Efficient Federated Learning on Edge Devices. IEEE Trans. Neural Networks Learn. Syst. 2022, 34, 10374–10386. [Google Scholar] [CrossRef] [PubMed]
- Wang, C.-H.; Huang, K.-Y.; Yao, Y.; Chen, J.-C.; Shuai, H.-H.; Cheng, W.-H. Lightweight Deep Learning: An Overview. IEEE Consum. Electron. Mag. 2022, 13, 51–64. [Google Scholar] [CrossRef]
- Li, Y.; Zhang, S.; Wang, W.-Q. A Lightweight Faster R-CNN for Ship Detection in SAR Images. IEEE Geosci. Remote. Sens. Lett. 2020, 19, 4006105. [Google Scholar] [CrossRef]
- Mei, S.; Chen, X.; Zhang, Y.; Li, J.; Plaza, A. Accelerating Convolutional Neural Network-Based Hyperspectral Image Classification by Step Activation Quantization. IEEE Trans. Geosci. Remote. Sens. 2021, 60, 5502012. [Google Scholar] [CrossRef]
- Gou, J.; Yu, B.; Maybank, S.J.; Tao, D. Knowledge Distillation: A Survey. Int. J. Comput. Vis. 2021, 129, 1789–1819. [Google Scholar] [CrossRef]
- Gou, J.; Sun, L.; Yu, B.; Wan, S.; Tao, D. Hierarchical Multi-Attention Transfer for Knowledge Distillation. ACM Trans. Multimedia Comput. Commun. Appl. 2023, 20, 1–20. [Google Scholar] [CrossRef]
- Zhao, C.; Wang, S.; Li, D.; Liu, X.; Yang, X.; Liu, J. Cross-domain sentiment classification via parameter transferring and attention sharing mechanism. Inf. Sci. 2021, 578, 281–296. [Google Scholar] [CrossRef]
- Xuan, S.; Zheng, L.; Chung, I.; Wang, W.; Man, D.; Du, X.; Yang, W.; Guizani, M. An incentive mechanism for data sharing based on blockchain with smart contracts. Comput. Electr. Eng. 2020, 83, 106587. [Google Scholar] [CrossRef]
- Miao, Q.; Lin, H.; Wang, X.; Hassan, M.M. Federated deep reinforcement learning based secure data sharing for Internet of Things. Comput. Networks 2021, 197, 108327. [Google Scholar] [CrossRef]
- Taghipour, S.; Namoura, H.A.; Sharifi, M.; Ghaleb, M. Real-time production scheduling using a deep reinforcement learning-based multi-agent approach. INFOR: Inf. Syst. Oper. Res. 2024, 62, 186–210. [Google Scholar] [CrossRef]
- Liang, Z.; Yang, R.; Wang, J.; Liu, L.; Ma, X.; Zhu, Z. Dynamic constrained evolutionary optimization based on deep Q-network. Expert Syst. Appl. 2024, 249, 123592. [Google Scholar] [CrossRef]
- Lee, S.; Choo, H.; Ismail, R. Smart Manufacturing Scheduling System: DQN based on Cooperative Edge Computing. In Proceedings of the 15th International Conference on Ubiquitous Information Management and Communication (IMCOM 2021), Seoul, Republic of Korea, 4–6 January 2021. [Google Scholar] [CrossRef]
- Qin, W.; Chen, H.; Wang, L. PASD: A Prioritized Action Sampling-Based Dueling DQN for Cloud-Edge Collaborative Computation Offloading in Industrial IoT. In China Conference on Wireless Sensor Networks; Springer Nature: Singapore, 2022; pp. 19–30. [Google Scholar] [CrossRef]
- Moustafa, M.S.; Metwalli, M.R.; Samshitha, R.; Mohamed, S.A.; Shovan, B. Cyclone detection with end-to-end super resolution and faster R-CNN. Earth Sci. Informatics 2024, 17, 1837–1850. [Google Scholar] [CrossRef]
- Xiao, Y.; Tian, Z.; Yu, J.; Zhang, Y.; Liu, S.; Du, S.; Lan, X. A review of object detection based on deep learning. Multimedia Tools Appl. 2020, 79, 23729–23791. [Google Scholar] [CrossRef]
- Nawaz, S.A.; Li, J.; Bhatti, U.A.; Shoukat, M.U.; Ahmad, R.M. AI-based object detection latest trends in remote sensing, multimedia and agriculture applications. Front. Plant Sci. 2022, 13, 1041514. [Google Scholar] [CrossRef] [PubMed]
- Fosić, I.; Žagar, D.; Grgić, K.; Križanović, V. Anomaly detection in NetFlow network traffic using supervised machine learning algorithms. J. Ind. Inf. Integr. 2023, 33, 100466. [Google Scholar] [CrossRef]
- Zhu, H.; Wei, H.; Li, B.; Yuan, X.; Kehtarnavaz, N. A Review of Video Object Detection: Datasets, Metrics and Methods. Appl. Sci. 2020, 10, 7834. [Google Scholar] [CrossRef]
- Kaur, R.; Singh, S. A comprehensive review of object detection with deep learning. Digit. Signal Process. 2022, 132, 103812. [Google Scholar] [CrossRef]
Parameter | Description |
---|---|
Input Dimension | 3() |
Output Dimension | 2() |
Number of Hidden Layers | 2 layers |
Neurons Per Layer | 64 neurons per layer |
Activation Function | ReLU |
Parameter | Value | Description |
---|---|---|
Learning Rate (lr) | 1 × 10−3 | Initial learning rate for the Adam optimizer |
Discount Factor (γ) | 0.99 | Degree of consideration for future rewards |
Initial ε Value | 1.0 | Initial exploration rate (fully random) |
Minimum ε Value | 0.05 | Lower bound of ε to maintain some exploration |
ε Decay Factor | 0.995 | ε is multiplied by this factor after each update |
Replay Buffer Size | 10,000 | Stores past state-action transitions for replay |
Batch Size | 64 | Number of samples per training iteration |
Target Network Update Frequency | 50 | Synchronize target network parameters every 50 steps |
Module Name | Effect | Representative Shape |
---|---|---|
extractor | Feature extraction (ResNet) | Muti-level convolution and residual blocks |
rpn | Region proposal | Candidate box classification and position regression |
head.classifier | Feature encoding | Three bottleneck blocks |
head.cls_loc.weight | Position regression head weight | (4 × 12, 2048) |
head.cls_loc.bias | Position regression head bias | (4 × 12, 1) |
head.score.weight | Class score output head weights | (12, 2048) |
head.score.bias | Category score bias | (12, 1) |
Categories | Quantities |
---|---|
A220 | 10,420 |
A321 | 4159 |
A330 | 2502 |
A350 | 1613 |
ARJ21 | 319 |
Boeing737 | 6040 |
Boeing747 | 2658 |
Boeing777 | 2005 |
Boeing787 | 2523 |
C919 | 260 |
Other-airplane | 16,930 |
Category | Model Used | Precision | Recall | AP |
---|---|---|---|---|
A321 | Original Model | 70.30% | 60.68% | 53.66% |
Without Reinforcement Learning | 67.44% | 55.77% | 47.17% | |
A330 | Original Model | 32.26% | 55.56% | 52.38% |
Without Reinforcement Learning | 28.57% | 50.00% | 45.83% | |
Other-plane | Original Model | 44.12% | 57.69% | 37.85% |
Without Reinforcement Learning | 41.43% | 55.77% | 34.63% |
Migration Category | Migration Model Site | Precision | Recall | AP |
---|---|---|---|---|
A321 | Original model | 70.30% | 60.68% | 53.66% |
head.score.bias | 70.30% | 60.68% | 52.71% | |
head.score.bias+head.score.weight | 71.00% | 60.68% | 53.71% | |
head.score.bias+head.score.weight+ head.cls_loc.weight+head.cls_loc.bias | 31.28% | 60.68% | 34.87% | |
A330 | Original model | 32.26% | 55.56% | 52.38% |
head.score.bias | 32.22% | 55.56% | 52.38% | |
head.score.bias+head.score.weight | 39.62% | 55.56% | 22.02% | |
head.score.bias+head.score.weight+ head.cls_loc.weight+head.cls_loc.bias | 36.85% | 55.56% | 20.47% | |
Other-plane | Original model | 44.12% | 57.69% | 37.85% |
head.score.bias | 44.11% | 57.69% | 37.85% | |
head.score.bias+head.score.weight | 32.94% | 59.40% | 30.26% | |
head.score.bias+head.score.weight+ head.cls_loc.weight+head.cls_loc.bias | 33.33% | 59.83% | 30.26% |
Category | Model Used | Precision | Recall | AP |
---|---|---|---|---|
A321 | Original Model | 70.30% | 60.68% | 53.66% |
Pruning | 67.44% | 55.77% | 48.43% | |
Distillation | 70.28% | 60.68% | 53.71% | |
A330 | Original Model | 32.26% | 55.56% | 52.38% |
Pruning | 33.33% | 50.00% | 42.71% | |
Distillation | 32.22% | 55.56% | 52.38% | |
Other-plane | Original Model | 44.12% | 57.69% | 37.85% |
Pruning | 43.70% | 56.73% | 37.61% | |
Distillation | 44.11% | 57.69% | 37.85% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, J.; Cheng, X.; Jia, Y.; Tan, S. Cloud–Edge Collaborative Model Adaptation Based on Deep Q-Network and Transfer Feature Extraction. Appl. Sci. 2025, 15, 8335. https://doi.org/10.3390/app15158335
Chen J, Cheng X, Jia Y, Tan S. Cloud–Edge Collaborative Model Adaptation Based on Deep Q-Network and Transfer Feature Extraction. Applied Sciences. 2025; 15(15):8335. https://doi.org/10.3390/app15158335
Chicago/Turabian StyleChen, Jue, Xin Cheng, Yanjie Jia, and Shuai Tan. 2025. "Cloud–Edge Collaborative Model Adaptation Based on Deep Q-Network and Transfer Feature Extraction" Applied Sciences 15, no. 15: 8335. https://doi.org/10.3390/app15158335
APA StyleChen, J., Cheng, X., Jia, Y., & Tan, S. (2025). Cloud–Edge Collaborative Model Adaptation Based on Deep Q-Network and Transfer Feature Extraction. Applied Sciences, 15(15), 8335. https://doi.org/10.3390/app15158335