A Systematic Review on Reinforcement Learning for Industrial Combinatorial Optimization Problems
Abstract
:1. Introduction
1.1. Reinforcement Learning Foundations
1.2. Previous Reviews
1.3. Research Questions
- RQ1: how to sufficiently encode the problem (state representation);
- RQ2: what control is necessary over the problem (action space);
- RQ3: how to guide the agent towards the desired behavior (reward design);
- RQ4: what the main limitations are for practical implementation;
- RQ5: which future developments are needed.
2. Review Methodology
2.1. Search Criteria
2.2. Eligibility Criteria
- Out of scope: irrelevant topics outside of industrial applications or not of combinatorial optimization (221);
- Insufficient details: irreproducible papers without all needed explicit descriptions of agent–environment interactions (99);
- Reviews and surveys (64).
3. State Representation
3.1. Resource Features
3.2. Entity Features
3.3. Time-Related
3.4. Solution
3.5. Static Parameters
3.6. Problem-Specific
3.7. Multi-Agent
3.8. Hybrid Strategies
3.9. State Format
4. Action Space
4.1. Q-Value Estimation
4.2. List Selection
4.3. Sequential Decisions
4.4. Heuristic Selection
4.5. Multiple Selections
4.6. Variable-Sized Output
4.7. Number Estimation
5. Reward Design
5.1. Mirror Objective Function
5.2. Relative Rewards
5.3. Extra Terms
5.4. Conditional Rewards
5.5. Problem-Specific Objective
5.6. Multi-Agent
5.7. Alternative Goals
6. Discussion
6.1. Research Questions
6.1.1. RQ1—State Representation
6.1.2. RQ2—Action Space
6.1.3. RQ3—Reward Design
6.1.4. RQ4—Limitations
6.1.5. RQ5—Future Developments
6.2. Popular Algorithms
6.3. The Limitations of This Review
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Sutton, R.S.; Barto, A.G. Reinforcement Learning; MIT Press: Cambridge, MA, USA, 2020; p. 282. [Google Scholar]
- Patel, P.P.; Jhaveri, R.H. Soft computing techniques to address various issues in wireless sensor networks: A survey. In Proceedings of the IEEE International Conference on Computing, Communication and Automation, ICCCA 2016, Greater Noida, India, 29–30 April 2016; pp. 399–404. [Google Scholar] [CrossRef]
- Cheng, L.; Yu, T. A new generation of AI: A review and perspective on machine learning technologies applied to smart energy and electric power systems. Int. J. Energy Res. 2019, 43, 1928–1973. [Google Scholar] [CrossRef]
- Cunha, B.; Madureira, A.M.; Fonseca, B.; Coelho, D. Deep Reinforcement Learning as a Job Shop Scheduling Solver: A Literature Review. In Proceedings of the 18th International Conference on Hybrid Intelligent Systems (HIS 2018), Porto, Portugal, 13–15 December 2018. [Google Scholar] [CrossRef]
- Morocho-Cayamcela, M.E.; Lee, H.; Lim, W. Machine learning for 5G/B5G mobile and wireless communications: Potential, limitations, and future directions. IEEE Access 2019, 7, 137184–137206. [Google Scholar] [CrossRef]
- Khan, S.; Farnsworth, M.; McWilliam, R.; Erkoyuncu, J. On the requirements of digital twin-driven autonomous maintenance. Annu. Rev. Control 2020, 50, 13–28. [Google Scholar] [CrossRef]
- Naeem, M.; Rizvi, S.T.H.; Coronato, A. A Gentle Introduction to Reinforcement Learning and its Application in Different Fields. IEEE Access 2020, 8, 209320–209344. [Google Scholar] [CrossRef]
- Quach, H.N.; Yeom, S.; Kim, K. Survey on reinforcement learning based efficient routing in SDN. In Proceedings of the 9th International Conference on Smart Media and Applications, Jeju, Republic of Korea, 17–19 September 2020; pp. 196–200. [Google Scholar] [CrossRef]
- Frikha, M.S.; Gammar, S.M.; Lahmadi, A.; Andrey, L. Reinforcement and deep reinforcement learning for wireless Internet of Things: A survey. Comput. Commun. 2021, 178, 98–113. [Google Scholar] [CrossRef]
- Xiao, Y.; Liu, J.; Wu, J.; Ansari, N. Leveraging Deep Reinforcement Learning for Traffic Engineering: A Survey. IEEE Commun. Surv. Tutor. 2021, 23, 2064–2097. [Google Scholar] [CrossRef]
- Esteso, A.; Peidro, D.; Mula, J.; Díaz-Madroñero, M. Reinforcement learning applied to production planning and control. Int. J. Prod. Res. 2023, 61, 5772–5789. [Google Scholar] [CrossRef]
- Torres, A.d.R.; Andreiana, D.S.; Roldán, Á.O.; Bustos, A.H.; Galicia, L.E.A. A Review of Deep Reinforcement Learning Approaches for Smart Manufacturing in Industry 4.0 and 5.0 Framework. Appl. Sci. 2022, 12, 12377. [Google Scholar] [CrossRef]
- Ogunfowora, O.; Najjaran, H. Reinforcement and deep reinforcement learning-based solutions for machine maintenance planning, scheduling policies, and optimization. J. Manuf. Syst. 2023, 70, 244–263. [Google Scholar] [CrossRef]
- Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef]
- Haddaway, N.R.; Page, M.J.; Pritchard, C.C.; McGuinness, L.A. PRISMA2020: An R package and Shiny app for producing PRISMA 2020-compliant flow diagrams, with interactivity for optimised digital transparency and Open Synthesis. Campbell Syst. Rev. 2022, 18, e1230. [Google Scholar] [CrossRef] [PubMed]
- Bai, Y.; Lv, Y. Reinforcement Learning-based Job Shop Scheduling for Remanufacturing Production. In Proceedings of the IEEE International Conference on Industrial Engineering and Engineering Management, Kuala Lumpur, Malaysia, 7–10 December 2022; pp. 246–251. [Google Scholar] [CrossRef]
- Bretas, A.M.; Mendes, A.; Chalup, S.; Jackson, M.; Clement, R.; Sanhueza, C. Addressing deadlock in large-scale, complex rail networks via multi-agent deep reinforcement learning. Expert Syst. 2023, 42, e13315. [Google Scholar] [CrossRef]
- Chang, J.; Yu, D.; Zhou, Z.; He, W.; Zhang, L. Hierarchical Reinforcement Learning for Multi-Objective Real-Time Flexible Scheduling in a Smart Shop Floor. Machines 2022, 10, 1195. [Google Scholar] [CrossRef]
- Chen, Q.; Huang, W.; Peng, Y.; Huang, Y. A Reinforcement Learning-Based Framework for Solving the IP Mapping Problem. IEEE Trans. Very Large Scale Integr. Syst. 2021, 29, 1638–1651. [Google Scholar] [CrossRef]
- Danino, T.; Ben-Shimol, Y.; Greenberg, S. Container Allocation in Cloud Environment Using Multi-Agent Deep Reinforcement Learning. Electronics 2023, 12, 2614. [Google Scholar] [CrossRef]
- Geng, N.; Lan, T.; Aggarwal, V.; Yang, Y.; Xu, M. A Multi-agent Reinforcement Learning Perspective on Distributed Traffic Engineering. In Proceedings of the 2020 IEEE 28th International Conference on Network Protocols (ICNP), Madrid, Spain, 13–16 October 2020. [Google Scholar] [CrossRef]
- Han, B.A.; Yang, J.J. Research on adaptive job shop scheduling problems based on dueling double DQN. IEEE Access 2020, 8, 186474–186495. [Google Scholar] [CrossRef]
- Huang, Y.; Hao, C.; Mao, Y.; Zhou, F. Dynamic Resource Configuration for Low-Power IoT Networks: A Multi-Objective Reinforcement Learning Method. IEEE Commun. Lett. 2021, 25, 2285–2289. [Google Scholar] [CrossRef]
- Islam, M.T.; Karunasekera, S.; Buyya, R. Performance and Cost-Efficient Spark Job Scheduling Based on Deep Reinforcement Learning in Cloud Computing Environments. IEEE Trans. Parallel Distrib. Syst. 2022, 33, 1695–1710. [Google Scholar] [CrossRef]
- Li, X.; Wang, J.; Sawhney, R. Reinforcement learning for joint pricing, lead-time and scheduling decisions in make-to-order systems. Eur. J. Oper. Res. 2012, 221, 99–109. [Google Scholar] [CrossRef]
- Li, X.; Fang, Y.; Pan, C.; Cai, Y.; Zhou, M. Resource Scheduling for UAV-Assisted Failure-Prone MEC in Industrial Internet. Drones 2023, 7, 259. [Google Scholar] [CrossRef]
- Liu, W.; Wu, S.; Zhu, H.; Zhang, H. An Integration Method of Heterogeneous Models for Process Scheduling Based on Deep Q-Learning Integration Agent. In Proceedings of the 16th IEEE Conference on Industrial Electronics and Applications, ICIEA 2021, Chengdu, China, 1–4 August 2021; pp. 1966–1971. [Google Scholar] [CrossRef]
- Ma, S.; Ilyushkin, A.; Stegehuis, A.; Iosup, A. Ananke: A Q-Learning-Based Portfolio Scheduler for Complex Industrial Workflows. In Proceedings of the 2017 IEEE International Conference on Autonomic Computing, ICAC 2017, Columbus, OH, USA, 17–21 July 2017; pp. 227–232. [Google Scholar] [CrossRef]
- Martins, M.S.; Viegas, J.L.; Coito, T.; Firme, B.M.; Sousa, J.M.; Figueredo, J.; Vieira, S.M. Reinforcement learning for dual-resource constrained scheduling. IFAC-PapersOnLine 2020, 53, 10810–10815. [Google Scholar] [CrossRef]
- Moon, J.; Yang, M.; Jeong, J. A novel approach to the job shop scheduling problem based on the deep Q-network in a cooperative multi-access edge computing ecosystem. Sensors 2021, 21, 4553. [Google Scholar] [CrossRef] [PubMed]
- Siddesha, K.; Jayaramaiah, G.V.; Singh, C. A novel deep reinforcement learning scheme for task scheduling in cloud computing. Clust. Comput. 2022, 25, 4171–4188. [Google Scholar] [CrossRef]
- Silva, T.; Azevedo, A. Production flow control through the use of reinforcement learning. Procedia Manuf. 2019, 38, 194–202. [Google Scholar] [CrossRef]
- Williem, R.S.; Setiawan, K. Reinforcement learning combined with radial basis function neural network to solve job-shop scheduling problem. In Proceedings of the APBITM 2011—2011 IEEE International Summer Conference of Asia Pacific Business Innovation and Technology Management, Dalian, China, 10–12 July 2011; pp. 29–32. [Google Scholar] [CrossRef]
- Wang, T.; Hu, X.; Zhang, Y. A DRL based approach for adaptive scheduling of one-of-a-kind production. Comput. Oper. Res. 2023, 158, 106306. [Google Scholar] [CrossRef]
- Xu, N.; Bu, T.M. Policy network for solving flexible job shop scheduling problem with setup times and rescoure constraints. In Proceedings of the GECCO 2022 Companion—2022 Genetic and Evolutionary Computation Conference, Boston, MA, USA, 9–13 July 2022; pp. 208–211. [Google Scholar] [CrossRef]
- Yang, Y.; Chen, X.; Yang, M.; Guo, W.; Jiang, P. Designing an Industrial Product Service System for Robot-Driven Sanding Processing Line: A Reinforcement Learning Based Approach. Machines 2024, 12, 136. [Google Scholar] [CrossRef]
- Yuan, M.; Huang, H.; Li, Z.; Zhang, C.; Pei, F.; Gu, W. A multi-agent double Deep-Q-network based on state machine and event stream for flexible job shop scheduling problem. Adv. Eng. Inform. 2023, 58, 102230. [Google Scholar] [CrossRef]
- Yu, L.; Yu, P.S.; Duan, Y.; Qiao, H. A resource scheduling method for reliable and trusted distributed composite services in cloud environment based on deep reinforcement learning. Front. Genet. 2022, 13, 964784. [Google Scholar] [CrossRef]
- Zhang, C.; Odonkor, P.; Zheng, S.; Khorasgani, H.; Serita, S.; Gupta, C.; Wang, H. Dynamic Dispatching for Large-Scale Heterogeneous Fleet via Multi-agent Deep Reinforcement Learning. In Proceedings of the 2020 IEEE International Conference on Big Data, Big Data 2020, Atlanta, GA, USA, 10–13 December 2020; pp. 1436–1441. [Google Scholar] [CrossRef]
- Zhang, M.; Lu, Y.; Hu, Y.; Amaitik, N.; Xu, Y. Dynamic Scheduling Method for Job-Shop Manufacturing Systems by Deep Reinforcement Learning with Proximal Policy Optimization. Sustainability 2022, 14, 5177. [Google Scholar] [CrossRef]
- Zhao, Y.; Wang, Y.; Tan, Y.; Zhang, J.; Yu, H. Dynamic Jobshop Scheduling Algorithm Based on Deep Q Network. IEEE Access 2021, 9, 122995–123011. [Google Scholar] [CrossRef]
- Zhao, C.; Deng, N. An actor-critic framework based on deep reinforcement learning for addressing flexible job shop scheduling problems. Math. Biosci. Eng. 2024, 21, 1445–1471. [Google Scholar] [CrossRef] [PubMed]
- Chen, J.; Yang, P.; Ren, S.; Zhao, Z.; Cao, X.; Wu, D. Enhancing AIoT Device Association With Task Offloading in Aerial MEC Networks. IEEE Internet Things J. 2024, 11, 174–187. [Google Scholar] [CrossRef]
- Gao, Y.; Wu, W.; Nan, H.; Sun, Y.; Si, P. Deep Reinforcement Learning based Task Scheduling in Mobile Blockchain for IoT Applications. In Proceedings of the ICC 2020—2020 IEEE International Conference on Communications (ICC), Dublin, Ireland, 7–11 June 2020. [Google Scholar]
- Geurtsen, M.; Adan, I.; Atan, Z. Dynamic Scheduling of Maintenance by a Reinforcement Learning Approach—A Semiconductor Simulation Study. In Proceedings of the Winter Simulation Conference, Singapore, 11–14 December 2022; pp. 3110–3121. [Google Scholar] [CrossRef]
- Gong, Y.; Sun, S.; Wei, Y.; Song, M. Deep Reinforcement Learning for Edge Computing Resource Allocation in Blockchain Network Slicing Broker Framework. In Proceedings of the IEEE Vehicular Technology Conference, Helsinki, Finland, 25–28 April 2021. [Google Scholar] [CrossRef]
- Hao, Y.; Li, F.; Zhao, C.; Yang, S. Delay-Oriented Scheduling in 5G Downlink Wireless Networks Based on Reinforcement Learning With Partial Observations. IEEE/ACM Trans. Netw. 2023, 31, 380–394. [Google Scholar] [CrossRef]
- Lamprecht, R.; Wurst, F.; Huber, M.F. Reinforcement Learning based Condition-oriented Maintenance Scheduling for Flow Line Systems. In Proceedings of the IEEE International Conference on Industrial Informatics (INDIN), Palma de Mallorca, Spain, 21–23 July 2021; pp. 1–7. [Google Scholar] [CrossRef]
- Lei, K.; Guo, P.; Wang, Y.; Xiong, J.; Zhao, W. An End-to-end Hierarchical Reinforcement Learning Framework for Large-scale Dynamic Flexible Job-shop Scheduling Problem. In Proceedings of the International Joint Conference on Neural Networks, Padua, Italy, 18–23 July 2022; pp. 1–8. [Google Scholar] [CrossRef]
- Li, Y.L.; Fadda, E.; Manerba, D.; Roohnavazfar, M.; Tadei, R.; Terzo, O. Online Single-Machine Scheduling via Reinforcement Learning. In Recent Advances in Computational Optimization; Springer: Cham, Switzerland, 2022. [Google Scholar] [CrossRef]
- Li, K.; Ni, W.; Dressler, F. LSTM-Characterized Deep Reinforcement Learning for Continuous Flight Control and Resource Allocation in UAV-Assisted Sensor Network. IEEE Internet Things J. 2022, 9, 4179–4189. [Google Scholar] [CrossRef]
- Marchesano, M.G.; Guizzi, G.; Popolo, V.; Converso, G. Dynamic scheduling of a due date constrained flow shop with Deep Reinforcement Learning. IFAC-PapersOnLine 2022, 55, 2932–2937. [Google Scholar] [CrossRef]
- Meng, T.; Huang, J.; Li, H.; Li, Z.; Jiang, Y.; Zhong, Z. Q-Learning Based Optimisation Framework for Real-Time Mixed-Task Scheduling. Cyber-Phys. Syst. 2022, 8, 173–191. [Google Scholar] [CrossRef]
- Palacio, J.C.; Jiménez, Y.M.; Schietgat, L.; Doninck, B.V.; Nowé, A. A Q-Learning algorithm for flexible job shop scheduling in a real-world manufacturing scenario. Procedia CIRP 2022, 106, 227–232. [Google Scholar] [CrossRef]
- Raeissi, M.M.; Brooks, N.; Farinelli, A. A Balking Queue Approach for Modeling Human-Multi-Robot Interaction for Water Monitoring. In Proceedings of the PRIMA 2017: Principles and Practice of Multi-Agent Systems—20th International Conference, Nice, France, 30 October–3 November 2017; 10621 LNAI. pp. 212–223. [Google Scholar] [CrossRef]
- Tan, Q.; Tong, Y.; Wu, S.; Li, D. Modeling, planning, and scheduling of shop-floor assembly process with dynamic cyber-physical interactions: A case study for CPS-based smart industrial robot production. Int. J. Adv. Manuf. Technol. 2019, 105, 3979–3989. [Google Scholar] [CrossRef]
- Tassel, P.; Kovács, B.; Gebser, M.; Schekotihin, K.; Kohlenbrein, W.; Schrott-Kostwein, P. Reinforcement Learning of Dispatching Strategies for Large-Scale Industrial Scheduling. In Proceedings of the International Conference on Automated Planning and Scheduling, ICAPS, Virtual, 13–24 June 2022; Volume 32, pp. 638–646. [Google Scholar] [CrossRef]
- Tripathy, S.S.; Bebortta, S.; Gadekallu, T.R. Sustainable Fog-Assisted Intelligent Monitoring Framework for Consumer Electronics in Industry 5.0 Applications. IEEE Trans. Consum. Electron. 2024, 70, 1501–1510. [Google Scholar] [CrossRef]
- Thomas, T.E.; Koo, J.; Chaterji, S.; Bagchi, S. Minerva: A reinforcement learning-based technique for optimal scheduling and bottleneck detection in distributed factory operations. In Proceedings of the 2018 10th International Conference on Communication Systems and Networks, COMSNETS 2018, Bengaluru, India, 3–7 January 2018; pp. 129–136. [Google Scholar] [CrossRef]
- Valet, A.; Altenmüller, T.; Waschneck, B.; May, M.C.; Kuhnle, A.; Lanza, G. Opportunistic maintenance scheduling with deep reinforcement learning. J. Manuf. Syst. 2022, 64, 518–534. [Google Scholar] [CrossRef]
- Xing, Y.; Yang, L.; Hu, X.; Mei, C.; Wang, H.; Li, J. 6G Deterministic Network Technology Based on Hierarchical Reinforcement Learning Framework. In Proceedings of the IEEE International Symposium on Broadband Multimedia Systems and Broadcasting, BMSB, Beijing, China, 14–16 June 2023; pp. 1–6. [Google Scholar] [CrossRef]
- Wang, Z.; Liao, W. Smart scheduling of dynamic job shop based on discrete event simulation and deep reinforcement learning. J. Intell. Manuf. 2023, 35, 2593–2610. [Google Scholar] [CrossRef]
- Yang, D.; Gong, K.; Zhang, W.; Guo, K.; Chen, J. enDRTS: Deep Reinforcement Learning Based Deterministic Scheduling for Chain Flows in TSN. In Proceedings of the 2023 International Conference on Networking and Network Applications (NaNA), Qingdao, China, 18–21 August 2023; pp. 239–244. [Google Scholar] [CrossRef]
- Zhang, Z.; Li, S.; Yan, X.; Zhang, L. Self-organizing network control with a TD learning algorithm. In Proceedings of the IEEE International Conference on Industrial Engineering and Engineering Management, Singapore, 10–13 December 2018; pp. 2159–2163. [Google Scholar] [CrossRef]
- Zhang, T.; Shen, S.; Mao, S.; Chang, G.K. Delay-aware Cellular Traffic Scheduling with Deep Reinforcement Learning. In Proceedings of the GLOBECOM 2020—2020 IEEE Global Communications Conference, Taipei, Taiwan, 7–11 December 2020; pp. 1–5. [Google Scholar] [CrossRef]
- Zhang, L.; Yang, C.; Yan, Y.; Hu, Y. Distributed Real-Time Scheduling in Cloud Manufacturing by Deep Reinforcement Learning. IEEE Trans. Ind. Inform. 2022, 18, 8999–9007. [Google Scholar] [CrossRef]
- Zhang, F.; Han, G.; Liu, L.; Zhang, Y.; Peng, Y.; Li, C. Cooperative Partial Task Offloading and Resource Allocation for IIoT Based on Decentralized Multi-Agent Deep Reinforcement Learning. IEEE Internet Things J. 2024, 11, 5526–5544. [Google Scholar] [CrossRef]
- Zhu, Y.; Sun, L.; Wang, J.; Huang, R.; Jia, X. Deep Reinforcement Learning-Based Joint Scheduling of 5G and TSN in Industrial Networks. Electronics 2023, 12, 2686. [Google Scholar] [CrossRef]
- Aissani, N.; Bekrar, A.; Trentesaux, D.; Beldjilali, B. Dynamic scheduling for multi-site companies: A decisional approach based on reinforcement multi-agent learning. J. Intell. Manuf. 2012, 23, 2513–2529. [Google Scholar] [CrossRef]
- Amaral, P.; Simoes, D. Deep Reinforcement Learning Based Routing in IP Media Broadcast Networks: Feasibility and Performance. IEEE Access 2022, 10, 62459–62470. [Google Scholar] [CrossRef]
- Bulbul, N.S.; Fischer, M. Reinforcement Learning assisted Routing for Time Sensitive Networks. In Proceedings of the GLOBECOM 2022—2022 IEEE Global Communications Conference, Rio de Janeiro, Brazil, 4–8 December 2022; pp. 3863–3868. [Google Scholar] [CrossRef]
- Chen, B.; Wan, J.; Lan, Y.; Imran, M.; Li, D.; Guizani, N. Improving cognitive ability of edge intelligent IIoT through machine learning. IEEE Netw. 2019, 33, 61–67. [Google Scholar] [CrossRef]
- Dahl, T.S.; Matarić, M.; Sukhatme, G.S. Multi-robot task allocation through vacancy chain scheduling. Robot. Auton. Syst. 2009, 57, 674–687. [Google Scholar] [CrossRef]
- Farahani, A.; Genga, L.; DIjkman, R. Online Multimodal Transportation Planning using Deep Reinforcement Learning. In Proceedings of the 2021 IEEE International Conference on Systems, Man and Cybernetics, Melbourne, Australia, 17–20 October 2021; pp. 1691–1698. [Google Scholar] [CrossRef]
- Haliem, M.; Mani, G.; Aggarwal, V.; Bhargava, B. A Distributed Model-Free Ride-Sharing Approach for Joint Matching, Pricing, and Dispatching Using Deep Reinforcement Learning. IEEE Trans. Intell. Transp. Syst. 2021, 22, 7931–7942. [Google Scholar] [CrossRef]
- Hazra, A.; Amgoth, T. CeCO: Cost-Efficient Computation Offloading of IoT Applications in Green Industrial Fog Networks. IEEE Trans. Ind. Inform. 2022, 18, 6255–6263. [Google Scholar] [CrossRef]
- Höpken, A.; Pargmann, H.; Schallner, H.; Galczynski, A.; Gerdes, L. Delivery scheduling in meat industry using reinforcement learning. Procedia CIRP 2023, 118, 68–73. [Google Scholar] [CrossRef]
- Huang, J.P.; Gao, L.; Li, X.Y. A Cooperative Hierarchical Deep Reinforcement Lerning based Multi-agent Method for Distributed Job Shop Scheduling Problem with Random Job Arrivals. Comput. Ind. Eng. 2023, 185, 109650. [Google Scholar] [CrossRef]
- Hubbs, C.D.; Li, C.; Sahinidis, N.V.; Grossmann, I.E.; Wassick, J.M. A deep reinforcement learning approach for chemical production scheduling. Comput. Chem. Eng. 2020, 141, 106982. [Google Scholar] [CrossRef]
- Lei, K.; Guo, P.; Wang, Y.; Zhang, J.; Meng, X.; Qian, L. Large-Scale Dynamic Scheduling for Flexible Job-Shop With Random Arrivals of New Jobs by Hierarchical Reinforcement Learning. IEEE Trans. Ind. Inform. 2024, 20, 1007–1018. [Google Scholar] [CrossRef]
- Li, J.; Ma, Y.; Gao, R.; Cao, Z.; Lim, A.; Song, W.; Zhang, J. Deep Reinforcement Learning for Solving the Heterogeneous Capacitated Vehicle Routing Problem. IEEE Trans. Cybern. 2022, 52, 13572–13585. [Google Scholar] [CrossRef]
- Li, H.; Assis, K.D.R.; Yan, S.; Simeonidou, D. DRL-Based Long-Term Resource Planning for Task Offloading Policies in Multiserver Edge Computing Networks. IEEE Trans. Netw. Serv. Manag. 2022, 19, 4151–4164. [Google Scholar] [CrossRef]
- Li, K. Optimizing warehouse logistics scheduling strategy using soft computing and advanced machine learning techniques. Soft Comput. 2023, 27, 18077–18092. [Google Scholar] [CrossRef]
- Liang, W.; Xie, W.; Zhou, X.; Wang, K.I.; Ma, J.; Jin, Q. Bi-Dueling DQN Enhanced Two-stage Scheduling for Augmented Surveillance in Smart EMS. IEEE Trans. Ind. Inform. 2022, 19, 8218–8228. [Google Scholar] [CrossRef]
- Liu, Z.; Long, C.; Lu, X.; Hu, Z.; Zhang, J.; Wang, Y. Which Channel to Ask My Question? Personalized Customer Service Request Stream Routing Using Deep Reinforcement Learning. IEEE Access 2019, 7, 107744–107756. [Google Scholar] [CrossRef]
- Lu, Y.; Huang, X.; Zhang, K.; Maharjan, S.; Zhang, Y. Communication-Efficient Federated Learning and Permissioned Blockchain for Digital Twin Edge Networks. IEEE Internet Things J. 2021, 8, 2276–2288. [Google Scholar] [CrossRef]
- Méndez-Hernández, B.M.; Rodríguez-Bazan, E.D.; Martinez-Jimenez, Y.; Libin, P.; Nowé, A. A Multi-objective Reinforcement Learning Algorithm for JSSP. In Proceedings of the 28th International Conference on Artificial Neural Networks, Munich, Germany, 17–19 September 2019; pp. 567–584. [Google Scholar] [CrossRef]
- Mhaisen, N.; Fetais, N.; Massoud, A. Real-Time Scheduling for Electric Vehicles Charging/Discharging Using Reinforcement Learning. In Proceedings of the 2020 IEEE International Conference on Informatics, IoT, and Enabling Technologies, ICIoT 2020, Doha, Qatar, 2–5 February 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Paraschos, P.D.; Koulinas, G.K.; Koulouriotis, D.E. A reinforcement learning/ad-hoc planning and scheduling mechanism for flexible and sustainable manufacturing systems. Flex. Serv. Manuf. J. 2024, 36, 714–736. [Google Scholar] [CrossRef]
- Park, I.B.; Huh, J.; Kim, J.; Park, J. A Reinforcement Learning Approach to Robust Scheduling of Semiconductor Manufacturing Facilities. IEEE Trans. Autom. Sci. Eng. 2020, 17, 1420–1431. [Google Scholar] [CrossRef]
- Roy, S.B.; Tan, E. Multi-hop Computational Offloading with Reinforcement Learning for Industrial IoT Networks. In Proceedings of the 2023 IEEE 97th Vehicular Technology Conference (VTC2023-Spring), Florence, Italy, 20–23 June 2023; pp. 1–5. [Google Scholar] [CrossRef]
- Schneider, J.G.; Boyan, J.A.; Moore, A.W. Stochastic Production Scheduling to meet Demand Forecasts. In Proceedings of the 37th IEEE Conference on Decision & Control, Tampa, FL, USA, 18 December 1998; pp. 2722–2727. [Google Scholar]
- Shen, X.; Liu, S.; Zhou, B.; Wu, T.; Zhang, Q.; Bao, J. Digital Twin-Driven Reinforcement Learning Method for Marine Equipment Vehicles Scheduling Problem. IEEE Trans. Autom. Sci. Eng. 2024, 21, 2173–2183. [Google Scholar] [CrossRef]
- Song, W.; Mi, N.; Li, Q.; Zhuang, J.; Cao, Z. Stochastic Economic Lot Scheduling via Self-Attention Based Deep Reinforcement Learning. IEEE Trans. Autom. Sci. Eng. 2024, 21, 1457–1468. [Google Scholar] [CrossRef]
- Tong, Z.; Wang, J.; Wang, Y.; Liu, B.; Li, Q. Energy and Performance-Efficient Dynamic Consolidate VMs Using Deep-Q Neural Network. IEEE Trans. Ind. Inform. 2023, 19, 11030–11040. [Google Scholar] [CrossRef]
- Vivekanandan, D.; Wirth, S.; Karlbauer, P.; Klarmann, N. A Reinforcement Learning Approach for Scheduling Problems with Improved Generalization through Order Swapping. Mach. Learn. Knowl. Extr. 2023, 5, 418–430. [Google Scholar] [CrossRef]
- Yu, X.; Wang, R.; Hao, J.; Wu, Q.; Yi, C.; Wang, P.; Niyato, D. Priority-Aware Deployment of Autoscaling Service Function Chains based On Deep Reinforcement Learning. IEEE Trans. Cogn. Commun. Netw. 2024, 10, 1050–1062. [Google Scholar] [CrossRef]
- Wang, S.; Bi, S.; Zhang, Y.A. Reinforcement Learning for Real-Time Pricing and Scheduling Control in EV Charging Stations. IEEE Trans. Ind. Inform. 2021, 17, 849–859. [Google Scholar] [CrossRef]
- Zhang, H.; Feng, L.; Liu, X.; Long, K.; Karagiannidis, G.K. User Scheduling and Task Offloading in Multi-Tier Computing 6G Vehicular Network. IEEE J. Sel. Areas Commun. 2023, 41, 446–456. [Google Scholar] [CrossRef]
- Zhang, F.; Han, G.; Li, A.; Lin, C.; Liu, L. QoS-Driven Distributed Cooperative Data Offloading and Heterogeneous Resource Scheduling for IIoT. IEEE Internet Things Mag. 2023, 6, 118–124. [Google Scholar] [CrossRef]
- Zhang, J.; Kong, L.; Zhang, H. Coordinated Ride-hailing Order Scheduling and Charging for Autonomous Electric Vehicles Based on Deep Reinforcement Learning. In Proceedings of the 2023 IEEE IAS Industrial and Commercial Power System Asia, I and CPS Asia 2023, Chongqing, China, 7–9 July 2023; pp. 2038–2044. [Google Scholar] [CrossRef]
- Chen, Z.; Wang, H.; Wang, B.; Yang, L.; Song, C.; Zhang, X.; Lin, F.; Cheng, J.C. Scheduling optimization of electric ready mixed concrete vehicles using an improved model-based reinforcement learning. Autom. Constr. 2024, 160, 105308. [Google Scholar] [CrossRef]
- Fu, F.; Kang, Y.; Zhang, Z.; Yu, F.R.; Wu, T. Soft Actor-Critic DRL for Live Transcoding and Streaming in Vehicular Fog-Computing-Enabled IoV. IEEE Internet Things J. 2021, 8, 1308–1321. [Google Scholar] [CrossRef]
- Gao, Y.; Zhang, C.; Xie, Z.; Qi, Z.; Zhou, J. Cost-Efficient and Quality-of-Experience-Aware Player Request Scheduling and Rendering Server Allocation for Edge-Computing-Assisted Multiplayer Cloud Gaming. IEEE Internet Things J. 2022, 9, 12029–12040. [Google Scholar] [CrossRef]
- Huang, Y.; Sun, Y.; Ding, Z. Renewable Energy Integration Driven Charging Scheme for Electric Vehicle Based Large Scale Delivery System. In Proceedings of the 2022 IEEE/IAS Industrial and Commercial Power System Asia (I&CPS Asia), Shanghai, China, 8–11 July 2022; pp. 1251–1256. [Google Scholar] [CrossRef]
- Ingalalli, A.; Kamalasadan, S.; Dong, Z.; Bharati, G.; Chakraborty, S. An Extended Q-Routing-based Event-driven Dynamic Reconfiguration of Networked Microgrids. In Proceedings of the 2022 IEEE Industry Applications Society Annual Meeting (IAS), Detroit, MI, USA, 9–14 October 2022; pp. 1–6. [Google Scholar] [CrossRef]
- Kim, S. Multi-Agent Learning and Bargaining Scheme for Cooperative Spectrum Sharing Process. IEEE Access 2023, 11, 47863–47872. [Google Scholar] [CrossRef]
- Lee, Y.H.; Lee, S. Deep reinforcement learning based scheduling within production plan in semiconductor fabrication. Expert Syst. Appl. 2022, 191, 116222. [Google Scholar] [CrossRef]
- Lei, J.; Hui, J.; Chang, F.; Dassari, S.; Ding, K. Reinforcement learning-based dynamic production-logistics-integrated tasks allocation in smart factories. Int. J. Prod. Res. 2023, 61, 4419–4436. [Google Scholar] [CrossRef]
- Li, M.; Chen, C.; Hua, C.; Guan, X. Learning-Based Autonomous Scheduling for AoI-Aware Industrial Wireless Networks. IEEE Internet Things J. 2020, 7, 9175–9188. [Google Scholar] [CrossRef]
- Ong, K.S.H.; Wang, W.; Hieu, N.Q.; Niyato, D.; Friedrichs, T. Predictive Maintenance Model for IIoT-Based Manufacturing: A Transferable Deep Reinforcement Learning Approach. IEEE Internet Things J. 2022, 9, 15725–15741. [Google Scholar] [CrossRef]
- Onishi, T.; Takahashi, E.; Nishikawa, Y.; Maruyama, S. AppDAS: An Application QoS-Aware Distributed Antenna Selection for 5G Industrial Applications. In Proceedings of the 2023 IEEE 20th Consumer Communications & Networking Conference (CCNC), Las Vegas, NV, USA, 8–11 January 2023; pp. 1027–1032. [Google Scholar] [CrossRef]
- Peng, S.; Xiong, G.; Ren, Y.; Shen, Z.; Liu, S.; Han, Y. A Parallel Learning Approach for the Flexible Job Shop Scheduling Problem. IEEE J. Radio Freq. Identif. 2022, 6, 851–856. [Google Scholar] [CrossRef]
- Redhu, S.; Hegde, R.M. Cooperative Network Model for Joint Mobile Sink Scheduling and Dynamic Buffer Management Using Q-Learning. IEEE Trans. Netw. Serv. Manag. 2020, 17, 1853–1864. [Google Scholar] [CrossRef]
- Rjoub, G.; Bentahar, J.; Abdel Wahab, O.; Saleh Bataineh, A. Deep and reinforcement learning for automated task scheduling in large-scale cloud computing systems. Concurr. Comput. Pract. Exp. 2021, 33, e5919. [Google Scholar] [CrossRef]
- Ruiz Rodríguez, M.L.; Kubler, S.; de Giorgio, A.; Cordy, M.; Robert, J.; Le Traon, Y. Multi-agent deep reinforcement learning based Predictive Maintenance on parallel machines. Robot. Comput. Integr. Manuf. 2022, 78, 102406. [Google Scholar] [CrossRef]
- Song, W.; Chen, X.; Li, Q.; Cao, Z. Flexible Job-Shop Scheduling via Graph Neural Network and Deep Reinforcement Learning. IEEE Trans. Ind. Inform. 2023, 19, 1600–1610. [Google Scholar] [CrossRef]
- Tan, L.; Hai, X.; Ma, K.; Fan, D.; Qiu, H.; Feng, Q. Digital Twin-Enabled Decision-Making Framework for Multi-UAV Mission Planning: A Multiagent Deep Reinforcement Learning Perspective. In Proceedings of the IECON 2023—49th Annual Conference of the IEEE Industrial Electronics Society, Singapore, 16–19 October 2023; pp. 14–19. [Google Scholar] [CrossRef]
- Waschneck, B.; Reichstaller, A.; Belzner, L.; Altenmuller, T.; Bauernhansl, T.; Knapp, A.; Kyek, A. Deep reinforcement learning for semiconductor production scheduling. In Proceedings of the 2018 29th Annual SEMI Advanced Semiconductor Manufacturing Conference, ASMC 2018, Saratoga Springs, NY, USA, 30 April–3 May 2018; pp. 301–306. [Google Scholar] [CrossRef]
- Xia, M.; Liu, H.; Li, M.; Wang, L. A dynamic scheduling method with Conv-Dueling and generalized representation based on reinforcement learning. Int. J. Ind. Eng. Comput. 2023, 14, 805–820. [Google Scholar] [CrossRef]
- Xie, R.; Gu, D.; Tang, Q.; Huang, T.; Yu, F.R. Workflow Scheduling in Serverless Edge Computing for the Industrial Internet of Things: A Learning Approach. IEEE Trans. Ind. Inform. 2022, 19, 8242–8252. [Google Scholar] [CrossRef]
- Xu, Y.; Zhao, J. Actor-Critic with Transformer for Cloud Computing Resource Three Stage Job Scheduling. In Proceedings of the 2022 7th International Conference on Cloud Computing and Big Data Analytics, ICCCBDA 2022, Chengdu, China, 22–24 April 2022; pp. 33–37. [Google Scholar] [CrossRef]
- Yan, K.; Shan, H.; Sun, T.; Hu, H.; Wu, Y.; Yu, L.; Zhang, Z.; Quek, T.Q. Reinforcement Learning-Based Mobile Edge Computing and Transmission Scheduling for Video Surveillance. IEEE Trans. Emerg. Top. Comput. 2022, 10, 1142–1156. [Google Scholar] [CrossRef]
- Wang, S.; Li, J.; Luo, Y. Smart Scheduling for Flexible and Hybrid Production with Multi-Agent Deep Reinforcement Learning. In Proceedings of the 2021 IEEE 2nd International Conference on Information Technology, Big Data and Artificial Intelligence, ICIBA 2021, Chongqing, China, 17–19 December 2021; Volume 2, pp. 288–294. [Google Scholar] [CrossRef]
- Wang, Z.; Liao, W. Job Shop Scheduling Problem Using Proximal Policy Optimization. In Proceedings of the 2023 IEEE International Conference on Industrial Engineering and Engineering Management, IEEM 2023, Singapore, 18–21 December 2023; pp. 1517–1521. [Google Scholar] [CrossRef]
- Wei, Z.; Li, M.; Wei, Z.; Cheng, L.; Lyu, Z.; Liu, F. A novel on-demand charging strategy based on swarm reinforcement learning in WRSNs. IEEE Access 2020, 8, 84258–84271. [Google Scholar] [CrossRef]
- Wu, D.; Liu, T.; Li, Z.; Tang, T.; Wang, R. Delay-Aware Edge-Terminal Collaboration in Green Internet of Vehicles: A Multiagent Soft Actor-Critic Approach. IEEE Trans. Green Commun. Netw. 2023, 7, 1090–1102. [Google Scholar] [CrossRef]
- Yan, L.; Shen, H.; Kang, L.; Zhao, J.; Zhang, Z.; Xu, C. MobiCharger: Optimal Scheduling for Cooperative EV-to-EV Dynamic Wireless Charging. IEEE Trans. Mob. Comput. 2023, 22, 6889–6906. [Google Scholar] [CrossRef]
- Zisgen, H.; Miltenberger, R.; Hochhaus, M.; Stöhr, N. Dynamic Scheduling of Gantry Robots using Simulation and Reinforcement Learning. In Proceedings of the 2023 Winter Simulation Conference (WSC), San Antonio, TX, USA, 10–13 December 2023; pp. 3026–3034. [Google Scholar] [CrossRef]
- Zhao, Y.; Luo, X.; Zhang, Y. The application of heterogeneous graph neural network and deep reinforcement learning in hybrid flow shop scheduling problem. Comput. Ind. Eng. 2024, 187, 109802. [Google Scholar] [CrossRef]
- Zhou, J.; Zheng, L.; Fan, W. Multirobot collaborative task dynamic scheduling based on multiagent reinforcement learning with heuristic graph convolution considering robot service performance. J. Manuf. Syst. 2024, 72, 122–141. [Google Scholar] [CrossRef]
- Felder, M.; Steiner, D.; Busch, P.; Trat, M.; Sun, C.; Bender, J.; Ovtcharova, J. Energy-Flexible Job-Shop Scheduling Using Deep Reinforcement Learning. In Proceedings of the Conference on Production Systems and Logistics, Santiago de Querétaro, Mexico, 28 February–2 March 2023; pp. 353–362. [Google Scholar] [CrossRef]
- Lara-Cárdenas, E.; Silva-Gálves, A.; Ortiz-Bayliss, J.C.; Amaya, I.; Cruz-Duarte, J.M.; Terashima-Marín, H. Exploring Reward-based Hyper-heuristics for the Job-shop Scheduling Problem. In Proceedings of the 2020 IEEE Symposium Series on Computational Intelligence (SSCI), Canberra, Australia, 1–4 December 2020; pp. 3133–3140. [Google Scholar]
- Qu, S.; Jie, W.; Shivani, G. Learning adaptive dispatching rules for a manufacturing process system by using reinforcement learning approach. In Proceedings of the IEEE International Conference on Emerging Technologies and Factory Automation, ETFA, Berlin, Germany, 6–9 September 2016. [Google Scholar] [CrossRef]
- Teng, Y.; Li, L.; Song, L.; Yu, F.R.; Leung, V.C. Profit Maximizing Smart Manufacturing over AI-Enabled Configurable Blockchains. IEEE Internet Things J. 2022, 9, 346–358. [Google Scholar] [CrossRef]
- Wang, X.; Zhang, L.; Liu, Y.; Zhao, C. Logistics-involved task scheduling in cloud manufacturing with offline deep reinforcement learning. J. Ind. Inf. Integr. 2023, 34, 100471. [Google Scholar] [CrossRef]
- Klein, N.; Prunte, J. A New Deep Reinforcement Learning Algorithm for the Online Stochastic Profitable Tour Problem. In Proceedings of the 2022 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM), Kuala Lumpur, Malaysia, 7–10 December 2022; pp. 635–639. [Google Scholar] [CrossRef]
- Liu, Y.; Yang, C.; Jiang, L.; Xie, S.; Zhang, Y. Intelligent Edge Computing for IoT-Based Energy Management in Smart Cities. IEEE Netw. 2019, 33, 111–117. [Google Scholar] [CrossRef]
- Muller, A.; Grumbach, F.; Kattenstroth, F. Reinforcement Learning for Two-Stage Permutation Flow Shop Scheduling—A Real-World Application in Household Appliance Production. IEEE Access 2024, 12, 11388–11399. [Google Scholar] [CrossRef]
- Wang, Y.; Chen, X.; Wang, L. Deep Reinforcement Learning-Based Rescue Resource Distribution Scheduling of Storm Surge Inundation Emergency Logistics. IEEE Trans. Ind. Inform. 2023, 19, 10004–10013. [Google Scholar] [CrossRef]
- Yan, H.; Cui, Z.; Chen, X.; Ma, X. Distributed Multiagent Deep Reinforcement Learning for Multiline Dynamic Bus Timetable Optimization. IEEE Trans. Ind. Inform. 2023, 19, 469–479. [Google Scholar] [CrossRef]
- Zhou, Y.; Li, X.; Luo, J.; Yuan, M.; Zeng, J.; Yao, J. Learning to Optimize DAG Scheduling in Heterogeneous Environment. In Proceedings of the IEEE International Conference on Mobile Data Management, Paphos, Cyprus, 6–9 June 2022; pp. 137–146. [Google Scholar] [CrossRef]
- Chen, Q.; Zheng, Z.; Hu, C.; Wang, D.; Liu, F. Data-driven task allocation for multi-task transfer learning on the edge. In Proceedings of the International Conference on Distributed Computing Systems, Dallas, TX, USA, 7–10 July 2019; pp. 1040–1050. [Google Scholar] [CrossRef]
- Choi, G.; Jeon, S.; Cho, J.; Moon, J. A Seed Scheduling Method with a Reinforcement Learning for a Coverage Guided Fuzzing. IEEE Access 2023, 11, 2048–2057. [Google Scholar] [CrossRef]
- Du, H.; Xu, W.; Yao, B.; Zhou, Z.; Hu, Y. Collaborative optimization of service scheduling for industrial cloud robotics based on knowledge sharing. Procedia CIRP 2019, 83, 132–138. [Google Scholar] [CrossRef]
- Fechter, J.; Beham, A.; Wagner, S.; Affenzeller, M. Approximate Q-Learning for Stacking Problems with Continuous Production and Retrieval. Appl. Artif. Intell. 2019, 33, 68–86. [Google Scholar] [CrossRef]
- Fu, F.; Kang, Y.; Zhang, Z.; Yu, F.R. Transcoding for live streaming-based on vehicular fog computing: An actor-critic DRL approach. In Proceedings of the IEEE INFOCOM 2020—IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Toronto, ON, Canada, 6–9 July 2020; pp. 1015–1020. [Google Scholar] [CrossRef]
- Iwamura, K.; Mayumi, N.; Tanimizu, Y.; Sugimura, N. A study on real-time scheduling for holonic manufacturing systems—Determination of utility values based on multi-agent reinforcement learning. In Proceedings of the 4th International Conference on Industrial Applications of Holonic and Multi-Agent Systems, HoloMAS 2009, Linz, Austria, 31 August–2 September 2009; pp. 135–144. [Google Scholar] [CrossRef]
- Lei, W.; Ye, Y.; Xiao, M. Deep Reinforcement Learning-Based Spectrum Allocation in Integrated Access and Backhaul Networks. IEEE Trans. Cogn. Commun. Netw. 2020, 6, 970–979. [Google Scholar] [CrossRef]
- Li, X.; Luo, W.; Yuan, M.; Wang, J.; Lu, J.; Wang, J.; Lu, J.; Zeng, J. Learning to optimize industry-scale dynamic pickup and delivery problems. In Proceedings of the International Conference on Data Engineering, Chania, Greece, 19–22 April 2021; pp. 2511–2522. [Google Scholar] [CrossRef]
- Liu, Y.; Yang, M.; Guo, Z. Reinforcement learning based optimal decision making towards product lifecycle sustainability. Int. J. Comput. Integr. Manuf. 2022, 35, 1269–1296. [Google Scholar] [CrossRef]
- Ma, S.; Ruan, J.; Du, Y.; Bucknall, R.; Liu, Y. An End-to-End Deep Reinforcement Learning Based Modular Task Allocation Framework for Autonomous Mobile Systems. IEEE Trans. Autom. Sci. Eng. 2024, 1–15. [Google Scholar] [CrossRef]
- Melnik, M.; Dolgov, I.; Nasonov, D. Hybrid intellectual scheme for scheduling of heterogeneous workflows based on evolutionary approach and reinforcement learning. In Proceedings of the IJCCI 2020—12th International Joint Conference on Computational Intelligence, Budapest, Hungary, 2–4 November 2020; pp. 200–211. [Google Scholar] [CrossRef]
- Muller-Zhang, Z.; Kuhn, T. A Digital Twin-based Approach Performing Integrated Process Planning and Scheduling for Service-based Production. In Proceedings of the IEEE International Conference on Emerging Technologies and Factory Automation, ETFA, Stuttgart, Germany, 6–9 September 2022. [Google Scholar] [CrossRef]
- Phiboonbanakit, T.; Horanont, T.; Huynh, V.N.; Supnithi, T. A Hybrid Reinforcement Learning-Based Model for the Vehicle Routing Problem in Transportation Logistics. IEEE Access 2021, 9, 163325–163347. [Google Scholar] [CrossRef]
- Song, G.; Xia, M.; Zhang, D. Deep Reinforcement Learning for Risk and Disaster Management in Energy-Efficient Marine Ranching. Energies 2023, 16, 6092. [Google Scholar] [CrossRef]
- Szwarcfiter, C.; Herer, Y.T.; Shtub, A. Balancing Project Schedule, Cost, and Value under Uncertainty: A Reinforcement Learning Approach. Algorithms 2023, 16, 395. [Google Scholar] [CrossRef]
- Troch, A.; Mannens, E.; Mercelis, S. Solving the Storage Location Assignment Problem Using Reinforcement Learning. In Proceedings of the 2023 the 8th International Conference on Mathematics and Artificial Intelligence, Chongqing, China, 7–9 April 2023; pp. 89–95. [Google Scholar] [CrossRef]
- Troia, S.; Alvizu, R.; Maier, G. Reinforcement learning for service function chain reconfiguration in NFV-SDN metro-core optical networks. IEEE Access 2019, 7, 167944–167957. [Google Scholar] [CrossRef]
- Zhang, J.; Lv, Y.; Li, Y.; Liu, J. An Improved QMIX-Based AGV Scheduling Approach for Material Handling Towards Intelligent Manufacturing. In Proceedings of the 2022 IEEE 20th International Conference on Embedded and Ubiquitous Computing, EUC 2022, Wuhan, China, 9–11 December 2022; pp. 54–59. [Google Scholar] [CrossRef]
- Guo, Y.; Li, J.; Xiao, L.; Allaoui, H.; Choudhary, A.; Zhang, L. Efficient inventory routing for Bike-Sharing Systems: A combinatorial reinforcement learning framework. Transp. Res. Part E Logist. Transp. Rev. 2024, 182, 103415. [Google Scholar] [CrossRef]
- Kumar, A.; Dimitrakopoulos, R. Production scheduling in industrial mining complexes with incoming new information using tree search and deep reinforcement learning. Appl. Soft Comput. 2021, 110, 107644. [Google Scholar] [CrossRef]
- Liu, S.; Wang, W.; Zhong, S.; Peng, Y.; Tian, Q.; Li, R.; Sun, X.; Yang, Y. A graph-based approach for integrating massive data in container terminals with application to scheduling problem. Int. J. Prod. Res. 2024, 62, 5945–5965. [Google Scholar] [CrossRef]
- Lu, Y.; Fang, S.; Niu, T.; Chen, G.; Liao, R. Battery Swapping Strategy for Electric Transfer-Vehicles in Seaport: A Deep Q-Network Approach. In Proceedings of the 2023 IEEE/IAS 59th Industrial and Commercial Power Systems Technical Conference (I&CPS), Las Vegas, NV, USA, 21–25 May 2023; pp. 1–6. [Google Scholar] [CrossRef]
- Ruan, J.H.; Wang, Z.X.; Chan, F.T.; Patnaik, S.; Tiwari, M.K. A reinforcement learning-based algorithm for the aircraft maintenance routing problem. Expert Syst. Appl. 2021, 169, 114399. [Google Scholar] [CrossRef]
- Sun, Y.; Long, Y.; Xu, L.; Tan, W.; Huang, L.; Zhao, L.; Liu, W. Long-Term Matching Optimization With Federated Neural Temporal Difference Learning in Mobility-on-Demand Systems. IEEE Internet Things J. 2023, 10, 1426–1445. [Google Scholar] [CrossRef]
- Wang, G.; Qin, Z.; Wang, S.; Sun, H.; Dong, Z.; Zhang, D. Towards Accessible Shared Autonomous Electric Mobility with Dynamic Deadlines. IEEE Trans. Mob. Comput. 2024, 23, 925–940. [Google Scholar] [CrossRef]
- Zhang, L.; Yan, Y.; Hu, Y. Deep reinforcement learning for dynamic scheduling of energy-efficient automated guided vehicles. J. Intell. Manuf. 2024, 35, 3875–3888. [Google Scholar] [CrossRef]
- Zhang, L.; Yang, C.; Yan, Y.; Cai, Z.; Hu, Y. Automated guided vehicle dispatching and routing integration via digital twin with deep reinforcement learning. J. Manuf. Syst. 2024, 72, 492–503. [Google Scholar] [CrossRef]
- Gankin, D.; Mayer, S.; Zinn, J.; Vogel-Heuser, B.; Endisch, C. Modular Production Control with Multi-Agent Deep Q-Learning. In Proceedings of the IEEE International Conference on Emerging Technologies and Factory Automation, ETFA, Vasteras, Sweden, 7–10 September 2021; pp. 1–8. [Google Scholar] [CrossRef]
- Stöckermann, P.; Immordino, A.; Altenmüller, T.; Seidel, G. Dispatching in Real Frontend Fabs With Industrial Grade Discrete-Event Simulations by Deep Reinforcement Learning with Evolution Strategies. In Proceedings of the 2023 Winter Simulation Conference (WSC), San Antonio, TX, USA, 10–13 December 2023; pp. 1–23. [Google Scholar] [CrossRef]
- Liu, R.; Piplani, R.; Toro, C. A deep multi-agent reinforcement learning approach to solve dynamic job shop scheduling problem. Comput. Oper. Res. 2023, 159, 106294. [Google Scholar] [CrossRef]
- Farag, H.; Gidlund, M.; Stefanovic, C. A Deep Reinforcement Learning Approach for Improving Age of Information in Mission-Critical IoT. In Proceedings of the 2021 IEEE Global Conference on Artificial Intelligence and Internet of Things, GCAIoT 2021, Dubai, United Arab Emirates, 12–16 December 2021; pp. 14–18. [Google Scholar] [CrossRef]
- Lee, S.; Cho, Y.; Lee, Y.H. Injection mold production sustainable scheduling using deep reinforcement learning. Sustainability 2020, 12, 8718. [Google Scholar] [CrossRef]
- Alitappeh, R.J.; Jeddisaravi, K. Multi-robot exploration in task allocation problem. Appl. Intell. 2022, 52, 2189–2211. [Google Scholar] [CrossRef]
- Ao, W.; Zhang, G.; Li, Y.; Jin, D. Learning to Solve Grouped 2D Bin Packing Problems in the Manufacturing Industry. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Long Beach, CA, USA, 6–10 August 2023; pp. 3713–3723. [Google Scholar] [CrossRef]
- Arishi, A.; Krishnan, K.; Arishi, M. Machine learning approach for truck-drones based last-mile delivery in the era of industry 4.0. Eng. Appl. Artif. Intell. 2022, 116, 105439. [Google Scholar] [CrossRef]
- Fang, J.; Rao, Y.; Luo, Q.; Xu, J. Solving One-Dimensional Cutting Stock Problems with the Deep Reinforcement Learning. Mathematics 2023, 11, 1028. [Google Scholar] [CrossRef]
- Liu, H.; Zhou, L.; Yang, J.; Zhao, J. The 3D bin packing problem for multiple boxes and irregular items based on deep Q-network. Appl. Intell. 2023, 53, 23398–23425. [Google Scholar] [CrossRef]
- Palombarini, J.; Martínez, E. SmartGantt—An intelligent system for real time rescheduling based on relational reinforcement learning. Expert Syst. Appl. 2012, 39, 10251–10268. [Google Scholar] [CrossRef]
- Palombarini, J.A.; Martínez, E.C. End-to-end on-line rescheduling from Gantt chart images using deep reinforcement learning. Int. J. Prod. Res. 2022, 60, 4434–4463. [Google Scholar] [CrossRef]
- Saroliya, U.; Arima, E.; Liu, D.; Schulz, M. Hierarchical Resource Partitioning on Modern GPUs: A Reinforcement Learning Approach. In Proceedings of the IEEE International Conference on Cluster Computing, ICCC, Santa Fe, NM, USA, 31 October–3 November 2023; pp. 185–196. [Google Scholar] [CrossRef]
- Servadei, L.; Zheng, J.; Arjona-Medina, J.; Werner, M.; Esen, V.; Hochreiter, S.; Ecker, W.; Wille, R. Cost optimization at early stages of design using deep reinforcement learning. In Proceedings of the MLCAD 2020—2020 ACM/IEEE Workshop on Machine Learning for CAD, Reykjavik, Iceland, 16–20 November 2020; pp. 37–42. [Google Scholar] [CrossRef]
- Wang, X.; Ren, T.; Bai, D.; Chu, F.; Yu, Y.; Meng, F.; Wu, C.C. Scheduling a multi-agent flow shop with two scenarios and release dates. Int. J. Prod. Res. 2023, 62, 421–443. [Google Scholar] [CrossRef]
- Wang, J.; Xing, C.; Liu, J. Intelligent preamble allocation for coexistence of mMTC/URLLC devices: A hierarchical Q-learning based approach. China Commun. 2023, 20, 44–53. [Google Scholar] [CrossRef]
- Wang, Z.; Chen, Y.; Liu, C.; Lin, W.; Yang, L. Guided Reinforce Learning Through Spatial Residual Value for Online 3D Bin Packing. In Proceedings of the IECON 2023—49th Annual Conference of the IEEE Industrial Electronics Society, Singapore, 16–19 October 2023; pp. 1–5. [Google Scholar] [CrossRef]
- Wang, C.; Shen, X.; Wang, H.; Xie, W.; Zhang, H.; Mei, H. Multi-Agent Reinforcement Learning-Based Routing Protocol for Underwater Wireless Sensor Networks with Value of Information. IEEE Sens. J. 2024, 24, 7042–7054. [Google Scholar] [CrossRef]
- Wu, Y.; Song, W.; Cao, Z.; Zhang, J.; Lim, A. Learning Improvement Heuristics for Solving Routing Problems. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 5057–5069. [Google Scholar] [CrossRef]
- Xiong, H.; Ding, K.; Ding, W.; Peng, J.; Xu, J. Towards reliable robot packing system based on deep reinforcement learning. Adv. Eng. Inform. 2023, 57, 102028. [Google Scholar] [CrossRef]
- Yuan, J.; Zhang, J.; Cai, Z.; Yan, J. Towards Variance Reduction for Reinforcement Learning of Industrial Decision-making Tasks: A Bi-Critic based Demand-Constraint Decoupling Approach. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Long Beach, CA, USA, 6–10 August 2023; pp. 3162–3172. [Google Scholar] [CrossRef]
- Zhang, H.; Hao, J.; Li, X. A method for deploying distributed denial of service attack defense strategies on edge servers using reinforcement learning. IEEE Access 2020, 8, 78482–78491. [Google Scholar] [CrossRef]
- Zhao, F.; Jiang, T.; Wang, L. A Reinforcement Learning Driven Cooperative Meta-Heuristic Algorithm for Energy-Efficient Distributed No-Wait Flow-Shop Scheduling with Sequence-Dependent Setup Time. IEEE Trans. Ind. Inform. 2022, 19, 8427–8440. [Google Scholar] [CrossRef]
- Zheng, X.; Chen, Z. An improved deep Q-learning algorithm for a trade-off between energy consumption and productivity in batch scheduling. Comput. Ind. Eng. 2024, 188, 109925. [Google Scholar] [CrossRef]
- Zhang, J.; Liu, Y.; Qin, X.; Xu, X. Energy-Efficient Federated Learning Framework for Digital Twin-Enabled Industrial Internet of Things. In Proceedings of the IEEE International Symposium on Personal, Indoor and Mobile Radio Communications, PIMRC, Helsinki, Finland, 13–16 September 2021; pp. 1160–1166. [Google Scholar] [CrossRef]
- Yang, Z.; Yang, S.; Song, S.; Zhang, W.; Song, R.; Cheng, J.; Li, Y. PackerBot: Variable-Sized Product Packing with Heuristic Deep Reinforcement Learning. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems, Prague, Czech Republic, 27 September–1 October 2021; pp. 5002–5008. [Google Scholar] [CrossRef]
- Grumbach, F.; Badr, N.E.A.; Reusch, P.; Trojahn, S. A Memetic Algorithm with Reinforcement Learning for Sociotechnical Production Scheduling. IEEE Access 2022, 11, 68760–68775. [Google Scholar] [CrossRef]
- Kallestad, J.; Hasibi, R.; Hemmati, A.; Sörensen, K. A general deep reinforcement learning hyperheuristic framework for solving combinatorial optimization problems. Eur. J. Oper. Res. 2023, 309, 446–468. [Google Scholar] [CrossRef]
- Liu, C.; Zhu, H.; Tang, D.; Nie, Q.; Zhou, T.; Wang, L.; Song, Y. Probing an intelligent predictive maintenance approach with deep learning and augmented reality for machine tools in IoT-enabled manufacturing. Robot. Comput. Integr. Manuf. 2022, 77, 102357. [Google Scholar] [CrossRef]
- Ran, Y.; Zhou, X.; Hu, H.; Wen, Y. Optimizing Data Center Energy Efficiency via Event-Driven Deep Reinforcement Learning. IEEE Trans. Serv. Comput. 2023, 16, 1296–1309. [Google Scholar] [CrossRef]
- Shafiq, S.; Mayr-Dorn, C.; Mashkoor, A.; Egyed, A. Towards Optimal Assembly Line Order Sequencing with Reinforcement Learning: A Case Study. In Proceedings of the IEEE International Conference on Emerging Technologies and Factory Automation, ETFA, Vienna, Austria, 8–11 September 2020; pp. 982–989. [Google Scholar] [CrossRef]
- Tong, Z.; Liu, B.; Mei, J.; Wang, J.; Peng, X.; Li, K. Data Security Aware and Effective Task Offloading Strategy in Mobile Edge Computing. J. Grid Comput. 2023, 21, 41. [Google Scholar] [CrossRef]
- Wang, X.; Wang, J.; Liu, J. Vehicle to Grid Frequency Regulation Capacity Optimal Scheduling for Battery Swapping Station Using Deep Q-Network. IEEE Trans. Ind. Inform. 2021, 17, 1342–1351. [Google Scholar] [CrossRef]
- Zhang, Z.Q.; Wu, F.C.; Qian, B.; Hu, R.; Wang, L.; Jin, H.P. A Q-learning-based hyper-heuristic evolutionary algorithm for the distributed flexible job-shop scheduling problem with crane transportation. Expert Syst. Appl. 2023, 234, 121050. [Google Scholar] [CrossRef]
- Hou, J.; Chen, G.; Li, Z.; He, W.; Gu, S.; Knoll, A.; Jiang, C. Hybrid Residual Multiexpert Reinforcement Learning for Spatial Scheduling of High-Density Parking Lots. IEEE Trans. Cybern. 2024, 54, 2771–2783. [Google Scholar] [CrossRef]
- Wang, H.; Bai, Y.; Xie, X. Deep Reinforcement Learning Based Resource Allocation in Delay-Tolerance-Aware 5G Industrial IoT Systems. IEEE Trans. Commun. 2024, 72, 209–221. [Google Scholar] [CrossRef]
- Yeh, Y.H.; Chen, S.Y.H.; Chen, H.M.; Tu, D.Y.; Fang, G.Q.; Kuo, Y.C.; Chen, P.Y. DPRoute: Deep Learning Framework for Package Routing. In Proceedings of the 2023 28th Asia and South Pacific Design Automation Conference (ASP-DAC), Tokyo, Japan, 16–19 January 2023; pp. 277–282. [Google Scholar] [CrossRef]
- Perin, G.; Nophut, D.; Badia, L.; Fitzek, F.H. Maximizing Airtime Efficiency for Reliable Broadcast Streams in WMNs with Multi-Armed Bandits. In Proceedings of the 2020 11th IEEE Annual Ubiquitous Computing, Electronics and Mobile Communication Conference, UEMCON 2020, New York, NY, USA, 28–31 October 2020; pp. 472–478. [Google Scholar] [CrossRef]
- Yaakoubi, Y.; Dimitrakopoulos, R. Learning to schedule heuristics for the simultaneous stochastic optimization of mining complexes. Comput. Oper. Res. 2023, 159, 106349. [Google Scholar] [CrossRef]
- Lin, C.C.; Deng, D.J.; Chih, Y.L.; Chiu, H.T. Smart Manufacturing Scheduling with Edge Computing Using Multiclass Deep Q Network. IEEE Trans. Ind. Inform. 2019, 15, 4276–4284. [Google Scholar] [CrossRef]
- Paeng, B.; Park, I.B.; Park, J. Deep Reinforcement Learning for Minimizing Tardiness in Parallel Machine Scheduling with Sequence Dependent Family Setups. IEEE Access 2021, 9, 101390–101401. [Google Scholar] [CrossRef]
- Yang, F.; Tian, J.; Feng, T.; Xu, F.; Qiu, C.; Zhao, C. Blockchain-Enabled Parallel Learning in Industrial Edge-Cloud Network: A Fuzzy DPoSt-PBFT Approach. In Proceedings of the 2021 IEEE Globecom Workshops, GC Wkshps 2021, Madrid, Spain, 7–11 December 2021; pp. 1–6. [Google Scholar] [CrossRef]
- Zhao, Y.; Zhang, H. Application of machine learning and rule scheduling in a job-shop production control system. Int. J. Simul. Model. 2021, 20, 410–421. [Google Scholar] [CrossRef]
- Chen, J.; Yi, C.; Wang, R.; Zhu, K.; Cai, J. Learning Aided Joint Sensor Activation and Mobile Charging Vehicle Scheduling for Energy-Efficient WRSN-Based Industrial IoT. IEEE Trans. Veh. Technol. 2023, 72, 5064–5078. [Google Scholar] [CrossRef]
- Dai, B.; Ren, T.; Niu, J.; Hu, Z.; Hu, S.; Qiu, M. A Distributed Computation Offloading Scheduling Framework based on Deep Reinforcement Learning. In Proceedings of the 19th IEEE International Symposium on Parallel and Distributed Processing with Applications, 11th IEEE International Conference on Big Data and Cloud Computing, 14th IEEE International Conference on Social Computing and Networking and 11th IEEE Internation, New York, NY, USA, 30 September–3 October 2021; pp. 413–420. [Google Scholar] [CrossRef]
- Qu, S.; Wang, J.; Govil, S.; Leckie, J.O. Optimized Adaptive Scheduling of a Manufacturing Process System with Multi-skill Workforce and Multiple Machine Types: An Ontology-based, Multi-agent Reinforcement Learning Approach. Procedia CIRP 2016, 57, 55–60. [Google Scholar] [CrossRef]
- Antuori, V.; Hebrard, E.; Huguet, M.J.; Essodaigui, S.; Nguyen, A. Leveraging Reinforcement Learning, Constraint Programming and Local Search: A Case Study in Car Manufacturing. In Principles and Practice of Constraint Programming; Simonis, H., Ed.; Springer International Publishing: Berlin/Heidelberg, Germany, 2020; pp. 657–672. [Google Scholar] [CrossRef]
- Johnson, D.; Chen, G.; Lu, Y. Multi-Agent Reinforcement Learning for Real-Time Dynamic Production Scheduling in a Robot Assembly Cell. IEEE Robot. Autom. Lett. 2022, 7, 7684–7691. [Google Scholar] [CrossRef]
- Rudolf, T.; Flögel, D.; Schürmann, T.; Süß, S.; Schwab, S.; Hohmann, S. ReACT: Reinforcement Learning for Controller Parametrization Using B-Spline Geometries. In Proceedings of the 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Honolulu, Oahu, HI, USA, 1–4 October 2023; pp. 3385–3391. [Google Scholar] [CrossRef]
- Sun, M.; Wang, X.; Liu, X.; Wu, S.; Zhou, X.; Ouyang, C.X. A Multi-agent Reinforcement Learning Routing Protocol in Mobile Robot Network. In Proceedings of the 2021 4th International Conference on Information Communication and Signal Processing, ICICSP 2021, Shanghai, China, 24–26 September 2021; pp. 469–475. [Google Scholar] [CrossRef]
- Sun, B.; Theile, M.; Qin, Z.; Bernardini, D.; Roy, D.; Bastoni, A.; Caccamo, M. Edge Generation Scheduling for DAG Tasks using Deep Reinforcement Learning. IEEE Trans. Comput. 2024, 73, 1034–1047. [Google Scholar] [CrossRef]
- Yang, S.; Song, S.; Chu, S.; Song, R.; Cheng, J.; Li, Y.; Zhang, W. Heuristics Integrated Deep Reinforcement Learning for Online 3D Bin Packing. IEEE Trans. Autom. Sci. Eng. 2024, 21, 939–950. [Google Scholar] [CrossRef]
- Zhang, J.; Shuai, T. Online Three-Dimensional Bin Packing: A DRL Algorithm with the Buffer Zone. Found. Comput. Decis. Sci. 2024, 49, 63–74. [Google Scholar] [CrossRef]
- Zhou, Y.; Yan, S.; Peng, M. Content placement with unknown popularity in fog radio access networks. In Proceedings of the IEEE International Conference on Industrial Internet Cloud, ICII 2019, Orlando, FL, USA, 11–12 November 2019; pp. 361–367. [Google Scholar] [CrossRef]
- Chen, S.; Jiang, C.; Li, J.; Xiang, J.; Xiao, W. Improved deep q-network for user-side battery energy storage charging and discharging strategy in industrial parks. Entropy 2021, 23, 1311. [Google Scholar] [CrossRef] [PubMed]
- Ding, L.; Lin, Z.; Yan, G. Multi-agent Deep Reinforcement Learning Algorithm for Distributed Economic Dispatch in Smart Grid. In Proceedings of the IECON 2020 The 46th Annual Conference of the IEEE Industrial Electronics Society, Singapore, 18–21 October 2020; pp. 3529–3534. [Google Scholar] [CrossRef]
- Li, J.; Zhou, C.; Liu, J.; Sheng, M.; Zhao, N.; Su, Y. Reinforcement Learning-Based Resource Allocation for Coverage Continuity in High Dynamic UAV Communication Networks. IEEE Trans. Wirel. Commun. 2024, 23, 848–860. [Google Scholar] [CrossRef]
- Tan, Y.; Shen, Y.; Yu, X.; Lu, X. Low-carbon economic dispatch of the combined heat and power-virtual power plants: A improved deep reinforcement learning-based approach. IET Renew. Power Gener. 2023, 17, 982–1007. [Google Scholar] [CrossRef]
- Van Den Bovenkamp, N.; Giraldo, J.S.; Salazar Duque, E.M.; Vergara, P.P.; Konstantinou, C.; Palensky, P. Optimal Energy Scheduling of Flexible Industrial Prosumers via Reinforcement Learning. In Proceedings of the 2023 IEEE Belgrade PowerTech, PowerTech 2023, Belgrade, Serbia, 25–29 June 2023; pp. 1–6. [Google Scholar] [CrossRef]
- Villaverde, B.C.; Rea, S.; Pesch, D. InRout—A QoS aware route selection algorithm for industrial wireless sensor networks. Ad Hoc Netw. 2012, 10, 458–478. [Google Scholar] [CrossRef]
- Xu, J.; Zhu, K.; Wang, R. RF aerially charging scheduling for UAV Fleet: AAA Q-learning approach. In Proceedings of the 2019 15th International Conference on Mobile Ad-Hoc and Sensor Networks, MSN 2019, Shenzhen, China, 11–13 December 2019; pp. 194–199. [Google Scholar] [CrossRef]
- Ludeke, R.; Heyns, P.S. Towards a Deep Reinforcement Learning based approach for real time decision making and resource allocation for Prognostics and Health Management applications. In Proceedings of the 2023 IEEE International Conference on Prognostics and Health Management, ICPHM 2023, Montreal, QC, Canada, 5–7 June 2023; pp. 20–29. [Google Scholar] [CrossRef]
- Huang, S.; Wang, Z.; Zhou, J.; Lu, J. Planning Irregular Object Packing via Hierarchical Reinforcement Learning. IEEE Robot. Autom. Lett. 2023, 8, 81–88. [Google Scholar] [CrossRef]
- Puche, A.V.; Lee, S. Online 3D Bin Packing Reinforcement Learning Solution with Buffer. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems, Kyoto, Japan, 23–27 October 2022; pp. 8902–8909. [Google Scholar] [CrossRef]
- Wu, Y.; Yao, L. Research on the Problem of 3D Bin Packing under Incomplete Information Based on Deep Reinforcement Learning. In Proceedings of the 2021 International Conference on E-Commerce and E-Management, ICECEM 2021, Dalian, China, 24–26 September 2021; pp. 38–42. [Google Scholar] [CrossRef]
- Chen, G.; Chen, Y.; Du, J.; Du, L.; Mai, Z.; Hao, C. A Hybrid DRL-Based Adaptive Traffic Matching Strategy for Transmitting and Computing in MEC-Enabled IIoT. IEEE Commun. Lett. 2024, 28, 238–242. [Google Scholar] [CrossRef]
- Ho, T.M.; Nguyen, K.K.; Cheriet, M. Game Theoretic Reinforcement Learning Framework For Industrial Internet of Things. In Proceedings of the IEEE Wireless Communications and Networking Conference, WCNC, Austin, TX, USA, 10–13 April 2022; pp. 2112–2117. [Google Scholar] [CrossRef]
- Li, J.; Wang, R.; Wang, K. Service Function Chaining in Industrial Internet of Things With Edge Intelligence: A Natural Actor-Critic Approach. IEEE Trans. Ind. Inform. 2023, 19, 491–502. [Google Scholar] [CrossRef]
- Ong, K.S.H.; Wang, W.; Niyato, D.; Friedrichs, T. Deep-Reinforcement-Learning-Based Predictive Maintenance Model for Effective Resource Management in Industrial IoT. IEEE Internet Things J. 2022, 9, 5173–5188. [Google Scholar] [CrossRef]
- Akbari, M.; Abedi, M.R.; Joda, R.; Pourghasemian, M.; Mokari, N.; Erol-Kantarci, M. Age of Information Aware VNF Scheduling in Industrial IoT Using Deep Reinforcement Learning. IEEE J. Sel. Areas Commun. 2021, 39, 2487–2500. [Google Scholar] [CrossRef]
- Bao, Q.; Zheng, P.; Dai, S. A digital twin-driven dynamic path planning approach for multiple automatic guided vehicles based on deep reinforcement learning. Proc. Inst. Mech. Eng. Part B J. Eng. Manuf. 2024, 238, 488–499. [Google Scholar] [CrossRef]
- Gowri, A.S.; Shanth I Bala, P. An agent based resource provision for IoT through machine learning in Fog computing. In Proceedings of the 2019 IEEE International Conference on System, Computation, Automation and Networking, ICSCAN 2019, Pondicherry, India, 29–30 March 2019; pp. 12–16. [Google Scholar] [CrossRef]
- Li, B.; Zhang, R.; Tian, X.; Zhu, Z. Multi-Agent and Cooperative Deep Reinforcement Learning for Scalable Network Automation in Multi-Domain SD-EONs. IEEE Trans. Netw. Serv. Manag. 2021, 18, 4801–4813. [Google Scholar] [CrossRef]
- Lin, L.; Zhou, W.; Yang, Z.; Liu, J. Deep reinforcement learning-based task scheduling and resource allocation for NOMA-MEC in Industrial Internet of Things. Peer-to-Peer Netw. Appl. 2023, 16, 170–188. [Google Scholar] [CrossRef]
- Liu, M.; Teng, Y.; Yu, F.R.; Leung, V.C.; Song, M. A Deep Reinforcement Learning-Based Transcoder Selection Framework for Blockchain-Enabled Wireless D2D Transcoding. IEEE Trans. Commun. 2020, 68, 3426–3439. [Google Scholar] [CrossRef]
- Liu, C.L.; Chang, C.C.; Tseng, C.J. Actor-critic deep reinforcement learning for solving job shop scheduling problems. IEEE Access 2020, 8, 71752–71762. [Google Scholar] [CrossRef]
- Liu, X.; Wang, G.; Chen, K. Option-Based Multi-Agent Reinforcement Learning for Painting With Multiple Large-Sized Robots. IEEE Trans. Intell. Transp. Syst. 2022, 23, 15707–15715. [Google Scholar] [CrossRef]
- Liu, P.; Wu, Z.; Shan, H.; Lin, F.; Wang, Q.; Wang, Q. Task offloading optimization for AGVs with fixed routes in industrial IoT environment. China Commun. 2023, 20, 302–314. [Google Scholar] [CrossRef]
- Lu, R.; Li, Y.C.; Li, Y.; Jiang, J.; Ding, Y. Multi-agent deep reinforcement learning based demand response for discrete manufacturing systems energy management. Appl. Energy 2020, 276, 115473. [Google Scholar] [CrossRef]
- Peng, Z.; Lin, J. A multi-objective trade-off framework for cloud resource scheduling based on the Deep Q-network algorithm. Clust. Comput. 2020, 23, 2753–2767. [Google Scholar] [CrossRef]
- Thanh, P.D.; Hoan, T.N.K.; Giang, H.T.H.; Koo, I. Packet Delivery Maximization Using Deep Reinforcement Learning-Based Transmission Scheduling for Industrial Cognitive Radio Systems. IEEE Access 2021, 9, 146492–146508. [Google Scholar] [CrossRef]
- Wang, S.; Yuen, C.; Ni, W.; Guan, Y.L.; Lv, T. Multiagent Deep Reinforcement Learning for Cost- and Delay-Sensitive Virtual Network Function Placement and Routing. IEEE Trans. Commun. 2022, 70, 5208–5224. [Google Scholar] [CrossRef]
- Budak, A.F.; Bhansali, P.; Liu, B.; Sun, N.; Pan, D.Z.; Kashyap, C.V. DNN-Opt: An RL Inspired Optimization for Analog Circuit Sizing using Deep Neural Networks. In Proceedings of the Design Automation Conference, San Francisco, CA, USA, 5–9 December 2021; pp. 1219–1224. [Google Scholar] [CrossRef]
- Cao, Z.; Lin, C.; Zhou, M.; Huang, R. Scheduling Semiconductor Testing Facility by Using Cuckoo Search Algorithm with Reinforcement Learning and Surrogate Modeling. IEEE Trans. Autom. Sci. Eng. 2019, 16, 825–837. [Google Scholar] [CrossRef]
- Chalumeau, F.; Coulon, I.; Cappart, Q.; Rousseau, L.M. SeaPearl: A Constraint Programming Solver Guided by Reinforcement Learning. In Integration of Constraint Programming, Artificial Intelligence, and Operations Research; Stuckey, P.J., Ed.; Springer International Publishing: Berlin/Heidelberg, Germany, 2021; pp. 392–409. [Google Scholar]
- Lin, C.R.; Cao, Z.C.; Zhou, M.C. Learning-Based Grey Wolf Optimizer for Stochastic Flexible Job Shop Scheduling. IEEE Trans. Autom. Sci. Eng. 2022, 19, 3659–3671. [Google Scholar] [CrossRef]
- Lin, C.R.; Cao, Z.C.; Zhou, M.C. Learning-Based Cuckoo Search Algorithm to Schedule a Flexible Job Shop With Sequencing Flexibility. IEEE Trans. Cybern. 2023, 53, 6663–6675. [Google Scholar] [CrossRef] [PubMed]
- Tang, H.; Xiao, Y.; Zhang, W.; Lei, D.; Wang, J.; Xu, T. A DQL-NSGA-III algorithm for solving the flexible job shop dynamic scheduling problem. Expert Syst. Appl. 2024, 237, 121723. [Google Scholar] [CrossRef]
- Wang, T.; Zhao, J.; Xu, Q.; Pedrycz, W.; Wang, W. A Dynamic Scheduling Framework for Byproduct Gas System Combining Expert Knowledge and Production Plan. IEEE Trans. Autom. Sci. Eng. 2023, 20, 541–552. [Google Scholar] [CrossRef]
- Wang, X.; Yao, H.; Mai, T.; Guo, S.; Liu, Y. Reinforcement Learning-Based Particle Swarm Optimization for End-to-End Traffic Scheduling in TSN-5G Networks. IEEE/ACM Trans. Netw. 2023, 31, 3254–3268. [Google Scholar] [CrossRef]
- Zhao, F.; Zhou, G.; Xu, T.; Zhu, N.; Jonrinaldi. A knowledge-driven cooperative scatter search algorithm with reinforcement learning for the distributed blocking flow shop scheduling problem. Expert Syst. Appl. 2023, 230, 120571. [Google Scholar] [CrossRef]
- Ma, N.; Wang, Z.; Ba, Z.; Li, X.; Yang, N.; Yang, X.; Zhang, H. Hierarchical Reinforcement Learning for Crude Oil Supply Chain Scheduling. Algorithms 2023, 16, 354. [Google Scholar] [CrossRef]
- Goh, S.L.; Kendall, G.; Sabar, N.R. Simulated annealing with improved reheating and learning for the post enrolment course timetabling problem. J. Oper. Res. Soc. 2019, 70, 873–888. [Google Scholar] [CrossRef]
- Fairee, S.; Khompatraporn, C.; Prom-on, S.; Sirinaovakul, B. Combinatorial Artificial Bee Colony Optimization with Reinforcement Learning Updating for Travelling Salesman Problem. In Proceedings of the 2019 16th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON), Pattaya, Thailand, 10–13 July 2019; pp. 93–96. [Google Scholar] [CrossRef]
- Durst, P.; Jia, X.; Li, L. Multi-Objective Optimization of AGV Real-Time Scheduling Based on Deep Reinforcement Learning. In Proceedings of the 42nd Chinese Control Conference, Tianjin, China, 24–26 July 2023; pp. 5535–5540. [Google Scholar] [CrossRef]
- Wang, L.; Yang, C.; Wang, X.; Li, J.; Wang, Y.; Wang, Y. Integrated Resource Scheduling for User Experience Enhancement: A Heuristically Accelerated DRL. In Proceedings of the 2019 11th International Conference on Wireless Communications and Signal Processing, WCSP 2019, Xi’an, China, 23–25 October 2019; pp. 1–6. [Google Scholar] [CrossRef]
- Carvalho, J.P.; Dimitrakopoulos, R. Integrating short-term stochastic production planning updating with mining fleet management in industrial mining complexes: An actor-critic reinforcement learning approach. Appl. Intell. 2023, 53, 23179–23202. [Google Scholar] [CrossRef]
- Soman, R.K.; Molina-Solana, M. Automating look-ahead schedule generation for construction using linked-data based constraint checking and reinforcement learning. Autom. Constr. 2022, 134, 104069. [Google Scholar] [CrossRef]
- Wei, W.; Fu, L.; Gu, H.; Zhang, Y.; Zou, T.; Wang, C.; Wang, N. GRL-PS: Graph embedding-based DRL approach for adaptive path selection. IEEE Trans. Netw. Serv. Manag. 2023, 20, 2639–2651. [Google Scholar] [CrossRef]
- Zhang, P.; Wang, C.; Kumar, N.; Liu, L. Space-Air-Ground Integrated Multi-Domain Network Resource Orchestration Based on Virtual Network Architecture: A DRL Method. IEEE Trans. Intell. Transp. Syst. 2022, 23, 2798–2808. [Google Scholar] [CrossRef]
- Zhang, Z.Q.; Qian, B.; Hu, R.; Yang, J.B. Q-learning-based hyper-heuristic evolutionary algorithm for the distributed assembly blocking flowshop scheduling problem. Appl. Soft Comput. 2023, 146, 110695. [Google Scholar] [CrossRef]
- Chen, R.; Li, W.; Yang, H. A Deep Reinforcement Learning Framework Based on an Attention Mechanism and Disjunctive Graph Embedding for the Job-Shop Scheduling Problem. IEEE Trans. Ind. Inform. 2023, 19, 1322–1331. [Google Scholar] [CrossRef]
- Elsayed, E.K.; Elsayed, A.K.; Eldahshan, K.A. Deep Reinforcement Learning-Based Job Shop Scheduling of Smart Manufacturing. Comput. Mater. Contin. 2022, 73, 5103–5120. [Google Scholar] [CrossRef]
- Farahani, A.; Elzakker, M.V.; Genga, L.; Troubil, P.; Dijkman, R. Relational Graph Attention-Based Deep Reinforcement Learning: An Application to Flexible Job Shop Scheduling with Sequence-Dependent Setup Times. In Proceedings of the 17th International Conference, LION 17, Nice, France, 4–8 June 2023; pp. 150–164. [Google Scholar] [CrossRef]
- Gan, X.M.; Zuo, Y.; Zhang, A.S.; Li, S.B.; Tao, F. Digital twin-enabled adaptive scheduling strategy based on deep reinforcement learning. Sci. China Technol. Sci. 2023, 66, 1937–1951. [Google Scholar] [CrossRef]
- Huang, J.P.; Gao, L.; Li, X.Y. An end-to-end deep reinforcement learning method based on graph neural network for distributed job-shop scheduling problem. Expert Syst. Appl. 2024, 238, 121756. [Google Scholar] [CrossRef]
- Lee, J.H.; Kim, H.J. Imitation Learning for Real-Time Job Shop Scheduling Using Graph-Based Representation. In Proceedings of the 2022 Winter Simulation Conference, Singapore, 11–14 December 2022; pp. 3285–3296. [Google Scholar] [CrossRef]
- Liu, C.L.; Huang, T.H. Dynamic Job-Shop Scheduling Problems Using Graph Neural Network and Deep Reinforcement Learning. IEEE Trans. Syst. Man Cybern. Syst. 2023, 53, 6836–6848. [Google Scholar] [CrossRef]
- Zhao, X.; Song, W.; Li, Q.; Shi, H.; Kang, Z.; Zhang, C. A Deep Reinforcement Learning Approach for Resource-Constrained Project Scheduling. In Proceedings of the 2022 IEEE Symposium Series on Computational Intelligence, SSCI 2022, Singapore, 4–7 December 2022; pp. 1226–1234. [Google Scholar] [CrossRef]
- Chilukuri, S.; Pesch, D. RECCE: Deep Reinforcement Learning for Joint Routing and Scheduling in Time-Constrained Wireless Networks. IEEE Access 2021, 9, 132053–132063. [Google Scholar] [CrossRef]
- Vijayalakshmi, V.; Saravanan, M. Reinforcement learning-based multi-objective energy-efficient task scheduling in fog-cloud industrial IoT-based systems. Soft Comput. 2023, 27, 17473–17491. [Google Scholar] [CrossRef]
- Wang, C.; Shen, X.; Wang, H.; Xie, W.; Mei, H.; Zhang, H. Q Learning-Based Routing Protocol with Accelerating Convergence for Underwater Wireless Sensor Networks. IEEE Sens. J. 2024, 24, 11562–11573. [Google Scholar] [CrossRef]
- Yan, Z.; Du, H.; Zhang, J.; Li, G. Cherrypick: Solving the Steiner Tree Problem in Graphs using Deep Reinforcement Learning. In Proceedings of the 16th IEEE Conference on Industrial Electronics and Applications, ICIEA 2021, Chengdu, China, 1–4 August 2021; pp. 35–40. [Google Scholar] [CrossRef]
- Yuan, Y.; Li, H.; Ji, L. Application of Deep Reinforcement Learning Algorithm in Uncertain Logistics Transportation Scheduling. Comput. Intell. Neurosci. 2021, 2021, 5672227. [Google Scholar] [CrossRef] [PubMed]
- Zhong, C.; Jia, H.; Wan, H.; Zhao, X. DRLS: A Deep Reinforcement Learning Based Scheduler for Time-Triggered Ethernet. In Proceedings of the International Conference on Computer Communications and Networks, ICCCN, Athens, Greece, 19–22 July 2021; pp. 1–11. [Google Scholar] [CrossRef]
- Chen, H.; Hsu, K.C.; Turner, W.J.; Wei, P.H.; Zhu, K.; Pan, D.Z.; Ren, H. Reinforcement Learning Guided Detailed Routing for Custom Circuits. Proc. Int. Symp. Phys. Des. 2023, 1, 26–34. [Google Scholar] [CrossRef]
- Da Costa, P.; Zhang, Y.; Akcay, A.; Kaymak, U. Learning 2-opt Local Search from Heuristics as Expert Demonstrations. In Proceedings of the International Joint Conference on Neural Networks, Shenzhen, China, 18–22 July 2021; pp. 1–8. [Google Scholar] [CrossRef]
- He, X.; Zhuge, X.; Dang, F.; Xu, W.; Yang, Z. DeepScheduler: Enabling Flow-Aware Scheduling in Time-Sensitive Networking. In Proceedings of the IEEE INFOCOM 2023—IEEE Conference on Computer Communications, New York, NY, USA, 17–20 May 2023; pp. 1–10. [Google Scholar] [CrossRef]
- Wu, Y.; Zhou, J.; Xia, Y.; Zhang, X.; Cao, Z.; Zhang, J. Neural Airport Ground Handling. IEEE Trans. Intell. Transp. Syst. 2023, 24, 15652–15666. [Google Scholar] [CrossRef]
- Baek, J.; Kaddoum, G. Online Partial Offloading and Task Scheduling in SDN-Fog Networks with Deep Recurrent Reinforcement Learning. IEEE Internet Things J. 2022, 9, 11578–11589. [Google Scholar] [CrossRef]
- Elsayed, M.; Erol-Kantarci, M. Deep Reinforcement Learning for Reducing Latency in Mission Critical Services. In Proceedings of the 2018 IEEE Global Communications Conference, GLOBECOM 2018, Abu Dhabi, United Arab Emirates, 9–13 December 2018. [Google Scholar] [CrossRef]
- Servadei, L.; Lee, J.H.; Medina, J.A.; Werner, M.; Hochreiter, S.; Ecker, W.; Wille, R. Deep Reinforcement Learning for Optimization at Early Design Stages. IEEE Des. Test 2023, 40, 43–51. [Google Scholar] [CrossRef]
- Solozabal, R.; Ceberio, J.; Sanchoyerto, A.; Zabala, L.; Blanco, B.; Liberal, F. Virtual Network Function Placement Optimization with Deep Reinforcement Learning. IEEE J. Sel. Areas Commun. 2020, 38, 292–303. [Google Scholar] [CrossRef]
- Zou, Y.; Wu, H.; Yin, Y.; Dhamotharan, L.; Chen, D.; Tiwari, A.K. An improved transformer model with multi-head attention and attention to attention for low-carbon multi-depot vehicle routing problem. Ann. Oper. Res. 2024, 339, 517–536. [Google Scholar] [CrossRef]
- Ahmed, B.S.; Enoiu, E.; Afzal, W.; Zamli, K.Z. An evaluation of Monte Carlo-based hyper-heuristic for interaction testing of industrial embedded software applications. Soft Comput. 2020, 24, 13929–13954. [Google Scholar] [CrossRef]
- Li, Y.; Fadda, E.; Manerba, D.; Tadei, R.; Terzo, O. Reinforcement Learning Algorithms for Online Single-Machine Scheduling. In Proceedings of the 2020 Federated Conference on Computer Science and Information Systems, FedCSIS 2020, Sofia, Bulgaria, 6–9 September 2020; Volume 21, pp. 277–283. [Google Scholar] [CrossRef]
- Ma, X.; Xu, H.; Gao, H.; Bian, M.; Hussain, W. Real-Time Virtual Machine Scheduling in Industry IoT Network: A Reinforcement Learning Method. IEEE Trans. Ind. Inform. 2023, 19, 2129–2139. [Google Scholar] [CrossRef]
- Wang, Y.; Wang, K.; Huang, H.; Miyazaki, T.; Guo, S. Traffic and Computation Co-Offloading with Reinforcement Learning in Fog Computing for Industrial Applications. IEEE Trans. Ind. Inform. 2019, 15, 976–986. [Google Scholar] [CrossRef]
- Xu, Z.; Han, G.; Liu, L.; Martinez-Garcia, M.; Wang, Z. Multi-energy scheduling of an industrial integrated energy system by reinforcement learning-based differential evolution. IEEE Trans. Green Commun. Netw. 2021, 5, 1077–1090. [Google Scholar] [CrossRef]
- Jiang, T.; Zeng, B.; Wang, Y.; Yan, W. A New Heuristic Reinforcement Learning for Container Relocation Problem. J. Phys. Conf. Ser. 2021, 1873, 012050. [Google Scholar] [CrossRef]
- De Mars, P.; O’Sullivan, A. Applying reinforcement learning and tree search to the unit commitment problem. Appl. Energy 2021, 302, 117519. [Google Scholar] [CrossRef]
- Revadekar, A.; Soni, R.; Nimkar, A.V. QORAl: Q Learning based Delivery Optimization for Pharmacies. In Proceedings of the 2020 11th International Conference on Computing, Communication and Networking Technologies, ICCCNT 2020, Kharagpur, India, 1–3 July 2020. [Google Scholar] [CrossRef]
- Kuhnle, A.; Kaiser, J.P.; Theiß, F.; Stricker, N.; Lanza, G. Designing an adaptive production control system using reinforcement learning. J. Intell. Manuf. 2021, 32, 855–876. [Google Scholar] [CrossRef]
- Arredondo, F.; Martinez, E. Learning and adaptation of a policy for dynamic order acceptance in make-to-order manufacturing. Comput. Ind. Eng. 2010, 58, 70–83. [Google Scholar] [CrossRef]
- Guan, W.; Zhang, H.; Leung, V.C. Customized Slicing for 6G: Enforcing Artificial Intelligence on Resource Management. IEEE Netw. 2021, 35, 264–271. [Google Scholar] [CrossRef]
- Kan, H.; Shuai, L.; Chen, H.; Zhang, W. Automated Guided Logistics Handling Vehicle Path Routing under Multi-Task Scenarios. In Proceedings of the 2020 IEEE International Conference on Mechatronics and Automation, ICMA 2020, Beijing, China, 13–16 October 2020; pp. 1173–1177. [Google Scholar] [CrossRef]
- Ghaleb, M.; Namoura, H.A.; Taghipour, S. Reinforcement Learning-based Real-time Scheduling under Random Machine Breakdowns and Other Disturbances: A Case Study. In Proceedings of the Annual Reliability and Maintainability Symposium, Orlando, FL, USA, 24–27 May 2021; pp. 1–8. [Google Scholar] [CrossRef]
- Hu, H.; Jia, X.; He, Q.; Fu, S.; Liu, K. Deep reinforcement learning based AGVs real-time scheduling with mixed rule for flexible shop floor in industry 4.0. Comput. Ind. Eng. 2020, 149, 106749. [Google Scholar] [CrossRef]
- Luo, S.; Zhang, L.; Fan, Y. Real-Time Scheduling for Dynamic Partial-No-Wait Multiobjective Flexible Job Shop by Deep Reinforcement Learning. IEEE Trans. Autom. Sci. Eng. 2022, 19, 3020–3038. [Google Scholar] [CrossRef]
- Wu, J.; Zhang, G.; Nie, J.; Peng, Y.; Zhang, Y. Deep Reinforcement Learning for Scheduling in an Edge Computing-Based Industrial Internet of Things. In Wireless Communications and Mobile Computing; John Wiley and Sons Ltd.: Hoboken, NJ, USA, 2021. [Google Scholar]
- Song, Q.; Lei, S.; Sun, W.; Zhang, Y. Adaptive federated learning for digital twin driven industrial internet of things. In Proceedings of the IEEE Wireless Communications and Networking Conference, WCNC, Nanjing, China, 29 March–1 April 2021. [Google Scholar] [CrossRef]
- Li, Y.; Liao, C.; Wang, L.; Xiao, Y.; Cao, Y.; Guo, S. A Reinforcement Learning-Artificial Bee Colony algorithm for Flexible Job-shop Scheduling Problem with Lot Streaming. Appl. Soft Comput. 2023, 146, 110658. [Google Scholar] [CrossRef]
- Naghibi-Sistani, M.B.; Akbarzadeh-Tootoonchi, M.R.; Javidi-Dashte Bayaz, M.H.; Rajabi-Mashhadi, H. Application of Q-learning with temperature variation for bidding strategies in market based power systems. Energy Convers. Manag. 2006, 47, 1529–1538. [Google Scholar] [CrossRef]
- Kunzel, G.; Indrusiak, L.S.; Pereira, C.E. Latency and Lifetime Enhancements in Industrial Wireless Sensor Networks: A Q-Learning Approach for Graph Routing. IEEE Trans. Ind. Inform. 2020, 16, 5617–5625. [Google Scholar] [CrossRef]
- Lu, H.; Zhang, X.; Yang, S. A Learning-based Iterative Method for Solving Vehicle Routing Problems. In Proceedings of the International Conference on Learning Representations, Virtual, 26 April–1 May 2020. [Google Scholar]
- Nain, Z.; Musaddiq, A.; Qadri, Y.A.; Nauman, A.; Afzal, M.K.; Kim, S.W. RIATA: A Reinforcement Learning-Based Intelligent Routing Update Scheme for Future Generation IoT Networks. IEEE Access 2021, 9, 81161–81172. [Google Scholar] [CrossRef]
- Zheng, K.; Luo, R.; Liu, X.; Qiu, J.; Liu, J. Distributed DDPG-Based Resource Allocation for Age of Information Minimization in Mobile Wireless-Powered Internet of Things. IEEE Internet Things J. 2024, 11, 29102–29115. [Google Scholar] [CrossRef]
- Liu, X.; Xu, J.; Zheng, K.; Zhang, G.; Liu, J.; Shiratori, N. Throughput Maximization with an AoI Constraint in Energy Harvesting D2D-enabled Cellular Networks: An MSRA-TD3 Approach. IEEE Trans. Wirel. Commun. 2024; early access. [Google Scholar] [CrossRef]
Reference | Type | Methods | Area | Objective |
---|---|---|---|---|
[2] | Survey | Soft computing | Wireless sensor networks | Approach overview |
[3] | Review | ML | Smart energy and electric power systems | Approach overview |
[4] | Review | DRL and evolutionary | Job shop scheduling | Overview |
[5] | Review | ML | 5G wireless communications | Potential solutions for area |
[6] | Vision article | RL and digital twins | Maintenance | Approach overview |
[7] | Review | RL | Multiple topics | MDP, RL algorithms and theory |
[8] | Survey | RL | Software-defined network routing | Identifying and analyzing recent studies |
[9] | Survey | RL and DRL | IoT communication and networking | Application analysis |
[10] | Survey | DRL | Traffic engineering, routing and congestion | Application and approach overview |
[11] | Review | RL | Production planning and control | Characteristics, algorithms and tools |
[12] | Review | DRL | Intelligent manufacture | DRL applicability versus alternatives |
[13] | Review | RL and DRL | Maintenance planning and optimization | Application taxonomy |
Format | Content | References | Insights |
---|---|---|---|
Entity lists | Averages | [16,18,27,34,35,37,42,78,203,264] | Compact and simple, loses individual details |
Per resource | [20,35,37,74,95,101,108,122,127,169,205,226,247,265] | Granular details but less scalable | |
Spatial represen- tations | Matrices | [22,70,71,76,112,120,152,166,167,172,193,194,195,208,210,212,221,228,232,233,234,266,267] | For structured environments and spatial reasoning |
Heightmaps | [195,221,222,232,233,234] | Capture 3D variations in a 2D representation | |
Convolutional approaches | [22,61,173,181,198,205,232,234,266] | Automatic feature extraction | |
Graph solutions | Undirected and directed | [19,21,142,219,220,228,246,254,268,269,270] | Symmetric and asymmetric relational dependencies |
Disjunctive graphs | [49,78,80,117,130,271,272,273,274,275,276,277,278] | Complex relationships and multi-entity interactions | |
Graph node features | [81,219,229,246,261,268,278,279,280,281,282,283] | Per-entity attributes | |
Graph edge features | [19,21,113,136,150,220,229,239,273,284] | Captures relationships and dependencies between entities | |
Graph NN | [21,70,78,80,130,142,188,219,246,254,261,268,269,270,271,274,275,277,278,279,283,285,286,287,288] | Process graph states, improving generalization at computational cost | |
Variable-sized | [141,179,181,220] | Flexible, adapts to dynamic state spaces | |
Recurrent neural networks | [20,47,65,104,115,136,156,178,184,198,238,243,271,288,289,290,291,292,293] | Capture temporal dependencies in sequential decisions | |
Fuzzy | [83,246,258] | Model uncertainty, useful for imprecise states |
RL Role | Advantages | Disadvantages | Examples |
---|---|---|---|
Tabular methods | Simple implementation. Explainable results. | Limited state representation. Must sufficiently explore all states. | [45,166,175,294,296] |
Iterative list selection | Very flexible. Widespread literature adoption. | Single-use actions. Often requires action masking. | [31,40,49,217,232] |
Hybrid approaches | Simplify agents’ decisions. Enhance other methods with RL decision making. | RL only optimizes a subset. External methods or tools might be insufficient to optimize. | [18,123,184,254,258] |
Specific neural network models | Variable-sized inputs and outputs. Relative positions of pixels, nodes and tokens provide extra context. | Frameworks are harder to train. Scalability issues. | [141,188,232,238,271] |
Multi-agent | Agents can make simpler decisions. Specialized agents for smaller tasks. | More complex frameworks. Decentralization requires communication protocols. | [49,116,217,246,251] |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Martins, M.S.E.; Sousa, J.M.C.; Vieira, S. A Systematic Review on Reinforcement Learning for Industrial Combinatorial Optimization Problems. Appl. Sci. 2025, 15, 1211. https://doi.org/10.3390/app15031211
Martins MSE, Sousa JMC, Vieira S. A Systematic Review on Reinforcement Learning for Industrial Combinatorial Optimization Problems. Applied Sciences. 2025; 15(3):1211. https://doi.org/10.3390/app15031211
Chicago/Turabian StyleMartins, Miguel S. E., João M. C. Sousa, and Susana Vieira. 2025. "A Systematic Review on Reinforcement Learning for Industrial Combinatorial Optimization Problems" Applied Sciences 15, no. 3: 1211. https://doi.org/10.3390/app15031211
APA StyleMartins, M. S. E., Sousa, J. M. C., & Vieira, S. (2025). A Systematic Review on Reinforcement Learning for Industrial Combinatorial Optimization Problems. Applied Sciences, 15(3), 1211. https://doi.org/10.3390/app15031211