1. Introduction
The Flexible Job Shop Scheduling Problem (FJSP) is a non-deterministic polynomial (NP)-hard problem, which is an extension of the classic JSP [
1]. In FJSP, the operations of the job can be processed on multiple machines, that is, the corresponding relationship between the operation and the machine is uncertain [
2]. In today’s world, with the diversification of user needs, flexible job-shops are faced with frequent insertion of new jobs. When a new job is inserted at a certain point, to reduce the makespan, it is necessary to dynamically form a new scheduling scheme by reasonably arranging the processing sequence of the new job and the old jobs, as well as the corresponding relationship between the operations and the machines, so as to improving economic efficiency and machine utilization. This is the DFJSP studied in this paper, which is a further study of FJSP. For combinatorial optimization problems, an efficient meta-heuristics algorithm can solve DFJSP. In addition, through reinforcement learning (RL) and other learning-based algorithms, the meta-heuristic can be optimized to improve the accuracy of flexible job-shop scheduling and obtain the optimal scheduling scheme.
With the continuous development of computer technology, many intelligent algorithms have been applied to combinatorial optimization problems, such as the particle swarm optimization (PSO) algorithm [
3,
4], genetic algorithm (GA) [
5,
6], artificial fish swarm algorithm (AFSA) [
7], Bayesian algorithm [
8], ant colony optimization (ACO) algorithm [
9], gray wolf optimization (GWO) algorithm [
10], lion swarm algorithm [
11], ABC algorithm [
12,
13], etc. Sabharwal et al. [
14] proposed an improved GA algorithm with excellent performance, which generated the initial population through probability, avoided the error caused by the interaction between the input parameters, which solved the combinatorial optimization problem well. Wang et al. [
15] optimized the ACO algorithm, changed the pheromone update mechanism, and optimized the makespan of the FJSP, overcame the shortcoming of the ACO algorithm falling into local optimum, and improved its computational efficiency. Yao et al. [
16] improved the population state of GWO algorithm, and proposed the IGWO algorithm through the position-based learning strategy, which enhanced the global search ability of GWO algorithm. Wang et al. [
17] proposed an adaptive multi-objective PSO algorithm with cost and tardiness as objective functions. The algorithm adopts an elite strategy and small probability mutation mechanism to avoid premature convergence of PSO and improve the convergence accuracy. Ge et al. [
7] improved the AFSA algorithm with the goal of minimizing the makespan, and improved the diversity of the population by adjusting the arrangement mechanism of machines and operations. The algorithm improves its local and global search ability through attracting behavior and path search strategy, and improves the performance of the algorithm to solve FJSP. Park et al. [
18] applied the same crossover strategy to the categorical part and sequential part, and proposed a unified GA algorithm, which simplified the structure of GA and effectively explored the search space of FJSP. It can be seen from the above literature that the key parameters of most algorithms are set in advance, and their adjustment methods are difficult to reasonably determine [
19]. In addition, some improved algorithms have too many input parameters, and their stability needs further research and verification [
20].
The artificial bee colony algorithm was proposed by Karaboga et al. [
21] in 2005. Due to its advantages of strong stability and few parameters, the ABC algorithm has received extensive attention and research by scholars [
20]. For more than ten years, many researchers have applied the ABC algorithm and its improved algorithm to FJSP, and have achieved practical achievements. Li et al. [
22] proposed an improved ABC algorithm using a two-dimensional vector coding method, aiming at the minimum energy consumption and the shortest makespan, taking into account important factors such as job preparation time, which improved the local and global search capabilities of the ABC algorithm. Pan et al. [
23] developed an adaptive strategy aiming at the shortest total time of earliness and tardiness, which improved the diversity of the population and enhanced the local reinforcement ability, and solved the job-shop scheduling problem using the discrete artificial bee colony (DABC) algorithm. Meng et al. [
24] proposed a hybrid ABC algorithm with the goal of minimizing the total flowtime, which balances the local and global search by dynamically adjusting the search range and increases the diversification of the population. Zhang et al. [
25] improved the ABC algorithm to solve stochastic FJSP, and finally obtained the solution with the shortest lateness by screening the solutions and reducing the computational load with the K-armed bandit model. Zheng et al. [
26] optimized the population of ABC algorithm by using chaos theory, left-shift strategy and crossover operation with the goal of minimum makespan. This method accelerates the convergence speed of the algorithm, improves the ability of global search, and solves the fuzzy FJSP problem. Gu [
27] used adaptive neighborhood search strategy and greedy method to optimize the population of ABC and retain the optimal solution, which prevented the loss of the optimal solution and solved the multi-objective low-carbon FJSP. However, it can be seen from the above researches that most of the parameters of the improved ABC algorithm are set in advance or unchanged, and are not adjusted with the change of the population state, which limits the performance improvement of the ABC algorithm. The
Q-learning algorithm, as a reinforcement learning algorithm, focuses on online learning and can maintain a balance between exploration and exploitation. Learning information and updating parameters can be obtained by receiving rewards from the environment for actions [
28]. Therefore, the
Q-learning algorithm provides the possibility to dynamically adjust the update dimension of the ABC algorithm by virtue of its learning ability. In addition, in view of the frequent insertion of new jobs in the actual flexible job-shop, it is necessary to study an algorithm to solve the dynamic scheduling problem of the job-shop.
To solve the problem that the update dimension of the ABC algorithm cannot be dynamically adjusted and a new job is frequently inserted into the flexible job-shop, this paper proposes the DSLABC algorithm with the goal of the shortest makespan. Firstly, to dynamically adjust the update dimension m, this paper combines the ABC algorithm with the Q-learning algorithm, and proposes the SLABC algorithm. In each iteration of the ABC algorithm, the Q-learning algorithm can use the characteristics of exploration and exploitation to select the appropriate update dimension according to the population state, realize the dynamic adjustment of the update dimension, and improve the convergence accuracy of the ABC algorithm. Secondly, this paper determines the specific method of dynamic scheduling, and proposes the DSLABC algorithm. When a new job is inserted at a certain time, the DSLABC algorithm reschedules the new job and the operations that has not started processing, and dynamically generates a new scheduling scheme, which solves the rescheduling problem in the flexible job-shop and reduces the makespan. Finally, this paper compares the DSLABC algorithm with other meta-heuristics algorithms, and proves the convergence performance and advantages of the DSLABC algorithm by solving 10 Brandimarte instances. And the feasibility of DSLABC algorithm to solve DFJSP is proved through a specific instance. In conclusion, the DSLABC algorithm proposed in this paper successfully realizes the dynamic adjustment of the update dimension, improves the convergence accuracy of the ABC algorithm, and successfully solves the DFJSP.
In this paper,
Section 2 introduces the flexible job shop scheduling problem with new job insertion, including the mathematical representation, constraint conditions, and coding of the problem.
Section 3 introduces the two basic algorithms needed to propose the DSLABC algorithm: ABC and
Q-learning algorithms.
Section 4 introduces the flow and specific settings of the DSLABC algorithm.
Section 5 verifies the effectiveness and practicability of the DSLABC algorithm.
Section 6 is the conclusion of this paper.
6. Conclusions
Given the frequent insertion of new jobs in flexible job-shops, this paper determines the specific method of job-shop rescheduling and verifies it by experiments. First, a dynamic scheduling model is proposed. By comparing the insertion time and the beginning time of each operation in the original scheduling scheme, the operations that need to be rescheduled are screened out, and the coding method between the operations and the machines is designed. Second, the DSLABC algorithm for dynamic scheduling is proposed, which realizes the combination of the ABC and the Q-learning algorithms. The specific equations for the state, reward, and action selection strategy are designed. Through the Q value table, the algorithm selects appropriate action according to the specific state of the population during each iteration, which improves the convergence accuracy and stability of the algorithm. Finally, by comparing 11 algorithms to solve the Brandimarte instances, it is verified that the SLABC algorithm (i.e., DSLABC) has good convergence accuracy and stability. Through the dynamic scheduling of specific examples, the effectiveness of the DSLABC algorithm in solving dynamic problems is verified, and a new method with high precision is provided for the current dynamic scheduling of flexible job-shops.
The novelties of this paper are: 1. Since the update dimension of the ABC algorithm cannot be adjusted dynamically, this paper realizes that the ABC algorithm can obtain a suitable update dimension in each iteration through the Q-learning algorithm, which improves the convergence accuracy of the ABC algorithm. 2. In view of the situation that the real flexible job-shop faces the insertion of new jobs, this paper reschedules the new job and the operations that have not started processing to generate a new scheduling scheme. The proposed DSLABC algorithm realizes the dynamic scheduling of flexible job-shop, and the effectiveness and advantages of the DSLABC algorithm are proved by experiments.
The next research plan is as follows: 1. Research on dynamic scheduling with multi jobs and multiperiod insertion to further improve the application scope and field of the DSLABC algorithm. 2. Research on scheduling problems with multiple objective, such as priority and energy consumption. 3. Research on algorithm verification on platforms such as Python. 4. Apply the algorithm to the actual flexible job-shop to verify its actual application performance.