**1. Introduction**

The increasing demands in the logistics industry all over the world have driven researchers and engineers to focus on developing intelligent transportation systems aimed at enhancing logistics efficiency [1,2]. One prominent application in this domain is unmanned warehouse systems [3]. As a typical component in an unmanned warehouse, an autonomous forklift transports parcels more efficiently than one driven manually because the former does not induce objective mistakes such as fatigue, anxiety, impatience, or anger [4]. Multiple autonomous forklifts should work together when the delivery burden is heavy [5]. Deploying multiple autonomous forklifts enhances delivery efficiency if the inter-forklift cooperation potential can be maximized. The typical modules that influence delivery efficiency include delivery task dispatch [6], cooperative trajectory planning [7,8], and control [9]. This paper is focused on the dispatching and cooperative trajectory planning schemes.

**Citation:** Zhang, T.; Li, H.; Fang, Y.; Luo, M.; Cao, K. Joint Dispatching and Cooperative Trajectory Planning for Multiple Autonomous Forklifts in a Warehouse: A Search-and-Learning-Based Approach. *Electronics* **2023**, *12*, 3820. https://doi.org/10.3390/ electronics12183820

Academic Editor: Felipe Jiménez

Received: 12 July 2023 Revised: 1 September 2023 Accepted: 7 September 2023 Published: 9 September 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

#### *1.1. Related Work*

#### 1.1.1. Dispatching Methods for Multiple Forklifts

A complete multi-forklift delivery planning system consists of two functions: delivery task dispatch and cooperative trajectory planning. Before cooperative trajectory planning, the goal point of each forklift is assigned in the dispatching phase [10]. Weidinger et al. [6] proposed a metaheuristic-based method in which assignment candidates are pruned a priori to facilitate the solution process [11]. A similar idea was proposed by Zhang et al. [12] to dispatch multiple automated guided vehicles (AGVs) in a matrix manufacturing workshop. However, both methods run slowly; thus, they cannot meet the real-time computation demand in a warehouse [6,12]. Lin et al. built a multi-AGV dispatching system via network structure together with simplex decision variables; in this system, an evolutionary algorithm minimizes the completion time of all AGVs in a formulated network optimization problem. However, References [6,12,13] shared a common limitation of assuming a uniform speed for AGVs. Moreover, the inter-vehicle collision avoidance problem is reduced to an oversimplified constraint, ensuring solely nonoverlapping time intervals per stopover.

Furthermore, recent studies have focused on the task integration of one AGV instead of considering how multiple AGVs cooperatively operate within the confined area. Bao et al. [14] proposed a heuristic method based on an auction strategy for a multi-AGV task dispatch scheme considering complex factors (such as pod repositioning). The concerned dispatch scheme is inherently an optimization problem with complex cost terms and constraints facilitated by the proposed auction strategy. Lee et al. [15] proposed a two-stage dispatching method. In particular, the first stage deals coarsely with the delivery efficiency and delivery flow balance by solving a bi-objective optimization problem. The result indicates how the parcels to be picked can be clustered. At stage 2, vehicles are dispatched to complete the clustered missions. Dividing the original scheme into two stages largely reduces the number of dispatch candidates without losing the optimum. Machinelearning-based dispatching methods have also been proposed [16]. The formulated reward functions efficiently simplify the dispatch scheme, particularly when complex factors are considered [17,18]. However, few vehicle kinematics is considered in [14,15], and the dispatching phase is fully separated from the trajectory planning strategy. This approach results in difficulties in maximizing overall delivery efficiency.

#### 1.1.2. Cooperative Trajectory Planning Methods for Multiple Forklifts

Cooperative trajectory planning follows the aforementioned dispatching phase. The prevalent cooperative trajectory planners are based on model predictive control [19], which is highlighted by its fast feature while strictly satisfying safety-related constraints. The artificial potential field method is similarly widely applied in trajectory planning, but it may encounter difficulty finding paths through narrow passages [20].

Ma et al. [21] converted constrained time-varying nonlinear programming problems to general unconstrained optimization problems by properly designing a penalty function. Thereafter, a particle swarm optimization method was employed to plan the motion of multiple robots sequentially in a double warehouse with two elevators. However, the optimization phase, along with other methods based on optimization, can significantly increase the computational burden [22–24], which can be reduced by forming model-based paths because warehouses are generally structured.

Yang et al. [1] proposed a strategy in which a time-varying dynamic evaluation function is formed based on a network congestion diffusion model to quantify the degree of road congestion. Hereafter, an improved A\* search algorithm and a time window algorithm were combined as a hierarchical planning method to search the idle path and avoid collisions. A path planning framework was designed by Zhou et al. [25] to simultaneously reduce the cost of operation and the path for AGVs in airport parcel loading scenarios. An ant colony optimization method was used to optimize the parcel pickup sequencing by ignoring other moving vehicles, while Dijkstra's algorithm was employed to determine the shortest route of each AGV. Zacharia et al. developed a joint routing and motion

planning method for AGVs that addresses uncertainties in demands and travel times. Their approach combines a scheduler for updating destination resources during navigation and integrates a fuzzy-based genetic algorithm with A\* search to handle capacity and distance variations [26]. Nonetheless, the vehicle kinematics considered in [1,25,26] remain oversimplified for industrial use even in the trajectory planning phase.

In other studies, the paths of vehicles were assumed to be predefined, and only the longitude trajectories were investigated. Kneissl et al. [27] formulated a method in which potential collision zones are continuously detected. Moreover, the right-of-way is granted to the first arriving vehicle while all the other vehicles involved stop and wait. Dresner et al. proposed confronting the analogous problem of conflict zones with a reservation-based system; in this system, vehicles request and receive time slots from the intersection while they pass [28]. Similarly, a discrete-event logic, which is comparable with a conventional right-handed bidirectional traffic system, was designed by Guney et al. [29] to handle the priorities of the AGVs in a warehouse dynamically. Thus, the need for computationally demanding heuristic searches is eliminated to ease strategy implementation in real-life industrial applications. Furthermore, Digani et al. [30] proposed an obstacle-free path generation method to deal with local deviations from the predetermined path. In the proposed method, new paths are generated via polar spline curves. However, the aisles in a warehouse cannot be fully exploited when certain traffic laws in [27–30] are strictly enforced.

#### 1.1.3. Joint Dispatching and Planning Methods for Multiple Forklifts

Most of the existing dispatching studies, e.g., [6,10–18], cannot accurately evaluate the candidate choices, possibly preventing the downstream planning module from achieving global optimality [1,21–30]. Thus, combining the dispatching and planning phases is naturally considered. The multi-agent path finding (MAPF) problem in its classical form is an effective approach for simplifying complex warehouse scenarios and facilitating cooperative solutions for dispatching and planning. In the MAPF problem, time is discretized into steps, allowing vehicles to either move or wait during each step [31]. Consequently, it becomes challenging to plan trajectories for vehicles with varying velocities or based on specific kinematic constraints. To address this limitation, researchers have explored extensions of the MAPF problem to accommodate such complexities. Among those extensions, Zhang et al. [32] designed a joint strategy to deal with an automatic valet parking system, in which a travel-distance-related reward function combined with a deep reinforcement learning technique was used to allocate the target parking spaces. The parking lot was segmented into local regions, and a rule-based right-of-way assignment strategy was applied to solve collisions and deadlocks. A simplified trajectory planning algorithm based on the car-following model [33] served as a tool to solve the trajectories of multiple AGVs when no potential collision was involved. A similar strategy was proposed by Lee et al. [34] for a supply-chain-connected warehouse. In their work, a cloud-based semiautomatic warehouse management system assigns tasks to mobile robots to optimize resource allocation. A robot control system executes an improved A\* search algorithm to generate the path of each AGV. Then, potential collisions, named stay-on, head-on, and cross-conflict, are identified and solved by following certain priority-based rules. Redispatchment of the AGV with low priority is triggered as the conflict cannot be prevented by those basic rules.

With regard to joint strategies, the studies above [32,34] can deal with large-scale AGV-based scenarios. However, these studies were concerned with vehicles possessing the simple kinematics of unicycle (differential-drive) robots [35] and generally focused on the construction of maneuverable systems, ignoring the overall optimal solutions concerning warehouse operations. Furthermore, during the trajectory planning phase, they initially planned only the paths and ignored other moving obstacles. Such considerations substantially reduce the risk of collisions and simplify the evaluation of the traveling difficulties pertaining to one potential task relative to the corresponding traveling distance. In other

words, it remains unknown the specific trajectory pertaining to each AGV as the jointed strategy is finished, and this trajectory is dependent on local scenes when conflict zones are involved. Thus, the results become unreliable when such strategies are applied to a warehouse of automated forklifts with complex kinematics. As a conclusion of this subsection, it deserves to develop a joint dispatching and planning method to balance the forks' motion quality and reaction speed in an unmanned warehouse.

### *1.2. Motivations*

This study aims to substantially improve the efficiency of cooperative operations among multiple autonomous forklifts by seamlessly integrating the dispatching and cooperative trajectory planning phases. Our primary objective is to address the limitations of existing dispatching methods, which often overlook low-level forklift kinematic capability. To overcome this challenge, we opt for the implementation of a graph search process in this phase. Moreover, to ensure a robust solution that avoids getting trapped in local optima, we chose to incorporate a machine-learning-based technique. In the trajectory planning phase, we recognize that optimization-based methods are computationally expensive. As such, our secondary objective is to develop an alternative search algorithm that employs a modelbased approach. This algorithm is designed to be both velocity-aware and sequentially solvable, striking a balance between accuracy and computational efficiency.
