*1.3. Contributions*

The core contribution of this paper is the proposal of a joint framework, which is promising to reduce the computational burden because all formulations involved are explicitly expressed. Concretely, the dispatching stage can enhance the multi-vehicle task solution quality because it considers the future trajectory pertaining to each forklift. Moreover, the kinematically feasible and safe trajectory of each forklift can be quickly generated through our proposed method at the trajectory planning stage, due to the removal of optimization-based methods.

#### *1.4. Organization*

In the rest of this paper, Section 2 formulates the in-warehouse delivery problem. Section 3 provides the score-based dispatching technique, in which ANN is applied to avoid deadlocks in evaluating the cost of each candidate dispatching option. Section 4 introduces the trajectory planning method, namely a model-based velocity-aware hybrid A\* search algorithm. Section 5 integrates the two aforementioned methods to develop a joint dispatching and cooperative trajectory planning framework, followed by Section 6, where comparative simulation results are present. Conclusions are drawn in Section 7, finally.

#### **2. Problem Statement**

Forklifts are used to deliver goods between fixed picking stations and predetermined shelf areas during delivery tasks in warehouses. The passages are generally designed to be narrow, and they merely allow turning maneuvers with a minimum radius and the passing of only two vehicles. Hence, conflicts arise when multiple forklifts cooperatively operate within a single warehouse.

Within one subtask during the filling of one warehouse, there are two stages: first, the initial pose and the final one should be assigned to one forklift as the dispatching stage; second, the trajectory planning stage generates a trajectory by avoiding collisions with any static or moving objects and satisfying the vehicle kinematics.

### *2.1. Warehouse Layout*

A typical small warehouse layout is schematized in Figure 1. In this warehouse, six separated shelf clusters, denoted as *sd* (*d* = 1, 2, ··· , 6), are placed and provide areas to store goods. Continuous lines indicate shelf walls, through which forklifts cannot move. Two firewalls, represented by rows of grey squares, are present between shelf cluster *s*<sup>1</sup> and

*s*<sup>3</sup> as well as cluster *s*<sup>2</sup> and *s*4. Meanwhile, four picking stations, marked with slender solid rectangles and *pr* (*r* = 1, 2, 3, 4), are located in both extremities of the vertical and wide passage in Figure 1. Four forklifts can enter the passage of each row of the shelf clusters, as long as neither stored stacks nor other vehicles block the route.

**Figure 1.** Schematic of a warehouse layout.

As presented in Figure 2, each shelf cluster has the capacity to accommodate varying numbers of goods stacks with strategically positioned notches in the arrangement designed to suit forklift kinematics during turns. This aspect will be elaborated on in Section 2.2. Each stack is marked with a number in Figure 2 to represent the order in which a shelf cluster can be filled. The shelf filling state and the vehicle state can be effectively expressed by noting the covered grids when the warehouse is divided by the squares outlined by grey dashed lines in Figure 2. In addition, when one forklift is unloading goods within a shelf cluster, its fork side should point to the stack position (cf. upper left forklift schematic and within shelf cluster *s*<sup>1</sup> in Figure 1). Similarly, when one forklift is picking goods at a picking station, the fork side should point to the station position (cf. upper forklift schematic and picking station *p*<sup>1</sup> in Figure 1).

**Figure 2.** Filling sequence pertaining to each shelf cluster.

#### *2.2. Kinematics of a Forklift Vehicle*

As reported in Figure 3, a forklift can be described as a front-steering vehicle if the fork part of the vehicle is treated as the rear side. The corresponding kinematic formulas write:

$$\begin{cases} \frac{dx(t)}{dt} = v(t) \cdot \cos\theta(t) \\ \frac{dy(t)}{dt} = v(t) \cdot \sin\theta(t) \\ \frac{dv(t)}{dt} = a(t) \\ \frac{d\theta(t)}{dt} = \frac{v(t) \cdot \tan\phi(t)}{l} \\ \frac{d\phi(t)}{dt} = \omega(t) \end{cases} \tag{1}$$

where *t* is time; *P*, located at coordinate (*x*, *y*), indicates the mid-point of the rear wheel axis; and *θ*, *v*, *a*, *φ*, and *ω* respectively denote the orientation angle, linear velocity pertaining to point *P*, acceleration, steering angle of the front wheels, and steering rate. Furthermore, *l* stands for the wheelbase length, *m* denotes the rear overhang length, *n* refers to the front overhang length, and 2*b* is the car width. Given that the initial values as well as *ω*(*t*) and *a*(*t*) are provided, the state variables can be calculated through integration over the dynamic process.

**Figure 3.** Parametric notations related to vehicle size and kinematics.

Meanwhile, a few boundaries are imposed on the state profiles over the entire simulation period throughout all dynamic maneuvers:

$$\begin{cases} |a(t)| \le a\_{\text{max}} \\ |v(t)| \le v\_{\text{max}} \\ |\phi(t)| \le \phi\_{\text{max}} \\ |\omega(t)| \le \omega\_{\text{max}} \end{cases} \tag{2}$$

where *amax*, *vmax*, *φmax*, and *ωmax* respectively indicate the upper limits of the corresponding variables.

#### **3. ANN-Combined Score-Based Dispatching Approach**

Filling a warehouse in an orderly manner requires several forklifts to perform multiple deliveries and return subtasks. In the current work, subtasks are assigned to different vehicles sequentially. With regard to such subtasks, selecting which vehicle will be used to plan the new trajectory and determining which goal coordinates the forklift is going to should comprise the fundamental initialization. Therefore, a dispatching system is necessary to solve these problems.

The dispatching approach (cf. Algorithm 1) utilizes the planned trajectories **T** of all vehicles, along with map information *map*, and vector **F** representing filled stacks for different clusters, as its inputs. The core of this approach lies within a while loop, wherein the potential subtask undergoes iterative updates (with a preset maximum iteration number *iterdispatch*) until it reaches an optimal state, as defined by the proposed method.

Outside of the loop, the function rank() sorts all vehicles based on **t**, arranging them in ascending order from first to last. This sorting process generates a ranking vector **R** consisting of four vehicle indices. If *f ail* = 1, indicating a failure in the trajectory search between *Pi*(*xi*, *yi*) at instant *ti* and *Pf xf* , *yf* , the corresponding vehicle is repositioned at the end in **R** and flagged as having been selected as *ncurrent*. In the loop, the algorithm is divided into two parts. The first one (lines 4 to 13) concerns the selection of the current investigated vehicle with an index of *ncurrent*, and the second part (lines 14 to 30), featuring a scoring system combined with the results of an ANN method, determines the goal coordinate for the current subtask. Notably, Algorithm 1 is applied when the ANN correction system, which is elaborated on in Section 3.2, is enforced.

```
Algorithm 1: ANN combined score-based dispatching algorithm

     ncurrent, Pi, Pf , ti

                    ← Dispatch

                                T, F, f ail, Pi, Pf , map
1. Initialize α ← 0 ;
2. R ← rank

              T, Pi, Pf , f ail
                          ;
3. while iter < iterdispatch, do
4. if α = 0, then
5. [ncurrent, ti] ← SelectInitialState(R);
6. else
7. if CheckSelection(R) > 0, then
8. [ncurrent, ti] ← SelectAlteredState(R);
9. else
10. [ncurrent, ti] ← SelectBackupState(R);
11. end if
12. end if
13. Pi = SetInitialPose(T, ncurrent, map);
14. if CheckDeliverTask(Pi) is true, then
15. Sd = PreAstarDeliver(Pi, F, map);
16. if max(Sd) > 0, then
17. Pf = SetFinalPose(Sd, map);
18. return;
19. else
20. α ← 1 ;
21. end if
22. else
23. Sr = PreAstarReturn(Pi, map);
24. if max(Sr) > 0, then
25. Pf = SetFinalPose(Sr, map);
26. return;
27. else
28. α ← 1 ;
29. end if
30. end if
31. end while
32. return;
```
#### *3.1. Vehicle Selection and Initial Pose of a New Subtask*

The function SelectInitialState() selects the vehicle, ranking the first one as *ncurrent*, and sets *ti*, which is the initial instant of the trajectory to be planned, as the ending instant of the last trajectory pertaining to *ncurrent*. Failure may occur in the determination of the new trajectory for the fork *ncurrent* in the new subtask because other forklifts may block the only corresponding route for a considerable time. Under such circumstances, a flag variable *α* is set to 1, and all vehicles in **R** are checked to see if they have been selected as *ncurrent* once for the current subtask by CheckSelection(). The function SelectAlteredState()

is then utilized. In this function, the first motion-finished vehicle is discarded, and the other forklifts are subsequently selected in turn as *ncurrent* on the basis of the rankings in **R** until the trajectory can be formed. Furthermore, no trajectory can be successfully planned for all forklifts at certain moments. In this case, SelectBackupState() is applied, in which the vehicle that ranks last in **R** is selected, and the *ti* of the new trajectory is postponed for a fixed time length of Δ*ti* relative to the end of the last subtask for vehicle *ncurrent*. The vehicle *ncurrent* final stopping pose *Pi*(*xi*, *yi*, *θi*) is set as the initial pose of the new subtask by function SetInitialPose().

#### *3.2. Scoring System*

A scoring system is applied to decide the goal coordinate of the new subtask. First, the function CheckDeliverTask(*Pi*) is initially employed to ascertain whether the planned trajectory involves heading to clusters or returning. The functions PreAstarDeliver() and PreAstarReturn() are then applied separately depending on whether the goal is a rack cluster or a picking station. In both functions, the grid networks, outlined by light colors in Figure 2, indicate the nodes used to define the location of a vehicle and stacks. The resolution of such nodes is purposely reduced with the aim of lowering computation costs. Given that the nodes are defined, a time dimension involved preliminary A\* search algorithm, whose expansion manner is presented in Figure 4, is used to generate preliminary trajectories that link the starting pose *Pi*(*xi*, *yi*) at instant *ti* to each potential target coordinate. Five patterns in total for this search algorithm are applied to vaguely indicate the possible maneuvers a vehicle could perform. In particular, manner 5 in Figure 4 expands only in the time dimension, representing the stopping condition of a virtual forklift. The time consumed derived from this algorithm for a virtual vehicle represented by one node is then applied as a parameter to evaluate the difficulty grades of reaching different goal poses. Other vehicles and walls are treated as obstacles during the search. Notably, the orientation angles *θ* for the initial and final poses are not required to be determined in such a system. Thus, *θ* is not considered a dimension in this search for the sake of calculation simplification.

**Figure 4.** Expansion manner of the preliminary hybrid A\* search.

In the function PreAstarDeliver(), the function *Sd*,0 is applied to evaluate the scores pertaining to different target stack locations *sd* (cf. Figure 1) when a subtask with one stack as the goal is considered. It writes:·

$$\begin{cases} \begin{aligned} \mathcal{S}\_{d,0} &= \mathbb{C}\_{1}\mathbf{G}\_{d} + \mathbb{I}\_{d,0} \\ \begin{aligned} I\_{d,0} &= \begin{cases} H + \mathbb{C}\_{2}I\_{d} + \mathbb{C}\_{3} \text{ preliminary trajectory is planned} \\ \mathbb{C}\_{4} & \text{prelinearity trajectory planning is failed within } iter\_{pre} \end{aligned} \end{cases} \end{cases} \end{cases}$$

$$\begin{aligned} H = -\Delta t\_{cover} \\ \begin{cases} t\_{find} - t\_{i} + \mathbb{C}\_{5}s\_{d} \in \{s\_{1}, s\_{2}\} \\ t\_{find} - t\_{i} & \text{s}\_{d} \in \{s\_{3}, s\_{4}, s\_{5}, s\_{6}\} \end{cases} \end{cases} \tag{3}$$

where *Gd* denotes the number of stacks to be filled/emptied in the target shelf cluster *sd* in order to balance the warehouse filling/emptying mission in different clusters. *Jd*,0 stands for the approximate difficulty to reach different goal coordinates in shelf cluster *sd*. *C*<sup>4</sup> is a negative constant applied when the goal in practice is not reachable. *H* indicates time length Δ*tcover*. Δ*tcover* counts the time units during which other vehicles occupy the shelf entrance node (cf. the grey grid for the upper left forklift in *s*<sup>1</sup> in Figure 1) of the goal pose within a fixed time length Δ*tentry* after the virtual vehicle entering the rack passage. The reason Δ*tcover* is added as a parameter is that, in practice, the availability of the above-mentioned node is critical during the planning of the final trajectory that considers kinematics. *Id* is a function of the time length *tfind* − *ti* consumed to arrive at the target in this preliminary search. Notably, this function is built to normalize the results for different shelf clusters because shelf clusters *s*<sup>1</sup> and *s*<sup>2</sup> are far from the picking stations, and delivering goods to these locations consumes much time. Furthermore, a negative constant *C*<sup>4</sup> is assigned to *Jd*,0 when the target cannot be reached through the search within a predefined maximum iteration number *iterpre*. Meanwhile, *C*1, *C*2,..., *C*<sup>5</sup> are calibration parameters. Among these, a substantial weighting coefficient, *C*1, is allocated to regulate the stacks filled in each shelf cluster; aiming for balance, *C*2, *C*3, *C*4, and *C*<sup>5</sup> are designed to quantitatively assess scores with respect to time considerations. If *Sd*,0 ≤ 0 is derived within *iterpre*, the vehicle kinematics-considered trajectory is difficult to find. Thus, the flag variable *α* is set to 1 in the initialization cycle, where ANN is not enforced, and *ncurrent* should be reassigned.

Similarly, in the function PreAstarReturn(), only the time consumed with respect to different picking stations *pr* (cf. Figure 1) in the A\* search is used in the evaluation of *Sr*, which is expressed as:

$$S\_{\tau} = \begin{cases} \begin{array}{c} I\_{\tau} \ \text{preliminary trajectory is planned} \\ 0 \ \text{preliminary trajectory planning is failed within } iter\_{\text{prre}} \end{array} \end{cases} \tag{4}$$

When ANN is not enforced, the potential targets are initially scored solely by means of Equations (3) and (4). The greater the functions to be evaluated are, the greater the likelihood of subsequent trajectory planning is and the faster the entire warehouse can be filled. In this case, arg max *Sd*,0 and arg max *Sr* are selected as the goal poses for delivery and return subtasks, respectively.

#### *3.3. ANN Correction Method*

The function PreAstarDeliver() is employed to refine goal score evaluations through a multilayer perceptron (MLP) network, which is elaborated upon as follows.

Figure 5 presents a typical MLP network of ANN with one hidden layer. Mathematically, with the trajectory planning states as known variables, the MLP network of the type reported in Figure 5 can be expanded step by step as follows:

$$\mathcal{G}(w, W) = \mathcal{F}(\sum\_{j=1}^{m} \mathcal{W}\_j \mathbf{h}\_j(w) + \mathcal{W}\_0) = \mathcal{F}(\sum\_{j=1}^{m} \mathcal{W}\_j \mathbf{f}\_j(\sum\_{i=1}^{n} w\_{ji} z\_i + w\_{j0}) + \mathcal{W}\_0) \tag{5}$$

where *wji* and *Wj* denote the weights assigned to the connection of the neurons. *Wo* and *wj*<sup>0</sup> are linked to the bias, whose values are simply the constant 1.

The ANN correction in the current study is designed for the scoring system for the goal pose determination of delivery subtasks. In the initialization phase of the ANN correction system, excluding the filling balance parameter of *Gd* in Equation (3), {*J*1,0, *J*2,0, ··· , *J*6,0} respectively denote the base values of the output elements in **J** = {*J*1,*est*, *J*2,*est*, ··· , *J*6,*est*} (cf. yˆ in Figure 5) for six MLP networks. Among the six elements, the one whose corresponding pose is selected as the target for vehicle kinematics-considered trajectory planning is further fixed based on the corresponding trajectory variables. The values of the other elements remain as unchanged as the results in Equation (3). Suppose that *sd* is the shelf cluster investigated within a subtask. Given that the base value *Sd*,0 is derived with Equation (3), if *sd* that corresponds to arg max *Sd*,0 is selected as the goal to determine the trajectory, then

the trajectory with the goal of *sd* will be planned, and the corresponding element in **J** will be derived with Equation (6) as follows:

$$J\_{d, \text{ext}} = \begin{cases} J\_{d,0} - \mathbb{C}\_6 (t\_f - t\_i - \overline{t} + \mathbb{C}\_7) & \text{trajector} \text{ is planned, and } d \in \{1, 2\} \\\ J\_{d,0} - \mathbb{C}\_6 (t\_f - t\_i - \overline{t} + \mathbb{C}s) & \text{trajector} \text{ is planned, and } d \in \{3, 4, 5, 6\} \\\ & \text{C}\_5 & \text{trajector} \text{ planning is failed within } i \text{ter}\_{pr\varepsilon} \\\ J\_{d,0} & \text{trajector} \text{ is not planned} \end{cases} \tag{6}$$

where *tf* is the ending instant of the trajectory, *t* stands for the average value of the time length pertaining to all previously derived trajectories, *C*<sup>6</sup> is a calibrated constant intended to balance *Jd*,0, and the latter solely accounts for time consumed to reach a goal without factoring in the distance covered. *C*<sup>7</sup> and *C*<sup>8</sup> are respectively used to normalize the difference in distances corresponding to various shelf clusters and the expected moving time period of each subtask.

**Figure 5.** Schematic of an MLP network.

Similar to the parameter *Gd* in Equation (3), the inputs (cf. *z*1, *z*2, ··· , *zn* in Figure 5) of the ANN system are the number of stacks filled within each shelf cluster. A vector **U** with six elements {*u*1, *u*2, ··· , *u*<sup>6</sup> }, which correspond to six shelf clusters *sd*, is used. Suppose that *ud* ∈ **U** is considered, it can be formulated as

$$u\_d = \begin{cases} -G\_d & \text{delivery subttask is linked} \\ -G\_d & \text{return subttask or subtasks of both types are linked} \\ 0 & \text{no subttask is linked} \end{cases} \tag{7}$$

The integers other than 0 can represent the filling states of the shelf clusters, which are associated with the currently moving forklifts. The vehicle motion states are also observed. This input of the MLP network vaguely provides information associated with the possible area the vehicles may be located in, given that each shelf cluster should be filled by following certain orders. Furthermore, at least two shelf clusters are not connected to any subtask because only four forklifts in the warehouse are employed. Evidently, these shelf clusters have no impact on the trajectory planning, and this situation is in line with the circumstances, where the investigated shelf group is filled. Thus, 0 is assigned to *ud* in this case.

As the number of hidden neurons is set to 12 according to an empirical technique [36], the MLP system used to score shelf cluster *sd* can be expressed in the form of Equation (5) as

$$\mathcal{J}\_{d, \text{ext}}(w\_d, \mathcal{W}\_d) = \mathcal{F}(\sum\_{j=1}^{12} \mathcal{W}\_{d, \text{j}} \mathbf{h}\_j(w\_d) + \mathcal{W}\_0) \\ = \mathcal{F}(\sum\_{j=1}^{12} \mathcal{W}\_{d, \text{j}} \mathbf{f}\_j(\sum\_{i=1}^6 w\_{d, \text{j}} u\_{d, i} + w\_{d, \text{j}}) + \mathcal{W}\_0) \tag{8}$$

The Levenberg-Marquardt method is then used to train the MLP and the values assigned to the elements of **W** and **w**.

In the following warehouse filling cycles and the final dispatching system, the scoring system described in Equation (3) is replaced by the expression below to determine the optimal goal pose for vehicle *ncurrent*. Equation (9) is the final scoring equation of the discussed dispatching system, wherein *C*<sup>9</sup> and *C*<sup>10</sup> act as the calibration constants. These constants are determined via a trial-and-error approach to yield results. The MLP contributes without excessively disrupting performance concerning warehouse filling/emptying time.

$$\begin{cases} S\_d = \mathbb{C}\_1 \mathbb{G}\_d + f\_d \\\ J\_d = \mathbb{C}\_9 \mathbb{I}\_{d,0} + \mathbb{C}\_{10} \hat{\mathbb{J}}\_{d,\text{est}}(\mathbf{w}\_d, \mathbf{W}\_d) \end{cases} \tag{9}$$

After the initialization cycle of warehouse filling (referring to Figure 6), the training process can continue until *Cy* cycles are finished. In the following cycles, the MLP has already been established depending on the data pertaining to the previous cycles, and the saved values of **W** and **w** are used to estimate the values ˆJ*d*,*est*() via Equation (8). Therefore, the base values of the MLP outputs *Jd*,0 are replaced by *Jd* in Equation (9) when shelf cluster *sd* is considered. The expression of the corresponding element in **J** for the following cycles writes:

$$J\_{d, \text{ext}} = \begin{cases} J\_d - \mathbb{C}\_6(t\_f - t\_i - \overline{t} + \mathbb{C}\_7) & \text{trajectory is planned, and } d \in \{1, 2\} \\\ J\_d - \mathbb{C}\_6(t\_f - t\_i - \overline{t} + \mathbb{C}\_8) & \text{trajectory is planned, and } d \in \{3, 4, 5, 6\} \\\ \mathbb{C}\_5 & \text{trajectory planning is failed within } iter\_{pr\epsilon} \\\ J\_d & \text{trajectory is not planned} \end{cases} \tag{10}$$

**Figure 6.** Flowchart of MLP training.

#### *3.4. Final Pose Selection*

The final pose of each subtask is determined using the SetFinalPose() function. For subtasks with the purpose of delivery, the destination coordinate *Pf xf* , *yf* is chosen based on the arg max *Sd* criterion; for return subtasks, the coordinate *Pf xf* , *yf* corresponding to arg max *Sr* is selected. Finally, the corresponding *θ* information should be added to complete the destination pose.

#### **4. Improved Hybrid A\* Search Algorithm**

Hybrid A\* algorithm [37] is an extension of the conventional 2D A\* search algorithm due to its consideration of kinematics during node expansions over time. Different from the original hybrid A\* search, the velocity variation is considered, and the node expansion manner is determined based on action purposes in the algorithm (cf. Algorithm 2) applied in this study. Furthermore, this improved heuristic method directly determines trajectory details throughout a subtask without an optimization stage.

Algorithm 2 is used to expand a parent node (*Pp*, *tp*) in one manner *Nc* through the improved hybrid A\* search with the target of *Pf* . Focusing on the motion of one vehicle, *Listopen* is used to store the data pertaining to all nodes (*Popen*, *topen*), which has been explored and can be further expanded. The information contains the corresponding parent nodes, expansion manner (cf. Section 4.1), time expansion data (cf. Section 4.2), and costs (cf. Section 4.4). On the contrary, the node (*Pclosed*, *tclosed*) cannot be expanded anymore and is stored in *Listclosed*. Furthermore, the function DetectCollision() is discussed in Section 4.3. In addition, the function AddNode() is used to add a node with its affiliating data into *Listopen* or *Listclosed*.

#### **Algorithm 2:** Improved hybrid A\* search algorithm *<sup>σ</sup>*, *Listopen* <sup>←</sup> SearchAStar **<sup>T</sup>**, *Nc*, *Pp*, *Pf* , *tp*, *Listopen*, *Listclosed*, *map* 1. *thigh*,*c*, *tmid*,*c*, *tzero*,*<sup>c</sup>* ← FixMovingTime *Listopen*, *Pp*, *tp*, *Nc* ; 2. *Listopen* ← SetCost *Listopen*, *Pp*, *tp*, *Nc* ; 3. *γ* ← DetectCollision **T**, *Nc*, *Pp*, *tp*, *Listopen*, *map* ; 4. **if** (*Pc*, *tzero*,*c*) ∈ *Listclosed* , **then** 5. **return;** 6. **end if** 7. **if** (*Pc*, *tzero*,*c*) ∈ *Listopen* **and** *γ* = 0, **then** 8. **if** *fc* < *fpre*, **then** 9. *Listopen* ← ReplaceNode *Listopen*, *Pp*, *tp* ; 10. **end if** 11. **else** 12. **if** *γ* = 0, **then** 13. *Listopen* ← AddNode *Listopen*, *Pc*, *tzero*,*c* ; 14. **if** *Pc* = *Pf* , **then** 15. *σ* ← 1 ; 16. **return**; 17. **end if** 18. **else if** *γ* = 1, **then** 19. *Listclosed* ← AddNode(*Listclosed*, *Pc*, *tzero*,*c*); 20. **return;** 21. **end if** 22. **end if** 23. **return**;

#### *4.1. Node Expansion Method*

This section elaborates on the various possible expansion manners denoted as *Nall*, with each individual possibility represented by *Nc*. The drivable area is initially mapped with the above-discussed grid networks (cf. Figure 2). The dimensions of the grid should be skillfully coupled with the size of the vehicle, with the aim of reducing occupied cells during a certain action and enhancing the utility rate of the space. The case demonstrated in Figure 7 is a well-designed example. In this case, one vehicle covers two grid cells. Thus, on a 2D space domain, the orientation angle *θ* involved in forklift movements can be easily described. In addition, through one maneuver, the vehicle body occupies a small number of grids. This can reduce the possibility of interference with other forklifts during motion planning.

**Figure 7.** Node expansion manners of improved hybrid A\* algorithm: (**a**) Stopping and going straight; (**b**) lane changing and turning.

As reported in Figure 7a,b, the nodes expanding manners *Nall* in the space domain can be divided into eleven patterns, which are summarized as four categories covering all possible maneuvers a forklift may intend to conduct. These include stopping, going straight (cf. semitransparent solid rectangles in Figure 7a), lane changing (cf. rectangles with diagonal stripes in Figure 7b), and turning (cf. solid rectangles in Figure 7b).

As far as the expanding manners are concerned, when *v* remains 0, one vehicle stops. As *v* is other than 0 with steering angle *φ* = 0, one fork can go or reverse straight. Figure 8 demonstrates one modeled forklift path depicted by consecutive outlines, when the vehicle goes upward and turns to the left from the right lower side to the center left. Through this maneuver, *φ* is varied to gain an identical final location relative to grids as the initial state, despite the change of *π*/2 in *θ*. Under such circumstances, the consecutive motions can be easily established. Meanwhile, the modeled maneuver of reverse turning to the left can also be noted in Figure 8, when the initial and final body outlines are exchanged.

**Figure 8.** Path of turning.

Similar to turning actions, a typical lane-changing maneuver is also modeled and outlined in Figure 9. The forklift goes forward and changes to the left lane, which is shown on the top row in Figure 9. The other possible node-expanding paths of turning and lane changing are modeled by mirroring or rotating the examples reported in Figs. 8 and 9.

**Figure 9.** Path of lane changing.

The smoothness of the modeled paths can be enhanced by adopting advanced techniques, and the identical heuristic trajectory search rule of this work can also be applied to the newly modeled paths.
