3.4.1. Navigation Control

This module receives a target for the LHD navigation, which could be a relevant location within the mine, such as an extraction/draw point or a dumping point. This target is usually defined by a dispatch system, or in some cases by a human operator. It consists of a destination node and a topological route, which is computed using the TMM as is explained in Section 3.2. This path is represented as a sequence of TMM nodes, each one containing information about the position, orientation, heading direction of the vehicle, and an indication of whether the vehicle must go through the node or come to a full stop on it.

Given that the topological route is composed of TMM nodes that are originated from tunnel and intersection nodes, this module has two navigation modes: *tunnel tramming*, used inside tunnels, and *path-following*, used inside intersections. In tunnel tramming mode, the navigation modules follow the path of the tunnel's walls, while in path-following mode, the navigation modules follow a trajectory in an area were more than a single path can be taken. In addition, inside tunnel nodes, intermediate sub-goals may be generated depending on the defined waypoints (See Section 3.2).

To achieve a smooth navigation across TMM nodes, transitions between different goals must be seamless. A naive implementation to identify when a goal has been reached would be to check when the global localization estimation equals the current goal, but this often makes the movement of the vehicle not continuous and clumsy. A better approach requires that *Navigation Control* anticipates when the LHD is going to reach a certain goal. In order to do this, a set of conditions are applied in addition to monitoring the global localization estimation:

$$P\left(\stackrel{\rightarrow}{\mathbf{x}}\_{LHD} = \stackrel{\rightarrow}{\mathbf{x}}\_{T}\right) = \frac{1}{2\pi\sqrt{|\sigma|}}e^{-\frac{1}{2}\left[\stackrel{\rightarrow}{\mathbf{x}}\_{LHD} - \stackrel{\rightarrow}{\mathbf{x}}\_{T}\right]^{T}\sigma^{-1}\left[\stackrel{\rightarrow}{\mathbf{x}}\_{LHD} - \stackrel{\rightarrow}{\mathbf{x}}\_{T}\right]\_{1,1}} > P\left(\stackrel{\rightarrow}{\mathbf{x}}\_{LHD} = \stackrel{\rightarrow}{\mathbf{x}}\_{T}\right)\_{MIN} \tag{2}$$

$$\dot{d}\left(\stackrel{\rightarrow}{\mathfrak{x}}\_{LHD\prime}\stackrel{\rightarrow}{\mathfrak{x}}\_{T}\right) > \dot{d}\_{MIN} \tag{3}$$

$$|\theta\_{LHD} - \theta\_T| < \Delta \theta\_{MAX} \tag{4}$$

where:

→ *x LHD* = 2D position estimation of the LHD. → *x <sup>T</sup>* = 2D position of the current navigation target. *σ* = 2D Self-localization estimation variance (without orientation estimation). *P* → *<sup>x</sup> LHD* = <sup>→</sup> *x <sup>T</sup> MIN* = Minimum 2D target reached likelihood threshold. . *d* = Euclidean distance function derivative with respect to time. . *dMIN* = Minimum Euclidean distance function derivative threshold. *θLHD* = LHD orientation (heading) estimation. *θ<sup>T</sup>* = Current target orientation (heading). Δ*θMAX* = Maximum orientation difference threshold. The condition (2) is the probabilistic estimation of actually reaching the desired target position. Condition (3) measures if the LHD is actually getting closer to the target and condition (4) measures the difference between the LHD's orientation and the current target's orientation. When navigating in *tunnel tramming* mode, only condition (2) is used, but when navigating in *path-following* mode, conditions (2)–(4) must be met. This way, lower values for *P* → *<sup>x</sup> LHD* = <sup>→</sup> *x <sup>T</sup> MIN* on condition (2) can be used (which helps to anticipate

the vehicle is going to the target goal (often a tunnel entrance) in an intersection. In *tunnel tramming* node, Navigation Control also checks that the LHD does not miss the tunnel end, checking the following conditions:

transitions and obtain a smooth movement), because conditions (3) and (4) indicate that

$$d\_{ODOM}(t) - d\_{TUNNEL} > e\_{MAX}^{ODOM} \tag{5}$$

$$\frac{d\_{ODOM}(t)}{d\_{TINNEL}} > e\_{MAX\,\%}^{ODOM} \tag{6}$$

where:

*dODOM*(*t*) = Accumulated linear odometry of the current tunnel. *dTUNNEL* = Total length of the tunnel. *eODOM MAX* = Maximum odometry error magnitude threshold.

*eODOM MAX* % = Maximum odometry error percentage threshold.

If both of these conditions are true, *Navigation Control* stops the vehicle and asks for assistance to the operator/supervisor of the system. Both conditions are required because, for short tunnels, condition (6) can trigger false alarms, while for long tunnels, condition (5) can trigger false alarms.

Other important information stored in the TMM is the maximum speed at which a node should be transited, and an indicator forcing the vehicle to drive closer to one of the walls of the road (instead of trying to remain in the center of the road). Both of these parameters can be manually tuned to optimize the way the vehicle approaches certain curves or traverses through the mine.

#### 3.4.2. Deliberative Path Planning

This module receives the next target position, which needs to be reached with a certain speed, as a relative pose from *Navigation Control*. Then, it calculates the path to be followed between the current pose and the desired destination, as a spline *S*(*t*) = *Sx*(*t*), *Sy*(*t*) . The desired steering speed (*ω*) and speed limit (*vMAX*) are then computed, and sent to *Guidance* (See Figure 9).

In order to calculate the spline's coefficients, the following border conditions are used:

$$\mathcal{S}\left(t=0\right) = X\_0; \mathcal{S}\left(t=t^\*=\frac{\vec{d}}{v}\right) = X\_1\tag{7}$$

$$\dot{X}\_0 = (v\cos\gamma, v\sin\gamma);\ \dot{X}\_1 = (v\cos\theta, v\sin\theta) \tag{8}$$

where:

*<sup>X</sup>*0, . *X*<sup>0</sup> = Position and speed of the front bumper of the vehicle (See Figure 10).

*<sup>X</sup>*1, . *X*<sup>1</sup> = Position and speed at the desired target destination (See Figure 10).

*d* = Estimated distance between *X*<sup>0</sup> and *X*1.

**Figure 10.** Graphic representation of the kinematic variables of the vehicle.

Using the calculated derivative of the spline, and the vehicle's kinematic model, given by (10) and (11), the desired steering rate γ can be calculated as:

$$\dot{\gamma} = \frac{\left(L\_f \cos \gamma + L\_r\right) \dot{\alpha}(t=0) - \upsilon\_i \sin \gamma}{L\_r} \tag{9}$$

with . *α*(*t* = 0) the calculated angle derivative of the spline, evaluated in *t* = 0; *Lr* the length from the LHD's pivot to the rear wheel axis; *Lf* the length from the LHD's pivot to the front wheel axis; *vi* the linear speed of the LHD; *γ* the steering angle in the LHD's pivot; *<sup>ω</sup>* <sup>=</sup> . *γ* the steering speed in the LHD's pivot.

#### 3.4.3. Guidance

This module performs the task of selecting the appropriate commands for the machine's actuators, given the high-level general directives of the expected motion and, at the same time, ensuring that the LHD will not hit any obstacles or mine infrastructure. For that purpose, a model-based predictive control (MPC) scheme was implemented using the vehicle's kinematic equations and a cost function that simultaneously considers the following: the high level reference commands, the distance to the walls of the tunnel, and the smooth variation of the actuator commands over time.

The kinematic model of a center-articulated vehicle has been presented in a number of previous publications, such as in [11]. Equations (10) and (11) show an incremental model for the machine's pose.

$$\Delta[\mathbf{x}, \mathbf{y}, \theta] = \Delta t \cdot \left[ \upsilon \cos(\theta), \upsilon \sin(\theta), (\upsilon \sin(\gamma) + L\_r \omega) / \left( L\_f \cos(\gamma) + L\_r \right) \right] \tag{10}$$

$$
\Delta \gamma = \Delta t \cdot \omega \tag{11}
$$

where:

[*x*, *y*, *θ*] = pose of the LHD (2D position and angle). Δ*t* = sampling time of the discrete model. *v* = linear speed of the LHD. *γ* = steering angle in the LHD's pivot. <sup>ω</sup> <sup>=</sup> . *γ* = steering speed in the LHD's pivot. *Lr* = length from the LHD's pivot to the rear wheel axis. *Lf* = length from the LHD's pivot to the front wheel axis.

The previous model is used in the MPC to predict the trajectory of the machine over a predefined timespan. Then, the optimization process is carried out, in which the best actuator command (*u* = [*uv*, *uω*]) for each time step is selected to minimize the following cost function:

$$Q = Q\_{mpp} + Q\_{stering} + Q\_{smoath} \tag{12}$$

This equation shows that the cost function is composed of three parts: one for keeping the vehicle away from the tunnel walls (*Qmap*), another (*Qsteering*) for following the highlevel reference commands, and a final one to smooth the optimization result over time (*Qsmooth*).

In Equation (13), it can be seen that the cost associated with keeping the machine away from the walls relies on maximizing the distance between certain key points of the vehicle and the closest data point in the registered point cloud of the environment. These key points are the corners of the front and rear vehicle bodies.

$$Q\_{map} = \sum\_{i=1}^{n} R\_F \frac{|D\_{FL,i} - D\_{FR,i}|}{D\_{FL,i}^2 D\_{FR,i}^2} + R\_M \frac{|D\_{ML,i} - D\_{MR,i}|}{D\_{ML,i}^2 D\_{MR,i}^2} + R\_R \frac{|D\_{RL,i} - D\_{RR,i}|}{D\_{RL,i}^2 D\_{RR,i}^2} \tag{13}$$

With *DFL*,*<sup>i</sup>* the distance between the front left corner of the machine and the closest point of the tunnel walls, predicted at time step *i* of the optimization process. Similarly, *DFR*, *DML*, *DMR*, *DRL*, and *DRR*, refer to the distances from the front right, middle left, middle right, rear left, and rear right corners of the vehicle, respectively. The cost function weights, *RF*, *RM*, and *RR*, are selected to obtain proper behavior.

Equation (14) details the cost related to following the command directives issued from the high-level software modules. Here, only the reference for the steering speed (*ω*) is considered, since the reference for the machine's maximum speed (*vMAX*) is directly set as an upper bound restriction for the optimization function. Again, the cost function weight *Rω* is selected to obtain proper behavior.

$$Q\_{stering} = \sum\_{i=1}^{n} R\_{\omega} \left| \omega\_i - \overline{\omega} \right| \tag{14}$$

Finally, the smoothing component of the cost function (*Qsmooth*), is intended to ensure that the command has a controlled variation (i.e., limits the change in the command between time steps), and that a newly computed optimal command vector has some degree of continuity after the time span for which it was selected. Namely, the *Qsmooth* component comprises, in turn, two other terms, as stated above.

$$Q\_{smooth} = Q\_{acc} + Q\_{proj} \tag{15}$$

The first term assigns an additional cost to commands that cause a linear or steering acceleration above predefined limits, as stated in Equation (16), while the second term, shown in Equation (18), rewards commands that, when maintained past their time horizon, for up to twice as long as originally intended, will not cause a collision with a tunnel wall.

$$Q\_{\rm acc} = \sum\_{i=2}^{n} R\_{\delta\omega} \cdot f(\omega\_i - \omega\_{i-1}, \Delta\omega\_{\rm m\nu}, \Delta\omega\_M) + R\_{\delta v} \cdot f(v\_i - v\_{i-1}, \Delta v\_{\rm m\nu}, \Delta v\_M) \tag{16}$$

$$f(\mathbf{x}\_{\prime}|\mathbf{x}\_{\min}, \mathbf{x}\_{\max}) = \begin{cases} \mathbf{x} - \mathbf{x}\_{\max} & \text{if } \mathbf{x}\_{\max} < \mathbf{x} \\ 0 & \text{if } \mathbf{x}\_{\min} < \mathbf{x} < \mathbf{x}\_{\max} \\ \mathbf{x}\_{\min} - \mathbf{x} & \text{if } \mathbf{x} < \mathbf{x}\_{\min} \end{cases} \tag{17}$$

$$Q\_{proj} = \sum\_{j=1}^{m} R\_p \frac{\Delta \left[ \mathbf{x}\_{j\prime} y\_{j\prime} \theta\_j \middle| v\_{n\prime} \ \omega\_n \right] \cdot \operatorname{crank} \left( \mathbf{x}\_{j\prime} y\_j \right)}{\Delta \left[ \mathbf{x}\_{j\prime} y\_{j\prime} \theta\_j \middle| v\_{n\prime} \ \omega\_n \right]} \tag{18}$$

$$crank(\mathbf{x}, y) = \begin{cases} 1 & \text{if } position \ (\mathbf{x}, y) \text{ is in collision} \\ 0 & \text{if } position \ (\mathbf{x}, y) \text{ is not in collision} \end{cases} \tag{19}$$

where *Rδω*, *Rδv*, and *Rp* are the cost weights, selected for proper behavior; Δ*ωm*, Δ*ωM*, Δ*vm* and Δ*vM* are the parameters for the minimum and maximum steering acceleration and linear acceleration, respectively; Δ *xj*, *yj*, *θ<sup>j</sup>* & &*vn*, *ω<sup>n</sup>* are the displacement caused by the kinematic model of the machine, at time step *j* when the last optimization command of the previous process is applied.

The outcome of the former process is a command vector for every time step in the selected timespan ( *u* = 0 *ut*<sup>0</sup> ,..., *utf* 1 ), in which each element (*uti* = [*uv*, *uω*, *ti*]) represents a speed and steering command pair, alongside the timestamp on which this command is to be executed.

#### 3.4.4. Command Executor

In opposition to the traditional philosophy of an MPC, the result of the *Guidance* module is not directly fed to the machine's actuators. It is first filtered and merged with previous results of the optimization process in order to always keep a consistent queue of commands that will sustain the operation of the vehicle for a short period of time. This filtering is carried out by the *Command Executor* module. The goal of this module is to ensure that the signals sent to the actuators will be appropriate, both for avoiding long-term damage of the devices involved and also for keeping the operation running as expected.

The *Command Executor's* input is a "trajectory" of commands to be executed at specific times. Each command of the trajectory is inserted in a command queue. The queue insertion process entails finding the time at which the current command is to be inserted, erasing any command previously queued from that moment onwards. Then, the new command is appended at the end of the queue, effectively overriding outdated directives.

Before the *Command Executor* issues a new command to the machine actuators, the upcoming command is filtered. The velocity command *uv* is limited to a maximum value *uv*,*max* and a "dead zone" is applied to the steering command *uω*, namely:

$$u\_{\upsilon} = \begin{cases} u\_{\upsilon} & \text{if } u\_{\upsilon} < u\_{\upsilon, \text{max}} \\ u\_{\upsilon, \text{max}} & \text{if } u\_{\upsilon, \text{max}} < u\_{\upsilon} \end{cases} \tag{20}$$

$$u\_{\omega} = \begin{cases} u\_{\omega} & \text{if } u\_{\omega^\*} < -u\_{\omega\_\* \min} \text{ or } u\_{\omega\_\* \min} < u\_{\omega^\*} \\ 0 & \text{if } -u\_{\omega\_\* \min} < u\_{\omega^\*} < u\_{\omega\_\* \min} \end{cases} \tag{21}$$

where *uω*, *min* is a predefined constant value for the steering command "dead zone" and *uv*,*max* is a value computed, so that if the machine were to be commanded to stop at the present time, it would effectively stop before the last queued command. That is, given a command queue with a total duration of *Qdt* seconds and a machine deceleration of *Dv* meters per second squared, then: *uv*,*max* = *Dv*·*Qdt*, where *Dv* is the mean deceleration of the machine when a full brake is applied, a parameter that can be determined experimentally.

A diagram of the described process is shown in Figure 11 for a single command of the input command trajectory. As mentioned, the same steps are executed for all elements.

**Figure 11.** Command management process of the command executor node.

#### **4. Development and Testing Methodology**

The methodology used for the development and testing of the proposed navigation system consists of four steps. The first is the development of the automation system in a simulated environment, which is a safe and cost efficient platform for that purpose. The second is the use of scale models to verify the behaviors that are too complex or impractical to be tested on a simulated environment. The third is validation and testing in real equipment, using a safe location intended for that purpose. Fourth is the validation and testing in a real operation environment under controlled conditions before moving on to production. Details are discussed further in the following section.

#### *4.1. Development in a Simulated Environment*

The system was initially developed and tested in a simulated environment using Gazebo [29] and integrated with ROS [30]. At first, an underground scenario with wide tunnels and perfect self-localization, using the real position from the simulator, was used as a testing environment. When the system could perform reasonably well, the wide tunnels were substituted by realistic tunnels, using laser scans acquired in a real underground mine. The realistic tunnels were much narrower and had irregular shapes. Finally, when the challenges of the new scenario were solved, the system was tested with a functional selflocalization module, and with other factors that added complexity, such as a simulation of the LHD's controller, in order to validate all low-level communication and security schemes.

#### *4.2. Development Using Scale Models*

Not all of the functions of the system can be tested in a simulated environment, either because of the complexity of the problem, which makes the simulation approach impractical, or because not enough data is available to simulate certain interactions between the equipment and the environment. To address this issue, a scale model can be built in order to validate some of the design assumptions before implementing the solution on a commercial vehicle. The scale models need to have a certain similarity in the aspects related to the phenomena that needs to be validated. In the case described here, a 1:5 scale model was built based on a commercial 5 [yd3] LHD, shown in Figure 12, with an electric power train and hydraulic actuation for the steering and bucket movements, mimicking real equipment. A scaled-down ore extraction point was built, including ore from an actual mine. The scale model was used to perform navigation in the laboratory before installing the control system in the commercial LHD.

**Figure 12.** 1:5 scaled LHD built for testing and validation.
