*4.4. Representing the Control Action*

Taken from the previous sections we have the following ODE:

$$\begin{aligned} \dot{\mathbf{x}}^{\{A\}} &= \upsilon\_{\mathbf{x}}^{\{B\}} \cos(\theta) - \upsilon\_{\mathbf{y}}^{\{B\}} \sin(\theta) \\ \dot{\mathbf{y}}^{\{A\}} &= \upsilon\_{\mathbf{y}}^{\{B\}} \sin(\theta) + \upsilon\_{\mathbf{y}}^{\{B\}} \cos(\theta) \\ \dot{\theta}^{\{A\}} &= \omega \end{aligned} \tag{8}$$

We need to incorporate the different wheel configurations; that is, to add the different wheels into the formulation. From the discussion above, there are limitations on the turning of each wheels as well as singular points. The key question here is how can we constrain the control to ensure that it is feasible. Equation (6) above is a function that satisfies the property stated in Equation (3):

$$f(\boldsymbol{\varrho}\_{i\prime}\boldsymbol{v}\_{i}) = f(\mathbf{p}\_{i\prime}\mathbf{v}\_{\prime}\boldsymbol{\omega})\tag{9}$$

That is, Equation (6) computes the orientation and velocity of each wheel. However, it is necessary to ensure that the velocity *vi* of each wheel is within bounds and does not exceed a max value. Similarly there is also a limitation on the turning rate of each wheel *ϕ*˙*i*.

The first step is to find a more suitable representation of the control **u***<sup>p</sup>* that is better aligned with the wheel configuration such as the steering angle of the wheels. Therefore we utilize the following control variable as proposed in [33]:

$$\mathbf{u}\_p = (v, \boldsymbol{\varrho}, \boldsymbol{\omega}) \tag{10}$$

where *v* = - *v*2 *<sup>x</sup>* + *v*<sup>2</sup> *<sup>y</sup>* and *ϕ* = atan2(*vy*, *vx*). In essence we change the base of the linear velocity to contain the linear velocity direction *ϕ*, as well as the linear velocity along this direction as *v*. As it is easy to convert between the different linear velocity representations (*vx* = *v* sin(*ϕ*), *vy* = *v* cos(*ϕ*)) they are used interchangeably in this work.

If we now only have a linear velocity (*ω* = 0) each steering wheel angle is given by *ϕ*, that is *ϕ<sup>i</sup>* = *ϕ* for each wheels *i*. Given this representation we can now add limitations on the change of the control *ϕ*˙ to better reflect the limitation on the change in steering angle. One key situation is when approaching the goal: the control action must be limited, otherwise small changes in linear velocities would require extremely large changes in wheel steering angles *ϕ*˙*i*. Note that this is not the same problems as discussed above regarding ICR as when *ω* = 0 the ICR is at the center of the platform and far from the wheels' locations **p***i*.

Another benefit is that the change in linear velocity control representation is that the linear velocity speed is separated from the orientation components. This allows a more intuitive way of formulating an acceleration profile. Furthermore, a profile on the change of linear direction—which is highly influential of the steering wheel angles—can be formulated. Specifically, the following limitations are added on the control variables:

$$\begin{aligned} -\dot{\upsilon}^{\text{max}} &\le \dot{\upsilon} \le \dot{\upsilon}^{\text{max}} \\ -\dot{\phi}^{\text{max}} &\le \dot{\phi} \le \dot{\phi}^{\text{max}} \\ -\dot{\omega}^{\text{max}} &\le \dot{\omega} \le \dot{\omega}^{\text{max}} \end{aligned} \tag{11}$$

Note that these are not directly connected to any physical limitation of the steer and drive wheels. For example, the change in steering wheel angle is often fast (in the platform used in the evaluation the changes are >20 rad/s). However, these limitations are useful for a smooth drive characteristic.

There is also a maximum velocity limitation on the drive wheel. This limitation is also explicitly considered and is detailed in the next section.

#### *4.5. Limitations on the Control Action Due to Maximum Velocity of the Drive Wheels*

The linear and angular velocity of the wheels are connected. Informally speaking, you cannot both drive quickly and turn quickly at the same time, because, as with differential drive platforms, the same wheel velocities *vi* are utilized to both obtain the linear velocity **v** as well as the angular velocity *ω* of the platform. However, for a differential drive platform the ICR point is on the line that connects the left and right wheel, while for the platform used here the ICR can be set arbitrarily. The maximum rotational velocity a wheel can achieve without having any linear velocity (that is the ICR is given at (0, 0)) is given by:

$$
\omega\_i^{\text{max}} = \frac{v\_i^{\text{max}}}{||\mathbf{p}\_i||} \tag{12}
$$

It is clear that the rotational speed is greatly dependent on the distance between the ICR and the wheels. For a differential driven platform with a linear velocity forward and turning right it will be the left wheel that will reach the *v*max *<sup>i</sup>* boundary first. Hence it is the wheel with the furthest distance from the ICR that will be the limitation. In principle for the platform at hand, it would be possible to have a larger linear velocity (*vx*, *vy*)=(*v*, 0) (along the *<sup>x</sup>*-axis) than (*vx*, *vy*)=(√2, <sup>√</sup>2) as the distance between the ICR and the wheel furthest away is longer in the latter case.

In the proposed approach this difference is neglected and we consider a combined linear and rotational max boundary as follows:

$$\begin{aligned} -\omega^{\text{max}} \le \omega - \frac{v}{d} \le \omega^{\text{max}}\\ -\omega^{\text{max}} \le \omega + \frac{v}{d} \le \omega^{\text{max}} \end{aligned} \tag{13}$$

where *d* = *di* = ||**p***i*|| which is assumed to be the same distance for all wheels *i*. In practice, this corresponds to always assuming the worst-case scenario where one wheel is always the furthest possible distance away from the ICR.

#### **5. Defining the Optimization Problem**

This section outlines the non-linear optimization problem that forms the basis of the MSDU Local Planner. The core objective of the MSDU Local Planner is to obtain a local feasible trajectory that drives the platform towards a goal. This goal is obtained from a global planner that is periodically updated (in the order of approx. 1 Hz). Hence, the local plan obtained does not have to consider getting stuck despite its local nature.

We here formulate a non-linear optimization problem that will generate a feasible trajectory. Due to the computational complexity that comes with non-linear solving, care has to be taken to formulate an optimization problem that is fast enough to solve. This will impact how the problem is formulated, as well as restricting the size of certain parameters; for example, the look-ahead distance and the sampling resolution. On the positive side, the non-linearity approach allows us to be much more free in how we select the objective function.

#### *5.1. Problem Formulation*

Two different approaches are used to steer the trajectory generation by the optimization. The first approach is to add factors into an objective function. The second is to post constraints on the different variables. Constraints are a very powerful and intuitive way to

steer the optimization, but can be problematic if constraints are posted that simply cannot be satisfied. One example would be the goal pose **P***goal* that we would like to arrive at. Typically, it is not guaranteed that given other constraints on, for example, the acceleration, velocity, and turning speed limits, that we can reach the goal. Instead, we use the distance between the current pose and future poses to the goal pose in the objective function. If the goal cannot be reached, that is acceptable as no constraints are violated and at the same time the minimization of the cost function (that is, the distance to the goal) will drive the vehicle towards the goal.

The state **s** consists of the vehicle pose (*x*, *y*, *θ*) and linear velocity, direction and angular velocity (*v*, *ϕ*, *ω*) whereas the optimization control variables **u** are (*dv*, *dϕ*, *dω*). Note that the control action **u***p* used to drive the vehicle is actually part of the state, however, we will use an additional optimization variables of derivatives of the control values to fulfill additional requirements such as limiting the maximum acceleration permitted; see Equation (10).

The model of the vehicle dynamics (**s**˙ = *f*(**s**, **u**)) is described as:

$$\begin{aligned} \dot{x} &= v \cos(\varphi) \cos(\theta) - v \sin(\varphi) \sin(\theta) \\ \dot{y} &= v \cos(\varphi) \sin(\theta) + v \sin(\varphi) \cos(\theta) \\ \dot{\theta} &= \omega \\ \dot{v} &= dv \\ \dot{\varphi} &= d\varphi \\ \dot{\omega} &= d\omega \end{aligned} \tag{14}$$

Given the dynamics, we formulate a constrained optimal control problem (OCP):

$$\begin{aligned} \text{minimize}\_{\mathbf{s},\mathbf{u}} \qquad & \quad \phi(T) + \int\_{0}^{T} l(\mathbf{s}(t), \mathbf{u}(t)) dt \\ \text{subject to} \qquad & \mathbf{s}(0) = \mathbf{\dot{s}}\_{0} \\ & \quad \mathbf{s}(t) = f(\mathbf{s}(t), \mathbf{u}(t)), \quad t \in [0, T] \\ & \quad h(\mathbf{s}(t), \mathbf{u}(t)) \le 0, \quad t \in [0, T] \\ & \quad d(\mathbf{s}(t), \mathbf{o}, \mathbf{c}) \le 0, \quad t \in [0, T], \mathbf{o} \in \mathcal{O}, \mathbf{c} \in \mathcal{C} \end{aligned} \tag{15}$$

where *T* is the horizon length in seconds, *φ*(*T*) is the terminal cost, *l* is the cost for time *t*, **s**ˆ0 is the initial state, *f* is the vehicle dynamics function (Equation (14)), *h* is the path constraints containing limits on inputs **u**, such as max accelerations, as well as pure state constraints on **s**, such as bounds on max velocities, and finally *d* provides a mean to ensure collision free state poses given a set of obstacle points O as well as a set of circles C representing the shape of the vehicle. Both the obstacle point **o** = [*ox*, *oy*] and the circle **c** = [*cx*, *cy*, *cR*] are given in the vehicle frame where *cR* is the radius of the circle.

To solve the OCP problem defined above we discretize it into a non-linear program (NLP) using multiple shooting [34]. The trajectory consists of vehicle states at discrete timestamps and holds *N* steps covering *T* seconds which gives us that each increment brings us *dt* = *<sup>T</sup> <sup>N</sup>* seconds into the future.

The discrete decision variable is *<sup>ζ</sup>* <sup>=</sup> {**s***i*, **<sup>u</sup>***i*}*<sup>N</sup> i*=1, and the non-linear program is written as:

$$\begin{aligned} \text{minimize}\_{\xi} \qquad & \phi(\zeta\_N) + \sum\_{k=0}^{N-1} l(\zeta\_k) \\ \text{subject to} \qquad & \mathbf{s}\_0 = \mathbf{s}\_0 \\ & \mathbf{s}\_{k+1} = F(\mathbf{s}\_k, \mathbf{u}\_k, dt), \quad k = 0 \dots N - 1 \\ & h(\mathbf{s}\_k, \mathbf{u}\_k) \le 0, \quad k = 0 \dots N \\ & d(\mathbf{s}\_k, \mathbf{o}, \mathbf{c}) \le 0, \quad k = 0 \dots N, \mathbf{o} \in \mathcal{O}, \mathbf{c} \in \mathcal{C} \end{aligned} \tag{16}$$

where the objective function is described in Equation (19), *F* is the discrete model of the dynamics (see Equation (20)), the path constraints *h* are given in Equations (21) and (22) and finally the constraints to ensure collision-free motions *d* are described in Equation (24).

As we are primarily interested in the next control action to take, we follow the classical model-predictive control (MPC) scheme and use the obtained decision variables *ζ* to extract the next control action. Depending on inherent lag in the system, it is also possible to take not just the first control action available but to take a future one.

Because the problem is formulated as a standard non-linear program, it can be straightforwardly integrated into existing non-linear solvers. In this work, the formulation was implemented with CasADi [35] which here utilizes the Ipopt library [36] to solve the posted non-linear problem.

The objective and constraints are discussed further in the following sections.

#### *5.2. Inputs and Outputs*

As described above we continuously receive a global plan (at approximately 1 Hz) from which we extract the next local "goal" based on our current localization estimate, which is also provided continuously (at 50 Hz, where the localization system runs slower and is dependent on the translation and rotational distance, but is augmented with odometry readings which are obtained at 50 Hz). To simplify the formulation the local goal is converted into the robot frame {*B*}, which allows us to assume that we always start at pose (0, 0, 0). The continuous sensory input (at 10 Hz) is already provided in the robot frame {*B*} as the sensory setup is located on the robot itself.

The controller or the local planner is queried at 10 Hz in which the latest received plan, localization estimate, and sensory data are used.

The output is the next control action to be executed **u***<sup>p</sup>* = (*vx*, *vy*, *ω*).

#### *5.3. Objectives*

As discussed above, the force that drives the robot towards the goal **g** = (*gx*, *gy*, *g<sup>θ</sup>* ) lies in the cost objective which contains the distance between the goal and each pose in the *N*-step long trajectory (*x*, *y*, *θ*)1...*N*. The goal part to the objective is as:

$$\mathbf{J}^{\text{goal}} = \sum\_{i=1}^{N} w\_i^x (\mathbf{g}\_x - \mathbf{x}\_i)^2 + w\_i^y (\mathbf{g}\_y - \mathbf{y}\_i)^2 + w\_i^\theta (\mathbf{g}\_\theta - \theta\_i)^2 \tag{17}$$

where we have different weighting factors *<sup>w</sup>*1...*<sup>N</sup> <sup>x</sup>* , *<sup>w</sup>*1...*<sup>N</sup> <sup>y</sup>* and *<sup>w</sup>*1...*<sup>N</sup> <sup>θ</sup>* . These weights can be selected and tuned as needed, but in the evaluation presented in this paper all position weights were set to be the same. It is also possible to have a lower cost on the intermediate weights in the range (1 ... *N* − 1) compared to the last terminal state weights *wN*. The core idea is that we want to steer the optimization towards the goal as quickly as possible. Additional constraints to limit the velocities and accelerations were also added as discussed in Section 5.4 below.

Another cost relates to the magnitude of the control actions utilized. This was found particularly important to limit the amount of turning when driving the platform close to the goal. The cost on the decisions variables related to the derivatives of the generated control output is defined as:

$$\mathbf{J}^{\text{control}} = \sum\_{i=1}^{N} w\_i^{dv} (d\upsilon)^2 + w\_i^{d\boldsymbol{\varphi}} (d\boldsymbol{\varphi})^2 + w\_i^{d\boldsymbol{\omega}} (d\boldsymbol{\omega})^2 \tag{18}$$

Our objective is the sum of the above and can be rewritten as:

$$\boldsymbol{\phi}(\boldsymbol{\zeta}\_N) + \sum\_{k=0}^{N-1} \boldsymbol{l}(\boldsymbol{\zeta}\_k) = \sum\_{i=1}^N \mathbf{x}\_i^T \mathbf{Q}\_i \mathbf{x}\_i + \sum\_{i=1}^N \mathbf{u}\_i^T \mathbf{R}\_i \mathbf{u}\_i \tag{19}$$

where **x***<sup>i</sup>* = [*gx* − *xi*, *gy* − *yi*, *g<sup>θ</sup>* − *θi*] *<sup>T</sup>* and **Q***<sup>i</sup>* together with **R***<sup>i</sup>* are diagonal weighting matrices.
