9: **end while**

### **2. Proposed Approach**

*2.1. Symbols and Notations*

We will use lower case normal font letters to represent scalars, while bold font variants will represent vectors. Matrices are represented by upper case bold fonts. The subscript *t* will be used to denote the time stamp of variables and vectors. The superscript *T* will represent the transposing of a matrix.

#### *2.2. Argmin Differentiation for Unconstrained Parametric Optimization*

We consider the optimal joint trajectories to be the solution of the following boundconstrained optimization with parameter **p**.

$$\mathcal{J}^\*(\mathbf{p}) = \arg\min\_{\xi} f(\xi, \mathbf{p}) \tag{9a}$$

$$
\mathfrak{F}\_{lb} \le \mathfrak{F} \le \mathfrak{F}\_{ub} \tag{9b}
$$

We are interested in computing the Jacobian of *ξ*∗(**p**) with respect to **p**. If we ignore the bound-constraints for now, we can follow the approach presented in [4] to obtain them in the following form.

$$\nabla\_{\mathbf{p}} \mathfrak{J} = - (\nabla\_{\mathfrak{J}}^2 f(\mathfrak{J}, \mathbf{p}))^{-1} \begin{bmatrix} \nabla\_{\mathfrak{J}, p\_1} f(\mathfrak{J}, \mathbf{p}), & \dots & \nabla\_{\mathfrak{J}, p\_n} f(\mathfrak{J}, \mathbf{p}) \end{bmatrix} \tag{10}$$

Using (10), we can derive a local model for the optimal solution corresponding to a perturbation Δ**p** as

$$\mathfrak{F}^\*(\mathfrak{p}) = \mathfrak{F}^\*(\mathfrak{p}) + \nabla\_{\mathbf{p}} \mathfrak{F}^\* \overbrace{(\mathfrak{p} - \mathbf{p})}^{\Delta \mathbf{p}},\tag{11}$$

Intuitively, (11) signifies a step of length Δ**p** along the gradient direction. However, for (11) to be valid, the step-length needs to be small. In other words, the perturbed parameter **p** needs to be in the vicinity of **p**. Although it is difficult to mathematically characterize the notion of "small", in the following, we attempt a practical definition based on the notion of optimal cost.

**Definition 1.** *A valid* |Δ**p**| *is one that satisfies the following relationship*

$$f(\boldsymbol{\xi}^\*(\overline{\mathbf{p}} = \mathbf{p} + \boldsymbol{\Delta}\mathbf{p}), \mathbf{p} + \boldsymbol{\Delta}\mathbf{p}) \le f(\boldsymbol{\xi}^\*, \mathbf{p} + \boldsymbol{\Delta}\mathbf{p})\tag{12}$$

The underlying intuition in (12) is that the perturbed solution should lead to a lower cost for the parameter **p** + Δ**p** as compared to *ξ*∗ for the same perturbed parameter.

#### *2.3. Line Search and Incremental Adaption*

Algorithm 1 couples the concept from the definition (11) with a basic line-search to incrementally adapt (11) to a large Δ**p**. The algorithm begins by initializing the optimal solution *<sup>k</sup>ξ* and the parameter *<sup>k</sup>***p** with prior values for iteration *k* = 0. These variables are then used to initialize the Hessian and Jacobian matrices. The core computations takes place in line 2, wherein we compute the least amount of scaling that needs to be done to step length *<sup>k</sup>*Δ**<sup>p</sup>** <sup>=</sup> *<sup>k</sup>***<sup>p</sup>** <sup>−</sup> **<sup>p</sup>** to guarantee a reduction in the cost. At line 3, we update the optimal solution based on step-length *ηk*Δ**p** obtained in line 2, followed by a simple projection at line 4 to satisfy the minimum and maximum bounds. At line 5, we perform the called forward roll-out of the solution to update the parameter set. For example, if the parameter **p** models position of the end-effector at the final time instant of a trajectory, then line 5 computes how close the *<sup>k</sup>*+1*ξ*<sup>∗</sup> takes the end-effector to the perturbed goal position **p**. On lines 7 and 8, we update the Hessian and the Jacobian matrices based on the updated parameter set and optimal solution.

#### **3. Task Constrained Joint Trajectory Optimization**

This section formulates various examples of the task-constrained trajectory optimization problem and uses the previous section's results for optimal adaptation of joint trajectories under task perturbation. To formulate the underlying costs, we adopt the way-point parametrization and represent the joint angles at time *t* as **q***t*. Furthermore, we will use (**x***e*(**q***t*), **o***e*(**q***t*)) to describe the end-effector position and orientation in terms of Euler angles, respectively.

#### *3.1. Orientation Constrained Interpolation between Joint Configurations*

The task here is to compute an interpolation trajectory between a given initial **q**<sup>0</sup> and a final joint configuration **q***<sup>m</sup>* while maintaining a specified orientation **o***<sup>d</sup>* for the end-effector at all times. We model it through the following cost function.

$$\sum\_{t} f\_{s}(\mathbf{q}\_{t-k:t}) + \left\| \frac{\mathbf{q}\_{t\_{1}} - \mathbf{q}\_{0}}{\mathbf{q}\_{t\_{w}} - \mathbf{q}\_{m}} \right\|\_{2}^{2} + \sum\_{t} \left\| \mathbf{o}\_{t}(\mathbf{q}\_{t}) - \mathbf{o}\_{d} \right\|\_{2}^{2} \tag{13}$$

The first term the cost function models smoothness in terms of joint angles from *t* − *k* to *t* [15]. For example, for *k* = 1, the smoothness is defined as the first-order finite difference of the joint positions at subsequent time instants. Similarly, *k* = 2, 3, will model higher order smoothness through second and third-order finite differences respectively. We consider all three finite-differences in our smoothness cost term. The second term ensures that the interpolation trajectory is close to the given initial and final points. The final term in the cost function maintains the required orientation of the end-effector.

We can shape (13) in the form of (9a) by defining *ξ* = (**q***t*<sup>1</sup> , **q***t*<sup>2</sup> , ... **q***tm* ). The bounds will correspond to the maximum and minimum limits on the joint angles at each time instant. We define the parameter set as **p** = (**q**0, **q***m*). That is, we are interested in computing the adaptation when either or both of **q**<sup>0</sup> and **q***<sup>m</sup>* gets perturbed.

#### Applications

Adaptation of *ξ*<sup>∗</sup> of (13) for different **q**0, **q***<sup>m</sup>* has applications in learning from demonstration setting where the human just provides the information about the initial and/or final joint configuration, and the manipulator then computes a smooth interpolation trajectory between the boundary configurations by adapting a prior computed trajectory.

Figure 1 presents an example of adaptation discussed above. The prior computed trajectory is shown in blue. This is then adapted to two different final joint configurations. The trajectory computed through Algorithm 1 is shown in green, while that obtained by resolving the optimization problem (with warm-starting) is shown in red.

#### *3.2. Orientation-Constrained Trajectories through Way-Points*

The task in this example is to make the end-effector move though given way-points while maintaining the orientation at **o***d*. Let **x***dt* represent the desired way-point of the endeffector at time *t*. Thus, we can formulate the following cost function for the current task.

$$\sum\_{t} f\_{\boldsymbol{\xi}}(\mathbf{q}\_{t-k:t}) + \sum\_{t} ||\mathbf{o}\_{\boldsymbol{\xi}}(\mathbf{q}\_{t}) - \mathbf{o}\_{d}||\_{2}^{2} + \sum\_{t} ||\mathbf{x}\_{\boldsymbol{\xi}}(\mathbf{q}\_{t}) - \mathbf{x}\_{d\_{t}}||\_{2}^{2} \tag{14}$$

The first two terms in the cost function are the same as the previous example. The changes appear in the final term which minimizes the *l*<sup>2</sup> norm of the distance of the end-effector with the desired way-point. The defintion of *ξ* remains the same as before. However, the parameter set is now defined as **p** = (**x***d*<sup>1</sup> , **x***d*<sup>2</sup> ,... **x***dm* ).

#### Application

**Collision Avoidance** As shown in Figure 2, a key application of the adaptation problem discussed above is in collision avoidance. A reactive planner such as [16] can provide new via-points for the manipulator to avoid collision. Our Algorithm 1 can then use the cost function (14) to adapt the prior trajectory shown in blue to that shown in green. For comparison, the trajectory obtained with resolve of the trajectory optimization is shown in red.

**Figure 1.** Prior trajectory shown in blue is used to adapt the joint motions to move towards two different final joint configurations while maintaining the horizontal orientation of the end-effector at all times.

**Figure 2.** Collision avoidance by perturbing the mid-point of the prior computed end-effector trajectory.

**Human–Robot Handover:** Algorithm 1 with cost function (14) also finds application in human–robot handover tasks. An example is shown in Figure 3, where the manipulator adapts the prior trajectory (blue) to a new estimate of the handover position. As before, the trajectory obtained with Algorithm 1 is shown in green, while the one shown in red corresponds to a re-solve of the trajectory optimization with warm-start initialization.

**Figure 3.** Perturbation in the final position of the end-effector.
