Maximum Principle and Second-Order Optimality Conditions in Control Problems with Mixed Constraints

Arutyunov, Aram V.; Karamzin, Dmitry Yu.; Pereira, Fernando Lobo

doi:10.3390/axioms11020040

Open AccessFeature PaperArticle

Maximum Principle and Second-Order Optimality Conditions in Control Problems with Mixed Constraints

by

Aram V. Arutyunov

^1,†

,

Dmitry Yu. Karamzin

^2,*,†

and

Fernando Lobo Pereira

^3,*,†

¹

V.A. Trapeznikov Institute of Control Sciences, Russian Academy of Sciences, 117997 Moscow, Russia

²

Federal Research Center “Computer Science and Control”, Russian Academy of Sciences, 119333 Moscow, Russia

³

Faculty of Engineering, University of Porto, 4200-465 Porto, Portugal

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Axioms 2022, 11(2), 40; https://doi.org/10.3390/axioms11020040

Submission received: 28 December 2021 / Revised: 11 January 2022 / Accepted: 15 January 2022 / Published: 20 January 2022

(This article belongs to the Special Issue Calculus of Variations, Optimal Control, and Mathematical Biology: A Themed Issue Dedicated to Professor Delfim F. M. Torres on the Occasion of His 50th Birthday)

Download Versions Notes

Abstract

:

This article concerns the optimality conditions for a smooth optimal control problem with an endpoint and mixed constraints. Under the normality assumption, which corresponds to the full-rank condition of the associated controllability matrix, a simple proof of the second-order necessary optimality conditions based on the Robinson stability theorem is derived. The main novelty of this approach compared to the known results in this area is that only a local regularity with respect to the mixed constraints, that is, a regularity in an

ε

-tube about the minimizer, is required instead of the conventional stronger global regularity hypothesis. This affects the maximum condition. Therefore, the normal set of Lagrange multipliers in question satisfies the maximum principle, albeit along with the modified maximum condition, in which the maximum is taken over a reduced feasible set. In the second part of this work, we address the case of abnormal minimizers, that is, when the full rank of controllability matrix condition is not valid. The same type of reduced maximum condition is obtained.

Keywords:

optimal control; maximum principle; mixed constraints

1. Introduction

In this article, second-order necessary optimality conditions for an optimal control problem with mixed equality and inequality constraints are investigated. Under the normality condition, which is ensured by the full rank of the controllability matrix, a rather simple proof of the optimality conditions is proposed based on Robinson’s theorem on the metric regularity for set-valued mappings. For the case in which the normality condition is violated, the second-order conditions are derived based on the index approach. This means that some reduced cone of Lagrange multipliers is invoked, which is defined by using the index of the quadratic form of the Lagrange function; see, e.g., [1,2,3].

In work [4], the two notions of the strong and of weak regularity of an admissible trajectory with respect to mixed constraints have been considered. Strong regularity means that the constraint qualification, or the so-called Robinson condition, is satisfied for all time-points and for all admissible control values. This corresponds to the regularity condition in the classical sense. Weak regularity means that this condition is satisfied merely in some neighborhood of the optimal process. By their nature, these two concepts correspond, respectively, to global, and local regularity settings. Under weak regularity, a refined maximum condition of Pontryagin’s type has been obtained, in which the maximum is taken over the closure of regular points of the feasible set, but not over the entire feasible set. In this article, the results of [4] are carried over to the second-order conditions in the case of global minimum.

The literature on optimality conditions for optimal control problems with mixed constraints is extensive. In the context of this research related to the study of mixed constraints, we note the works of [5,6,7,8,9,10,11,12]. Regarding the second-order conditions in mixed constrained problems, one may consider, e.g., [3,13,14] and the bibliography cited therein. At the same time, these selective lists of publications are far from exhaustive.

This work is organized as follows. In the next section, the problem formulation is presented, together with main definitions and notation. In Section 3, the issue of normality is discussed. In Section 4, the main result of this work—the normal maximum principle and second-order optimality conditions—is formulated and proved. In Section 5, the abnormal situation is taken into consideration, and the result of the previous section is refined. Section 6 concludes the work with a short summary.

2. Problem Formulation

Consider the following optimal control problem on the fixed time interval

[0, 1]

:

\{\begin{matrix} Minimize & φ (p) \\ subject to & \dot{x} (t) = f (x (t), u (t), t) for a . a . t, \\ e_{1} (p) \leq 0, e_{2} (p) = 0, \\ u (t) \in U (x (t), t) for a . a . t, \end{matrix}

(1)

where

p = (x_{0}, x_{1})

,

x_{0} = x (0)

,

x_{1} = x (1)

,

t \in [0, 1]

, and

U (x, t) : = {u \in R^{m} : r_{1} (x, u, t) \leq 0, r_{2} (x, u, t) = 0} .

The mappings

φ : R^{2 n} \to R

,

e_{i} : R^{2 n} \to R^{k_{i}}

,

f : R^{n} \times R^{m} \times R^{1} \to R^{n}

,

r_{i} : R^{n} \times R^{m} \times R^{1} \to R^{q_{i}}

,

i = 1, 2,

satisfy the following hypothesis.

Hypothesis 1 (H1).

Mappings

φ, e_{1}, e_{2}, f, r_{1}, r_{2}

are twice continuously differentiable.

The vector

p = (x_{0}, x_{1})

is termed the endpoint, as well as the constraints given by mappings

e_{1}, e_{2}

. The scalar function

φ (p)

defines the minimizing functional. Mappings

r_{1}, r_{2}

define the mixed constraints which are imposed on both state and control variables.

A pair of functions

(x, u)

is designed by the control process; if

x (\cdot)

is absolutely continuous,

u (\cdot)

is measurable and essentially bounded, whereas

\dot{x} (t) = f (x (t), u (t), t)

for a.a.

t \in [0, 1]

. A control process is feasible, provided that the endpoint, control and state constraints are satisfied. A feasible process

(\bar{x}, \bar{u})

is termed optimal if, for any feasible process

(x, u)

,

φ (\bar{p}) \leq φ (p)

, where

\bar{p} = (\bar{x} (0), \bar{x} (1))

.

This concept of the minimum is known as a global strong minimum. The purpose of this work is to derive the second-order necessary optimality conditions for this type of minimum under the normality assumptions. That is, to find such a set of Lagrange multipliers that simultaneously satisfies the maximum principle and Legendre’s condition, and for which

λ^{0} > 0

. Such a set of multipliers must be unique upon normalization. The abnormal situation is also examined after the normal case.

Consider the reference control process

(\bar{x}, \bar{u})

, which can be optimal, extremal, regular, or normal in what follows. Denote by

r = (r_{1}, r_{2})

the joint mapping acting onto

R^{q}

, where

q = q_{1} + q_{2}

. Let

J (x, u, t) : = {j : r^{j} (x, u, t) = 0}

be the set of active indices, where the upper index specifies the vector component. Set

J (u, t) : = J (\bar{x} (t), u, t)

. Let

U (\cdot)

designate the closure of function

\bar{u} (\cdot)

w.r.t. the Lebesgue measure; that is, for a given

t \in [0, 1]

, the set

U (t)

consists of essential values of

\bar{u} (\cdot)

at point t,8]. Recall that the vector a is said to be the essential value of a function

u (\cdot)

at point

τ

, provided that

ℓ ({t \in [τ - ε, τ + ε] : u (t) \in B_{ε} (a)}) > 0

\forall ε > 0

, where

B_{ε} (a)

is the closed ball centered at a with the radius

ε

, and ℓ designates the Lebesgue measure on

R

.

The main regularity concept is as follows.

Definition 1.

The control process

(\bar{x}, \bar{u})

is said to be regular w.r.t. the mixed constraints, provided that, for all

t \in [0, 1]

and for all

u \in U (t)

, the active gradients

{(r^{j})}_{u}^{'} (\bar{x} (t), u, t)

,

j \in J (u, t)

are linearly independent.

The following proposition represents an equivalent reformulation of the introduced regularity concept. For

ε \geq 0

, define the set

J_{ε} (x, u, t) : = \{j \in {1, \dots, q_{1}} : r^{j} (x, u, t) \in [- ε, 0]\} \cup {q_{1} + 1, \dots, q},

which is subject to the same conventions as the mapping

J (x, u, t)

. It is clear that

J \subseteq J_{ε}

, and

J_{0} = J

.

Proposition 1.

Let the control process

(\bar{x}, \bar{u})

be regular w.r.t. the mixed constraints. Then, there exists a number

ε_{0} > 0

such that, for all

t \in [0, 1]

and for almost all

s \in [0, 1]

such that

| s - t | \leq ε_{0}

, the

ε_{0}

-active gradients

{(r^{j})}_{u}^{'} (\bar{x} (s), \bar{u} (s), s)

,

j \in J_{ε_{0}} (s)

are linearly independent. Moreover, the number

ε_{0}

can be chosen such that the modulus of surjectivity for this set of gradients is not lower than

ε_{0}

.

The proof is based on a simple contradiction argument.

The point

u \in U (x, t)

is termed regular provided that the gradients

{(r^{j})}_{u}^{'} (x, u, t)

,

j \in J (x, u, t)

are positively linearly independent. The subset of all regular points of

U (x, t)

is denoted as

U_{R} (x, t)

. Denote

Θ (x, t) : = clos U_{R} (x, t) .

It is clear that, for the regular process

(\bar{x}, \bar{u})

, one has

U (t) \subseteq Θ (\bar{x} (t), t) \neq ⌀

\forall t \in [0, 1]

.

Consider the Hamilton–Pontryagin function

H (x, u, ψ, t) : = 〈 ψ, f (x, u, t) 〉,

and the Lagrangian

L (p, λ) : = λ^{0} φ (p) + 〈 λ_{1}, e_{1} (p) 〉 + 〈 λ_{2}, e_{2} (p) 〉 .

Here,

ψ \in {(R^{n})}^{*}

,

λ = (λ^{0}, λ_{1}, λ_{2}) \in {(R^{1 + k_{1} + k_{2}})}^{*}

are the conjugate variables.

Definition 2.

The control process

(\bar{x}, \bar{u})

is said to satisfy the maximum principle provided that there exists a vector

λ = (λ^{0}, λ_{1}, λ_{2}) \in {(R^{1 + k_{1} + k_{2}})}^{*}

, where

λ^{0} \geq 0

and

λ_{1} \geq 0

, an absolutely continuous vector-valued function

ψ \in W_{1, \infty} ([0, 1]; {(R^{n})}^{*})

, and a measurable, essentially bounded, vector-valued function

ν \in L_{\infty} ([0, 1]; {(R^{q})}^{*})

of which the j-th component is nonnegative for

j = 1, \dots, q_{1}

, such that

λ \neq 0

, and, on

[0, 1]

, it holds that:

\dot{ψ} (t) = - H_{x}^{'} (\bar{x} (t), \bar{u} (t), ψ (t), t) + ν (t) r_{x}^{'} (\bar{x} (t), \bar{u} (t), t) f o r a . a . t,

(2)

ψ (α) = {(- 1)}^{α} L_{x_{α}}^{'} (\bar{p}, λ) f o r α = 0, 1,

(3)

max_{u \in Θ (\bar{x} (t), t)} H (\bar{x} (t), u, ψ (t), t) = H (\bar{x} (t), \bar{u} (t), ψ (t), t) f o r a . a . t,

(4)

H_{u}^{'} (\bar{x} (t), \bar{u} (t), ψ (t), t) - ν (t) r_{u}^{'} (\bar{x} (t), \bar{u} (t), t) = 0 f o r a . a . t,

(5)

〈 λ_{1}, e_{1} (\bar{p}) 〉 = 0, a n d \int_{0}^{1} 〈 ν (t), r (\bar{x} (t), \bar{u} (t), t) 〉 d t = 0 .

(6)

Here, Condition (2) is the co-state equation, that is, the differential equation for the conjugate variable

ψ

. Equalities (3) are the transversality conditions. Equality (4) is the maximum condition. Equality (5) is the so-called Euler–Lagrange equation. Equalities (6) are known as the complementary slackness condition. Furthermore,

(λ, ψ, ν)

are known as the Lagrange multipliers.

Under the regularity condition given in Definition 1, the multipliers

ψ

, and

ν

are uniquely defined by the vector

λ

, where

(λ, ψ, ν)

is the set of Lagrange multipliers corresponding to

(\bar{x}, \bar{u})

in view of the maximum principle. This assertion simply follows from the Euler–Lagrange equation. Then, denote by

Λ = Λ (\bar{x}, \bar{u})

the set of vectors

λ \in {(R^{1 + k_{1} + k_{2}})}^{*}

for which there exist

(ψ, ν)

such that the corresponding set of Lagrange multipliers

(λ, ψ, ν)

generated by

λ

satisfies the maximum principle.

3. Normality Condition

Let us introduce the notion of normality. This notion is based on the concept of linearization of the control problem and the corresponding variational differential system. Consider the reference control process

(\bar{x}, \bar{u})

, and a pair

(δ x_{0}, δ u) \in X : = R^{n} \times L_{2} ([0, 1]; R^{m})

. Denote by

δ x (\cdot)

the solution to the variational differential equation on the time interval

[0, 1]

, which corresponds to

(δ x_{0}, δ u)

, that is,

\dot{δ x} (t) = f_{x}^{'} (\bar{x} (t), \bar{u} (t), t) δ x (t) + f_{u}^{'} (\bar{x} (t), \bar{u} (t), t) δ u (t),

(7)

where

δ x (0) = δ x_{0}

. Such a solution exists on the entire time interval

[0, 1]

and, as soon as

δ u

is an

L_{2}

-function, one finds that

δ x \in W_{1, 2} ([0, 1]; R^{n})

.

In what follows, it is not restrictive to set

e_{1} (\bar{p}) = 0

. Thus, all the endpoint constraints of the inequality type are assumed to be active. Consider the two following subspaces in

X

:

\begin{matrix} N_{e} : = \{(δ x_{0}, δ u) \in X : e^{'} (\bar{p}) δ p = 0\}, \\ N_{r} : = \{(δ x_{0}, δ u) \in X : D (t) [r_{x}^{'} (\bar{x} (t), \bar{u} (t), t) δ x (t) + r_{u}^{'} (\bar{x} (t), \bar{u} (t), t) δ u (t)] = 0\} . \end{matrix}

Here, e is the joint mapping of

e_{1}, e_{2}

;

δ p = (δ x_{0}, δ x_{1})

, where

δ x_{1} = δ x (1)

, and

D (t)

is the diagonal

q \times q

-matrix which has 1 in the position

(j, j)

iff

j \in J (t)

and 0 otherwise.

Consider the matrix

R (t) = r_{u}^{'} {(\bar{x} (t), \bar{u} (t), t)}^{*} D (t) r_{u}^{'} (\bar{x} (t), \bar{u} (t), t) .

Set

M (t) : = R {(t)}^{+} r_{u}^{'} {(\bar{x} (t), \bar{u} (t), t)}^{*} D (t) r_{x}^{'} (\bar{x} (t), \bar{u} (t), t) .

Here,

A^{+}

stands for the generalized inverse [15]. Here, the generalized inverse

R {(t)}^{+}

can be computed as follows. Let

T (t)

be a non-singular orthogonal linear transform which maps the subspace

ker R {(t)}^{⊥} = im R (t)

onto the subspace of

R^{m}

with the first

m - q (t)

coordinates vanished, where

q (t) = | J (t) |

is the number of active indices. Then,

R {(t)}^{+} = T^{- 1} (t) {[T (t) R (t) T^{- 1} (t)]}^{- 1} T (t)

, where the pseudo-inverse of the block

{(\begin{matrix} 0 & 0 \\ 0 & A \end{matrix})}^{- 1}

is understood as

(\begin{matrix} 0 & 0 \\ 0 & A^{- 1} \end{matrix})

.

Consider the matrix differential system

\dot{Φ} (t) = f_{x}^{'} (\bar{x} (t), \bar{u} (t), t) Φ (t) - f_{u}^{'} (\bar{x} (t), \bar{u} (t), t) M (t) Φ (t),

(8)

where

Φ (0) = I

. Let

Φ (t)

be the solution to (8), and

P (t)

be the matrix of orthogonal projection onto

ker R (t)

.

It is clear that by virtue of the construction, any element

(δ x_{0}, δ u) \in N_{r}

can be represented as

(δ x_{0}, δ u) = (δ x_{0}, P δ u + V [δ x_{0}, δ u]),

(9)

where

V [δ x_{0}, δ u] = M (t) δ x (t)

, whereas

δ x (t) = Φ (t) (δ x_{0} + \int_{0}^{t} Φ^{- 1} (s) f_{u}^{'} (\bar{x} (s), \bar{u} (s), s) P (s) δ u (s) d s) .

(10)

Conversely, any

δ x_{0} \in R^{n}

and

δ v \in L_{2} ([0, 1]; R^{m})

:

δ v (t) \in ker R (t)

a.e. yields an element of

N_{r}

as

(δ x_{0}, δ v + V [δ x_{0}, δ v]) \in N_{r}

. Therefore, there is a one-to-one correspondence between

N_{r}

and the space of the above-specified elements

(δ x_{0}, δ v)

. At the same time, the formula for the solution

δ x

in

N_{r}

is given by (10).

Let us proceed with the construction of the controllability matrix. Define the

R^{n \times k}

-matrix A as

A = e_{x_{0}}^{'} (\bar{p}) + e_{x_{1}}^{'} (\bar{p}) Φ (1),

with the

R^{m \times k}

-matrix

B (t)

given as

B (t) = e_{x_{1}}^{'} (\bar{p}) Φ (1) Φ^{- 1} (t) f_{u}^{'} (\bar{x} (t), \bar{u} (t), t) .

Now, the controllability matrix Q is introduced as the

R^{k \times k}

-matrix:

Q = A A^{*} + \int_{0}^{1} B (t) P (t) B^{*} (t) d t .

Definition 3.

The regular control process

(\bar{x}, \bar{u})

is said to be normal, provided that

Q > 0

, or equivalently,

rank Q = k

.

4. Main Result

In this section, the second-order necessary optimality conditions are addressed. Consider the two cones

C_{e} : = \{y \in R^{k} : y^{j} \leq 0 for j = 1, \dots, k_{1}, and y^{j} = 0 for j = k_{1} + 1, \dots, k\},

C_{r} : = \{y \in R^{q} : y^{j} \leq 0 for j = 1, \dots, q_{1}, and y^{j} = 0 for j = q_{1} + 1, \dots, q\} .

Define the cone

\begin{matrix} K : = {(δ x_{0}, δ u) \in X : e^{'} (\bar{p}) δ p \in C_{e}, \\ D (t) [r_{x}^{'} (\bar{x} (t), \bar{u} (t), t) δ x (t) + r_{u}^{'} (\bar{x} (t), \bar{u} (t), t) δ u (t)] \in C_{r}} . \end{matrix}

On the space

X

, consider the quadratic form

\begin{matrix} Ω_{λ} {[(δ x_{0}, δ u)]}^{2} & = & L_{p p}^{″} (\bar{p}, λ) {[δ p]}^{2} - \int_{0}^{1} H_{w w}^{″} (\bar{x} (t), \bar{u} (t), ψ (t), t) {[δ w (t)]}^{2} d t \\ + \int_{0}^{1} 〈ν (t), r_{w w}^{″} (\bar{x} (t), \bar{u} (t), t) {[δ w (t)]}^{2}〉 d t . \end{matrix}

Here and further, for convenience of notation:

w = (x, u)

,

δ w (t) = (δ x (t), δ u (t))

.

The main result of this section consists in the following theorem.

Theorem 1.

Let

(\bar{x}, \bar{u})

be an optimal control process in Problem (1). Suppose that this process is normal.

Then,

Λ \neq ⌀

. Moreover,

dim span (Λ) = 1

, and, for

λ = (λ^{0}, λ_{1}, λ_{2}) \in Λ

, it holds that

λ^{0} > 0

, and

Ω_{λ} {[(δ x_{0}, δ u)]}^{2} \geq 0 \forall (δ x_{0}, δ u) \in K .

(11)

The proof is preceded with the following auxiliary assertion.

Lemma 1.

Consider linear bounded operators A and

A_{i}

,

i = 1, 2, \dots

, acting in a given Hilbert space X, such that

A_{i} \to A

pointwise. Assume that the spaces

im A_{i}

and

im A_{i}^{*}

are closed and that

im A \subseteq im A_{i}

for all i. Assume also that the sequence of norms

∥ {(A_{i} A_{i}^{*})}^{- 1} ∥_{im A_{i}}

is uniformly bounded. Let

C \subseteq X

be a closed and convex set.

Then,

A^{- 1} (C) = \underset{i \to \infty}{Limsup} A_{i}^{- 1} (C) .

(12)

Proof.

Let

ξ_{i} \in A_{i}^{- 1} (C)

and

ξ_{i} \to ξ_{0}

as

i \to \infty

. By virtue of the uniform boundedness principle,

A_{i} ξ_{i} \to A ξ_{0}

. Thus,

A ξ_{0} \in C

and the embedding ‘⊇’ is proven.

Let us confirm the inverse embedding. Given

ξ_{0} \in A^{- 1} (C)

, it is necessary to indicate a sequence of elements

ξ_{i} \in A_{i}^{- 1} (C)

, such that

ξ_{i} \to ξ_{0}

.

Consider the extremal problem

∥ ξ - ξ_{0} ∥^{2} \to min, A_{i} ξ = A ξ_{0} .

Denote the solution to this problem as

ξ_{i}

. The solution exists since

im A \subseteq im A_{i}

and since the quadratic functional is weakly lower semi-continuous, whereas the closed convex set

A_{i}^{- 1} (A ξ_{0})

is weakly closed. Since the image of

A_{i}

is closed, one can apply the Lagrange multiplier rule as follows. There exists a non-zero vector

λ_{i} \in im A_{i}

such that

ξ_{i} - ξ_{0} + A_{i}^{*} λ_{i} = 0 .

Applying

A_{i}

, the multiplier is expressed as follows

λ_{i} = {(A_{i} A_{i}^{*})}^{- 1} (A_{i} ξ_{0} - A_{i} ξ_{i}) .

Therefore,

ξ_{i} = (I - A_{i}^{*} {(A_{i} A_{i}^{*})}^{- 1} (A_{i} - A)) ξ_{0} .

Note that

∥ A_{i}^{*} {(A_{i} A_{i}^{*})}^{- 1} ∥ \leq ∥ A_{i} ∥ \cdot ∥ {(A_{i} A_{i}^{*})}^{- 1} ∥ \leq const

by the assumption of the lemma. However,

A_{i} ξ_{0} \to A ξ_{0}

, and thus,

ξ_{i} \to ξ_{0}

. □

Proof to Theorem 1.

By virtue of Theorem 3.5 in [4] and the regularity of the process

(\bar{x}, \bar{u})

, there exists a set of multipliers

(λ, ψ, ν)

satisfying the maximum principle, such that

λ \neq 0

.

Firstly, we prove that the given

λ

satisfies the following Lagrange multipliers rule:

λ^{0} (〈φ_{x_{0}}^{'} (\bar{p}), δ x_{0}〉 + 〈φ_{x_{1}}^{'} (\bar{p}), δ x_{1}〉) + (λ_{1}, λ_{2}) (e_{x_{0}}^{'} (\bar{p}) δ x_{0} + e_{x_{1}}^{'} (\bar{p}) δ x_{1}) = 0

(13)

for all

(δ x_{0}, δ u) \in N_{r}

.

Due to the maximum principle, one has

\begin{matrix} 〈ψ (1), δ x (1)〉 & = & 〈ψ (0), δ x (0)〉 + \int_{0}^{1} (〈\dot{ψ} (t), δ x (t)〉 + 〈ψ (t), \dot{δ x} (t)〉) d t \\ = & 〈ψ (0), δ x (0)〉 + \int_{0}^{1} (〈 - ψ (t) f_{x}^{'} (\bar{x} (t), \bar{u} (t), t) + ν (t) r_{x}^{'} (\bar{x} (t), \bar{u} (t), t), δ x (t) 〉 \\ 〈ψ (t), f_{x}^{'} (\bar{x} (t), \bar{u} (t), t) δ x (t) + f_{u}^{'} (\bar{x} (t), \bar{u} (t), t) δ u (t)〉) d t \\ = & 〈L_{x_{0}}^{'} (\bar{p}, λ), δ x_{0}〉 \\ + \int_{0}^{1} (ν (t) r_{x}^{'} (\bar{x} (t), \bar{u} (t), t) δ x (t) + ψ (t) f_{u}^{'} (\bar{x} (t), \bar{u} (t), t) δ u (t)) d t . \end{matrix}

Let us add and subtract the term

ν (t) r_{u}^{'} (\bar{x} (t), \bar{u} (t), t) δ u (t)

under the integral. Then, by virtue of (5), and also by taking into account that

ν (t) (r_{x}^{'} (t) δ x (t) + r_{u}^{'} (t) δ u (t)) = 0 for a . a . t

when

(δ x_{0}, δ u) \in N_{r}

, we derive

〈ψ (1), δ x (1)〉 = 〈L_{x_{0}}^{'} (\bar{p}, λ), δ x_{0}〉

. Then, from (3),

〈- L_{x_{1}}^{'} (\bar{p}, λ), δ x_{1}〉 = 〈ψ (1), δ x (1)〉 = 〈L_{x_{0}}^{'} (\bar{p}, λ), δ x_{0}〉 .

Hence,

〈L_{x_{0}}^{'} (\bar{p}, λ), δ x_{0}〉 + 〈L_{x_{1}}^{'} (\bar{p}, λ), δ x_{1}〉 = 0,

and therefore, Condition (13) is proven.

Consider the endpoint constraint operator

E (δ x_{0}, δ u) : = e^{'} (\bar{p}) δ p

acting from

X

to

R^{k}

. It is a straightforward task to derive that condition

Q > 0

implies that

E (N_{r}) = R^{k}

. Then, using

λ \neq 0

, Equation (13) yields that

λ^{0} > 0

. Moreover, the multiplier

λ

is unique, upon normalization.

Let us proceed to the proof of the second-order condition (11). Take a number

ε > 0

. Let

D_{ε} (t)

designate the diagonal

q \times q

-matrix defined as

D (t)

but, now, with the set

J (t)

replaced by

J_{ε} (t)

. Define the cone

K_{ε}

in the same way as

K

, but with the matrix

D (t)

replaced by

D_{ε} (t)

. It is clear that

K_{ε} \subseteq K

for all

ε > 0

. Firstly, we prove (11) for the reduced cone

K_{ε}

. Consider the space

X_{\infty} = R^{n} \times L_{\infty} ([0, 1]; R^{n})

as the

L_{\infty}

-analogue of

X

, and the following image-space

Y_{ε} : = \prod_{j = 1}^{q} L_{\infty} (T_{j}^{ε}; R) \times R^{k}

, where

T_{j}^{ε} = {t \in [0, 1] : j \in J_{ε} (t)}

,

j = 1, \dots, q

.

Define the mapping

F_{ε} : X_{\infty} \to Y_{ε}

as follows

F_{ε} (x_{0}, u (\cdot)) = (r^{1} (x (\cdot), u (\cdot), \cdot) {|_{T_{1}^{ε}}, \dots, r^{q} (x (\cdot), u (\cdot), \cdot) |}_{T_{q}^{ε}}, e^{1} (p), e^{2} (p), \dots, e^{k} (p)) .

Set

{\bar{y}}_{ε} = F_{ε} ({\bar{x}}_{0}, \bar{u} (\cdot))

. Let

C_{ε} \subset Y_{ε}

be the closed cone such that, for all pairs

(ξ (\cdot), γ) \in C_{ε}

, one has

ξ (t) \in C_{r}

for a.a.

t \in [0, 1]

, and

γ \in C_{e}

. Here,

ξ^{j} (t) = 0

when

t \notin T_{j}^{ε}

.

Consider the inclusion

F_{ε} (x_{0}, u (\cdot)) \in {\bar{y}}_{ε} + C_{ε}, (x_{0}, u (\cdot)) \in X_{\infty} .

(14)

The Fréchet derivative

F_{ε}^{'} ({\bar{x}}_{0}, \bar{u} (\cdot))

is the linear mapping

(A_{ε}, B) : X_{\infty} \to Y_{ε}

, where

A_{ε} = (A_{ε}^{1}, \dots, A_{ε}^{q})

,

A^{j} (δ x_{0}, δ u) : = {(r^{j})}_{x}^{'} (\bar{x} (t), \bar{u} (t), t) δ x (t) + {(r^{j})}_{u}^{'} (\bar{x} (t), \bar{u} (t), t) δ u (t), t \in T_{j}^{ε},

and

B (δ x_{0}, δ u) : = e^{'} (\bar{p}) δ p

. The proof of this fact involves a standard argument.

Firstly, consider this derivative as the extended linear mapping acting from

X

to

\prod_{j = 1}^{q} L_{2} (T_{j}^{ε}; R) \times R^{k}

, that is, in Hilbert spaces. Let us prove its surjection. Since the linear mapping

A_{ε}

is surjective due to regularity w.r.t. the mixed and state constraints (this is a simple task to ensure by solving the corresponding Volterra equation and using Proposition 1 in this way), it is sufficient to show that the linear mapping

B

is surjective on

ker A_{ε}

. Let

Q_{ε}

be the matrix constructed as Q; however, with the matrix

D (t)

replaced by

D_{ε} (t)

. It is clear that

Q_{ε} \to Q

as

ε \to 0

. Therefore, one has that

Q_{ε} > 0

for all sufficiently small

ε

. At the same time, this condition implies that

E (ker A_{ε}) = R^{k}

. Therefore, it is simple to conclude that

(A_{ε}, B)

is a surjective linear mapping for all sufficiently small

ε

.

The surjection of

(A_{ε}, B)

as the linear mapping from

X_{\infty}

to

Y_{ε}

results from the following simple argument. Firstly, notice that, in space

X

, one has the relation

clos (ker A_{ε} \cap X_{\infty}) = ker A_{ε},

(15)

which is clear due to Formulas (9) and (10), as these still hold when

D (t)

is replaced by

D_{ε} (t)

for a sufficiently small

ε

. Then, simply,

N_{r} = N_{r} (ε) = ker A_{ε}

. At the same time, the linear operator

A_{ε}

is surjective as the mapping from

X_{\infty}

to

\prod_{j = 1}^{q} L_{\infty} (T_{j}^{ε}; R)

by virtue of the same arguments involving the solution to a Volterra equation. However, the image of

B

is finite-dimensional, whereas, as has already been confirmed,

B

is surjective on the space

ker A_{ε}

. Therefore, by virtue of (15), one finds that

B

is surjective on the subspace

ker A_{ε} \cap X_{\infty}

. Thus, the derivative

F_{ε}^{'} ({\bar{x}}_{0}, \bar{u} (\cdot))

is surjective and, thereby,

({\bar{x}}_{0}, \bar{u} (\cdot))

is a normal point for the mapping

F_{ε}

.

The Robinson theorem (see Theorem 1 in [16]) asserts the existence of a neighbourhood

O_{ε}

of point

(({\bar{x}}_{0}, \bar{u} (\cdot)), {\bar{y}}_{ε}) \in X_{\infty} \times Y_{ε}

such that

dist ((x_{0}, u (\cdot)), F_{ε}^{- 1} (y + C_{ε})) \leq c dist (F_{ε} (x_{0}, u (\cdot)), y + C_{ε}) \forall ((x_{0}, u (\cdot)), y) \in O_{ε} .

(16)

Consider an arbitrary element

h = (δ x_{0}, δ u) \in K_{ε} \cap X_{\infty}

, and a number

τ > 0

. By putting in (16) the value

ξ (τ) = ({\bar{x}}_{0}, \bar{u} (\cdot)) + τ h

for

(x_{0}, u (\cdot))

, and

{\bar{y}}_{ε}

for y, one obtains

dist (ξ (τ), F_{ε}^{- 1} ({\bar{y}}_{ε} + C_{ε})) \leq o (τ),

and, thus, the set

K_{ε} \cap X_{\infty}

is tangent to the solution set

F_{ε}^{- 1} ({\bar{y}}_{ε} + C_{ε})

; see Corollary 2 in [16]. This means that, for every small

τ

, there exists a vector

ω (τ) = (a (τ), v (\cdot; τ)) \in X_{\infty}

such that

\frac{∥ ω (τ) ∥}{τ} \to 0

as

τ \to 0

, whereas

F_{ε} (ξ (τ) + ω (τ)) \in {\bar{y}}_{ε} + C_{ε}

.

Let us set

x_{0} (τ) : = {\bar{x}}_{0} + τ δ x_{0} + a (τ)

, and

u (t; τ) : = \bar{u} (t) + τ δ u (t) + v (t; τ)

. Then, one may verify that the control pair

(x_{0} (τ), u (\cdot; τ)) \in X_{\infty}

is admissible to Problem (1) for all sufficiently small

τ

due to the construction of the mapping

F_{ε}

. At this point, it is essential that

ε > 0

. Let

x (\cdot; τ)

be the trajectory corresponding to the pair

(x_{0} (τ), u (\cdot; τ))

, and

p (τ) = (x (0; τ), x (1; τ))

. Take the multiplier

λ = (1, λ_{1}, λ_{2}) \in Λ (\bar{x}, \bar{u})

, and the corresponding multiplier

ν

entailed by

λ

. Let

φ (\bar{p}) = 0

.

Consider the inequality

\begin{matrix} L (p (τ), λ) + \int_{0}^{1} 〈ν (t), r (x (t; τ), u (t; τ), t)〉 d t \geq 0, \end{matrix}

(17)

which results from the condition of minimum and from the fact that the control process

(x (\cdot; τ), u (\cdot; τ))

is admissible.

Consider the second-order variational system

\{\begin{matrix} {\dot{δ x}}_{(1)} (t; τ) = f_{x}^{'} (\bar{x} (t), \bar{u} (t), t) δ x_{(1)} (t; τ) + f_{u}^{'} (\bar{x} (t), \bar{u} (t), t) δ u (t; τ), \\ {\dot{δ x}}_{(2)} (t; τ) = f_{x}^{'} (\bar{x} (t), \bar{u} (t), t) δ x_{(2)} (t; τ) + \frac{1}{2} f_{w w}^{″} (\bar{x} (t), \bar{u} (t), t) {[(δ x_{(1)} (t; τ), δ u (t; τ))]}^{2}, \\ δ x_{(1)} (0; τ) = δ x_{0} (τ), δ x_{(2)} (0; τ) = 0 . \end{matrix}

Here,

δ x_{0} (τ) = δ x_{0} + \frac{a (τ)}{τ}

, and

δ u (t; τ) = δ u (t) + \frac{v (t; τ)}{τ}

. It is clear that

x (t; τ) = \bar{x} (t) + τ δ x_{(1)} (t; τ) + τ^{2} δ x_{(2)} (t; τ) + o (τ^{2}) .

Therefore, by expanding in the Taylor series in (17), one has

\begin{matrix} o (τ^{2}) & \leq & 〈L_{x_{0}}^{'} (\bar{p}, λ), τ δ x_{0} (τ)〉 + 〈L_{x_{1}}^{'} (\bar{p}, λ), τ δ x_{(1)} (1; τ) + τ^{2} δ x_{(2)} (1; τ)〉 \\ + \frac{τ^{2}}{2} L_{p p}^{″} (\bar{p}, λ) {[(δ x_{0} (τ), δ x_{(1)} (1; τ))]}^{2} + \int_{0}^{1} 〈 ν (t), r_{x}^{'} (\bar{x} (t), \bar{u} (t), t) \\ [τ δ x_{(1)} (t; τ) + τ^{2} δ x_{(2)} (t; τ)] + τ r_{u}^{'} (\bar{x} (t), \bar{u} (t), t) δ u (t; τ) \\ + \frac{τ^{2}}{2} r_{w w}^{″} (\bar{x} (t), \bar{u} (t), t) {[(δ x_{(1)} (t; τ), δ u (t; τ))]}^{2} 〉 d t . \end{matrix}

Using the adjoint equation, one has

\begin{matrix} \frac{d}{d t} 〈ψ (t), δ x_{(1)} (t; τ)〉 = ν (t) r_{x}^{'} (t) δ x_{(1)} (t; τ) + ψ (t) f_{u}^{'} (t) δ u (t; τ), \\ \frac{d}{d t} 〈ψ (t), δ x_{(2)} (t; τ)〉 = ν (t) r_{x}^{'} (t) δ x_{(2)} (t; τ) + \frac{1}{2} ψ (t) f_{w w}^{″} (t) {[(δ x_{(1)} (t; τ), δ u (t; τ))]}^{2} . \end{matrix}

Here, and from now on, the dependence on the optimal process is, for simplicity, omitted. Therefore, using these relations and the transversality conditions (3), and by gathering the terms with

τ

, and

τ^{2}

in two different groups, we obtain

\begin{matrix} o (τ^{2}) & \leq & - τ \int_{0}^{1} [H_{u}^{'} (t) - ν (t) r_{u}^{'} (t)] δ u (t; τ) d t + \frac{τ^{2}}{2} (L_{p p}^{″} (\bar{p}, λ) {[(δ x_{0} (τ), δ x_{(1)} (1; τ))]}^{2} \\ - \int_{0}^{1} (ψ (t) f_{w w}^{″} (t) + ν (t) r_{w w}^{″} (t)) {[(δ x_{(1)} (t; τ), δ u (t; τ))]}^{2} d t) . \end{matrix}

Now, as the implication of (5), we obtain (11) for the given

(δ x_{0}, δ u) \in K_{ε} \cap X_{\infty}

. Then, Estimate (11) is proven on the cone

K_{ε}

by a simple passage to the limit in

X

.

Let us pass to the limit as

ε \to 0

, and prove (11) on the entire cone

K

. Take

h \in K

. One needs to justify that, for all

ε > 0

, there exists

h_{ε} \in K_{ε}

such that

h_{ε} \to h

in

X

. Indeed, then, (11) is proven due to a simple passage to the limit. However, the existence of such

h_{ε}

is yielded by Lemma 1. Indeed, it is a straightforward task to verify that the derivative operator

F_{ε}^{'}

satisfies all the assumptions of this assertion; it obviously converges pointwise to

F_{0}^{'}

as

ε \to 0

, while its image is closed, as was confirmed above. The image of the conjugate operator is closed due to the regularity of the reference control process with respect to the mixed constraints which is merely a technical step to ensure. Another technical step is to assert that

F_{ε}^{'} {[F_{ε}^{'}]}^{*}

is positive due to normality. Moreover, the constant of covering does not depend on

ε

. It is also a straightforward task to verify that the rest of the assumptions hold if we consider

C_{0}

as C.

The proof is complete. □

5. Abnormal Case

In this section, we consider the case when

rank Q < k

. This case, when the normality condition is not satisfied, is called abnormal. Then, as a simple example can show that Theorem 1 fails to hold. Firstly, the normalized multiplier

λ

is no longer unique, and moreover, there may not exist such a multiplier from the cone

Λ

for which Estimate (11) is still valid everywhere on

K

. Therefore, Theorem 1 requires a certain refinement in the abnormal case. Let us formulate the “abnormal” version of this statement. In this enterprise, we follow the method based on the so-called index approach.

Consider the reduced cone of Lagrange multipliers

Λ_{a} = Λ_{a} (\bar{x}, \bar{u})

, which contains multipliers

λ \in Λ

such that

\begin{matrix} {ind}_{N} Ω_{λ} = k - rank Q . \end{matrix}

Here, the notation

{ind}_{X}

stands for the index of a quadratic form over the space X, and

N = N_{e} \cap N_{r}

. Consider also the following extra hypotheses.

Hypothesis 2 (H2).

Mappings

f, r_{2}

are affine w.r.t. u, while

r_{1}

is convex w.r.t. u.

Hypothesis 3 (H3).

Mixed constraints are globally regular, that is,

U (x, t) = U_{R} (x, t)

for all x and t. Moreover, the set-valued mapping

U (x, t)

is uniformly bounded.

The main result of this section is as follows.

Theorem 2.

Let

(\bar{x}, \bar{u})

be an optimal control process in Problem (1). Suppose that this process is regular with respect to the mixed constraints.

Then,

Λ_{a} \neq ⌀

. Moreover, under (H2) and (H3), one has

max_{λ \in Λ_{a}} Ω_{λ} {[(δ x_{0}, δ u)]}^{2} \geq 0 \forall (δ x_{0}, δ u) \in K .

(18)

In the case of local weak minimum, Estimate (18) has been proven in [14]. Here, our task is to prove it in the case of global strong, or Pontryagin’s type of the minimum. In [3], the condition that the cone

Λ_{a}

is non-empty has been proven in the class of generalized controls. Note that, under the normality condition of the optimal control process, Estimate (18) implies (11) since the normalized multiplier is unique. Thus, Theorem 2, in essence, represents a stronger assertion than Theorem 1, albeit under some extra assumptions such as (H2) and (H3). These two assumptions are meant to simplify the presentation. Note that (H3) is sufficient to suppose on the optimal trajectory only.

Proof.

The proof of theorem is divided into the two stages.

Stage 1. In this stage, we prove that

Λ_{a} \neq ⌀

. In the beginning, suppose that Hypothesis (H3) is valid. Under (H1) and (H3), it is convenient to assume that

f (x, u, t)

, and

r (x, u, t)

are constant with respect to

(x, u)

, and t, outside of some sufficiently large ball. This can be obtained due to a simple problem reduction. In what follows, it will not also be restrictive to consider that

φ (p^{*}) = 0

, and, for the simplicity of exposition, to consider that all the constraints are scalar-valued, i.e.,

k_{1} = k_{2} = q_{1} = 1

, while

q_{2} = 0

.

Let

a, b

be non-negative numbers. Consider the mapping

Δ (a, b) : = \{\begin{matrix} a b^{- 4} & if b > 0, \\ 1 & if a > 0, b = 0, \\ 0 & if a = b = 0 . \end{matrix}

This function is lower semi-continuous. It will serve as a penalty function in the applied method below.

Take a pair

(x_{0}, u) \in X

, and consider the unique solution to the Cauchy problem

\dot{x} (t) = f (x (t), u (t), t)

,

x (0) = x_{0}

, which exists on the entire time interval

[0, 1]

due to the above assumptions. Set

p = (x_{0}, x_{1})

, where

x_{1} = x (1)

. Note that p depends on

(x_{0}, u)

. Let

{ε_{i}}

be an arbitrary sequence of positive numbers converging to zero. Consider the mapping

φ_{i}^{+} (p) = {(φ (p) + ε_{i})}^{+}

, where

a^{+} = max {a, 0}

for

a \in R

. Thus, the following functional over the space

X

is well-defined:

F_{i} (x_{0}, u) : = φ_{i}^{+} (p) + Δ ({(e_{1}^{+} (p))}^{2} + {| e_{2} (p) |}^{2} + \int_{0}^{1} {(r {(x, u, t)}^{+})}^{2} d t, φ_{i}^{+} (p)) .

Functional

F_{i}

is lower semi-continuous which is a straightforward exercise to verify due to the assumptions made above regarding the mappings f and r. At the same time, this functional is positive everywhere:

F_{i} > 0

.

Consider the following problem

Minimize F_{i} (x_{0}, u), (x_{0}, u) \in X .

Note that

F_{i} ({\bar{x}}_{0}, \bar{u}) = ε_{i}

. By applying the smooth variational principle, see, e.g., in [17], for each i, there exists an element

(x_{0, i}, u_{i}) \in X

and a sequence of elements

({\tilde{x}}_{j}, {\tilde{u}}_{j}) \in X

,

j = 1, 2, \dots

, converging to

(x_{0, i}, u_{i})

such that

F_{i} (x_{0, i}, u_{i}) \leq F_{i} ({\bar{x}}_{0}, \bar{u}) = ε_{i},

(19)

| x_{0, i} - {\bar{x}}_{0} |^{2} + \int_{0}^{1} {| u_{i} (t) - \bar{u} (t) |}^{2} d t \leq \sqrt[3]{ε_{i}^{2}},

(20)

and the pair

(x_{0, i}, u_{i})

is the unique solution to the following problem:

\begin{matrix} Minimize F_{i} (x_{0}, u) + \sqrt[3]{ε_{i}} \sum_{j = 1}^{\infty} 2^{- j} (| x_{0} - {\tilde{x}}_{j} |^{2} + \int_{0}^{1} {| u - {\tilde{u}}_{j} (t) |}^{2} d t), (x_{0}, u) \in X . \end{matrix}

Suppose that

φ_{i}^{+} (p_{i}) = 0

. Then,

φ (p_{i}) < 0

and, in view of optimality, taking into account that

φ (\bar{p}) = 0

, it follows that some of constraints in (1):

e_{1}

, or

e_{2}

, or r, are violated. Therefore, by definition of

Δ

, one has

F_{i} (x_{0, i}, u_{i}) \geq 1

. However, this contradicts (19) for

i > 1

. Thus,

φ_{i}^{+} (p_{i}) > 0

. Consider a number

δ_{i} > 0

such that

φ_{i}^{+} (p) > 0

\forall p

:

| p - p_{i} | \leq δ_{i}

. Then, by virtue of, again, the definition of

Δ

, the pair

(x_{0, i}, u_{i})

is the unique global minimum to the following control problem:

\begin{matrix} Minimize & z_{0} + z_{0}^{- 4} ({(e_{1} {(p)}^{+})}^{2} + {| e_{2} (p) |}^{2}) + \int_{0}^{1} z^{- 4} {(r {(x, u, t)}^{+})}^{2} d t \\ + \sqrt[3]{ε_{i}} \sum_{j = 1}^{\infty} 2^{- j} (| x_{0} - {\tilde{x}}_{j} |^{2} + \int_{0}^{1} {| u - {\tilde{u}}_{j} (t) |}^{2} d t), \\ subject to & \dot{x} = f (x, u, t), \\ \dot{z} = 0, for a . a . t \in [0, 1], \\ | p - p_{i} | \leq δ_{i}, z_{0} = φ_{i}^{+} (p) . \end{matrix}

(21)

Denote by

x_{i}, z_{i}

the solution to (21), that is, the trajectory corresponding to the pair

(x_{0, i}, u_{i} (\cdot))

. Note that function

z_{i} (\cdot)

is constant, and thus it can be treated simply as number

z_{i} \in R

.

Problem (21) is, as a matter of fact, unconstrained. Consider the first and second-order necessary optimality conditions for this problem.

The first-order conditions are stated as follows. There exist a number

λ_{i}^{0} > 0

, and absolutely continuous conjugate functions

ψ_{i}

and

σ_{i}

which correspond to

x_{i}

, and

z_{i}

, respectively, such that, for a.a.

t \in [0, 1]

,

\begin{matrix} {\dot{ψ}}_{i} (t) = - H_{x}^{'} (x_{i} (t), u_{i} (t), ψ_{i} (t), t) + 2 λ_{i}^{0} z_{i}^{- 4} r^{+} (x_{i} (t), u_{i} (t), t) r_{x}^{'} (x_{i} (t), u_{i} (t), t), \\ {\dot{σ}}_{i} (t) = - 4 λ_{i}^{0} z_{i}^{- 5} {(r {(x_{i} (t), u_{i} (t), t)}^{+})}^{2}, \end{matrix}

(22)

\begin{matrix} ψ_{i} (s) = {(- 1)}^{s} λ_{i}^{0} (2 z_{i}^{- 4} (e_{1}^{+} (p_{i}) \frac{\partial e_{1}}{\partial x_{s}} (p_{i}) + e_{2} (p_{i}) \frac{\partial e_{2}}{\partial x_{s}} (p_{i})) + (1 - s) \sqrt[3]{ε_{i}} ω_{1, i}^{'} (x_{0, i})) \\ - {(- 1)}^{s} ρ_{i} \frac{\partial φ}{\partial x_{s}} (p_{i}), s = 0, 1, \\ σ_{i} (0) = λ_{i}^{0} (1 - 4 z_{i}^{- 5} {(e_{1}^{+} (p_{i}))}^{2} - 4 z_{i}^{- 5} {| e_{2} (p_{i}) |}^{2}) + ρ_{i}, \\ σ_{i} (1) = 0, \end{matrix}

(23)

\begin{matrix} max_{u \in R^{m}} (H (x_{i} (t), u, ψ_{i} (t), t) - λ_{i}^{0} z_{i}^{- 4} {(r {(x_{i} (t), u, t)}^{+})}^{2} - λ_{i}^{0} \sqrt[3]{ε_{i}} \cdot ω_{2, i} (u, t)) \\ = H (x_{i} (t), u_{i} (t), ψ_{i} (t), t) - λ_{i}^{0} z_{i}^{- 4} {(r {(x_{i} (t), u_{i} (t), t)}^{+})}^{2} - λ_{i}^{0} \sqrt[3]{ε_{i}} \cdot ω_{2, i} (u_{i} (t), t), \end{matrix}

(24)

Here,

ρ_{i} \in R

is the multiplier corresponding to the constraint

z_{0} = φ_{i}^{+} (p)

,

ω_{1, i} (x) : = \sum_{j = 1}^{\infty} 2^{- j} | x - {\tilde{x}}_{j} |^{2}, and ω_{2, i} (u, t) : = \sum_{j = 1}^{\infty} 2^{- j} {| u - {\tilde{u}}_{j} (t) |}^{2} .

Conditions (22)–(24) are the first-order optimality conditions in the form of the maximum principle. Consider the second-order optimality conditions for Problem (21).

Take an element

(δ x_{0}, δ u) \in X

. Consider the variational differential equation related to (21), that is,

\begin{matrix} {\dot{δ x}}_{i} (t) & = & \frac{\partial f}{\partial x} (x_{i} (t), u_{i} (t), t) δ x_{i} (t) + \frac{\partial f}{\partial u} (x_{i} (t), u_{i} (t), t) δ u (t), \\ {\dot{δ z}}_{i} (t) & = & 0, \end{matrix}

(25)

for a.a.

t \in [0, 1]

, where

\begin{matrix} δ x_{i} (0) = δ x_{0}, \\ δ z_{i} (0) = φ^{'} (p_{i}) δ p_{i} . \end{matrix}

Here,

δ p_{i} : = (δ x_{0}, δ x_{i} (1))

.

The solution to (25) exists, and it is unique on the entire time interval

[0, 1]

due to the assumptions made above. The function

δ z_{i} (\cdot)

is obviously constant, thus, it is treated just as number

δ z_{i}

in what follows.

On the space

X

, consider the quadratic form

\begin{matrix} Ω_{i} {[(δ x_{0}, δ u)]}^{2} & = & λ_{i}^{0} 2 z_{i}^{- 4} e_{1} {(p_{i})}^{+} e_{1}^{″} (p_{i}) {[δ p_{i}]}^{2} + 2 z_{i}^{- 4} e_{2} (p_{i}) e_{2}^{″} (p_{i}) {[δ p_{i}]}^{2} \\ + 20 z_{i}^{- 6} ({(e_{1} {(p_{i})}^{+})}^{2} + {| e_{2} (p_{i}) |}^{2}) δ z_{i}^{2} - (ρ_{i} φ^{″} (p_{i}) + λ_{i}^{0} \sqrt[3]{ε_{i}} ω_{1, i}^{″} (p_{i})) {[δ p_{i}]}^{2} \\ - \int_{0}^{1} H_{w w}^{″} (x_{i} (t), u_{i} (t), ψ_{i} (t), t) {[(δ x_{i} (t), δ u (t))]}^{2} d t \\ + λ_{i}^{0} \int_{0}^{1} 2 z_{i}^{- 4} r {(x_{i} (t), u_{i} (t), t)}^{+} r_{w w}^{″} (x_{i} (t), u_{i} (t), t) {[(δ x_{i} (t), δ u (t))]}^{2} d t \\ + \int_{0}^{1} 20 z_{i}^{- 6} {(r {(x_{i} (t), u_{i} (t), t)}^{+})}^{2} δ z_{i}^{2} d t + λ_{i}^{0} \sqrt[3]{ε_{i}} \int_{0}^{1} ω_{2, i}^{″} (u_{i} (t), t) {[δ u (t)]}^{2} d t . \end{matrix}

Consider the closed subspace

N_{i} \subseteq X

of pairs

(δ x_{0}, δ u)

such that

(a): $e^{'} (p_{i}) δ p_{i} = 0$ ;
(b): $r_{x}^{'} (x_{i} (t), u_{i} (t), t) δ x_{i} (t) + r_{u}^{'} (x_{i} (t), u_{i} (t), t) δ u (t) = 0$ $for a . a . t \in T_{i}$ .

Here,

T_{i} : = {t \in [0, 1] : r (x_{i} (t), u_{i} (t), t) \geq 0}

.

Then, the second-order necessary optimality condition is given by the inequality

Ω_{i} {[(δ x_{0}, δ u)]}^{2} \geq 0 \forall (δ x_{0}, δ u) \in N_{i} .

(26)

(Note that functional

F (x_{0}, u)

is not twice continuously differentiable. At the same time, the scalar function

F (x_{0, i} + τ δ x_{0}, u_{i} + τ δ u)

of

τ

possesses the second derivative w.r.t.

τ

at

τ = 0

, provided that

(δ x_{0}, δ u) \in N_{i}

. Using this fact, and the fact that Problem (21) is unconstrained, it is simple to derive (26) by applying direct variations arguments.)

The next step is to pass to the limit as

i \to \infty

in the obtained optimality conditions. Firstly, it follows from (20) that

x_{0, i} \to {\bar{x}}_{0}

, and

u_{i} (t) \to \bar{u} (t)

strongly in

L_{2}

, and, thereby,

x_{i} (t) ⇉ \bar{x} (t)

uniformly on

[0, 1]

. Then,

z_{i} \to 0

. Define

\begin{matrix} λ_{i}^{1} : = 2 λ_{i}^{0} z_{i}^{- 4} e_{1} {(p_{i})}^{+}; \\ λ_{i}^{2} : = 2 λ_{i}^{0} z_{i}^{- 4} e_{2} (p_{i}); \\ ν_{i} (t) : = 2 λ_{i}^{0} z_{i}^{- 4} r {(x_{i} (t), u_{i} (t), t)}^{+}, \end{matrix}

and consider the following normalization for the multipliers

| λ_{i} | + | ψ_{i} (0) | + ∥ ν_{i} ∥_{L_{2}} = 1,

(27)

where

λ_{i} = (λ_{i}^{0}, λ_{i}^{1}, λ_{i}^{2})

.

Let us show that

σ_{i} (0) \to 0

. Indeed, one has

σ_{i} (0) = 4 \int_{0}^{1} λ_{i}^{0} z_{i}^{- 5} r {(x_{i} (t), u_{i} (t), t)}^{+})^{2} d t = 2 \int_{0}^{1} ν_{i} (t) z_{i}^{- 1} r {(x_{i} (t), u_{i} (t), t)}^{+} d t .

However, due to (19), one has

∥ z_{i}^{- 1} r {(x_{i} (t), u_{i} (t), t)}^{+} ∥_{L_{2}} \to 0

. This, together with (27), implies that

σ_{i} (0) \to 0

. Then, the transversality condition and, again, (19) and (27) simply yield that

λ_{i}^{0} - ρ_{i} \to 0

.

By passing to a subsequence, in view of the compactness argument, one may assume from (27) that

λ_{i} \to λ

,

ψ_{i} (t) ⇉ ψ (t)

and

ν_{i} \overset{w}{\to} ν

weakly in

L_{2}

as

i \to \infty

for some multipliers

λ = (λ^{0}, λ_{1}, λ_{2})

,

ψ

and

ν

. Then,

ρ_{i} \to λ^{0}

. It is also clear that by passing to a subsequence, one can assert that

λ_{i}^{0} z_{i}^{- 4} \to \infty

. Indeed, otherwise all the multipliers converge to zero, contradicting (27). By virtue of the regularity of the optimal control process with respect to mixed constraints, for each i, there exists a control function

ζ_{i}

such that

ζ_{i} (t) \in U (x_{i} (t), t)

a.e., and

ζ_{i} \to \bar{u}

in

L_{\infty}

. Thus, from the maximum condition (24), it follows that

r {(x_{i} (t), u_{i} (t), t)}^{+} \to 0

uniformly. Since the set

U (\bar{x} (t), t)

is uniformly bounded, this implies, again due to regularity, that the control function

u_{i}

is essentially bounded uniformly with respect to i, that is

∥ u_{i} ∥_{L_{\infty}} \leq const

.

From (24), one derives that

ν_{i} (t) \cdot r_{u}^{'} (x_{i} (t), u_{i} (t), t) = H_{u}^{'} (x_{i} (t), u_{i} (t), ψ_{i} (t), t) - λ_{i}^{0} \sqrt[3]{ε_{i}} \cdot \frac{\partial ω_{2, i}}{\partial u} (u_{i} (t), t) .

Whence, using regularity and the above obtained facts, one has

| ν_{i} (t) | \leq const (| ψ_{i} (t) | + λ_{i}^{0}) \forall i, t \in [0, 1] .

(28)

Using the facts and estimates obtained above, one can simply pass to the limit in (22)–(24) and prove that the set of multipliers

(λ, ψ, ν)

satisfies the maximum principle. At the same time, the fact that

λ \neq 0

follows from (28).

Now, let us pass to the limit in (26). Take numbers

ε > 0

and

σ > 0

. Restricting to a subsequence, one can state that

u_{i} (t) \to \bar{u} (t)

for a.a.

t \in [0, 1]

. Due to Egorov’s theorem, there is a subset

E_{σ}

, the measure of which equals

1 - σ

, such that

u_{i} (t) ⇉ \bar{u} (t)

uniformly on

E_{σ}

. Denote

T_{ε} : = {t \in [0, 1] : r (\bar{x} (t), \bar{u} (t), t) \geq - ε}

,

T_{ε} (σ) : = T_{ε} \cap E_{σ}

.

Consider the bounded linear operator

A_{i} : X (E_{σ}) \to L_{2} (T_{ε} (σ); R)

such that

A_{i} (δ x_{0}, δ u) = r_{x}^{'} (x_{i} (t), u_{i} (t), t) δ x_{i} (t) + r_{u}^{'} (x_{i} (t), u_{i} (t), t) δ u (t) |_{T_{ε} (σ)},

where

X (E_{σ})

is defined as

X

, but

δ u (t) = 0

for a.a.

t \notin E_{σ}

. It is a simple matter to show that, due to the uniform convergence, one has

A_{i} \to A

strongly, where

A (δ x_{0}, δ u) = r_{x}^{'} (\bar{x} (t), \bar{u} (t), t) δ x (t) + r_{u}^{'} (\bar{x} (t), \bar{u} (t), t) δ u (t) |_{T_{ε} (σ)} .

Then, as is known,

ker A_{i} \to X_{ε} (σ) : = ker A \subseteq X (E_{σ})

. It is clear that

X_{ε} (σ) \to X_{ε} : = X_{ε} (0)

as

σ \to 0

by virtue of its definition and the regularity condition. One needs to use the solution to a corresponding Volterra equation to prove this simple fact. Then, Lemma 1 yields that

X_{ε} \to N_{r}

as

ε \to 0

if we consider

C = {0}

. Here, when treating the convergence of spaces, the symbol ‘→’ stands for

Limsup

.

Let

Π_{i} \subseteq X

denote the kernel of the endpoint operator

e^{'} (p_{i}) δ p_{i}

. It is clear that

codim Π_{i} \leq k

. Then, this is a simple exercise to ensure the existence of a subspace

Π \subseteq N_{e}

such that

codim Π \leq k

and

Π \cap N_{r} \subseteq Limsup Π_{i} \cap ker A_{i},

where

Limsup

is total: firstly as

i \to \infty

, then, as

σ \to 0

and finally, as

ε \to 0

. At the same time, note that

T_{i} \cap E_{σ} \subseteq T_{ε} (σ)

for all large i. Therefore, one has the embedding

ker A_{i} \cap Π_{i} \subseteq N_{i} \cap X (E_{σ})

, and then, the passage to the limit in (26) gives the condition

Λ_{a} \neq ⌀

. In the latter deduction, Proposition 1 of [14] has been used and also the fact that the terms with

δ z_{i}^{2}

in

Ω_{i}

converge to zero in view of (19) and (27).

Now, it is necessary to remove the extra assumptions imposed in (H3) regarding the boundedness and global regularity. However, this can be done following precisely the same method as presented in [4]. Take

c > 0

, and consider the additional control constraint

| u | \leq c

. For each

ε > 0

, there will be N specifically constructed regular selectors of

U (x, t)

which are surrounded with N

ε

-tubes as in the above-cited source. Then, the passage to the limit, firstly as

ε \to 0

, then, as

N \to \infty

, and, at the end, as

c \to \infty

will complete the proof for Stage 1.

The full proof for the next stage is rather lengthy. Therefore, let us present it schematically, in a sketch-form, exposing the main idea.

Stage 2. Here, under (H2) and (H3), we prove Estimate (18). For this purpose, the notion of

χ

-problem is used. Take any

ε, δ > 0

and

(δ x_{0}, δ u) \in K

. It is not restrictive to assume that the minimum in (1) is absolute.

Consider the problem

\{\begin{matrix} Minimize & φ (p) - χ φ ({\bar{p}}_{ε}) + {(max {0, χ - 1})}^{4} + δ {| x_{0} - {\bar{x}}_{0} - ε δ x_{0} |}^{2} \\ + δ \int_{0}^{1} {| u (t) - \bar{u} (t) - ε δ u (t) |}^{2} d t, \\ subject to & \dot{x} (t) = f (x (t), u (t), t) for a . a . t, \\ e (p) - χ e ({\bar{p}}_{ε}) \in C_{e}, | x_{0} - {\bar{x}}_{0} | \leq δ, \\ r (x (t), u (t), t) - χ r ({\bar{x}}_{ε} (t), \bar{u} (t) + ε δ u (t), t) \in C_{r} for a . a . t, \\ χ \geq 0 . \end{matrix}

(29)

Here,

{\bar{x}}_{ε} (\cdot)

is the trajectory corresponding to the perturbed pair

({\bar{x}}_{0} + ε δ x_{0}, \bar{u} (t) + ε δ u (t))

, whereas

{\bar{p}}_{ε}

is the corresponding endpoint vector.

Note that the infimum in Problem (29) is finite due to the imposed assumptions. Moreover, it is not greater than zero, since the process

({\bar{x}}_{ε} (t), \bar{u} (t) + ε δ u (t))

,

χ = 1

is feasible, whereas the value of the cost equals zero. At the same time, when

χ = 0

, the infimum over

X

is positive due to the absolute optimality in (1) and since

φ (\bar{p}) = 0

. This suggests the application of the smooth variational principle, albeit in the version from Ref. [18], so the finite-dimensional variable

χ

is not subject to perturbation. That is, the adding to the cost due to the variational principle is within the space

X

only. Then, for any sufficiently small

α > 0

, one can assert the solution

(x_{0, α}, u_{α} (\cdot))

,

χ_{α}

to the perturbed problem such that

χ_{α} > 0

. Indeed, otherwise there is a contradiction with the range of the infimum.

The remaining arguments are somewhat standard: the results of Stage 1 are applied to the

α

-solutions which are regular due to (H3). By taking

α = α (ε)

appropriately small according to the given

ε

, one can prove the convergence of this solution to the optimal solution

({\bar{x}}_{0}, \bar{u} (t))

as

ε \to 0

. (In this enterprise, Hypothesis (H2) is essentially used together with the weak sequential compactness of controls

u_{α (ε)}

implemented by virtue of a standard technique. It is also needed to use the form of the minimizing functional in (29) and the above obtained fact that the infimum is not greater than zero, in order to prove the strong convergence of these controls.) Then, it is necessary to pass to the limit in the obtained conditions, firstly as

α \to 0

, then, as

ε \to 0

and finally, as

δ \to 0

. At the same time, the transversality condition with respect to the

χ

-variable will yield the desired Estimate (18) by virtue of the expansion in Taylor series.

The proof is complete. □

6. Conclusions

In this article, second-order necessary conditions in the form of Estimate (18) have been derived for both normal and abnormal cases. The notion of normality is defined as the condition of full rank for the corresponding controllability matrix. In the normal case, the set of Lagrange multipliers is unique, upon normalization, while the multiplier

λ^{0}

is positive. In the abnormal case, it is essential that the reduced cone of Lagrange multipliers

Λ_{a}

is considered and that it has been proven non-empty. Along with the second-order necessary conditions, a refined version of the maximum condition in the form (4) has been obtained. The principal feature of the obtained result is that the maximum is taken over the reduced feasible set

Θ (x, t)

, which is the closure of the set of regular points.

Author Contributions

Conceptualization and methodology, A.V.A.; writing—original draft preparation, D.Y.K.; writing—review and editing, D.Y.K. and F.L.P.; supervision and project administration, F.L.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Ministry of Science and Higher Education of the Russian Federation, project no 075-15-2020-799.

Data Availability Statement

Not applicable.

Acknowledgments

The support of SYSTEC UID/EEA/00147, Ref. POCI-01-0145-FEDER-006933; SNAP project, Ref. NORTE-01-0145-FEDER-000085; and MAGIC project, Ref. POCI-01-0145-FEDER-032485; all funded by ERDF | NORTE 2020, PT2020, FEDER | COMPETE2020 | POCI | PIDDAC | FCT/MCTES, are acknowledged. Theorem 1 is obtained by A.V. Arutyunov under financial support of Russian Science Foundation (Project No 22-21-00863). Theorem 2 was obtained by A.V. Arutyunov under financial support of Russian Science Foundation (Project No 20-11-20131). The work of D.Yu. Karamzin was carried out according to the State Assignment of Russian Federation (State registration number AAAA-A19-119092390082-8, Topic 0063-2019-0010). The useful remarks of the anonymous reviewers are also acknowledged.

Conflicts of Interest

The authors declare no conflict of interest.

References

Agrachev, A.A.; Gamkrelidze, R.V. Index of Extremality and Quasiextremality. Russ. Math. Dokl. 1985, 284, 11–14. [Google Scholar]
Agrachev, A.A.; Gamkrelidze, R.V. Quasi-extremality for control systems. J. Sov. Math. 1991, 55, 1849–1864. [Google Scholar] [CrossRef]
Arutyunov, A.V. Perturbations of extremal problems with constraints and necessary optimality conditions. J. Sov. Math. 1991, 54, 1342–1400. [Google Scholar] [CrossRef]
Arutyunov, A.V.; Karamzin, D.Y.; Pereira, F.L.; Silva, G.N. Investigation of regularity conditions in optimal control problems with geometric mixed constraints. Optimization 2016, 65, 185–206. [Google Scholar] [CrossRef]
Pontryagin, L.S.; Boltyanskii, V.G.; Gamkrelidze, R.V.; Mishchenko, E.F. Mathematical Theory of Optimal Processes; Nauka: Moscow, Russia, 1983. [Google Scholar]
Hestenes, M.R. Calculus of Variations and Optimal Control Theory; Wiley: New York, NY, USA, 1966. [Google Scholar]
Neustadt, L.W. Optimization; Princeton University Press: Princeton, NJ, USA, 1976. [Google Scholar]
Dubovitskii, A.Y.; Milyutin, A.A. Necessary conditions for a weak extremum in optimal control problems with mixed constraints of the inequality type. USSR Comput. Math. Math. Phys. 1968, 8, 24–98. [Google Scholar] [CrossRef]
Devdariani, E.N.; Ledyaev, Y.S. Maximum Principle for Implicit Control Systems. Appl. Math. Optim. 1999, 40, 79–103. [Google Scholar] [CrossRef]
Milyutin, A.A. Maximum Principle in a General Optimal Control Problem; Fizmatlit: Moscow, Russia, 2001. (In Russian) [Google Scholar]
Pinho, M.R.; Vinter, R.B.; Zheng, H. A maximum principle for optimal control problems with mixed constraints. IMA J. Math. Control Inform. 2001, 18, 189–205. [Google Scholar] [CrossRef]
Clarke, F.; Pinho, M.R. Optimal control problems with mixed constraints. SIAM J. Control Optim. 2010, 48, 4500–4524. [Google Scholar] [CrossRef]
Milyutin, A.A.; Osmolovskii, N.P. Calculus of Variations and Optimal Control; American Mathematical Society: Providence, RI, USA, 1998. [Google Scholar]
Arutyunov, A.V.; Karamzin, D.Y. Necessary conditions for a weak minimum in an optimal control problem with mixed constraints. Differ. Equ. 2005, 41, 1532–1543. [Google Scholar] [CrossRef]
Penrose, R. A generalized inverse for matrices. Math. Proc. Camb. Philos. Soc. 1955, 51, 406–413. [Google Scholar] [CrossRef] [Green Version]
Robinson, S.M. Regularity and stability for convex multivalued functions. Math. Oper. Res. 1976, 1, 130–143. [Google Scholar] [CrossRef]
Ioffe, A.D.; Tikhomirov, V.M. Some remarks on variational principles. Math. Notes 1997, 61, 48–253. [Google Scholar] [CrossRef]
Arutyunov, A.V.; Karamzin, D. Square-root metric regularity and related stability theorems for smooth mappings. SIAM J. Optim. 2021, 31, 1380–1409. [Google Scholar] [CrossRef]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Arutyunov, A.V.; Karamzin, D.Y.; Pereira, F.L. Maximum Principle and Second-Order Optimality Conditions in Control Problems with Mixed Constraints. Axioms 2022, 11, 40. https://doi.org/10.3390/axioms11020040

AMA Style

Arutyunov AV, Karamzin DY, Pereira FL. Maximum Principle and Second-Order Optimality Conditions in Control Problems with Mixed Constraints. Axioms. 2022; 11(2):40. https://doi.org/10.3390/axioms11020040

Chicago/Turabian Style

Arutyunov, Aram V., Dmitry Yu. Karamzin, and Fernando Lobo Pereira. 2022. "Maximum Principle and Second-Order Optimality Conditions in Control Problems with Mixed Constraints" Axioms 11, no. 2: 40. https://doi.org/10.3390/axioms11020040

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Maximum Principle and Second-Order Optimality Conditions in Control Problems with Mixed Constraints

Abstract

1. Introduction

2. Problem Formulation

3. Normality Condition

4. Main Result

5. Abnormal Case

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI