**2. Preliminaries**

In this section, we recall necessary results and fix notations. We assume the reader to be familiar with the standard Riemann–Liouville and Caputo fractional calculi [19,20].

Let *α* be a real number in [0, 1] and let *ψ* be a non-negative continuous function defined on [0, 1] such that 

$$\int\_0^1 \psi(\alpha)d\alpha > 0.$$

This function *ψ* will act as a distribution of the order of differentiation.

**Definition 1** (See [1])**.** *The left and right-sided Riemann–Liouville distributed-order fractional derivatives of a function x* : [*a*, *b*] → R *are defined, respectively, by*

$$\mathbb{D}\_{a^{+}}^{\mathfrak{Y}(\cdot)}\mathbf{x}(t) = \int\_{0}^{1} \boldsymbol{\upvarphi}(a) \cdot D\_{a^{+}}^{a} \mathbf{x}(t) da \quad \text{and} \quad \mathbb{D}\_{b^{-}}^{\mathfrak{Y}(\cdot)}\mathbf{x}(t) = \int\_{0}^{1} \boldsymbol{\upvarphi}(a) \cdot D\_{b^{-}}^{a} \mathbf{x}(t) da \,\omega$$

*where Dα a*+ *and Dα b*− *are, respectively, the left and right-sided Riemann–Liouville fractional derivatives of order α.*

**Definition 2** (See [1])**.** *The left and right-sided Caputo distributed-order fractional derivatives of a function x* : [*a*, *b*] → R *are defined, respectively, by*

$$^C\mathbb{D}\_{a^+}^{\psi(\cdot)}\mathbf{x}(t) = \int\_0^1 \psi(a) \cdot ^C D\_{a^+}^{a} \mathbf{x}(t) da \quad \text{and} \quad ^C\mathbb{D}\_{b^-}^{\psi(\cdot)}\mathbf{x}(t) = \int\_0^1 \psi(a) \cdot ^C D\_{b^-}^{a} \mathbf{x}(t) da,$$

*where CDα a*+ *and CDα b*− *are, respectively, the left and right-sided Caputo fractional derivatives of order α.*

As noted in [16], there is a relation between the Riemann–Liouville and the Caputo distributed-order fractional derivatives:

$${}^{C}\mathbb{D}\_{a^{+}}^{\psi(\cdot)}\mathbf{x}(t) = \mathbb{D}\_{a^{+}}^{\psi(\cdot)}\mathbf{x}(t) - \mathbf{x}(a)\int\_{0}^{1} \frac{\psi(a)}{\Gamma(1-a)}(t-a)^{-a}da$$

and

$$\mathbb{E}^{\mathbb{C}}\mathbb{D}\_{b^{-}}^{\psi(\cdot)}\mathbf{x}(t) = \mathbb{D}\_{b^{-}}^{\psi(\cdot)}\mathbf{x}(t) - \mathbf{x}(b)\int\_{0}^{1} \frac{\psi(a)}{\Gamma(1-a)}(b-t)^{-a}da.$$

Along the text, we use the notation

$$\mathbb{I}\_{b^{-}}^{1-\psi(\cdot)}\mathfrak{x}(t) = \int\_{0}^{1} \psi(\alpha) \cdot I\_{b^{-}}^{1-\alpha}\mathfrak{x}(t) d\alpha\_{\ast}$$

where *I*1−*<sup>α</sup> b*− represents the right Riemann–Liouville fractional integral of order 1 − *α*.

The next result has an essential role in the proofs of our main results; that is, in the proofs of Theorems 1 and 2.

**Lemma 1** (Integration by parts formula [16])**.** *Let x be a continuous function and y a continuously differentiable function. Then,*

$$\int\_{a}^{b} \mathbf{x}(t) \cdot \prescript{\mathbf{C}}{}{\mathbb{D}}\_{a^{+}}^{\mathfrak{p}(\cdot)} \mathbf{y}(t) dt = \left[ y(t) \cdot \mathbb{I}\_{b^{-}}^{1-\mathfrak{p}(\cdot)} \mathbf{x}(t) \right]\_{a}^{b} + \int\_{a}^{b} y(t) \cdot \mathbb{D}\_{b^{-}}^{\mathfrak{p}(\cdot)} \mathbf{x}(t) dt.$$

Next, we recall the standard notion of concave function, which will be used in Section 3.3.

**Definition 3** (See [21])**.** *A function h* : R*n* → R *is concave if*

$$h(\beta \theta\_1 + (1 - \beta)\theta\_2) \ge \beta h(\theta\_1) + (1 - \beta)h(\theta\_2)$$

*for all β* ∈ [0, 1] *and for all θ*1*, θ*2 *in* R*n.*

**Lemma 2** (See [21])**.** *Let h* : R*n* → R *be a continuously differentiable function. Then h is a concave function if and only if it satisfies the so called gradient inequality:*

$$h(\theta\_1) - h(\theta\_2) \ge \nabla h(\theta\_1)(\theta\_1 - \theta\_2)$$

*for all θ*1, *θ*2 ∈ R*n.*

Finally, we recall a fractional version of Gronwall's inequality, which will be useful to prove the continuity of solutions in Section 3.1.

**Lemma 3** (See [22])**.** *Let α be a positive real number and let <sup>a</sup>*(·)*, b*(·)*, and <sup>u</sup>*(·) *be non-negative continuous functions on* [0, *T*] *with b*(·) *monotonic increasing on* [0, *<sup>T</sup>*)*. If*

$$u(t) \le a(t) + b(t) \int\_0^t (t - s)^{\alpha - 1} u(s) ds,$$

*then*

$$u(t) \le a(t) + \int\_0^t \left[ \sum\_{n=0}^\infty \frac{\left(b(t)\Gamma(\alpha)\right)^n}{\Gamma(n\alpha)} (t-s)^{n\alpha-1} u(s) \right] ds$$

*for all t* ∈ [0, *<sup>T</sup>*)*.*

#### **3. Main Results**

The basic problem of optimal control we consider in this work, denoted by (BP), consists in finding a piecewise continuous control *u* ∈ *PC* and the corresponding piecewise smooth state trajectory *x* ∈ *PC*<sup>1</sup> solution of the distributed-order non-local variational problem

$$\begin{aligned} f\left[\mathbf{x}(\cdot), \boldsymbol{\mu}(\cdot)\right] &= \int\_{a}^{b} L\left(t, \mathbf{x}(t), \boldsymbol{\mu}(t)\right) dt \longrightarrow \max, \\ \prescript{\mathbf{C}}{}{\mathbb{D}}\_{a^{+}}^{\mathbf{p}(\cdot)} \mathbf{x}(t) &= f\left(t, \mathbf{x}(t), \boldsymbol{\mu}(t)\right), \quad t \in [a, b], \end{aligned} \tag{\text{BP}}$$
 
$$\mathbf{x}(\cdot) \in \mathbf{PC}^{1}, \quad \boldsymbol{\mu}(\cdot) \in \mathbf{PC},$$
 
$$\mathbf{x}(a) = \mathbf{x}\_{4}$$

where functions *L* and *f* , both defined on [*a*, *b*] × R × R, are assumed to be continuously differentiable in all their three arguments: *L* ∈ *C*1, *f* ∈ *C*1. Our main contribution is to prove necessary (Section 3.2) and sufficient (Section 3.3) optimality conditions.

#### *3.1. Sensitivity Analysis*

Before we can prove necessary optimality conditions to problem (BP), we need to establish continuity and differentiability results on the state solutions for any control perturbation (Lemmas 4 and 5), which are then used in Section 3.2. The proof of Lemma 4 makes use of the following mean value theorem for integration, that can be found in any textbook of calculus (see Lemma 1 of [23]): if *F* : [0, 1] → R is a continuous function and *ψ* is an integrable function that does not change the sign on the interval, then there exists a number *α*¯, such that

$$\int\_0^1 \psi(\alpha) F(\alpha) d\alpha = F(\mathbb{R}) \int\_0^1 \psi(\alpha) d\alpha.$$

**Lemma 4** (Continuity of solutions)**.** *Let u be a control perturbation around the optimal control <sup>u</sup>*<sup>∗</sup>*, that is, for all t* ∈ [*a*, *b*]*, u*(*t*) = *u*<sup>∗</sup>(*t*) + *h*(*t*)*, where h*(·) ∈ *PC is a variation and* ∈ R*. Denote by x its corresponding state trajectory, solution of*

$$\mathbb{T}^{\mathcal{C}} \mathbb{D}\_{a^{+}}^{\psi(\cdot)} \mathfrak{x}^{\mathcal{C}}(t) = f\left(t, \mathfrak{x}^{\mathcal{C}}(t), \mathfrak{u}^{\mathcal{C}}(t)\right), \quad \mathfrak{x}^{\mathcal{C}}(a) = \mathfrak{x}\_{4}.$$

*Then, we have that x converges to the optimal state trajectory x*<sup>∗</sup> *when tends to zero.*

**Proof.** Starting from the definition, we have, for all *t* ∈ [*a*, *b*], that

$$\left| \,^C \mathbb{D}\_{a^+}^{\psi(\cdot)} \mathbf{x}^c(t) - \,^C \mathbb{D}\_{a^+}^{\psi(\cdot)} \mathbf{x}^\*(t) \right| = \left| f \left( t, \mathbf{x}^c(t), \mathbf{u}^c(t) \right) - f \left( t, \mathbf{x}^\*(t), \mathbf{u}^\*(t) \right) \right|.$$

Then, by linearity,

$$\left| \left| \,^{\mathbb{C}} \mathbb{D}\_{a^{+}}^{\mathfrak{Y}(\cdot)} \mathbf{x}^{\varepsilon}(t) - \,^{\mathbb{C}} \mathbb{D}\_{a^{+}}^{\mathfrak{Y}(\cdot)} \mathbf{x}^{\*}(t) \right| \right| = \left| \,^{\mathbb{C}} \mathbb{D}\_{a^{+}}^{\mathfrak{Y}(\cdot)} \left( \mathbf{x}^{\varepsilon}(t) - \mathbf{x}^{\*}(t) \right) \right| = \left| f \left( t, \mathbf{x}^{\varepsilon}(t), \mathfrak{u}^{\varepsilon}(t) \right) - f \left( t, \mathbf{x}^{\*}(t), \mathfrak{u}^{\*}(t) \right) \right| $$

and it follows, by definition of the distributed operator, that

$$\left| \int\_0^1 \psi(a)^\mathsf{C} D\_{a^+}^\mathsf{a} \left( \mathbf{x}^\varepsilon(t) - \mathbf{x}^\*(t) \right) da \right| = \left| f \left( t, \mathbf{x}^\varepsilon(t), \mathbf{u}^\varepsilon(t) \right) - f \left( t, \mathbf{x}^\*(t), \mathbf{u}^\*(t) \right) \right|.$$

Now, using the mean value theorem for integration, and denoting *m* := - 10 *ψ*(*α*)*d<sup>α</sup>*, we obtain that there exists an *α*¯ such that

$$\left| \,^C D\_{a^+}^{\mathbb{R}} \left( \mathfrak{x}^c(t) - \mathfrak{x}^\*(t) \right) \right| \le \frac{|f\left( t, \mathfrak{x}^c(t), \mathfrak{u}^c(t) \right) - f\left( t, \mathfrak{x}^\*(t), \mathfrak{u}^\*(t) \right)|}{m}$$

.

Clearly, one has

$$\left| \,^{\mathbb{C}}D\_{a^{+}}^{\bar{\mathbf{a}}}\left(\mathbf{x}^{\mathbf{c}}(t) - \mathbf{x}^{\ast}(t)\right) \leq \left| \,^{\mathbb{C}}D\_{a^{+}}^{\bar{\mathbf{a}}}\left(\mathbf{x}^{\mathbf{c}}(t) - \mathbf{x}^{\ast}(t)\right) \right| \leq \frac{\left| f\left(t, \mathbf{x}^{\mathbf{c}}(t), u^{\mathbf{c}}(t)\right) - f\left(t, \mathbf{x}^{\ast}(t), u^{\mathbf{s}}(t)\right) \right|}{m} \leq \frac{\left| \,^{\mathbb{C}}D\_{a^{+}}^{\bar{\mathbf{a}}}\left(\mathbf{x}^{\mathbf{c}}(t), u^{\mathbf{c}}(t)\right) - f\left(t, \mathbf{x}^{\ast}(t), u^{\mathbf{c}}(t)\right) \right|}{m} \leq \frac{\left| \,^{\mathbb{C}}D\_{a^{+}}^{\bar{\mathbf{a}}}\left(\mathbf{x}^{\mathbf{c}}(t), u^{\mathbf{c}}(t)\right) - f\left(t, \mathbf{x}^{\ast}(t), u^{\mathbf{c}}(t)\right) \right|}{m}$$

which leads to

$$\mathbf{x}^{\mathbf{c}}(t) - \mathbf{x}^\*(t) \le I\_{\mathbf{a}^+}^{\mathbb{R}} \left[ \frac{|f\left(t, \mathbf{x}^{\mathbf{c}}(t), \mathbf{u}^{\mathbf{c}}(t)\right) - f\left(t, \mathbf{x}^\*(t), \mathbf{u}^\*(t)\right)|}{m} \right].$$

Moreover, because *f* is Lipschitz-continuous, we have

$$\left| f\left(t, \mathbf{x}^{\varepsilon}, \boldsymbol{\mu}^{\varepsilon}\right) - f\left(t, \mathbf{x}^{\*}, \boldsymbol{\mu}^{\*}\right) \right| \leq \mathcal{K}\_{1} \left| \mathbf{x}^{\varepsilon} - \mathbf{x}^{\*} \right| + \mathcal{K}\_{2} \left| \boldsymbol{\mu}^{\varepsilon} - \boldsymbol{\mu}^{\*} \right|.$$

By setting *K* = max{*<sup>K</sup>*1, *<sup>K</sup>*2}, it follows that

$$\begin{aligned} \left| \mathbf{x}^{\varepsilon}(t) - \mathbf{x}^{\*}(t) \right| &\leq \frac{K}{m} I\_{a^{+}}^{\mathbb{R}} \left( \left| \mathbf{x}^{\varepsilon}(t) - \mathbf{x}^{\*}(t) \right| + \left| \varepsilon h(t) \right| \right) \\ &= \frac{K}{m} \left[ |\boldsymbol{\varepsilon}| I\_{a^{+}}^{\mathbb{R}} \left( \left| h(t) \right| \right) + I\_{a^{+}}^{\mathbb{R}} \left( \left| \mathbf{x}^{\varepsilon}(t) - \mathbf{x}^{\*}(t) \right| \right) \right] \\ &= \frac{K}{m} \left[ |\boldsymbol{\varepsilon}| I\_{a^{+}}^{\mathbb{R}} \left( \left| h(t) \right| \right) + \frac{1}{\Gamma(\mathbb{R})} \int\_{a}^{t} (t - s)^{\mathbb{R} - 1} \left| \mathbf{x}^{\varepsilon}(s) - \mathbf{x}^{\*}(s) \right| ds \right]. \end{aligned}$$

for all *t* ∈ [*a*, *b*]. Now, by applying Lemma 3 (the fractional Gronwall inequality), it follows that

$$\begin{split} \left| \mathbf{x}^{\varepsilon}(t) - \mathbf{x}^{\*}(t) \right| &\leq \frac{K}{m} \left[ |\boldsymbol{\epsilon}| I\_{a^{+}}^{\mathbb{R}} \left( \left| h(t) \right| \right) + |\boldsymbol{\epsilon}| \int\_{a}^{t} \left( \sum\_{i=0}^{\infty} \frac{1}{\Gamma(i\mathbb{R})} (t-s)^{i\bar{n}-1} I\_{a^{+}}^{\mathbb{R}} \left( \left| h(s) \right| \right) \right) ds \right] \\ &= |\boldsymbol{\epsilon}| \frac{K}{m} \left[ I\_{a^{+}}^{\mathbb{R}} \left( \left| h(t) \right| \right) + \int\_{a}^{t} \left( \sum\_{i=1}^{\infty} \frac{1}{\Gamma(i\bar{n}+1)} (t-s)^{i\bar{n}} I\_{a^{+}}^{\mathbb{R}} \left( \left| h(s) \right| \right) \right) ds \right] \\ &\leq |\boldsymbol{\epsilon}| \frac{K}{m} \left[ I\_{a^{+}}^{\mathbb{R}} \left( \left| h(t) \right| \right) + \int\_{a}^{t} \left( \sum\_{i=1}^{\infty} \frac{\delta^{i\bar{n}}}{\Gamma(i\bar{n}+1)} I\_{a^{+}}^{\mathbb{R}} \left( \left| h(s) \right| \right) \right) ds \right]. \end{split}$$

The series in the last inequality is a Mittag–Leffler function and thus convergent. Hence, by taking the limit when tends to zero, we obtain the desired result: *x* → *x*<sup>∗</sup> for all *t* ∈ [*a*, *b*].

**Lemma 5** (Differentiation of the perturbed trajectory)**.** *There exists a function η defined on* [*a*, *b*] *such that*

$$\mathbf{x}^{\epsilon}(t) = \mathbf{x}^\*(t) + \epsilon \eta(t) + o(\epsilon).$$

**Proof.** Since *f* ∈ *C*1, we have that

$$f(t, \mathbf{x}^{\varepsilon}, \mathbf{u}^{\varepsilon}) = f(t, \mathbf{x}^{\*}, \mathbf{u}^{\*}) + (\mathbf{x}^{\varepsilon} - \mathbf{x}^{\*}) \frac{\partial f(t, \mathbf{x}^{\*}, \mathbf{u}^{\*})}{\partial \mathbf{x}} + (\mathbf{u}^{\varepsilon} - \mathbf{u}^{\*}) \frac{\partial f(t, \mathbf{x}^{\*}, \mathbf{u}^{\*})}{\partial \mathbf{u}} + o(|\mathbf{x}^{\varepsilon} - \mathbf{x}^{\*}|, |\mathbf{u}^{\varepsilon} - \mathbf{u}^{\*}|).$$

Observe that *u* − *u*<sup>∗</sup> = *h*(*t*) and *u* → *u*<sup>∗</sup> when → 0 and, by Lemma 4, we have *x* → *x*<sup>∗</sup> when → 0. Thus, the residue term can be expressed in terms of only, that is, the residue is *<sup>o</sup>*(). Therefore, we have

$$\mathbb{D}^{\mathbb{C}} \mathbb{D}\_{a^{+}}^{\mathfrak{Y}(\cdot)} \mathbf{x}^{\varepsilon}(t) = \prescript{\mathbb{C}}{} \mathbb{D}\_{a^{+}}^{\mathfrak{Y}(\cdot)} \mathbf{x}^{\*}(t) + (\mathbf{x}^{\varepsilon} - \mathbf{x}^{\*}) \frac{\partial f(t, \mathbf{x}^{\*}, \boldsymbol{u}^{\*})}{\partial \mathbf{x}} + \epsilon h(t) \frac{\partial f(t, \mathbf{x}^{\*}, \boldsymbol{u}^{\*})}{\partial \boldsymbol{u}} + o(\epsilon),$$

which leads to

$$\lim\_{\varepsilon \to 0} \left[ \frac{^C \mathbb{D}\_{a^+}^{\mathfrak{Y}(\cdot)} (\mathbf{x}^{\varepsilon} - \mathbf{x}^\*)}{\varepsilon} - \frac{(\mathbf{x}^{\varepsilon} - \mathbf{x}^\*)}{\varepsilon} \frac{\partial f(t, \mathbf{x}^\*, \boldsymbol{u}^\*)}{\partial \mathbf{x}} - h(t) \frac{\partial f(t, \mathbf{x}^\*, \boldsymbol{u}^\*)}{\partial \boldsymbol{u}} \right] = 0, \nu$$

meaning that

$${}^{C}\mathbb{D}\_{a^{+}}^{\psi(\cdot)}\left(\lim\_{\varepsilon\to 0}\frac{\mathbf{x}^{\varepsilon}-\mathbf{x}^{\*}}{\varepsilon}\right) = \lim\_{\varepsilon\to 0}\frac{\mathbf{x}^{\varepsilon}-\mathbf{x}^{\*}}{\varepsilon}\frac{\partial f(t,\mathbf{x}^{\*},\boldsymbol{\mu}^{\*})}{\partial \mathbf{x}} + h(t)\frac{\partial f(t,\mathbf{x}^{\*},\boldsymbol{\mu}^{\*})}{\partial \boldsymbol{u}}.$$

We want to prove the existence of the limit lim →0 *x* − *x*<sup>∗</sup> =: *η*, that is, to prove that *x*(*t*) = *x*<sup>∗</sup>(*t*) + *η*(*t*) + *<sup>o</sup>*(). This is indeed the case, since *η* is solution of the distributed order fractional differential equation

$$\begin{cases} ^C\mathbb{D}\_{a^+}^{\phi(\cdot)}\eta(t) = \frac{\partial f(t, \mathbf{x}^\*, \mathbf{u}^\*)}{\partial \mathbf{x}} \eta(t) + \frac{\partial f(t, \mathbf{x}^\*, \mathbf{u}^\*)}{\partial \mathbf{u}} h(t), \\\\ \eta(a) = 0. \end{cases}$$

The intended result is proven.

#### *3.2. Pontryagin's Maximum Principle of Distributed-Order*

The following result is a necessary condition of Pontryagin type [24] for the basic distributed-order non-local optimal control problem (BP).

**Theorem 1** (Pontryagin Maximum Principle for (BP))**.** *If* (*x*<sup>∗</sup>(·), *<sup>u</sup>*<sup>∗</sup>(·)) *is an optimal pair for* (BP)*, then there exists λ* ∈ *PC*1*, called the adjoint function variable, such that the following conditions hold for all t in the interval* [*a*, *b*]*:*

• *The optimality condition*

$$\frac{\partial L}{\partial u}(t, \mathbf{x}^\*(t), \boldsymbol{u}^\*(t)) + \lambda(t) \frac{\partial f}{\partial u}(t, \mathbf{x}^\*(t), \boldsymbol{u}^\*(t)) = 0;\tag{1}$$

• *The adjoint equation*

$$\mathbb{D}\_{b^{-}}^{\mathfrak{g}(\cdot)}\lambda(t) = \frac{\partial L}{\partial \mathbf{x}}(t, \mathbf{x}^\*(t), \boldsymbol{\mu}^\*(t)) + \lambda(t)\frac{\partial f}{\partial \mathbf{x}}(t, \mathbf{x}^\*(t), \boldsymbol{\mu}^\*(t));\tag{2}$$

• *The transversality condition*

$$\mathbb{T}\_{b^{-}}^{1-\psi(\cdot)}\lambda(b) = 0.\tag{3}$$

**Proof.** Let (*x*<sup>∗</sup>(·), *<sup>u</sup>*<sup>∗</sup>(·)) be the solution to problem (BP), *h*(·) ∈ *PC* be a variation, and a real constant. Define *u*(*t*) = *u*<sup>∗</sup>(*t*) + *h*(*t*), so that *u* ∈ *PC*. Let *x* be the state corresponding to the control *<sup>u</sup>*<sup>∗</sup>, that is, the state solution of

$$\mathbf{^C D}\_{a^+}^{\varphi(\cdot)} \mathbf{x}^{\varepsilon}(t) = f\left(t, \mathbf{x}^{\varepsilon}(t), \mathbf{u}^{\varepsilon}(t)\right), \quad \mathbf{x}^{\varepsilon}(a) = \mathbf{x}\_{\mathbb{A}}.\tag{4}$$

Note that *u*(*t*) → *u*<sup>∗</sup>(*t*) for all *t* ∈ [*a*, *b*] whenever → 0. Furthermore,

$$
\frac{\partial u^c(t)}{\partial \epsilon} \Big|\_{\epsilon=0} = h(t). \tag{5}
$$

Something similar is also true for *<sup>x</sup>*. Because *f* ∈ *C*1, it follows from Lemma 4 that, for each fixed *t*, *x*(*t*) → *x*<sup>∗</sup>(*t*) as → 0. Moreover, by Lemma 5, the derivative *∂x*(*t*) *∂* =0 exists for each *t*. The objective functional at (*<sup>x</sup>*, *u*) is

$$J[\mathbf{x}^c, \boldsymbol{\mu}^c] = \int\_a^b L\left(t, \mathbf{x}^c(t), \boldsymbol{\mu}^c(t)\right) dt.$$

Next, we introduce the adjoint function *λ*. Let *<sup>λ</sup>*(·) be in *PC*1, to be determined. By the integration by parts formula (see Lemma 1),

$$\int\_{a}^{b} \lambda(t) \cdot \prescript{C}{}{\mathbb{D}}\_{a^{+}}^{\psi(\cdot)} \mathbf{x}^{c}(t) dt = \left[ \mathbf{x}^{c}(t) \cdot \mathbb{I}\_{b^{-}}^{1-\psi(\cdot)} \lambda(t) \right]\_{a}^{b} + \int\_{a}^{b} \mathbf{x}^{c}(t) \cdot \mathbb{D}\_{b^{-}}^{\psi(\cdot)} \lambda(t) dt,$$

and one has

$$\int\_{a}^{b} \lambda(t) \cdot \mathbb{D}\_{a^{+}}^{\Psi(\cdot)} \mathbf{x}^{a}(t) dt - \int\_{a}^{b} \mathbf{x}^{c}(t) \cdot \mathbb{D}\_{b^{-}}^{\Psi(\cdot)} \lambda(t) dt - \mathbf{x}^{c}(b) \cdot \mathbb{I}\_{b^{-}}^{1-\Psi(\cdot)} \lambda(b) + \mathbf{x}^{c}(a) \cdot \mathbb{I}\_{b^{-}}^{1-\Psi(\cdot)} \lambda(a) = 0.$$

Adding this zero to the expression *J*[*<sup>x</sup>*, *u*] gives

$$\begin{split} \boldsymbol{\phi}(\boldsymbol{\varepsilon}) = \boldsymbol{J}[\boldsymbol{\mathsf{x}}^{\boldsymbol{c}}, \boldsymbol{\mathsf{u}}^{\boldsymbol{c}}] &= \int\_{\boldsymbol{a}}^{\boldsymbol{b}} \left[ \boldsymbol{\mathsf{L}}\left(\boldsymbol{t}, \boldsymbol{\mathsf{x}}^{\boldsymbol{c}}(\boldsymbol{t}), \boldsymbol{\mathsf{u}}^{\boldsymbol{c}}(\boldsymbol{t})\right) + \boldsymbol{\lambda}(\boldsymbol{t}) \cdot \boldsymbol{\mathsf{C}}^{\boldsymbol{c}} \mathbb{D}\_{\boldsymbol{a}^{+}}^{\boldsymbol{\varphi}(\cdot)} \boldsymbol{\mathsf{x}}^{\boldsymbol{c}}(\boldsymbol{t}) - \boldsymbol{\mathsf{x}}^{\boldsymbol{c}}(\boldsymbol{t}) \cdot \mathbb{D}\_{\boldsymbol{b}^{-}}^{\boldsymbol{\varphi}(\cdot)} \boldsymbol{\lambda}(\boldsymbol{t}) \right] dt \\ & \quad - \boldsymbol{\mathsf{x}}^{\boldsymbol{c}}(\boldsymbol{b}) \cdot \mathbb{I}\_{b^{-}}^{1-\boldsymbol{\varphi}(\cdot)} \boldsymbol{\lambda}(\boldsymbol{b}) + \boldsymbol{\mathsf{x}}^{\boldsymbol{c}}(\boldsymbol{a}) \cdot \mathbb{I}\_{b^{-}}^{1-\boldsymbol{\varphi}(\cdot)} \boldsymbol{\lambda}(\boldsymbol{a}), \end{split}$$

which by (4) is equivalent to

$$\begin{split} \boldsymbol{\phi}(\boldsymbol{\varepsilon}) = \boldsymbol{J}[\boldsymbol{\mathbf{x}}^{\boldsymbol{\varepsilon}}, \boldsymbol{u}^{\boldsymbol{\varepsilon}}] = \int\_{\boldsymbol{\mathfrak{s}}}^{\boldsymbol{b}} \left[ \boldsymbol{L}\left(t, \mathbf{x}^{\boldsymbol{\varepsilon}}(t), \boldsymbol{u}^{\boldsymbol{\varepsilon}}(t)\right) + \boldsymbol{\lambda}(t) \cdot \boldsymbol{f}\left(t, \mathbf{x}^{\boldsymbol{\varepsilon}}(t), \boldsymbol{u}^{\boldsymbol{\varepsilon}}(t)\right) - \mathbf{x}^{\boldsymbol{\varepsilon}}(t) \cdot \mathbb{D}\_{\boldsymbol{b}^{-}}^{\boldsymbol{\mathfrak{y}}(\cdot)}\boldsymbol{\lambda}(t) \right] dt \\ & \qquad - \mathbf{x}^{\boldsymbol{\varepsilon}}(\boldsymbol{b}) \cdot \mathbb{I}\_{\boldsymbol{b}^{-}}^{1-\mathfrak{y}(\cdot)}\boldsymbol{\lambda}(\boldsymbol{b}) + \mathbf{x}\_{\boldsymbol{a}} \cdot \mathbb{I}\_{\boldsymbol{b}^{-}}^{1-\mathfrak{y}(\cdot)}\boldsymbol{\lambda}(\boldsymbol{a}). \end{split}$$

Since the process (*x*<sup>∗</sup>, *u*<sup>∗</sup>)=(*x*0, *u*<sup>0</sup>) is assumed to be a maximizer of problem (BP), the derivative of *φ*() with respect to must vanish at = 0; that is,

$$\begin{split} 0 &= \phi'(0) = \frac{d}{d\varepsilon} l[\mathbf{x}^{\varepsilon}, \boldsymbol{\mu}^{\varepsilon}]|\_{\varepsilon=0} \\ &= \int\_{a}^{b} \left[ \frac{\partial L}{\partial \mathbf{x}} \frac{\partial \mathbf{x}^{\varepsilon}(t)}{\partial \boldsymbol{\varepsilon}} \Big|\_{\varepsilon=0} + \frac{\partial L}{\partial \boldsymbol{u}} \frac{\partial \boldsymbol{u}^{\varepsilon}(t)}{\partial \boldsymbol{\varepsilon}} \Big|\_{\varepsilon=0} + \lambda(t) \left( \frac{\partial f}{\partial \mathbf{x}} \frac{\partial \mathbf{x}^{\varepsilon}(t)}{\partial \boldsymbol{\varepsilon}} \Big|\_{\varepsilon=0} + \frac{\partial f}{\partial \boldsymbol{u}} \frac{\partial \boldsymbol{u}^{\varepsilon}(t)}{\partial \boldsymbol{\varepsilon}} \Big|\_{\varepsilon=0} \right) \right] \\ & \qquad - \mathbb{D}\_{b^{-}}^{\psi(\cdot)} \lambda(t) \frac{\partial \mathbf{x}^{\varepsilon}(t)}{\partial \boldsymbol{\varepsilon}} \Big|\_{\varepsilon=0} \Big] dt - \frac{\partial \mathbf{x}^{\varepsilon}(b)}{\partial \boldsymbol{\varepsilon}} \Big|\_{\varepsilon=0} \mathbb{I}\_{b^{-}}^{1-\psi(\cdot)} \lambda(b), \end{split}$$

where the partial derivatives of *L* and *f* , with respect to *x* and *u*, are evaluated at (*t*, *<sup>x</sup>*<sup>∗</sup>(*t*), *<sup>u</sup>*<sup>∗</sup>(*t*)). Rearranging the term and using (5), we obtain that

$$\int\_{\mathfrak{g}}^{\mathfrak{b}} \left[ \left( \frac{\partial L}{\partial \mathbf{x}} + \lambda(t) \frac{\partial f}{\partial \mathbf{x}} - \mathbb{D}\_{\mathfrak{b}^{-}}^{\mathfrak{q}(\cdot)} \lambda(t) \right) \frac{\partial \mathbf{x}^{\varepsilon}(t)}{\partial \boldsymbol{\varepsilon}} \Big|\_{\boldsymbol{\varepsilon}=0} + \left( \frac{\partial L}{\partial \boldsymbol{u}} + \lambda(t) \frac{\partial f}{\partial \mathbf{u}} \right) h(t) \right] dt - \frac{\partial \mathbf{x}^{\varepsilon}(b)}{\partial \boldsymbol{\varepsilon}} \Big|\_{\boldsymbol{\varepsilon}=0} \mathbb{I}\_{b^{-}}^{1-\mathfrak{q}(\cdot)} \lambda(b) = 0.$$

Setting *<sup>H</sup>*(*<sup>t</sup>*, *x*, *u*, *λ*) = *<sup>L</sup>*(*<sup>t</sup>*, *x*, *u*) + *λ f*(*<sup>t</sup>*, *x*, *<sup>u</sup>*), it follows that

$$\int\_{a}^{b} \left[ \left( \frac{\partial H}{\partial \mathbf{x}} - \mathbb{D}\_{b^{-}}^{\Psi(\cdot)} \lambda(t) \right) \frac{\partial \mathbf{x}^{\varepsilon}(t)}{\partial \boldsymbol{\varepsilon}} \Big|\_{\boldsymbol{\varepsilon} = 0} + \frac{\partial H}{\partial \boldsymbol{u}} h(t) \right] dt - \frac{\partial \mathbf{x}^{\varepsilon}(b)}{\partial \boldsymbol{\varepsilon}} \Big|\_{\boldsymbol{\varepsilon} = 0} \mathbb{I}\_{b^{-}}^{1 - \Psi(\cdot)} \lambda(b) = 0,$$

where the partial derivatives of *H* are evaluated at (*t*, *<sup>x</sup>*<sup>∗</sup>(*t*), *<sup>u</sup>*<sup>∗</sup>(*t*), *<sup>λ</sup>*(*t*)). Now, choosing

$$\mathbb{D}\_{b^{-}}^{\mathfrak{g}(\cdot)}\lambda(t) = \frac{\mathfrak{g}H}{\mathfrak{d}\mathfrak{x}}\left(t, \mathfrak{x}^\*(t), \mathfrak{u}^\*(t), \lambda(t)\right), \quad \text{with } \mathbb{I}\_{b^{-}}^{1-\mathfrak{g}(\cdot)}\lambda(b) = 0,$$

that is, given the adjoint equation (2) and the transversality condition (3), it yields

$$\int\_{a}^{b} \frac{\partial H}{\partial \boldsymbol{u}} \left( t, \boldsymbol{x}^\*(t), \boldsymbol{u}^\*(t), \lambda(t) \right) h(t) = 0$$

and, by the fundamental lemma of the calculus of variations [25], we have the optimality condition (1):

$$\frac{\partial H}{\partial \boldsymbol{\mu}}\left(\boldsymbol{t}, \boldsymbol{x}^\*(\boldsymbol{t}), \boldsymbol{u}^\*(\boldsymbol{t}), \lambda(\boldsymbol{t})\right) = 0.$$

This concludes the proof.

**Remark 1.** *If we change the basic optimal control problem* (BP) *by changing the boundary condition given on the state variable at initial time, x*(*a*) = *xa, to a terminal condition, then the optimality condition and the adjoint equation of the Pontryagin Maximum Principle (Theorem 1) remain exactly the same. Changes appear only on the transversality condition:*

• *A boundary condition at final/terminal time—that is, fixing the value x*(*b*) = *xb with x*(*a*) *remaining free, leads to*

$$
\Gamma\_{a^{-}}^{1-\psi(\cdot)}\lambda(a) = 0;
$$

• *In the case when no boundary conditions is given (i.e., both x*(*a*) *and x*(*b*) *are free), then we have*

$$\mathbb{I}\_{b^{-}}^{1-\psi(\cdot)}\lambda(b) = 0 \quad \text{and} \quad \mathbb{I}\_{a^{-}}^{1-\psi(\cdot)}\lambda(a) = 0.$$

**Remark 2.** *If f* (*t*, *x*, *u*) = *u, that is, <sup>C</sup>*D*ψ*(·) *a*+ *x*(*t*) = *<sup>u</sup>*(*t*)*, then our problem* (BP) *gives a basic problem of the calculus of variations, in the distributed-order fractional sense of [16]. In this very particular case, we obtain from our Theorem 1 the Euler–Lagrange equation of [16] (cf. Theorem 2 of [16]).*

**Remark 3.** *Our distributed-order fractional optimal control problem* (BP) *can be easily extended to the vector setting. Precisely, let x* := (*<sup>x</sup>*1, ... , *xn*) *and u* := (*<sup>u</sup>*1, ... , *um*) *with* (*<sup>n</sup>*, *m*) ∈ N2*, such that m* ≤ *n, and functions f* : [*a*, *b*] × R*n* × R*m* → R*n and L* : [*a*, *b*] × R*n* × R*m* → R *be continuously differentiable with respect to all its components. If* (*x*<sup>∗</sup>, *u*<sup>∗</sup>) *is an optimal pair, then the following conditions hold for t* ∈ [*a*, *b*]*:*

• *The optimality conditions*

$$\frac{\partial L}{\partial u\_i}(t, \mathbf{x}^\*(t), \boldsymbol{u}^\*(t)) + \lambda(t) \cdot \frac{\partial f}{\partial u\_i}(t, \mathbf{x}^\*(t), \boldsymbol{u}^\*(t)) = 0, \quad i = 1, \dots, m;$$

• *The adjoint equations*

$$\mathbb{D}\_{b^{-}}^{\mathfrak{p}(\cdot)}\lambda\_{\dot{j}}(t) = \frac{\partial L}{\partial \mathbf{x}\_{\dot{j}}}(t, \mathbf{x}^\*(t), \mathbf{u}^\*(t)) + \lambda(t) \cdot \frac{\partial f}{\partial \mathbf{x}\_{\dot{j}}}(t, \mathbf{x}^\*(t), \mathbf{u}^\*(t)), \quad \dot{j} = 1, \dots, n;$$

• *The transversality conditions*

$$\mathbb{L}\_{b^{-}}^{1-\psi(\cdot)}\lambda\_{\dot{j}}(b) = 0, \quad j = 1, \ldots, n. \tag{6}$$

**Definition 4.** *The candidates to solutions of* (BP)*, obtained by the application of our Theorem 1, will be called (Pontryagin) extremals.*

We now illustrate the usefulness of our Theorem 1 with an example.

**Example 1.** *The triple* (*x*˜, *u*˜, *λ*) *given by x*˜(*t*) = *t*2*, u*˜(*t*) = *t*(*t* − 1) ln *t , and λ*(*t*) = 0*, for t* ∈ [0, 1]*, is an extremal of the following distributed-order fractional optimal control problem:*

$$J[\mathbf{x}(\cdot), \boldsymbol{\mu}(\cdot)] = \int\_0^1 -\left(\mathbf{x}(t) - t^2\right)^2 - \left(\boldsymbol{\mu} - \frac{t(t-1)}{\ln t}\right)^2 \longrightarrow \max,$$

$$\,^C\mathbb{D}\_{0^+}^{\mathfrak{p}(\cdot)}\mathbf{x}(t) = \boldsymbol{\mu}(t), \quad t \in [0, 1],\tag{7}$$

$$\mathbf{x}(0) = 0.$$

*Indeed, by defining the Hamiltonian function as*

$$H(t, \mathbf{x}, u, \lambda) = -\left[ (\mathbf{x} - t^2) + \left( u - \frac{t(t-1)}{\ln t} \right)^2 \right] + \lambda u\_\prime \tag{8}$$

*it follows:*

• *From the optimality condition ∂H∂u* = 0*,*

$$\lambda(t) = 2\left(u - \frac{t(t-1)}{\ln t}\right);\tag{9}$$

• *From the adjoint equation* D*ψ*(*α*) 0<sup>+</sup> *λ*(*t*) = *∂H∂x ,*

$$\mathbb{D}\_{0^{+}}^{\psi(a)}\lambda(t) = -2(\mathbf{x} - t^{2});\tag{10}$$

• *From the transversality condition,*

$$\mathbb{I}\_{b^{-}}^{1-\psi(a)}\lambda(b) = 0.\tag{11}$$

*We easily see that* (9)*,* (10) *and* (11) *are satisfied for*

$$
\alpha(t) = t^2, \quad \mu(t) = \frac{t(t-1)}{\ln t}, \quad \lambda(t) = 0.
$$

*3.3. Sufficient Condition for Global Optimality*

We now prove a Mangasarian type theorem for the distributed-order fractional optimal control problem (BP).

**Theorem 2.** *Consider the basic distributed-order fractional optimal control problem* (BP)*. If* (*<sup>x</sup>*, *u*) → *<sup>L</sup>*(*<sup>t</sup>*, *x*, *u*) *and* (*<sup>x</sup>*, *u*) → *f*(*<sup>t</sup>*, *x*, *u*) *are concave and* (*x*˜, *u*˜, *λ*) *is a Pontryagin extremal with λ*(*t*) ≥ 0*, t* ∈ [*a*, *b*]*, then*

$$J[\tilde{\mathbf{x}}, \tilde{u}] \ge J[\mathbf{x}, u]$$

*for any admissible pair* (*<sup>x</sup>*, *<sup>u</sup>*)*.*

**Proof.** Because *L* is concave as a function of *x* and *u*, we have from Lemma 2 that

$$L\left(t, \ddot{\mathbf{x}}(t), \ddot{\mathbf{u}}(t)\right) - L\left(t, \mathbf{x}(t), \boldsymbol{u}(t)\right) \geq \frac{\partial L}{\partial \mathbf{x}}\left(t, \ddot{\mathbf{x}}(t), \ddot{\mathbf{u}}(t)\right) \cdot \left(\ddot{\mathbf{x}}(t) - \mathbf{x}(t)\right) + \frac{\partial L}{\partial \boldsymbol{u}}\left(t, \ddot{\mathbf{x}}(t), \ddot{\mathbf{u}}(t)\right) \cdot \left(\ddot{\mathbf{u}}(t) - \boldsymbol{u}(t)\right)$$

for any control *u* and its associated trajectory *x*. This gives

$$\begin{split} \left[ f[\vec{x}(\cdot), \vec{u}(\cdot)] - f[\vec{x}(\cdot), u(\cdot)] \right] &= \int\_{a}^{b} \left[ L\left(t, \vec{x}(t), \vec{u}(t)\right) - L\left(t, \vec{x}(t), u(t)\right) \right] dt \\ &\geq \int\_{a}^{b} \left[ \frac{\partial L}{\partial \mathbf{x}}\left(t, \vec{x}(t), \vec{u}(t)\right) \cdot \left(\vec{x}(t) - \mathbf{x}(t)\right) + \frac{\partial L}{\partial u}\left(t, \vec{x}(t), \vec{u}(t)\right) \cdot \left(\vec{u}(t) - u(t)\right) \right] dt \\ &= \int\_{a}^{b} \left[ \frac{\partial L}{\partial \mathbf{x}}\left(t, \vec{x}(t), \vec{u}(t)\right) \cdot \left(\vec{x}(t) - \mathbf{x}(t)\right) - \frac{\partial L}{\partial u}\left(t, \vec{x}(t), \vec{u}(t)\right) \cdot \left(\vec{u}(t) - u(t)\right) \right] dt. \end{split} \tag{12}$$

From the adjoint equation (2), we have

$$\frac{\partial \mathbb{D}}{\partial \mathbf{x}}(t, \mathbf{\tilde{x}}(t), \tilde{\mathbf{u}}(t)) = \mathbb{D}\_{b^{-}}^{\psi(\cdot)} \lambda(t) - \lambda(t) \frac{\partial f}{\partial \mathbf{x}}(t, \mathbf{\tilde{x}}(t), \tilde{\mathbf{u}}(t)).$$

From the optimality condition (1), we know that

$$
\frac{
\partial L
}{
\partial \boldsymbol{\mu}
}(t, \boldsymbol{\mathfrak{x}}(t), \boldsymbol{\upmu}(t)) = -\boldsymbol{\lambda}(t) \frac{
\partial \boldsymbol{f}
}{
\partial \boldsymbol{\upmu}
}(t, \boldsymbol{\upmu}(t), \boldsymbol{\upmu}(t)).
$$

It follows from (12) that

$$\begin{split} \left[ f[\vec{\pi}(\cdot), \vec{u}(t)] - f[\vec{x}(\cdot), u(\cdot)] \right] \geq \int\_{\mathfrak{a}}^{b} \left( \mathbb{D}\_{b^{-}}^{\mathfrak{g}(\cdot)} \lambda(t) - \lambda(t) \frac{\partial f}{\partial \mathbf{x}} \left( t, \vec{x}(t), \vec{u}(t) \right) \right) \cdot \left( \vec{x}(t) - \mathbf{x}(t) \right) \\ & - \lambda(t) \frac{\partial f}{\partial \mathbf{u}} \left( t, \vec{x}(t), \vec{u}(t) \right) \cdot \left( \vec{u}(t) - u(t) \right) dt. \end{split} \tag{13}$$

Using the integration by parts formula of Lemma 1,

$$\int\_{a}^{b} \lambda(t) \cdot \mathbb{D}\_{a^{+}}^{\mathbb{P}(\cdot)} \left( \mathfrak{x}(t) - \mathfrak{x}(t) \right) dt = \left[ \left( \mathfrak{x}(t) - \mathfrak{x}(t) \right) \cdot \mathbb{I}\_{b^{-}}^{1-\mathfrak{P}(\cdot)} \lambda(t) \right]\_{a}^{b} + \int\_{a}^{b} \left( \mathfrak{x}(t) - \mathfrak{x}(t) \right) \cdot \mathbb{D}\_{b^{-}}^{\mathfrak{P}(\cdot)} \lambda(t) dt \Big|\_{a}^{b}$$

meaning that

$$\begin{split} \int\_{a}^{b} \left(\mathfrak{x}(t) - \mathfrak{x}(t)\right) \cdot \mathbb{D}\_{b^{-}}^{\mathfrak{q}(\cdot)} \lambda(t) dt \\ &= \int\_{a}^{b} \lambda(t) \cdot ^{\mathbb{C}} \mathbb{D}\_{a^{+}}^{\mathfrak{q}(\cdot)} \left(\mathfrak{x}(t) - \mathfrak{x}(t)\right) dt - \left[\left(\mathfrak{x}(t) - \mathfrak{x}(t)\right) \cdot \mathbb{I}\_{b^{-}}^{1-\mathfrak{q}(\cdot)} \lambda(t)\right]\_{a}^{b} . \end{split} \tag{14}$$

Substituting (14) into (13), we ge<sup>t</sup>

$$\begin{split} \int \left[ \tilde{\mathbf{x}}(\cdot), \tilde{\mathbf{u}}(\cdot) \right] - \int \left[ \mathbf{x}(\cdot), u(\cdot) \right] &\geq \int\_{a}^{b} \lambda(t) \left[ f \left( t, \tilde{\mathbf{x}}(t), \tilde{\mathbf{u}}(t) \right) \right. \\ \left. - f \left( t, \mathbf{x}(t), u(t) \right) - \frac{\partial f}{\partial \mathbf{x}} \left( t, \tilde{\mathbf{x}}(t), \tilde{\mathbf{u}}(t) \right) \cdot \left( \tilde{\mathbf{x}}(t) - \mathbf{x}(t) \right) - \frac{\partial f}{\partial u} \left( t, \tilde{\mathbf{x}}(t), \tilde{\mathbf{u}}(t) \right) \cdot \left( \tilde{\mathbf{u}}(t) - \mathbf{u}(t) \right) \right] dt. \end{split}$$

Finally, taking into account that *λ*(*t*) ≥ 0 and *f* is concave in both *x* and *u*, we conclude that *J* [*x*˜(·), *<sup>u</sup>*˜(·)] − *J* [*x*(·), *<sup>u</sup>*(·)] ≥ 0.

**Example 2.** *The extremal* (*x*˜, *u*˜, *λ*) *given in Example 1 is a global minimizer for problem* (7)*. This is easily checked from Theorem 2 since the Hamiltonian defined in* (8) *is a concave function with respect to both variables x and u and, furthermore, λ*(*t*) ≡ 0*. In Figure 1, we give the plots of the optimal solution to problem* (7)*.*

**Figure 1.** The optimal control *u*<sup>∗</sup> and corresponding optimal state variable *<sup>x</sup>*∗, solution of problem (7).
