Application of Mini-Batch Metaheuristic Algorithms in Problems of Optimization of Deterministic Systems with Incomplete Information about the State Vector

Panteleev, Andrei V.; Lobanov, Aleksandr V.

doi:10.3390/a14110332

Open AccessArticle

Application of Mini-Batch Metaheuristic Algorithms in Problems of Optimization of Deterministic Systems with Incomplete Information about the State Vector^†

by

Andrei V. Panteleev

^* and

Aleksandr V. Lobanov

Department of Mathematics and Cybernetics, Moscow Aviation Institute, National Research University, 4, Volokolamskoe Shosse, 125993 Moscow, Russia

^*

Author to whom correspondence should be addressed.

^†

This paper is an extended version of conference paper Application of the mini-batch adaptive method of random search(MAMRS) in problems of optimal in mean control of the trajectory pencils, In Proceedings of the 19th International Conference “Aviation and Cosmonautics” (AviaSpace-2020), Moscow, Russia, 23–27 November 2020.

Algorithms 2021, 14(11), 332; https://doi.org/10.3390/a14110332

Submission received: 27 September 2021 / Revised: 3 November 2021 / Accepted: 12 November 2021 / Published: 14 November 2021

(This article belongs to the Special Issue Metaheuristic Algorithms and Applications)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

In this paper, we consider the application of the zero-order mini-batch optimization method in the problem of finding optimal control of a pencil of trajectories of nonlinear deterministic systems in the case of incomplete information about the state vector. The pencil of trajectories originates from a given set of initial states. To solve the problem, the structure of a feedback system is proposed, which contains models of the plant, measuring system, nonlinear state observer and control law of the fixed structure with unknown coefficients. The objective function proposed considers the quality of pencil of trajectories control, which is estimated by the average value of the Bolz functional over the given set of initial states. Unknown control laws of a plant and an observer are found in the form of expansions in terms of orthonormal systems of basis functions, which are specified on the set of possible states of a dynamical system. The original pencil of trajectories control problem is reduced to a global optimization problem, which is solved using the well-proven zero-order method, which uses a modified mini-batch approach in a random search procedure with adaptation. An algorithm for solving the problem is proposed. The satellite stabilization problem with incomplete information is solved.

Keywords:

mini-batch algorithms; metaheuristic; optimal control; satellite stabilization problem

1. Introduction

A general approach to the numerical solution of the problem of finding the average optimal control of nonlinear deterministic dynamical systems under conditions of uncertainty in setting the initial conditions and incomplete information about the state vector is proposed. Since direct information about the state vector is not available, a nonlinear state observer is included in the closed-loop control system, which finds an estimate of the state vector from the output of the nonlinear model of the measuring system. The control laws of the plant and the observer are found simultaneously as functions of time and estimates of the state vector. In contrast to linear systems with a quadratic criterion, in which the synthesis of the optimal controller and the optimal filter is performed independently, in the proposed procedure, the undefined coefficients of the control laws of the plant and the observer are sought simultaneously [1].

An alternative way is to use various numerical methods for solving the Bellman equation as a sufficient condition for optimality of feedback control in the complete state information problem. In this case, arbitrary initial conditions are considered, for which the minimum of the functional should be obtained. When solving practical problems of control theory, it is usually possible to define a set of initial states, determined by the conditions of operation of the control system, and for this set to search for the corresponding law of control with feedback. To complete the solution, one should find the parameters of the nonlinear observer independently and use the estimate of the state vector in the optimal control law instead of exact information about the state vector.

In the present paper, the behavior of a nonlinear continuous deterministic plant (model of object) is described by the ODE system. Parallelepiped constraints are imposed on control vector coordinates. Initial conditions are given by a compact set of initial states. The quality of separate trajectory control is estimated by the value of the Bolz functional. For the given set of initial conditions, a pencil of trajectories is considered. The performance index to be minimized is calculated by the average value of the Bolz functional over the set of initial states. The problem is to find the control laws for the plant and the state observer in the class of functional expansions in terms of elements of orthonormal basis systems with unknown coefficients, depending on time and estimates of the state vector coordinates. The components of the control laws are found using systems of basis functions that are used in problems of spectral analysis [2,3]. It is proposed to apply the mini-batch adaptive method of random search (MAMRS) [4,5,6] for solving the problem under consideration and to analyze the solution of the problem for various models of the measuring system. As a special case, the control problem with complete information about the state vector is considered. MAMRS can be classified as a metaheuristic method [7,8,9,10,11]. MAMRS extends the idea of stochastic gradient methods [12,13,14,15] to a method that does not require information about the gradient. The efficiency and analysis of this method is demonstrated by solving an applied optimal control problem of satellite stabilization [16].

2. Statement of the Problem

We consider the nonlinear continuous dynamical system described by the vector differential equation:

\dot{x} (t) = f (t, x (t), u (t)),

(1)

where

f (t, x, u)

is a given continuous function,

t \in T = [t_{0}; t_{1}]

is a continuous time and the initial moment

t_{0}

and final moment

t_{1}

are specified;

x \in R^{n}

is a state vector;

u \in U \subseteq R^{q}

is a control vector and

U = [a_{1}, b_{1}] \times \dots \times [a_{q}, b_{q}]

is a set of allowable values of control.

The initial conditions are specified as:

x (t_{0}) = x_{0} \in Ω \subset R^{n},

(2)

where Ω is a set with positive measure (

mes Ω > 0

) and a piecewise smooth boundary. It characterizes the uncertainty in setting the initial conditions.

The model of the measuring system is described by the relation:

z (t) = h (t, x (t)),

(3)

where

z \in R^{m}

is an output vector and

h (t, x)

is a given continuous function. The information coming from the model of the measuring system arrives at the input of the state observer, producing an estimate of the state vector.

We suppose that it is possible to obtain an estimate of the state vector using a nonlinear observer of the form:

\frac{d \hat{x} (t)}{d t} = f (t, \hat{x} (t), u (t, \hat{x} (t))) + K (t, \hat{x} (t)) [z (t) - h (t, \hat{x} (t))],

(4)

\hat{x} (t_{0}) = {\hat{x}}_{0},

(5)

where

\hat{x} (t)

is a state vector estimate,

{\hat{x}}_{0} \in Ω

is an initial estimate and

K (t, \hat{x}) \in R^{n \times m}

is an unknown continuous

n \times m

matrix function. This matrix is considered as a feedback control of the observation process. The state vector estimate is used also in the plant control law

u (t, \hat{x})

.

We define the set of admissible control laws

U

by functions

(u (t, \hat{x}), K (t, \hat{x}))

, where

\forall t \in T

, the plant control

u (t) = u (t, {\hat{x}}^{} (t)) \in U

is a piecewise continuous and the observer control

K (t) = K (t, \hat{x} (t)) \in R^{n \times m}

is a continuous function. It is assumed that the solution of the system of Equations (1) and (4) with the initial conditions (2), (5) taking into account (3), exists and is unique.

The performance index for a separate trajectory:

I (x_{0}, u (t, \hat{x} (t)), K (t, \hat{x} (t))) = \int_{t_{0}}^{t_{1}} f^{0} (t, x (t), u (t, \hat{x} (t)), K (t, \hat{x} (t))) d t + F (x (t_{1})),

(6)

where

f^{0} (t, x, u, K), F (x)

are given continuous functions.

We associate the pencil of trajectories of the system of Equations (1) and (4) with each admissible control law

(u (t, \hat{x}), K (t, \hat{x})) \in U

and the set

Ω

of initial states:

X (t, u (t, \hat{x}), K (t, \hat{x})) = \cup {x (t, u (t, \hat{x}), K (t, \hat{x}), x_{0}), \hat{x} (t, U (t, \hat{x}), K (t, \hat{x}), {\hat{x}}_{0}) | x_{0} \in Ω},

that is, the union of the system of Equations (1) and (4) and solutions for all possible initial states from the set

Ω

.

The performance index for the pencil of trajectories control to be minimized is:

J [U (t, \hat{x}), K (t, \hat{x})] = \int_{Ω} I (x_{0}, U (t, \hat{x} (t)), K (t, \hat{x} (t))) d x_{0} / mes Ω .

(7)

The optimal control problem is to choose the control policy

(u * (t, \hat{x}), K * (t, \hat{x})) \in U

so that performance index (7) is minimized:

J [U^{*} (t, \hat{x}), K^{*} (t, \hat{x})] = \min_{(u (t, \hat{x}), K (t, \hat{x})) \in u} J [u (t, \hat{x}), K (t, \hat{x})]

(8)

Since the average value of performance index (6) is minimized on the set of initial states

Ω

, the required control is called optimal on average.

3. Solution Search Strategy

We consider the transition to the parametric optimization problem from the control problem (8), i.e., to the problem of finding unknown coefficients of the plant control and the observer control. The plant control constraints of parallelepiped type should be taken into account.

To implement this transition, we use the following assumptions:

1.: The set of initial states $Ω$ is a parallelepiped, defined by the direct product of segments $[α_{i}; β_{i}], i = \bar{1, n},$ i.e., $Ω = [α_{1}; β_{1}] \times \dots \times [α_{n}; β_{n}]$ . With the help of a step $Δ x_{i}$ , all line segments are divided into $N_{i}$ segments and the parallelepiped $Ω$ is divided into $N = N_{1} \dots N_{n}$ elementary disjoint subsets $Ω_{k}, k = \bar{1, N}$ . In each elementary subset $Ω_{k}$ , an initial state $x_{0}^{k}$ (the center of the parallelepiped $Ω_{k}$ is specified;
2.: The direct product $Q = [\underline{x_{1}}, \bar{x_{1}}] \times \dots \times [\underline{x_{n}}, \bar{x_{n}}]$ represents the set of admissible values of the state vector coordinates, where $\underline{x_{i}}, \bar{x_{i}}, i = \bar{1, n}$ are the lower and upper boundaries for each coordinate, respectively, determined by the applied problem being solved. Therefore, one can assume that the possible estimates of the state vector should satisfy the following conditions: ${\hat{x}}_{1} \in [\underline{x_{1}}, \bar{x_{1}}], \dots, {\hat{x}}_{n} \in [\underline{x_{n}}, \bar{x_{n}}]$ ;
3.: The plant control policy is searched in the form:

u_{j} (t, \hat{x} (t)) = sat \underset{v_{j} (t)}{\underset{︸}{{g_{j} (t, {\hat{x}}_{1} (t), \dots, {\hat{x}}_{n} (t))}}}, j = \bar{1, q},

(9)

where saturation function sat guarantees the fulfillment of the plant control constraints of the form

a_{j} \leq u_{j} (t) = u_{j} (t, \hat{x} (t)) \leq b_{j}

:

sat v_{j} (t) = {\begin{cases} v_{j} (t), a_{j} < v_{j} (t) < b_{j}, \\ a_{j}, v_{j} (t) \leq a_{j}, \\ b_{j}, v_{j} (t) \geq b_{j}, \end{cases}

(10)

g_{j} (t, {\hat{x}}_{1}, \dots, {\hat{x}}_{n}) = \sum_{i_{0} = 0}^{L_{0} - 1} \sum_{i_{1} = 0}^{L_{1} - 1} \dots \sum_{i_{n} = 0}^{L_{n} - 1} u_{i_{0}, i_{1}, \dots, i_{n}}^{j} \cdot q (i_{0}, t) p_{1} (i_{1}, {\hat{x}}_{1}) \dots p_{n} (i_{n}, {\hat{x}}_{n}),

(11)

where

u_{i_{0}, i_{1}, \dots, i_{n}}^{j}

are unknown coefficients;

L_{0}, L_{1}, \dots, L_{n}

are scales of truncation;

q (i_{0}, t), i_{0} = \bar{0, L_{0} - 1}

is a system of orthonormal time functions (basis functions) defined on the segment

[t_{0}, t_{1}]

and satisfying the condition

\int_{t_{0}}^{t_{1}} q (i, t) q (j, t) d t = {\begin{matrix} 1, i = j, \\ 0, i \neq j, \end{matrix}

; and

p_{j} (i_{j}, {\hat{x}}_{j}), i_{j} = \bar{0, L_{j} - 1}

is a system of orthonormal functions of a variable

{\hat{x}}_{j}

(basis functions) defined on an interval

[\underline{x_{j}}, \bar{x_{j}}]

,

j = 1, \dots, n

.

As the basis functions

q (i_{0}, t), p_{k} (i_{k}, {\hat{x}}_{k}), k = \bar{1, n}

, one can take, for example:

Legendre polynomials:

p (n, x) = \sum_{k = 0}^{n} {(C_{n}^{k})}^{2} {\tilde{x}}^{n - k} {\tilde{x}}^{k}, n = \bar{0, L - 1};

Cosine:

p (n, x) = \cos (n \cdot π \cdot (2 \cdot \tilde{x} - 1)), n = \bar{0, L - 1};

where

\tilde{x} = (x - \underline{x}) / (\bar{x} - \underline{x})

and other systems of basic functions.

The matrix entries

K_{i j} (t, \hat{x}), i = 1, \dots, n; j = 1, \dots, m

of the state observer control policy

K (t, \hat{x})

are found by a formula similar to (11), where variable

u

is replaced by

K

.

The value of the pencil control cost functional (7) is approximated as:

J [u (t, \hat{x}), K (t, \hat{x})] ≅ (1 / N) \sum_{k = 1}^{N} I (x_{0}, u (t, \hat{x} (t)), K (t, \hat{x} (t))) .

(12)

The optimization problem is to choose the best parameters

u_{i_{0}, i_{1}, \dots, i_{n}}^{j}

,

K_{i_{0}, i_{1}, \dots, i_{n}}^{i, j}

, minimizing performance index (12) by using a mini-batch adaptive method of random search (MAMRS) [4]. The strategy of its application is that, for the approximate calculation of functional (12), randomly selected

d

non-coinciding trajectories emanating from the set of initial states are used that form a mini-batch:

J_{d} [u (t, \hat{x}), K (t, \hat{x})] = (1 / d) \sum_{k = 1}^{d} I (x_{0}, u (t, \hat{x} (t)), K (t, \hat{x} (t))) .

(13)

The mini-batch size is user-definable,

1 \leq d \leq N

, and is usually selected step by step. Furthermore, for simplicity of presentation, we assume that each coordinate of the control laws

u (t, \hat{x})

and

K (t, \hat{x})

can be associated with a matrix column of the coefficients

u_{i_{0}, i_{1}, \dots, i_{n}}^{j}

,

K_{i_{0}, i_{1}, \dots, i_{n}}^{i, j}

. Furthermore, by concatenation, one can represent the entire set of optimized parameters in the form of an extended vector. Let us denote it by

K_{d}

and assume that it has dimension

(n \times 1)

. The objective function is denoted by

J_{d} (K_{d})

. For each mini-batch size

1 \leq d \leq N

, the optimization results are different. When

d \to N

, the accuracy of solving the optimization problem in general increases.

4. Mini-Batch Adaptive Search Algorithm

Let us consider the optimization problem

J_{d} [K_{d}] \to \min_{K_{d}}

.

Denote:

J_{d}^{s}

is the minimum value of the cost function after the

s -

th run;

{\hat{K}}_{d}^{s}

is a best parameter vector column after startup;

d

is the mini-batch size.

Step 0. Set the initial mini-batch size:

d = 1

(in general, one can start with any value of

1 \leq d \leq N

);

S_{\max}

is a maximum number of starts;

B_{\max}

is a maximum number of passes;

α = 1.618

is an expansion coefficient;

β = 0.618

is a compression coefficient; M is a maximum number of failed tests at the current iteration;

t_{0} = 1

is an initial step size (one can use any value

t_{0} > R

), R is a minimum step size and L is a maximum number of iterations; and

r

is a number of initial trial solutions (

1 \leq r \leq 10

).

Step 1. Set the values:

b = 1

(passes number counter) and

P_{d} = 0

(initial value of the sum of the cost function average values).

Step 2. Set the values:

s = 1

(starts number counter) and

J_{d}^{1} = 10^{8} \div 10^{10}

;

S_{d} = 0

(initial value of the sum of the objective function values).

Step 3. Define the initial values of coefficients

u_{i_{0}, i_{1}, \dots, i_{n}}^{j}

,

k_{i_{0}, i_{1}, \dots, i_{s}}^{i j}

. Generate r vectors

K_{d}^{s}

using a uniform distribution of its coordinates at some intervals. Calculate the value of the function

J_{d}

for each generated vector and order them according to the value of the objective function [17]. The vector with the smallest value of the objective function is denoted by column vector

K_{d}^{s}^{, 0}

. Put

l = 0, j = 1

.

Step 4. Generate a random vector

ξ^{j} = {(ξ_{1}^{j}, \dots, ξ_{q n}^{j})}^{T}

, where

ξ_{i}^{j}

is a random variable uniformly distributed on the interval [−1,1].

Step 5. Calculate:

y^{j} = K_{d}^{s}^{, l} + t_{l} \frac{ξ^{j}}{‖ ξ^{j} ‖}

.

Step 6. Generate the mini-batch size

d

, i.e., generate

d

pairwise mismatched sets of

q_{1} \in N_{1}, \dots, q_{n} \in N_{n}

values defining the initial states

X_{0}^{k} \in Ω

with numbers

k = q_{1} \cdot \dots \cdot q_{n} \in {1, \dots, N}

or corresponding to the tuple

< q_{1}, \dots, q_{n} >

.

Check the fulfillment of the conditions:

(a): If $J_{d} (y^{j}) < J_{d} (K_{d}^{s}^{, l})$ , the algorithm step is successful. Put $z^{j} = K_{d}^{s, l} + α (y^{j} - K_{d}^{s, l})$ . Determine if current direction $y^{j} - K_{d}^{s}^{, l}$ is successful: if $J_{d} (z^{j}) < J_{d} (K_{d}^{s}^{, l})$ , the search direction is successful. Put $K_{d}^{s}^{, l + 1} = z^{j}$ , t_l₊₁ = αt_l, l = l + 1 and check the termination condition. If l < L, put j = 1 and go to step 4. If l = L, the search process is over: ${\hat{K}}_{d}^{s} = K_{d}^{s}^{, l}$ , go to step 8; if $J_{d} (z^{j}) \geq J_{d} (K_{d}^{s}^{, l})$ , the search direction is unsuccessful, go to step 7;
(b): If $J_{d} (y^{j}) \geq J_{d} (K_{d}^{s}^{, l})$ , the unsuccessful step is made, go to step 7.

Step 7. Calculate the number of unsuccessful steps from the current solution:

(a): If j < M, put j = j + 1 and go to step 4;
(b): If j = M, check the termination condition: if $t_{l} \leq R$ , the process is over: ${\hat{K}}_{d}^{s} = K_{d}^{s}^{, l}$ and $J_{d}^{s} = J_{d} (K_{d}^{s}^{, l})$ , go to step 8; if $t_{l} > R$ , put t_l = β t_l, j = 1 and go to step 4.

Step 8. Check the improvement of the cost function value as a result of the

s

-th run: if

J_{d} (K_{d}^{s}^{, l}) < J_{d}^{s}

, put

J_{d}^{s} = J_{d} (K_{d}^{s}^{, l})

and

{\hat{K}}_{d}^{s} = K_{d}^{s}^{, l}

and go to step 9; if

J_{d} (K_{d}^{s}^{, l}) \geq J_{d}^{s}

, go to step 9.

Step 9. Calculate

S_{d} = S_{d} + J_{d}^{s}

and verify the stop conditions (the maximum number of starts is achieved): if

s < S_{\max}

, put

s = s + 1

and go to step 3; if

s = S_{\max}

, put

{\hat{K}}_{d} = {\hat{K}}_{d}^{s}

—the best solution during the

b

-th pass for a given

d

; calculate

m_{d} = S_{d}^{s} / S_{\max}

and

σ_{m_{d}} = {(\frac{1}{S_{\max} - 1} {\sum_{s = 1}^{S_{\max}} [J_{d}^{s} - m_{d}]}^{2})}^{1 / 2}

and go to step 10.

Step 10*. Put

P_{d} = P_{d} + m_{d}

and

m_{d}^{b} = m_{d}

and check the condition for completing a given number of passes: if

b < B_{\max}

, put

b = b + 1

and go to step 2; if

b = B_{\max}

,calculate:

{\bar{m}}_{d} = \sum_{b = 1}^{B_{\max}} m_{d}^{b} / B_{\max}

,

σ_{{\bar{m}}_{d}} = {(\frac{1}{B_{\max} - 1} {\sum_{b = 1}^{B_{\max}} [m_{d}^{b} - {\bar{m}}_{d}]}^{2})}^{1 / 2}

.

Step 11*. Check the condition for completing studies of the effect of the mini-batch size: if

d < N

, put

d = d + 1, s = 1

and go to step 1; if

d = N

, go to step 12.

Step 12. As a result, find the best estimate of

{\hat{K}}_{d}^{*}

after

B_{\max}

passes and indicators

{\bar{m}}_{d}

and

σ_{{\bar{m}}_{d}}

for each value of the mini-batch size

d

. To analyze the resulting estimation accuracy, find the value

J ({\hat{K}}_{d}^{*})

.

Steps 10 and 11 are performed if necessary. It is recommended to do restarts to increase the chances of finding a global extremum. The best solution is selected from the restarts made.

5. Satellite Stabilization Problem

The problem of damping the rotational motion of the satellite by the engines installed on it is considered. The system describing the motion of a rigid body relative to the center of inertia after the transition to dimensionless variables has the form:

{\begin{cases} \dot{p} (t) = [u_{1} (t) / 6], \\ \dot{q} (t) = [u_{2} (t) - 0.2 r (t) p (t)], \\ \dot{r} (t) = [0.2 (u_{3} (t) + p (t) q (t))], \end{cases}

where

p, q, r

are the projections of the angular velocity onto the main central axes of inertia and

t \in [0, 1]

and

u_{1}, u_{2}, u_{3}

are controls that characterize the thrust of the engines located on the satellite.

The set of initial states is given by a uniform distribution law on the set

Ω = [23; 25] \times [13; 15] \times [13; 15] .

At the final moment of the system functioning, the following conditions must be fulfilled:

p (1) = q (1) = r (1) = 0,

corresponding to the meaning of the satellite stabilization problem. The fulfillment of terminal conditions should be accompanied by minimization of the fuel used to turn the satellite.

The functional (6):

I = \int_{0}^{1} [| u_{1} (t) | + | u_{2} (t) | + | u_{3} (t) |] d t + 10^{3} \cdot [p^{2} (1) + q^{2} (1) + r^{2} (1)] .

Next, we will consider two examples: the joint estimations and control problem with incomplete information about the state vector and the optimal control problem with complete information about the state vector.

5.1. Example 1. The Joint Estimation and Control Problem

The proposed observer equation is:

{\begin{cases} \frac{d \hat{p}}{d t} = u_{1} (t, \hat{x} (t)) / 6 + K_{1} (t, \hat{x} (t)) [z (t) - h (t, \hat{x} (t))], \\ \frac{d \hat{q}}{d t} = u_{2} (t, \hat{x} (t)) - 0.2 \hat{r} \hat{p} + K_{2} (t, \hat{x} (t)) [z (t) - h (t, \hat{x} (t))], \\ \frac{d \hat{r}}{d t} = 0.2 (u_{3} (t, \hat{x} (t)) + \hat{p} \hat{q}) + K_{3} (t, \hat{x} (t)) [z (t) - h (t, \hat{x} (t))] . \end{cases}

Further, we will consider the cases of solving the problem with different models of the measuring system.

In all tests, the number of initial states is

N = 27

and

L_{0} = L_{1} = L_{2} = L_{3} = 2

is the scale of truncation. The initial state estimation vector is

\hat{x} (0) = {\hat{x}}_{0} = {(24, 15, 13)}^{T}

. Parameters of the mini-batch adaptive method of random search are

B_{\max} = 1000

,

M = 15

and

R = 8 \cdot 10^{- 5}

. To synthesize the plant control

u (t, \hat{x})

and observer control

K (t, \hat{x})

, a system of orthonormal Legendre polynomials is used.

5.1.1. Case A

The measuring system model is described by the following relationship:

z (t) = {(r (t), p (t))}^{T}

The behavior of trajectories set for different mini-batch sizes is shown in Figure 1:

Table 1 shows the results of solving the problem depending on the mini-batch size.

5.1.2. Case B

The measuring system model is described by the following relationship:

z (t) = {(p (t), q (t))}^{T}

The behavior of the trajectories set for different sizes of mini-batch is shown in Figure 2:

Table 2 shows the results of solving the problem depending on the mini-batch size.

5.1.3. Case C

The measuring system model is described by the following relationship:

z (t) = {(r (t), q (t))}^{T}

The behavior of the trajectories set for different sizes of mini-batch is shown in Figure 3:

Table 3 shows the results of solving the problem depending on the mini-batch size.

Based on Table 1, Table 2 and Table 3, we can conclude that, with an increase of the mini-batch size, the accuracy of the problem solution also increases.

Figure 4 and Table 4 show the solution to the problem of satellite stabilization depending on the selected model of measuring systems with a mini-batch

d = 27

:

From Figure 4 and Table 4, a similar character of convergence for different models of the measuring system is observed.

5.2. Example 2. The Control Problem with Complete Information about the State Vector

The measuring system model is described by the following relationship:

z (t) = {(p (t), q (t), r (t))}^{T}

In this case, there is no need to use a state observer because there is complete information about the state vector at an arbitrary moment in time. In practice, this case is rarely realized, but it is of interest for the analysis of losses in terms of the value of the cost functional associated with the incompleteness of the information received. In all tests, the number of generated random initial states is

N = 27

and

L_{1} = L_{2} = L_{3} = 2

is the scale of truncation. Parameters of the mini-batch adaptive method of random search are

B_{\max} = 1000

,

M = 30

and

R = 8 \cdot 10^{- 9}

. To synthesize the plant control

u (t, x)

, a system of orthonormal Legendre polynomials is used.

The behavior of trajectories set for different sizes of mini-batch is shown in Figure 5:

Table 5 shows the results of solving the problem depending on the mini-batch size.

Based on the results of examples 1 and 2, we can conclude that, with the mini-batch size

d = 10

, good convergence of the estimates of the state vector coordinates to the true values is already achieved. The total execution time of the algorithm with the mini-batch size

d = 10

was 30 min and with the mini-batch size

d = 27

was 90 min based on an INTEL CORE i5 2.10 GHz processor. The results obtained indicate that, when using mini-batches, the required quality of transients is achieved at reasonable computational costs.

6. Conclusions

The developed zero-order metaheuristic optimization algorithm, namely, a mini-batch adaptive method of random search, is tested on the satellite stabilization problem of finding the optimal control for a pencil of trajectories of nonlinear deterministic systems emanating from a given set of initial states. The software for solving the problem of satellite stabilization is developed. Three cases of solving the problem for different models of the measuring system with incomplete information are considered. The analysis of the problem solution for different models of the measuring system with incomplete information is carried out. A comparison is made with the solution of the problem with a model of the measuring system containing complete information about the state vector. The study of the influence of the mini-batch size on the accuracy of the solution in each considered problem is carried out. Recommendations on the choice of the algorithm parameters are given. The obtained numerical results confirm the idea that, for a certain mini-batch size, an acceptable quality of transient processes can be achieved with low computational costs.

Author Contributions

Conceptualization, A.V.P.; Methodology, A.V.P. and A.V.L.; Software, A.V.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Davtyan, L.G.; Panteleev, A.V. Method of Parametric Optimization of Nonlinear Continuous Systems of Joint Estimation and Control. J. Comput. Syst. Sci. Int. 2019, 58, 360–373. [Google Scholar] [CrossRef]
Rybakov, K.A. Modeling and Analysis of Output Processes of Linear Continuous Stochastic Systems Based on Orthogonal Expansions of Random. J. Comput. Syst. Sci. Int. 2020, 59, 322–337. [Google Scholar] [CrossRef]
Rybakov, K.A. Spectral method of analysis and optimal estimation in linear stochastic systems. Int. J. Model. Simul. Sci. Comput. 2020, 11, 2050022. [Google Scholar] [CrossRef]
Panteleev, A.V.; Lobanov, A.V. The mini-batch adaptive method of random search (MAMRS) for parameters optimization in the tracking control problem. IOP Conf. Ser. Mater. Sci. Eng. 2020, 927, 012025. [Google Scholar] [CrossRef]
Panteleev, A.V.; Lobanov, A.V. Mini-Batch Adaptive Random Search Method for the Parametric Identification of Dynamic Systems. Autom. Remote Control. 2020, 81, 2026–2045. [Google Scholar] [CrossRef]
Panteleev, A.V.; Lobanov, A.V. Application of the mini-batch adaptive method of random search (MAMRS) in problems of optimal in mean control of the trajectory pencils. J. Phys. Conf. Ser. 2021, 1925, 012006. [Google Scholar] [CrossRef]
Floudas, C.; Pardalos, P. Encyclopedia of Optimization; Springer: New York, NY, USA, 2009. [Google Scholar]
Gendreau, M. Handbook of Metaheuristics; Springer: New York, NY, USA, 2010. [Google Scholar]
Yilmaz, V. Automated ground filtering of LiDAR and UAS point clouds with metaheuristics. Optics Laser Technol. 2021, 138, 106890. [Google Scholar] [CrossRef]
Seyyedabbasi, A.; Aliyev, R.; Kiani, F.; Gulle, M.U.; Basyildiz, H.; Shah, M.A. Hybrid algorithms based on combining reinforcement learning and metaheuristic methods to solve global optimization problems. Knowl. Based Syst. 2021, 223, 107044. [Google Scholar] [CrossRef]
Shokouhifar, M. FH-ACO: Fuzzy heuristic-based ant colony optimization for joint virtual network function placement and routing. Appl. Soft Comput. 2021, 107, 107401. [Google Scholar] [CrossRef]
Ruder, S. An Overview of Gradient Descent Optimization Algorithms. arXiv 2017, arXiv:1609.04747. [Google Scholar]
Yuan, H.; Ma, T. Federated Accelerated Stochastic Gradient Descent. arXiv 2020, arXiv:2006.08950. [Google Scholar]
Mustapha, A.; Mohamed, L.; Ali, K. An Overview of Gradient Descent Algorithm Optimization in Machine Learning: Application in the Ophthalmology Field. SADASC 2020, 1207, 349–359. [Google Scholar]
Qian, X.; Klabjan, D. The Impact of the Mini-batch Size on the Variance of Gradients in Stochastic Gradient Descent. arXiv 2020, arXiv:2004.13146. [Google Scholar]
Krylov, I.A. Numerical solution of the problem of the optimal stabilization of an artificial satellite. USSR Comput. Math. Math. Phys. 1968, 8, 284–291. [Google Scholar] [CrossRef]
Peng, W.; Jiyun, B.; Jun, M. A Hybrid Genetic Ant Colony Optimization Algorithm with an Embedded Cloud Model for Continuous Optimization. J. Inf. Process. Syst. 2020, 16, 1169–1182. [Google Scholar]

Figure 1. The behavior of the satellite coordinates

p, q, r

(red) and estimations (blue) for different mini-batch sizes: (a)

d = 1

; (b)

d = 10

; (c)

d = 20

; (d)

d = 27

.

Figure 1. The behavior of the satellite coordinates

p, q, r

(red) and estimations (blue) for different mini-batch sizes: (a)

d = 1

; (b)

d = 10

; (c)

d = 20

; (d)

d = 27

.

Figure 2. The behavior of the satellite coordinates

p, q, r

(red) and estimations (blue) for different mini-batch sizes: (a)

d = 1

; (b)

d = 10

; (c)

d = 20

; (d)

d = 27

.

Figure 2. The behavior of the satellite coordinates

p, q, r

(red) and estimations (blue) for different mini-batch sizes: (a)

d = 1

; (b)

d = 10

; (c)

d = 20

; (d)

d = 27

.

Figure 3. The behavior of the satellite coordinates

p, q, r

(red) and estimations (blue) for different mini-batch sizes: (a)

d = 1

; (b)

d = 10

; (c)

d = 20

; (d)

d = 27

.

Figure 3. The behavior of the satellite coordinates

p, q, r

(red) and estimations (blue) for different mini-batch sizes: (a)

d = 1

; (b)

d = 10

; (c)

d = 20

; (d)

d = 27

.

Figure 4. The behavior of the satellite coordinates

p, q, r

(red) and estimations (blue) for different model of measuring systems: (a)

z (t) = {(r (t), q (t))}^{T}

; (b)

z (t) = {(p (t), q (t))}^{T}

; (c)

z (t) = {(r (t), p (t))}^{T}

.

Figure 4. The behavior of the satellite coordinates

p, q, r

(red) and estimations (blue) for different model of measuring systems: (a)

z (t) = {(r (t), q (t))}^{T}

; (b)

z (t) = {(p (t), q (t))}^{T}

; (c)

z (t) = {(r (t), p (t))}^{T}

.

Figure 5. The behavior of the satellite coordinates

p, q, r

for different mini-batch sizes: (a)

d = 1

; (b)

d = 10

; (c)

d = 20

; (d)

d = 27

.

Figure 5. The behavior of the satellite coordinates

p, q, r

for different mini-batch sizes: (a)

d = 1

; (b)

d = 10

; (c)

d = 20

; (d)

d = 27

.

Table 1. Solution results for different mini-batch sizes.

$d$	$J [K_{d}]$
1	2970.9487
10	315.6251
20	256.9266
27	167.1622

Table 2. Solution results for different mini-batch sizes.

$d$	$J [K_{d}]$
1	2722.1031
10	455.5253
20	311.6676
27	236.1635

Table 3. Solution results for different mini-batch sizes.

$d$	$J [K_{d}]$
1	2934.4927
10	860.9751
20	405.3041
27	351.7748

Table 4. Solution results for different mini-batch sizes.

$d$	$z (t)$	$J [K_{d}]$
27	${(r (t), q (t))}^{T}$	351.7748
27	${(p (t), q (t))}^{T}$	236.1635
27	${(r (t), p (t))}^{T}$	167.1622

Table 5. Solution results for different mini-batch sizes.

$d$	$J [K_{d}]$
1	1129.1925
10	217.4981
20	196.9630
27	98.8108

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Panteleev, A.V.; Lobanov, A.V. Application of Mini-Batch Metaheuristic Algorithms in Problems of Optimization of Deterministic Systems with Incomplete Information about the State Vector. Algorithms 2021, 14, 332. https://doi.org/10.3390/a14110332

AMA Style

Panteleev AV, Lobanov AV. Application of Mini-Batch Metaheuristic Algorithms in Problems of Optimization of Deterministic Systems with Incomplete Information about the State Vector. Algorithms. 2021; 14(11):332. https://doi.org/10.3390/a14110332

Chicago/Turabian Style

Panteleev, Andrei V., and Aleksandr V. Lobanov. 2021. "Application of Mini-Batch Metaheuristic Algorithms in Problems of Optimization of Deterministic Systems with Incomplete Information about the State Vector" Algorithms 14, no. 11: 332. https://doi.org/10.3390/a14110332

APA Style

Panteleev, A. V., & Lobanov, A. V. (2021). Application of Mini-Batch Metaheuristic Algorithms in Problems of Optimization of Deterministic Systems with Incomplete Information about the State Vector. Algorithms, 14(11), 332. https://doi.org/10.3390/a14110332

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Application of Mini-Batch Metaheuristic Algorithms in Problems of Optimization of Deterministic Systems with Incomplete Information about the State Vector^†

Abstract

1. Introduction

2. Statement of the Problem

3. Solution Search Strategy

4. Mini-Batch Adaptive Search Algorithm

5. Satellite Stabilization Problem

5.1. Example 1. The Joint Estimation and Control Problem

5.1.1. Case A

5.1.2. Case B

5.1.3. Case C

5.2. Example 2. The Control Problem with Complete Information about the State Vector

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Application of Mini-Batch Metaheuristic Algorithms in Problems of Optimization of Deterministic Systems with Incomplete Information about the State Vector †

Abstract

1. Introduction

2. Statement of the Problem

3. Solution Search Strategy

4. Mini-Batch Adaptive Search Algorithm

5. Satellite Stabilization Problem

5.1. Example 1. The Joint Estimation and Control Problem

5.1.1. Case A

5.1.2. Case B

5.1.3. Case C

5.2. Example 2. The Control Problem with Complete Information about the State Vector

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Application of Mini-Batch Metaheuristic Algorithms in Problems of Optimization of Deterministic Systems with Incomplete Information about the State Vector^†