Next Article in Journal
A Novel Chaotic System with Only Quadratic Nonlinearities: Analysis of Dynamical Properties and Stability
Previous Article in Journal
Single Machine Scheduling Proportionally Deteriorating Jobs with Ready Times Subject to the Total Weighted Completion Time Minimization
Previous Article in Special Issue
A Hybrid Initialization and Effective Reproduction-Based Evolutionary Algorithm for Tackling Bi-Objective Large-Scale Feature Selection in Classification
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Meaning and Accuracy of the Improving Functions in the Solution of the CBQR by Krotov’s Method

The Department of Civil Engineering, Ariel University, Ariel 40700, Israel
Mathematics 2024, 12(4), 611; https://doi.org/10.3390/math12040611
Submission received: 14 December 2023 / Revised: 8 February 2024 / Accepted: 14 February 2024 / Published: 19 February 2024
(This article belongs to the Special Issue Optimisation Algorithms and Their Applications)

Abstract

:
A new solution to the continuous-time bilinear quadratic regulator optimal control problem (CBQR) was recently developed using Krotov’s Method. This paper provides two theoretical results related to the properties of that solution. The first discusses the equivalent representation of the cost-to-go performance index. The second one breaks down this equivalence into smaller identities referencing the components of the performance index. The paper shows how these results can be used to verify the numerical accuracy of the computed solution. Additionally, the meaning of the improving function and the equivalent representation, which are the main elements in the discussed CBQR’s solution, are explained according to the derived notions. A numerical example of structural control application exemplifies the significance of these results and how they can be applied to a specific CBQR problem.

1. Introduction

The challenge of “choosing the best path” is a prevalent problem that has a significant presence in applied science and technology [1]. Not surprisingly, its solutions find their way into a wide range of applications, including automatic control of systems [2]. For the latter, the problem is addressed in the framework of optimal control theory. The fundamental objective in this field is to maximize the return from, or minimize the cost of, the operation of physical [3,4], biological [5,6], social [7,8,9], economic processes [10,11], etc.
Many studies have been conducted over the years in optimal control theory, yielding diverse results for many types of problems, and the research is still ongoing. An important tool for solving various optimal control problems is Krotov’s method [2], sometimes referred to by its original name: “a global method of successive improvements of control”. The method, which stems from the fundamental extension principle [2], is a successive algorithm aimed at the computerized solution of optimal control problems. It is a well-known instrument for constructing optimal control for quantum systems [12,13]. Additionally, its efficiency was demonstrated for a class of structural control problems [14,15,16], iron and steel manufacturing processes [17], and biological systems [18].
However, it should be noted that even though Krotov’s method furnishes a rigorous methodology for solving optimal control problems, its formulation is general and requires additional effort when addressing a specific issue. In order to apply the method to a given optimal control problem, one should solve another problem. That is, one should reshape the given performance index to a form that points out a clear way to obtain a better, improved process and the feedback generating it [2]. The reshaped form of the performance index is called an equivalent representation, and the key object for obtaining it is an object called an improving function [2].
One of the recent studies that utilized Krotov’s method deals with the solution for the continuous-bilinear quadratic regulator problem (CBQR) [19]. Here, the aim is to support that solution by furnishing some interesting theoretical comments on the equivalent representation related to this problem. These results can be used to verify the numerical accuracy of the computed improving function and the solution obtained by it. Additionally, in the same context, they shed light on the meaning of the elements of the equivalent representation and the improving function. Section 2 provides some background needed for understanding the main results. The latter are presented in Section 3. Finally, a numerical example is given to illustrate the significance of the results and how they can be used to verify the numerical accuracy of an improving function obtained by the method suggested in [19].

2. Methods

In order to put things in context and facilitate the main derivations, it is beneficial to review several notions and theories.
x : R R n denotes a state trajectory, and u : R R n u denotes a control trajectory. Here, x ( t ) is an intersection of x at t. That is, x ( t ) is a specific vector in R n that expresses the system’s state at t, whereas x refers to the entire trajectory. U i is a set of control signals admissible to the i-th control device.
The CBQR is an optimal control problem that consists of a bilinear system [20] and a quadratic performance index of the form:
x ˙ ( t ) = A ( t ) x ( t ) + B ( t ) u ( t ) + { u N ( t ) } x ( t ) + g ( t ) x ( 0 ) , u U , t ( 0 , t f )
J ( x , u ) = 1 2 0 t f x ( t ) T Q ( t ) x ( t ) + u ( t ) T R ( t ) u ( t ) d t + 1 2 x ( t f ) T H x ( t f )
Here, { u N ( t ) } i = 1 n u u i ( t ) N i ( t ) ; A , N i : R R n × n and B : R R n × n u . g : R R n is a trajectory of external excitations; Q : R R n × n such that Q ( t ) 0 ; R : R R n u × n u such that R ( t ) 0 and H 0 . A pair ( x , u ) that satisfies Equation (1) is called an admissible process. The set X [ R R n ] comprises state trajectories, reachable from U and the specified x ( 0 ) . The solution to this problem is required in the form of a state feedback that synthesizes an optimal admissible process, i.e., one that minimizes J.
A successive method was utilized in order to solve this problem [19], that is, to obtain an admissible process that minimizes J. The method, which is named Krotov’s method after its founder Prof. V. F. Krotov [2], furnishes the solution as a sequence called an improving sequence. Krotov defined this sequence to be a sequence of admissible processes— { ( x k , u k ) } —such that J ( x k , u k ) J ( x k + 1 , u k + 1 ) ; i.e., the performance of each element in the improving sequence is better than or equal to its former. In addition to this monotonous improvement, the method has the advantages of not being limited to small variations in u and obtaining the solution in the form of feedback [2]. A succinct introduction to this method is given below.
Consider a class of optimal control problems defined by a state equation, set of admissible control trajectories, and performance index:
x ˙ ( t ) = f ( t , x ( t ) , u ( t ) ) ; x ( 0 ) , t ( 0 , t f ) ; u U
J ( x , u ) = l f ( x ( t f ) ) + 0 t f l ( t , x ( t ) , u ( t ) ) d t
Here, f : R × R n × R n u R n , l : R × R n × R n u R and l f : R n R . The goal is to find an admissible ( x , u ) that minimizes J.
One of the key concepts in Krotov’s theory is the equivalent representation [2]. It was shown that the optimal control problem can be reformulated by transforming the performance index J into an equivalent one, J e q . The rationale behind this transformation lies in the potential for a thoughtfully selected J e q to simplify the solution process. The equivalent representation plays a crucial role in various results put forth by Krotov, notably in Krotov’s method [2]. The subsequent theorem introduces the relevant equivalent representation for our context. ξ and ν denote vectors in R n and R n u , respectively.
Theorem 1
([2]). Let q : R × R n R be a piecewise-smooth function, upon which the next functions and performance index are constructed:
s ( t , ξ , ν ) q t ( t , ξ ) + q x ( t , ξ ) f ( t , ξ , ν ) + l ( t , ξ , ν )
s f ( ξ ) l f ( ξ ) q ( t f , ξ )
J e q ( x , u ) s f ( x ( t f ) ) + q ( 0 , x ( 0 ) ) + 0 t f s ( t , x ( t ) , u ( t ) ) d t
If ( x , u ) is an admissible process, then J ( x , u ) = J e q ( x , u ) .
Hence, the challenge is to find a q that forms a beneficial J e q . In the following theorem, Krotov points out a way of finding such a q. The theorem states the properties of q and an improving feedback, u ^ , which allow for the improvement of a given admissible process.
Theorem 2
([2]). Let a given ( x 1 , u 1 ) be admissible, and let q : R × R n R . If the following statements hold:
1. 
q grants s and s f the property:
s ( t , x 1 ( t ) , u 1 ( t ) ) = max ξ X ( t ) s ( t , ξ , u 1 ( t ) ) ; t ( 0 , t f ) s f ( x 1 ( t f ) ) = max ξ X ( t f ) s f ( ξ )
2. 
u ^ is a feedback
u ^ ( t , ξ ) = arg min ν U ( t ) s ( t , ξ , ν )
for all t [ 0 , t f ] , ξ X ( t ) .
3. 
x 2 is a state trajectory that solves:
x ˙ 2 ( t ) = f ( t , x 2 ( t ) , u ^ ( t , x 2 ( t ) ) ) ; x 2 ( 0 ) = x ( 0 ) ,
at any t ( 0 , t f ) , and u 2 is a control trajectory such that u 2 ( t ) = u ^ ( t , x 2 ( t ) ) ,
then ( x 2 , u 2 ) is an improved process.
A q that meets the requirements listed in the above theorem is called an improving function. By repeating the process improvement over and over, an improving sequence is obtained.
However, while the process of successive improvement proves highly beneficial, this alone does not guarantee optimality. Additional considerations are necessary to ensure that the obtained solution is indeed optimal. For instance, even assuming convergence of the improving sequence, the optimality of its limit process remains uncertain. In the case of an optimum, we anticipatethat it will satisfy conditions of optimality, such as Pontryagin’s minimum principle. Krotov has also addressed this question.
Assume that at some step k of the algorithm, the gradients q k x and q ( k 1 ) x are equal for all t ( 0 , t f ) , and q k grants the gradient of s k the following property
s k x ( t , x k ( t ) , u k ( t ) ) = 0 ; t ( 0 , t f )
at the process ( x k , u k ) . Krotov showed that, in this case, ( x k , u k ) satisfies Pontryagin’s minimum principle with the costate q k x ( t , x k ( t ) ) [2]. In this context, Krotov’s method provides a convenient instrument for finding a solution that satisfies Pontryagin’s minimum principle, rather than solving it directly. Furthermore, while solving problems directly via Pontryagin’s minimum principle typically yields an open-loop control trajectory, Krotov’s method offers a solution in a feedback form [21].
The major difficulty faced by a control designer intending to apply Krotov’s method is formulating an improving function suitable for the addressed optimal control problem. For the CBQR problem, given an admissible process, ( x , u ) , an improving function can be formulated as q ( t , ξ ) = ξ T P ( t ) ξ + ξ T p ( t ) [19]. Here, P : R R n × n is the solution of the following differential Lyapunov equation [22]:
P ˙ ( t ) = P ( t ) ( A ( t ) + { u N ( t ) } ) ( A ( t ) + { u N ( t ) } ) T P ( t ) Q ( t )
to P ( t f ) = H , and p : R R n is the solution of:
p ˙ ( t ) = A ( t ) + { u N ( t ) } T p ( t ) P ( t ) ( B ( t ) u ( t ) + g ( t ) )
to p ( t f ) = 0 . In these equations, u is the control trajectory specified by the given admissible process.
As mentioned above, the improving function q is associated with some J e q , whose importance lies in the fact that it underpins the process improvement. This relation, between the improving function and equivalent representation, is discussed below in Section 3 in the context of the CBQR problem.

3. Results

The equivalent representation of the performance index refers to the entire time domain [ 0 , t f ] . However, in some cases, there is an interest in the performance of the system over a sub-interval, [ t 1 , t f ] [ 0 , t f ] , also known as the cost-to-go. Following Theorem 1 one can easily obtain the equivalent representation of the cost-to-go. The next corollary provides an equivalent representation of the cost-to-go of a given admissible process, ( x , u ) .
Corollary 1.
Let J ¯ : X × U × R R and J ¯ e q : X × U × R R be the functionals:
J ¯ ( x , u , t 1 ) t 1 t f l ( t , x ( t ) , u ( t ) ) d t + l f ( x ( t f ) )
J ¯ e q ( x , u , t 1 ) q ( t 1 , x ( t 1 ) ) + t 1 t f s ( t , x ( t ) , u ( t ) ) d t + s f ( x ( t f ) )
where t 1 [ 0 , t f ] and ( x , u ) is admissible. Then, J ¯ ( x , u , t 1 ) = J ¯ e q ( x , u , t 1 ) for any 0 t 1 t f .
The proof is similar to Krotov’s proof to Theorem 1, differing only by setting the time domain to [ t 1 , t f ] rather than [ 0 , t f ] . It is given below for the reader’s convenience.
Proof. 
Let the hypothesis hold. By substituting Equations (5) and (6) into J ¯ e q J ¯ , we obtain:
J ¯ e q ( x , u , t 1 ) J ¯ ( x , u , t 1 ) = q ( t 1 , x ( t 1 ) ) + ( l f ( x ( t f ) ) q ( t f , x ( t f ) ) l f ( x ( t f ) ) + t 1 t f q t ( t , x ( t ) ) + q x ( t , x ( t ) ) f ( t , x ( t ) , u ( t ) ) d t
As ( x , u ) is an admissible process, x ˙ ( t ) = f ( t , x ( t ) , u ( t ) ) holds, leading to:
J ¯ e q ( x , u , t 1 ) J ¯ ( x , u , t 1 ) = q ( t 1 , x ( t 1 ) ) q ( t f , x ( t f ) ) + t 1 t f q t ( t , x ( t ) ) + q x ( t , x ( t ) ) x ˙ ( t ) d t = q ( t 1 , x ( t 1 ) ) q ( t f , x ( t f ) ) + t 1 t f d q ( t , x ( t ) ) = 0
by the virtue of Newton–Leibnitz formula. □
Remark 1.
 
J ¯ e q can be used for evaluating performance over sub-trajectories by:
t 1 t 2 l ( t , x ( t ) , u ( t ) ) d t = J ¯ ( x , u , t 2 ) J ¯ ( x , u , t 1 )
= J ¯ e q ( x , u , t 2 ) J ¯ e q ( x , u , t 1 )
= q ( t 1 , x ( t 1 ) ) q ( t 2 , x ( t 2 ) ) + t 1 t 2 s ( t , x ( t ) , u ( t ) ) d t
Assume that ( x , u ) is an admissible process. By substituting q into s and then into J ¯ e q (Equation (14)), we obtain
J ¯ e q ( x , u , t 1 ) = 1 2 x ( t 1 ) T P ( t 1 ) x ( t 1 ) + p ( t 1 ) T x ( t 1 ) + t 1 t f p ( t ) T ( B ( t ) u ( t ) + g ( t ) ) + 1 2 u ( t ) T R ( t ) u ( t ) d t
Corollary 1 establishes the overall equivalence of J and J e q in a general sense, without delving into their components. The subsequent theorem addresses how this equivalence is reflected through the components of J and J e q for a linear system with an external excitation and no control input, namely:
f ( t , x ( t ) , u ( t ) ) = f ( t , x ( t ) ) A ( t ) x ( t ) + g ( t ) ; x ( 0 ) l ( t , x ( t ) , u ( t ) ) = l ( t , x ( t ) ) x ( t ) T Q ( t ) x ( t ) ; l f ( x ( t f ) ) = 1 2 x ( t f ) T H x ( t f )
In brief, the following theorem highlights specific properties of P and p , which are the solutions to Equations (11) and (12), respectively. These properties allow us to deconstruct the overall equivalence, as described in Corollary 1, into smaller identities. The theorem also provides a tool for verifying the accuracy of P and p and offers a deeper understanding of their role in J e q .
It is worth noting that, despite the differences between the CBQR and the aforementioned system, the result remains relevant to the CBQR case, as elucidated later in Section 4.
Theorem 3.
Let g : R R n ; A , Q : R R n × n , where Q ( t ) 0 . If x h , x p , p : R R n and P : R R n × n satisfy the linear ODEs
x ˙ h ( t ) = A ( t ) x h ( t ) ; x h ( t 1 ) = x ( t 1 )
x ˙ p ( t ) = A ( t ) x p ( t ) + g ( t ) ; x p ( t 1 ) = 0
P ˙ ( t ) = P ( t ) A ( t ) A ( t ) T P ( t ) Q ( t ) ; P ( t f ) = H
p ˙ ( t ) = A ( t ) T p ( t ) P ( t ) g ( t ) ; p ( t f ) = 0
for t ( t 1 , t f ) , then:
(a) 
1 2 x ( t 1 ) T P ( t 1 ) x ( t 1 ) = 1 2 t 1 t f x h ( t ) T Q ( t ) x h ( t ) d t + 1 2 x h ( t f ) T H x h ( t f )
(b) 
t 1 t f p ( t ) T g ( t ) d t = 1 2 t 1 t f x p ( t ) T Q ( t ) x p ( t ) d t + 1 2 x p ( t f ) T H x p ( t f )
(c) 
x ( t 1 ) T p ( t 1 ) = t 1 t f x h ( t ) T Q ( t ) x p ( t ) d t + x h ( t f ) T H x p ( t f )
Proof. 
Consider Equations (1) and (2) over [ t 1 , t f ] , and let u 0 . Consequently, Equation (1) becomes:
x ˙ ( t ) = A ( t ) x ( t ) + g ( t ) ; t ( t 1 , t f ) , x ( t 1 )
x ’s solution consists of homogeneous and particular solutions satisfying Equations (19) and (20), respectively. In addition, Equations (11) and (12) become Equations (21) and (22), repectively.
As ( x , 0 ) is admissible and by Corollary 1:
1 2 t 1 t f x ( t ) T Q ( t ) x ( t ) d t + 1 2 x ( t f ) T H x ( t f ) = t 1 t f p ( t ) T g ( t ) d t + 1 2 x ( t 1 ) T P ( t 1 ) x ( t 1 ) + x ( t 1 ) T p ( t 1 )
After substituting x = x h + x p :
1 2 t 1 t f x h ( t ) + x p ( t ) T Q ( t ) x h ( t ) + x p ( t ) d t + 1 2 x h ( t f ) + x p ( t f ) T H x h ( t f ) + x p ( t f ) = t 1 t f p ( t ) T g ( t ) d t + 1 2 x ( t 1 ) T P ( t 1 ) x ( t 1 ) + x ( t 1 ) T p ( t 1 )
(a)
Let g 0 . Hence, x p = 0 , p = 0 , and
1 2 t 1 t f x h ( t ) T Q ( t ) x h ( t ) d t + 1 2 x h ( t f ) T H x h ( t f ) = 1 2 x ( t 1 ) T P ( t 1 ) x ( t 1 )
(b)
Let x ( t 1 ) = 0 . Hence, x h = 0 and
1 2 t 1 t f x p ( t ) T Q ( t ) x p ( t ) d t + 1 2 x p ( t f ) T H x p ( t f ) = t 1 t f p ( t ) T g ( t ) d t
(c)
By (a), (b), and cancelling terms from Equation (23):
t 1 t f x h ( t ) T Q ( t ) x p ( t ) d t + x h ( t f ) T H x p ( t f ) = x ( t 1 ) T p ( t 1 )

4. Discussion

The fact described in Corollary 1 proves to be useful in certain circumstances. As mentioned earlier, the method proposed for solving the CBQR problem [19] relies on process improvement. However, even though the process is rigorously derived, there is a practical obstacle to its application. The solutions for Equations (11) and (12) are usually obtained through numerical integration methods. In practice, numerical computation errors are always an issue when differential equations are solved numerically [23,24]. Numerical observations show that a successful improvement of a given process— ( x k , u k ) —is quite sensitive to the accuracy of q. This accuracy is highly affected by that of P and p , which solves Equations (11) and (12). Furthermore, differential Lyapunov equations tend to be quite stiff [25,26], making the numerical accuracy issue very present. This is particularly true in the case of Equation (11) and when u is rapidly varying or discontinuous, which is a common situation in many practical optimal control problems [2,16,27,28,29]. Therefore, dealing with numerical errors becomes a common practice in the improvement process. As a result, when numerical errors do exist in P and p , the structure of q becomes flawed and inaccurate, and the improvement might fail. A verification of q’s numerical accuracy at different t’s can be helpful in such a case as it may pinpoint the root of the problem and, consequently, its solution.
First, note that P ( t f ) and p ( t f ) are specified. Hence they are accurately known, and q ( t f , ξ ) is accurately known as well. Hence, q’s accuracy at t f is not of concern. As for q at other t’s, J ¯ e q can be used. By taking advantage of the independence of J ¯ on q, one can verify q’s accuracy at a certain t 1 [ 0 , t f ) by examining the difference | J ¯ e q ( x k , u k , t 1 ) J ¯ ( x k , u k , t 1 ) | . Clearly, higher accuracy will be reflected in a smaller difference. This tool can be utilized to identify the sections of u that contribute to the numerical issues, indicating where additional computational efforts should be focused.
Theorem 3 can be instrumental for a more specific verification of the accuracy of P and p . Since the right-hand side of expressions (a)–(c) in Theorem 3 depends solely on x , it can be computed and compared to the left-hand side of those equations. This allows us to verify the accuracy of the numerically computed P and p at any given t, as follows.
Throughout the improving steps, the control trajectory undergoes repeated alterations. However, at each step, the previously computed process, say ( x , u ) , serves as a fixed starting point. Consequently, at a given step, we treat an admissible control trajectory, u , as a specified input. We can now write the state equation for this step as:
x ˙ ( t ) = A ^ ( t ) x ( t ) + g ^ ( t ) ; x ( 0 ) , t ( 0 , t f )
where A ^ ( t ) A ( t ) + { u N ( t ) } and g ^ ( t ) B ( t ) u ( t ) + g ( t ) . Clearly, for the given process, Equation (1) is equivalent to Equation (24). As it is a linear system, the state trajectory can be decomposed into homogeneous and particular solutions. This also applies to each sub-interval ( t 1 , t f ) ( 0 , t f ) . Therefore, x = x h + x p over ( t 1 , t f ) where the homogeneous solution x h solves
x ˙ h ( t ) = A ^ ( t ) x h ( t ) ; t ( t 1 , t f ) , x h ( t 1 ) = x ( t 1 )
and the particular one, x p , solves
x ˙ p ( t ) = A ^ ( t ) x p ( t ) + g ^ ( t ) ; t ( t 1 , t f ) , x p ( t 1 ) = 0
On the one hand, substituting this decomposition into the CBQR’s l and l f and then into Equation (13), yields:
J ¯ ( x h + x p , u , t 1 ) = 1 2 t 1 t f x h ( t ) T Q ( t ) x h ( t ) d t + 1 2 x h ( t f ) T H x h ( t f ) + 1 2 t 1 t f x p ( t ) T Q ( t ) x p ( t ) d t + 1 2 x p ( t f ) T H x p ( t f ) + t 1 t f x h ( t ) T Q ( t ) x p ( t ) d t + x h ( t f ) T H x p ( t f ) + 1 2 t 1 t f u ( t ) T R ( t ) u ( t ) d t
On the other hand, according to Equation (18), an admissible process ( x , u ) can be evaluated over the sub-interval ( t 1 , t f ) by:
J ¯ ( x h + x p , u , t 1 ) = J ¯ e q ( x , u , t 1 ) = 1 2 x ( t 1 ) T P ( t 1 ) x ( t 1 ) + p ( t 1 ) T x ( t 1 ) + t 1 t f p ( t ) T g ^ ( t ) d t + t 1 t f 1 2 u ( t ) T R ( t ) u ( t ) d t
where P and p are computed from Equations (11) and (12) to the given u and g . Obviously, in this case, Equations (21) and (22) solved for A ^ and g ^ are equivalent to Equations (11) and (12). Theorem 3 reveals that each state-dependent-term on the right-hand side of Equation (27) is equal to a corresponding component of J ¯ e q . Let
I h ( τ ) 1 2 τ t f x h ( t ) T Q ( t ) x h ( t ) d t + 1 2 x h ( t f ) T H x h ( t f ) I p ( τ ) 1 2 τ t f x p ( t ) T Q ( t ) x p ( t ) d t + 1 2 x p ( t f ) T H x p ( t f ) I h p ( τ ) τ t f x h ( t ) T Q ( t ) x p ( t ) d t + x h ( t f ) T H x p ( t f ) I e q , p ( τ ) τ t f p ( t ) T g ^ ( t ) d t
The above trajectories can be used to verify the accuracy of P and p ; i.e., according to Theorem 3, a good accuracy should be reflected by a strong agreement in each of the pairs ( I h , x P x ) , ( I h p , x T p ) , and ( I p , I e q , p ) .
Finally, Theorem 3 offers a more profound insight into the meaning of the components in J ¯ e q . The identities that are introduced in the theorem reveal that:
  • The term 1 2 x ( t 1 ) T P ( t 1 ) x ( t 1 ) evaluates the performance of the homogeneous solution.
  • t 1 t f p ( t ) T g ^ ( t ) d t evaluates the performance of the particular solution.
  • p ( t 1 ) T x ( t 1 ) evaluates the cross-performance of the homogeneous-particular solutions.
Additionally,
q ( t 1 , x ( t 1 ) ) = 1 2 x ( t 1 ) T P ( t 1 ) x ( t 1 ) + p ( t 1 ) T x ( t 1 ) = I h ( t 1 ) + I h p ( t 1 )
This implies that the improving function, q, is a sum of the homogeneous solution’s performance and the cross-performance of the homogeneous-particular solutions.

5. Numerical Example

A CBQR problem that was previously introduced in [19] is used here in order to exemplify an application of the above results. Unlike the original paper, where it was used to demonstrate a practical application of the CBQR problem and its solution, here, the focus is placed on the solution’s numerical accuracy. It illustrates how the above results can be used to verify the numerical accuracy of an improving function that was obtained by the method suggested in [19]. The problem concerns a structural control problem of a two-story building subjected to external excitation and configured with two semi-active variable stiffness (SAVS) devices [30,31]. Devices of this type consist of a frame that connects adjacent floors through a hydraulic element. This element has two operation modes controlled by a servo-valve having two states: open and closed. When the valve is open, it allows almost free flow of the hydraulic fluid, and the device’s resistance to relative floors’ motion is minimal. In a closed state, the valve prevents the fluid’s flow, locks the device, and turns it into a lateral bracing element. Springs and variable dashpots were used for modeling these SAVS devices. Each variable dashpot is capable of providing either a finite or infinite damping, representing the device’s unlocked or locked state, respectively. Figure 1 depicts the dynamic scheme used for modeling the controlled structural system.
The system dynamics amount to the following bilinear state equation:
x ˙ ( t ) = A x ( t ) + u 1 ( t ) N 1 x ( t ) + u 2 ( t ) N 2 x ( t ) + g ( t ) A = 0 0 1 0 0 0 0 0 0 1 0 0 2000 1000 2.01 1 1 1 ˙ 0 5 1 1 ˙ 0 5 500 500 0.5 0.51 0 5 1 ˙ 0 6 0 0 2.5 1 ˙ 07 0 0 0 0 0 2.5 1 ˙ 07 2.5 1 ˙ 07 0 0 N 1 = diag ( 0 , 0 , 0 , 0 , 0 , 5000 ) N 2 = diag ( 0 , 0 , 0 , 0 , 5000 , 5000 ) g ( t ) = 0 0 3 3 0 0 T sin ( 16.55 t )
where the state vector x ( t ) = ( z 1 ( t ) , z 2 ( t ) , z ˙ 1 ( t ) , z ˙ 2 ( t ) , w 1 ( t ) , w 2 ( t ) ) describes the horizontal displacements and velocities in the dynamic degrees of freedom and the forces applied through the SAVS devices to the floors. u ( u 1 , u 2 ) is the control input trajectory, representing the SAVS devices’ locking patterns. Here, the control policy is restricted to one of three unlocking patterns [30]: (1) both devices are unlocked, (2) only the second device is unlocked, or (3) both devices are locked. These settings lead to an admissible set of control inputs:
u ( t ) U = 0 1 , 1 0 , 0 0
Every object in this set reflects the SAVS devices’ unlocking patterns: 1, 2 and 3, respectively. Note that U finiteness inhibits variational methods from being used for u . A horizontal ground acceleration signal of z ¨ g ( t ) = 3 sin ( 16.55 t ) was used to simulate a seismic excitation.
The performance evaluation accounts for inter-story drifts and control forces. It is:
J ( x , u ) = ( 1 2 0 5 x 1 ( t ) 2 1 ˙ 05 + ( x 2 ( t ) x 1 ( t ) ) 2 1 ˙ 05 + 5 ( x 5 ( t ) 2 + x 6 ( t ) 2 ) d t ) + 1 2 ( x 1 ( 5 ) 2 1 ˙ 04 + ( x 2 ( 5 ) x 1 ( 5 ) ) 2 1 ˙ 04 + 50 ( x 5 ( 5 ) 2 + x 6 ( 5 ) 2 ) )
Q and H were constructed accordingly. It follows that R = 0 as u has no weight in J. Additional details of the system are available in the original paper [19].
Here, that problem is revisited and discussed in the context of Corollary 1 and Theorem 3. That is, in this section, the results are exemplified by demonstrating their application in diagnosing numerical issues emerging during the solution of the above CBQR problem.
Two characteristics of this problem allude that numerical issues are likely to be involved in solving it, especially in solving the differential equations that are related to the process improving stage and the feedback synthesis. First, due to the operation’s principle of the SAVS devices, the control signals are binary. Second, the control signals alternate rapidly. Such issues indeed came up during the CBQR iterations, as follows.
Following [19], Krotov’s method was applied to the CBQR problem through MATLAB. A numerical integration algorithm based on the fourth-order Runge-Kutta method was utilized to solve the necessary differential equations. However, here, computations were carried out twice. In the first case, the integration step was set to 0.01 s , whereas, in the second, it was set to 0.001 s . Notably, even though the fourth-order Runge-Kutta is not recommended for stiff equations [25], it is sufficient for the point discussed in this paper. Here, the discussion revolves around validating the numerical solution’s accuracy rather than its actual accurate computation.
Fifteen iterations were executed, each one consisting of a single CBQR improvement step. As explained above, these iterations generate a sequence of processes, where each is expected to be better than the previous in terms of J. Table 1 provides J’s values, evaluated for each computed process and each case. In this table, i stands for the iteration number. Start with i = 0 for the initial process, i = 1 for the process obtained after one improvement, i = 2 for that obtained after two improvements, and so forth. The table also denotes the relative change in J, signified by Δ J i J i J i 1 .
It can be seen that J’s values in case 2 are different than those of case 1. This implies that the processes, obtained in each case, differ too. Additionally, for an improving sequence, Δ J is expected to be non-positive. However, starting at the fifth iteration of case 1, Δ J introduces some positive values, which means that a deterioration was obtained rather than an improvement. Although this issue also exists in case 2, it is milder. In case 2, only one such incident was recorded—after iteration 13. The explanation to this non-monotonous behavior is numerical errors, involved in the computation of P and p . Obviously, case 2, which benefits from higher accuracy due to a smaller integration step, is more reliable than case 1.
Hence, inspection of the monotonicity of { ( x , u ) i } can serve as a simple measure of the accuracy of the obtained process. Nevertheless, a deeper inspection can be performed through a comparison of the original cost-to-go J ¯ with its equivalent counterpart, which was defined in Corollary 1, J ¯ e q . Such a comparison is presented in Figure 2. In this figure, case 1’s costs are J ¯ c 1 , J ¯ e q c 1 , and case 2’s are J ¯ c 2 , J ¯ e q c 2 . Corollary 1 states that, theoretically, J ¯ and J ¯ e q should be identical for any 0 t 1 t f . However, this property is not observed in case 1. It can be seen that J ¯ e q c 1 substantially deviates from J ¯ c 1 . In contrast, in case 2, there is a relatively good agreement between J ¯ c 2 and J ¯ e q c 2 . Additionally, Figure 2 points out another interesting fact. It can be seen that J ¯ c 1 and J ¯ c 2 are not equal. Although they do have similar initial and terminal costs, they differ over the majority of time instances. This is another sign of the difference between the solutions obtained in case 1 and 2. However, the good agreement of J ¯ c 2 and J ¯ e q c 2 implies that case 2 is the accurate one.
Can we better identify the reason for the poor performance of case 1? Based on Theorem 3, the answer is yes. Consider the identities defined in Theorem 3 and the terms defined in Equation (28). A better identification can be obtained by examining the equivalence of the terms in each one of the pairs: ( I h , x P x ) , ( I h p , x T p ) , and ( I p , I e q , p ) .
Figure 3 inspects this equivalence in each case. As before, a ‘c1’ superscript indicates case 1, and case 2 is indicated by ‘c2’. The correspondence in each of the pairs is conspicuous in case 2. However, even though case 1 shows reasonable similarity in the two first pairs, in the third pair, I p c 1 is dissimilar to I e q , p c 1 for t [ 0 , 3.2 ] . It follows that in the given problem this element is the reason for the difference between J ¯ c 1 and J ¯ e q c 1 . Because the pair ( I p , I e q , p ) consists of g ^ , which is absent from the other pairs, it is suggested that the numerical obstacle in the given problem is most likely related to the components of g ^ , i.e., either the control input u , the external excitation z ¨ g , or both.

6. Conclusions

This paper furnishes novel theoretical results that are related to a recently published solution to the CBQR problem. Specifically, the results refer to a key element in that solution—the improving function. First, Corollary 1 defines an equivalent cost-to-go performance index. Notably, its formulation is general and not limited to the CBQR problem. A sub-interval’s equivalence is an immediate consequence of this theorem and is discussed too. Next, Theorem 3 continues the idea presented in Corollary 1 but concentrates on the CBQR case.
The theorem breaks the equivalence down into smaller identities in relation to the components of the CBQR improving function.
These results allow us to verify the accuracy of the obtained numerical solution. This can be performed for the overall accuracy of the computed feedback and each of its components. It is an important tool as numerical issues frequently arise in solving ODEs, especially when discontinuous control signals and excitations are involved. Additionally, these theoretical results shed light on the meaning of the equivalent representation, the improving function, and the way in which they are related to the given problem.
A numerical example from structural control is provided to illustrate the above identities and how they can be used to verify the accuracy of the computed solution and pinpoint causes of inaccuracy, if there are such.
Based on the above results, different future research directions can be considered. The following are three possible directions. First, the results were demonstrated to have application in validating the numerical accuracy of improving functions. However, there is still work to be performed on systematically incorporating such a validation into the iteration process and what steps should be taken when errors are encountered. Second, can the identities from Theorem 3 be extended for more complex systems? Third, the derived identities provide novel explanations of the elements of the improving function related to the CBQR problem. It would be beneficial to examine their utilization in improving the computation time and effort required to solve that problem.

Funding

This research received no external funding.

Data Availability Statement

No new data was created or analyzed within the manuscript.

Conflicts of Interest

The author declares no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
CBQRContinuous-Time Bilinear Quadratic Regulator
ODEOrdinary Differential Equation

References

  1. Gerdts, M. Optimal Control of ODEs and DAEs; De Gruyter Oldenbourg: Berlin, Germany; Boston, MA, USA, 2024. [Google Scholar] [CrossRef]
  2. Krotov, V.F. Global Methods in Optimal Control Theory; Chapman & Hall/CRC Pure and Applied Mathematics; Taylor & Francis: Boca Raton, FL, USA, 1995. [Google Scholar]
  3. Wu, K.; Ren, C.; Chen, Y. An optimal control method for time-delay feedback control of 1/4 vehicle active suspension under random excitation. J. Low Freq. Noise Vib. Act. Control 2022, 41, 732–747. [Google Scholar] [CrossRef]
  4. Safiullah, S.; Rahman, A.; Ahmad Lone, S. Optimal control of electrical vehicle incorporated hybrid power system with second order fractional-active disturbance rejection controller. Optim. Control Appl. Methods 2021, 44, 905–934. [Google Scholar] [CrossRef]
  5. Lunz, D.; Bonnans, J.F.; Ruess, J. Optimal control of bioproduction in the presence of population heterogeneity. J. Math. Biol. 2023, 86, 43. [Google Scholar] [CrossRef]
  6. Lamwong, J.; Wongvanich, N.; Tang, I.M.; Pongsumpun, P. Optimal Control Strategy of a Mathematical Model for the Fifth Wave of COVID-19 Outbreak (Omicron) in Thailand. Mathematics 2024, 12, 14. [Google Scholar] [CrossRef]
  7. Ibrahim, O.; Okuonghae, D.; Ikhile, M. Optimal control model for criminal gang population in a limited-resource setting. Int. J. Dyn. Control 2023, 11, 835–850. [Google Scholar] [CrossRef] [PubMed]
  8. Sahoo, B.; Das, R. Crime population modelling: Impacts of financial support. Int. J. Dyn. Control 2023, 11, 504–519. [Google Scholar] [CrossRef]
  9. Jain, A.; Dhar, J.; Gupta, V.K. Optimal control of rumor spreading model on homogeneous social network with consideration of influence delay of thinkers. Differ. Equ. Dyn. Syst. 2023, 31, 113–134. [Google Scholar] [CrossRef]
  10. Kirk, D.E. Optimal Control Theory: An Introduction; Dover Publications: Miniola, NY, USA, 2004; p. 479. [Google Scholar]
  11. Cadenillas, A.; Huamán-Aguilar, R. The Optimal Control of Government Stabilization Funds. Mathematics 2020, 8, 1975. [Google Scholar] [CrossRef]
  12. Morzhin, O.; Pechen, A. Krotov method for optimal control of closed quantum systems. Russ. Math. Surv. 2019, 74, 851–908. [Google Scholar] [CrossRef]
  13. Fernandes, M.E.F.; Fanchini, F.F.; de Lima, E.F.; Castelano, L.K. Effectiveness of the Krotov method in finding controls for open quantum systems. J. Phys. A Math. Theor. 2023, 56, 495303. [Google Scholar] [CrossRef]
  14. Halperin, I.; Agranovich, G.; Ribakov, Y. Using Constrained Bilinear Quadratic Regulator for the Optimal Semi-Active Control Problem. J. Dyn. Syst. Meas. Control 2017, 139, 111011. [Google Scholar] [CrossRef]
  15. Halperin, I.; Agranovich, G.; Ribakov, Y. Multi-input control design for a constrained bilinear biquadratic regulator with external excitation. Optim. Control Appl. Methods 2019, 40, 1045–1053. [Google Scholar] [CrossRef]
  16. Halperin, I.; Agranovich, G.; Ribakov, Y. Design of Optimal Feedback for Structural Control; CRC Press: Boca Raton, FL, USA, 2021. [Google Scholar]
  17. Liu, Y.; Zhou, P. Optimal tracking control for blast furnace molten iron quality based on subspace identification and Krotov’s method. Optim. Control Appl. Methods 2023, 44, 2532–2550. [Google Scholar] [CrossRef]
  18. Halperin, I.; Agranovich, G.; Lew, B. The Discrete Bilinear Biquadratic Regulator. IEEE Trans. Autom. Control 2021, 66, 5006–5012. [Google Scholar] [CrossRef]
  19. Halperin, I. Solution of the Continuous Time Bilinear Quadratic Regulator Problem by Krotov’s Method. IEEE Trans. Autom. Control 2023, 68, 2415–2421. [Google Scholar] [CrossRef]
  20. Bruni, C.; DiPillo, G.; Koch, G. Bilinear systems: An appealing class of “nearly linear” systems in theory and applications. IEEE Trans. Autom. Control 1974, 19, 334–348. [Google Scholar] [CrossRef]
  21. Krotov, V.F. A technique of global bounds in optimal control theory. Control Cybern. 1988, 17, 115–144. [Google Scholar]
  22. Gajic, Z.; Qureshi, M.T.J. The Lyapunov Matrix Equation in System Stability and Control; Academic Press, Inc.: Cambridge, MA, USA, 1995. [Google Scholar]
  23. Gimeno, J.; Jorba, À.; Jorba-Cuscó, M.; Miguel, N.; Zou, M. Numerical integration of high-order variational equations of ODEs. Appl. Math. Comput. 2023, 442, 127743. [Google Scholar] [CrossRef]
  24. Ji, Y.; Xing, Y. Highly Accurate and Efficient Time Integration Methods with Unconditional Stability and Flexible Numerical Dissipation. Mathematics 2023, 11, 593. [Google Scholar] [CrossRef]
  25. Shampine, L.F.; Gear, C.W. A User’s View of Solving Stiff Ordinary Differential Equations. SIAM Rev. 1979, 21, 1–17. [Google Scholar] [CrossRef]
  26. Choi, C.H. Solving stiff Lyapunov differential equations. In Proceedings of the 2000 American Control Conference, ACC (IEEE Cat. No. 00CH36334), IEEE, Chicago, IL, USA, 28–30 June 2000; Volume 5, pp. 3370–3372. [Google Scholar]
  27. Adhyaru, D.M.; Kar, I.N.; Gopal, M. Constrained Optimal Control of Bilinear Systems Using Neural Network Based HJB Solution. In Proceedings of the 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), Hong Kong, China, 1–8 June 2008; pp. 4137–4142. [Google Scholar] [CrossRef]
  28. Hassan, M.; Boukas, E. Constrained linear quadratic regulator: Continuous-time case. Nonlinear Dyn. Syst. Theory 2008, 8, 35–42. [Google Scholar]
  29. Liu, C.; Loxton, R.; Teo, K.L. A Computational Method for Solving Time-Delay Optimal Control Problems with Free Terminal Time. Numer. Algebra Control Optim. 2014, 72, 53–60. [Google Scholar] [CrossRef]
  30. Kobori, T.; Takahashi, M.; Nasu, T.; Niwa, N.; Ogasawara, K. Seismic response controlled structure with Active Variable Stiffness system. Earthq. Eng. Struct. Dyn. 1993, 22, 925–941. [Google Scholar] [CrossRef]
  31. Zoccolini, L.; Bruschi, E.; Cattaneo, S.; Quaglini, V. Current Trends in Fluid Viscous Dampers with Semi-Active and Adaptive Behavior. Appl. Sci. 2023, 13, 10358. [Google Scholar] [CrossRef]
Figure 1. Dynamic scheme for a two-floor structure, equipped with two SAVS devices [19].
Figure 1. Dynamic scheme for a two-floor structure, equipped with two SAVS devices [19].
Mathematics 12 00611 g001
Figure 2. Cost-to-go trajectories in case 1 ( J ¯ c 1 , J ¯ e q c 1 ) and case 2 ( J ¯ c 2 , J ¯ e q c 2 ).
Figure 2. Cost-to-go trajectories in case 1 ( J ¯ c 1 , J ¯ e q c 1 ) and case 2 ( J ¯ c 2 , J ¯ e q c 2 ).
Mathematics 12 00611 g002
Figure 3. Trajectories of the cost-to-go elements for cases 1 and 2: (a) I h against x P x , (b) I h p against x T p , (c) I p against I e q , p . Case 1 is indicated by a c 1 superscript and case 2 is indicated by a c 2 superscript.
Figure 3. Trajectories of the cost-to-go elements for cases 1 and 2: (a) I h against x P x , (b) I h p against x T p , (c) I p against I e q , p . Case 1 is indicated by a c 1 superscript and case 2 is indicated by a c 2 superscript.
Mathematics 12 00611 g003
Table 1. Peformance index values at each process. Here, Δ J i = J i J i 1 .
Table 1. Peformance index values at each process. Here, Δ J i = J i J i 1 .
Case 1Case 2
i J Δ J J Δ J
0 4.59 × 10 14 - 4.59 × 10 14 -
1 4.34 × 10 13 4.16 × 10 14 1.54 × 10 14 3.05 × 10 14
2 5.84 × 10 12 3.76 × 10 13 1.44 × 10 13 1.39 × 10 14
3 4.88 × 10 12 9.61 × 10 11 5.54 × 10 12 8.9 × 10 12
4 4.78 × 10 12 9.94 × 10 10 4.66 × 10 12 8.81 × 10 11
5 4.82 × 10 12 3.51 × 10 10 4.65 × 10 12 4.5 × 10 9
6 4.59 × 10 12 2.23 × 10 11 4.6 × 10 12 4.9 × 10 10
7 4.57 × 10 12 2.59 × 10 10 4.53 × 10 12 6.9 × 10 10
8 4.45 × 10 12 1.23 × 10 11 4.51 × 10 12 2.87 × 10 10
9 4.38 × 10 12 6.33 × 10 10 4.49 × 10 12 1.38 × 10 10
10 4.43 × 10 12 5.05 × 10 10 4.49 × 10 12 5.34 × 10 9
11 4.36 × 10 12 7.09 × 10 10 4.48 × 10 12 8.91 × 10 9
12 4.42 × 10 12 5.52 × 10 10 4.47 × 10 12 8.72 × 10 9
13 4.38 × 10 12 3.75 × 10 10 4.49 × 10 12 1.92 × 10 10
14 4.5 × 10 12 1.23 × 10 11 4.47 × 10 12 1.41 × 10 10
15 4.41 × 10 12 9.66 × 10 10 4.47 × 10 12 7.55 × 10 9
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Halperin, I. The Meaning and Accuracy of the Improving Functions in the Solution of the CBQR by Krotov’s Method. Mathematics 2024, 12, 611. https://doi.org/10.3390/math12040611

AMA Style

Halperin I. The Meaning and Accuracy of the Improving Functions in the Solution of the CBQR by Krotov’s Method. Mathematics. 2024; 12(4):611. https://doi.org/10.3390/math12040611

Chicago/Turabian Style

Halperin, Ido. 2024. "The Meaning and Accuracy of the Improving Functions in the Solution of the CBQR by Krotov’s Method" Mathematics 12, no. 4: 611. https://doi.org/10.3390/math12040611

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop