Next Article in Journal
Adaptive Mission Abort Planning Integrating Bayesian Parameter Learning
Previous Article in Journal
Several Characterizations of the Generalized 1-Parameter 3-Variable Hermite Polynomials
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Class of Distributed Online Aggregative Optimization in Unknown Dynamic Environment

1
School of Control Engineering, Northeastern University at Qinhuangdao, Qinhuangdao 066004, China
2
College of Computer Engineering, Jimei University, Xiamen 361021, China
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(16), 2460; https://doi.org/10.3390/math12162460
Submission received: 9 July 2024 / Revised: 7 August 2024 / Accepted: 7 August 2024 / Published: 8 August 2024
(This article belongs to the Topic Distributed Optimization for Control)

Abstract

:
This paper considers a class of distributed online aggregative optimization problems over an undirected and connected network. It takes into account an unknown dynamic environment and some aggregation functions, which is different from the problem formulation of the existing approach, making the aggregative optimization problem more challenging. A distributed online optimization algorithm is designed for the considered problem via the mirror descent algorithm and the distributed average tracking method. In particular, the dynamic environment and the gradient are estimated by the averaged tracking methods, and then an online optimization algorithm is designed via a dynamic mirror descent method. It is shown that the dynamic regret is bounded in the order of O ( T ) . Finally, the effectiveness of the designed algorithm is verified by some simulations of cooperative control of a multi-robot system.

1. Introduction

The distributed online optimization problem for multi-agent systems has received considerable attention in the past few decades [1,2]. The objective is to minimize a global time-varying cost function, written as the sum of local convex functions, under the constrict that each individual agent has only knowledge of the local convex functions.
Recently, some researchers have focused on distributed online aggregative optimization, which is a special class of online optimization problems. In distributed online aggregative optimization, the local cost functions include some aggregative terms motivated by many real applications. These aggregative terms make the design and analyses of the online optimization algorithm more challenging. Some tracking methods were presented to solve distributed online aggregative optimization problems. For example, a distributed aggregative gradient tracking algorithm is proposed and analyzed in [3] to solve a distributed online aggregative optimization problem. It is shown that convergence to the optimal variable is linear. Different from [3], a set constraint is considered in [4], where an online distributed gradient tracking algorithm is proposed to solve a distributed online aggregative optimization problem with exact gradient information and stochastic/noisy gradients, respectively. The upper boundary of dynamic regret is analyzed and given, and some simulations on the target surrounding problem are provided to verify the effectiveness of the designed algorithm. The authors of [5] considered the distributed online aggregative optimization without the assumption of boundedness of the gradients and the feasible sets. In particular, a projected aggregative tracking algorithm was presented in [5], and the authors showed that the dynamic regret is bounded by a constant term and a term related to time variations.
Dynamic environments are common in the real applications of online optimization [6,7,8,9,10,11]. For example, the authors of [6] showed that many online learning problems, including the dynamic texture analysis, solar flare detection, sequential compressed sensing of a dynamic scene, traffic surveillance, tracking self-exciting point processes, and network behavior in the Enron email corpus, can benefit from the incorporation of a dynamic environment. In [7], the tracking problem of a time-varying parameter with unknown dynamics was studied as an online optimization in a dynamic environment. The distributed tracking and tracking of dynamic point processes on network problems were solved in [8,9] using mirror descent methods with dynamical environments, respectively. The localization of sensor networks problem was solved in [10] by using a distributed online bandit learning algorithm over a multi-agent network with a dynamical environment. Note that the dynamic environments were assumed to be known in [6,7,8,9,10]. In contrast with [6,7,8,9,10], an unknown dynamic environment was considered in [11], where the dynamic regret was shown to be bounded.
Based on the above discussions, the distributed online aggregative optimization with a dynamic environment is motivated by many real applications. The problem has many applications, especially under the condition that the dynamic environment is unknown. Therefore, this paper considers a class of distributed online aggregative optimization problems with an unknown dynamic environment. The main contributions are as follows. Some aggregative terms and an unknown dynamic environment are simultaneously considered in this paper for an online optimization problem. The problem formulation comes from some real applications, e.g., the cooperative control problem for a multi-robot system in the simulation section. Compared with [3,4,5,6,7,8,9,10,11], the dynamic environment was not considered in [3,4,5], and the aggregative terms were not considered in [6,7,8,9,10,11], respectively. Therefore, the design and analyses of the optimization algorithm in this paper are more challenging than in [3,4,5,6,7,8,9,10,11]. In particular, some averaged tracking methods estimate the dynamic environment and the gradient. Based on the estimations, an online optimization algorithm is designed via a dynamic mirror descent method. It is shown that the dynamic regret is bounded in the order of O ( T ) , and some simulations are proposed to verify the effectiveness of the designed algorithm.
The remaining parts of this paper are organized as follows. In Section 2 and Section 3, some preliminaries and problem formulation are presented. The main results are proposed in Section 4. In Section 5, the performance of the designed algorithm is verified by some simulations. Section 6 concludes the paper.

2. Preliminaries

2.1. Notations

R n denotes the n-dimensional Euclidean space. x denotes the 2-norm of a vector x. [ T ] with positive integer T denotes the set { 1 , 2 , , T } . x , y denotes the inner product of vectors x R n and y R n , i.e., x , y = x T y where x T denotes the transpose of x. For a function f ( x , t ) : R n × R R , f ( x , t ) denotes the gradient of f ( x , t ) with respect to a vector x. σ 2 ( W ) denotes the second-largest singular value of matrix W.

2.2. Graph Theory

For a multi-agent system, we use a directed graph G = ( V , E ) to describe the information exchange within it, where V = { 1 , 2 , , N } is the node set and E V × V is the edge set. ( j , i ) E represents that agent i can obtain information from agent j. The self-loop is unconsidered, i.e., ( i , i ) E . N i = { j : ( j , i ) E } denotes the in-neighbor set of node i. If for every ( j , i ) E , there exists ( i , j ) E , then the graph is an undirected graph. A path between nodes i and j is a sequence of edges ( i , V 1 ) , ( V 1 , V 2 ) , , ( V k , j ) in the graph G where V l , l = 1 , 2 , , k are some distinct nodes. An undirected graph is connected if there is a path between each pair of nodes.

2.3. Bregman Divergence

The mirror descent algorithm based on Bregman divergence is frequently used and effective in online optimization [8]. The Bregman divergence is defined as follows. Let R : R n R be a strongly convex function which satisfies
R ( x ) R ( y ) + x y , R ( y ) + 1 2 x y 2 , x , y R n ,
and define the Bregman divergence D R ( x , y ) as
D R ( x , y ) = R ( x ) R ( y ) x y , R ( y ) .

3. Problem Formulation

Considering a network system composed of N agents, and there exists a sequence of time-varying convex function f t ( x ) : R n R over the network composed of f i , t ( x , v ( x ) ) : R n × R d R . Considering the following optimization problem,
min . x X f t ( x ) = i = 1 N f i , t ( x , v ( x ) ) ,
where i [ N ] , t [ T ] , x X R n , ψ i : R n R d , and v : R n R d is the aggregative variable defined by v ( x ) = 1 N i = 1 N ψ i ( x ) . The function f i , t ( x , v ( x ) ) is convex and assigned to agent i. Supposing there exists a sequence of unknown stable non-expansive mapping A t R n × n , i.e., A t   1 , such that
x t + 1 = A t x t + ϑ t ,
where x t = argmin x X f t ( x ) and ϑ t is an unstructured and unknown noise. Assuming that agent i independently observes the mapping A t , and A i , t denotes the observed value in time t. Moreover, assuming that A t is the optimal observed value for all agents and A t is the optimal solution to the following optimization problem:
A t = arg min A ˜ t i = 1 N A ˜ t A i , t 2 .
It follows from (4) and (5) that
x t + 1 = A t x t + θ t ,
where θ t = ( A t A t ) x t + ϑ t .
Distributed Online Aggregative Optimization Problem:
The objective is to generate a sequence x i , t , i [ N ] , t [ T ] to minimize the following dynamic regret,
R e g ( T ) : = t = 1 T i = 1 N f i , t ( x i , t ) t = 1 T i = 1 N f i , t ( x t ) ,
where x t is the optimal solution to (3) which satisfies constraint (6), i.e., x t = argmin x X f t ( x ) while x t satisfies constraint (6).
Remark 1.
The online aggregative optimization problems have been studied in [4,5]. However, this paper considers an unknown dynamic environment, which is more challenging than [4,5], i.e., the mapping A t .
Let ψ i ( x i ) , 1 f i , t ( x i , v ) and 2 f i , t ( x i , v ) denote x i ψ i ( x i ) , x i f i , t ( x i , v ) and v f i , t ( x i , v ) , respectively. Let W R N × N be the adjacency matrix of network G , and η t R , t [ T ] be the global step-size which will be used to design an online optimization algorithm in the next section. Let X ¯ R n and Y ¯ R d be some convex sets which will be defined in the next section. Some necessary assumptions are as follows.
Assumption 1.
Graph G is undirected and connected, and W is doubly stochastic, i.e.,
i = 1 N W i j = j = 1 N W i j = 1 ,
and there exists a constant α ( 0 , 1 ) such that W i j α when W i j 0 , and W i i α for all i [ N ] .
Assumption 2.
For x , m X ¯ , y , n Y ¯ , i [ N ] and t [ T ] , the functions f i , t ( x ) , 1 f i , t ( x , y ) , 2 f i , t ( x , y ) , ψ i ( x ) , and ψ i ( x ) are Lipschitz continuous, and the functions ψ i ( x ) , 1 f i , t ( x , y ) and 2 f i , t ( x , y ) are bounded, i.e.,
ι x m     f i , t ( x ) f i , t ( m )   L x m 1 f i , t ( x , y ) 1 f i , t ( m , n )   L ( x m + y n ) 2 f i , t ( x , y ) 2 f i , t ( m , n )   L ( x m + y n ) ψ i ( x ) ψ i ( m )   L x m ψ i ( x ) ψ i ( m )   L x m ψ i ( x )   H , 1 f i , t ( x , y )   H , 2 f i , t ( x , y )   H ,
where L is the Lipschitz constant, ι > 0 and H > 0 .
Assumption 3.
There exists
| D R ( x , z ) D R ( y , z ) | K x y D R ( x , i = 1 N a ( i ) y i ) i = 1 N a ( i ) D R ( x , y i ) | R ( x ) R ( y ) | L R x y ,
where x , y , y i , z X ¯ , a ( i ) is on the N-dimensional simplex, and K and L R are some Lipschitz constants.
Assumption 4.
There exists positive constants M A and M, such that A t I n M A η t , and i = 1 N A i , t A i , t 1 1 M η t , where η t is the global step sequence. Moreover, D R ( A t x , A t y ) D R ( x , y ) holds for x , y X ¯ and t [ T ] .
Remark 2.
Assumptions 1, 3, and 4 come from [6,7,8,11]. They are common in mirror descent algorithms for distributed online optimization. Assumption 2 is also common in online aggregation optimization [4].

4. Main Result

Inspired by [11], consider the following online algorithm:
A ^ i , t = j = 1 N W i j A ^ j , t 1 + A i , t A i , t 1
x ^ i , t + 1 = argmin x X { η t x , i , t + D R ( x , y i , t ) }
x i , t + 1 = j = 1 N W i j A ^ j , t x ^ i , t + 1
y i , t + 1 = j = 1 N W i j x j , t + 1
v i , t + 1 = j = 1 N W i j v j , t + ψ i ( x i , t + 1 ) ψ i ( x i , t )
z i , t + 1 = j = 1 N W i j z j , t + ψ i ( x i , t + 1 ) ψ i ( x i , t ) ,
where i [ N ] , t [ T ] , i , t = 1 f i , t ( x i , t , v i , t ) + 2 f i , t ( x i , t , v i , t ) z i , t , η t is the global step sequence, x ^ i , t + 1 , y i , t + 1 , A ^ i , t + 1 , v i , t + 1 and z i , t + 1 are some auxiliary variables, and x i , t + 1 is the designed sequence. In (8), the initial states are chosen such that A ^ i , 0 = A i , 0 = 0 n × n , y i , 0 = x i , 0 = x ^ i , 0 = 0 n , v i , 0 = ψ i ( x i , 0 ) and z i , 0 = ψ i ( x i , 0 ) .
It follows from (8) that the states in (8) are bounded. Let sets X ¯ and Y ¯ satisfy X X ¯ , x i , t X ¯ , y i , t X ¯ and v i , t Y ¯ . Let P = max i { v i , 1 , z i , 1 } , Q = sup x X x , R 2 = sup x , y X D R ( x , y ) , and
A i , t = j = 1 N W i j A ^ j , t A t , z ¯ t = 1 N i = 1 N z i , t x ¯ t + 1 = 1 N i = 1 N x i , t + 1 , v ( x ) = 1 N i = 1 N ψ i ( x ) ˜ i , t = 1 f i , t ( x i , t , v ( x i , t ) ) + 2 f i , t ( x i , t , v ( x i , t ) ) v ( x i , t ) .
Remark 3.
We have provided a comparative analysis of problem formulation of online optimization in Table 1, where DAT denotes distributed average tracking. It follows that the problem considered in this paper is more challenging than [3,4,5,6,7,8,9,10,11]. Algorithms (8a)–(8d) come from [11]. In fact, algorithms (8a)–(8d) have been used in [11] to solve a class-distributed online optimization problem with an unknown mapping. However, the aggregative variable v ( x ) is not considered in [11]. This makes the design and analyses of the optimization algorithm more challenging. In particular, it follows from (3) that, for agent i, the gradient of the function f i , t ( x , v ( x ) ) with respect to x is 1 f i , t ( x i , t , v ( x i , t ) ) + 2 f i , t ( x i , t , v ( x i , t ) ) v ( x i , t ) . But v ( x i , t ) and v ( x i , t ) are unknown for agent i due to the topology constraint. Therefore, the algorithms (8e) and (8f) are added by using the distributed average tracking method to estimate v ( x i , t ) and v ( x i , t ) . Based on (8e) and (8f), the new gradient i , t = 1 f i , t ( x i , t , v i , t ) + 2 f i , t ( x i , t , v i , t ) z i , t is introduced in algorithm (8b). Hence, algorithms (8e) and (8f) are used to achieve aggregative gradient tracking via D A T . A strict analysis of dynamic regret of algorithm (8) will be presented as follows.
Some lemmas and the main result are presented as follows, and the proofs are given in the Appendix A, Appendix B, Appendix C, Appendix D, Appendix E, Appendix F and Appendix G.
Lemma 1.
Consider the mapping (5) and the algorithm (8a), and supposing that Assumptions 1 and 4 hold. Then for all i [ N ] and t [ T ] , there exists
A i , t   8 M η t 1 λ ,
where λ ( 0 , 1 ) .
Lemma 2.
Considering algorithm (8f), and supposing that Assumptions 1 and 2 hold, then for all i [ N ] and t [ T ] , there exists,
z i , t z ¯ t   Δ 1 , i , t Δ 2 ,
where Δ 1 = N P γ + 2 N L γ β 1 β + 4 L , Δ 2 = H Δ 1 + H L + H , γ = ( 1 α 2 N 2 ) 2 , and β = 1 α 2 N 2 .
Lemma 3.
Considering algorithms (8a)–(8d), and supposing that Assumptions 1–4 hold, then for all i [ N ] and t [ T ] , there exists,
x i , t + 1 x ¯ t + 1 Δ 3 τ = 0 t σ 2 t τ ( W ) η τ ,
where Δ 3 = N Δ 2 + M A Q + 8 M Q 1 λ .
Lemma 4.
Considering algorithm (8), and supposing that Assumptions 1–3 hold, then for all i [ N ] and t [ T ] , there exists,
x i , t + 1 x i , t Δ 4 η t + 2 Δ 3 τ = 0 t σ 2 t τ ( W ) η τ v i , t v ( x i , t ) N γ P β t + Δ 5 η t 1 + Δ 6 τ = 0 t σ 2 t τ ( W ) η τ z i , t v ( x i , t ) N γ P β t + Δ 5 η t 1 + Δ 6 τ = 0 t σ 2 t τ ( W ) η τ ,
where Δ 4 = Δ 2 + M A Q + 8 M Q 1 λ , Δ 5 = L Δ 4 ( N γ β 1 β + 2 ) , and Δ 6 = 2 L Δ 3 ( N γ β 1 β + 3 ) .
Lemma 5.
Considering algorithm (8), and supposing that Assumptions 1, 3 and 4 hold, then for all i [ N ] and t [ T ] , there exists,
t = 1 T i = 1 N ( 1 η t D R ( x t , y i , t ) 1 η t D R ( x t , x ^ i , t + 1 ) ) 2 N R 2 η T + 1 + t = 1 T N K η t + 1 x t + 1 A t x t + t = 1 T i = 1 N η t η t + 1 Δ 7 A t x t x i , t + 1 ,
where Δ 7 = 8 L R M Q ( 1 λ ) .
Lemma 6.
Considering algorithm (8), and supposing that Assumptions 1–3 hold, then for all i [ N ] and t [ T ] , there exists,
˜ i , t i , t ( L + L 2 + H ) { N γ P β t + Δ 5 η t 1 + Δ 6 τ = 0 t σ 2 t τ ( W ) η τ } .
The main result of this paper is as follows.
Theorem 1.
Considering the optimization problem (3), constraint (6), and algorithm (8), and supposing that Assumptions 1–4 hold. Let η t = η > 0 and ι > Δ 7 . Then there exists
Reg ( T ) L ι Δ 7 Δ 8 η T + 2 N R 2 + N K C T η + Δ 9 ,
where C T = t = 1 T θ t , Δ 8 = 2 Q ( L + L 2 + H ) Δ 5 + Δ 6 1 σ 2 ( W ) + N ( H + H L ) 2 2 + 2 N ( H + H L ) Δ 3 1 σ 2 ( W ) , and Δ 9 = 2 N Q Δ 7 + Δ 7 C T + 2 Q N 2 γ P ( L + L 2 + H ) β 1 β .
Remark 4.
The condition ι > Δ 7 is reasonable since a sufficiently small Δ 7 is chosen by decreasing L R . It follows from the boundary Reg ( T ) in Theorem 1 that a proper step is η t = η 0 / T with η 0 > 0 . In such a case, the regret Reg ( T ) is bounded in the order of O ( T ) for any arbitrary T > 1 .

5. Simulations

Inspired by [5,8], considering a cooperative control problem for a multi-robot system in a plant X R 2 , in particular, a moving target with state T X is modeled as
T t + 1 = A t T t + ϑ t ,
where A t R 2 × 2 and ϑ t R 2 is the noise. Moreover, there is a multi-robot network composed of N robots aiming to protect the target, while there are N intruders aiming to capture the target. Let I i , t X denote the state of the ith intruder at time t, which is assigned to robot i. Supposing that the formation of a multi-robot network is defined by a virtual leader–follower formation method, let x t denote the leader’s state which is unknown for all robots. In particular, let x i , t X denote the estimated value of x t by robot i at time t, and the state of robot i is given by ψ i ( x i , t ) = β i x i , t + γ i , where β i and γ i represent an offset of robot i to the virtual leader x i , t .
The multi-robot network is used to protect the moving target by choosing an optimal virtual leader such that the average state of all the robots is close to the target. Meanwhile, each robot is simultaneously close to the associated intruder, referring to Figure 1. Let v ( x t ) = 1 N i = 1 N ψ i ( x i , t ) denote the averaged state of robots where x t = col ( x 1 , t , x 2 , t , , x N , t ) . Then, the cooperative control problem can be modeled as the following optimization problem:
min . x i , t X f t ( x t ) = i = 1 N f i , t ( x i , t , v ( x t ) ) ,
where
f i , t ( x i , t , v ( x t ) ) = α 1 v ( x t ) T t 2 + α 2 ψ i ( x i , t ) I i , t 2 ,
α 1 > 0 , and α 2 > 0 . In fact, the objective is to cooperate in seeking an optimal virtual leader, which is the optimal solution to the problem (13).
It is worth pointing out that the estimated virtual leader should be consensus. Therefore, let x t X be the consensus solution to problem (13). Then x t should be the solution to the following online optimization problem:
x t = min . x X f t ( x ) = i = 1 N f i , t ( x , v ( x ) ) ,
where ψ i ( x ) = β i x + γ i and v ( x ) = 1 N i = 1 N ψ i ( x ) . In addition, it follows from (12) that x t should also have the mapping A t , i.e., x t satisfies the following constraint:
x t + 1 = A t x t + ϑ t ,
where ϑ t R 2 is an unknown and unstructured noise. Note that A t is unknown for all robots. Hence, A i , t denotes the observed mapping of A t at time t for robot i. Problem (14) and constraint (15) take the form of problem (3) and constraint (6), so Algorithm 1 can be used to solve problem (14) with constraint (15).
Let N = 4 , X = { x R 2 | x 3 } , A t = 1 ( 1 + cos t ) 10 3 0 1 , ϑ t is a random vector chosen from a normal distribution with zero mean and standard deviation [ 1 ; 2 ] 10 3 , I i , t = T t + col ( cos ( 2 i π / N ) , sin ( 2 i π / N ) ) , β i = 1 , γ i = col ( cos ( 2 i π / N ) , sin ( 2 i π / N ) ) , α 1 = 0.1 , α 2 = 0.2 , the unknown noise ϑ t satisfies ϑ t   0.2 , and A i , t = A t + μ t I 2 where μ t is a random number chosen from a normal distribution with zero mean and standard deviation 10 3 .
Choosing T = 1000 , η t = 0.001 and R ( x ) = 0.01 x 2 , and supposing that an arbitrary doubly stochastic W is given, it can be verified that Assumptions 1–4 and the condition ι > Δ 7 hold. The simulations are performed on a computer equipped with AMD Ryzen 9 5950X 16-core CPU, 64G RAM, and Nvidia RTX 3080Ti GPU. Figure 2 shows the trajectory x 1 , t of robot 1 by using Algorithm 1. The dynamic regret of Algorithm 1 is indicated in Figure 3, and the regret Reg ( T ) in Figure 3 is bounded. Therefore, Theorem 1 is verified by Figure 2 and Figure 3.

6. Conclusions

This paper introduces an online optimization algorithm for addressing distributed online aggregative problems featuring dynamic environments. With this algorithm, the dynamic regret converges to a boundary without relying on the condition that the dynamic environment is known. Future research includes the distributed online optimization problems in the context of time-varying directed networks.

Author Contributions

Conceptualization, C.Y., S.W. and B.H.; methodology, C.Y. and S.W.; writing—original draft preparation, C.Y. and S.W.; writing—review and editing, S.Z. and S.L.; funding acquisition, B.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Key R&D Program of China under Grant 2022ZD0119601.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Proof of Lemma 1.
The proof follows from Lemma 2.1 in [11]. □

Appendix B

Proof of Lemma 2.
Let ϵ z i , t + 1 = ψ i ( x i , t + 1 ) ψ i ( x i , t ) . Then the algorithm (8f) can be rewritten as
z i , t + 1 = j = 1 N W i j z j , t + ϵ z i , t + 1 .
According to Lemma 2 in [12],
z i , t + 1 z ¯ t + 1 N γ β t max j z j , 1 + γ l = 1 t 1 β t l j = 1 N ϵ z j , l + 1 + 1 N j = 1 N ϵ z j , t + 1 + ϵ z i , t + 1 .
It follows from Assumption 2 that ϵ z i , t + 1   2 L . Then,
z i , t + 1 z ¯ t + 1 Δ 1 .
Moreover, it follows from (8f) that
1 N i = 1 N z i , t + 1 = 1 N i = 1 N j = 1 N W i j z j , t + 1 N i = 1 N ψ i ( x i , t + 1 ) 1 N i = 1 N ψ i ( x i , t ) .
By using Assumption 1,
z ¯ t + 1 1 N i = 1 N ψ i ( x i , t + 1 ) = z ¯ t 1 N i = 1 N ψ i ( x i , t ) ,
It follows from z i , 0 = ψ i ( x i , 0 ) that z ¯ t = 1 N i = 1 N ψ i ( x i , t ) . Assumption 2 implies that z ¯ t L holds. Therefore,
i , t H + H ( z i , t z ¯ t + z ¯ t ) H + H Δ 1 + H L = Δ 2 .

Appendix C

Proof of Lemma 3.
Let e i , t = x ^ i , t + 1 y i , t and p i , t = ( A t I n + A i , t ) x ^ i , t + 1 . It follows from Lemma 2.2 in [11] that
e i , t Δ 2 η t p i , t ( M A Q + 8 M Q k = 0 t λ t k ) η t x ¯ t + 1 = 1 N τ = 0 t j = 1 N ( e j , τ + p j , τ ) ,
and
x i , t + 1 x ¯ t + 1   N i , t + M A Q + 8 M Q 1 λ τ = 0 t σ 2 t τ ( W ) η τ .
By using the result i , t Δ 2 in Lemma 2, (9) holds. □

Appendix D

Proof of Lemma 4.
Considering the first inequality in (10), and it follows from (A2) that
x ¯ t + 1 x ¯ t   = 1 N j = 1 N ( e j , t + p j , t ) ( Δ 2 + M A Q + 8 M Q k = 0 t λ t k ) η t ,
which implies that
x i , t + 1 x i , t x i , t + 1 x ¯ t + 1 + x ¯ t + 1 x ¯ t + x ¯ t x i , t ( Δ 2 + M A Q + 8 M Q k = 0 t λ t k ) η t + 2 Δ 3 τ = 0 t σ 2 t τ ( W ) η τ .
Thus, the first inequality in (10) holds.
Consider the second inequality in (10). Note that there exists
v ( x i , t ) v i , t v ( x i , t ) 1 N i = 1 N ψ i ( x i , t ) + 1 N i = 1 N ψ i ( x i , t ) v i , t L N j = 1 N x i , t x j , t + 1 N i = 1 N ψ i ( x i , t ) v i , t ,
where the last line is since the function ψ i is Lipschitz (see Assumption 2). By using (9),
L N j = 1 N x i , t x j , t L N j = 1 N ( x i , t x ¯ t + x ¯ t x j , t ) 2 L Δ 3 τ = 0 t σ 2 t τ ( W ) η τ .
Let ϵ v i , t + 1 = ψ i ( x i , t + 1 ) ψ i ( x i , t ) . By using Assumption 2 and (A5),
ϵ v i , t + 1   L x i , t + 1 x i , t L ( Δ 2 + M A Q + 8 M Q k = 0 t λ t k ) η t + 2 L Δ 3 τ = 0 t σ 2 t τ ( W ) η τ .
It follows from that
v i , t + 1 = j = 1 N W i j v j , t + ϵ v i , t + 1 .
According to Lemma 2 in [12], it follows that
v i , t + 1 1 N j = 1 N v j , t + 1 N γ β t max j v j , 1 + γ l = 1 t 1 β t l j = 1 N ϵ v j , l + 1 + 1 N j = 1 N ϵ v j , t + 1 + ϵ v i , t + 1 N γ P β t + ( N γ β 1 β + 2 ) L ( Δ 2 + M A Q + 8 M Q k = 0 t λ t k ) η t + ( N γ β 1 β + 2 ) 2 L Δ 3 τ = 0 t σ 2 t τ ( W ) η τ .
It follows from (8e) and Assumption 1 that
1 N i = 1 N v i , t + 1 = 1 N i = 1 N j = 1 N W i j v j , t + 1 N i = 1 N ψ i ( x i , t + 1 ) 1 N i = 1 N ψ i ( x i , t ) .
According to the initial state v i , 0 = ψ i ( x i , 0 ) , it follows that
1 N j = 1 N v j , t = 1 N i = 1 N ψ i ( x i , t ) .
It follows from (A8) and (A9) that
1 N i = 1 N ψ i ( x i , t ) v i , t N γ P β t + ( N γ β 1 β + 2 ) L ( Δ 2 + M A Q + 8 M Q k = 0 t 1 λ t k 1 ) η t 1 + ( N γ β 1 β + 2 ) 2 L Δ 3 τ = 0 t 1 σ 2 t 1 τ ( W ) η τ .
Combining (A6), (A7), and (A10), the second inequality in (10) holds.
Considering the third inequality in (10), and note that there exists
z i , t v ( x i , t ) z i , t 1 N i = 1 N ψ i ( x i , t ) + 1 N i = 1 N ψ i ( x i , t ) v ( x i , t ) z i , t z ¯ i , t + L N j = 1 N x i , t x j , t ,
Similar to (A8), there exists
ϵ z i , t + 1   L x i , t + 1 x i , t L ( Δ 2 + M A Q + 8 M Q k = 0 t λ t k ) η t + 2 L Δ 3 τ = 0 t σ 2 t τ ( W ) η τ .
It follows from (A1) that
z i , t z ¯ i , t   N γ P β t + ( N γ β 1 β + 2 ) L Δ 2 + M A Q + 8 M Q k = 0 t λ t k η t + ( N γ β 1 β + 2 ) 2 L Δ 3 τ = 0 t σ 2 t τ ( W ) η τ .
Therefore, according to (A5), (A11), and (A12), the third inequality in (10) holds. □

Appendix E

Proof of Lemma 5.
The proof is follows from Lemma 2.3 in [11]. □

Appendix F

Proof of Lemma 6.
It follows from Assumption 2 and Lemma 2 that
˜ i , t i , t 1 f i , t ( x i , t , v ( x i , t ) ) 1 f i , t ( x i , t , v i , t ) + 2 f i , t ( x i , t , v ( x i , t ) ) v ( x i , t ) 2 f i , t ( x i , t , v i , t ) z i , t L v ( x i , t ) v i , t + 2 f i , t ( x i , t , v ( x i , t ) ) v ( x i , t ) 2 f i , t ( x i , t , v i , t ) v ( x i , t ) + 2 f i , t ( x i , t , v i , t ) v ( x i , t ) 2 f i , t ( x i , t , v i , t ) z i , t .
According to Assumption 2 and Lemma 4, we have
2 f i , t ( x i , t , v ( x i , t ) ) v ( x i , t ) 2 f i , t ( x i , t , v i , t ) v ( x i , t ) 2 f i , t ( x i , t , v ( x i , t ) ) 2 f i , t ( x i , t , v i , t ) v ( x i , t ) L v ( x i , t ) v i , t v ( x i , t ) L 2 N γ P β t + Δ 5 η t 1 + Δ 6 τ = 0 t σ 2 t τ ( W ) η τ ,
and
2 f i , t ( x i , t , v i , t ) v ( x i , t ) 2 f i , t ( x i , t , v i , t ) z i , t 2 f i , t ( x i , t , v i , t ) z i , t v ( x i , t ) H N γ P β t + Δ 5 η t 1 + Δ 6 τ = 0 t σ 2 t τ ( W ) η τ .
Then the proof is over by combining (A13)–(A15) and (10). □

Appendix G

Proof of Theorem 1.
Note that the function f i is convex. Thus, there exists
f i , t ( x i , t ) f i , t ( x t ) ˜ i , t , x i , t x t = i , t , x ^ i , t + 1 x t + ˜ i , t , x i , t y i , t + ˜ i , t , y i , t x ^ i , t + 1 + ˜ i , t i , t , x ^ i , t x t .
According to algorithm (8a), Lemma 4.1 in [13], and the fact that D R ( x , y ) 1 2 x y 2 where x , y X , it follows that
i , t , x ^ i , t + 1 x t 1 η t D R ( x t , y i , t ) 1 η t D R ( x t , x ^ i , t + 1 ) 1 η t D R ( x ^ i , t + 1 , y i , t ) 1 η t D R ( x t , y i , t ) 1 η t D R ( x t , x ^ i , t + 1 ) 1 2 η t x ^ i , t + 1 y i , t 2 .
By using Lemmas 2 and 3, Assumption 1, and Algorithm (8d),
˜ i , t , x i , t y i , t = ˜ i , t , x i , t x ¯ t + x ¯ t y i , t = ˜ i , t , x i , t x ¯ t + j = 1 N W i j i , t , x ¯ t x j , t ˜ i , t ( x i , t x ¯ t + j = 1 N W i j x j , t x ¯ t ) 2 ( H + H L ) Δ 3 τ = 0 t σ 2 t τ ( W ) η τ .
It follows from Lemma 2 that
˜ i , t , y i , t x ^ i , t + 1 ˜ i , t x ^ i , t + 1 y i , t 1 2 η t x ^ i , t + 1 y i , t 2 + ( H + H L ) 2 2 η t .
According to (A16)–(A19), we have that
f i , t ( x i , t ) f i , t ( x t ) ˜ i , t i , t , x ^ i , t x t + 1 η t D R ( x t , y i , t ) 1 η t D R ( x t , x ^ i , t + 1 ) + ( H + H L ) 2 2 η t + 2 ( H + H L ) Δ 3 τ = 0 t σ 2 t τ ( W ) η τ .
Then it follows from Lemmas 5 and 6 that
t = 1 T i = 1 N f i , t ( x i , t ) f i , t ( x t ) t = 1 T i = 1 N x ^ i , t x t ( L + L 2 + H ) N γ P β t + Δ 5 η t 1 + Δ 6 τ = 0 t σ 2 t τ ( W ) η τ + 2 N R 2 η T + 1 + t = 1 T N K η t + 1 x t + 1 A t x t + t = 1 T i = 1 N η t η t + 1 Δ 7 A t x t x i , t + 1 + N ( H + H L ) 2 2 t = 1 T η t + 2 N ( H + H L ) Δ 3 t = 1 T τ = 0 t σ 2 t τ ( W ) η τ .
Note that η τ is fixed and satisfies η t = η . It follows from x ^ i , t x t 2 Q , (A20) and Assumption 2 that
t = 1 T i = 1 N ι x i , t x t Δ 7 t = 1 T i = 1 N x i , t x t + Δ 8 η T + N K η t = 1 T θ t + 2 N R 2 η + i = 1 N Δ 7 x T + 1 x i , T + 1 + Δ 7 t = 1 T θ t + 2 Q N γ P ( L + L 2 + H ) t = 1 T i = 1 N β t .
Then it follows from (A21) that
t = 1 T i = 1 N ( ι Δ 7 ) x i , t x t Δ 8 η T + 2 N R 2 + N K C T η + Δ 9 .
Therefore, the proof is over by combining (A22) with
Reg ( T ) t = 1 T i = 1 N L x i , t x t .

References

  1. Shi, Y.; Ran, L.; Tang, J.; Wu, X. Distributed optimization algorithm for composite optimization problems with non-smooth function. Mathematics 2022, 10, 3135. [Google Scholar] [CrossRef]
  2. Li, X.X.; Xie, L.H.; Li, N. A survey on distributed online optimization and online games. Annu. Rev. Control 2023, 56, 24. [Google Scholar] [CrossRef]
  3. Li, X.X.; Xie, L.H.; Hong, Y.G. Distributed aggregative optimization over multi-agent networks. IEEE Trans. Autom. Control 2022, 67, 3165–3171. [Google Scholar] [CrossRef]
  4. Li, X.X.; Yi, X.L.; Xie, L.H. Distributed online convex optimization with an aggregative variable. IEEE Trans. Control Netw. Syst. 2022, 9, 438–449. [Google Scholar] [CrossRef]
  5. Carnevale, G.; Camisa, A.; Notarstefano, G. Distributed online aggregative optimization for dynamic multirobot coordination. IEEE Trans. Autom. Control 2023, 68, 3736–3743. [Google Scholar] [CrossRef]
  6. Hall, E.C.; Willett, R.M. Online convex optimization in dynamic environments. IEEE J. Sel. Top. Signal Process. 2015, 9, 647–662. [Google Scholar] [CrossRef]
  7. Mokhtari, A.; Shahrampour, S.; Jadbabaie, A.; Ribeiro, A. Online optimization in dynamic environments: Improved regret rates for strongly convex problems. In Proceedings of the 55th IEEE Conference on Decision and Control (CDC), Las Vegas, NV, USA, 12–14 December 2016; pp. 7195–7201. [Google Scholar]
  8. Shahrampour, S.; Jadbabaie, A. Distributed online optimization in dynamic environments using mirror descent. IEEE Trans. Autom. Control 2018, 63, 714–725. [Google Scholar] [CrossRef]
  9. Nazari, P.; Khorram, E.; Tarzanagh, D.A. Adaptive online distributed optimization in dynamic environments. Optim. Method Softw. 2021, 36, 973–997. [Google Scholar] [CrossRef]
  10. Li, J.Y.; Li, C.J.; Yu, W.W.; Zhu, X.M.; Yu, X.H. Distributed online bandit learning in dynamic environments over unbalanced digraphs. IEEE Trans. Netw. Sci. Eng. 2021, 8, 3034–3047. [Google Scholar] [CrossRef]
  11. Wang, S.; Huang, B.M. Distributed online optimisation in unknown dynamic environment. Int. J. Syst. Sci. 2024, 55, 1167–1176. [Google Scholar] [CrossRef]
  12. Lee, S.; Zavlanos, M.M. On the sublinear regret of distributed primal-dual algorithms for online constrained optimization. arXiv 2017, arXiv:1705.11128. [Google Scholar]
  13. Beck, A.; Teboulle, M. Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett. 2003, 31, 167–175. [Google Scholar] [CrossRef]
Figure 1. The concept of the cooperative control problem for a multi-robot system.
Figure 1. The concept of the cooperative control problem for a multi-robot system.
Mathematics 12 02460 g001
Figure 2. Trajectories of x t and x 1 , t over T = 1000 .
Figure 2. Trajectories of x t and x 1 , t over T = 1000 .
Mathematics 12 02460 g002
Figure 3. The dynamic regret of Algorithm 1 for problem (14) with constraint (15).
Figure 3. The dynamic regret of Algorithm 1 for problem (14) with constraint (15).
Mathematics 12 02460 g003
Table 1. Comparison of problem formulation and algorithm of online optimization.
Table 1. Comparison of problem formulation and algorithm of online optimization.
ReferenceAggregative TermsDynamic EnvironmentAlgorithm/Technique
[3]YesNoDistributed aggregative gradient tracking
[4]YesNoOnline distributed gradient tracking
[5]YesNoProjected aggregative tracking
[6]NoKnownDynamic mirror descent
[7]NoKnownOnline gradient descent
[8]NoKnownDecentralized mirror descent
[10]NoKnownAdaptive gradient method
[9]NoKnownDistributed bandit online leaning
[11]NoUnknownGradient tracking via DAT
This paperYesUnknownAggregative gradient tracking via DAT
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yang, C.; Wang, S.; Zhang, S.; Lin, S.; Huang, B. A Class of Distributed Online Aggregative Optimization in Unknown Dynamic Environment. Mathematics 2024, 12, 2460. https://doi.org/10.3390/math12162460

AMA Style

Yang C, Wang S, Zhang S, Lin S, Huang B. A Class of Distributed Online Aggregative Optimization in Unknown Dynamic Environment. Mathematics. 2024; 12(16):2460. https://doi.org/10.3390/math12162460

Chicago/Turabian Style

Yang, Chengqian, Shuang Wang, Shuang Zhang, Shiwei Lin, and Bomin Huang. 2024. "A Class of Distributed Online Aggregative Optimization in Unknown Dynamic Environment" Mathematics 12, no. 16: 2460. https://doi.org/10.3390/math12162460

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop