1. Introduction
Consider the following nonlinear semidefinite optimization problem
where function
, mapping
and mapping
are assumed to be twice continuously differentiable in a neighborhood of a given feasible point
, where
is the space of
real symmetric matrices.
The augmented Lagrange method was initiated by Powell [
1] and Hestenes [
2] for solving equality constrained nonlinear programming problems and was extended by Rockafellar [
3] to inequality constrained optimization problems. For convex programming, Rockafellar [
3,
4] adopted the augmented Lagrange function for establishing a saddle point theorem and demonstrated the global convergence of the augmented Lagrange method when the penalty parameter is chosen as an arbitrary positive number. Rockafellar [
5] gave a deep study of the augmented Lagrange method for convex optimization.
The study of local convergence properties for the augmented Lagrange method is fairly comprehensive. For the optimization problems with equality constraints, Powell [
1] proved the local linear convergence of the augmented Lagrange method to a local minimum point when the second-order sufficient condition and the linear independence constraint qualification are satisfied. This result was extended by Bertsekas ([
6], Chapter 3) to the optimization problem with inequality constraints under the strict complementarity condition, in which the linear rate constant is proportional to
. If the strict complementarity condition is not satisfied, Ito and Kunisch [
7], Conn et al. [
8] and Contesse-Becker [
9] proved that the augmented Lagrange method has linear convergence rate.
The Lagrange function for Problem (
1) can be written as
The augmented Lagrange function for (
1) is defined by
where
denotes the projection operator to
. The augmented Lagrange method for Problem (
1) can be expressed in the following form:
- Step 0
Given parameter , initial point , initial multiplier and .
- Step 1
If
then stop and
is a Karush–Kuhn–Tucker (KKT) pair.
- Step 2
Solve the following problem
and calculate
- Step 3
Update , set to k, and go to Step 1.
For nonlinear semidefinite optimization problem, in the appendix of Sun et al. [
10], they used a direct way to derive the linear rate of convergence when the strict complementarity condition holds. However, this result on the rate of convergence of the augmented Lagrange method obtained in [
10] has the possibility for improvement. For example, can we obtain a similar result to ([
6], Chapter 3) for equality constrained optimization problems when
is very small? How can we characterize the rate constant of the local linear convergence of the augmented Lagrangian method? In this paper, we will give positive answers to these two questions.
It should be noted that there are a lot of important applications for augmented Lagrangian methods in different types of optimization problems; for examples, see [
11,
12,
13].
The paper is organized as follows. In the next section, we develop properties of the augmented Lagrange function under the Jacobian uniqueness conditions for the semidefinite optimization problem (
1), which will be required to prove results on the convergence rate of the augmented Lagrange method. In
Section 3, we demonstrate the linear rate of convergence of the augmented Lagrangian method for the semidefinite optimization problem when the Jacobian uniqueness conditions are satisfied. In
Section 4, we establish the asymptotical convergence rate of Lagrange multipliers, which shows that the sequence of Lagrange multiplier vectors produced by the augmented Lagrange method is convergent to the optimal Lagrange multiplier superlinearly when the sequence
is increasing to
∞. Finally, we draw a conclusion in
Section 5.
We list two technical results at the end of this section, which will be used in developing properties of the augmented Lagrange function for proving the main theorem about the convergence rate of the ALM. The first technical result is a variant of [
14] and the second result is an implicit function theorem from page 12 of Bertsekas [
6].
Lemma 1. Let X and Y be two finite dimensional Hilbert spaces and be continuous and positive homogeneous of degree 2, namely Suppose that there exists a real number such that for any w satisfying , where is a linear mapping. Then, there exist positive real numbers and such that Lemma 2. Assume that be an open subset of , Σ be a nonempty compact subset of , and be a mapping and on for some . Assume that exists, and it is continuous on . Assume that is a vector such that for , and the Jacobian is nonsingular for all . Then, there exist scalars , and a mapping such that on , for all , and for all . The mapping ψ is unique in the sense that if , and , then . Furthermore, if , then we have 2. Properties of the Augmented Lagrangian
Assume that
is a given feasible point of Problem (
1) and
f,
h and
g are twice differentiable in a neighborhood of
. The following conditions, which are called Jacobian uniqueness conditions, are needed in our analysis.
Definition 1. Jacobian uniqueness conditions at are the following conditions:
- (i)
The point satisfies the Karush–Kuhn–Tucker condtions: - (ii)
The constraint nondegeneracy condition is satisfied at : where denotes the linearity space of a closed convex cone.
- (iii)
The strict complementarity condition at holds, namely .
- (iv)
At , the second-order sufficiency optimality conditions holds, namely for any satisfying , where is the Moore–Penrose pseudoinverse of and is the critical cone at defined by
In this section, we will give some properties of the Jacobian uniqueness conditions of Problem (
1) and properties of the augmented Lagrange function under this set of conditions. These properties are crucial for studying the convergence rate of augmented Lagrange method.
Let
be a KKT pair. Assume that (iii) holds; then,
is nonsingular. Let the eigenvalues of
be
and
Then, an orthogonal matrix
exists such that
where
If Jacobian uniqueness conditions (i)–(iii) hold, then the cone
is reduced to the following subspace
If Jacobian uniqueness condition (iv) holds, then there exists
such that
In fact, if this is not true, then a sequence
with
exists such that
There exists a subsequence
and
with
such that
. The closedness of
implies
. Taking the limit of (
5) along the subsequence
, we obtain
which contradicts Jacobian uniqueness condition (iv).
Then, the Jacobian of
, denoted by
, is expressed as
Lemma 3. Let be a given point and f, h and g be twice continuously differentiable in a neighborhood of . Let the Jacobian uniqueness conditions at are satisfied. Then, is a nonsingular linear operator.
Proof. Consider the equation
where
. This equation is equivalent to
From the third equality of (
8) we have for
that
where
This implies the following relations
From
and
, we have
. Multiplying
to the first equality of (
8) we obtain
which implies
from Jacobian uniqueness condition (iv). This comes from the fact that
implies
from Jacobian uniqueness condition (iv). Then, from the first equality of (
8) we obtain
which is equivalent to
This, from
, implies
From Jacobian uniqueness condition (ii) we obtain
and this implies
and
. Combining
, we obtain that
is a nonsingular linear operator. □
Proposition 1. Let be a Karush–Kuhn–Tucker point of Problem (1) and the Jacobian uniqueness conditions are satisfied at . Then there exist positive numbers and such that is positively definite when and . Proof. If
is nonsingular, then
is differentiable at
and
Then, from (
4), we obtain for any vector
,
This implies for any vector
that
It follows from (ii) that the linear mapping
is onto. Then, we have from Lemma 1 that there exists
such that
is positively definite if
. Therefore, there exists a positive real number
such that
is positively definite if
and
. □
Suppose that
is nonsingular such that
is differentiable at
. In this case, we define a linear operator:
Proposition 2. Let be a Karush–Kuhn–Tucker point of Problem (1) and the Jacobian uniqueness conditions are satisfied at . Then, there exists a positive real number large enough, such that is nonsingular andfor some positive constant if , where . Proof. We divide the proof into two steps.
- Step 1:
we prove that for sufficiently small is nonsingular.
Since
we have from Lemma 3 that
is nonsingular. Now, we consider the case where
. Consider the equation
where
. This equation is equivalent to
From the second equality of (
11), we have
From the third equality of (
11), we have for
that
where
This implies the following relations
Then, multiplying
to the first equality of (
11), we obtain
which implies
from Proposition 1 when
. Therefore, we obtain
,
, and
so that
is nonsingular when
.
- Step 2:
We prove that
for some positive constant
if
small enough.
Noting, for
we have
and
. Therefore, we get
For any
we have that
For any matrix
, we have
where
We have from (
14) and (
15) that
Thus, we have, for
, that
Therefore, there exists a sufficiently large positive number
, for
, if
; then,
is nonsingular and
for some positive constant
. The proof is complete. □
Proposition 3. The corresponding Löwner operator F is twice (continuously) differentiable at X if and only if f is twice (continuously) differentiable at , .
Proposition 4. Let be a Karush–Kuhn–Tucker point of Problem (1) and the Jacobian uniqueness conditions are satisfied at . Then, there exist , , and , for , is nonsingular andif and . Proof. We have from Proposition 2 that the operator
is nonsingular. Since the norm of
is less than 1 and
we have
Since
is twice continuously differentiable at
we obtain
for
. For
we obtain
where
Note for
that
we have
Therefore, from
we get
where
As
can be expressed as
we have from (
16) and (
17) that there exist
and
and for sufficiently small
,
is nonsingular if
, and
if
. □
3. The Convergence Rate of the Augmented Lagrange Method
In this section, we focus on the local convergence of the augmented Lagrange method for nonlinear semidefinite optimization problems under the Jacobian uniqueness conditions. Now we estimate the solution error of the augmented Lagrange subproblem
and the error for the updating multiplier when
is around
. The local convergence and the linear rate of multipliers can be obtained by using these estimates.
For a real number
define
Theorem 1. Let Jacobian uniqueness conditions be satisfied at . Then, there exist and and such that for any the problemhas a unique solution , which is smooth on . Moreover, for ,where Proof. If
x is a local minimum point of Problem
then, from the definition of
, we obtain
Define
,
and
, note
then, the system (
20) is equivalent to
, where
Obviously, from the definition
in (
10), we have
. Then, from Proposition 4, we have that
is nonsingular when
.
Define
and
and
From the implicit function theorem, we have that there exists
with
,
and mapping
which is smooth on
and satisfies
From Propositions 1 and 4, we may choose and small enough such that constraint nondegeneracy condition holds at , , is positively definite and for all .
Differentiating the three equations in (
21) with respect to
, we obtain
where
. Define
and
. Then, we have from (
22), for
that
Noting that
for
and
, we obtain from (
23) and
that
Noting that
is twice continuously differentiable at
, we have
It is easy to check the equality
. Then, when
is chosen small enough, there exists a positive constant
such that
when
and
.
Combining this estimate with (
24), we obtain
Substituting
by
in (
25) yields
Since the choice of
in (
26) is arbitrary, we obtain
which implies
or
From the definitions of
and
K, we have that
It follows from (
21) that
and
Noting that
and
and
we have from Proposition 1 that
Thus,
is the unique solution of Problem (
18) and differentiable on
. Without loss of generality, suppose
and define
. The for any
, we obtain from (
27) that
which implies the estimates (
19). □
According to the above theorem, it is easy for us to prove the local convergence properties of the augmented Lagrange method for the nonlinear semidefinite optimization problem.
Proposition 5. Let satisfy Jacobian uniqueness conditions. Let and be given in Theorem 1. Suppose that , and satisfy Then, the sequence generated by the ALM is convergent to withif . The sequence converges superlinearly to when . Proof. For the sequence
generated by the ALM, we obtain from Theorem 1 that
which implies
and
Suppose that
satisfies
and
, then for
, from Theorem 1 we have that
which implies
and
Therefore, by induction, we obtain that for any
and
. Then for
, we obtain
which implies
Noting that
and
is increasing, we obtain from the above inequality that
. The estimate in (
29) comes from (
30) and the rate of convergence is superlinear when
. □
4. Asymptotical Superlinear Convergence of Multipliers
In Theorem 1, the convergence rate of the augmented Lagrange method is characterized by (
19), which involves a constant
. The means by which to give an estimate of
are an important topic. In this section, we estimate
using the eigenvalues of the second-order derivative of the perturbation function of Problem (
1).
Let
be a Kurash–Kuhn–Tucker point of Problem (
1), consider the following system of equations in
,
then,
is a solution of (
32) for any where
. According to the implicit function theorem, there exist a constant
and mappings
such that
and for
, where
,
Moreover, there exists
such that
for
. Define the function
as follows
In view of the Jacobian uniqueness conditions,
and
can be taken small enough so that
is actually a local minimum point in
of the following perturbed problem
Thus, the function
p is actually the following perturbation function:
Lemma 4. Suppose that Jacobian uniqueness conditions hold and ε and δ are taken sufficiently small such that is a local minimum point of Problem (35). Then, Proof. We use
to denote the Lagrange function of Problem (
35), namely
Then,
is expressed as follows
Noting
and
, from the above expression of
we obtain
The proof is complete. □
Lemma 5. Suppose that Jacobian uniqueness conditions hold and ε and δ are taken sufficiently small so that is a local minimum point of the perturbed problem (35). Then, Proof. Differentiating (
33), we obtain
and
Denote
. Then, the Equations (
39) and (
40) can be written as
or
where
Thus, Equation (
41) is equivalent to
Therefore, we get that
which implies
It follows from Page 20 of [
15] that the inverse of
can be expressed as
It is easy to check
which implies
where
Therefore, we have from (
42) and (
43) that
namely, the equality (
38) holds. □
Corollary 1. Let Jacobian uniqueness conditions be satisfied at . Then,where . Proof. The equality (
38) is valid for all
u with
and all
large enough. For
, we have
which implies (
44) from (
37). □
By using the above properties, we are able to analyze the rate of convergence of multipliers generated by the augmented Lagrange method. For this purpose, we first give an equivalent expression of
which is a key property for analyzing the superlinear rate of the sequence of multipliers.
Theorem 2. Let the Jacobian uniqueness conditions be satisfied at . Let , δ and ε be given by Theorem 1. Then, for all ,where is defined byand . Proof. Noting that
is equivalent to
= 0, we have
Differentiating the last three equations in (
47) with respect to
, we obtain
Denoting
and
, we have from (
48) that
We can easily obtain the following expression of
:
From the equality
that
with
Thus, we have from (
49) that
which implies
Then, we get
Substituting , , , , and , . We obtain the desired result. □
Theorem 3. Assume that satisfies the Jacobian uniqueness y conditions, , δ and ε are the constants given by Theorem 1. Suppose that Then, there exists a scalar such that if and satisfythen the sequence generated byis well-defined, and and . Furthermore ifand for all k, thenwhile if and for all k, then Proof. In view of
of (
46), we have that
Using (
44), we obtain
and thus for an eigenvalue
, one has
where
denotes the corresponding eigenvalue of
. It is obvious that for any
, there exists
such that for all
with
and
we have
where
denotes the spectral norm of the operator. Using (
45) for all
chosen as in the above, we have
From (
52) and (
53), we have
Thus, by choosing a sufficiently small
, we can determine that there exists
such that
for
with
and
. Combining this with (
19) and (
53), we obtain that
and
. The estimates (
55) and (
56) for the convergence rate can be easily obtained directly from (
57). □
5. Conclusions
In this paper, we have studied the convergence rate of the augmented Lagrangian method for the nonlinear semidefinite optimization problem. We have proven the local linear rate of convergence of the sequence of multipliers and that the ratio constant is proportional to when exceeds a threshold and the ratio is sufficiently small. Importantly, based on the second-order derivative of the perturbation function of the nonlinear semidefinite optimization problem, we have obtained an accurate estimation for the rate constant of the linear convergence of multiplier vectors generated by the augmented Lagrange method, which shows that the sequence of multipliers is superlinear convergent if is increasing to ∞.
There are many unsolved problems left in the augmented Lagrange method for nonlinear semidefinite optimization problems. First, in Theorem 1, the result on the convergence rate of the augmented Lagrange method is obtained when the subproblems are exactly solved. A natural problem is how to analyze the convergence rate of the ALM when the subproblems are solved inexactly. Second, all results in this paper are about local convergence of the augmented Lagrange method, global convergent augmented Lagrangian methods are worth studying. Third, for estimating the rate constant of linear convergence, we need the strict complementarity condition; this is a critical condition. What about the convergence properties of the augmented Lagrange method when this condition does not hold?