A Three-Block Inexact Heterogeneous Alternating Direction Method of Multipliers for Elliptic PDE-Constrained Optimization Problems with a Control Gradient Penalty Term

Xiaotong Chen; Tongtong Wang; Xiaoliang Song

doi:10.3390/axioms13110744

,

and

¹

School of Science, Dalian Maritime University, Dalian 116026, China

²

School of Mathematical Sciences, Dalian University of Technology, Dalian 116024, China

^*

Author to whom correspondence should be addressed.

Axioms2024, 13(11), 744;https://doi.org/10.3390/axioms13110744

Version Notes

Order Reprints

Abstract

Optimization problems with PDE constraints are widely used in engineering and technical fields. In some practical applications, it is necessary to smooth the control variables and suppress their large fluctuations, especially at the boundary. Therefore, we propose an elliptic PDE-constrained optimization model with a control gradient penalty term. However, introducing this penalty term increases the complexity and difficulty of the problems. To solve the problems numerically, we adopt the strategy of “First discretize, then optimize”. First, the finite element method is employed to discretize the optimization problems. Then, a heterogeneous strategy is introduced to formulate the augmented Lagrangian function for the subproblems. Subsequently, we propose a three-block inexact heterogeneous alternating direction method of multipliers (three-block ihADMM). Theoretically, we provide a global convergence analysis of the three-block ihADMM algorithm and discuss the iteration complexity results. Numerical results are provided to demonstrate the efficiency of the proposed algorithm.

Keywords:

PDE-constrained optimization; ADMM; control gradient penalty term; convergence analysis

MSC:

49M41; 65K10

1. Introduction

Elliptic PDE-constrained optimization problems are of significant importance in many fields, including engineering, physics, biology and finance. In engineering, they are essential for optimizing designs in structural mechanics, fluid dynamics, and electromagnetics. In physics, they are crucial to modeling heat transfer, quantum mechanics, and elasticity. In biology, these problems are integral in utilizing modeling diffusion processes, population dynamics, and biochemical reactions. Furthermore, in finance, they are vital for option pricing, risk management, and portfolio optimization. The versatility and applicability of elliptic PDE-constrained optimization problems make them a cornerstone for solving complex real-world challenges in many disciplines. For more details, see [1,2,3,4]. The general model is as follows:

\{\begin{matrix} min_{(y, u) \in Y \times U} J (y, u) & = \frac{1}{2} ∥ y - y_{d} ∥_{L^{2} (Ω)}^{2} + \frac{α}{2} {∥ u ∥}_{L^{2} (Ω)}^{2} \\ s . t . L y & = u + y_{r} in Ω, \\ y & = 0 on \partial Ω, \\ u & \in U_{a d} = {ν (x) | a \leq ν (x) \leq b, a . e on Ω} \subseteq U, \end{matrix}

where

Y : = H_{0}^{1} (Ω)

,

U : = H_{0}^{1} (Ω)

. The domain

Ω \subseteq R^{n} (n = 2, 3)

is convex, open, and bounded. The boundary

Γ

is

C^{1, 1}

- or polygonal.

y_{d} \in L^{2} (Ω)

and

y_{r} \in L^{2} (Ω)

denote the desired state and the source term, respectively. The parameters

a, b \in R

,

α > 0

. The operator L denotes the uniformly elliptic differential operator:

L y : = - \sum_{i, j = 1}^{n} \partial_{x_{j}} (a_{i j} y_{x_{i}}) + c_{0} y,

where

a_{i j}

,

c_{0} \in L^{\infty} (Ω)

,

a_{i j} = a_{j i}

,

c_{0} \geq 0

and for a specific constant

κ > 0

, the following inequality holds:

\sum_{i, j = 1}^{n} a_{i j} (x) ζ_{i} ζ_{j} \geq θ {∥ ζ ∥}^{2}, a . a . x \in Ω, \forall ζ \in R^{n} .

With the development of science and engineering, PDE-constrained optimization problems in practical applications have become increasingly complex. In particular, the gradient penalty term plays a crucial role in PDE-constrained optimization problems. Referring to the total variation model in image restoration [5], Gao and Ding introduced a gradient penalty term to the objective function. This modification improves the model by ensuring the smoothness of the control forces while still minimizing the total control forces [6]. Moreover, Clever and Lang proposed the idea of adding a gradient penalty term to the state temperature, augmenting the objective function with a function that depends on the state gradient. This approach aims to minimize thermal stress within a glass [7]. The physical significance of the gradient penalty term is to constrain the gradient changes of the solution and promote its smoothness and continuity. This penalty term helps prevent abrupt changes or oscillations in the solution, particularly in problems requiring stability and continuity, such as fluid dynamics or structural optimization. By penalizing large gradient changes, the gradient penalty term guides the optimization process towards solutions exhibiting desired characteristics like smoothness and stability.

Inspired by the efficiency of the gradient penalty term and in order to smooth and suppress large changes in the control variable u, we propose an elliptic PDE-constrained optimization model with a control gradient penalty term and pointwise box constraints on the control:

\{\begin{matrix} min_{(y, u) \in Y \times U} J (y, u) & = \frac{1}{2} ∥ y - y_{d} ∥_{L^{2} (Ω)}^{2} + \frac{α}{2} {∥ u ∥}_{L^{2} (Ω)}^{2} + β {∥ \nabla u ∥}_{L^{2} (Ω)}^{2} \\ s . t . L y & = u + y_{r} in Ω, \\ y & = 0 on \partial Ω, \\ u & \in U_{a d} = {ν (x) | a \leq ν (x) \leq b, a . e on Ω} \subseteq U . \end{matrix}

(1)

To numerically solve (1), we apply the “First discretize, then optimize” strategy in [2]. The first step is to discretize the continuous elliptic PDE-constrained optimization problem. Specifically, we employ the finite element method due to its high efficiency. The second step is to select an optimization algorithm. Existing optimization algorithms for solving PDE-constrained optimization problems can be divided into two categories. One comprises the higher-order methods, also known as Hessian-type methods, which mainly include the semi-smooth Newton (SSN) method [8,9], the sequential quadratic programming (SQP) method [6] and so on. The other category comprises the first-order methods, also known as gradient-type methods, which mainly include the fast iterative shrinkage thresholding (FISTA) algorithm [10], the alternating direction multiplier method (ADMM) [11,12,13] and the accelerated block coordinate descent (ABCD) method [14]. Among them, the higher-order algorithms usually have locally super-linear convergence rates and can obtain high-accuracy solutions, but they are also computationally expensive. On the other hand, first-order algorithms have gained significant attention due to their simple iterative techniques, generally linear or sublinear convergence rates, and very low computational costs. In particular, the advantages of first-order optimization algorithms are more obvious when solving large-scale non-smooth optimization problems. This motivates us to further investigate the application of first-order methods in solving PDE-constrained optimization problems.

Inspired by the effectiveness of ADMMs in solving large-scale finite-dimensional optimization problems, as well as the latest work on applying an ADMM to PDE-constrained optimization problems [11,12,13], we focus on using an ADMM. In certain situations, computing the exact solution of each subproblem may be either impossible or extremely costly, even when it is theoretically possible. To overcome this difficulty, inexact ADMM algorithms have been extensively studied. In [15], Eckstein et al. first proposed an inexact version of an ADMM. After that, a proximal ADMM (PADMM) was proposed to make the subproblems easier to solve in [16]. For an inexact version of the PADMM algorithm, see [17,18]. Moreover, in [19], Chen et al. presented the sGS-imsPADMM method, which combines the inexact two-block semi-proximal ADMM with the inexact symmetric Gauss–Seidel (sGS) method. Then, Song et al. introduced an inexact heterogeneous ADMM (ihADMM) algorithm for sparse PDE-constrained optimization problems with

L^{1}

-control cost in [11]. Unlike a standard ADMM, ihADMM defines the augmented Lagrangian function for the two subproblems using different weighted inner products. This algorithm was later applied to elliptic optimal control problems with

L^{2}

-control cost, as described in [20]. Later, in [12], Chen et al. developed a multilevel ADMM (mADMM) algorithm by incorporating a multilevel approach with inexact subproblem solutions, which substantially lowered computational overhead. Following the success of this multilevel approach, a multilevel heterogeneous ADMM (mhADMM) was further proposed in [13] to tackle optimal control problems with

L^{1}

-control cost.

Motivated by the efficiency of ADMM-type algorithms, extending the classical two-block ADMM to the multiblock case is a natural idea. However, as shown in [21], multiblock extensions generally do not guarantee convergence, with counterexamples illustrating nonconvergence. Despite this, there are specific problem structures where multiblock ADMMs can still achieve convergence. In this paper, we propose a three-block inexact heterogeneous ADMM (three-block ihADMM) algorithm for solving a new elliptic PDE-constrained optimization model with a control gradient penalty term in the objective function. This model introduces a control gradient penalty term in the objective function and pointwise box constraints on the control. By incorporating these features, unlike the simple extension of existing methods, our approach is specifically designed to address the complex structure of this problem. The three-block ihADMM employs a heterogeneous strategy to formulate the augmented Lagrangian function and uses Krylov-based methods to inexactly solve the saddle point problem associated with the u-subproblem, and the other two subproblems have closed-form solutions, enhancing both the flexibility and applicability of the algorithm in handling complex optimization challenges. Theoretically, we rigorously prove the convergence of the three-block ihADMM algorithm and establish an iteration complexity of

o (1 / k)

, demonstrating that our approach offers a solution with solid convergence guarantees. This contribution provides both computational efficiency and theoretical rigor for solving multiblock optimization problems, particularly in cases where convergence has traditionally been challenging.

This paper is organized as follows. In Section 2, a new three-block ihADMM algorithm is proposed. Specifically, we first employ the standard piecewise linear finite element to discretize the original continuous problem. Then, we combine the heterogeneous strategy and the inexact strategy to propose the use of the three-block ihADMM algorithm for the discretized problem. In Section 3, we provide the theoretical convergence analysis and discuss the iteration complexity. Numerical experiments are presented in Section 4. In Section 5, we summarize the paper.

2. A Three-Block Inexact Heterogeneous Alternating Direction Method of Multipliers (Three-Block ihADMM)

2.1. An Inexact ADMM Algorithm in Continuous Space

To apply ADMM-type methods, we introduce artificial variables v and z to rewrite (1) in the following equivalent form:

\{\begin{matrix} min_{(y, u, v, z) \in Y \times U \times U \times U} J (y, u) = & \frac{1}{2} ∥ y - y_{d} ∥_{L^{2} (Ω)}^{2} + \frac{α}{2} {∥ u ∥}_{L^{2} (Ω)}^{2} + β {∥ \nabla v ∥}_{L^{2} (Ω)}^{2} \\ s . t . L y = & u + y_{r} in Ω, \\ y = & 0 on \partial Ω, \\ u = & v, \\ u = & z, \\ z \in & U_{a d} = {ν (x) | a \leq ν (x) \leq b, a . e . on Ω} \subseteq U . \end{matrix}

(2)

From Proposition 1 in [13], we know that the elliptic PDE involved in (1) has a unique weak solution. Let

S : L^{2} (Ω) \to H_{0}^{1} (Ω)

denote the continuous linear solution operator; then, the solution can be expressed as

y (u) : = S (u + y_{r})

. The adjoint operator

S^{*}

:

H^{- 1} (Ω) \to H_{0}^{1} (Ω)

is also continuous and linear. It is evident that (1) exhibits strong convexity. Hence, from the equivalence between (1) and (2), it is obvious that the existence of and uniqueness of the solution to (2) are guaranteed. The optimal solution, denoted as

(y^{*}, u^{*}, v^{*}, z^{*})

, can be characterized by the first-order optimality condition as follows.

Theorem 1.

(y^{*}, u^{*}, z^{*}, v^{*})

denotes the optimal solution of (2) if and only if there exists an adjoint state,

p^{*} \in H_{0}^{1} (Ω)

, and Lagrange multiplier,

λ_{i}^{*} \in L^{2} (Ω) (i = 1, 2)

, ensuring that the following conditions are satisfied in the weak sense:

\begin{matrix} y^{*} = S (u^{*} + y_{r}), \\ p^{*} = S^{*} (y^{*} - y_{d}), \\ α u^{*} - p^{*} + λ_{1}^{*} + λ_{2}^{*} = 0, \\ u^{*} = z^{*}, \\ u^{*} = v^{*}, \\ z^{*} \in U_{a d}, \\ 2 β g - λ_{1}^{*} = 0, \forall g \in \partial {∥ \nabla v^{*} ∥}_{L^{2} (Ω)}, \\ {⟨- λ_{2}^{*}, \tilde{z} - z^{*}⟩}_{L^{2} (Ω)} \geq 0, \forall \tilde{z} \in U_{a d} . \end{matrix}

By employing the solution operator

S

, we can reformulate (2) into the following reduced form:

\{\begin{matrix} min_{(u, v, z) \in U \times U \times U} & \hat{J} (u) + h (v) + δ_{U_{a d}} (z) \\ s . t . & u = v, \\ u = z, \end{matrix}

(3)

where

\hat{J} (u) : = \frac{1}{2} ∥ S (u + y_{r}) - y_{d} ∥_{L^{2} (Ω)}^{2} + \frac{α}{2} {∥ u ∥}_{L^{2} (Ω)}^{2}

and

h (v) : = {β ∥ \nabla v ∥}_{L^{2} (Ω)}^{2}

. The indicator function

δ_{U_{a d}} (\cdot)

is defined as

δ_{U_{a d}} (z) = \{\begin{matrix} 0, & z \in U_{a d}, \\ \infty, & z \notin U_{a d} . \end{matrix}

Consequently, the augmented Lagrangian function for (3) is as follows:

\begin{matrix} L_{σ_{1}, σ_{2}} (u, v, z; λ_{1}, λ_{2}) = & \hat{J} (u) + h (v) + δ_{U_{a d}} (z) + {⟨ λ_{1}, u - v ⟩}_{L^{2} (Ω)} + {⟨ λ_{2}, u - z ⟩}_{L^{2} (Ω)} \\ + \frac{σ_{1}}{2} {∥ u - v ∥}_{L^{2} (Ω)}^{2} + \frac{σ_{2}}{2} {∥ u - z ∥}_{L^{2} (Ω)}^{2}, \end{matrix}

where

λ_{i} \in L^{2} (Ω) (i = 1, 2)

represent Lagrangian multipliers and

σ_{i} > 0 (i = 1, 2)

are penalty parameters. The classical two-block ADMM in Hilbert space can be directly extended to the three-block version. Specifically, given the initial point

(v^{0}, z^{0}, λ_{1}^{0}, λ_{2}^{0})

, the ADMM iterative scheme is as follows:

\{\begin{matrix} {\bar{u}}^{k + 1} & = arg min_{u} L_{σ_{1}, σ_{2}} (u, v^{k}, z^{k}; λ_{1}^{k}, λ_{2}^{k}) \\ = arg min_{u} \hat{J} (u) + {⟨ {\bar{λ}}_{1}^{k}, u - {\bar{v}}^{k} ⟩}_{L^{2} (Ω)} + {⟨ {\bar{λ}}_{2}^{k}, u - {\bar{z}}^{k} ⟩}_{L^{2} (Ω)} + \frac{σ_{1}}{2} {∥ u - {\bar{v}}^{k} ∥}_{L^{2} (Ω)}^{2} \\ + \frac{σ_{2}}{2} {∥ u - {\bar{z}}^{k} ∥}_{L^{2} (Ω)}^{2}, \\ {\bar{v}}^{k + 1} & = arg min_{v} L_{σ_{1}, σ_{2}} (u^{k}, v, z^{k}; λ_{1}^{k}, λ_{2}^{k}) \\ = arg min_{v} h (v) + {⟨ {\bar{λ}}_{1}^{k}, {\bar{u}}^{k + 1} - v ⟩}_{L^{2} (Ω)} + \frac{σ_{1}}{2} {∥ {\bar{u}}^{k + 1} - v ∥}_{L^{2} (Ω)}^{2}, \\ {\bar{z}}^{k + 1} & = arg min_{u} L_{σ_{1}, σ_{2}} (u^{k}, v^{k}, z; λ_{1}^{k}, λ_{2}^{k}) \\ = arg min_{z} δ_{U_{a d}} (z) + {⟨ {\bar{λ}}_{2}^{k}, {\bar{u}}^{k + 1} - z ⟩}_{L^{2} (Ω)} + \frac{σ_{2}}{2} {∥ {\bar{u}}^{k + 1} - z ∥}_{L^{2} (Ω)}^{2}, \\ {\bar{λ}}_{1}^{k + 1} & = {\bar{λ}}_{1}^{k} + τ σ_{1} ({\bar{u}}^{k + 1} - {\bar{v}}^{k + 1}), \\ {\bar{λ}}_{2}^{k + 1} & = {\bar{λ}}_{2}^{k} + τ σ_{2} ({\bar{u}}^{k + 1} - {\bar{z}}^{k + 1}) . \end{matrix}

In general, computing an exact solution for each subproblem may be expensive and not always necessary. Thus, employing iterative methods like Krylov-based techniques such as the conjugate gradient (CG) method [22] and the generalized minimal residual (GMRES) method [23] to obtain approximate solutions for these subproblems is an effective strategy. These methods are particularly effective for solving large, sparse linear systems, as they avoid the high computational costs associated with direct methods like Gaussian elimination. These iterative methods yield approximate solutions within a specified tolerance, which is beneficial when exact solutions are either unnecessary or too costly to compute. Furthermore, Krylov-based methods can be enhanced through preconditioning techniques that improve the systems’ condition number, making them highly flexible and effective for various problems. Building on this concept, we propose the three-block inexact ADMM algorithm in continuous Hilbert space for (3), as detailed in Algorithm 1.

Notice that the u-subproblem of Algorithm 1 is a convex differentiable optimization problem. By omitting the error vector

δ^{k}

, it can be equivalently reformulated as the following system:

S^{*} (S (u^{k + 1} + y_{r}) - y_{d}) + α u^{k + 1} + λ_{1}^{k} + σ_{1} (u^{k + 1} - v^{k}) + λ_{2}^{k} + σ_{2} (u^{k + 1} - z^{k}) = 0 .

Notice that

y^{k + 1} = S (u^{k + 1} + y_{r})

, let

p^{k + 1} : = S^{*} (y_{d} - y)

, so we can obtain

(α + σ_{1} + σ_{2}) u^{k + 1} - p^{k + 1} + λ_{1}^{k} - σ_{1} v^{k} + λ_{2}^{k} - σ_{2} z^{k} = 0 .

(4)

We point out that (4) can be rewritten as

\begin{matrix} [\begin{matrix} I & 0 & S^{- *} \\ 0 & (α + σ_{1} + σ_{2}) I & - I \\ S^{- 1} & - I & 0 \end{matrix}] [\begin{matrix} y^{k + 1} \\ u^{k + 1} \\ p^{k + 1} \end{matrix}] = [\begin{matrix} y_{d} \\ σ_{1} v^{k} + σ_{2} z^{k} - λ_{1}^{k} - λ_{2}^{k} \\ y_{r} \end{matrix}] . \end{matrix}

Furthermore, we can eliminate p to obtain the reduced system:

\begin{matrix} [\begin{matrix} (α + σ_{1} + σ_{2}) I & S^{*} \\ - S & I \end{matrix}] [\begin{matrix} u^{k + 1} \\ y^{k + 1} \end{matrix}] = [\begin{matrix} S^{*} y_{d} + σ_{1} v^{k} + σ_{2} z^{k} - λ_{1}^{k} - λ_{2}^{k} \\ S y_{r} \end{matrix}], \end{matrix}

(5)

where I denotes the identity operator. It is evident that the linear system (5) can be viewed as a particular instance of the generalized saddle point problem. Given the structure of this linear system, various Krylov-based methods can be utilized to inexactly solve it.

Algorithm 1 Three-block inexact ADMM algorithm for (3)

Input: Choose the initial point

(v^{0}, z^{0}, λ_{1}^{0}, λ_{2}^{0}) \in dom (δ_{U_{a d}} (\cdot)) \times dom (δ_{U_{a d}} (\cdot)) \times L^{2} (Ω) \times L^{2} (Ω)

and the parameter

τ \in (0, \frac{1 + \sqrt{5}}{2}) .

Let the sequence

{ε_{k}}_{k = 0}^{\infty} \subseteq [0, + \infty)

, satisfying

\sum_{k = 0}^{\infty} ε_{k} < \infty .

Set

k = 0 .

Output:

u^{k}, v^{k}, z^{k}, λ_{1}^{k}, λ_{2}^{k} .

Step 1: Compute an approximation solution,

u^{k + 1}

, of

min_{u} L_{σ_{1}, σ_{2}} (u, v^{k}, z^{k}; λ_{1}^{k}, λ_{2}^{k})

such that the residual

δ_{u}^{k + 1} : = \nabla \hat{J} (u^{k + 1}) + λ_{1}^{k} + σ_{1} (u^{k + 1} - v^{k}) + λ_{2}^{k} + σ_{2} (u^{k + 1} - z^{k})

satisfies

∥ δ_{u}^{k + 1} ∥_{L^{2} (Ω)} \leq ε_{k + 1} .

Step 2: Compute

v^{k + 1} = arg min_{v} h (v) + {⟨ λ_{1}^{k}, u^{k + 1} - v ⟩}_{L^{2} (Ω)} + \frac{σ_{1}}{2} {∥ u^{k + 1} - v ∥}_{L^{2} (Ω)}^{2} .

Step 3: Compute

z^{k + 1} = arg min_{z} δ_{U_{a d}} (z) + {⟨ λ_{2}^{k}, u^{k + 1} - z ⟩}_{L^{2} (Ω)} + \frac{σ_{2}}{2} {∥ u^{k + 1} - z ∥}_{L^{2} (Ω)}^{2} .

Step 4: Compute

λ_{1}^{k + 1} = λ_{1}^{k} + τ σ_{1} (u^{k + 1} - v^{k + 1}) .

Step 5: Compute

λ_{2}^{k + 1} = λ_{2}^{k} + τ σ_{2} (u^{k + 1} - z^{k + 1}) .

Step 6: If the stopping criteria are met, stop. Otherwise, set

k : = k + 1

and return to Step 1.

Moreover, it is obvious that there exists a closed-form solution to the z-subproblem:

\begin{matrix} z^{k + 1} = Π_{U_{a d}} (u^{k + 1} + \frac{λ_{2}^{k}}{σ_{2}}) . \end{matrix}

(6)

Although we have given Algorithm 1 in function space, we need to discretize (1) to implement the numerical computation. The fact that both the u-subproblem and the z-subproblem of the ihADMM algorithm in function space are well structured provides some insight into performing numerical discretization. The v-subproblem also has a special structure, which we will discuss in the following subsection. In other words, we have to utilize the structure (1) in function space when performing numerical discretization. Then, extending the well-formed structure, like (5) and (6), in parallel to the discretized problem, thus solving the discretized problem with an efficient ADMM-type algorithm, is an important starting point for the next subsection of this paper.

2.2. Finite Element Discretization

To numerically solve (1), we discretize both y and u using the standard piecewise linear finite element method [24]. Next, we begin by considering a regular and quasi-uniform triangulation family,

{T_{h}}_{h > 0}

, of

\bar{Ω}

. For each

T \in T_{h}

, we denote the diameter of the set T by

ρ_{T} : = diam T

. Additionally, we define

σ_{T}

as the diameter of the largest ball that fits entirely within

T .

Let

h = {max}_{T \in T_{h}} ρ_{T}

denote the mesh size. We assume that the triangulation satisfies the standard regularity conditions typically used in finite element error estimates.

Assumption 1

([20], Assumption 2). There exist two positive constants,

τ_{1}

and

τ_{2}

, such that

\frac{ρ_{T}}{σ_{T}} \leq τ_{1}, \frac{h}{ρ_{T}} \leq τ_{2}

,

\forall T \in T_{h}, h > 0

. In addition, let

{\bar{Ω}}_{h} = ⋃_{T \in T_{h}} T

,

Ω_{h} \subset Ω

, and

Γ_{h}

denote its interior and its boundary, respectively. In the case that Ω is a convex polyhedral domain, we have

Ω = Ω_{h} .

In the case that Ω is a domain with a

C^{1, 1}

- boundary Γ, we assume that

{\bar{Ω}}_{n}

is convex and that all boundary vertices of

{\bar{Ω}}_{h}

are contained in Γ, such that

| Ω ∖ Ω_{h} | \leq c h^{2}

, where

| \cdot |

denotes the measure of the set and

c > 0

is a constant.

Given the homogeneous boundary condition of the state equation, we define the discretized state space

Y_{h} = \{y_{h} \in C (\bar{Ω}) ∣ y_{h | T} \in P_{1}, \forall T \in T_{h}; y_{h} = 0 in \bar{Ω} ∖ Ω_{h}\},

and the discretized control space

U_{h} = \{u_{h} \in C (\bar{Ω}) ∣ u_{h | T} \in P_{1}, \forall T \in T_{h}; u_{h} = 0 in \bar{Ω} ∖ Ω_{h}\},

where

P_{1}

represents the space of polynomials of degree less than or equal to 1.

For a given

T_{h}

with nodes

{x_{i}}_{i = 1}^{N_{h}}, let {\{ψ_{i} (x)\}}_{i = 1}^{N_{h}}

denote the nodal basis functions, which span

Y_{h}

and

U_{h}

and satisfy

\begin{matrix} ψ_{i} (x) \geq 0, {∥ ψ_{i} (x) ∥}_{\infty} = 1 \forall i = 1, 2, \dots, N_{h}, \sum_{i = 1}^{N_{h}} ψ_{i} (x) = 1 . \end{matrix}

Then, we have

u_{h} = \sum_{i = 1}^{N_{h}} u_{i} ψ_{i} \in U_{h}, v_{h} = \sum_{i = 1}^{N_{h}} v_{i} ψ_{i} \in U_{h}, z_{h} = \sum_{i = 1}^{N_{h}} z_{i} ψ_{i} \in U_{h}, y_{h} = \sum_{i = 1}^{N_{h}} y_{i} ψ_{i} \in Y_{h} .

Let

u = {(u_{1}, \dots, u_{N_{h}})}^{T}, v = {(v_{1}, \dots, v_{N_{h}})}^{T}, z = {(z_{1}, \dots, z_{N_{h}})}^{T}, y = {(y_{1}, \dots, y_{N_{h}})}^{T}

represent the relative coefficient vectors. Define

y_{d, h} : = \sum_{i = 1}^{N_{h}} y_{d}^{i} ψ_{i}, y_{r, h} : = \sum_{i = 1}^{N_{h}} y_{r}^{i} ψ_{i}

as the nodal projection of

y_{d}

and

y_{r}

onto

Y_{h}

, where

y_{d}^{i} = y_{d} (x^{i})

and

y_{r}^{i} = y_{r} (x^{i})

. Let

y_{r} = {(y_{r}^{1}, y_{r}^{2}, \dots, y_{r}^{N_{h}})}^{T}, y_{d} = {(y_{d}^{1}, y_{d}^{2}, \dots, y_{d}^{N_{h}})}^{T}

denote the relative coefficient vectors. In addition, let the discretized feasible set

U_{a d, h}

be defined as

U_{a d, h} : = U_{h} \cap U_{a d} = \{z_{h} = \sum_{i = 1}^{N_{h}} z_{i} ψ_{i} (x) | a \leq z_{i} \leq b, \forall i = 1, \dots, N_{h}\},

and define the bilinear form:

a_{h} (y, ν) = \int_{Ω_{h}} (\sum_{i, j = 1}^{n} a_{j i} y_{x_{i}} ν_{x_{j}} + c_{0} y ν) d x .

Then, we can formulate the discretized version of (1):

\{\begin{matrix} min_{(y_{h}, u_{h}, v_{h}, z_{h}) \in Y_{h} \times U_{h} \times U_{h} \times U_{h}} J_{h} (y_{h}, u_{h}, v_{h}, z_{h}) & = \frac{1}{2} ∥ y_{h} - y_{d} ∥_{L^{2} (Ω_{h})}^{2} + \frac{α}{2} ∥ u_{h} ∥_{L^{2} (Ω_{h})}^{2} + β {∥ \nabla v_{h} ∥}_{L^{2} (Ω_{h})}^{2} \\ s . t . y_{h} & = S_{h} (u_{h} + y_{r}), \\ u_{h} & = z_{h}, \\ u_{h} & = v_{h}, \\ z_{h} & \in U_{a d, h} . \end{matrix}

(7)

From the error estimation results in [25,26], we arrive at the following result.

Theorem 2.

Let

u^{*}

represent the optimal control solution of (1) and

u_{h}^{*}

represent the optimal control solution of (7). Then, we have

lim_{h \to 0} \frac{1}{h} {∥ u^{*} - u_{h}^{*} ∥}_{L^{2} (Ω)} = 0 .

Proof.

The proof is similar to Proposition 4.5 in [26], so we omit it here. □

To rewrite (7) in its matrix vector form, we define the following stiffness, mass and lump mass matrices:

K_{h} = {(a_{h} (ψ_{i}, ψ_{j}))}_{i, j = 1}^{N_{h}}, M_{h} = {(\int_{Ω_{h}} ψ_{i} ψ_{j} d x)}_{i, j = 1}^{N_{h}}, W_{h} = diag {(\int_{Ω_{h}} ψ_{i} (x) d x)}_{i = 1}^{N_{h}} .

Moreover, we have

∥ \nabla y_{h} {(x) ∥}^{2} = \sum_{j = 1}^{n} {∥\sum_{i = 1}^{N_{h}} y_{i} \frac{\partial ψ_{i} (x)}{\partial x_{j}}∥}^{2} = \sum_{j = 1}^{n} y^{T} D_{j} y = {∥ y ∥}_{D_{h}}^{2},

where

D_{j} = {(\int_{Ω_{h}} \frac{\partial ψ_{i} (x)}{\partial x_{j}} \cdot \frac{\partial ψ_{l} (x)}{\partial x_{j}} d x)}_{i, l = 1}^{N_{h}}, j = 1, \dots, n, D_{h} = \sum_{j = 1}^{n} D_{j} .

Based on the matrix form mentioned above, it is evident that

K_{h}, M_{h}

and

W_{h}

are symmetric positive definite matrices, and

D_{h}

is a symmetric matrix.

Consequently, based on the above representation, we can rewrite (7) in the following matrix vector form:

\{\begin{matrix} min_{(y, u, v, z) \in R^{4 N_{h}}} \frac{1}{2} ∥ y & - y_{d} ∥_{M_{h}}^{2} + \frac{α}{2} {∥ u ∥}_{M_{h}}^{2} + β {∥ v ∥}_{D_{h}}^{2} + δ_{{[a, b]}^{N_{h}}} (z) \\ s . t . K_{h} y & = M_{h} (u + y_{r}), \\ u & = v, \\ u & = z . \end{matrix}

(8)

Similar to (1), we know (8) is strictly solvable. Moreover, we can derive the important discrete first-order optimality conditions.

Theorem 3.

(u^{*}, v^{*}, z^{*})

denotes the optimal solution of (8) if and only if there exist Lagrangian multipliers,

λ_{1}^{*}

and

λ_{2}^{*}

, such that the following conditions hold:

\begin{matrix} - M_{h} λ_{1}^{*} - M_{h} λ_{2}^{*} & = \nabla f (u^{*}), \end{matrix}

(9a)

\begin{matrix} M_{h} λ_{1}^{*} & = \nabla h (v^{*}), \end{matrix}

(9b)

\begin{matrix} M_{h} λ_{2}^{*} & \in \partial g (z^{*}), \end{matrix}

(9c)

\begin{matrix} u^{*} & = v^{*}, \end{matrix}

(9d)

\begin{matrix} u^{*} & = z^{*} . \end{matrix}

(9e)

For the convenience of representation, let

\begin{matrix} f (u) : = \frac{1}{2} ∥ K_{h}^{- 1} M_{h} (u + y_{r}) - y_{d} ∥_{M_{h}}^{2} + \frac{α}{2} {∥ u ∥}_{M_{h}}^{2}, \\ h (v) : = {β ∥ v ∥}_{D_{h}}^{2}, \\ g (z) : = δ_{{[a, b]}^{N_{h}}} (z) . \end{matrix}

(10)

Then, the problem (8) can be rewritten as

\{\begin{matrix} min_{(u, v, z) \in R^{3 N_{h}}} & f (u) + h (v) + g (z) \\ s . t . & u = v, \\ u = z . \end{matrix}

(11)

The augmented Lagrangian function is defined as

\begin{matrix} L_{σ_{1}, σ_{2}} (u, v, z; λ_{1}, λ_{2}) = & f (u) + h (v) + g (z) + ⟨ λ_{1}, u - v ⟩ + \frac{σ_{1}}{2} | | u - {v | |}^{2} \\ + ⟨ λ_{2}, u - z ⟩ + \frac{σ_{2}}{2} | | u - {z | |}^{2} . \end{matrix}

Then, the iteration format of the ADMM is as follows:

\{\begin{matrix} u^{k + 1} & = arg min_{u} f (u) + ⟨ λ_{1}^{k}, (u - v^{k}) ⟩ + ⟨ λ_{2}^{k}, (u - z^{k}) ⟩ + \frac{σ_{1}}{2} ∥ u - v^{k} ∥^{2} + \frac{σ_{2}}{2} ∥ u - z^{k} ∥, \\ v^{k + 1} & = arg min_{v} h (v) + ⟨ λ_{1}^{k}, (u^{k + 1} - v) ⟩ + \frac{σ_{1}}{2} | | u^{k + 1} - v {| |}^{2}, \\ z^{k + 1} & = arg min_{z} g (z) + ⟨ λ_{2}^{k}, (u^{k + 1} - z) ⟩ + \frac{σ_{2}}{2} | | u^{k + 1} - z {| |}^{2}, \\ λ_{1}^{k + 1} & = λ_{1}^{k} + τ σ_{1} (u^{k + 1} - v^{k + 1}), \\ λ_{2}^{k + 1} & = λ_{2}^{k} + τ σ_{2} (u^{k + 1} - z^{k + 1}) . \end{matrix}

Notice that the

u

-subproblem is equivalent to the following linear system:

\begin{matrix} M_{h} K_{h}^{- 1} M_{h} (K_{h}^{- 1} M_{h} (u^{k + 1} + y_{r}) - y_{d}) + α M_{h} u^{k + 1} + λ_{1}^{k} + λ_{2}^{k} + σ_{1} (u^{k + 1} - v^{k}) + σ_{2} (u^{k + 1} - z^{k}) = 0 . \end{matrix}

(12)

Let

y^{k + 1} : = K_{h}^{- 1} M_{h} (u^{k + 1} + y_{r})

represent the discretized state and

p^{k + 1} : = K_{h}^{- 1} M_{h} (y_{d} - y^{k + 1})

represent the discretized adjoint state, respectively. Consequently, (12) can be reformulated as

\begin{matrix} [\begin{matrix} M_{h} & 0 & K_{h} \\ 0 & (α + σ_{1} + σ_{2}) M_{h} & - M_{h} \\ K_{h} & - M_{h} & 0 \end{matrix}] [\begin{matrix} y^{k + 1} \\ u^{k + 1} \\ p^{k + 1} \end{matrix}] = [\begin{matrix} M_{h} y_{d} \\ σ_{1} v^{k} + σ_{2} z^{k} - (λ_{1}^{k} + λ_{2}^{k}) \\ M_{h} y_{r} \end{matrix}] . \end{matrix}

(13)

By eliminating the variable

p^{k + 1}

, (13) can be reformulated as follows:

\begin{matrix} [\begin{matrix} M_{h} & α K_{h} + (σ_{1} + σ_{2}) K_{h} M_{h}^{- 1} \\ - K_{h} & M_{h} \end{matrix}] [\begin{matrix} y^{k + 1} \\ u^{k + 1} \end{matrix}] = [\begin{matrix} K_{h} M_{h}^{- 1} (σ_{1} v^{k} + σ_{2} z^{k} - λ_{1}^{k} - λ_{2}^{k}) + M_{h} y_{d} \\ - M_{h} y_{r} \end{matrix}] . \end{matrix}

(14)

Notice that (14) contains the inverse of the matrix

M_{h}

, which inevitably leads to additional computations. To reduce the computational cost, we introduce a heterogeneous strategy. Specifically, we employ various weighted inner products to formulate the augmented Lagrangian functions. On one hand, for the

u

-subproblem and

v

-subproblem, given the penalty parameters

σ_{1} > 0, σ_{2} > 0

, we define the corresponding augmented Lagrangian function based on an

M_{h}

-weighted inner product as follows:

\begin{matrix} L_{σ_{1}, σ_{2}} (u, v, z; λ_{1}, λ_{2}) : = & f (u) + h (v) + g (z) + ⟨ λ_{1}, M_{h} (u - v) ⟩ + \frac{σ_{1}}{2} | | u - {v | |}_{M_{h}}^{2} \\ + ⟨ λ_{2}, M_{h} (u - z) ⟩ + \frac{σ_{2}}{2} | | u - {z | |}_{M_{h}}^{2} . \end{matrix}

On the other hand, for the

z

-subproblem, we define the corresponding augmented Lagrangian function based on the

W_{h}

-weighted inner product as follows:

\begin{matrix} L_{σ_{1}, σ_{2}} (u, v, z; λ_{1}, λ_{2}) : = & f (u) + h (v) + g (z) + ⟨ λ_{1}, M_{h} (u - v) ⟩ + \frac{σ_{1}}{2} | | u - {v | |}_{M_{h}}^{2} \\ + ⟨ λ_{2}, M_{h} (u - z) ⟩ + \frac{σ_{2}}{2} | | u - {z | |}_{W_{h}}^{2} . \end{matrix}

Based on the above augmented Lagrangian functions, we write the iterative format of the ADMM:

\{\begin{matrix} u^{k + 1} & = arg min_{u} f (u) + ⟨ λ_{1}^{k}, M_{h} (u - v^{k}) ⟩ + ⟨ λ_{2}^{k}, M_{h} (u - z^{k}) ⟩ \\ + \frac{σ_{1}}{2} ∥ u - v^{k} ∥_{M_{h}}^{2} + \frac{σ_{2}}{2} {∥ u - z^{k} ∥}_{M_{h}}^{2}, \\ v^{k + 1} & = arg min_{v} h (v) + ⟨ λ_{1}^{k}, M_{h} (u^{k + 1} - v) ⟩ + \frac{σ_{1}}{2} | | u^{k + 1} - v {| |}_{M_{h}}^{2}, \\ z^{k + 1} & = arg min_{z} g (z) + ⟨ λ_{2}^{k}, M_{h} (u^{k + 1} - z) ⟩ + \frac{σ_{2}}{2} | | u^{k + 1} - z {| |}_{W_{h}}^{2}, \\ λ_{1}^{k + 1} & = λ_{1}^{k} + τ σ_{1} (u^{k + 1} - v^{k + 1}), \\ λ_{2}^{k + 1} & = λ_{2}^{k} + τ σ_{2} (u^{k + 1} - z^{k + 1}), \end{matrix}

(15)

where

τ > 0

denotes the step size.

Benefiting from such a heterogeneous strategy, each subproblem of the three-block ihADMM algorithm can be efficiently implemented. Next, we provide the specific iteration process.

For the

v

-subproblem, let

ψ (v) = {β ∥ v ∥}_{D_{h}}^{2} + ⟨ λ_{1}^{k}, M_{h} (u^{k + 1} - v) ⟩ + σ_{1} / 2 | | u^{k + 1} - v {| |}_{M_{h}}^{2};

then, we have

\nabla ψ (v) = 2 β D_{h} v - σ_{1} M_{h} (u^{k + 1} - v + \frac{λ_{1}^{k}}{σ_{1}}) = 0 .

Consequently, the solution of the

v

-subproblem can be given as follows:

v^{k + 1} = {(2 β D_{h} + σ_{1} M_{h})}^{- 1} (σ_{1} M_{h} u^{k + 1} + M_{h} λ_{1}^{k}) .

Moreover, it is easy to see that the

z

-subproblem has a closed-form solution:

z^{k + 1} = Π_{{[a, b]}^{N_{h}}} (u^{k + 1} + \frac{W_{h}^{- 1} M_{h} λ_{2}^{k}}{σ_{2}}) .

In addition, notice that the

u

-subproblem at the kth iteration is equivalent to the linear system:

\begin{matrix} M_{h} K_{h}^{- 1} M_{h} (K_{h}^{- 1} M_{h} (u^{k + 1} + y_{r}) - y_{d}) + α M_{h} u^{k + 1} + M_{h} (λ_{1}^{k} + λ_{2}^{k}) \\ + & σ_{1} (u^{k + 1} - v^{k}) + σ_{2} (u^{k + 1} - z^{k})) = 0 . \end{matrix}

(16)

Similar to the steps in (12) and (13), (16) can be rewritten as

\begin{matrix} [\begin{matrix} M_{h} & 0 & K_{h} \\ 0 & (α + σ_{1} + σ_{2}) M_{h} & - M_{h} \\ K_{h} & - M_{h} & 0 \end{matrix}] [\begin{matrix} y^{k + 1} \\ u^{k + 1} \\ p^{k + 1} \end{matrix}] = [\begin{matrix} M_{h} y_{d} \\ M_{h} (σ_{1} v^{k} + σ_{2} z^{k} - (λ_{1}^{k} + λ_{2}^{k})) \\ M_{h} y_{r} \end{matrix}] . \end{matrix}

(17)

By eliminating the variable

p^{k + 1}

, (17) can be reformulated in a reduced form without incurring any additional computational cost:

\begin{matrix} [\begin{matrix} M_{h} & (α + σ_{1} + σ_{2}) K_{h} \\ - K_{h} & M_{h} \end{matrix}] [\begin{matrix} y^{k + 1} \\ u^{k + 1} \end{matrix}] = [\begin{matrix} K_{h} (σ_{1} v^{k} + σ_{2} z^{k} - λ_{1}^{k} - λ_{2}^{k}) + M_{h} y_{d} \\ - M_{h} y_{r} \end{matrix}] . \end{matrix}

(18)

We point out that (18) is a special case of the generalized saddle point problem. Therefore, Krylov-based methods can be employed to inexactly solve the problem. The GMRES method [23] is a highly efficient technique for solving linear systems. The benefits of using GMRES to solve saddle point problems include its ability to efficiently handle large and sparse linear systems, its convergence properties, and its capability to handle nonsymmetric and indefinite matrices. GMRES can also be flexible in terms of preconditioning and can be parallelized to improve computational efficiency. Let

(r_{1}^{k}, r_{2}^{k})

denote the residual error vector satisfying

\begin{matrix} [\begin{matrix} \frac{1}{(α + σ_{1} + σ_{2})} M_{h} & K_{h} \\ - K_{h} & M_{h} \end{matrix}] [\begin{matrix} y^{k + 1} \\ u^{k + 1} \end{matrix}] = [\begin{matrix} \frac{1}{(α + σ_{1} + σ_{2})} K_{h} (σ_{1} v^{k} + σ_{2} z^{k} - λ_{1}^{k} - λ_{2}^{k}) + M_{h} y_{d} + r_{1}^{k} \\ - M_{h} y_{r} + r_{2}^{k} \end{matrix}] . \end{matrix}

In the numerical experiments, we require that the residual vector satisfies

{∥r_{1}^{k}∥}_{2} + {∥r_{2}^{k}∥}_{2} \leq \frac{ε_{k}}{\sqrt{2} {∥M_{h} K_{h}^{- 1}∥}_{2} max \{{∥M_{h} K_{h}^{- 1}∥}_{2}, α + σ_{1} + σ_{2}\}}

to guarantee that the error vector

δ^{k} = (α + σ) M_{h} K_{h}^{- 1} r_{1}^{k} + M_{h} K_{h}^{- 1} M_{h} K_{h}^{- 1} r_{2}^{k}

satisfies

∥ δ^{k} ∥_{2} \leq ε_{k} .

Moreover, to improve the convergence rate of the GMRES method, we employ the PMHSS preconditioner in [27]. Specifically, the PMHSS preconditioner is defined as

P_{H S S} = \frac{1}{γ} [\begin{matrix} I & \sqrt{γ} I \\ - \sqrt{γ} I & γ I \end{matrix}] [\begin{matrix} M_{h} + \sqrt{γ} K_{h} & 0 \\ 0 & M_{h} + \sqrt{γ} K_{h} \end{matrix}],

where

γ = α + σ_{1} + σ_{2}

.

We point out that the

u

-subproblem involves solving a large-scale linear system, which is the primary computational expense of the algorithm. The

M_{h}

-weighting technique can approximate a system with

3 \times 3

blocks to one with

2 \times 2

blocks without any extra computational cost. The

M_{h}

-weighting inner product also provides a closed-form solution for the

v

-subproblem. Additionally, the

W_{h}

-weighting inner product gives the

z

-subproblem a separable structure, allowing for a closed-form solution that can be computed with a single projection operator.

In Algorithm 2, we present the matrix vector form of the three-block inexact heterogeneous ADMM (three-block ihADMM) for (11).

Algorithm 2 Three-block inexact heterogeneous ADMM (three-block ihADMM) for (

\bar{{RDP}_{h}}

)

Input: Choose the initial point

(v^{0}, z^{0}, λ_{1}^{0}, λ_{2}^{0}) \in dom (δ_{[a, b]} (\cdot)) \times dom (δ_{[a, b]} (\cdot)) \times R^{n} \times R^{n}

and the parameter

τ \in (0, \frac{1 + \sqrt{5}}{2}) .

Let the sequence

{ε_{k}}_{k = 0}^{\infty} \subseteq [0, + \infty)

satisfying

\sum_{k = 0}^{\infty} ε_{k} < \infty

. Set

k = 0 .

Output:

u^{k}, v^{k}, z^{k}, λ_{1}^{k}, λ_{2}^{k} .

Step 1:

Compute an approximation solution u^{k + 1} of

(18) by the GMRES method.

Step 2:

Compute v^{k + 1} = {(2 β D_{h} + σ_{1} M_{h})}^{- 1} (σ_{1} M_{h} u^{k + 1} + M_{h} λ_{1}^{k}) .

Step 3:

Compute z^{k + 1} = Π_{{[a, b]}^{N_{h}}} (u^{k + 1} + \frac{λ_{2}^{k}}{σ_{2}}) .

Step 4:

Compute λ_{1}^{k + 1} = λ_{1}^{k} + τ σ_{1} (u^{k + 1} - v^{k + 1}) .

Step 5:

Compute λ_{2}^{k + 1} = λ_{2}^{k} + τ σ_{2} (u^{k + 1} - z^{k + 1}) .

Step 6: If the stopping criteria are met, stop. Otherwise, set

k : = k + 1

and return to Step 1.

Furthermore, we notice that each subproblem of Algorithm 2 is a discretized form of each subproblem of Algorithm 1, which illustrates the benefit of using different weighted inner products for discretization.

3. Convergence Analysis

We will show the convergence analysis and discuss the iteration complexity of Algorithm 2 in this section. For computational convenience, we introduce the following two lemmas.

Lemma 1

([20], Appendix). There are two basic identities:

\begin{matrix} {⟨ γ_{1}, γ_{2} ⟩}_{Q} = \frac{1}{2} (∥ γ_{1} ∥_{Q}^{2} + ∥ γ_{2} ∥_{Q}^{2} - ∥ γ_{1} - γ_{2} ∥_{Q}^{2}) = \frac{1}{2} (∥ γ_{1} + γ_{2} ∥_{Q}^{2} - ∥ γ_{1} ∥_{Q}^{2} - ∥ γ_{2} ∥_{Q}^{2}), \end{matrix}

(19a)

\begin{matrix} {⟨ γ_{1} - γ_{1}^{'}, γ_{2} - γ_{2}^{'} ⟩}_{Q} = \frac{1}{2} (∥ γ_{1} + γ_{2} ∥_{Q}^{2} + ∥ γ_{1}^{'} + γ_{2}^{'} ∥_{Q}^{2} - ∥ γ_{1} + γ_{2}^{'} ∥_{Q}^{2} - ∥ γ_{1}^{'} + γ_{2} ∥_{Q}^{2}), \end{matrix}

(19b)

where

γ_{1}, γ_{2}, γ_{1}^{'},

and

γ_{2}^{'}

are from the same Euclidean space, and Q denotes a self-adjoint positive semidefinite linear operator.

Lemma 2

([19], Lemma 6.1). If the sequence

{d_{i}} \in R

satisfies

d_{i} \geq 0, \forall i \geq 0 a n d \sum_{i = 0}^{\infty} d_{i} = \bar{d} < \infty .

Then, we have

min_{i = 1, \dots, k} {d_{i}} \leq \frac{\bar{d}}{k}, lim_{k \to \infty} {k \cdot min_{i = 1, \dots, k} {d_{i}}} = 0 .

To facilitate the following iteration complexity analysis, we define the function

R_{h} (u, v, z, λ_{1}, λ_{2}) \to [0, \infty)

:

\begin{matrix} R_{h} (u, v, z, λ_{1}, λ_{2}) = & ∥ M_{h} λ_{1} + M_{h} λ_{2} {+ \nabla f (u) ∥}^{2} + {∥ \nabla h (v) - M_{h} λ_{1} ∥}^{2} \\ + {dist}^{2} (0, - M_{h} λ_{2} + \partial g (z)) {+ ∥ u - z ∥}^{2} + {∥ u - v ∥}^{2} . \end{matrix}

(20)

We know from the definitions in (10) that

f (u)

,

h (v)

, and

g (z)

are proper, convex, and closed functions. Since the matrices

M_{h}

and

K_{h}

are both symmetric and positive definite, the gradient operators

\nabla f (u)

and

\nabla h (v)

are strongly monotone. Therefore, we have

\begin{matrix} ⟨ \nabla f (u_{1}) - \nabla f (u_{2}), u_{1} - u_{2} ⟩ = | | u_{1} - u_{2} {| |}_{Σ_{f}}^{2}, \end{matrix}

(21a)

\begin{matrix} ⟨ \nabla h (v_{1}) - \nabla h (v_{2}), v_{1} - v_{2} ⟩ = | | v_{1} - v_{2} {| |}_{Σ_{h}}^{2}, \end{matrix}

(21b)

where matrices

Σ_{f} = α M_{h} + M_{h} K_{h}^{- 1} M_{h} K_{h}^{- 1} M_{h}

and

Σ_{h} = 2 β D_{h}

are also symmetric and positive definite. In addition, we know that

\partial g

is a maximal monotone operator, i.e.,

\begin{matrix} ⟨ φ_{1} - φ_{2}, z_{1} - z_{2} ⟩ \geq 0, \forall φ_{1} \in \partial g (z_{1}), φ_{2} \in \partial g (z_{2}) . \end{matrix}

(22)

Let

\begin{matrix} {\bar{u}}^{k + 1} & = arg min_{u} f (u) + ⟨ λ_{1}^{k}, M_{h} (u - v^{k}) ⟩ + ⟨ λ_{2}^{k}, M_{h} (u - z^{k}) ⟩ \\ + \frac{σ_{1}}{2} ∥ u - v^{k} ∥_{M_{h}}^{2} + \frac{σ_{2}}{2} {∥ u - z^{k} ∥}_{M_{h}}^{2}, \\ {\bar{v}}^{k + 1} & = {(2 β D_{h} + σ_{1} M_{h})}^{- 1} (σ_{1} M_{h} {\bar{u}}^{k + 1} + M_{h} λ_{1}^{k + 1}), \\ {\bar{z}}^{k + 1} & = Π_{{[a, b]}^{N_{h}}} ({\bar{u}}^{k + 1} + \frac{λ_{2}^{k}}{σ_{2}}) \end{matrix}

(23)

denote the exact solutions at the

(k + 1)

th iteration in (15). Then, the gap between

(u^{k + 1}, v^{k + 1}, z^{k + 1})

and

({\bar{u}}^{k + 1}, {\bar{v}}^{k + 1}, {\bar{z}}^{k + 1})

is shown in the following lemma.

Lemma 3.

Let

(u^{k + 1}, v^{k + 1}, z^{k + 1})

be the sequence generated by Algorithm 2 and

({\bar{u}}^{k + 1}, {\bar{v}}^{k + 1}, {\bar{z}}^{k + 1})

be the sequence defined by (23). Then, for any

k \geq 0

, we have

\begin{matrix} ∥ u^{k + 1} - {\bar{u}}^{k + 1} ∥ = ∥ {(σ_{1} M_{h} + σ_{2} M_{h} + Σ_{f})}^{- 1} δ^{k} ∥ \leq ρ ε_{k}, \end{matrix}

(24a)

\begin{matrix} ∥ z^{k + 1} - {\bar{z}}^{k + 1} ∥ \leq ∥ u^{k + 1} - {\bar{u}}^{k + 1} ∥ \leq ρ ε_{k}, \end{matrix}

(24b)

\begin{matrix} ∥ v^{k + 1} - {\bar{v}}^{k + 1} ∥ \leq ρ_{2} ∥ u^{k + 1} - {\bar{u}}^{k + 1} ∥ \leq ρ ε_{k}, \end{matrix}

(24c)

where

ρ_{1} : = ∥ {(σ M_{h} + Σ_{f})}^{- 1} ∥

,

ρ_{2} : = ∥ σ_{1} {(2 β D_{h} + σ_{1} M_{h})}^{- 1} M_{h} ∥

,

ρ : = max {ρ_{1}, ρ_{1} ρ_{2}},

and the error tolerance

∥ δ^{k} ∥_{2} \leq ε_{k}

.

Proof.

According to the optimality conditions, we have

Σ_{f} u^{k + 1} - M_{h} K_{h}^{- 1} M_{h} y_{d} + M_{h} λ_{1}^{k} + M_{h} λ_{2}^{k} + σ_{1} M_{h} (u^{k + 1} - v^{k}) + σ_{2} M_{h} (u^{k + 1} - z^{k}) - δ^{k} = 0,

Σ_{f} {\bar{u}}^{k + 1} - M_{h} K_{h}^{- 1} M_{h} y_{d} + M_{h} λ_{1}^{k} + M_{h} λ_{2}^{k} + σ_{1} M_{h} ({\bar{u}}^{k + 1} - v^{k}) + σ_{2} M_{h} ({\bar{u}}^{k + 1} - z^{k}) = 0,

respectively.

Therefore,

u^{k + 1} - {\bar{u}}^{k + 1} = {(σ_{1} M_{h} + σ_{2} M_{h} + Σ_{f})}^{- 1} δ^{k} .

Then, (24a) holds. The projection

Π_{[a, b]} (\cdot)

is nonexpansive; thus, we have

∥ z^{k + 1} - {\bar{z}}^{k + 1} ∥ = ∥ Π_{[a, b]} (u^{k + 1} + \frac{λ_{2}^{k}}{σ_{2}}) - Π_{[a, b]} ({\bar{u}}^{k + 1} + \frac{λ_{2}^{k}}{σ_{2}}) ∥ \leq ∥ u^{k + 1} - {\bar{u}}^{k + 1} ∥ .

Furthermore,

\begin{matrix} ∥ v^{k + 1} - {\bar{v}}^{k + 1} ∥ \\ = ∥ {(2 β D_{h} + σ_{1} M_{h})}^{- 1} (σ_{1} M_{h} (u^{k + 1} - {\bar{u}}^{k + 1})) ∥ \\ \leq ∥ {(2 β D_{h} + σ_{1} M_{h})}^{- 1} σ_{1} M_{h} ∥ ∥ u^{k + 1} - {\bar{u}}^{k + 1} ∥ \\ \leq ρ ε_{k} . \end{matrix}

Therefore, we complete the proof. □

Subsequently, we define

\begin{matrix} r_{1}^{k} & = u^{k} - v^{k}, & r_{2}^{k} & = u^{k} - z^{k}, & {\bar{r}}_{1}^{k} & = {\bar{u}}^{k} - {\bar{v}}^{k}, & {\bar{r}}_{2}^{k} & = {\bar{u}}^{k} - {\bar{z}}^{k} . \\ {\tilde{λ}}_{1}^{k + 1} & = λ_{1}^{k} + σ_{1} r_{1}^{k + 1}, & {\tilde{λ}}_{2}^{k + 1} & = λ_{2}^{k} + σ_{2} r_{2}^{k + 1}, & {\bar{λ}}_{1}^{k + 1} & = λ_{1}^{k} + τ σ_{1} {\bar{r}}_{1}^{k + 1}, & {\bar{λ}}_{2}^{k + 1} & = λ_{2}^{k} + τ σ_{2} {\bar{r}}_{2}^{k + 1}, \\ {\hat{λ}}_{1}^{k + 1} & = λ_{1}^{k} + σ_{1} {\bar{r}}_{1}^{k + 1}, & {\hat{λ}}_{2}^{k + 1} & = λ_{2}^{k} + σ_{2} {\bar{r}}_{2}^{k + 1} . \end{matrix}

Furthermore, we introduce the following two key propositions that are crucial for analyzing the global convergence and discussing the iterative complexity of the three-block ihADMM algorithm.

Proposition 1.

Let

{(u^{k}, v^{k}, z^{k}, λ_{1}^{k}, λ_{2}^{k})}

denote the sequence generated by Algorithm 2 and

(u^{*}, v^{*}, z^{*}, λ_{1}^{*}, λ_{2}^{*})

denote the Karush–Kuhn–Tucker (KKT) point of problem (11). Then, for any

k \geq 0

, we have

\begin{matrix} ⟨ δ^{k}, u^{k + 1} - u^{*} ⟩ + \frac{1}{2 τ σ_{1}} ∥ λ_{1}^{k} - λ_{1}^{*} ∥ + \frac{1}{2 τ σ_{2}} ∥ λ_{2}^{k} - λ_{2}^{*} ∥_{M_{h}}^{2} + \frac{σ_{1}}{2} ∥ v^{k} - v^{*} ∥_{M_{h}}^{2} + \frac{σ_{2}}{2} {∥ z^{k} - z^{*} ∥}_{M_{h}}^{2} \\ - \frac{1}{2 τ σ_{1}} ∥ λ_{1}^{k + 1} - λ_{1}^{*} ∥_{M_{h}}^{2} - \frac{1}{2 τ σ_{2}} ∥ λ_{2}^{k + 1} - λ_{2}^{*} ∥_{M_{h}}^{2} - \frac{σ_{1}}{2} ∥ v^{k + 1} - v^{*} ∥_{M_{h}}^{2} - \frac{σ_{2}}{2} {∥ z^{k + 1} - z^{*} ∥}_{M_{h}}^{2} \\ \geq & ∥ u^{k + 1} - u^{*} ∥_{Σ_{f}}^{2} + ∥ v^{k + 1} - v^{*} ∥_{Σ_{h}}^{2} + \frac{(1 - τ) σ_{1}}{2} ∥ r_{1}^{k + 1} ∥_{M_{h}}^{2} + \frac{(1 - τ) σ_{2}}{2} {∥ r_{2}^{k + 1} ∥}_{M_{h}}^{2} \\ + \frac{σ_{1}}{2} ∥ u^{k + 1} - v^{k} ∥_{M_{h}}^{2} + \frac{σ_{2}}{2} {∥ u^{k + 1} - z^{k} ∥}_{M_{h}}^{2} . \end{matrix}

(25)

Proof.

We know from the optimality conditions for

(u^{k}, v^{k}, z^{k})

that

\begin{matrix} δ^{k} - (M_{h} λ_{1}^{k} + M_{h} λ_{2}^{k} + σ_{1} M_{h} (u^{k + 1} - v^{k}) + σ_{2} M_{h} (u^{k + 1} - z^{k})) = \nabla f (u^{k + 1}), \end{matrix}

(26a)

\begin{matrix} M_{h} λ_{1}^{k} + σ_{1} M_{h} (u^{k + 1} - v^{k + 1}) = \nabla h (v^{k + 1}), \end{matrix}

(26b)

\begin{matrix} M_{h} λ_{2}^{k} + σ_{2} M_{h} (u^{k + 1} - z^{k + 1}) \in \partial g (z^{k + 1}), \end{matrix}

(26c)

respectively. Thus, using Theorem 3, (26a)–(26c) and taking

u_{1} = u^{k + 1}, u_{2} = u^{*}

,

v_{1} = v^{k + 1}

,

v_{2} = v^{*}

,

z_{1} = z^{k + 1}

, and

z_{2} = z^{*}

in (21a), (21b), and (22), we obtain

\begin{matrix} ⟨ δ^{k} - (M_{h} (λ_{1}^{k} + λ_{2}^{k}) + σ_{1} M_{h} (u^{k + 1} - v^{k}) + σ_{2} M_{h} (u^{k + 1} - z^{k})) + M_{h} (λ_{1}^{*} + λ_{2}^{*}), u^{k + 1} - u^{*} ⟩ \\ \geq ∥ u^{k + 1} - u^{*} ∥_{Σ_{f}}^{2}, \\ ⟨ M_{h} λ_{1}^{k} + σ_{1} M_{h} (u^{k + 1} - v^{k + 1}) - M_{h} λ_{1}^{*}, v^{k + 1} - v^{*} ⟩ \geq {∥ v^{k + 1} - v^{*} ∥}_{Σ_{h}}^{2}, \\ ⟨ M_{h} λ_{2}^{k} + σ_{2} M_{h} (u^{k + 1} - z^{k + 1}) - M_{h} λ_{2}^{*}, z^{k + 1} - z^{*} ⟩ \geq 0 . \end{matrix}

Adding the above three inequalities together, we obtain

\begin{matrix} ⟨ δ^{k}, u^{k + 1} - u^{*} ⟩ - ⟨ {\tilde{λ}}_{1}^{k + 1} - λ_{1}^{*}, M_{h} r_{1}^{k + 1} ⟩ - ⟨ {\tilde{λ}}_{2}^{k + 1} - λ_{2}^{*}, M_{h} r_{2}^{k + 1} ⟩ \\ - σ_{1} ⟨ M_{h} (v^{k + 1} - v^{k}), u^{k + 1} - u^{*} ⟩ - σ_{2} ⟨ M_{h} (z^{k + 1} - z^{k}), u^{k + 1} - u^{*} ⟩ \\ \geq & ∥ u^{k + 1} - u^{*} ∥_{Σ_{f}}^{2} + {∥ v^{k + 1} - v^{*} ∥}_{Σ_{h}}^{2}, \end{matrix}

(27)

where we have used the fact that

λ_{1}^{k} + σ_{1} (u^{k + 1} - v^{k}) = {\tilde{λ}}_{1}^{k + 1} + σ_{1} (v^{k + 1} - v^{k}),

λ_{2}^{k} + σ_{2} (u^{k + 1} - z^{k}) = {\tilde{λ}}_{2}^{k + 1} + σ_{2} (z^{k + 1} - z^{k}) .

Then, we will reformulate the last four terms on the left side of (27). First, using (19a), we obtain

\begin{matrix} ⟨ λ_{i}^{*} - {\tilde{λ}}_{i}^{k + 1}, M_{h} r_{i}^{k + 1} ⟩ & = ⟨ λ_{i}^{*} - λ_{i}^{k} - σ_{i} r_{i}^{k + 1}, M_{h} r_{i}^{k + 1} ⟩ \\ = \frac{1}{τ σ_{i}} ⟨ λ_{i}^{*} - λ_{i}^{k}, M_{h} (λ_{i}^{k + 1} - λ_{i}^{k}) ⟩ - σ_{i} {∥ r_{i}^{k + 1} ∥}_{M_{h}}^{2} \\ = \frac{1}{2 τ σ_{i}} (∥ λ_{i}^{k} - λ_{i}^{*} ∥_{M_{h}}^{2} - ∥ λ_{i}^{k + 1} - λ_{i}^{*} ∥_{M_{h}}^{2}) + \frac{(τ - 2) σ_{i}}{2} {∥ r_{i}^{k + 1} ∥}_{M_{h}}^{2} . \end{matrix}

(28)

Second, we can obtain the following result by employing (19b) and

u^{*} = v^{*} = z^{*}

:

\begin{matrix} σ_{1} ⟨ M_{h} (v^{k + 1} - v^{k}), u^{*} - u^{k + 1} ⟩ = & \frac{σ_{1}}{2} | | v^{k} - v^{*} {| |}_{M_{h}}^{2} + \frac{σ_{1}}{2} | | r_{1}^{k + 1} {| |}_{M_{h}}^{2} \\ - \frac{σ_{1}}{2} ∥ v^{k + 1} - v^{*} ∥_{M_{h}}^{2} - \frac{σ_{1}}{2} {∥ u^{k + 1} - v^{k} ∥}_{M_{h}}^{2}, \end{matrix}

(29)

\begin{matrix} σ_{2} ⟨ M_{h} (z^{k + 1} - z^{k}), u^{*} - u^{k + 1} ⟩ = & \frac{σ_{2}}{2} | | z^{k} - z^{*} {| |}_{M_{h}}^{2} + \frac{σ_{2}}{2} | | r_{2}^{k + 1} {| |}_{M_{h}}^{2} \\ - \frac{σ_{2}}{2} ∥ z^{k + 1} - z^{*} ∥_{M_{h}}^{2} - \frac{σ_{2}}{2} {∥ u^{k + 1} - z^{k} ∥}_{M_{h}}^{2} . \end{matrix}

(30)

Then, substituting (28), (29), and (30) into (27), we can obtain (25). □

Proposition 2.

Let

{(u^{k}, v^{k}, z^{k}, λ_{1}^{k}, λ_{2}^{k})}

denote the sequence generated by Algorithm 2 and

(u^{*}, v^{*}, z^{*}, λ_{1}^{*}, λ_{2}^{*})

denote the KKT point of (11).

({\bar{u}}^{k + 1}, {\bar{v}}^{k + 1}, {\bar{z}}^{k + 1})

are defined by (23). Then, for any

k \geq 0

, we have

\begin{matrix} \frac{1}{2 τ σ_{1}} ∥ λ_{1}^{k} - λ_{1}^{*} ∥_{M_{h}}^{2} + \frac{1}{2 τ σ_{2}} ∥ λ_{2}^{k} - λ_{2}^{*} ∥_{M_{h}}^{2} + \frac{σ_{1}}{2} ∥ v^{k} - v^{*} ∥_{M_{h}}^{2} + \frac{σ_{2}}{2} {∥ z^{k} - z^{*} ∥}_{M_{h}}^{2} \\ - \frac{1}{2 τ σ_{1}} ∥ {\bar{λ}}_{1}^{k + 1} - λ_{1}^{*} ∥_{M_{h}}^{2} - \frac{1}{2 τ σ_{2}} ∥ {\bar{λ}}_{2}^{k + 1} - λ_{2}^{*} ∥_{M_{h}}^{2} - \frac{σ_{1}}{2} ∥ {\bar{v}}^{k + 1} - v^{*} ∥_{M_{h}}^{2} - \frac{σ_{2}}{2} {∥ {\bar{z}}^{k + 1} - z^{*} ∥}_{M_{h}}^{2} \\ \geq & ∥ {\bar{u}}^{k + 1} - u^{*} ∥_{Σ_{f}}^{2} + ∥ {\bar{v}}^{k + 1} - v^{*} ∥_{Σ_{h}}^{2} + \frac{(1 - τ) σ_{1}}{2} ∥ {\bar{r}}_{1}^{k + 1} ∥_{M_{h}}^{2} + \frac{(1 - τ) σ_{2}}{2} {∥ {\bar{r}}_{2}^{k + 1} ∥}_{M_{h}}^{2} \\ + \frac{σ_{1}}{2} ∥ {\bar{u}}^{k + 1} - v^{k} ∥_{M_{h}}^{2} + \frac{σ_{2}}{2} {∥ {\bar{u}}^{k + 1} - z^{k} ∥}_{M_{h}}^{2} . \end{matrix}

(31)

Proof.

Similar to Proposition 1, we can finish the proof by replacing

u^{k + 1}

,

v^{k + 1}

, and

z^{k + 1}

with

{\bar{u}}^{k + 1}

,

{\bar{v}}^{k + 1}

, and

{\bar{z}}^{k + 1}

in the proof of Proposition 1. □

Theorem 4.

Let

(y^{*}, u^{*}, v^{*}, z^{*}, p^{*}, λ_{1}^{*}, λ_{2}^{*})

denote the KKT point of (8) and

{(u^{k}, v^{k}, z^{k}, λ_{1}^{k}, λ_{2}^{k})}

denote the sequence generated by Algorithm 2. Let

{y^{k}}

be the associated state and

{p^{k}}

be the adjoint state. Then, for any

τ \in (0, 1]

and

σ_{1} \in (0, \frac{1}{4} α]

and

σ_{2} \in (0, \frac{1}{4} α]

, it follows that

\begin{matrix} lim_{k \to \infty} {∥ u^{k} - u^{*} ∥ + ∥ v^{k} - v^{*} ∥ + ∥ z^{k} - z^{*} ∥ + ∥ λ_{1}^{k} - λ_{1}^{*} ∥ + ∥ λ_{2}^{k} - λ_{2}^{*} ∥} = 0, \end{matrix}

(32)

\begin{matrix} lim_{k \to \infty} {∥ y^{k} - y^{*} ∥ + ∥ p^{k} - p^{*} ∥} = 0 . \end{matrix}

(33)

Furthermore, there is a constant C that depends only on

(u^{0}, v^{0}, z^{0}, λ_{1}^{0}, λ_{2}^{0})

and

(u^{*}, v^{*}, z^{*}, λ_{1}^{*}, λ_{2}^{*})

, such that

\begin{matrix} min_{1 \leq i \leq k} {R_{h} (u^{i}, v^{i}, z^{i}, λ_{1}^{i}, λ_{2}^{i})} \leq \frac{C}{k}, \end{matrix}

(34)

\begin{matrix} lim_{k \to \infty} (k \times min_{1 \leq i \leq k} {R_{h} (u^{i}, v^{i}, z^{i}, λ_{1}^{i}, λ_{2}^{i})}) = 0, \end{matrix}

(35)

where

R_{h} (\cdot)

is defined in (20).

Proof.

We first demonstrate the global convergence of the iteration sequences. First, we need to prove that the sequence

{(u^{k}, v^{k}, z^{k}, λ_{1}^{k}, λ_{2}^{k})}

is a bounded sequence. To achieve this, we define the sequences

θ^{k}

and

{\bar{θ}}^{k}

as follows:

\begin{matrix} θ^{k} & = (\frac{1}{\sqrt{2 τ σ_{1}}} M_{h}^{\frac{1}{2}} (λ_{1}^{k} - λ_{1}^{*}), \frac{1}{\sqrt{2 τ σ_{2}}} M_{h}^{\frac{1}{2}} (λ_{2}^{k} - λ_{2}^{*}), \sqrt{\frac{σ_{1}}{2}} M_{h}^{\frac{1}{2}} (v^{k} - v^{*}), \sqrt{\frac{σ_{2}}{2}} M_{h}^{\frac{1}{2}} (z^{k} - z^{*})), \\ {\bar{θ}}^{k} & = (\frac{1}{\sqrt{2 τ σ_{1}}} M_{h}^{\frac{1}{2}} ({\bar{λ}}_{1}^{k} - λ_{1}^{*}), \frac{1}{\sqrt{2 τ σ_{2}}} M_{h}^{\frac{1}{2}} ({\bar{λ}}_{2}^{k} - λ_{2}^{*}), \sqrt{\frac{σ_{1}}{2}} M_{h}^{\frac{1}{2}} ({\bar{v}}^{k} - v^{*}), \sqrt{\frac{σ_{2}}{2}} M_{h}^{\frac{1}{2}} ({\bar{z}}^{k} - z^{*})) . \end{matrix}

For any

τ \in (0, 1]

and

σ_{1} \in (0, \frac{1}{4} α], σ_{2} \in (0, \frac{1}{4} α]

, we know from Proposition 2 that

∥ {\bar{θ}}^{k + 1} ∥^{2} \leq {∥ θ^{k} ∥}^{2}

. Consequently, we obtain

\begin{matrix} ∥ θ^{k + 1} ∥ & \leq ∥ {\bar{θ}}^{k + 1} ∥ + ∥ {\bar{θ}}^{k + 1} - θ^{k + 1} ∥ \leq ∥ θ^{k} ∥ + ∥ {\bar{θ}}^{k + 1} - θ^{k + 1} ∥ . \end{matrix}

Using Lemma 3, we can obtain

\begin{matrix} ∥ {\bar{θ}}^{k + 1} - θ^{k + 1} ∥^{2} \\ = & \frac{1}{2 τ σ_{1}} ∥ {\bar{λ}}_{1}^{k + 1} - λ_{1}^{k + 1} ∥_{M_{h}}^{2} + \frac{1}{2 τ σ_{2}} ∥ {\bar{λ}}_{2}^{k + 1} - λ_{2}^{k + 1} ∥_{M_{h}}^{2} + \frac{σ_{1}}{2} ∥ {\bar{v}}^{k + 1} - v^{k + 1} ∥_{M_{h}}^{2} + \frac{σ_{2}}{2} {∥ {\bar{z}}^{k + 1} - z^{k + 1} ∥}_{M_{h}}^{2} \\ = & \frac{τ σ_{1}}{2} ∥ {\bar{r}}_{1}^{k + 1} - r_{1}^{k + 1} ∥_{M_{h}}^{2} + \frac{τ σ_{2}}{2} ∥ {\bar{r}}_{2}^{k + 1} - r_{2}^{k + 1} ∥_{M_{h}}^{2} + \frac{σ_{1}}{2} ∥ {\bar{v}}^{k + 1} - v^{k + 1} ∥_{M_{h}}^{2} + \frac{σ_{2}}{2} {∥ {\bar{z}}^{k + 1} - z^{k + 1} ∥}_{M_{h}}^{2} \\ \leq & τ (σ_{1} + σ_{2}) ∥ {\bar{u}}^{k + 1} - u^{k + 1} ∥_{M_{h}}^{2} + (τ σ_{1} + \frac{σ_{1}}{2}) ∥ {\bar{v}}^{k + 1} - v^{k + 1} ∥_{M_{h}}^{2} + (τ σ_{2} + \frac{σ_{2}}{2}) {∥ {\bar{z}}^{k + 1} - z^{k + 1} ∥}_{M_{h}}^{2} \\ \leq & \frac{5}{2} (σ_{1} + σ_{2}) ∥ M_{h} ∥ ρ^{2} ε_{k}^{2} . \end{matrix}

Then we have

∥ {\bar{θ}}^{k + 1} - θ^{k + 1} ∥ \leq \sqrt{\frac{5}{2} (σ_{1} + σ_{2}) ∥ M_{h} ∥} ρ ε_{k} .

Hence, for any

k \geq 0

, we have

\begin{matrix} ∥ θ^{k + 1} ∥ & \leq ∥ θ^{k} ∥ + \sqrt{\frac{5}{2} (σ_{1} + σ_{2}) ∥ M_{h} ∥} ρ ε_{k} \\ \leq ∥ θ^{0} ∥ + \sqrt{\frac{5}{2} (σ_{1} + σ_{2}) ∥ M_{h} ∥} ρ \sum_{k = 0}^{\infty} ε_{k} \\ = ∥ θ^{0} ∥ + \sqrt{\frac{5}{2} (σ_{1} + σ_{2}) ∥ M_{h} ∥} ρ C_{1} \\ \equiv \bar{ρ} . \end{matrix}

(36)

Given that

∥ {\bar{θ}}^{k + 1} ∥ \leq ∥ θ^{k} ∥

, it follows that

∥ {\bar{θ}}^{k + 1} ∥ \leq \bar{ρ}

, ∀

k \geq 0

. Then, the sequences

{θ^{k}}

and

{{\bar{θ}}^{k}}

are both bounded sequences. Given the definition of

{θ^{k}}

and

M_{h} ≻ 0

, it follows that the sequences

{λ_{1}^{k}}

,

{λ_{2}^{k}}

,

{v^{k}}

, and

{z^{k}}

are also bounded. Additionally, the boundedness of the sequence

{u^{k}}

is ensured by the updating rules for

λ_{1}^{k}

and

λ_{2}^{k}

. Therefore, the sequence

{(u^{k}, v^{k}, z^{k}, λ_{1}^{k}, λ_{2}^{k})}

is bounded. This implies that there exists a subsequence

{(u^{k_{i}}, v^{k_{i}}, z^{k_{i}}, λ_{1}^{k_{i}}, λ_{2}^{k_{i}})}

converging to an accumulation point

(\bar{u}, \bar{v}, \bar{z}, {\bar{λ}}_{1}, {\bar{λ}}_{2})

. Next, we will demonstrate that

(\bar{u}, \bar{v}, \bar{z}, {\bar{λ}}_{1}, {\bar{λ}}_{2})

satisfies the KKT conditions, which are actually equal to

(u^{*}, v^{*}, z^{*}, λ_{1}^{*}, λ_{2}^{*})

.

By applying Proposition 2, we derive that

\begin{matrix} \sum_{k = 0}^{\infty} (∥ {\bar{u}}^{k + 1} - u^{*} ∥_{Σ_{f}}^{2} + ∥ {\bar{v}}^{k + 1} - v^{*} ∥_{Σ_{h}}^{2} + \frac{(1 - τ)}{2} (σ_{1} ∥ {\bar{r}}_{1}^{k + 1} ∥_{M_{h}}^{2} + σ_{2} ∥ {\bar{r}}_{2}^{k + 1} ∥_{M_{h}}^{2})) \\ + \sum_{k = 0}^{\infty} (\frac{σ_{1}}{2} ∥ {\bar{u}}^{k + 1} - v^{k} ∥_{M_{h}}^{2} + \frac{σ_{2}}{2} {∥ {\bar{u}}^{k + 1} - z^{k} ∥}_{M_{h}}^{2}) \\ \leq & \sum_{k = 0}^{\infty} (∥ θ^{k} ∥^{2} - ∥ θ^{k + 1} ∥^{2} + ∥ θ^{k + 1} ∥^{2} - ∥ {\bar{θ}}^{k + 1} ∥^{2}) \\ \leq & ∥ θ^{0} ∥^{2} + \sum_{k = 0}^{\infty} (∥ θ^{k + 1} - {\bar{θ}}^{k + 1} ∥ (∥ θ^{k + 1} ∥ + ∥ {\bar{θ}}^{k + 1} ∥)) \\ \leq & ∥ θ^{0} ∥^{2} + 2 \bar{ρ} \sqrt{\frac{5}{2} (σ_{1} + σ_{2}) ∥ M_{h} ∥} ρ C_{1} < \infty, \end{matrix}

which means

\{\begin{matrix} lim_{k \to \infty} ∥ {\bar{u}}^{k + 1} - u^{*} ∥ = 0, \\ lim_{k \to \infty} ∥ {\bar{v}}^{k + 1} - v^{*} ∥ = 0, \\ lim_{k \to \infty} ∥ {\bar{r}}_{1}^{k + 1} ∥ = 0, \\ lim_{k \to \infty} ∥ {\bar{r}}_{2}^{k + 1} ∥ = 0, \\ lim_{k \to \infty} ∥ {\bar{u}}^{k + 1} - v^{k} ∥ = 0, \\ lim_{k \to \infty} ∥ {\bar{u}}^{k + 1} - z^{k} ∥ = 0 . \end{matrix}

We know from Lemma 3 that

\begin{matrix} | | u^{k + 1} - u^{*} | | & \leq | | {\bar{u}}^{k + 1} - u^{*} | | + | | u^{k + 1} - {\bar{u}}^{k + 1} | | \leq | | {\bar{u}}^{k + 1} - u^{*} | | + ρ ε_{k}, \end{matrix}

(37a)

\begin{matrix} | | v^{k + 1} - v^{*} | | & \leq | | {\bar{v}}^{k + 1} - v^{*} | | + | | v^{k + 1} - {\bar{v}}^{k + 1} | | \\ \leq | | {\bar{v}}^{k + 1} - v^{*} | | + ∥ {\bar{u}}^{k + 2} - v^{k + 1} ∥ + ∥ {\bar{u}}^{k + 2} - {\bar{v}}^{k + 1} ∥ \\ \leq | | {\bar{v}}^{k + 1} - v^{*} | | + ∥ {\bar{u}}^{k + 2} - v^{k + 1} ∥ + ∥ {\bar{u}}^{k + 2} - u^{*} ∥ + ∥ {\bar{v}}^{k + 1} - v^{*} ∥, \end{matrix}

(37b)

\begin{matrix} | | z^{k + 1} - z^{*} | | & \leq | | {\bar{z}}^{k + 1} - z^{*} | | + | | z^{k + 1} - {\bar{z}}^{k + 1} | | \leq | | {\bar{z}}^{k + 1} - z^{*} | | + ρ ε_{k} \\ \leq | | {\bar{z}}^{k + 1} - {\bar{u}}^{k + 1} | | + | | {\bar{u}}^{k + 1} - z^{*} | | + ρ ε_{k} \\ = | | {\bar{r}}_{2}^{k + 1} | | + | | {\bar{u}}^{k + 1} - u^{*} | | + ρ ε_{k} . \end{matrix}

(37c)

By taking the limits of both sides of (37a)–(37c), we can derive from

lim_{k \to \infty} ε_{k} = 0

that

\{\begin{matrix} lim_{k \to \infty} ∥ u^{k + 1} - u^{*} ∥ = 0, \\ lim_{k \to \infty} ∥ v^{k + 1} - v^{*} ∥ = 0, \\ lim_{k \to \infty} ∥ z^{k + 1} - z^{*} ∥ = 0, \\ lim_{k \to \infty} ∥ r_{1}^{k + 1} ∥ = 0, \\ lim_{k \to \infty} ∥ r_{2}^{k + 1} ∥ = 0, \\ lim_{k \to \infty} ∥ u^{k + 1} - v^{k} ∥ = 0, \\ lim_{k \to \infty} ∥ u^{k + 1} - z^{k} ∥ = 0 . \end{matrix}

(38)

Obviously, according to (38), we know that

{u^{k}}

,

{v^{k}}

, and

{z^{k}}

are convergence sequences, and their limits are

u^{*}

,

v^{*}

, and

z^{*}

, respectively. Therefore, we have

\bar{u} = u^{*}

,

\bar{v} = v^{*}

, and

\bar{z} = z^{*}

. Thus, all we need to do next is to show that

{\bar{λ}}_{i} = λ_{i}^{*} .

In order to achieve this, we take the limits for

k_{i} \to \infty

in (26a) to obtain

lim_{k_{i} \to \infty} (δ^{k_{i}} - (M_{h} λ_{1}^{k_{i}} + M_{h} λ_{2}^{k_{i}} + σ_{1} M_{h} (u^{k_{i} + 1} - v^{k_{i}}) + σ_{2} M_{h} (u^{k_{i} + 1} - z^{k_{i}}))) = lim_{k_{i} \to \infty} \nabla f (u^{k_{i} + 1}) .

This leads to

- M_{h} {\bar{λ}}_{1} - M_{h} {\bar{λ}}_{2} = \nabla f (u^{*}) .

Consequently, we know from (9a) that

{\bar{λ}}_{1} + {\bar{λ}}_{2} = λ_{1}^{*} + λ_{2}^{*} .

Similarly, we have

lim_{k_{i} \to \infty} M_{h} λ_{1}^{k_{i}} + σ_{1} M_{h} (u^{k_{i} + 1} - v^{k_{i} + 1}) = lim_{k_{i} \to \infty} \nabla h (v^{k_{i} + 1}),

which results in

M_{h} {\bar{λ}}_{1} = \nabla h (v^{*}) .

Then, from (9b), we know that

{\bar{λ}}_{1} = λ_{1}^{*}

; thus,

{\bar{λ}}_{2} = λ_{2}^{*} .

Moreover, to finalize the proof, we have to demonstrate that the sequence

{λ_{i}^{k}}

converges to

λ_{i}^{*}

. According to (36), it follows that

\forall k > k_{i},

∥ θ^{k + 1} ∥ \leq ∥ θ^{k_{i}} ∥ + \sqrt{\frac{5}{2} (σ_{1} + σ_{2}) ∥ M_{h} ∥} ρ \sum_{j = k_{i}}^{k} ε_{j} .

Because

lim_{k_{i} \to \infty} ∥ θ^{k_{i}} ∥ = 0 and \sum_{k = 0}^{\infty} ε_{k} < \infty

, we have

lim_{k \to \infty} ∥ θ^{k} ∥ = 0 .

Then, we obtain

lim_{k \to \infty} ∥ λ_{i}^{k + 1} - λ_{i}^{*} ∥ = 0 .

Therefore, we have established the convergence of

{(u^{k + 1}, v^{k + 1}, z^{k + 1}, λ_{1}^{k + 1}, λ_{2}^{k + 1})}

, thus completing the proof of (32). For the proof of (33), it follows straightforwardly from the definition of

{(y^{k}, p^{k})}

, and we omit the detailed steps here.

Finally, we give the proofs (34) and (35), which provide the iteration complexity results for the sequence generated by the three-block ihADMM algorithm. First, we know from the optimality conditions (26a)–(26c) for

(u^{k + 1}, v^{k + 1}, z^{k + 1})

that

\begin{matrix} δ^{k} + (τ - 1) (σ_{1} M_{h} r_{1}^{k + 1} + σ_{2} M_{h} r_{2}^{k + 1}) - σ_{1} M_{h} (v^{k + 1} - v^{k}) - σ_{2} M_{h} (z^{k + 1} - z^{k}) \\ = & M_{h} (λ_{1}^{k + 1} + λ_{2}^{k + 1}) + \nabla f (u^{k + 1}), \\ σ_{1} M_{h} r_{1}^{k + 1} = - M_{h} λ_{1}^{k + 1} + \nabla h (v^{k + 1}), \\ σ_{2} M_{h} r_{2}^{k + 1} \in - M_{h} λ_{2}^{k + 1} + \partial g (z^{k + 1}) . \end{matrix}

We denote

w_{k + 1} : = (u^{k + 1}, v^{k + 1}, z^{k + 1}, λ_{1}^{k + 1}, λ_{2}^{k + 1}) .

Then, according to the definition of

R_{h}

, we have

\begin{matrix} R_{h} (w^{k + 1}) & = ∥ M_{h} λ_{1}^{k + 1} + M_{h} λ_{2}^{k + 1} + \nabla f (u^{k + 1}) ∥^{2} + {∥ \nabla h (v^{k + 1}) - M_{h} λ_{1}^{k + 1} ∥}^{2} \\ + {dist}^{2} (0, - M_{h} λ_{2}^{k + 1} + \partial g (z^{k + 1})) + ∥ u^{k + 1} - z^{k + 1} ∥^{2} + {∥ u^{k + 1} - v^{k + 1} ∥}^{2} \\ \leq 2 ∥ δ^{k} ∥^{2} + 2 {(τ - 1)}^{2} σ_{1}^{2} ∥ M_{h} ∥^{2} ∥ r_{1}^{k + 1} ∥^{2} + 2 {(τ - 1)}^{2} σ_{2}^{2} ∥ M_{h} ∥^{2} {∥ r_{2}^{k + 1} ∥}^{2} \\ + 2 σ_{1}^{2} ∥ M_{h} ∥ ∥ v^{k + 1} - v^{k} ∥_{M_{h}}^{2} + 2 σ_{2}^{2} ∥ M_{h} ∥ ∥ z^{k + 1} - z^{k} ∥_{M_{h}}^{2} \\ + σ_{1}^{2} ∥ (M_{h}) ∥^{2} ∥ r_{1}^{k + 1} ∥^{2} + σ_{2}^{2} ∥ (M_{h}) ∥^{2} ∥ r_{2}^{k + 1} ∥^{2} + ∥ r_{1}^{k + 1} ∥^{2} + {∥ r_{2}^{k + 1} ∥}^{2} \\ \leq 2 ∥ δ^{k} ∥^{2} + 2 {(τ - 1)}^{2} σ_{1}^{2} ∥ M_{h} ∥^{2} ∥ r_{1}^{k + 1} ∥^{2} + 2 {(τ - 1)}^{2} σ_{2}^{2} ∥ M_{h} ∥^{2} {∥ r_{2}^{k + 1} ∥}^{2} \\ + 4 σ_{1}^{2} ∥ M_{h} ∥^{2} ∥ u^{k + 1} - v^{k + 1} ∥^{2} + 4 σ_{2}^{2} ∥ M_{h} ∥^{2} {∥ u^{k + 1} - z^{k + 1} ∥}^{2} \\ + 4 σ_{1}^{2} ∥ M_{h} ∥ ∥ u^{k + 1} - v^{k} ∥_{M_{h}}^{2} + 4 σ_{2}^{2} ∥ M_{h} ∥ ∥ u^{k + 1} - z^{k} ∥_{M_{h}}^{2} \\ + σ_{1}^{2} ∥ (M_{h}) ∥^{2} ∥ r_{1}^{k + 1} ∥^{2} + σ_{2}^{2} ∥ (M_{h}) ∥^{2} ∥ r_{2}^{k + 1} ∥^{2} + ∥ r_{1}^{k + 1} ∥^{2} + {∥ r_{2}^{k + 1} ∥}^{2} \\ \leq 2 | | δ^{k} {| |}^{2} + η_{1} | | r_{1}^{k + 1} {| |}^{2} + η_{2} | | r_{2}^{k + 1} {| |}^{2} \\ + 4 σ_{1}^{2} | | M_{h} | | | | u^{k + 1} - v^{k} {| |}_{M_{h}}^{2} + 4 σ_{2}^{2} | | M_{h} | | | | u^{k + 1} - z^{k} {| |}_{M_{h}}^{2}, \end{matrix}

(39)

where

η_{i} : = 2 {(τ - 1)}^{2} σ_{i}^{2} ∥ M_{h} ∥^{2} + 5 σ_{i}^{2} {∥ M_{h} ∥}^{2} + 1 .

Next, we search for an upper bound of

R_{h} (w_{k + 1})

. First, based on the definition of

θ^{k}

and (36), we can easily obtain for

\forall k \geq 0

that

∥ λ_{i}^{k} - λ_{i}^{*} ∥ \leq \bar{ρ} \sqrt{\frac{2 τ σ_{i}}{∥ M_{h}^{- 1} ∥}}, ∥ v^{k} - v^{*} ∥ \leq \bar{ρ} \sqrt{\frac{2}{σ_{1} ∥ M_{h}^{- 1} ∥}}, ∥ z^{k} - z^{*} ∥ \leq \bar{ρ} \sqrt{\frac{2}{σ_{2} ∥ M_{h}^{- 1} ∥}} .

Then, we can obtain an upper bound of

⟨ δ^{k}, u^{k + 1} - u^{*} ⟩

:

\begin{matrix} ⟨ δ^{k}, u^{k + 1} - u^{*} ⟩ & \leq | | δ^{k} | | (| | u^{k + 1} - z^{k + 1} | | + | | z^{k + 1} - z^{*} | |) \\ = ∥ δ^{k} ∥ (\frac{1}{τ σ_{2}} ∥ λ_{2}^{k + 1} - λ_{2}^{k} ∥ + ∥ z^{k + 1} - z^{*} ∥) \\ \leq ∥ δ^{k} ∥ (\frac{1}{τ σ_{2}} (∥ λ_{2}^{k + 1} - λ_{2}^{*} ∥ + ∥ λ_{2}^{k} - λ_{2}^{*} ∥) + ∥ z^{k + 1} - z^{*} ∥) \\ \leq ((1 + \frac{2}{\sqrt{τ}}) \frac{2 \sqrt{2} \bar{ρ}}{\sqrt{τ σ_{2} ∥ M_{h}^{- 1} ∥}}) ∥ δ^{k} ∥ \\ \equiv \bar{η} ∥ δ^{k} ∥ . \end{matrix}

Then, we know from (31) in Proposition 2 that

\begin{matrix} \sum_{k = 0}^{\infty} (\frac{(1 - τ) σ_{1}}{2} ∥ r_{1}^{k + 1} ∥_{M_{h}}^{2} + \frac{(1 - τ) σ_{2}}{2} ∥ r_{2}^{k + 1} ∥_{M_{h}}^{2} + \frac{σ_{1}}{2} | | u^{k + 1} - v^{k} {| |}_{M_{h}}^{2} + \frac{σ_{2}}{2} | | u^{k + 1} - z^{k} {| |}_{M_{h}}^{2}) \\ \leq \sum_{k = 0}^{\infty} (∥ θ^{k} ∥ - ∥ θ^{k + 1} ∥) + \sum_{k = 0}^{\infty} ⟨ δ^{k}, u^{k + 1} - u^{*} ⟩ \\ \leq ∥ θ^{0} ∥ + \bar{η} \sum_{k = 0}^{\infty} ∥ δ^{k} ∥ \\ \leq ∥ θ^{0} ∥ + \bar{η} \sum_{k = 0}^{\infty} ε^{k} \\ \leq ∥ θ^{0} ∥ + \bar{η} C_{1} . \end{matrix}

Hence, we have

\{\begin{matrix} \sum_{k = 0}^{\infty} {∥ r_{i}^{k + 1} ∥}^{2} \leq \frac{2 (∥ θ^{0} ∥ + \bar{η} C_{1})}{(1 - τ) σ_{i} ∥ {(M_{h})}^{- 1} ∥}, \\ \sum_{k = 0}^{\infty} {∥ u^{k + 1} - v^{k} ∥}_{M_{h}}^{2} \leq \frac{2 (∥ θ^{0} ∥ + \bar{η} C_{1})}{σ_{1}}, \\ \sum_{k = 0}^{\infty} {∥ u^{k + 1} - z^{k} ∥}_{M_{h}}^{2} \leq \frac{2 (∥ θ^{0} ∥ + \bar{η} C_{1})}{σ_{2}} . \end{matrix}

(40)

Then, by substituting (40) into (39), we obtain

\begin{matrix} \sum_{k = 0}^{\infty} R_{h} (w^{k + 1}) \\ \leq 2 \sum_{k = 0}^{\infty} ∥ δ^{k} ∥^{2} + η_{1} \sum_{k = 0}^{\infty} ∥ r_{1}^{k + 1} ∥^{2} + η_{2} \sum_{k = 0}^{\infty} ∥ r_{2}^{k + 1} ∥^{2} + 4 σ_{1}^{2} ∥ M_{h} ∥ \sum_{k = 0}^{\infty} {∥ u^{k + 1} - v^{k} ∥}_{M_{h}}^{2} \\ + 4 σ_{2}^{2} ∥ M_{h} ∥ \sum_{k = 0}^{\infty} {∥ u^{k + 1} - z^{k} ∥}_{M_{h}}^{2} \\ \leq 2 \sum_{k = 0}^{\infty} ε_{k}^{2} + η_{1} \frac{2 (∥ θ^{0} ∥ + \bar{η} C_{1})}{(1 - τ) σ_{1} ∥ {(M_{h})}^{- 1} ∥} + η_{2} \frac{2 (∥ θ^{0} ∥ + \bar{η} C_{1})}{(1 - τ) σ_{2} ∥ {(M_{h})}^{- 1} ∥} + 2 σ^{2} ∥ M_{h} ∥ \frac{2 (θ^{0} + \bar{η} C_{1})}{σ_{1}} \\ + 2 σ^{2} ∥ M_{h} ∥ \frac{2 (θ^{0} + \bar{η} C_{1})}{σ_{2}} \\ \leq 2 C_{2} + η_{1} \frac{2 (∥ θ^{0} ∥ + \bar{η} C_{1})}{(1 - τ) σ_{1} ∥ {(M_{h})}^{- 1} ∥} + η_{2} \frac{2 (∥ θ^{0} ∥ + \bar{η} C_{1})}{(1 - τ) σ_{2} ∥ {(M_{h})}^{- 1} ∥} + 4 σ^{2} ∥ M_{h} ∥ \frac{(∥ θ^{0} ∥ + \bar{η} C_{1})}{σ_{1}} \\ + 4 σ^{2} ∥ M_{h} ∥ \frac{(∥ θ^{0} ∥ + \bar{η} C_{1})}{σ_{2}} \\ \equiv C . \end{matrix}

Thus, by applying Lemma 2, (34)–(35) hold. As a result, by integrating the global convergence results, we have finalized the proof for Theorem 4. □

4. Numerical Experiments

In this section, we demonstrate the numerical results of incorporating a control gradient penalty term into the cost function of elliptic PDE-constrained optimization problems. We utilize the proposed three-block ihADMM algorithm to solve the problems. For comparison, the numerical results derived from the classical ADMM algorithm are also provided.

4.1. Algorithmic Details

First, we outline the algorithmic details that are common to both examples.

Discretization: The standard piecewise linear finite element method was employed for the discretization process.

Initialization: We chose zero as the initial value.

Parameter setting: The step length was chosen as

τ = 1.618

for both algorithms. For comparison, in each example, the two algorithms used the same parameters,

β

and

σ_{i}

. The parameter

σ_{i}

was selected as the optimal parameter based on extensive numerical experiments. When using the three-block ihADMM algorithm to solve elliptic PDE-constrained optimization problems with a control gradient penalty term, the choice of the control gradient penalty coefficient is crucial for several reasons:

1. Numerical stability: Too large a value for the parameter

β

can lead to numerical instability. This may manifest as slow convergence or even failure to converge, as well as erratic behavior in the numerical solution.

2. Convergence speed: If

β

is too large, it may slow down the convergence rate of the algorithm, requiring more iterations to meet the convergence criteria.

3. Impact on solution quality: The choice of the penalty term coefficient affects the quality of the final optimal solution. An excessively large coefficient may overly penalize the solution, causing it to deviate from the true optimal solution of the problem. Conversely, a coefficient that is too small may result in suboptimal precision in the optimization outcome.

Therefore, it is essential to adjust and balance the coefficient of the control gradient penalty term appropriately to ensure stable, fast convergence to an accurate optimal solution.

Terminal condition: Let h denote the mesh size. In our numerical experiments, cases with different values for h were considered. Let #dofs represent the number of degrees of freedom associated with the control variables in each grid layer. To evaluate the accuracy of an approximate optimal solution, we assessed the corresponding KKT residual error. The algorithms were terminated when

η < 10^{- 6}

, where

η : = max {η_{1}, η_{2}, η_{3}, η_{4}, η_{5}, η_{6}, η_{7}},

and

\begin{matrix} η_{1} & = \frac{∥ K_{h} y - M_{h} u - M_{h} y_{r} ∥}{1 + ∥ M_{h} y_{r} ∥}, & η_{2} & = \frac{∥ M_{h} (y - y_{d}) + K_{h} p ∥}{1 + ∥ M_{h} y_{d} ∥}, & η_{3} & = \frac{∥ M_{h} (u - z) ∥}{1 + ∥ u ∥}, \\ η_{4} & = \frac{∥ α M_{h} u - M_{h} p + M_{h} (λ_{1} + λ_{2}) ∥}{1 + ∥ u ∥}, & η_{5} & = \frac{∥ z - Π_{{[a, b]}^{N_{h}}} (z + M_{h} λ_{2}) ∥}{1 + ∥ z ∥}, & η_{6} & = \frac{∥ M_{h} (u - v) ∥}{1 + ∥ u ∥}, \\ η_{7} & = \frac{∥ 2 β K_{h} v - M_{h} λ_{1} ∥}{1 + ∥ v ∥} . \end{matrix}

The maximum iteration number was set to 600.

Computational environment: The results presented in this paper were generated using MATLAB R2023b with the FEM package iFEM [28] running on a desktop computer with Intel Core i7 CPU (2.10 GHz) with 32 GB of RAM.

4.2. Examples

Example 1.

(Example 3.3 in [2] with an additional control gradient penalty term)

Consider

\begin{matrix} min_{(y, u) \in H_{0}^{1} (Ω) \times L^{2} (Ω)} J (y, u) & = \frac{1}{2} ∥ y - y_{d} ∥_{L^{2} (Ω)}^{2} + \frac{α}{2} {∥ u ∥}_{L^{2} (Ω)}^{2} + β {∥ \nabla u ∥}_{L^{2} (Ω)}^{2} \\ s . t . - Δ y & = u in Ω, \\ y & = 0 on \partial Ω, \\ u & \in U_{a d} = {v (x) | a \leq v (x) \leq b, a . e on Ω}, \end{matrix}

where

Ω = B_{1} (0) \subseteq R^{2}

. The desired state

y_{d} = (1 - (x_{1}^{2} + x_{2}^{2})) x_{1}

and the parameters are set to

α = 0.1, a = - 0.2,

and

b = 0.2 .

In this example, the penalty parameter

β

,

σ_{1}

, and

σ_{2}

were chosen as

β = 3 \times 10^{- 5}

,

σ_{1} = 0.4

, and

σ_{2} = 0.4

. The exact solutions of this problem cannot be determined in advance. Therefore, we utilized numerical solutions obtained on a grid with

h = 2^{- 10}

as the benchmark reference.

The numerical optimal controls obtained by the three-block ADMM algorithm on the grid with

h = 2^{- 8}

are displayed in Figure 1.

Figure 1. The numerical optimal control,

u_{h}

, with different values of parameter

β

for Example 1. (a)

u_{h}

with

β = 3 e - 5

. (b)

u_{h}

with

β = 0

. (c)

u_{h}

with

3 \times 10^{- 4}

. (d)

u_{h}

with

β = 1 \times 10^{- 2}

.

From Figure 1, we can see that when

β = 0

, the numerical optimal control,

u_{h}

, is not smooth in the cross-section. Meanwhile, when

β > 0

, the cross-section becomes smooth. Moreover, as the parameter

β

of the gradient penalty term increases, the numerical optimal control,

u_{h}

, becomes smoother and smoother. This shows that adding a control gradient penalty term to the PDE-constrained optimization problem results in a solution with smoothing properties. Moreover, we would like to point out that a larger penalty term will strongly discourage large control gradients, leading to a smoother control solution, while a smaller penalty term may lead to undesirable smoothness in the optimization results. However, the magnitude of the control gradient penalty term should be chosen carefully to balance the desired smoothness of the control strategy with the specific requirements and constraints of the problem.

As an example, we show the numerical results of

β = 3 \times 10^{- 5}

in Table 1. For the proposed three-block ihADMM algorithm and the classical ADMM algorithm, the mesh size, h; the number of degrees of freedom, #dofs; the KKT residual,

η

; the CPU time; and the number of iterations, #iter, are reported. We would like to point out that each case in Table 1 represents a different size of mesh. In each case, the three-block ihADMM algorithm and the classical ADMM algorithm were terminated with the same stopping criteria. Therefore, #iter and CPU time were compared when the KKT residuals for each mesh were of the same order of magnitude.

Table 1. The convergence behavior of the three-block ihADMM algorithm and the classical ADMM algorithm.

From each row of Table 1, it is evident that the three-block ihADMM algorithm requires fewer iterations and is significantly faster than the classical ADMM algorithm in obtaining approximate solutions across different final mesh sizes. Specifically, the fifth and sixth columns of Table 1 show that the three-block ihADMM algorithm outperforms the classical ADMM algorithm in terms of both CPU time and the number of iterations as the final mesh size decreases. Furthermore, the results show that as the discrete mesh size becomes finer, the number of iterations and computational time required by the classical ADMM algorithm increase faster than for the three-block ihADMM algorithm. This highlights the efficiency of the proposed three-block ihADMM algorithm in achieving moderate accuracy while reducing computational cost.

Example 2.

(Example 4.1 in [9] with an additional control gradient penalty term)

Consider

\begin{matrix} min_{(y, u) \in H_{0}^{1} (Ω) \times L^{2} (Ω)} J (y, u) & = \frac{1}{2} ∥ y - y_{d} ∥_{L^{2} (Ω)}^{2} + \frac{α}{2} {∥ u ∥}_{L^{2} (Ω)}^{2} + β {∥ \nabla u ∥}_{L^{2} (Ω)}^{2} \\ s . t . - Δ y & = u in Ω, \\ y & = 0 on \partial Ω, \\ u & \in U_{a d} = {v (x) | a \leq v (x) \leq b, a . e on Ω}, \end{matrix}

where

Ω = {(0, 1)}^{2}

. The parameters are set as

a = 0.3, b = 1,

and

α = 0.001

. The desired state was chosen as

y_{d} =

- 4 π^{2} α sin (π x) sin (π y) + S r

, where

r = min (1, max (0.3, 2 sin (π x) sin (π y)))

. S represents the solution operator associated with

- Δ

.

The penalty parameter

β

,

σ_{1}

, and

σ_{2}

were chosen as

β = 8 \times 10^{- 8}

,

σ_{1} = 5 \times 10^{- 3}

, and

σ_{2} = 5 \times 10^{- 3}

. The numerical optimal control on the grid with

h = \frac{\sqrt{2}}{2^{7}}

is displayed in Figure 2.

Figure 2. The numerical optimal control,

u_{h}

, with different values of parameter

β

for Example 2. (a)

u_{h}

with

β = 8 \times 10^{- 8}

. (b)

u_{h}

with

β = 0

. (c)

u_{h}

with

β = 8 \times 10^{- 7}

. (d)

u_{h}

with

β = 1 \times 10^{- 5}

.

We can see from Figure 2 that when

β = 0

, the numerical optimal control,

u_{h}

, is not smooth in the cross-section at points a and b. When

β > 0

, the upper boundary and lower boundary become smooth. Specifically, as the parameter

β

of the gradient penalty term increases, the numerical optimal control,

u_{h}

, becomes smoother and smoother at the upper and lower boundaries. We can see that introducing a control gradient penalty term into the objective function of PDE-constrained optimization problems can result in a smoother control curve.

As an example, we present the numerical results for

β = 8 \times 10^{- 8}

for both the three-block ihADMM algorithm and the classical ADMM algorithm in Table 2.

Table 2. The convergence behavior of the three-block ihADMM algorithm and the classical ADMM algorithm.

The results presented in Table 2 indicate that the proposed three-block ihADMM algorithm demonstrates superior numerical efficiency regarding both the number of iterations and computational time when compared to the classical ADMM algorithm. Furthermore, the fifth and sixth columns of Table 2 show that as the mesh size decreases, the three-block ihADMM algorithm exhibits a more significant advantage in terms of the number of iterations and the computational time compared to the classical ADMM algorithm.

5. Conclusions

In this paper, a new elliptic PDE-constrained optimization model with a control gradient penalty term and pointwise box constraints on the control is proposed to address the application requirements. The main purpose of adding the control gradient penalty term is to smooth the control variable, particularly at the boundary. Nevertheless, adding this penalty term makes the system more complex to solve. To tackle this problem, we introduce two artificial variables and propose a convergent three-block ihADMM algorithm. Unlike the classical ADMM algorithm, the proposed algorithm introduces the inexactness strategy. Moreover, different weighted inner products are adopted to define the augmented Lagrangian function in three subproblems. Theoretically, we present the global convergence analysis and provide theoretical results on the iteration complexity for the proposed algorithm. The numerical experiments demonstrate that the three-block ihADMM algorithm significantly outperforms the classical ADMM algorithm in terms of both the number of iterations and computational time across various final mesh sizes. Moreover, as the final mesh size decreases, the efficiency of the three-block ihADMM algorithm becomes even more obvious.

Author Contributions

Conceptualization, X.C. and X.S.; methodology, X.C. and X.S.; software, T.W.; validation, X.C.; formal analysis, T.W. and X.C.; writing—original draft preparation, X.C. and T.W.; writing—review and editing, X.S.; visualization, T.W.; supervision, X.S.; funding acquisition, X.C. and X.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key R&D Program of China (No. 2023YFA1011303), the National Natural Science Foundation of China (No. 12301477, No. 12301479, No. 42274166), the Fundamental Scientific Research Projects of Higher Education Institutions of Liaoning Provincial Department of Education (No. JYTMS20230165), and the Fundamental Research Funds for the Central Universities (No. 3132024202).

Data Availability Statement

The data presented in this study are available upon request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Huang, F.; Chen, Y.; Chen, Y.; Sun, H. Stochastic collocation for optimal control problems with stochastic PDE constraints by meshless techniques. J. Math. Anal. Appl. 2024, 530, 127634. [Google Scholar] [CrossRef]
Hinze, M.; Pinnau, R.; Ulbrich, M.; Ulbrich, S. Optimization with PDE Constraints; Springer: Berlin, Germany, 2009; pp. 171–172. [Google Scholar]
Zheng, Q. A reordering-based preconditioner for elliptic PDE-constrained optimization problems with small Tikhonov parameters. Comp. Appl. Math. 2023, 42, 169. [Google Scholar] [CrossRef]
Sirignano, J.; MacArt, J.; Spiliopoulos, K. PDE-constrained models with neural network terms: Optimization and global convergence. J. Comput. Phys. 2023, 481, 112016. [Google Scholar] [CrossRef]
Rudin, L.I.; Osher, S.; Fatemi, E. Nonlinear total variation based noise removal algorithms. Phys. D 1992, 60, 259–268. [Google Scholar] [CrossRef]
Gao, L.; Ding, J. An improved DMOC method with gradient penalty term. Adv. Mater. Res. 2014, 945, 2784–2787. [Google Scholar] [CrossRef]
Clever, D.; Lang, J. Optimal control of radiative heat transfer in glass cooling with restrictions on the temperature gradient. Optim. Control Appl. Methods 2012, 33, 157–175. [Google Scholar] [CrossRef]
Hinze, M.; Vierling, M. The semi-smooth Newton method for variationally discretized control constrained elliptic optimal control problems implementation, convergence and globalization. Optim. Methods Softw. 2012, 27, 933–950. [Google Scholar] [CrossRef]
Liu, Y.; Wen, Z.; Yin, W. A multiscale semi-smooth Newton method for optimal transport. J. Sci. Comput. 2022, 91, 39. [Google Scholar] [CrossRef]
Liang, J.; Luo, T.; Schonlieb, C.B. Improving “fast iterative shrinkage-thresholding algorithm”: Faster, smarter, and greedier. SIAM J. Sci. Comput. 2022, 44, A1069–A1091. [Google Scholar] [CrossRef]
Song, X.; Yu, B.; Wang, Y.; Zhang, X. An FE-inexact heterogeneous ADMM for elliptic optimal control problems with L¹-control cost. J. Syst. Sci. Complex. 2018, 31, 1659–1697. [Google Scholar] [CrossRef]
Chen, X.; Song, X.; Chen, Z.; Yu, B. A multi-level ADMM algorithm for elliptic PDE-constrained optimization problems. Comput. Appl. Math. 2020, 39, 1–31. [Google Scholar] [CrossRef]
Chen, X.; Song, X.; Chen, Z.; Xu, L. A multilevel heterogeneous ADMM algorithm for elliptic optimal control problems with L¹-control cost. Mathematics 2023, 11, 570. [Google Scholar] [CrossRef]
Chen, Z.; Song, X.; Chen, X.; Yu, B. A warm-start FE-dABCD algorithm for elliptic optimal control problems with constraints on the control and the gradient of the state. Comput. Math. Appl. 2024, 161, 1–12. [Google Scholar] [CrossRef]
Eckstein, J.; Bertsekas, D. On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 1992, 55, 293–318. [Google Scholar] [CrossRef]
Eckstein, J. Some saddle-function splitting methods for convex programming. Optim. Methods Softw. 1994, 4, 75–83. [Google Scholar] [CrossRef]
Bingsheng, H.; Liao, L.; Han, D.; Yang, H. A new inexact alternating directions method for monotone variational inequalities. Math. Program. 2002, 92, 103–118. [Google Scholar]
Ng, M.K.; Wang, F.; Yuan, X. Inexact alternating direction methods for image recovery. SIAM J. Sci. Comput. 2011, 33, 1643–1668. [Google Scholar] [CrossRef]
Chen, L.; Sun, D.; Toh, K.C. An efficient inexact symmetric Gauss-Seidel based majorized ADMM for high-dimensional convex composite conic programming. Math. Program. 2017, 161, 237–270. [Google Scholar] [CrossRef]
Song, X.; Yu, B. A two-phase strategy for control constrained elliptic optimal control problems. Numer. Linear Algebra Appl. 2018, 25, 21–38. [Google Scholar] [CrossRef]
Chen, C.; He, B.; Ye, Y.; Yuan, X. The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent. Math. Program. 2016, 155, 57–79. [Google Scholar] [CrossRef]
Lasdon, L.; Mitter, S.; Waren, A. The conjugate gradient method for optimal control problems. IEEE Trans. Autom. Control. 1967, 12, 132–138. [Google Scholar] [CrossRef]
Saad, Y.; Schultz, M.H. GMRES: A generalized minimum residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 1986, 7, 856–869. [Google Scholar] [CrossRef]
Mora, D.; Rodríguez, R. A piecewise linear finite element method for the buckling and the vibration problems of thin plates. Math. Comput. 2009, 78, 1891–1917. [Google Scholar] [CrossRef]
Casas, E. Using piecewise linear functions in the numerical approximation of semilinear elliptic control problems. Adv. Comput. Math. 2007, 26, 137–153. [Google Scholar] [CrossRef]
Wachsmuth, G.; Wachsmuth, D. Convergence and regularization results for optimal control problems with sparsity functional. ESAIM Control Optim. Calc. Var. 2011, 17, 858–886. [Google Scholar] [CrossRef]
Cao, S.; Wang, Z. PMHSS iteration method and preconditioners for Stokes control PDE-constrained optimization problems. Numer. Algorithms 2021, 87, 365–380. [Google Scholar] [CrossRef]
Chen, L. iFEM: An Integrated Finite Element Methods Package in MATLAB; Technical report; University of California at Irvine: Irvine, CA, USA, 2008. [Google Scholar]

Figure 1. The numerical optimal control,

u_{h}

, with different values of parameter

β

for Example 1. (a)

u_{h}

with

β = 3 e - 5

. (b)

u_{h}

with

β = 0

. (c)

u_{h}

with

3 \times 10^{- 4}

. (d)

u_{h}

with

β = 1 \times 10^{- 2}

.

Figure 1. The numerical optimal control,

u_{h}

, with different values of parameter

β

for Example 1. (a)

u_{h}

with

β = 3 e - 5

. (b)

u_{h}

with

β = 0

. (c)

u_{h}

with

3 \times 10^{- 4}

. (d)

u_{h}

with

β = 1 \times 10^{- 2}

.

Figure 2. The numerical optimal control,

u_{h}

, with different values of parameter

β

for Example 2. (a)

u_{h}

with

β = 8 \times 10^{- 8}

. (b)

u_{h}

with

β = 0

. (c)

u_{h}

with

β = 8 \times 10^{- 7}

. (d)

u_{h}

with

β = 1 \times 10^{- 5}

.

Figure 2. The numerical optimal control,

u_{h}

, with different values of parameter

β

for Example 2. (a)

u_{h}

with

β = 8 \times 10^{- 8}

. (b)

u_{h}

with

β = 0

. (c)

u_{h}

with

β = 8 \times 10^{- 7}

. (d)

u_{h}

with

β = 1 \times 10^{- 5}

.

Table 1. The convergence behavior of the three-block ihADMM algorithm and the classical ADMM algorithm.

Case	h	#dofs	Index	Three-Block ihADMM	Classical ADMM
1	$2^{- 4}$	148	#iter	601	601
			Residual $η$	$1.61 \times 10^{- 5}$	$1.61 \times 10^{- 5}$
			CPU time (s)	0.517	0.987
2	$2^{- 5}$	635	#iter	601	601
			Residual $η$	$2.04 \times 10^{- 6}$	$2.04 \times 10^{- 6}$
			CPU time (s)	2.219	5.031
3	$2^{- 6}$	2629	#iter	52	89
			Residual $η$	$9.75 \times 10^{- 7}$	$9.33 \times 10^{- 7}$
			CPU time (s)	4.134	4.488
4	$2^{- 7}$	10,697	#iter	61	120
			Residual $η$	$9.40 \times 10^{- 7}$	$9.33 \times 10^{- 7}$
			CPU time (s)	11.374	44.761
5	$2^{- 8}$	43,153	#iter	113	313
			Residual $η$	$9.93 \times 10^{- 7}$	$9.96 \times 10^{- 7}$
			CPU time (s)	108.787	1578.778
6	$2^{- 9}$	173,345	#iter	171	601
			Residual $η$	$8.53 \times 10^{- 7}$	$6.84 \times 10^{- 6}$
			CPU time (s)	2220.688	42,815.179

Table 2. The convergence behavior of the three-block ihADMM algorithm and the classical ADMM algorithm.

Case	h	#dofs	Index	Three-Block ihADMM	Classical ADMM
1	$\frac{\sqrt{2}}{2^{3}}$	49	#iter	67	71
			Residual $η$	$9.64 \times 10^{- 7}$	$9.35 \times 10^{- 7}$
			CPU time (s)	0.155	0.108
2	$\frac{\sqrt{2}}{2^{4}}$	225	#iter	64	71
			Residual $η$	$9.20 \times 10^{- 7}$	$9.11 \times 10^{- 7}$
			CPU time (s)	0.259	0.166
3	$\frac{\sqrt{2}}{2^{5}}$	961	#iter	59	68
			Residual $η$	$9.31 \times 10^{- 7}$	$9.46 \times 10^{- 7}$
			CPU time (s)	1.381	0.935
4	$\frac{\sqrt{2}}{2^{6}}$	3969	#iter	52	65
			Residual $η$	$9.46 \times 10^{- 7}$	$9.50 \times 10^{- 7}$
			CPU time (s)	5.482	5.424
5	$\frac{\sqrt{2}}{2^{7}}$	16,129	#iter	56	89
			Residual $η$	$9.87 \times 10^{- 7}$	$9.55 \times 10^{- 7}$
			CPU time (s)	27.015	78.947
6	$\frac{\sqrt{2}}{2^{8}}$	65,025	#iter	206	320
			Residual $η$	$8.83 \times 10^{- 7}$	$9.99 \times 10^{- 7}$
			CPU time (s)	436.688	3285.739
7	$\frac{\sqrt{2}}{2^{9}}$	261,121	#iter	354	601
			Residual $η$	$8.08 \times 10^{- 7}$	$5.39 \times 10^{- 5}$
			CPU time (s)	9893.480	87,119.179

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

A Three-Block Inexact Heterogeneous Alternating Direction Method of Multipliers for Elliptic PDE-Constrained Optimization Problems with a Control Gradient Penalty Term

Abstract

1. Introduction

2. A Three-Block Inexact Heterogeneous Alternating Direction Method of Multipliers (Three-Block ihADMM)

2.1. An Inexact ADMM Algorithm in Continuous Space

2.2. Finite Element Discretization

3. Convergence Analysis

4. Numerical Experiments

4.1. Algorithmic Details

4.2. Examples

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics