*2.2. Asymmetric Penalty Regularization Model*

Compare with optimization algorithm Equations (8) and (9), to estimate the fault transient impulses *x* precisely, this work introduces a novel penalty regularization method, i.e., asymmetric and symmetric nonconvex penalty regularization model,

$$\overset{\triangle}{\mathbf{x}} = \operatorname\*{argmin}\_{\mathbf{x}} \left\{ F(\mathbf{x}) = \frac{1}{2} \| H(\mathbf{y} - \mathbf{x}) \|\_{2}^{2} + \lambda\_{0} \sum\_{n=0}^{N-1} \theta\_{t}(\mathbf{x}\_{n}; r) + \sum\_{i=1}^{M} \lambda\_{i} \sum\_{n=0}^{N-1} \phi([\mathbf{D}\_{i} \mathbf{x}]\_{n}) \right\} \tag{13}$$

where *F*(*x*) is the proposed objective cost function (OCF), penalty function *θε*(*xn*;*r*) is a asymmetric and differentiable function, and *φ*([*<sup>D</sup>i<sup>x</sup>*]*n*) is a symmetric and differentiable function,

1

if term *M* = 2, matrix *D*1 defined as *D*1 = ⎡⎢⎢⎢⎣ −1, ... ... −1, 1 ⎤⎥⎥⎥⎦, and matrix *D*2 defined as −1,2,−1

*D*2 = ⎡⎢⎢⎢⎣ −1, 2, −1, ... −1, 2, −1 ⎤⎥⎥⎥⎦. The innovations of the novel compound regularizer model

are as follows:


Based on this, the core issues of the proposed algorithm are (1) how to construct a symmetric and differentiable penalty function; (2) how to construct an asymmetric and differentiable penalty function and; (3) how to solve the proposed method based on the MM algorithm and make diagnosis results are more accurate than the traditional LFLO and nonconvex penalty regularization approaches.

**For the first issue**, traditional LFLO regularization approach uses the absolute value function *φA*(*x*) = *x* as the penalty function; however, the common drawback of function *φA*(*x*) = *x* is that this function is non-differentiable at zero point, which can cause some numerical problems. To address this problem, a non-linear approximation function of *φB*(*x*) or *φc*(*x*) is proposed, i.e.,

$$
\phi\_B(\mathbf{x}) = \sqrt{|\mathbf{x}|^2 + \varepsilon} \tag{14}
$$

$$\phi\_{\varepsilon}(\mathbf{x}) = |\mathbf{x}| - \varepsilon \log(|\mathbf{x}| + \varepsilon) \tag{15}$$

Note that when *ε* = 0, then the *φB*(*x*) and *φc*(*x*) degrade into the absolute value function *φA*(*x*), while the *ε* > 0, the *φB*(*x*) and *φc*(*x*) are differentiable at zero point. The function *φA*(*x*), *φB*(*x*), *φc*(*x*) and their first-order derivatives are listed in Table 1.


**Table 1.** Symmetric penalty functions and their derivatives.

In order for the non-linear approximation functions to maintain the reliable sparsity-inducting behavior of the original LFLO algorithm, the parameter *ε* should be presented to an adequately small positive value. Fortunately, for example, the parameter *ε* = 10−<sup>5</sup> and *ε* = 10−<sup>6</sup> which are small enough so that the numerical issues can be avoided.

**For the second issue**, inspired by the absolute value function *φA*(*x*) = *x* , and also in contrast to the symmetric and differentiable penalty function *φB*(*x*) and *φc*(*x*), here, a segmented function is proposed as follows,

$$\theta\_{\varepsilon}(\mathbf{x}) = \begin{cases} \mathbf{x}, \mathbf{x} > \varepsilon \\ f(\mathbf{x}), \|\mathbf{x}\| \le \varepsilon \\ -r\mathbf{x}, \mathbf{x} < \varepsilon \end{cases} \tag{16}$$

where *r* > 0 is a positive constant. It should be noted that if we exclude the intermediate function *f*(*x*), the *θε*(*x*) is also degrade into the absolute value function *φA*(*x*) when *r* = 1. Therefore, the main problem of Equation (16) will transform into a task of how to construct the intermediate function *f*(*x*), −*ε* ≤ *x* ≤ *ε*. To address this issue, we seek a majorizer (here the Majorization-minimization algorithm is utilized) as the approximation function of *f*(*x*), −*ε* ≤ *x* ≤ *ε*. In order to eliminate the issue that penalty function is non-differentiable at zero point, a simple quadratic equation (QE) is introduced accordingly,

$$\mathcal{g}(\mathbf{x}, \mathbf{v}) = a\mathbf{x}^2 + b\mathbf{x} + c \tag{17}$$

According to the theory of majorization-minimization [39,40], we have,

$$\begin{cases} g(\upsilon, \upsilon) = \theta(\upsilon, r), g'(\upsilon, \upsilon) = \theta'(\upsilon, r) \\ g(s, \upsilon) = \theta(s, r), g'(s, \upsilon) = \theta'(s, r) \end{cases} \tag{18}$$

The parameter *a*, *b*, *c* and *s* are all functions of *v*, we have,

$$a = \frac{1+r}{4|v|}, \; b = \frac{1-r}{2}, \; c = \frac{(1+r)|v|}{4}, \; s = -v \tag{19}$$

Substituting Equation (19) into *g*(*<sup>x</sup>*, *v*) = *ax*<sup>2</sup> + *bx* + *c*, we have,

$$\log(\mathbf{x}, \mathbf{v}) = \frac{1+r}{4|v|}\mathbf{x}^2 + \frac{1-r}{2}\mathbf{x} + \frac{(1+r)|v|}{4} \tag{20}$$

Similarly, the numerical issue of Equation (20) will appear if the parameter *v* approaches zero. To address this problem, the sufficiently small positive value *ε* is used instead of |*v*|, thus, the segmented function Equation (16) can be rewritten as,

$$\theta\_{\varepsilon}(\mathbf{x}) = \begin{cases} \mathbf{x}\_{\prime} & \mathbf{x} > \varepsilon \\ \frac{1+r}{4\varepsilon} \mathbf{x}^{2} + \frac{1-r}{2} \mathbf{x} + \frac{(1+r)\varepsilon}{4} & ||\mathbf{x}|| \le \varepsilon \\ -r\mathbf{x}\_{\prime} & \mathbf{x} < \varepsilon \end{cases} \tag{21}$$

Hence, the new function *θε*(*x*) is a continuously differentiable function. The plot of the continuously differentiable asymmetric penalty function *θε*(*x*;*r*) is shown in Figure 2, and the function *θε*(*x*;*r*) is a second-order polynomial on [−*ε*,*<sup>ε</sup>*].

**Figure 2.** The plot of the asymmetric penalty function *θε*(*x*;*r*)

**The third issue**, will be solved and derived in Section 2.3 using the majorization-minimization algorithm.

### *2.3. The Solution of Proposed Model Based on Majorization-Minimization Algorithm*

In this paper, the majorization-minimization (MM) algorithm is implemented to derive an iterative solution procedure for the proposed approach [38]. The *G*(*<sup>x</sup>*,*<sup>v</sup>*) is chosen as the majorizer of *F*(*x*). Specifically, the iterative solution procedure could be divided into three phases:


**For problem (a)**, we first seek a majorizer *g*(*<sup>x</sup>*, *v*) for *φ*(*x*), i.e.,

$$\begin{cases} g(v, v) = \phi(v), \text{for all } \mathbf{x}, v \\ g(\mathbf{x}, v) \ge \phi(\mathbf{x}) \end{cases} \tag{22}$$

Since *φ*(*x*) is symmetric function, we set *g*(*<sup>x</sup>*, *v*) to be an even second-order polynomial, i.e.,

$$\log(\mathbf{x}, \mathbf{v}) = m\mathbf{x}^2 + b \tag{23}$$

Thus, according to Equation (22) and *g*(*<sup>v</sup>*, *v*) = *φ*(*v*) and *g*(*<sup>v</sup>*, *v*) = *φ*(*v*), we have,

$$\text{max}^2 + b = \phi(v) \text{ and } 2m\upsilon = \phi'(\upsilon) \tag{24}$$

The parameter *m* and *b* can be computed as,

$$m = \frac{\phi'(v)}{2v} \text{ and } b = \phi(v) - \frac{v}{2}\phi'(v) \tag{25}$$

Substituting Equation (25) in Equation (23), we have

$$\log(\mathbf{x}, \upsilon) = \frac{\phi'(\upsilon)}{2\upsilon} \mathbf{x}^2 + \phi(\upsilon) - \frac{\upsilon}{2} \phi'(\upsilon) \tag{26}$$

Summing, we obtain,

$$\begin{split} \frac{\sum\_{n} \mathbf{g}(\mathbf{x}\_{n}, \boldsymbol{\upsilon}\_{n})}{\sum\_{n} \mathbf{g}(\mathbf{x}\_{n}, \boldsymbol{\upsilon}\_{n})} &= \sum\_{n} \left[ \frac{\boldsymbol{\phi}'(\boldsymbol{\upsilon}\_{n})}{2\boldsymbol{\upsilon}\_{n}} \mathbf{x}\_{n} \boldsymbol{\varrho}^{2} + \boldsymbol{\phi}(\boldsymbol{\upsilon}\_{n}) - \frac{\boldsymbol{\upsilon}\_{n}}{2} \boldsymbol{\phi}'(\boldsymbol{\upsilon}\_{n}) \right] \\ \boldsymbol{\varrho} &= \frac{1}{2} \mathbf{x}^{T} \frac{\boldsymbol{\phi}'(\boldsymbol{\upsilon}\_{n})}{\boldsymbol{\upsilon}\_{n}} \mathbf{x} + \sum\_{n} \left[ \boldsymbol{\phi}(\boldsymbol{\upsilon}\_{n}) - \frac{\boldsymbol{\upsilon}\_{n}}{2} \boldsymbol{\phi}'(\boldsymbol{\upsilon}\_{n}) \right] \\ &= \frac{1}{2} \mathbf{x}^{T} [\boldsymbol{\Lambda}(\boldsymbol{\upsilon})]\_{n} \mathbf{x} + \boldsymbol{\mathcal{c}}(\boldsymbol{\upsilon}) \\ &\geq \sum\_{n=0}^{N-1} \boldsymbol{\phi}(\mathbf{x}\_{n}) \end{split} \tag{27}$$

where [Λ(*v*)]*n* = *φ*(*vn*) *vn* is diagonal matrix and *c*(*v*) = ∑*n φ*(*vn*) − *vn*2 *φ*(*vn*). Therefore, based on Equation (27), we obtain,

$$\begin{split} & \sum\_{i=0}^{M} \lambda\_{i} \sum\_{n=0}^{M-1} \operatorname{g} \left( [\mathbf{D}\_{i} \boldsymbol{x}]\_{n'} \left[ \mathbf{D}\_{i} \boldsymbol{v} \right]\_{n} \right) \\ &= \sum\_{i=0}^{M} \left[ \frac{\lambda\_{i}}{2} (\mathbf{D}\_{i} \boldsymbol{x})^{T} \left[ \frac{\boldsymbol{\phi}^{\prime} (\mathbf{D}\_{i} \boldsymbol{v})}{\mathbf{D}\_{i} \boldsymbol{v}} \right] (\mathbf{D}\_{i} \boldsymbol{x}) + \sum\_{n} \left[ \boldsymbol{\phi} ([\mathbf{D}\_{i} \boldsymbol{v}]\_{n}) - \frac{[\mathbf{D}\_{i} \boldsymbol{v}]\_{n}}{2} \boldsymbol{\phi}^{\prime} ([\mathbf{D}\_{i} \boldsymbol{v}]\_{n}) \right] \right] \\ &= \sum\_{i=0}^{M} \left[ \frac{\lambda\_{i}}{2} (\mathbf{D}\_{i} \boldsymbol{x})^{T} [\boldsymbol{\Lambda} (\mathbf{D}\_{i} \boldsymbol{v})] (\mathbf{D}\_{i} \boldsymbol{x}) + c\_{i} (\mathbf{D}\_{i} \boldsymbol{v}) \right] \\ &\geq \sum\_{i=0}^{M} \lambda\_{i} \sum\_{n=0}^{N-1} \boldsymbol{\phi} ([\mathbf{D}\_{i} \boldsymbol{x}]\_{n}) \end{split} \tag{28}$$

where [Λ(*<sup>D</sup>i<sup>v</sup>*)]*n* = *φ*([*<sup>D</sup>i<sup>v</sup>*]*n*) [*<sup>D</sup>i<sup>v</sup>*]*n* is diagonal matrix and *ci*(*<sup>D</sup>i<sup>v</sup>*) = ∑*n φ*([*<sup>D</sup>i<sup>v</sup>*]*n*) − [*<sup>D</sup>i<sup>v</sup>*]*n* 2 *<sup>φ</sup>*([*<sup>D</sup>i<sup>v</sup>*]*n*).

**For problem (b)**, assume that *g*0(*<sup>x</sup>*, *v*) is the majorizer of the asymmetric and differentiable function *θε*(*xn*;*<sup>r</sup>*), since the *f*(*x*) = 1+*r* 4*ε x*2 + 1−*<sup>r</sup>* 2 *x* + (<sup>1</sup>+*r*)*ε* 4 , *x* ≤ *ε*, we have,

$$\begin{cases} \ g\_0(\mathbf{x}, \upsilon) = \frac{1+r}{4\upsilon} \mathbf{x}^2 + \frac{1-r}{2} \mathbf{x} + \frac{(1+r)\upsilon}{4} \mathbf{x} \ge f(\mathbf{x}), \upsilon > \varepsilon\\\ g\_0(\mathbf{x}, \upsilon) = -\frac{1+r}{4\upsilon} \mathbf{x}^2 + \frac{1-r}{2} \mathbf{x} - \frac{(1+r)\upsilon}{4} \ge f(\mathbf{x}), \upsilon < -\varepsilon \end{cases} \tag{29}$$

when *v* > *ε*, then,

$$\begin{cases} g\_0(\mathbf{x}, \upsilon) - f(\mathbf{x}) = \left( \frac{1+r}{4\upsilon} - \frac{1+r}{4\varepsilon} \right) \mathbf{x}^2 + (\upsilon - \varepsilon) \frac{1+r}{4} \\ = \frac{(1+r)(\upsilon - \varepsilon)(\upsilon \upsilon - \mathbf{x}^2)}{4\upsilon \varepsilon} > 0 \end{cases} \tag{30}$$

when *v* < <sup>−</sup>*ε*, then,

$$\begin{cases} g\_0(\mathbf{x}, \boldsymbol{\upsilon}) - f(\mathbf{x}) = \left( -\frac{1+\boldsymbol{\tau}}{4\mathbf{v}} - \frac{1+\boldsymbol{\tau}}{4\boldsymbol{\epsilon}} \right) \mathbf{x}^2 - (\boldsymbol{\upsilon} + \boldsymbol{\varepsilon}) \frac{1+\boldsymbol{\tau}}{4} \\ = -\frac{(1+\boldsymbol{\tau})(\boldsymbol{\upsilon} + \boldsymbol{\varepsilon})(\boldsymbol{\upsilon}\mathbf{x} + \boldsymbol{\tau}^2)}{4\mathbf{v}\boldsymbol{\epsilon}} > 0 \end{cases} \tag{31}$$

Therefore, the majorizer of the asymmetric and differentiable function *θε*(*xn*;*r*) is obtained,

$$\begin{cases} \ g\_0(\mathbf{x}, \mathbf{v}) = \frac{1+r}{4|\mathbf{v}|} \mathbf{x}^2 + \frac{1+r}{2} \mathbf{x} + |\mathbf{v}| \frac{1+r}{4}, & |\mathbf{v}| > \varepsilon\\\ g\_0(\mathbf{x}, \mathbf{v}) = \frac{1+r}{4\varepsilon} \mathbf{x}^2 + \frac{1+r}{2} \mathbf{x} + \varepsilon \frac{1+r}{4}, & |\mathbf{v}| \le \varepsilon \end{cases} \tag{32}$$

Summing, we obtain,

$$\begin{aligned} \sum\_{n=0}^{N-1} g\_0(\mathbf{x}\_{\mathcal{U}}, \boldsymbol{\upsilon}\_{\mathcal{U}}) &= \mathbf{x}^T [\boldsymbol{\Gamma}(\boldsymbol{\upsilon})] \mathbf{x} + b^T \mathbf{x} + c(\boldsymbol{\upsilon}) \\ &\geq \sum\_{n=0}^{N-1} \theta\_{\boldsymbol{\varepsilon}}(\mathbf{x}\_{\mathcal{U}}, \boldsymbol{r}) \end{aligned} \tag{33}$$

where **<sup>Γ</sup>**(*v*) is a diagonal matrix, i.e., [**Γ**(*v*)]*n* = (1 + *<sup>r</sup>*)/4|*vn*|, |*vn*| ≥ *ε* and [**Γ**(*v*)]*n* = (1 + *<sup>r</sup>*)/4*<sup>ε</sup>*, |*vn*| ≤ *ε*, and [*b*]*n* = (1 − *r*)/2.

**For problem (c)**, based on Equations (28) and (33), the majorizer of *F*(*x*) based on MM algorithm is given by,

$$\begin{aligned} G(\mathbf{x}, \mathbf{v}) &= \frac{1}{2} \| H(\mathbf{y} - \mathbf{x}) \|\_{2}^{2} + \lambda\_{0} \mathbf{x}^{T} [\Gamma(\mathbf{v})] \mathbf{x} + \lambda\_{0} b^{T} \mathbf{x} \\ &+ \sum\_{i=1}^{M} \left[ \frac{\lambda\_{i}}{2} (\mathbf{D}\_{i} \mathbf{x})^{T} [\Lambda(\mathbf{D}\_{i} \mathbf{v})](\mathbf{D}\_{i} \mathbf{x}) \right] + c(\mathbf{v}) \end{aligned} \tag{34}$$

Minimizing *G*(*<sup>x</sup>*,*<sup>v</sup>*) with respect to *x* yields,

$$\mathbf{x} = H^T H + 2\lambda\_0 \mathbf{f}(\mathbf{v}) + \sum\_{i=1}^{M} \lambda\_i \mathbf{D}\_i^T [\Lambda(\mathbf{D}\_i \mathbf{v}) \mathbf{D}\_i]^{-1} \left( H^T H \mathbf{y} - \lambda\_0 \mathbf{b} \right) \tag{35}$$

Substituting *H* = *BA*−<sup>1</sup> in Equation (35), we have,

$$\begin{aligned} \mathbf{x} &= \mathbf{A} \left\{ \mathbf{B}^T \mathbf{B} + \mathbf{A}^T \left( 2\lambda\_0 \Gamma(\mathbf{r}) + \sum\_{i=1}^M \lambda\_i \mathbf{D}\_i^T [\Lambda(\mathbf{D}; \mathbf{r})] \mathbf{D}\_i \right) \mathbf{A} \right\}^{-1} \times \left( \mathbf{B}^T \mathbf{B} \mathbf{A}^{-1} \mathbf{y} - \lambda\_0 \mathbf{A}^T \mathbf{b} \right) \\ &= \mathbf{A} \left( \mathbf{B}^T \mathbf{B} + \mathbf{A}^T \mathbf{M} \mathbf{A} \right)^{-1} \left( \mathbf{B}^T \mathbf{B} \mathbf{A}^{-1} \mathbf{y} - \lambda\_0 \mathbf{A}^T \mathbf{b} \right) \\ &= \mathbf{A} \mathbf{Q}^{-1} \left( \mathbf{B}^T \mathbf{B} \mathbf{A}^{-1} \mathbf{y} - \lambda\_0 \mathbf{A}^T \mathbf{b} \right) \end{aligned} \tag{36}$$

where matrix *M* = <sup>2</sup>*λ*0**<sup>Γ</sup>**(*v*) + *M* ∑ *i*=1 *<sup>λ</sup>i<sup>D</sup>Ti* [Λ(*<sup>D</sup>i<sup>v</sup>*)]*<sup>D</sup>i* and matrix *Q* = *BTB* + *ATMA*.

Finally, by using the above formulas, the fault transient impulses *x* can be obtained by the following iterations,

$$\mathbf{M}^{(k)} = 2\lambda\_0 \Gamma \left( \mathbf{x}^{(k)} \right) + \sum\_{i=1}^{M} \lambda\_i \mathbf{D}\_i^T \left[ \Lambda \left( \mathbf{D}\_i \mathbf{x}^{(k)} \right) \right] \mathbf{D}\_i \tag{37}$$

$$\mathbf{Q}^{(k)} = \mathbf{B}^T \mathbf{B} + \mathbf{A}^T \mathbf{M}^{(k)} \mathbf{A} \tag{38}$$

$$\mathbf{x}^{(k+1)} = \mathbf{A} \left[ \mathbf{Q}^{(k)} \right]^{-1} \left( \mathbf{B}^T \mathbf{B} \mathbf{A}^{-1} \mathbf{y} - \lambda\_0 \mathbf{A}^T \mathbf{b} \right) \tag{39}$$

In conclusion, the complete steps of the proposed algorithm are summarized as follows,


$$[\Gamma(\upsilon)]\_{\mathfrak{n}} = (1+r)/4|\upsilon\_{\mathfrak{n}}|\_{\mathfrak{r}}|\upsilon\_{\mathfrak{n}}| \ge \varepsilon;$$

$$[\Gamma(\upsilon)]\_{\mathfrak{n}} = (1+r)/4\varepsilon, |\upsilon\_{\mathfrak{n}}| \le \varepsilon;$$

$$[\Lambda(\mathsf{D}\_{i}\upsilon)]\_{\mathfrak{n}} = \frac{\phi'([\mathsf{D}\_{i}\upsilon]\_{\mathfrak{n}})}{[\mathsf{D}\_{i}\upsilon]\_{\mathfrak{n}}}, \mathfrak{i} = 0, 1, 2, \dots, M;$$

$$\mathsf{M}^{(k)} = 2\lambda\_{0}\Gamma\Big(\mathsf{x}^{(k)}\Big) + \sum\_{i=1}^{M} \lambda\_{i}\mathsf{D}\_{i}^{T}\Big[\Lambda\Big(\mathsf{D}\_{i}\mathsf{x}^{(k)}\Big)\Big]\mathsf{D}\_{i};$$

$$\mathbf{Q}^{(k)} = \mathbf{B}^T \mathbf{B} + \mathbf{A}^T \mathbf{M}^{(k)} \mathbf{A};$$

$$\mathbf{x}^{(k+1)} = \mathbf{A} \left[ \mathbf{Q}^{(k)} \right]^{-1} \mathbf{E};$$

(5) If the stopping criterion is satisfied, then output signal *x*, otherwise, *k* = *k* + 1, and go to step (4).

(6) **Output**: signal *x*.
