Next Article in Journal
Dynamic Analysis and Intelligent Control Strategy for the Internal Thermal Control Fluid Loop of Scientific Experimental Racks in Space Stations
Previous Article in Journal
Prior Sensitivity Analysis in a Semi-Parametric Integer-Valued Time Series Model
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Complex Correntropy with Variable Center: Definition, Properties, and Application to Adaptive Filtering

College of Electronic and Information Engineering, Chongqing Key Laboratory of Nonlinear Circuits and Intelligent Information Processing, Southwest University, Chongqing 400715, China
*
Author to whom correspondence should be addressed.
Entropy 2020, 22(1), 70; https://doi.org/10.3390/e22010070
Submission received: 14 November 2019 / Revised: 23 December 2019 / Accepted: 4 January 2020 / Published: 6 January 2020
(This article belongs to the Section Information Theory, Probability and Statistics)

Abstract

:
The complex correntropy has been successfully applied to complex domain adaptive filtering, and the corresponding maximum complex correntropy criterion (MCCC) algorithm has been proved to be robust to non-Gaussian noises. However, the kernel function of the complex correntropy is usually limited to a Gaussian function whose center is zero. In order to improve the performance of MCCC in a non-zero mean noise environment, we firstly define a complex correntropy with variable center and provide its probability explanation. Then, we propose a maximum complex correntropy criterion with variable center (MCCC-VC), and apply it to the complex domain adaptive filtering. Next, we use the gradient descent approach to search the minimum of the cost function. We also propose a feasible method to optimize the center and the kernel width of MCCC-VC. It is very important that we further provide the bound for the learning rate and derive the theoretical value of the steady-state excess mean square error (EMSE). Finally, we perform some simulations to show the validity of the theoretical steady-state EMSE and the better performance of MCCC-VC.

1. Introduction

Choosing the appropriate cost function (usually the statistical measure of error signal) is the key problem in adaptive filtering theory and application [1,2,3]. In the presence of Gaussian noise, it is best to using the minimum mean square error (MMSE) criterion. Therefore, a series of MMSE based algorithms [4,5,6,7] have emerged during the past decades. The MMSE based algorithms use the mean square value of the error between the desired signal and output signal as the cost function, which has many attractive features, such as convexity and smoothness. In addition, MMSE has low computational complexity since it only needs to calculate the second order statistics of the signals. However, in many non-Gaussian cases, the MMSE based algorithms are not robust. To improve this shortcoming, many kinds of non-MMSE criteria based algorithms have been developed in [8,9,10,11,12,13,14,15,16]. Since signals are often expressed in complex forms in many practical scenarios [17,18], adaptive filtering in complex domain is of great significance. During the past few years, some information criteria based algorithms have been proposed for complex domain adaptive filtering [19,20,21,22]. Particularly recently Guimarães defined a new similarity measurement between two complex variables based on complex correntropy [19,20] and proposed the maximum complex correntropy criterion (MCCC) algorithm. MCCC uses a complex Gaussian function as the kernel function, and derives the updation of weight based on Wirtinger Calculus. The complex Gaussian kernel function is desirable due to its smoothness and strict positive-definiteness. The performance of the MCCC algorithm is better than classic MMSE based algorithms, and is robust to non-Gaussian noise. Moreover, MCCC has been widely applied to many fields of machine learning and signal processing [23,24].
According to the MCCC, given two complex variables C 1 = A 1 + j B 1 and C 2 = A 2 + j B 2 , complex correntropy is defined by [19,20]
V σ C ( C 1 , C 2 ) = E [ κ ( C 1 C 2 ) ]
where A 1 , B 1 , A 2 , B 2 represent real variables, E [ ] denotes the expectation, and κ ( C 1 C 2 ) denotes the kernel function with
κ ( C 1 C 2 ) = G σ C ( C 1 C 2 ) = 1 2 π σ 2 exp ( ( C 1 C 2 ) ( C 1 C 2 ) * 2 σ 2 )
and σ > 0 is the kernel width.
The purpose of adaptive filtering is to estimate the target variable T in some sense by designing a model M to construct a output Y from input X . Under MCCC, we find this mode by maximizing the complex correntropy between T and Y :
M * = arg max M V σ C ( T , Y ) = arg max M E [ G σ C ( T Y ) ]
where is the model assumption space which contains the possible models to construct the output Y from input X , and M * is the optimal model.
However, the center of complex correntropy is always at zero, which is not the best option in the case of non-zero mean noise. Although the maximum corentropy criterion with variable center in [25] and [26] can be suitable for the variable center, they cannot be used for complex domain adaptive filtering. To overcome their defects, this paper proposes the maximum complex correntropy criterion with variable center (MCCC-VC).
The main contributions of this research lie in the following aspects: (1) we define a MCCC-VC and give its probability explanation; (2) based on the MCCC-VC, we propose a novel adaptive filtering algorithm in complex domain by utilizing the gradient descend approach; (3) we give effective and feasible methods to estimate the kernel center and update the kernel width adaptively; (4) we derive the bound for the learning rate, and the theoretical steady-state excess mean square error (EMSE) of the MCCC-VC algorithm, and verify the theoretical analysis by simulations.
The organization of this paper is as follows: Section 2 defines the complex correntropy with variable center and studies its properties. Section 3 proposes the MCCC-VC algorithm and provides the method for the optimization of the parameters. In addition, Section 3 also studies the convergence of the MCCC-VC algorithm and derives the theoretical steady-state excess mean square error (EMSE). Section 4 verifies the correctness of the theoretical conclusions and the superior performance of the MCCC-VC algorithm. Finally, Section 5 summarizes the conclusion of this paper.

2. Complex Correntropy with Variable Center

For two complex variables, the target variable T and the output Y , the complex correntropy with variable center is defined as:
V σ , c C ( T , Y ) = E [ G σ C ( T Y c ) ] = E [ 1 2 π σ 2 exp ( ( T Y c ) ( T Y c ) * 2 σ 2 ) ]
where c represents the center of the kernel function. When c = 0 , (4) will return to the original complex correntropy.
The complex correntropy with variable center c consists of the whole even moments of T Y about the center c , which is as follows:
V σ , c C ( T , Y ) = 1 2 π σ 2 n = 0 ( 1 ) n 2 n n ! E [ | e c | 2 n σ 2 n ]
where e = T Y is the complex valued error variable. With the increase of σ , the higher-order moments around the variable center c would attenuate quickly. Therefore, the second-order moment is the key factor which determines the value. In particular, when c = E [ e ] and σ , maximizing the complex correntropy with c would be equal to minimizing the variance of the error.
Moreover, when σ 0 , we obtain
lim σ 0 V σ , c C ( T Y ) = lim σ 0 G σ C ( t R y R c R , t I y I c I ) p T Y ( t R , t I , y R , y I ) d t R d t I d y R d y I = δ ( t R y R c R , t I y I c I ) p T Y ( t R , t I , y R , y I ) d t R d t I d y R d y I = p T Y ( t R , t I , t R c R , t I c I ) d t R d t I
where δ ( x , y ) is the two-dimensional Dirac function with { δ ( x , y ) d x d y = 1 δ ( x , y ) = 0 , x 2 + y 2 0 , the second line is derived based on the fact that lim σ 0 G σ C ( x , y ) = lim σ 0 1 2 π σ 2 exp ( x 2 + y 2 2 σ 2 ) has the same property as δ ( x , y ) , t R , y R , and c R are the real parts of t , y and c , t I , y I , and c I are the imaginary parts of t , y , and c , and p T Y ( t R , t I , y R , y I ) denotes the joint probability density function (PDF) of ( T , Y ) . Furthermore, we derive the following result:
lim σ 0 V σ , c C ( T Y ) = lim σ 0 G σ C ( ε R c R , ε I c I ) p e ( ε R , ε I ) d ε R d ε I = δ ( ε R c R , ε I c I ) p e ( ε R , ε I ) d ε R d ε I = p e ( c R , c I ) d ε R d ε I
where p e ( ε R , ε I ) is the joint PDF of error. Thus, when σ 0 , the value of complex correntropy with variable center c would approach p e ( ε R , ε I ) evaluated at ( c R , c I ) .

3. MCCC-VC Algorithm

In this part, we derive a novel adaptive filtering algorithm based on maximum complex correntropy criterion (i.e., minimum complex correntropy loss) with variable center (MCCC-VC).

3.1. Cost Function

We apply the MCCC-VC to adaptive filtering and derive the cost function as follows:
J V C l o s s C = G σ C ( 0 ) E [ G σ C ( e ( k ) c ( k ) ) ] = 1 2 π σ 2 { 1 E [ exp [ ( ( e ( k ) c ( k ) ) ( e ( k ) c ( k ) ) * ) 2 σ 2 ] ] }
where
e ( k ) = d ( k ) w H x ( k )
is the error at time instant k , w = [ w 1   w 2     w m ] T is the filter weight, d ( k ) is the desired signal at time instant k , x ( k ) = [ x ( k )   x ( k 1 )     x ( k m + 1 ) ] T is the input signal at time instant k , c ( k ) is the center of the kernel at time instant k .
The essential idea behind the cost function (8) is that, in the practical case, even when the error distribution is non-zero-mean, the proposed MCCC-VC can perform well, because MCCC-VC matches well the error distribution.
Figure 1 compares the surfaces of the proposed MCCC-VC with MCCC, where the noise is non-zero-mean complex Gaussian noise with unit variance. For visualization, we chose m = 1 , and set the system parameter and the mean of the noise as w 0 = 5 + 5 i and c = 6 + 6 i , respectively. One can see that the cost function of MCCC-VC is minimized at w 0 , whereas the cost function of MCCC is minimized at some other place.

3.2. Gradient Descent Algorithm Based On MCCC-VC

Since the stochastic gradient descent approach requires less computational complexity, we adopt it to search the minimum of the cost function. Utilizing Wirtinger Calculus [27,28], we obtain the updation of the weight as follows:
w ( k + 1 ) = w ( k ) μ { [ 1 exp [ ( e ( k ) c ( k ) ) ( e ( k ) c ( k ) ) * 2 σ 2 ] ] w * ( k ) } = w ( k ) + μ 2 σ 2 exp [ | e ( k ) c ( k ) | 2 2 σ 2 ] ( e ( k ) c ( k ) ) * x ( k ) = w ( k ) + η w exp [ | e ( k ) c ( k ) | 2 2 σ 2 ] ( e ( k ) c ( k ) ) * x ( k )
where η w = μ 2 σ 2 is the learning rate for the weight.

3.3. Optimization of the Parameters in MCCC-VC

3.3.1. Optimization Problem in MCCC-VC

The two parameters center location c and the width of kernel σ act a pivotal part in the performance of MCCC-VC. Thus, it is extremely important to optimize them to further improve the robustness and convergence performance in the non-zero mean noise.
The optimal model according to MCCC-VC is as follows:
M * = arg max M , σ Ω , c V σ , c C ( T , Y ) = arg max M , σ Ω , c E [ G σ C ( e c ) ]
In addition, the complex correntropy with variable center can be divided into three parts:
V σ , c C ( T , Y ) = G σ C ( ε R c R , ε I c I ) p e ( ε R , ε I ) d ε R d ε I = 1 2 [ G σ C ( ε R c R , ε I c I ) ] 2 d ε R d ε I + 1 2 [ p e ( ε R , ε I ) ] 2 d ε R d ε I 1 2 [ G σ C ( ε R c R , ε I c I ) p e ( ε R , ε I ) 2 ] d ε R d ε I
Owing to the first term is independent from the optimal model, we can derive
M * = arg max M , σ Ω , c V σ , c C ( T , Y ) = arg max M , σ Ω , c U σ , c C ( T , Y )
where
U σ , c C ( T , Y ) = [ p e ( ε R , ε I ) ] 2 d ε R d ε I [ G σ C ( ε R c R , ε I c I ) p e ( ε R , ε I ) ] 2 d ε R d ε I
and
p e ( ε R , ε I ) = 2 E [ G σ C ( e R c R , e I c I ) ]
The parameters can be optimized by
( M * , σ * , c * ) = arg max M , σ Ω , c U σ , c C ( T , Y )
where Ω and represent the allowed sets of parameters σ and c .
Remark 1.
It can be seen that as long as the function U σ , c C ( T , Y ) is maximized, M , σ , and c can be optimized simultaneously. However, it is computationally demanding to compute and compare all the values of U σ , c C ( T , Y ) under all the possible parameters in the allowed sets. Moreover, it may be difficult to obtain the allowed sets of parameters.

3.3.2. Stochastic Gradient Descent Approach

To further simplify the optimization problem, we propose a stochastic gradient descent based online approach.
(1) When the model M is fixed, p e ( ε ) d ε r d ε I is independent of the kernel width σ and the center position c . In this case, σ and c can be optimized according to the following formula:
( σ * , c * ) = arg min σ Ω , c [ G σ C ( ε R c R , ε I c I ) p e ( ε R , ε I ) ] 2 d ε R d ε I = arg min σ Ω , c { [ G σ C ( ε R c R , ε I c I ) ] 2 d ε R d ε I 2 E [ G σ C ( e c ) ] } = arg min σ Ω , c { 2 E [ G σ C ( e c ) ] + 1 4 π σ 2 }
Provide N error samples { e ( k ) } k = 1 N , we can get E [ G σ C ( e c ) ] 1 N k = 1 N G σ C ( e ( k ) c ( k ) ) . Therefore, we have the following formula:
( σ * , c * ) = arg min σ Ω , c { [ 2 N k = 1 N G σ C ( e ( k ) c ( k ) ) ] + 1 4 π σ 2 }
Furthermore, in order to simplify the optimization problem, we can set c ( k ) as the median or mean of the error samples. Thus, we only need to optimize σ . We take 1 / σ 2 as a new variable σ ˜ , and update σ ˜ and σ 2 using the stochastic gradient descent approach as follows:
σ ˜ ( k + 1 ) = σ ˜ ( k ) η σ [ [ 2 N l = k T + 1 k G σ C ( e ( l ) c ( k ) ) ] + 1 4 π σ 2 σ ˜ ] | σ ˜ = σ ˜ ( k ) , c = c ( k ) = σ ˜ ( k ) η σ { [ 1 π N l = k T + 1 k exp ( | e ( l ) c ( k ) | 2 2 σ ˜ ( k ) ) ( 1 | e ( l ) c ( k ) | 2 2 σ ˜ ( k ) ) ] + 1 4 π }
and
σ 2 ( k + 1 ) = 1 σ ˜ ( k + 1 )
where c ( k ) is estimated online as c ( k ) = l = k T + 1 k e ( l ) , and T is the smoothing length, η σ is the learning rate for σ ˜ .
(2) When the kernel width σ ( k ) and the center position c ( k ) is fixed, the model M is optimized by MCCC-VC using (10).
Remark 2.
For the proposed MCCC-VC algorithm, the weight and the parameters are updated alternately at each time instant k using (10), (19) and (20), respectively.

3.4. Performance Analysis

3.4.1. Convergence Analysis

The MCCC-VC algorithm is written as a form of nonlinear error:
w ( k + 1 ) = w ( k ) + η w f ( e ( k ) ) x ( k )
with f ( e ( k ) ) = exp [ | e ( k ) c ( k ) | 2 2 σ 2 ] ( e ( k ) c ( k ) ) * being the scalar function of the error e ( k ) .
Taking into consideration that
d ( k ) = w 0 H x ( k ) + v ( k )
the error is written as
e ( k ) = w ˜ H ( k ) x ( k ) + v ( k ) = e a ( k ) + v ( k )
where w ˜ ( k ) = w 0 w ( k ) is the weight error vector at time instant k , w 0 is the system parameter, e a ( k ) = w ˜ H ( k ) x ( k ) is the prior error, and v ( k ) is the additive noise at time instant k .
Therefore, we get the following formula
w ˜ ( k + 1 ) = w ˜ ( k ) η w f ( e ( k ) ) x ( k )
By taking the square of the 2-norm of both sides, we can further get the following formula:
E { w ˜ ( k + 1 ) 2 } = E { w ˜ ( k ) 2 } 2 η w E { Re [ e a ( k ) f ( e ( k ) ) ] } + η w 2 E { x ( k ) 2 | f ( e ( k ) ) | 2 }
To guarantee the convergence of the MCCC-VC, the weight error power should be gradually decreased. Thus, we obtain the bound for the learning rate as follows:
0 < η w 2 E { Re [ e a ( k ) f ( e ( k ) ) ] } E { x ( k ) 2 | f ( e ( k ) ) | 2 }

3.4.2. Steady-State Mean Square

If MCCC-VC arrives at steady-state, we have
lim k E { w ˜ ( k + 1 ) 2 } = lim k E { w ˜ ( k ) 2 }
Then, when k , we can get
2 E { Re [ ( e a ( k ) c ) f ( e ( k ) ) ] } = η w E { x ( k ) 2 | f ( e ( k ) ) | 2 }
According to the definition of the steady-state excess mean square error (EMSE), we have
S = lim k E [ | e a ( k ) | 2 ] = E [ | e a | 2 ]
To obtain the theoretical steady-state EMSE, we present the following two assumptions [21,22,29]:
(1)
v ( k ) is zero-mean distributed and independent of x ( k ) , and x ( k ) is circular.
(2)
e a ( k ) is zero-mean and independent of v ( k ) .
Owing to the distributions of x ( k ) , v ( k ) , e a ( k ) , and e ( k ) are not related to the time index k at the steady-state, the time index is ignored in the following derivation.
The left side of (28) can be written as
L = E { e a [ exp [ | e c | 2 2 σ 2 ] ( e c ) * ] + e a * [ exp [ | e c | 2 2 σ 2 ] ( e c ) ] } = E { exp [ | e c | 2 2 σ 2 ] ( e a ( e c ) * + e a * ( e c ) ) } = E { g 1 ( e ) ( 2 | e a | 2 + e a v * + e a * v ) }
where
g 1 ( e ) = exp [ | e c | 2 2 σ 2 ]
We use Taylor expansion to approximate g 1 ( e ) as
g 1 ( e ) g 1 ( v ) + 2 Re { g 1 e | e = v e a } + Re { 2 g 1 e * e * | e = v ( e a * ) 2 + 2 g 1 e * e | e = v | e a | 2 }
where
g 1 e = exp [ | e c | 2 2 σ 2 ] × [ | e c | 2 2 σ 2 ( e c ) 1 ]
g 1 e * = exp [ | e c | 2 2 σ 2 ] × [ | e c | 2 2 σ 2 ( ( e c ) * ) 1 ]
2 g 1 e * e = exp [ | e c | 2 2 σ 2 ] | e c | 2 × [ | e c | 4 ( 2 σ 2 ) 2 3 | e c | 2 2 σ 2 + | e c | 2 σ 2 ]
2 g 1 e * e * = exp [ | e c | 2 2 σ 2 ] ( ( e c ) * ) 2 × [ | e c | 4 ( 2 σ 2 ) 2 ]
Owing to x is circular, we can get the values of the following two items:
E [ ( e a * ) 2 ] = 0
E [ e a 2 ] = w ˜ H x x T w ˜ * = 0
Based on the above derivation, if the higher-order terms are small enough, we can rewrite the left side of (28) as follows:
L 2 S exp [ | v | 2 2 σ 2 ] × { 1 | v | 2 2 σ 2 }
The right side of (28) can be written as
R = η w T r ( R x x H ) E { | f ( e ( k ) ) | 2 } = η w T r ( R x x H ) E { exp [ | e c | 2 σ 2 ] | e c | 2 } = η w T r ( R x x H ) E { g 2 ( e ) }
where
g 2 ( e ) = exp [ | e c | 2 σ 2 ] | e c | 2
In a similar way, we use a Taylor expansion to approximate g 2 ( e ) as
g 2 ( e ) g 2 ( v ) + Re { 2 g 2 e * e | e = v | e a | 2 + 2 g 2 e * e * | e = v ( e a * ) 2 } + 2 Re { g 2 e | e = v e a }
where
g 2 e = exp [ | e c | 2 σ 2 ] | e c | 2 × [ | e c | 2 σ 2 ( ( e c ) * ) 1 + ( e * ) 1 ]
g 2 e * = exp [ | e c | 2 σ 2 ] | e c | 2 × [ | e c | 2 σ 2 ( ( e c ) * ) 1 + ( e * ) 1 ]
2 g 2 e * e = exp [ | e c | 2 σ 2 ] × [ | e c | 4 σ 4 3 | e c | 2 σ 2 1 ]
2 g 2 e * e * = exp [ | e c | 2 2 σ 2 ] | e c | 2 ( ( e c ) * ) 2 ( | e c | 4 σ 4 | e c | 2 2 σ 2 )
If the higher-order terms are small enough, we can rewrite the right side of (28) as follows:
R η w T r ( R x x H ) E { exp [ | v c | 2 2 σ 2 ] | v c | 2 } + η w T r ( R x x H ) S × R 1
where
R 1 = E { exp [ | v c | 2 2 σ 2 ] ( | v c | 4 σ 4 3 | v c | 2 σ 2 1 ) }
Finally, we get the theoretical steady-state EMSE as follows:
S = η w T r ( R x x H ) E { exp [ | v c | 2 2 σ 2 ] | v c | 2 } E { 2 exp [ | v c | 2 2 σ 2 ] × [ 1 | v c | 2 2 σ 2 ] } η w T r ( R x x H ) R 1
Furthermore, when η w is small enough, (49) is further simplified as
S = η w T r ( R x x H ) E { exp [ | v c | 2 2 σ 2 ] | v c | 2 } E { 2 exp [ | v c | 2 2 σ 2 ] × [ 1 | v c | 2 2 σ 2 ] }
Moreover, we derive the theoretical value of σ 2 by setting { 2 E [ G σ C ( e c ) ] + 1 4 π σ 2 } σ 2 = 0 . In this way, we have 1 π σ 4 exp ( | e c | 2 2 σ 2 ) 1 π σ 6 exp ( | e c | 2 2 σ 2 ) | e c | 2 2 1 4 π σ 4 = 0 . Due to e v at the steady state, we can further obtain the theoretical value of σ 2 based on the following approach:
σ 2 = E { | v c | 2 2 exp [ | v c | 2 2 σ 2 ] } E { 1 4 + exp [ | v c | 2 2 σ 2 ] }
Since the right side of (51) depends on σ 2 , it is a fixed-point solution for the theoretical σ 2 .
Remark 3.
The theoretical steady-state EMSE in (50) is accurate only when e a is small enough, since the higher-order term can be negligible in this case. However, if the noise power or step size is too large, or the center position of the kernel function deviates from the mean of the noise, there will be a large deviation between the theoretical and simulated values of steady-state EMSE.

4. Simulation

In this section, we present some simulations to show the validity of theoretical results and the superiority of MCCC-VC. We obtain all the simulation results by averaging over 300 Monte Carlo trials.

4.1. Steady-State Performance

In this part, the filter weight w 0 = [ w 1 w 2 w 10 ] T is randomly generated, where w k = w R k + j w I k , and w R k , w I k N ( 0 , 0.1 ) , w R k and w I k represent the real and imaginary components of w k , and N ( μ , σ ^ 2 ) denotes the Gaussian distributed variable whose mean and variance are μ and σ ^ 2 , respectively. We randomly generate input signal x = x R + j x I . In order to show the robustness of MCCC-VC, additive complex noise v = v R + j v I is added in the simulation, whose real and imaginary parts are denoted by v R and v I , respectively. All algorithms initialize w with a zero vector.
Firstly, we illustrate the correctness of theoretical steady-state EMSEs. For each simulation, 30,000 iterations are carried out to make sure MCCC-VC reaches the steady-state, and the last 1000 iterations are used to obtain the simulated steady-state EMSEs. The theoretical kernel width and steady-state EMSEs are calculated according to (51) and (50), respectively. Figure 2 and Figure 3 show the simulated and theoretical steady-state EMSEs of MCCC-VC under various noise variances and learning rates, where v is Gaussian distributed with mean 3 + 3 j . It can be seen from both figures that theoretical results are closely matching with simulated results.
Then, we change the noise to binary noise, and the mean is also 3 + 3 j . In addition, the simulated and theoretical steady-state EMSEs are obtained the same as before. Figure 4 and Figure 5 show the simulated and theoretical steady-state results of MCCC-VC under various noise variances and learning rates. Obviously, there is also a good matching between theoretical results and simulated results.

4.2. Performance Comparison

In this part, we compare the performance of the proposed MCCC-VC algorithm with MCCC and minimum complex kernel risk sensitive loss (MCKRSL) [22]. For the fair comparison, all three algorithms use the gradient descent method to search for the optimal solution. We measure the performance of all the algorithms by weight error power.
In this simulation, the noise v ( k ) is composed of two independent noises [16], i.e., v ( k ) = ( 1 a ( k ) ) A ( k ) + a ( k ) B ( k ) , where P ( a ( k ) = 0 ) = 1 c , and P ( a ( k ) = 1 ) = c ( 0 c 1 ) . A ( k ) is the ordinary noise with small variance σ v 2 = 1 whose real and imaginary parts are denoted by A R ( k ) and A I ( k ) , and B ( k ) is the outliers with large variance whose real and imaginary parts are denoted by B R ( k ) and B I ( k ) .
In this simulation, we set c = 0.05 and B R , B I N ( 0 , 100 ) . In addition, we consider the following four cases for A ( k ) :
(1)
A R ( k ) , A I ( k ) N ( 3 , σ v 2 / 2 ) ;
(2)
P ( A R ( k ) = 3 + σ v / 2 ) = P ( A R ( k ) = 3 σ v / 2 ) = P ( A I ( k ) = 3 + σ v / 2 ) = P ( A I ( k ) = 3 σ v / 2 ) = 0.5 ;
(3)
A R ( k ) , A I ( k ) U ( 3 σ v / 2 , 3 + σ v / 2 ) , with U ( α , β ) denoting the uniform distribution over [ α , β ] ;
(4)
A R ( k ) = 3 + σ v sin θ 1 k / 2 , A I ( k ) = 3 + σ v sin θ 2 k / 2 , where θ 1 k , θ 2 k U [ 0 , 2 π ] .
Figure 6, Figure 7, Figure 8 and Figure 9 show the convergence behavior of various algorithms on the basis of weight error power w ( k ) w 0 2 under different noises, where the parameter settings of different algorithms are summarized in Table 1. It can be seen clearly that the convergence performance of MCCC-VC is better than other two algorithms in all cases.

5. Conclusions

The complex correntropy usually employs a Gaussian kernel whose center is zero, which is not the best choice for many situations. To overcome this defect, this paper proposes the maximum complex correntropy criterion with variable center (MCCC-VC). The complex correntropy is extended to the case where the center can be anywhere. Furthermore, this paper also proposes an effective method to optimize the center position and the kernel width. More significantly, we analyze the convergence and steady-state performance of MCCC-VC theoretically. Simulation results obtained in Section 4 support the reliability of theoretical analysis and show the excellent performance of MCCC-VC.

Author Contributions

Conceptualization, F.D., G.Q., and S.W.; methodology, F.D., G.Q., and S.W.; software, F.D., G.Q., and S.W.; validation, G.Q.; formal analysis, G.Q.; investigation, F.D., G.Q., and S.W.; resources, G.Q.; data curation, F.D. and G.Q.; writing—original draft preparation, F.D.; writing—review and editing, G.Q. and S.W.; visualization, F.D. and G.Q.; supervision, F.D., G.Q., and S.W.; project administration, G.Q.; funding acquisition, S.W. and G.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China under grants 61671389 and 61701419, and Fundamental Research Funds for the Central Universities under grant XDJK2019B011.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Haykin, S. Adaptive Filter Theory, 3rd ed.; Prentice Hall: New York, NY, USA, 1996. [Google Scholar]
  2. Sayed, A.H. Fundamentals of adaptive filtering. IEEE Control Syst. 2003, 25, 77–79. [Google Scholar]
  3. Chen, B.; Zhu, Y.; Hu, J.; Principe, J.C. System Parameter Identification: Information Criteria and Algorithms; Newnes: Oxford, UK, 2013. [Google Scholar]
  4. Widrow, B.; McCool, J.M.; Larimore, M.G.; Johnson, C.R. Stationary and nonstationary learning characteristics of the LMS adaptive filter. Proc. IEEE. 1976, 64, 1151–1162. [Google Scholar] [CrossRef]
  5. Kwong, R.H.; Johnston, E.W. A variable step size LMS algorithm. IEEE Trans. Signal Process. 1992, 40, 1633–1642. [Google Scholar] [CrossRef] [Green Version]
  6. Benesty, J.; Duhamel, P. A fast exact least mean square adaptive algorithm. IEEE Trans. Signal Process. 1992, 40, 2904–2920. [Google Scholar] [CrossRef]
  7. Diniz, P.S.R. Adaptive Filtering: Algorithms and Practical Implementation, 4th ed.; Springer: New York, NY, USA, 2013. [Google Scholar]
  8. Pei, S.C.; Tseng, C.C. Least mean p-power error criterion for adaptive FIR filter. IEEE J. Sel. Areas Commun. 1994, 12, 1540–1547. [Google Scholar]
  9. AI-Naffouri, T.Y.; Sayed, A.H. Adaptive filters with error nonlinearities: Mean-square analysis and optimum design. EURASIP J. Appl. Signal Process. 2001, 1, 192–205. [Google Scholar] [CrossRef]
  10. Erdogmus, D.; Principe, J.C. Generalized information potential criterion for adaptive system training. IEEE Trans. Neural Netw. 2002, 13, 1035–1044. [Google Scholar] [CrossRef]
  11. Principe, J.C. Information Theoretic Learning: Renyi’s Entropy and Kernel Perspectives; Springer: New York, NY, USA, 2010. [Google Scholar]
  12. Sayin, M.O.; Vanli, N.D.; Kozat, S.S. A novel family of adaptive filtering algorithms based on the logarithmic cost. IEEE Trans. Signal Process. 2014, 62, 4411–4424. [Google Scholar] [CrossRef] [Green Version]
  13. Liu, W.; Pokharel, P.P.; Príncipe, J.C. Correntropy: Properties and applications in non-Gaussian signal processing. IEEE Trans. Signal Process. 2007, 55, 5286–5298. [Google Scholar] [CrossRef]
  14. Chen, B.; Xing, L.; Zhao, H.; Zheng, N.; Príncipe, J.C. Generalized correntropy for robust adaptive filtering. IEEE Trans. Signal Process. 2016, 64, 3376–3387. [Google Scholar] [CrossRef] [Green Version]
  15. Ma, W.; Qu, H.; Gui, G.; Xu, L.; Zhao, J.; Chen, B. Maximum correntropy criterion based sparse adaptive filtering algorithms for robust channel estimation under non-Gaussian environments. J. Franklin Inst. 2015, 352, 2708–2727. [Google Scholar] [CrossRef] [Green Version]
  16. Chen, B.; Xing, L.; Xu, B.; Zhao, H.; Zheng, N.; Príncipe, J.C. Kernel Risk-Sensitive Loss: Definition, Properties and Application to Robust Adaptive Filtering. IEEE Trans. Signal Process. 2017, 65, 2888–2901. [Google Scholar] [CrossRef] [Green Version]
  17. Mandic, D.; Goh, V. Complex Valued Nonlinear Adaptive Filters: Noncircularity, Widely Linear and Neural Models (ser. Adaptive and Cognitive Dynamic Systems: Signal Processing, Learning, Communications and Control); Wiley: New York, NY, USA, 2009. [Google Scholar]
  18. Shi, L.; Zhao, H.; Zakharov, Y. Performance analysis of shrinkage linear complex-valued LMS algorithm. IEEE Signal Process. Lett. 2019, 26, 1202–1206. [Google Scholar] [CrossRef]
  19. Guimaraes, J.P.F.; Fontes, A.I.R.; Rego, J.B.A.; Martins, A.M.; Principe, J.C. Complex correntropy: Probabilistic interpretation and application to complex-valued data. IEEE Signal Process. Lett. 2017, 24, 42–45. [Google Scholar] [CrossRef]
  20. Guimarães, J.P.F.; Fontes, A.I.R.; Rego, J.B.A.; Martins, A.M.; Principe, J.C. Complex Correntropy Function: properties, and application to a channel equalization problem. Expert Syst. Appl. 2018, 107, 173–181. [Google Scholar] [CrossRef] [Green Version]
  21. Qian, G.; Wang, S. Generalized Complex Correntropy: Application to Adaptive Filtering of Complex Data. IEEE Access. 2018, 6, 19113–19120. [Google Scholar] [CrossRef]
  22. Qian, G.; Wang, S. Complex Kernel Risk-Sensitive Loss: Application to Robust Adaptive Filtering in Complex Domain. IEEE Access. 2018, 6, 60329–60338. [Google Scholar] [CrossRef]
  23. Guimarães, J.P.F.; Fontes, A.I.R.; da Silva, F.B. Complex Correntropy Induced Metric Applied to Compressive Sensing with Complex-Valued Data. In Proceedings of the 2018 IEEE Southwest Symposium on Image Analysis and Interpretation (SSIAI), Las Vegas, NV, USA, 8–10 April 2018; pp. 21–24. [Google Scholar]
  24. Qian, G.; Luo, D.; Wang, S. A Robust Adaptive Filter for a Complex Hammerstein System. Entropy 2019, 21, 162. [Google Scholar] [CrossRef] [Green Version]
  25. Chen, B.; Wang, X.; Li, Y.; Principe, J.C. Maximum correntropy criterion with variable center. IEEE Signal Process. 2019, 26, 1212–1216. [Google Scholar] [CrossRef] [Green Version]
  26. Zhu, L.; Song, C.; Pan, L.; Li, J. Adaptive filtering under the maximum correntropy criterion with variable center. IEEE Access. 2019, 7, 105902–105908. [Google Scholar] [CrossRef]
  27. Wirtinger, W. Zur formalen theorie der funktionen von mehr complexen veränderlichen. Math. Ann. 1927, 97, 357–375. [Google Scholar] [CrossRef]
  28. Bouboulis, P.; Theodoridis, S. Extension of Wirtinger’s calculus to reproducing Kernel Hilbert spaces and the complex kernel LMS. IEEE Trans. Signal Process. 2011, 59, 964–978. [Google Scholar] [CrossRef] [Green Version]
  29. Picinbono, B. On circularity. IEEE Trans. Signal Process. 1994, 42, 3473–3482. [Google Scholar] [CrossRef]
Figure 1. Surfaces of maximum complex correntropy criterion with variable center (MCCC-VC) and MCCC.
Figure 1. Surfaces of maximum complex correntropy criterion with variable center (MCCC-VC) and MCCC.
Entropy 22 00070 g001
Figure 2. Steady-state excess mean square errors (EMSEs) under various σ v 2 . (Gaussian distributed noise, η w = 3.8 × 10 4 , η σ = 4 × 10 3 ).
Figure 2. Steady-state excess mean square errors (EMSEs) under various σ v 2 . (Gaussian distributed noise, η w = 3.8 × 10 4 , η σ = 4 × 10 3 ).
Entropy 22 00070 g002
Figure 3. Steady-state EMSEs under various η w . (Gaussian distributed noise, σ v 2 = 1 , η σ = 4 × 10 3 ).
Figure 3. Steady-state EMSEs under various η w . (Gaussian distributed noise, σ v 2 = 1 , η σ = 4 × 10 3 ).
Entropy 22 00070 g003
Figure 4. Steady-state EMSEs under various σ v 2 . (Binary distributed noise, η w = 3.8 × 10 4 , η σ = 4 × 10 3 ).
Figure 4. Steady-state EMSEs under various σ v 2 . (Binary distributed noise, η w = 3.8 × 10 4 , η σ = 4 × 10 3 ).
Entropy 22 00070 g004
Figure 5. Steady-state EMSEs under various η w . (Binary distributed noise, σ v 2 = 1 , η σ = 4 × 10 3 ).
Figure 5. Steady-state EMSEs under various η w . (Binary distributed noise, σ v 2 = 1 , η σ = 4 × 10 3 ).
Entropy 22 00070 g005
Figure 6. Convergence behavior of various algorithms (case 1).
Figure 6. Convergence behavior of various algorithms (case 1).
Entropy 22 00070 g006
Figure 7. Convergence behavior of various algorithms (case 2).
Figure 7. Convergence behavior of various algorithms (case 2).
Entropy 22 00070 g007
Figure 8. Convergence behavior of various algorithms (case 3).
Figure 8. Convergence behavior of various algorithms (case 3).
Entropy 22 00070 g008
Figure 9. Convergence behavior of various algorithms (case 4).
Figure 9. Convergence behavior of various algorithms (case 4).
Entropy 22 00070 g009
Table 1. Parameter setting of different algorithms.
Table 1. Parameter setting of different algorithms.
AlgorithmMCCCMCKRSLMCCC-VC
Parameters η = 1 × 10 3 ,
σ = 5 .
η = 1.8 × 10 4 ,
σ = 5 ,
λ = 3 .
η w = 4.8 × 10 4 ,
η σ = 4 × 10 4 ,
σ ( 0 ) = 5 .
Notes: η and σ denote the learning rate and kernel width for MCCC and MCKRSL, and λ denotes the risk-sensitive parameter for MCKRSL. Moreover, η w , η σ denote the learning rates for the weight and kernel width of MCCC-VC, and σ(0) denotes the initial kernel with of MCCC-VC.

Share and Cite

MDPI and ACS Style

Dong, F.; Qian, G.; Wang, S. Complex Correntropy with Variable Center: Definition, Properties, and Application to Adaptive Filtering. Entropy 2020, 22, 70. https://doi.org/10.3390/e22010070

AMA Style

Dong F, Qian G, Wang S. Complex Correntropy with Variable Center: Definition, Properties, and Application to Adaptive Filtering. Entropy. 2020; 22(1):70. https://doi.org/10.3390/e22010070

Chicago/Turabian Style

Dong, Fei, Guobing Qian, and Shiyuan Wang. 2020. "Complex Correntropy with Variable Center: Definition, Properties, and Application to Adaptive Filtering" Entropy 22, no. 1: 70. https://doi.org/10.3390/e22010070

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop