1. Introduction
There have been increasing demands for developing robust, adaptive, and accurate multi-sensor information filters (MSIF), which have been widely applied to many fields such as navigation systems, modern industries, military threat detection, target tracking, and remote sensing [
1,
2]. Especially in recent years, there have been many state estimation problems in which the processes were often non-linear and uncertain for tracking and navigation, for example [
3]. Hence, the theory has been researched and broadly applied to many realistic systems. A method with both adaptability and robustness cannot be realized in real time for a single sensor/observer system [
4]. By using a multi-sensor structure, an information fusion algorithm can obtain much more accurate estimations than a single one [
5,
6]. However, to the best of our knowledge, few methods focus on both adaptability and robustness in MSIF. From a system point of view, there are mainly two different methods to process the data from a multi-sensor [
7]. The first method is the centralized filter where all raw sensor data are fed to a central site for processing [
1]. The second one is the decentralized or distributed filter where the process is divided between some local filter concurrently to obtain individual raw data-based estimates and one master/center filter to fuse those local estimates to provide a much accurate global optimal estimate [
6,
8].
The centralized filter is also called measurement fusion, because all observations are directly fused to obtain a final estimate. The main advantage is high accuracy due to the use of lossless information. However, in practice, this architecture is heavily challenged by a large complexity of computation while the number of sensors increases. Another drawback of this method is the lack of robustness when one or more sensors fail.
In the decentralized structure, the complexity of computation can be reduced in the center filter because part of the computation is taking place at the local filters; furthermore, fault detection and isolation are easier to be implemented. For the reasons given above, parallel structure, which can provide improved reliability and fault tolerance of sensors, has been paid more attention to and implemented in many aspects [
1,
8].
Frank et al. proposed estimators for multi-sensors with different failure rates [
9]. They took the stochastic perturbations and the probability of sensors’ failure into account, so the robustness could be improved. However, the matrices which signify the direction of perturbations are assumed known, which may not be available in practice for many systems. An excellent piece of literature also focused on the problem, where the Bernoulli random variables were used to predict the phenomenon of missing observations [
10]. Unfortunately, the assumption that any failed sensor may recover after specific
m instances of time cannot be held in many harsh environments.
Qiu et al. proposed a diagonal weighting matrix method for the fusion of local estimates [
7]. However, this algorithm gains computational efficiency at the expense of a loss in accuracy [
11]. Zhang et al. put forward a method to fuse the multi-sensor measurements in a sequential way [
12]. In [
13], under the energy harvesting constraints, a robust fusion filtering over a multi-sensor system is proposed. By using the covariance intersection fusion strategy, this theoretical framework for discrete time-varying stochastic multi-sensor systems is established. Based on the fact that so many scholars successfully use the decentralized structure to improve the performance of multi-sensor information fusion, this paper adopts the decentralized structure [
8].
For a local filter in the MSIF, there are several options. Kalman filter is a promising method for linear problems [
14]. When problems become nonlinear, a set of improved forms could be adopted, of which the extended Kalman filter (EKF) and the unscented Kalman filter (UKF) are used widely. Using Taylor series expansions, the EKF linearizes the non-linear models to make them convenient for the standard Kalman filter procedure. The core drawback of the EKF makes it unable to achieve sufficient estimation accuracy for a maneuvering target, because the first-order approximate error could be huge under strong nonlinear conditions. Although a lot of efforts were made to estimate the states by the adaptive algorithms based on the covariance estimation, as argued by Ge et al. [
15], the EKF and its adaptive forms are still not optimal options for the key reason above. The generic particle filter (PF) and the cubature Kalman filter (CKF) are also well-known methods for nonlinear problems, as mentioned by [
16]; the high computation cost might not be affordable in many applications.
The posterior mean and covariance of any Gaussian random variable in third-order accuracy could be approximated by UKF based on the unscented transformation (UT) [
17]. To apply the UKF, process-error and measurement noise should be taken into consideration, so plenty of adaptive UKF methods have been proposed. Soken et al. corrected the mismatches of the process noise covariance (
-adaptation) based on an adaptive UKF algorithm, and they applied the method to accomplish the picosatellite attitude estimation under the condition that the process noise covariance may vary [
18]. However, by using one scalar parameter to correct the
, the accuracy is limited to a certain extent. Similar to [
18], Chang proposed a method with both adaptivity and robustness [
4]. The method can deal with the condition that both the process and measurement noise covariance change at the same time, but not in real time. Meanwhile, the drawback of the heuristic method is the same as [
4], that only one scalar was calculated for adapting the
or
, which stand for the measurement noise covariance.
As argued in [
10], if measurements only contain noise, they can be seen as outliers [
6]. To make a more accurate estimation of the noise covariance, plenty of scholars have turned to adaptive UKF methods. Mohamed et al. developed an adaptive Kalman filter, based on the maximum likelihood criterion for the proper choice of filtering weight. They argued that the method is efficient by adapting the matrices
and/or
[
19]. The basic idea of [
19] is to adapt the
by innovation sequences and the
by residual ones. However, the innovation and residual sequences obtained from the filter are not independent [
15].
In [
20], based on an adaptively robust EKF, Yuan et al. proposed a PDR/UWB (Pedestrian Dead Reckoning and Ultra-Wide Band) integrated navigation algorithm. To obtain the adaptability, the algorithm takes the positioning scene and the heading as constraints. The robustness of the algorithm is achieved by adopting the idea similar to [
6]. However, in many other applications such as tracking and remote sensing, the constraints are not always available or difficult to implement because of the great complexity and the high computational cost [
21]. In [
22], to handle measurement outliers, the robust estimation reduces the weight of the observation; to handle the model error, an adaptive factor is introduced to balance the adverse effect. This algorithm has inspired many scholars. In [
23], two novel quantitative nonlinear observability measures are proposed to get an optimal filter design. However, both [
22] and [
23] inevitably use the state-dependent calculations to adjust the measured values or the weight matrix, so the same problem as in [
20] mentioned above exists. In [
24], an adaptive filtering method was proposed. The authors used the residual error to construct a low-pass filter together with the process noise covariance. They argued that the high process noise could be effectively suppressed. However, if measurement noise covariance is not estimated accurately in a timely manner, the residual error will be inaccurate; then, in the next iteration, the residual error may not be adjusted only by the algorithm. So, it is necessary to estimate the measurement noise covariance as accurately as possible. Moreover, it is better that the estimation of the measurement noise covariance is independent of the process noise or estimation of the state [
15].
If the estimation of the noise covariance matrix
could be accurately made, it would be a solid foundation to give the matrix
a relatively accurate estimation. Zhang et al. developed a measurement-based adaptive Kalman filtering algorithm (MAKF) that overcame the instability drawback of improved Sage–Husa adaptive filter for the integrated navigation system [
25,
26]. Realizing the limitation of MAKF is that the following assumption could not always be held—one of the measurement noise covariances is relatively smaller than the other—the group of Zhang afterwards developed an improved method named redundant measurement noise covariance estimation (RMNCE) [
27], which can estimate the noise variance of the measurement and is not affected by the process state estimation error. So, in this work, we utilize the RMNCE, which can deal with the unknown noise covariance in real time, to calculate the measurement noise covariance
for each sensor.
In our proposed method, the matrix
of each sensor is also calculated through innovation sequences, denoted as
, as mentioned in [
28]. We denote the ratio between
and
as the indicator to reflect whether there would be non-ignorable process error or not. Furthermore, if the indicator or the hypothesis test theory based on chi-square suggests the existence of process error, the trigger for adaptation is on. To the best of our knowledge, the combination of the two criteria is firstly introduced by this paper.
For a given MSIF problem, without loss of generality, the statistical properties of measurement noise are not reliable, though they could be obtained in advance. So, we adopt the RMNCE method to estimate the variance of measurement noise in multi-sensor system. Additionally, taking all the above adaptive
estimation algorithms into consideration, a new
estimation algorithm based on both innovation and residual sequences is given, which is inspired by [
16]. Finally, the decentralized architecture is used to fuse the estimations from local sensors.
The contributions of the paper are twofold.
Firstly, an efficient algorithm is proposed for the MSIF to detect the process-error based on the indicator, which is combined by hypothesis test theory with the Mahalanobis distance of innovation sequences and the RMNCE. Then, an innovative estimation algorithm is proposed by using both innovation and residual sequences.
Secondly, to the best of our knowledge, the RMNCE algorithm together with a weighted factor is first introduced into MSIF. To begin with, the covariance of measurement noise obtained by RMNCE is not only used as the measurement noise covariance estimation of each sensor but also as the element to calculate the weighted factor. Then, a novel method is proposed to simplify the calculation of the weighted factor as an alternative to optimal matrix weights in [
1]. Moreover, an indicator is also proposed based on the RMNCE to detect whether the process-error exists or not. At last, the simulation results demonstrate that the proposed scheme can increase the tracking precision with both adaptivity and robustness.
The remainder of this paper is organized as follows: in the next section, we describe the standard UKF, the RMNCE, an innovative adaptive UKF (AUKF) proposed by this paper and the decentralized MSIF.
Section 3 introduces the adaptive multi-sensor information fusion algorithm (RAUKF-MSIF).
Section 4 provides the simulation and discussions.
Section 5 finally draws the conclusions.
2. The Decentralized MSIF and the Proposed Adaptive UKF
Considering a discrete time nonlinear stochastic system with
l sensors, the process and measurement models can be described as
where
denotes the state vector,
is the measurement collected by sensor
at sampling time instant
,
and
are uncorrelated zero-mean Gaussian white noise with compatible dimension [
4,
16,
25,
26,
28,
29],
and
are the known time-varying nonlinear state transition and measurement function, respectively.
is the system noise-driven matrix with compatible dimension.
The statistical properties assumed about noise processes can be summarized as [
4]
where
denotes the Kronecker delta function.
The following assumptions are also made as initial value
where the initial state
is independent of
and
,
is the initial estimation error covariance matrix.
In the following
Section 2.1, the procedure of the standard UKF is briefly reviewed. In
Section 2.2, the RMNCE is introduced. To improve the adaptivity of the estimation, the process-error should be detected timely, so an innovative adaptive UKF is proposed in
Section 2.3. In
Section 2.4, the structure of MSIF adopted by our work is given.
2.1. The Standard UKF
The UKF uses the fact that it should be easier to estimate a nonlinear distribution than to give an approximation of a nonlinear system [
29]. In the standard UKF, to generate the sigma points to undergo the nonlinear transformation and calculate the first two moments of the transformed set, the UT, a deterministic sampling technique, is implemented. For the sake of simplicity, only one sensor of (1) is taken into consideration, and the general procedures are as follows:
Step 1: Initialization.
where
is the initial state,
is the initial estimation error covariance.
Step 2: Sigma points generation.
where
denotes the dimension of the state;
is the composite scaling parameter that is used for fine tuning,
is set to
and a good default setting on
is
[
15].
Step 3: State prediction.
where
is the predicted state mean; and
is the predicted state covariance.
and
are weights, which are defined as
where
is used to incorporate the higher order information of the distribution, according to [
15], for Gaussian distribution
[
29].
Step 4: Observation prediction.
Step 5: Kalman gain calculation.
Step 6: State estimation and error covariance matrix update.
Step 7: Iterate from steps 2 to 6 until all samples are completed.
Under the condition of time-varying process-error and measurement noise covariance, we can infer that if in (6) and in (9) could not be estimated timely, then inaccurate estimates would be made because, in (10), is influenced by which is related to and .
2.2. Adaptive Estimation
As proved by Li et al. in [
27], for simplicity, assuming
and
are independent redundant measurements of a signal
from two sensors, they can be modeled as
where
denotes the true value of
,
and
are steady items of the measurement errors,
and
are uncorrelated zero-mean random white noise.
The first-order-self-difference (FOSD)
,
and the second-order-mutual-difference (SOMD)
are defined as
Under the condition that the sampling interval is short enough,
. The covariance of the random noise for measurement
and
can be estimated as
The mathematical expectations in (13) are calculated as follows, because the statistical characteristics are stable over a relatively short period.
where
and
can be calculated by
where
is the sliding window width which can be empirically set to 30~60 [
25].
In practice, in order to capture the transient behavior of the variation of
in time while considering the smoothness, a fading memory calculation is implemented as [
30]
where
is the fading factor and in our work is set to 0.980,
represents the direct output in (13) at time
,
is the final covariance matrix.
2.3. The Proposed Adaptive UKF Based on the Mahalanobis Distance of the Innovation Vector
As emphasized by many researchers [
15,
17,
31], the parameters of the system model and the distribution of the measurement noise and the process noise could not be maintained as constants in practice all the time. In order to prevent the estimation from deteriorating or even diverging, caused by the model error, it is vital to determine and correct the mismatch between the real process error and the parameters’ preset. Herein, in our proposed method, based on hypothesis testing theory [
14], the Mahalanobis distance of innovation vector is employed as a criterion to identify whether the system modeling error exists or not [
17].
For the sake of simplicity, only one sensor of (1) is taken into consideration, thus, the superscript is dropped. This paper develops a new method to improve the adaptability of the classical UKF against process model error based on the Mahalanobis distance. However, if the measurement noise matrix is also contaminated, then one single sensor cannot cope with this case. Thanks to the RMNCE, in MSIF, the can be estimated and is immune to the state estimation.
Define the innovation sequence
according to [
15] as
Then
should be zero-mean Gaussian-distributed with covariance
[
19],
; the square of the Mahalanobis distance of the innovation should be
distributed [
4],
where
is the degree of freedom.
Based on the hypothesis testing theory, let
be the given significance level, and then we have
where
stands for the probability of a random event,
denotes the
-quantile of the distribution
.
If (20) does not hold, it can be deduced with high probability that there exists process-error in the system (1), assuming that the observations are within reasonable bounds. Because there is more than one sensor in the system, (20) should be calculated for each sensor.
Different from [
4,
17], instead of a single parameter that acts on
We present a new algorithm that takes the advantage of sequences of both innovation and residual to correct the matrix directly by a diagonal matrix, and the procedure is as follows.
According to [
15], the residual
can be defined as
According to [
18], based on the principle of orthogonality [
12], the residual sequences are uncorrelated from the measurements. So, the following equations should hold:
If
is replaced by its definition in (10), then
The traces of both sides should be equal:
where
, and
is the predicted covariance without the additive process noise. If (23) does not hold, there will be some change in matrix
. So, one scalar called the adaptive fading factor is generated based on the Equation (23) [
16]:
Instead of one scalar being calculated to tune the matrix , much higher accuracy could be obtained by using a matrix , in which is the dimension of the .
So, an innovative adaptive method is proposed as
To obtain (28), firstly, the relationship of , and are given; then the matrix is solved but not in the form of the trace of a matrix.
The covariance of innovation is:
The covariance of difference sequences between innovation and residual is
A deep look should be taken at
and
before calculating (30)
where
and
represent the prediction error and the estimation error, respectively.
The first two items in the right side of (30) can be derived based on (31) as
Because the sample frequency is high, the Jacobian matrices , and
, Ge et al. gave the proof in [
15]
Then, based on (32) and (33), namely,
According to (29), replacing
by
in (25) can be rewritten without trace calculation based on (32), (33), and
Hence, (28) is obtained by rearranging (35). Normally, the covariance equations stacked above are calculated as
in which
is the windows size,
, and
.
Remark 1. In Equation (28), the inverse matrices ofandare required to be calculated. In practice,is normally a positive definite diagonal matrix, butmay be not an invertible matrix, or not even a square matrix. Considering thatis a diagonal matrix, we can obtain the elements by solving the following equation:
Remark 2. In Equation (37),andare the Jacobian matrices. They normally can be considered as the slope at. During the course of numerical calculation,andmay be very small, so one or more elements ofwill be large enough to causeto be a negative definite in the next step. To solve this problem, a threshold value is used to limit the element of. Meanwhile, ifis not positive in numerical applications, it is always reset to the absolute value of its estimate[28].
Remark 3. In (28) or (37), the result of the first three items in the right side of equations may be much greater than the last item, because the assumption that the innovation and residual sequences are orthogonal is statistically ideal. In other words, for a limited set of the sample, (36) is commonly biased [28]. However, in many real-time or online systems, the sliding window could not be set very large to meet the assumption. So, the tradeoff should be made between the numerical stability and real-time performance. 2.4. The Decentralized MSIF
Due to the expensive computational cost for high-dimension matrices and low stability when some measurements are abnormal [
10], i.e., one or more sensors provide information with large noises, or just noises, the centralized architecture is not adopted widely. This work adopts the decentralized MSIF structure similar to [
1], which is derived from [
32]. For simplicity, the time instant
in
is dropped in this section [
1].
Let
be unbiased estimators of
in (1) and the estimation errors be
. Assume that
and
are correlated, the covariance and cross covariance are
and
, respectively. The optimal fusion estimator
with matrix weights can be described as
where the optimal matrix weights
are to be determined.
The globally optimal information fusion Kalman filter
of the state
, based on the principle of linear minimum variance, will satisfy the following conditions [
1,
17]:
Condition 1. must be the unbiased estimation of,namely, .
Condition 2. makesminimum, in whichis the error covariance matrix of.
For simplicity, we denote
. Our aim is to find
to construct the unbiased estimator
where
are arbitrary matrices.
If
is an unbiased estimator for
, the following condition should be fulfilled,
Taking the expectation of both sides of (38) yields,
Based on (38) and (42), we get the fusion estimation error
The error variance matrix of the fusion estimator is
where
is a symmetric positive definite matrix.
Now the problem is converted to a classic one: under the constraint of (42), to solve the minimum of
by applying the Lagrange multiplier
where
is the Lagrange multiplier,
; and
represents an
-dimensional identity matrix.
By setting
, and noting that
, we have
By substituting (42) into (46)
Because
is a symmetric positive definite matrix,
is nonsingular. Based on the matrix theory, (47) can be solved
From (48), the matrix
should be calculated to obtain the global optimal state estimation. The diagonal elements
in the matrix
can be directly calculated by the error covariance matrix of the state estimation in the
local filter. However, the cross-covariance matrix is difficult to get. In our work, we use the result given by Gao et al. base on UT [
17].
where
is the sigma point transformed by the nonlinear function
in (6) for the
sensor;
represents the sigma points transformed by the nonlinear function
in (8); and
denotes the order of the transformed sigma points.
Remark 4. In order to maintainas a positive definite, some exceptions should be taken into consideration. Firstly, Equation (42) should be examined immediately whenis fulfilled by. This step is necessary because there may be truncation errors during calculation. If the constraint defined by (42) is not satisfied, set.
Secondly, in MSIF, the optimal estimation
theoretically lies in the closed interval:
where
denotes the 2-norm. If (50) is not maintained, the following degenerative methods (52) or (53) can be used:
3. The Proposed Method with Both Adaptivity and Robustness
In
Section 2, we proposed the innovative algorithm to adapt
when process-error exists. In this section, based on the hypothesis theory, the adopted judging criterion on process-error detection is given [
4,
17]. However, the drawbacks of the methods proposed by [
4] and [
17] are that the
and
could not be estimated at the same time. Severe problems would be caused under the dilemma to decide which one should be adapted:
,
, or both? To conquer this challenge, the RMNCE is employed to estimate the noise covariance of each sensor, then a decision is made whether the matrix
is suffering gross errors or not. Based on RMNCE, the decision is made easier by MSIF because the estimations of
are obtained with relatively high accuracy.
Briefly, in our proposed MSIF architecture, the matrix of each sensor is estimated by RMNCE, and is denoted as ; meanwhile, the process-error can be corrected by adapting if necessary. For the sake of simplicity, only one sensor of (1) is taken into consideration, so the superscript in could be dropped.
3.1. Robust Estimaiton Based on RMNCE
As described in
Section 2.2, the main advantage of RMNCE is that the estimate of variance is based only on measurements and hence can be immune to the state estimation error [
27]. So,
calculated by (17) can be considered as the benchmark to test whether the process error exists or not. Let
be calculated by the following equation [
28]
The difference between
and
could be large if process error exists. The quotient is used as an indicator to detect the process error as
where
and
denote the trace of
and
, respectively.
When the quotient is around 1.0, it could be deduced with great probability that there is no process error. In this paper, a closed interval from 0.90 to 1.10 is used as the normal range. Otherwise, if the indicator is larger than 1.10 or smaller than 0.90, there is process error with great probability.
Remark 5. If the diagonal elements differ by more than one order of magnitude, it is better to calculate the indicator separately. For example, the matrices and are as
Although the indicator calculated by (54) belongs to the closed interval, , it is obvious that the second element in the diagonal differs 10 times in than .
Remark 6. The sliding window width for calculatingandshould be the same. In this paper, it is set to 50.
In our MSIF architecture, we also proposed the following matrix weights as an alternative to (48).
Consider a MSIF system with three sensors, the noise covariance matrices
could be obtained by (13). Because the noise covariance matrices are usually diagonal, the following equation is employed to calculate the
where
is the dimension of
.
Remark 7. In our proposed MSIF structure, all the sensors have the same measurement model, so all thehave the same dimension. If a MSIF system has different sensors, the variation of (56) should be derived specifically. However, the core idea is that the smalleris, the greater weightis. A degenerated case is that if one or more elements ofare observed only by one sensor, assuming sensor, thenwheredenotes the number of elements inthat are only observed by sensor. Hence, this is not a MSIF anymore, and this paper would not discuss this case any further. Remark 8. When there aresensors in a MSIF system, there are basically two options to utilize (13):
Option 1. To calculate two matricesone time, so (13) and the relative equations will be runtimes at least.
Option 2. To calculate all matricesone time, then the classical least square method is used to solve the overdetermined equations, which are derived by the variation of (13). In this paper, Option 1 is adopted.
3.2. The Adaptive and Robust UKF Algorithm for MSIF (ARUKF-MSIF)
In this subsection, the complete scheme is given. The proposed Algorithm 1 ARUKF-MSIF aimed at target tracking can be implemented as follows.
Algorithm 1. ARUKF-MSIF algorithm |
; |
Step 1: State prediction through (5)–(7) for each sensor. |
Step 2: Observation prediction through (7) and (8) for each sensor.
|
Step 3: Estimate matrix for each sensor through (12) to (17). |
Step 4: Process-error judgment through (20) and (54). |
Step 5: Abnormal innovation distinguishing. |
5.1 If (20) and (54) holds: |
5.1.1 Go to step 6. |
5.2 Else: |
5.2.1 Adapt through (28). |
Step 6: Calculate Kalman gain and filtering through (9) and (10). |
Step 7: MSIF implementation. |
7.1 Calculate through (49). |
7.2 Calculate matrix weights through (48) or (56), generate the optimal estimation . |
Step 8: For the next iteration, repeat steps from 1 to 7. |
The framework of the proposed ARUKF-MSIF methodology is shown in
Figure 1. It has a two-layer fusion structure, the local layer and the globally optimal layer.
Based on RMNCE, the measurement noise covariance of each sensor is estimated. Every sensor estimates the states independently in the local layer. If any sensor subsystem detects the process-error by the chi-square test or the indicator proposed by this paper suggests the existence of the process-error, then the proposed -adaption algorithm is employed to correct this mismatch. The globally optimal layer is the final fusion center, where the optimal matrix weights are determined.