1. Introduction
With the rapid development of aerospace technology, the importance of space in the civil and military fields has become increasingly prominent and, thus, has become a strategic fulcrum for competition among countries. Space technology is an essential basis for developing and utilizing space resources and safeguarding national security. Satellites are some of the most important tools for human exploration and space applications. Due to the high cost of satellite construction and the high cost of launching and in-orbit maintenance, once a failure occurs, it will cause considerable losses. In recent years, on-orbit fault diagnosis technology has made remarkable progress, which has effectively improved the on-orbit operation status and life of satellites. Over the years, studies on fault diagnoses of spacecraft have attracted extensive attention from scholars, and lots of valuable research results have emerged in both the theoretical and engineering fields [
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11]. Research such as that on fault detection and diagnosis [
1,
2,
3,
4,
5,
6], reconfigurability analysis [
7,
8,
9], life or fault prediction [
10,
11,
12,
13], and fault tolerance [
14,
15,
16,
17,
18,
19] improves the security and reliability of satellite on-orbit operation from different perspectives.
Fault diagnosis is a significant means of achieving high-precision, long-life, and high-reliability on-orbit operation of satellite systems, and it is of great significance. However, existing research has mainly focused on the design of diagnostic algorithms, and there have been few studies on fault diagnosability. According to the definition of fault diagnosability [
20], diagnosability includes detectability and isolability. Diagnosability is the key to fast and accurate fault detection and isolation in dynamic systems, and it is also the basis of the design of fault diagnosis methods. Without analyzing the diagnosability of faults, blindly designing fault diagnosis algorithms will inevitably consume personnel and material resources and may not achieve satisfactory results. Therefore, research on fault diagnosability is of great significance.
Most of the existing diagnosability research on general control systems only considers the situation of linear system faults [
21,
22,
23]; research on nonlinear systems and nonlinear faults can usually only achieve qualitative evaluation [
24,
25]. There have been few relatively studies on the diagnosability of satellite systems. For example, Liu [
26] studied the fault diagnosability of a satellite momentum wheel based on the idea of if the transfer function from fault to output was zero and if the transfer functions between faults were different. Li and Wang used the distribution probability of fault vectors and the cosine similarity between different fault vectors in [
27] to design a quantitative evaluation method for diagnosability based on directional similarity, which was verified by a satellite attitude determination system. In [
28], the authors described a satellite control system as a class of affine nonlinear models by using the invariant minimum dual distribution to give the diagnosability evaluation, which included the qualitative and quantitative evaluation index of detectability and isolability.
This paper proposes a method of system diagnosability evaluation based on information geometry theory. The diagnosability problem of the system is described as a distance judgment problem of a multivariate distribution in statistics, and the Fisher information distance is used to quantify the diagnosability of system faults. A fault detectability index and fault isolation index are designed. The designed indexes have explicit and intuitive geometric significance, and they are real distance measures without the problem of index asymmetry. The principle and algorithm process of the method to realize diagnosability judgment are given. Taking the satellite attitude determination system as an example, the scientificity and effectiveness of the proposed method, as well as its superiority in analyzing nonlinear faults, are verified. The fault information and properties contained in the fault manifold geodesics are deeply excavated, and information geometry theory is used to analyze some essential problems in the process of fault development and evolution and to lay the foundation for future research on efficient fault detection, diagnosis, design, and optimization methods.
2. Mathematical Description of the Problem
A satellite attitude determination system is the basis for the attitude control of a satellite. Its task is to process information measured by the attitude sensors to obtain the attitude of the satellite body coordinate system relative to the orbit coordinate system [
29,
30]. The most commonly used attitude sensors include star sensors, sun sensors, Earth sensors, and gyroscopes [
30,
31].
A model of satellite dynamics can be equivalently transformed into three independent axes: the roll, pitch, and yaw. The pitch angular velocity of a satellite is a fast variable in comparison with the other two axes. In this paper, a satellite pitch-axis attitude determination system based on an “infrared Earth sensor + gyroscope” is taken as an example for analysis. The infrared Earth sensor and the gyroscope are used to measure the attitude angle and angular rate of the satellite, respectively, and the combined use of the two can achieve high-precision combined attitude determination. Considering that the attitude angle/angular velocity of the satellite’s pitch axis (Y-axis) is decoupled from the roll and yaw axes (X-axis and Z-axis), and in order to reduce the model’s dimensions and simplify the problem, the discrete-form state-space model of the satellite attitude determination system on the pitch axis is given below, as shown in Equation (
1).
The mathematical model of the satellite attitude determination system of the infrared Earth sensor + gyroscope is expressed as above, where and represent the attitude angle and orbital angular velocity of the satellite, respectively; represents the pitch-axis angular rate of the satellite, which is the output of the gyroscope; and are the exponentially correlated drift and constant drift of the gyroscope that are caused by the small amount of torque interference on it; is the time constant of the gyroscope. stands for the fault vector, and // are the noises of the gyroscope’s output and two drifts, respectively, which are all in the form of Gaussian white noise. and represent the fault vector and Gaussian white noise of the measured attitude angle output by the infrared Earth sensor, respectively. represents the sampling time interval. The values of the related parameters are: s, and the time constant (dimensionless); the orbital angular velocity , and the Gaussian white noises , , , . Here, ∼N is a Gaussian distribution symbol, and the distribution’s parameters are shown in the parentheses.
Equation (
1) can be simplified as follows:
In fact, state
x, input
u, and output
y may be influenced by the coupling of interferences
w and
v, as well as by fault
f. In order to separate the influence of state
x and decouple the fault from the interference, the following equation was constructed in [
32].
where
We pe-multiply both sides of the equation by the matrix
:
where
, and
on the left side of the equation is the dynamic behavior of the system. Since the equivalent space transformation does not affect the solution of the system, Equation (
4) can be used to describe the dynamic behavior of the attitude determination system shown in Equation (
1).
on the right side of the equation is the fault vector, which is composed of the direction matrix
and the fault vector
f.
is the interference vector and has a dynamic behavior.
is a multivariate distribution composed of the fault vector and the interference vector.
Therefore, the purpose of quantifying the detectability and isolability of the dynamic system faults shown in Equation (
1) can be realized by measuring the similarity or difference in the multivariate distribution when no fault occurs or when different faults occur according to the corresponding criterion (such as the distance similarity or directional similarity).
3. Quantitative Diagnosability Evaluation Based on the Fisher Information Distance
Parameter vectors usually exist in an abstract manifold, and the manifolds corresponding to a real system usually have a complex topology. Consider the probability distribution parameter family , where x is a random variable, is an n-dimensional parameter vector with a particular distribution, and S is a statistical manifold with the (local) coordinate system . For a particular that belongs to the parameter space , the measured and observed value of x belongs to the sampling space ; then, each corresponds to an actual probability distribution, that is, each probability distribution corresponds to a point on the statistical manifold S.
Take the vector
containing the fault information as the mean value
of the fault manifold and take the interference vector
as the variance
of the fault manifold, namely:
The values of coefficient matrices
in Equation (
1) are known; now, only
needs to be determined, and
is the left orthogonal basis for the null space of
H, which means that
. The matrix
H is determined by the system matrix
A and the output matrix
C of Equation (
1) [
32]. Extending the system to a full-dimensional observable system, let
; then,
Different values of the parameter
represent different types of faults. Considering the nonlinearity caused by the sensor measurement conversion, the settings in this paper are as follows:
The manifold of a fault system is obtained, and the fault probability distribution and parameterized probability distribution family constitute an n-dimensional statistical manifold. Here, is the distribution parameter (and the fault parameter vector), is the mean value, and is the variance. Different faults have different and in this manifold. In this statistical manifold, is the coordinate system and is global.
In statistical manifolds, a Fisher information matrix (FIM) is the unique Riemannian geometric metric tensor for the parameterized probability distribution family [
33], and it is expressed as
. The FIM is given by the following:
where
E is the mathematically expected value. As the parameter
approaches
, FIM measures the ability to distinguish between two adjacent parameters
and
from the data
x. This equation can be rewritten in parametric form as:
According to the Equation (
12), the information metric
of the attitude determination system studied under the fault set is Equation (
13):
On a manifold, since space is curved, to define the distance between two points on the manifold, the length of the curve connecting the two points on the manifold should first be defined.
The differential distance between two points (or two distributions)
and
on a manifold can be expressed by the metric:
Considering that
is the curve connecting
and
, this curve can be described as a parametric equation with a single free parameter
t. The distance between the distributions
and
on the statistical manifold is defined by the curve
[
34]:
where ≜ stands for “define as”. This integral distance depends on the choice of the curve
. Generally, the minimum value of curves for all possible connections is defined as the integral distance between the two distributions, and it is called the Fisher information distance (FID). The Fisher information distance between the distributions
and
is expressed as [
35]:
The curve with the smallest distance defined above is actually the geodesic that connects two points on the manifold, while the Fisher information distance is the length of the shortest geodesic connecting the two points. Geodesics are generalizations of straight lines on manifolds in Euclidean space, and the Fisher information distance is a generalization of the Euclidean distance on manifolds.
The FID satisfies the property of the distance definition, and it is symmetrical for all
:
For
, the FID satisfies the triangle inequality:
Based on the above studies, the following detectability evaluation index is designed:
where
represents the parameters of the fault manifold in the normal state or fault-free state, and
, represents the parameters of the fault manifold at time
k in the case of a fault
.
is an all-zero matrix in the case of the fault-free state.
The following isolability evaluation index is designed:
where
and
are the parameters of different fault manifolds
and
, respectively. By measuring the difference in the FIDs of different faults, different faults can be separated. It is necessary to notice that
, the isolability index is symmetrical.
The definition of the Fisher information distance given by Equation (
17) requires solving the minimum value of the integral, which is a variational problem whose solution is given by the geodesic equation.
Einstein’s summation convention is used in the above formula, where
is the coordinate of the geodesic
, and
is the Christoffel symbol of the second kind [
36] in this coordinate system, which is usually used to represent a class of Riemannian connection coefficients, which can be obtained with the Fisher information matrix according to the following formula:
where
is the coordinate component of the inverse matrix of the Fisher information matrix
, which can be obtained directly through inversion.
The geodesic equation in Equation (9) is an ordinary differential equation of the coordinate
. For the given initial conditions
and
, the solution of the geodesic equation is unique.
The above equations are called geodesic equations, and there is a metric tensor field on the manifold S; the geodesic of the manifold is the geodesic of , where fits with ().
is the parametric form of the geodesic
; then, the above equation can be rewritten into the coordinate component form, as shown in Equation (
22). The geodesic Equation (
24) is an ordinary differential equation of the coordinate
. For the given initial conditions
and
, the solution of the geodesic equation is unique.
The Fisher information distance is the length of the shortest geodesic connecting two points (i.e., two distributions) on a manifold. Geodesics connected to different endpoints also express the detectability and separability of different faults.
Therefore, according to Equation (
17), the FID can be obtained after integration, and it can be used to measure the distance on a manifold between different distributions that represent a fault-free state, a fault, and different faults so as to achieve a quantitative evaluation of diagnosability.
The corresponding non-zero Christoffel symbol of the second kind is:
The corresponding geodesic equation of the attitude determination system is:
4. Simulation Experiment
Summarizing the content in
Section 2, the algorithm for the quantitative evaluation method for diagnosability is provided in Algorithm 1:
Algorithm 1: The quantitative evaluation method for diagnosability on an information manifold |
Require: State–Space Model of a Dynamic System: 1. Pretreatment: 2. Pretreatment: 3. Manifold parameter: Mean value , Variance 4. Fisher Information Metric 5. Fisher Information Distance (FID) 6. Detectability Index 7. Isolability Index 8. Import specific fault/faults to obtain the detectability/isolability index |
To verify the effectiveness of the proposed algorithm, in this section, for the satellite pitch-axis attitude determination system, which is shown in Equation (
1), a simulated experiment on diagnosability evaluation is carried out in a joint mathematical and Matlab simulation platform.
As the derived manifold parameters
already contain the information of the satellite pitch-axis attitude determination system (Equation (
1)), as shown in Equation (
10),
is the fault vector of the gyroscope, and
is the fault vector of the infrared Earth sensor. Each fault vector contains a nonlinearity related to its sensor output characteristics.
We set the initial value to
. The time-varying geodesics for the two fault components are obtained, as shown in
Figure 1.
The FID of
is 1.55741, and the FID of
is 1.25, which means that
It can be found that the geodesic images of the
fault component and the
fault component also reflect this difference; the derivatives of the geodesics of the two fault components are different. It should be noted that the two geodesics in
Figure 1 only reflect the development and variation of a single fault itself on the manifold.
For the two-dimensional fault information manifold (including two fault components) studied in this paper, there are three possible fault types: two single-fault cases and one compound fault case.
A fault information manifold with a curvature is a typical surface that is hard to display. To intuitively demonstrate its shape and study its properties, this paper then “transforms the surface into a plane” through geodesics, then researches the fault diagnosability problem on it. In this paper, by taking the two fault components as the X-axis and Y-axis, respectively, the two-dimensional fault manifold surface can be mapped in the Cartesian coordinate system. In this coordinate system, the X-axis () and Y-axis () represent two single-fault cases (a gyroscope fault and infrared sensor fault), while the first quadrant represents a compound fault case. In the same type of fault case, different points in the coordinate system indicate a fault with different parameters.
Plotting the geodesics, the figure shows the path taken by a group of composite fault geodesics with a departure velocity of 1 that are located at the fault manifold coordinate (4,4) in the same time period. The compound fault parameters are set as follows: . We set the number of geodesic lines to .
This set of geodesics describes the unit circle on the fault information manifold with the same FID centered at the location of the compound fault, the point (4,4). The FID unit circle of a statistical manifold does not usually correspond to a circle in a Euclidean space, and vice versa. In fact, on the fault information manifold, the endpoints of these 128 geodesics form a “unit circle”, but when mapped in Euclidean space, the shape of this unit circle is distorted, forming a shape similar to that of a “comet”.
Figure 2 also expresses the response of the FID to the fault development process in this fault form.
Because the two single faults are set to the x and y rectangular coordinate axes, the manifold, which was originally a curved surface, is “leveled”, and the FID, which was originally a unit circle on the manifold, is distorted in this process. In this coordinate system, the evaluation of the diagnosis of faults based on the FID can be succinctly and intuitively described and studied.
As mentioned above, the X-axis () and Y-axis () represent two single-fault scenarios, and the first quadrant represents a compound fault scenario. Now, three types of faults are displayed in the coordinate system: a single gyro fault , a single infrared sensor fault , and a compound fault :(4,4).
According to the definition of detectability in this paper, the detectability indexes of single faults
and
and of
are shown in
Figure 3, indicating FID of the line between the coordinates where the fault is located and the origin. It should be noted that the “line” here is actually the geodesic on the manifold, not a line mapped in Euclidean space.
The isolability between the single faults
and
and the compound fault
is shown in the
Figure 4, indicating the FIDs of the geodesics between the two faults. The line shown in
Figure 4a is obviously not the shortest line between two points in Euclidean space, but it is the shortest path between them on a manifold.
Table 1 shows the numerical results of the detectability and isolability indexes for the three types of faults.
Firstly, the coordinates of the compound fault under the fault information manifold can be expressed as (9.351245, 141.061290), indicating that the isolability between the compound fault and (single infrared fault) is stronger than that of (single gyro fault). It can also be stated that the coupling between and the compound fault is deep and difficult to isolate.
Secondly, it can be found that on the right is a symmetric matrix. This is because the FID is a true distance measure with symmetry, and it is fundamentally more scientific and accurate than diagnosability evaluation methods based on the KLD. The principle of the diagnosability evaluation method based on the KLD is stated briefly as follows.
The distance between two distributions (in this paper, two faults)
and
can be approximated with various alternative approximations. A common alternative to the information distance is the KLD (Kullback–Leibler divergence).
The relationship between the KLD and differential Fisher information distance is:
The KLD provides a measure of the distance between two points on a manifold, but the KLD cannot give the shortest path between two points, which means that the KLD does not contain information about the structure of the manifold. This is also one of the most significant differences between the KLD and the Fisher information distance. At the same time, the KLD is not a true distance measure because it does not satisfy the symmetry and triangle inequality of the distance definition.
Since the KLD of the two distributions is equivalent to the maximum likelihood estimate between them, it can be used to evaluate the detectability of faults
and the isolability between faults
and
, which are calculated as follows:
The numerical results of the detectability and isolability indexes of the three types of faults based on the KLD with the same settings are shown in the following.
It can been seen from
Table 2 that though the diagnosability indexes obtained with the KLD can realize the evaluation of fault detectability and isolability, the evaluation value of each index appears to be chaotic and irregular. Because the KLD is asymmetric, the value of the isolability evaluation between
and
is unequal to the value of the isolability evaluation between
and
. The problem of asymmetry in the fault isolability index is inevitable for a diagnosability evaluation based on the KLD [
27]. Similarly, the diagnosability evaluation method based on the Bhattacharyya coefficient (BC) has the same problem [
37]. The FID is a significant concept in information and statistical theory. It has made some achievements in theoretical and applied research on signal processing, target tracking, path planning, and other fields. In the field of fault diagnosis, the results obtained with the diagnosability method proposed in this paper have a similar tendency to that of the results obtained with the traditional KLD method, which demonstrates their correctness on the other side. At the same time, the method presented in this paper contains accurate fault information, and there is no problem of asymmetry in the isolability index.
However, for the method proposed in this paper, the diagnosability problem is transferred to a fault information manifold for research, and the real distance measurement of the FID is used to design detectability and isolability indexes, so there is no such problem of asymmetry. The design of the indexes with the FID is relatively more scientific and is conducive to the development of subsequent research on diagnosability (such as fault diagnosis, diagnosability optimization, etc.).
Otherwise, through comparison with
Table 1, we find that there is a certain relationship between the diagnosability evaluation values based on the FIDs of the three types of faults. It is manifested as: The isolability evaluation value of the single faults
and
is equal to the sum of the detectability evaluation values of
and
.
The detectability evaluation value of the compound fault
is equal to the modulus of the difference between the two detectability evaluation values of the single faults
and
.
For verification, we set multiple compound faults with different parameters.
From
Table 3, it can be noticed that in the new cases, there is indeed some relationship between the diagnosability indexes based on the FID, despite the numeral error that exists because of the limits of decimal digits. However, the diagnosability indexes obtained with the KLD do not have this connection.
The four representative geodesic lines in
Figure 2 are selected and displayed separately.
It can be seen in
Figure 5 that among these geodesic lines (there are theoretically infinite ones), there are two special geodesic lines, which are the shortest geodesic lines in the “unit circle”; in addition, the FIDs of the two special geodesic lines are equal, and the other geodesic lines are symmetrical about these two special geodesic lines. According to the research in [
38], they are named the “symmetry lines”, and “symmetry lines” have extremely important properties in the information space.
In the figure, the two geodesic “symmetry lines” appear to be “straight lines” and have unequal lengths because they can only be displayed through the Euclidean plane. However, with the Riemann metric, these two special geodesics are both typical curves and have the same length. According to [
38], a “symmetry line” embodies a certain symmetry and conservation principle in the information space.
With the Riemannian metric, free particles travel unequal distances along different geodesics to accumulate the same amount of energy. The geodesics between two points in a Riemannian manifold are not unique, and the FID between two points corresponds to the geodesics of the shortest length. Since on different geodesics, free particles have the same energy that is accumulated in a certain period of time, according to the “principle of lowest energy”, free particles must choose to travel the shortest path. In the information space, the geodesics corresponding to the FID also correspond to the shortest path for accumulating energy from the “initial state” endpoint to the “final state” endpoint, which can also be regarded as the inevitable path of information. For the fault information space, after a certain fault occurs in the system, this state can also be understood as an “initial state”. In the absence of the injection of new faults, the “initial state” will follow the geodesic line and move to reach a certain “final state”. The geodesic corresponding to the FID reflects the developmental trajectory of the fault.
Remark 1. - 1.
There are special geodesic “symmetry lines” that exist on a fault information manifold, and they can represent the diagnosability properties of a fault. A “symmetry line” is an inevitable path for the development and evolution of a fault after setting the “initial state”.
- 2.
There is a special geodesic “symmetry line” of a fault component whose length reflects how detectable the fault is. Since the geodesic is the path that the fault travels on the manifold with the same departure speed and the same time interval, a longer path means a richer amount of information, which is more beneficial for researchers’ measurement work.
- 3.
The faults studied in this paper are coupled with two faults, and the coupling between the faults causes a deformation of the FID unit circle, distorting the unit circle into a “comet-like” shape. The closer the fault information is to the “comet tail”, the easier it is to decouple, and the longer the “comet tail” is, the easier it is for the fault to be decoupled. This effect is now called the “comet tail effect” on the fault manifold.