*4.2. Improved Adaptive Particle Swarm Optimization (APSO)*

Kennedy and Eberhart first proposed particle swarm optimization (PSO) in 1995 [34]. PSO is an algorithm for finding the optimal solution inspired by the foraging behavior of bird groups. In the PSO algorithm, each particle represents a feasible solution of a function to be optimized, and the movement of the particle is restricted by two aspects: speed and position. The speed constrains the distance of particle movement, while the position constrains the direction of particle movement. Each particle's movement is given a fitness function to evaluate the particle's location. Under the control of constraint conditions and evaluation function, the particles search for a better area in the process of moving. After many iterations, they gather near the optimal solution. The particle velocity and position update formula are as follows:

$$w\_{id}^{k+1} = \omega v\_{id}^k + c\_1 r\_1 (p\_{id} - \mathbf{x}\_{id}^k) + c\_2 r\_2 (p\_{\mathcal{G}^d} - \mathbf{x}\_{id}^k) \tag{22}$$

$$\mathbf{x}\_{id}^{k+1} = \mathbf{x}\_{id}^{k} + \boldsymbol{\sigma}\_{id}^{k+1} \tag{23}$$

where *v<sup>k</sup> id* is the current velocity of the *<sup>d</sup>*-th component in the *<sup>i</sup>*-th particle, *<sup>v</sup>k*+<sup>1</sup> *id* is the next velocity of the *d*-th component in the *i*-th particle, *ω* is the inertia weight, *ω* ≥ 0, *c*<sup>1</sup> and *c*<sup>2</sup> are the acceleration constant of the particle, *r*<sup>1</sup> and *r*<sup>2</sup> are random numbers between 0 and 1, *r*1,*r*<sup>2</sup> = *random*(0, 1), *pid* represents the best position of the *d*-th component of the *i*-th particle, *pgd* represents the best position of the *d*-th component of all particles, *x<sup>k</sup> id* is the current position of the *d*-th component in the *i*-th particle, and *xk*+<sup>1</sup> *id* is the next position of the *d*-th component in the *i*-th particle.

PSO has the advantages of fewer parameters and fast convergence, but it also has shortcomings such as premature convergence and falling into local optimum. It can be seen from Equations (22) and (23) that the inertia weight *ω* determines the relationship between the next flight distance and the current flight distance, which further affects the position after the flight. The larger the *ω*, the stronger the particle's flying ability in the solution space, which is conducive to searching in the global scope. The smaller the *ω*, the smaller the flight length, and the stronger the search ability of the particles in a local area, which is conducive to the convergence of the algorithm. However, if the value of *ω* is too large, it will easily cause the algorithm to skip the optimal solution or oscillate near the optimal solution, which will lead to the premature convergence; if *ω* is too small, the algorithm will easily fall into a local optimum.

The inertia weight *ω* should be a larger value at the beginning of the iteration to ensure a strong global search ability and the ability to jump out of the local optimum. However, in the later stage of the algorithm iteration, smaller *ω* should be used to ensure strong local search capabilities, which is conducive to the convergence of the algorithm.

In response to the above problem, this paper proposes a strategy for adaptively changing the ω according to the number of iterations and the current fitness value. The formula is as follows:

$$\omega\_{id}^{k+1} = \begin{cases} \omega\_0 - \varepsilon^{-k/K} \* \frac{f\_{\text{max}}^k - f\_{id}^k}{f\_{\text{max}}^k - f\_{\text{avg}}^k}, & f\_{id}^k \le f\_{\text{avg}}^k \\\ \omega\_0 + \varepsilon^{-k/K} \* \frac{f\_{id}^k - f\_{\text{min}}^k}{f\_{\text{avg}}^k - f\_{\text{min}}^k}, & f\_{id}^k > f\_{\text{avg}}^k \end{cases} \tag{24}$$

where *k* is the current number of iterations, *k* + 1 is the next number of iterations, *K* is the maximum number of iterations, *ωk*+<sup>1</sup> *id* is the inertia weight for the next iteration for the *d*-th component of the *i*-th particle, *ω*<sup>0</sup> is the initial value of *ω*, *ω*<sup>0</sup> = 0.5 in this paper, *f <sup>k</sup> id* is the fitness value of the *d*-th component of the *i*-th particle obtained in the *k*-th iteration, *f <sup>k</sup>* max is the maximum fitness value in the *k*-th iteration, *f <sup>k</sup>* min is the minimum fitness value in the *k*-th iteration, *f <sup>k</sup> avg* is the average fitness value in the *k*-th iteration.

At the beginning of the iteration, the weight of the particle changes greatly, and as the number of iterations of the particle increases, the weight change decreases. At the same time, the weight change is determined by the fitness function. When the particle fitness is less than or equal to the average fitness, that is, when the accuracy of the classification model is greater than or equal to the average accuracy, the inertia weight decreases; when the particle fitness is greater than the average fitness, the accuracy of the classification model is lower than the average accuracy, and the inertia weight increases.

The increase or decrease of the inertia weight is determined by the number of iterations. At the beginning of the iteration, the increase or decrease of the weight is large, which is convenient for searching in the global and optimal solution neighborhood. At the later stage of the iteration, the increase or decrease of the weight is small. The increase of the weight can avoid falling into the local optimal solution for random search, and the decrease of the weight facilitates the local fine search.

#### *4.3. The Algorithm of the Proposed APSO-SVM*

In this paper, the improved Adaptive Particle Swarm Optimization (APSO) is used to optimize the penalty factor *λ* and the kernel function parameter *σ* in the SVM. The specific steps of APSO-SVM are as follows.

Step 1: Input the dataset with labels.

Step 2: Divide the dataset into training set and test set, then normalize both two sets.

Step 3: Population initialization. Set the number of particles in the initial population as *n*. Set the range of penalty factor *λ* to [*λ*min, *λ*max]. Set the range of kernel function parameter *σ* to [*σ*min, *σ*max]. Initialize the parameter set *θ* = {*ω*0, *c*1, *c*2, *K*}. Initialize the position *x*<sup>0</sup> *<sup>i</sup>* , the speed *<sup>v</sup>*<sup>0</sup> *<sup>i</sup>* , the optimal position *pid* of *i*-th particle and the global optimal position *pgd*. Set fitness error *ε*.

Step 4: Calculate the corresponding inertia weight *ωk*+<sup>1</sup> *id* according to Equation (24) in the adaptive adjustment strategy. Update the velocity *vk*+<sup>1</sup> *id* and position *<sup>x</sup>k*+<sup>1</sup> *id* of the particles according to Equations (22) and (23). Determine whether *λ* and *σ* are in [*λ*min, *λ*max] and [*σ*min, *σ*max] respectively. If *λ* < *λ*min, set *λ* = *λ*min. If *λ* > *λ*max, set *λ* = *λ*max. If *σ* < *σ*min, set *σ* = *σ*min. If *σ* > *σ*max, set *σ* = *σ*max.

Step 5: If *fi* <sup>&</sup>gt; *<sup>f</sup>*(*pi*) or <sup>|</sup> *fi* <sup>−</sup> *<sup>f</sup>*(*pi*)|≤ *<sup>ε</sup>*, *<sup>λ</sup>*(*xi*) <sup>&</sup>lt; *<sup>λ</sup>*(*pi*), update *pi*. If *fi* <sup>&</sup>gt; *<sup>f</sup>*(*pg*) or - *fi* − *f*(*pg*) - -≤ *ε*, *λ*(*xi*) < *λ*(*pg*), update *pg*. The expression of *f*(∗) is shown in Equation (21).

Step 6: If *k* < *K*, repeat the step 4 to step 6. If *k* ≥ *K*, end the APSO.

Step 7: Use the optimal solution (*λ*∗, *σ*∗) to create the SVMs model and use this model for classification.

The flow chart of APSO-SVM is shown in Figure 8.

**Figure 8.** Flow chart of APSO-SVM.

#### **5. Case Study of Anomaly Detection**

#### *5.1. Data Description*

The data set used in this article is from a satellite's telemetry voltage value of its momentum wheel. In this data set, five types of sample with different health status are screened out. Stable Change (large) indicates that the momentum wheel voltage value continuously and steadily changes with a large change amplitude. Stable Change (small) indicates that the momentum wheel voltage value continuously and steadily changes with a small change amplitude. Large to Small indicates that the amplitude of the momentum wheel voltage change smoothly transitions from large to small. The above three types of sample all represent that the momentum wheel is in a normal state. Irregular Change indicates that the voltage of the momentum wheel changes irregularly. Sudden Change indicates that the voltage of the momentum wheel has a sudden change, such as the voltage suddenly jumping to 0. Irregular Change and Sudden Change represent that the momentum wheel is in an abnormal state. The time-domain waveforms of different types of data are shown in the Figure 9.

**Figure 9.** The waveform of five different types of voltage telemetry data for the momentum wheel: (**a**) comparison of 5 types of data in time-domain, (**b**) the waveform of Stable Change (large), (**c**) the waveform of Stable Change (small), (**d**) the waveform of Large to Small, (**e**) the waveform of Irreg-ular Change, (**f**) the waveform of Sudden Change.

To verify the effectiveness of the method proposed in this article, the training set and test set used in this article are shown in the Table 3.


**Table 3.** Label description of the momentum wheel voltage telemetry dataset.

### *5.2. Feature Extraction and Selection*

5.2.1. Time/Frequency Domain Feature Extraction and Selection

According to the time-domain feature and frequency-domain feature calculation methods shown in Table 1, the time-frequency feature values of the five types of momentum wheel voltage telemetry data are calculated. The time-domain features are shown in Figure 10, and the frequency-domain features are shown in Figure 11.

According to the time-frequency feature statistical feature distribution diagrams of different types of data in Figures 10 and 11, the time-domain feature distribution of SC is very scattered, but the frequency-domain feature distribution is relatively more concentrated. Intuitively, peak and peak-to-peak in the time domain feature can distinguish five types of data to a certain extent, and F3 and F4 in the frequency domain feature can also distinguish five types of sample to a certain extent.

In order to quantify the ability of different feature values to distinguish samples, we use the feature evaluation method (Laplacian Score and Relief-F Score) in Section 3.1.2 to score the above 25 types of time-domain features and frequency-domain features. The evaluation results are shown in Figure 12. Comprehensively considering the evaluation results of LS and RFS, this paper chooses nine features (peak, peak-to-peak, skewness, kurtosis, F3, F4, F8, F10 and F11), which have higher scores in two evaluation methods, as part of the feature sequence. These high-scoring features describe the amplitude characteristics, fluctuation characteristics and spectral density characteristics of the voltage telemetry signals.

#### 5.2.2. Complexity Feature Analysis

To verify the effectiveness of the proposed complexity feature extraction method of Huffman-multi-scale entropy, this paper analyzes the sample entropy under different sample lengths and different scales.

Taking a normal type data Stable Change (large) as an example to study the impact of sample length on complexity characteristics, the sample length is taken from 5000 to 25,000 at intervals of 2000. Figure 13 shows the results of calculating the multi-scale entropy and Huffman-multi-scale entropy with each sample length respectively. The scale is from 10 to 300 at intervals of 10. It can be found that when the sample length is 9000 to 15,000, both methods can achieve higher sample entropy. Therefore, this paper selects the sample length as 10,000.

**Figure 10.** The time-domain features of the five types of momentum wheel voltage telemetry data: (**a**) peak, (**b**) peak-to-peak, (**c**) mean, (**d**) absolute mean, (**e**) root amplitude, (**f**) standard deviation, (**g**) root mean square, (**h**) skewness, (**i**) kur-tosis, (**j**) peak index, (**k**) impulse factor, (**l**) margin index, (**m**) waveform index.

**Figure 11.** The frequency-domain features of the five types of momentum wheel voltage telemetry data: (**a**) F1, (**b**) F2, (**c**) F3, (**d**) F4, (**e**) F5, (**f**) F6, (**g**) F7, (**h**) F8, (**i**) F9, (**j**) F10, (**k**) F11, (**l**) F12.

**Figure 12.** The time-domain (**a**) and frequency-domain (**b**) features evaluation results.

**Figure 13.** The results of the multi-scale entropy (**a**) and Huffman-multi-scale entropy (**b**) with different sample length.

At the same time, this paper also calculates the multi-scale entropy and Huffmanmulti-scale entropy of different types of momentum wheel voltage telemetry signals when the sample length is 10,000. The scale ranges from 10 to 300, with an interval of 10. The calculation result is shown in Figure 14. It can be seen from Figure 14 that, for normal data, the results of multi-scale entropy and Huffman-multi-scale entropy are close. For Irregular Change data, the value of Huffman-multi-scale entropy is significantly lower than that of multi-scale entropy. This shows that Huffman-multi-scale entropy has better distinguishing ability for data with high complexity. It is worth noting that, for the abnormal data of Sudden Change type, the data itself has pulse characteristics, which causes the fluctuation characteristics of the data before and after the sudden change to be concealed to a certain extent. Therefore, both multi-scale entropy and Huffman-multi-scale entropy can well describe the characteristics of increased complexity caused by sudden and large changes in data.

**Figure 14.** Results of the multi-scale entropy (**a**) and Huffman-multi-scale entropy (**b**) for different data type.

Based on the above analysis, this paper selects the sample length as 10,000. The feature sequence (HMSE + T/F) is composed of 30 complexity features (scale from 10 to 300 at intervals of 10) and nine time/frequency-domain features.

#### *5.3. Anomaly Detection Results and Discussion*

This paper uses a five-class SVM model based on the DAG method, and uses the proposed APSO to train the classification model. To verify the effectiveness of the proposed method on the spacecraft anomaly detection problem, this paper not only calculates the recognition accuracy (RA) of the classification model for each category, but also calculates the detection rate (DR), false alarm rate (FAR) and the missed alarm rate (MAR). The calculation method of RA, DR, FAR and MAR are shown in Equations (25)–(28).

$$RA = \frac{Num(predicted = true)}{Num(true)} \ast 100\% \tag{25}$$

$$DR = \frac{Num(NN + FF)}{Num(true)} \ast 100\% \tag{26}$$

$$FAR = \frac{Num(NF)}{Num(N)} \* 100\% \tag{27}$$

$$MAR = \frac{Num(FN)}{Num(F)} \ast 100\% \tag{28}$$

where *Num*(*predicted* = *true*) is the total number of category predictions that are exactly the same as the true value, *Num*(*true*) is the total number of the test samples, *Num*(*NN* + *FF*) is the total number of the real normal data predicted as normal data and the real abnormal data predicted as abnormal data, *Num*(*NF*) is the total number of real normal data predicted as abnormal data, *Num*(*N*) is the total number of real normal data, *Num*(*FN*) is the total number of real abnormal data predicted as normal data, and *Num*(*F*) is the total number of real abnormal data.

Figure 15 shows the corresponding part of the false alarm rate and the missed alarm rate in the confusion matrix. C(large) is the Stable Change (large), C(small) is the Stable Change (small), L-S is the Large to Small, IC is the Irregular Change, and SC is the Sudden Change.

**Figure 15.** Correct, false alarms and missed alarms in the confusion matrix.

This paper calculates the confusion matrix of the abnormal detection of the momentum wheel voltage telemetry signal calculated by HMSE-T/F-APSO-SVM, MSE-T/F-APSO-SVM and MSE-T/F-PSO-SVM. The results are shown in Figure 16. It can be seen from the Figure 16, the identification accuracy of Sudden Change by the above three methods can all reach 100%. This result is consistent with the conclusion of the qualitative analysis of eigenvalues in the previous article. The probability of MSE-T/F-PSO-SVM identifying Stable Change (large) and Stable Change (small) as Irregular Change reaches 10.33% and 11.33%, respectively. At the same time, the probability of MSE-T/F-PSO-SVM identifying Irregular Change as Stable Change (large) and Stable Change (small) reaches 16.67% and 7.67%, respectively. The probability of HMSE-T/F-APSO-SVM and MSE-T/F-PSO-SVM identifying Stable Change (large) and Stable Change (small) as Irregular Change are 0%. At the same time, the probability of MSE-T/F-PSO-SVM identifying Irregular Change as Stable Change (large) and Stable Change (small) reaches 16.67% and 7.67%, respectively. The distinction between Stable Change (large) Stable Change (small) and Irregular Change can be effectively improved by calculating Huffman-multi-scale entropy. This conclusion is also consistent with the result in Figure 14.

In addition, the recognition accuracy of Large to Small and Sudden Change of the three methods has reached 100%, which shows that the feature sequence and anomaly detection model selected in this paper have strong sensitivity to signals with a definite change rule.

To further verify that the method proposed in this paper can effectively improve the accuracy of spacecraft anomaly detection and reduce the rate of false alarms and missed alarms, this paper compares the proposed method with other methods, and calculates the anomaly detection under different processing methods. Principal Component Analysis (PCA), Random forest (RF), Logistic Regression (LR), K-Nearest Neighbor (KNN) and Multilayer perceptron (MLP) are used in this paper. The results of the recognition accuracy, false alarm rate and missed alarm rate of different methods are shown in Table 4.

**Figure 16.** The confusion matrix of the abnormal detection for the momentum wheel voltage telemetry signal calculated by different processing methods (%): (**a**) HMSE-T/F-APSO-SVM, (**b**) HMSE-T/F-PSO-SVM, (**c**) MSE-T/F-PSO-SVM.

**Table 4.** Results of the recognition accuracy, detection rate, false alarm rate and missed alarm rate of different methods (%).


From the results in Table 4, it can be seen that: First, the recognition accuracy and detection rate of the method proposed in this paper can reach 99.60% and 99.87%, which are higher than other methods listed in the table, and the false alarm rate is reduced to 0, while the false alarm rate is reduced to 0.34%, which are lower than other methods. Second, the detection method based on the feature sequence (HMSE + T/F) has a higher recognition accuracy and detection rate as well as lower false alarm rate and missed detection rate than the detection method based on the original data. Third, by comparing the standard deviation of various indicators, it can be found that the feature sequence based on Huffman-multi-scale entropy and time-frequency domain features proposed in this paper can effectively improve the stability of the detection method.

#### **6. Conclusions**

In this research, we propose a new detection framework for anomaly detection based on spacecraft telemetry data. Due to the very low frequency characteristics of telemetry data, most frequency analysis methods are not suitable for spacecraft anomaly detection. Therefore, this paper first proposes a feature sequence construction method based on time-domain and frequency-domain feature screening and complexity feature fusion. On this basis, a new method of Huffman-multi-scale entropy (HMSE) based on the Huffman coding principle is proposed. To improve the classification accuracy of SVM, this paper adopts a multi-class SVM model based on the DAG principle, and proposes an improved adaptive particle swarm optimization (APSO) to train the SVM model. Then we apply the proposed method to the voltage telemetry data set of the satellite momentum wheel. Compared with other methods, the results show that the proposed method has a good performance in improving the recognition accuracy and detection rate, and it can also effectively reduce the false alarm rate and the missed alarm rate. Therefore, the method proposed in this paper has a good development prospect in the field of anomaly detection of spacecraft.

In the future work, more real-world datasets will be applied to verify the effectiveness of the detection ability of the proposed method. In addition, more methods based on artificial neural networks will be studied to further improve the versatility of anomaly detection methods.

**Author Contributions:** Conceptualization, methodology and validation, Y.L. and M.L.; investigation, P.L. and R.W.; data curation and project administration, M.X.; writing—original draft preparation, M.L.; writing—review and editing, Y.L. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the National Natural Science Foundation of China (No.52075117), the Science Research Project (JSZL2020203B004) and the Key Laboratory Opening Funding of Harbin Institute of Technology (HIT.KLOF.2016.077, HIT.KLOF.2017.076, HIT.KLOF. 2018.074 and HIT.KLOF. 2018.076).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data included in this study are all owned by the research group and will not be transmitted.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

