Application of Knowledge Graph Technology with Integrated Feature Data in Spacecraft Anomaly Detection

Yi, Xiaojian; Huang, Peizheng; Che, Shangjie

doi:10.3390/app131910905

Open AccessArticle

Application of Knowledge Graph Technology with Integrated Feature Data in Spacecraft Anomaly Detection

by

Xiaojian Yi

^1,2,3

,

Peizheng Huang

^1,2,* and

Shangjie Che

¹

School of Mechatronical Engineering, Beijing Institute of Technology, Beijing 100081, China

²

Yangtze Delta Region Academy of Beijing Institute of Technology, Jiaxing 314003, China

³

Tangshan Research Institute, Beijing Institute of Technology, Tangshan 063099, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(19), 10905; https://doi.org/10.3390/app131910905

Submission received: 31 July 2023 / Revised: 25 September 2023 / Accepted: 28 September 2023 / Published: 30 September 2023

(This article belongs to the Special Issue Intelligent Fault Diagnosis and Health Detection of Machinery)

Download

Browse Figures

Versions Notes

Abstract

:

Featured Application

This method applies knowledge graph technology for spacecraft anomaly detection, improving reliability and safety in space missions. It enables real-time monitoring, timely diagnosis, and maintenance, preventing mission failures. Potential applications include predictive maintenance strategies, resource optimization, and proactive planning. The approach advances space system engineering and enhances the robustness of future missions.

Abstract

Given the complexity of spacecraft system structures and functions, existing data-driven methods for anomaly detection face issues of insufficient interpretability and excessive dependence on historical data. To address these challenging problems, this paper proposes a method for applying knowledge graph technology with integrated feature data in spacecraft anomaly detection. First, the ontology concepts of the spacecraft equipment knowledge graph are designed according to expert knowledge, and then feature data are extracted from the historical operation data of the spacecraft in various states to build a rich spacecraft equipment knowledge graph. Next, spacecraft anomaly event knowledge graphs are constructed based on various types of anomaly features. During spacecraft operation, telemetry data are matched with the feature data in the knowledge graph, enabling anomaly device location and anomaly cause judgment. Experimental results show that this method, which utilizes spacecraft anomaly prior knowledge for anomaly detection and causes interpretation, has high practicality and efficiency. This research demonstrates the promising application prospects of knowledge graph technology in the field of spacecraft anomaly detection.

Keywords:

spacecraft; anomaly detection; knowledge graph; feature data integration

1. Introduction

In recent years, with the continuous growth of the number of on-orbit spacecraft and the increasing complexity of onboard equipment, the frequency of anomalies occurring in spacecraft operations has gradually increased. Due to the high value and difficulty of repairing spacecraft, any anomaly may lead to serious consequences. Therefore, it is significantly important in on-orbit spacecraft health management research to monitor spacecraft status in real time, detect anomalies, and deal with them promptly to prevent failures from occurring.

In the field of spacecraft anomaly detection, telemetry data anomaly analysis has received widespread attention, as it is closely related to equipment operating status. Data-driven methods, which have a higher degree of intelligence compared to methods based on thresholds, expert systems, and expert experience, have become a current research hotspot. In recent years, a large number of studies have focused on this field, resulting in various methods, which are reported below.

(1): Anomaly detection methods for telemetry data correlation: Various techniques have been proposed to address this issue, including the ARMA prediction model [1], LSSVM [2], RVM [3], KPCA [3], and dynamic Bayesian network [3];
(2): Anomaly detection methods for telemetry data clustering: These approaches incorporate methods such as LSSVM [2], hierarchical clustering combined with KNN classification [4], extraction of significant time patterns [5], DTW matching-based techniques [6], and LOF based on statistical eigenvalues [7];
(3): Multivariate anomaly detection methods for telemetry data: These techniques can be broadly divided into four categories. First, methods based on subspace computation, including Orca [8], GritBot [9], IMS [10,11], K-means [12], expectation maximization [12], OCSVM [13], and PLSDA [14]. Second, generative estimation-based methods, which consist of the dynamic grouped mixture model [15]. Third, basic pattern reconstruction error-based methods, which encompass sparse representation and latent semantic analysis techniques [16]. Finally, graph construction-based methods, involving box modeling algorithms [17]. Each of these approaches provides unique benefits and applications for tackling the challenges associated with telemetry data anomaly detection.

In general, data-driven methods analyze large amounts of historical data to build anomaly detection models for spacecraft systems, allowing for the monitoring of spacecraft status parameters and the detection of some anomalies and failures. However, due to the complexity of spacecraft systems, the interpretability of anomaly detection and fault diagnosis results is limited, and further analysis of anomaly causes still relies on expert knowledge and experience. Additionally, in the spacecraft on-orbit operation management process, operators need to repeatedly memorize and consult maintenance procedures, emergency plans, and troubleshooting manuals, which are inefficient and prone to errors or omissions. A large amount of prior knowledge about anomalies has not been fully utilized, resulting in data resource waste. Therefore, there is an urgent need for an explainable artificial intelligence technology and intelligent knowledge application technology to improve the efficiency of spacecraft anomaly detection and knowledge management.

In recent years, knowledge graphs have become an emerging technique for anomaly detection across various domains. For example, Akoglu et al. [18] surveyed various graph-based anomaly detection methods, highlighting the benefits of modeling anomalies using connectivity patterns.

Since Google launched the knowledge graph-based search engine in 2012, the interest in constructing and applying domain-specific knowledge graphs across industries has grown significantly. Numerous studies have confirmed the substantial application value of knowledge graph technology in various industries like electric power equipment [19,20,21,22,23,24,25,26], network communication [27,28,29,30,31], and aerospace [32,33,34,35,36,37,38,39,40,41].

In the field of space exploration, several recent studies have emphasized the increasing role of knowledge graphs. Kou Chao [39] et al. pioneered the construction of a knowledge graph for spacecraft launch, addressing issues of sparse and incomplete knowledge and paving the way for semantic AI and complex data analytics. Concurrently, Hui-Bin Shi [40] and colleagues introduced an innovative approach to spacecraft fault diagnosis through the integration of information using fusion technologies to construct a comprehensive lifecycle knowledge graph. This approach enhances fault response capabilities and ensures spacecraft integrity. Meanwhile, Lu Zhang [41] and his team proposed a knowledge graph specific to embedded aerospace software defects, improving the efficiency of third-party software testing and evaluation and enhancing the quality and credibility of aerospace software products. The contributions of these studies underscore the increasing use of knowledge graphs as an effective tool for managing complex issues in the spacecraft domain.

Despite these advancements, the application of knowledge graph technology in spacecraft anomaly detection is still in the preliminary exploration stage. To address this gap, this paper proposes a knowledge graph-based on-orbit spacecraft anomaly detection method. The main contributions of this paper are as follows:

(1): A knowledge graph-based architecture and framework for spacecraft anomaly detection is proposed, consisting of a data layer, a graph construction layer, an algorithm integration layer, and an application layer;
(2): A method for constructing a spacecraft equipment knowledge graph is introduced, integrating expert knowledge and historical data to enhance the interpretability and reliability of anomaly detection;
(3): Spacecraft anomaly event knowledge graphs are created based on various anomaly features, facilitating anomaly device location and cause judgment by matching telemetry data with feature data in the knowledge graph;
(4): Experimental results demonstrate the effectiveness of the proposed method and its potential applications in spacecraft anomaly detection.

The rest of this paper is arranged as follows. Section 2 develops the architecture and framework of the spacecraft anomaly detection algorithm based on a knowledge graph. Section 3 presents the workflow of the spacecraft anomaly detection method based on a knowledge graph. Section 4 discusses typical cases and provides a result analysis. Finally, Section 5 concludes the paper.

This research builds on existing efforts by developing a tailored knowledge graph framework for spacecraft anomaly detection. The focus on integrating expert knowledge and historical data into the graphs aims to enhance the interpretability and context awareness of the resulting anomaly detections. By matching real-time telemetry data against learned patterns and relationships in the knowledge graph, the goal is to enable more rapid diagnosis of spacecraft anomalies compared to purely data-driven techniques. The experimental validation on real spacecraft data will further validate the benefits of this knowledge graph approach in supporting spacecraft operations.

2. Knowledge Graph-Based Spacecraft Anomaly Detection Algorithm Architecture and Framework

2.1. Knowledge Graph-Based Spacecraft Anomaly Detection Algorithm Architecture

Spacecraft anomaly detection represents a complex and challenging task, given that it involves a myriad of interrelated factors. For instance, the temperature of spacecraft equipment could be influenced by the electrical load of various devices and the space environment’s ambient conditions. As such, it is evident that merely relying on preset alert thresholds for identifying abnormal states doesn’t yield reliable results. This study aims to bridge this gap by integrating spacecraft anomaly detection prior knowledge with spacecraft telemetry data using knowledge graphs. These knowledge graphs, consisting of nodes representing different data points and edges representing relationships between them, can effectively map out intricate dependencies among various factors influencing spacecraft equipment. This integration provides a more holistic and reliable approach to spacecraft anomaly detection compared to traditional methods.

The knowledge graphs group and extract pertinent features from the telemetry data and then compare these real-time telemetry data with reference data that are generated by the knowledge graph itself. This approach is based on the hypothesis that data deviations in abnormal situations are significantly larger than those in historical normal data. If the deviation is found to be too large, the situation is marked as an abnormal state. The overall architecture of the algorithm is depicted in Figure 1.

The first step in the process involves the collection of spacecraft equipment data and expert knowledge. These resources are used to construct a comprehensive spacecraft equipment knowledge graph through the knowledge extraction process that is built into this article. After this, feature data are extracted from historical telemetry data in accordance with the spacecraft mission log. These feature data are then seamlessly integrated with the knowledge graph, creating a multidimensional and interconnected dataset.

The second step involves generating a reference dataset that accurately describes the current state characteristics of the spacecraft. This is accomplished through the knowledge graph. The reference dataset’s primary function is to detect anomalies in orbiting spacecraft by comparing the telemetry data with the reference dataset. This comparison allows for an objective, data-driven assessment of the spacecraft’s current state and any potential deviations.

The third step is the construction of a spacecraft anomaly event graph. This graph is based on spacecraft anomaly reports and disposal plans. When abnormal data are detected, this graph is queried using abnormal keywords to determine the cause of the anomaly and the corresponding disposal plan. This proactive approach helps in mitigating risks and ensuring the spacecraft’s safe operation.

The final step is undertaken if the cause of the anomaly is still unknown after querying the event graph. In such cases, further knowledge extraction and manual completion are carried out. Once the anomalies are resolved, they are recorded in the event graph. This serves as a valuable future reference, ensuring an ongoing accumulation of knowledge and facilitating more efficient anomaly detection and resolution in subsequent operations.

2.2. Knowledge Graph-Driven Spacecraft Anomaly Detection and Processing System Framework

Following the establishment of the anomaly detection algorithm architecture based on a knowledge graph, as outlined in Section 2.1, we can now propose a more comprehensive system framework for knowledge graph-driven spacecraft anomaly detection and processing. It is crucial to note that our current research is still in its relatively early stages, and we actively encourage readers to closely follow the promising developments in this groundbreaking field.

The proposed system framework, illustrated in Figure 2, comprises four primary layers: the basic data layer, the graph construction layer, the algorithm integration layer, and the graph application layer. It is designed to harness the potential of knowledge graphs for diagnosing and resolving spacecraft anomalies.

The basic data layer serves as the foundation, containing data sources for various spacecraft system anomalies. It includes both structured and unstructured anomaly knowledge, which form the basic corpus for entity and relationship extraction in the spacecraft anomaly detection knowledge graph.

The graph construction layer employs deep learning techniques and manual verification to extract knowledge from spacecraft anomaly text materials. Knowledge from multiple sources and various forms is synthesized and stored in a structured format of triples in the knowledge graph database.

The algorithm integration layer builds on the constructed knowledge graph. It combines spacecraft anomaly detection task scenarios, processing experiences, and operation rules for knowledge logic association mining. The layer carries out analysis and computations through telemetry data feature extraction, similarity matching, and data statistical mining methods.

Finally, the graph application layer translates these functional modules into intelligent applications, which are tailored according to spacecraft anomaly detection application requirements and scenarios. This layer represents the practical application of spacecraft anomaly knowledge in specific situations.

By building on the architecture of the anomaly detection algorithm based on a knowledge graph, this system framework presents an innovative approach to spacecraft anomaly detection and resolution. As our research progresses, we anticipate that this framework will play a pivotal role in enhancing the reliability and efficiency of future space missions.

3. Methods

Based on the above research, the implementation process of spacecraft anomaly detection based on a knowledge graph is shown in Figure 3.

According to the technical route outlined above, the research process in this paper can be roughly divided into three parts.

The first part involves the construction of a knowledge graph. By integrating materials from various sources, experts design and summarize concepts to create an ontology model for the knowledge graph. Next, knowledge extraction is conducted to initially construct the knowledge graph.

The second part is focused on extracting spacecraft feature data. This process includes data cleaning, feature selection, and data classification. The resulting feature data are then integrated with the knowledge graph to form a spacecraft anomaly detection knowledge graph that incorporates feature data.

The third part deals with the anomaly detection of spacecraft telemetry data. First, static attribute data matching is performed based on sensors for spacecraft telemetry data. After generating reference data from the knowledge graph, dynamic attribute data matching is carried out. Anomaly detection is then conducted based on the data-matching results. Upon discovering anomalous telemetry data, the anomaly detection algorithm generates detection results. These results are analyzed to assess the quality of the spacecraft knowledge graph.

3.1. Constructing the Knowledge Graph

3.1.1. Ontology Construction for the Knowledge Graph

After obtaining data from various sources, it is necessary to design the ontology concepts for the knowledge graph. A well-designed ontology schema can better describe the relationships between knowledge, reduce data redundancy, and improve efficiency. Ontology construction methods can be divided into two categories. The first is the top-down approach, which starts by defining the data schema and constructs the ontology from the top-level concepts to the lower-level concepts. The second is the bottom-up approach, which is based on the underlying domain data and concepts and gradually abstracts upward to form higher-level concepts.

Several renowned ontologies in the general domain that can be utilized for constructing knowledge graphs include DBpedia Ontology [42] and WordNet [43]. These ontologies provide a solid foundation and proven frameworks for representing and structuring knowledge in a meaningful and machine-readable way.

When aiming for high efficiency and accuracy in the spacecraft anomaly detection knowledge graph, the top-down construction method is often employed. However, due to the broad range of data sources and complex data types, a “combined approach” is preferred in this case. Here, domain experts design the upper-level concepts of the knowledge graph, while the lower-level concepts are designed based on the data sources and their generalization.

This combined approach leverages the strengths of both top-down and bottom-up methods, providing a structured framework for organizing knowledge while also being flexible enough to incorporate diverse and complex data sources.

3.1.2. Knowledge Extraction

The spacecraft equipment knowledge graph is built based on telemetry data, equipment data, and expert knowledge databases. Equipment data come from relational databases, telemetry data include in-orbit historical data, and expert knowledge databases are used for creating task states and dividing feature data. The anomaly event graph is constructed based on historical anomaly data and analysis data, identifying abnormal states and obtaining feature data. For structured data, the knowledge graph can be created through rule mapping or manual collation. For semi-structured and unstructured data, NLP (Natural Language Processing) methods are required for entity recognition and relationship extraction, followed by expert verification to improve quality.

This paper employs an entity recognition method based on Bidirectional Long Short-Term Memory [44] (BiLSTM) networks and Conditional Random Fields [45] (CRFs), as shown in Figure 4.

The embedding layer transforms text characters into vectors, which are then input into a BiLSTM network to extract semantic features from sentences, enabling the combination of forward and backward hidden states. This helps to address long-distance dependency issues in the text. The CRF inference layer is a conditional probability distribution model for handling sequence labeling in the text, receiving the output from the BiLSTM and selecting the most likely tag sequence while constraining the predicted labels.

Before training the model, a certain number of texts need to be annotated to train the model. The BIOES sequence annotation method is used, dividing the annotated corpus into training and test sets at a ratio of 4:1. The entity annotation results of a sentence in the corpus are shown in Figure 5. During this process, model parameters are continuously adjusted to determine the optimal performance.

For text relationship extraction, this paper adopts a Bidirectional Long Short-Term Memory (BiLSTM) model based on the self-attention mechanism, as shown in Figure 6.

The embedding layer vectorizes the text, forming input features. The BiLSTM learns the context and shallow semantics, obtaining high-level word vectors. The self-attention layer calculates weights and learns deep global semantics, obtaining sentence global features. The output layer concatenates global and local features, calculating the relationship vector between entities.

3.2. Extracting Feature Data

To detect whether the target telemetry data of the spacecraft is abnormal, it is necessary to integrate the reference data of the spacecraft under various states into the knowledge graph, describing the data features of the spacecraft under normal or abnormal conditions. Specifically, the following steps are involved in feature data extraction.

3.2.1. Data Cleaning

To fully present the data features of the reference data sequence under normal conditions, it is necessary to perform outlier removal and missing value imputation on the reference data. First, read the threshold values for each parameter from the expert knowledge database and use these thresholds to remove outliers from the reference data. Then, traverse the dataset and select either direct imputation or regression imputation based on the length of the missing sequence.

3.2.2. Feature Selection

Due to the numerous and high-dimensional engineering parameters of a spacecraft, directly processing all data cannot quickly obtain spacecraft state information. Therefore, it is necessary to screen the parameter set and identify the parameters that best reflect the spacecraft’s state. Considering that there is related information among spacecraft state parameters, we can determine the correlation between parameters by mining the associated knowledge among them and selecting features based on the degree of correlation. In this paper, the Kendall rank correlation coefficient is used as a tool for parameter correlation analysis.

Suppose there are two random variables

X

and

Y

with

n

data points, forming

n

element pairs (

X_{i}

,

Y_{i}

). Concordant pairs represent that

X

and

Y

have the same trend, while discordant pairs indicate opposite trends. Kendall’s tau statistic calculates the number of concordant and discordant pairs in pairwise comparisons. The correlation coefficient ranges from −1 to 1, with negative numbers indicating negative correlation, positive numbers indicating positive correlation, and larger absolute values indicating a closer correlation. When the correlation coefficient is approaching 0, the correlation is less significant. The calculation formula is as follows:

τ = \frac{2}{n (n - 1)} \sum_{i < j} s g n (X_{i} - X_{j}) s g n (Y_{i} - Y_{j}),

(1)

where

s g n

is the sign function.

Before performing feature selection, we discretize the continuous attribute parameters into K finite intervals using equidistant binning, resulting in K states for each parameter. The Kendall correlation coefficient is employed to measure the correlation among various parameters. Parameters with lower correlation with other variables are removed first. We set a threshold, denoted as

τ

, and if the absolute sum of the correlation coefficients between a parameter and other variables is lower than

τ

, the parameter is removed. Please note that this

τ

threshold is not solely used to keep the highest correlated features. Instead, it serves as a cut-off value to filter out features that have little correlation with the target variable. This approach helps to reduce the dimensionality of the data and prevent overfitting.

3.2.3. Data Classification

Based on the spacecraft mission log, we can categorize the obtained historical telemetry data of the spacecraft into several datasets under different working conditions. For each category of telemetry data under these conditions, we employ the DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm to cluster them, outlining patterns or trends in the data. The distribution of data points identified by the clustering algorithm represents the “features” of specific operating conditions.

As shown in the Figure 7, we first partition the data based on task time labels and then cluster the partitioned data. The data of a particular partition are divided into six clusters, and the data points from each cluster serve as a reference data group. The black data points are outliers and are removed during the feature extraction process.

3.3. Detecting Anomalies in Real-Time Telemetry Data

3.3.1. Data Matching Based on a Knowledge Graph

This study is based on a constructed knowledge graph with feature data information, which allows the generation of feature data subsets representing different spacecraft working states. When real-time telemetry data are received, a reference dataset with a similar working state can be selected from the knowledge graph for comparison to determine whether the spacecraft is in an abnormal condition.

First, perform matching based on the static attribute data of sensors. According to the spacecraft’s mission state and working mode, select the sensor feature datasets with the same mission state and working mode from the spacecraft equipment knowledge graph for matching and assign the corresponding sensor feature data to each sensor’s real-time data.

Next, perform matching based on the dynamic attribute data of sensors. At each anomaly detection interval, group the real-time sensor data according to the sampling time to form a detection sequence and compute the similarity between this detection sequence and the sensor feature data corresponding to the static attribute data matching process. If the similarity obtained from the matching exceeds a predefined threshold, the detection sequence is considered normal. If the similarity is below the threshold, the detection sequence is deemed abnormal, and the abnormal sensor information is used to retrieve the corresponding spacecraft equipment from the spacecraft equipment knowledge graph, outputting the abnormal equipment information.

3.3.2. Anomaly Analysis Based on a Knowledge Graph

Data matching algorithms can determine whether the data from spacecraft equipment sensors display a state different from their stable or normal status. When real-time telemetry data are received, anomaly detection algorithms identify potential anomalous telemetry parameters. Once an anomaly is detected, the abnormal parameters and phenomena output by the algorithm can be used as keywords to locate the anomalous event in the knowledge graph, thereby aiding staff in developing anomaly handling plans and conducting anomaly cause analysis.

Anomaly analysis is a significant application of knowledge graphs in intelligent knowledge processing. In this application scenario, we aim to construct a classifier capable of matching corresponding events based on user-inputted keywords. This task can be seen as a specialized form of text classification. However, unlike standard text classification, user input is often incomplete, with some keywords lost during the input process. Therefore, the classifier must still accurately match the correct event when some features (keywords) are missing.

To resolve this issue, we selected the Naive Bayes model as our classification algorithm, primarily because it can handle inputs of indefinite length, and its training and prediction speeds are fast. These traits are crucial for our scenario, where we need to frequently update the model. Furthermore, the Naive Bayes model has a higher tolerance for missing input features compared to other models, which is also essential for our task.

Specifically, let

X = (x_{1}, x_{2}, \dots x_{n})

represent the

n

-dimensional keyword vector of abnormal parameters or abnormal phenomena output by the anomaly detection algorithm and

E = (e_{1}, e_{2}, \dots e_{m})

represent the

m

abnormal event categories. Under the assumption that keyword probabilities are mutually independent within each abnormal event category, the Naive Bayes classifier selects the category with the maximum posterior probability as the classification label for the keyword vector

x

:

E (x) = \underset{e_{j} \in E}{a r g m a x} P (e_{j}) \prod_{i = 1}^{n} P (x_{i} |e_{j})

(2)

Here,

P (e_{j})

is the prior probability of the abnormal event category

e_{j}

, which can be calculated by the frequency of the

e_{j}

class in the sample, and

P (x_{i} |e_{j})

is the probability that

x_{i}

occurs in the

e_{j}

class,

j = 1, 2, \dots, m

.

The calculation

P (x_{i} |e_{j})

generally uses keyword frequency:

P (x_{i} |e_{j}) = \{\begin{matrix} \frac{F (x_{i} |e_{j}) \times k}{\sum_{i = 1}^{n} F (x_{i} |e_{j})}, x_{i} \in C \\ \frac{F (x_{i} |e_{j})}{\sum_{i = 1}^{n} F (x_{i} |e_{j})}, x_{i} \notin C \end{matrix}

(3)

Here,

F (x_{i} |e_{j})

is the term frequency of

x_{i}

in the

e_{j}

class and

\sum_{i = 1}^{n} F (x_{i} |e_{j})

is the term frequency count of all words in the

e_{j}

class;

C

is the set of core words corresponding to the

e_{j}

class, and these core words are some special keywords that are often representative for certain classifications; and

k

represents the importance of core words in impacting classification, with

k > 1

.

By adopting the above method, the abnormal event type can be determined based on the abnormal keywords, identifying the cause of the anomaly and the handling plan. In addition, when the abnormal information is incomplete, the priority order of abnormal event investigation can be determined according to the probability values output by the classification model.

4. Example

In Section 3, we introduced the construction and implementation process of the spacecraft knowledge graph and anomaly detection. In the following section, we will demonstrate the effectiveness and feasibility of knowledge graph construction and anomaly detection through a specific spacecraft anomaly detection example, following the process described in the previous section. This section’s example will be divided into three parts: knowledge graph construction, feature data extraction, and telemetry data anomaly detection experiment.

4.1. Constructing the Knowledge Graph

Based on the telemetry engineering parameter table, anomaly event reports, anomaly handling plans, and manual monitoring experience of a certain type of domestic spacecraft Control Moment Gyroscope (CMG) system, we constructed a knowledge graph. The CMG, as a critical spacecraft electromechanical device, plays a crucial role in the stable control of the spacecraft. Using 42 months of historical telemetry data of the CMG system in orbit, this study carried out feature data extraction and effectively integrated it with the knowledge graph. To ensure data security, all data have been encrypted.

4.1.1. Knowledge Graph Ontology Construction

The construction of the knowledge graph ontology is the core element of knowledge graph construction. As a domain knowledge abstraction model, the ontology needs to be iteratively optimized in combination with expert experience and domain knowledge characteristics to provide a foundation for applications. In this paper, we constructed two knowledge graphs for spacecraft anomaly detection: the spacecraft equipment knowledge graph and the spacecraft abnormal event graph. The ontology models are shown in Figure 8, Figure 9 and Figure 10.

In the tree-like ontology structure of the spacecraft equipment knowledge graph, different hierarchical relationships, system equipment connection relationships, task state switching conditions, and parameter correlation relationships are defined. The reference feature data of telemetry parameters are integrated as anomaly detection-matching templates.

The ontology design of the spacecraft abnormal event graph includes types, such as abnormal phenomena, causes, equipment, and parameters, and uses reference feature data for precise matching. In the event chain ontology, subgraphs describing the transition from normal to abnormal states are incorporated, and the abnormal development process is intuitively described using the graph model. The event chain subgraph contains normal states (white nodes) and abnormal states (gray nodes), with state-switching conditions as relationships. The subgraph of normal states is strongly connected, while recoverable abnormal state nodes and the subgraph of normal states are strongly connected. Irrecoverable abnormal state nodes have only unidirectional relationships with other nodes.

4.1.2. Knowledge Extraction

In our process of building the knowledge graph, structured data constitutes a significant proportion. The remaining semi-structured and unstructured data require further processing. The semi-structured corpus can be converted into structured data in batches by defining rules manually, and structured data can be directly mapped to triples to construct the knowledge graph. Unstructured texts, on the other hand, need to undergo manual text annotation and the training of knowledge extraction models to achieve automated knowledge extraction from the text corpus.

A typical example of an unstructured corpus is shown in Figure 11. This corpus comes from the anomaly event report, which contains the anomaly event name, anomaly time, spacecraft name, anomaly phenomenon, anomaly equipment, anomaly parameter, anomaly cause, and handling plan.

Before employing the model to extract knowledge, it is necessary to annotate a sizeable amount of data to train the model. The interface of the text annotation software, doccano, used for this purpose is shown in Figure 12.

In this study, a total of 50 anomaly records were selected for manual annotation. The models for both entity recognition and relationship extraction were divided into training and test sets at a ratio of 4:1 based on the annotated corpus. It is important to emphasize that this division is specifically intended to measure the performance of the model in the Named Entity Recognition (NER) phase, not to represent final performance. To evaluate the accuracy of the extraction results, the F1 score is used for assessment, and its calculation formula is:

F_{1} = \frac{2 \times P \times R}{P + R}

(4)

In the formula, precision

P

is the ratio of the number of correctly recognized entities to the number of recognized entities and recall

R

is the ratio of the number of correctly recognized entities to the total number of entities. Table 1 presents the evaluation results of the entity recognition model on the test set, demonstrating that the constructed model can basically satisfy the knowledge extraction task of anomaly events.

Using the entity recognition and relationship extraction models built in this study, we conducted knowledge extraction on all unstructured text corpora in the library. A total of 1195 entities, including anomaly events, anomaly times, spacecraft, and anomaly equipment, were extracted, along with 1311 relationships corresponding to anomaly times, anomaly phenomena, and anomaly equipment. The extraction results are shown in Table 2.

4.2. Extracting Feature Data

The experimental data selected includes telemetry data from a single frame Control Moment Gyroscope (CMG) subsystem of a spacecraft during its 42-month in-orbit operation. Approximately one day of data are extracted every 10 days, totaling 155 days and involving 2.4 million records and 17 telemetry parameters. First, we perform data cleaning.

4.2.1. Data Cleaning

The data cleaning process includes outlier removal and missing value imputation.

(1): Outlier Removal

Outlier removal is an indispensable step in data preprocessing, as the collected data series may contain abnormal values that affect the overall feature extraction of the data series. When removing outliers, the upper and lower threshold limits (U and L) of each parameter must be read first. For each data point in the original parameter data series, the decision to delete the data point is made based on the upper (U) and lower (L) threshold limits. Figure 13 provides a visualization example of outlier removal.

(2): Missing Value Imputation

Data missing is a common problem in data collection, transmission, and storage processes. Missing data occur randomly, so current methods for handling missing values are generally divided into three categories: deletion, imputation, and no processing. For spacecraft time series data, direct deletion would result in the loss of data at certain time points, while not processing would affect subsequent feature data extraction. Therefore, we use imputation to handle missing values. As spacecraft parameters have a certain degree of stability and generally do not experience large fluctuations in a short period of time, missing values at short time points can be filled directly with the value of the previous time point. For long periods of missing time point data, filling the missing values directly with the value of a non-missing time point prior to the occurrence of the missing data would result in a loss of information contained in that time series and may even affect subsequent modeling. Therefore, we use regression imputation, establishing a regression equation and using its predicted values for missing value imputation. Figure 14 provides a visualization example of missing value imputation using regression imputation.

4.2.2. Feature Selection

After removing redundancy, excluding outliers, and imputing missing values, we use the Kendall correlation coefficient to measure the correlation between various parameters and plot the correlation coefficient measurement results among various parameters. Figure 15 displays a heatmap of the correlation among various parameters in the CMG system.

To improve the performance of feature extraction, we excluded parameters with lower correlations with other variables and selected seven parameters from the system parameters for feature extraction. Referring to the heatmap above, the seven selected parameter codes are CMG003, CMG005, CMG011, CMG012, CMG0013, CMG015, and CMG017.

4.2.3. Data Classification

For data classification, we extracted the time labels of various work states according to the task log and divided the dataset accordingly. Then, we performed clustering on the telemetry data under each work state using the DBSCAN algorithm to determine the abnormal boundaries. Figure 16 shows the visualization effect after data partitioning and clustering, with the seven-dimensional telemetry data under each state reduced.

In order to perform anomaly detection for the CMG system, we selected clusters with more than 50 data points as a group of feature data and extracted a total of 329 groups of feature data. Each group of data are classified according to task status and time labels as a type of feature node and then fused with the spacecraft equipment knowledge graph. This improves the accuracy and reliability of anomaly detection.

The extracted feature data are stored in the Neo4j database. Figure 17 provides a visualization of a portion of the spacecraft equipment knowledge graph.

4.3. Telemetry Data Anomaly Detection Experiment

4.3.1. Data Matching Based on a Knowledge Graph

When selecting CMG system anomaly data, we chose several real anomaly occurrences in the spacecraft and selected telemetry data for testing, one hour before and after the anomaly occurrence (30 min before and after). According to the anomaly report, we selected 60 data samples for testing. These data samples were not used for feature data extraction to evaluate the generalization performance of the anomaly detection algorithm. After preparing the experimental data, we conducted the telemetry data anomaly detection experiment.

In the first step of the anomaly detection algorithm matching, to quickly match the corresponding reference data according to the static attributes of the experimental data and improve the search speed, the slicing window method is applied to the experimental data for static information matching. For example, setting the slicing window to a length of 100 data points as a detection cycle, performing static information matching for the data within each slicing window, obtaining the reference dataset, and then performing anomaly detection of the telemetry data sequence.

In each detection cycle, we selected the 1–2 most similar feature datasets as reference data and used the DBSCAN clustering algorithm for analysis. Figure 18 shows the clustering results obtained using the DBSCAN method. Compared with the reference dataset, this method can easily identify abnormal data points. Figure 18 depicts the clustering results of normal reference data (represented by dots) and abnormal data (represented by plus signs).

According to the tests, the method proposed in this paper can identify 95% of the anomalies in the test set. Even though the current implementation is only a small-scale test demonstration, it still shows high accuracy compared to the traditional threshold-based methods. These are widely adopted in spacecraft ground control systems where technicians monitor telemetry data in near real time during the spacecraft’s in-orbit operation, aided by signal thresholds to check if values exceed preset ranges. Typically, this type of monitoring targets only a portion of the telemetry sequences. Despite its common use in the industry, our proposed method demonstrates significant advantages over this approach. The accuracy comparison results between our proposed method and the traditional threshold-based method are shown in Table 3.

4.3.2. Discussion of Experimental Results

Before we delve into the discussion, we would first like to introduce some specifics of the threshold-based method. The method, currently widely adopted for anomaly detection in spacecraft ground management, is based on setting thresholds for each telemetry parameter. It is only when a parameter surpasses its threshold that the anomaly detection system can identify an anomaly. Presently, a substantial amount of telemetry data are stored in the database, with many anomalies still unknown to the management personnel. Spacecraft management staff regularly clean and check the data, typically discovering anomalies only after they have occurred for a certain duration.

The currently adopted threshold-based method considers that the system has a total of six modes, namely three-axis stable earth flight mode, yaw maneuver flight mode, earth-to-yaw mode, yaw-to-earth mode, orbit control mode, and anomaly. The threshold-based method defines the maximum and minimum values for each telemetry parameter in the five normal modes, excluding the anomaly. It conducts anomaly detection based on the upper and lower limits of telemetry parameters in each mode. Although this method is simple and efficient, many anomalies do not cause telemetry signals to exceed their limits when they occur.

We applied a clustering algorithm to further subdivide the data of these five normal modes, extracting over 300 sets of feature data. This approach allows us to further refine the data distribution under each mode compared to the few data modes divided by the original threshold method. Consequently, the accuracy of our anomaly detection algorithm has been further improved.

Of note, 16 false alarms were generated using the threshold method. This occurred because management staff attempted to increase the sensitivity of the old threshold-based detection method by narrowing the threshold range based on their manual monitoring experience. However, they overlooked the sudden changes in some telemetry parameters during spacecraft equipment mode switches or under certain special modes, as shown in the Figure 19. After summarizing these anomalies, the managers realized that they exhibited similar patterns and classified these situations as false alarms.

The proposed method was able to resolve these 16 situations. This is because the process of constructing the feature data of the normal state using this method has already included the sudden changes in parameters brought about by the spacecraft mode transitions. Therefore, these false alarms have been resolved.

The one false alarm that this method did not solve was due to the shift in the distribution of telemetry data under a certain operating mode as the working time of the spacecraft equipment became longer. There was a significant time gap between the feature extraction and the subsequent collection of test data. This also indicates that the feature data in the knowledge graph is subject to temporal limitations. The knowledge graph needs to be updated timely with feature data to adapt to the operating conditions at different stages of the spacecraft’s lifecycle.

While our approach was successful in detecting anomalies in most of the test cases, it failed to identify exceptions in some specific situations. In particular, two anomalies in our test set were not detected.

These two missed cases share a common characteristic: the range of distribution of the anomalous data points was relatively small compared to the reference data. This could have made it challenging for the DBSCAN clustering algorithm to distinguish these anomalous data points from the normal ones. This indicates that while our approach performs well in most situations, it may need improvements when dealing with subtler anomalies.

Additionally, our method did not account for the time dependency inherent in telemetry data sequences. In time-series data, the value of a data point may be influenced by previous data points. If the occurrence of an anomaly is related to previous data points, our method may fail to detect it.

To improve our approach, we are considering combining it with other algorithms to better address these issues. In particular, we are exploring the use of algorithms specifically designed for time-series data, such as anomaly detection algorithms based on autoregressive models. We expect that these improvements will enhance our method’s performance in handling a variety of situations, especially in dealing with subtle anomalies and accounting for time dependency.

4.3.3. Anomaly Analysis Based on a Knowledge Graph

During the operation of a spacecraft, the anomaly detection algorithm can identify abnormal telemetry data and output the abnormal parameters and devices. Upon detecting an anomaly, further data analysis can reveal the characteristics of the abnormal phenomenon. The abnormal parameters and phenomena can be organized into keywords and input into a Naive Bayes classifier to calculate the probability distribution of different anomaly events. In this way, the cause of the anomaly event can be quickly located based on the spacecraft anomaly event graph, and corresponding measures can be taken to maintain the normal operation of the spacecraft. Figure 20 shows an experimental case of anomaly analysis for a spacecraft CMG system.

In this case, further data analysis revealed abnormal fluctuations in telemetry parameters CMG005, CMG013, and CMG017. CMG005 experienced a sudden increase, CMG013 had large fluctuations, and CMG017 showed periodic decreases. By inputting these abnormal parameters and phenomena into the Naive Bayes classifier, the probability distribution of three different anomaly event categories was calculated. According to the classifier’s results, event 4 had the highest probability. Therefore, this set of abnormal parameters and phenomena was classified as event 4. Further, using the spacecraft anomaly event graph, the cause of event 4 was quickly identified: a failure in a motor component in the CMG system led to unstable attitude control of the spacecraft. Based on the anomaly handling plan, engineers can troubleshoot the faulty control moment gyroscope component and reset the drive circuit to restore its normal working status, thereby resolving the spacecraft’s anomaly issue.

In practical applications, incomplete anomaly information may exist. Engineers can determine the priority of anomaly event investigation based on the probability values output by the classification model, improving investigation efficiency. For example, in this case, if some telemetry parameter data were missing, engineers could first investigate event 4 with the highest probability, followed by events 2 and 9. We have conducted further experiments to address the potential issue of the Naive Bayes model’s feature independence assumption not being fully valid in practice, which might lead to inaccurate probability estimates. We selected 20 sample events and constructed 20 incomplete input data to evaluate the classifier’s performance. The experimental results show that if the event with the highest predicted probability matches the correct label, our accuracy rate can reach 65%. If the correct label is within the top three events with the highest predicted probabilities, our accuracy rate can even reach 100%. This indicates that even if the feature independence assumption might cause inaccurate event probability prediction, the classifier still often makes correct classification decisions, and the Naive Bayes classifier can still achieve satisfactory results.

Thus, the anomaly analysis method based on the knowledge graph provides effective support for anomaly handling and maintenance during spacecraft operation. While the anomaly analysis method based on the knowledge graph greatly aids anomaly handling and maintenance in spacecraft operation, it is important to acknowledge its limitations. The knowledge graph is confined to the information it contains and may not account for unknown or novel situations. Additionally, its effectiveness is tied to the quality and completeness of data input. Therefore, continuous maintenance and updates by experts are necessary to keep the system current and accurate. The method is not a standalone solution, but a tool that with expert knowledge and ongoing upkeep, can strongly support anomaly analysis and maintenance across various fields.

5. Conclusions

In addressing the complexity of spacecraft system structures and functions and the challenges posed by existing data-driven methods for anomaly detection, this paper proposed an innovative method for applying knowledge graph technology with integrated feature data in spacecraft anomaly detection. Our primary work involved designing ontology concepts of the spacecraft equipment knowledge graph based on expert knowledge, and extracting feature data from the historical operation data of the spacecraft to build a rich spacecraft equipment knowledge graph. Furthermore, we constructed spacecraft anomaly event knowledge graphs based on various types of anomaly features. This method enabled anomaly device location and anomaly cause judgement during spacecraft operation by matching telemetry data with the feature data in the knowledge graph.

Our experimental analysis has shown the potential of the proposed method not only to enhance anomaly detection accuracy and reduce false alarm rates in spacecraft compared to traditional threshold-based strategies, but also to provide a practical and effective means for analyzing anomaly causes and formulating treatment plans. However, its efficacy is contingent on the quality of data, the comprehensiveness of the knowledge graph, and the accuracy of the anomaly detection algorithm. Despite these limitations, the method offers scalability and flexibility, allowing for the incorporation of more entities and anomaly records, multidisciplinary knowledge integration, efficient data management, and quick abnormal state identification. Future work will explore the use of knowledge graph embedding models [46] to advance the processing of the knowledge graph, predicting missing relations, generating new knowledge, and supporting intelligent simulation testing and anomaly detection. While our method shows promise, it should be noted that our research is still in its nascent stages. To realize a truly efficient and effective system for spacecraft anomaly detection and diagnosis, there is much that needs to be explored and improved.

Author Contributions

Conceptualization, X.Y. and P.H.; methodology, X.Y. and P.H.; software, P.H. and S.C.; validation, P.H. and S.C.; formal analysis, S.C.; investigation, X.Y. and P.H.; resources, X.Y.; data curation, P.H. and S.C.; writing—original draft preparation, P.H.; writing—review and editing, X.Y.; visualization, P.H. and S.C.; supervision, X.Y.; project administration, X.Y.; funding acquisition, X.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Pre-Research Project 50902060403 from the Equipment Development Department.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Due to confidentiality agreements, the data from this study cannot be publicly shared.

Conflicts of Interest

The authors declare no conflict of interest.

References

Li, W.; Meng, Q. Fault Detection for in-orbit Satellites Using an Adaptive Prediction Model. Chin. J. Space Sci. 2014, 34, 201–207. [Google Scholar] [CrossRef]
Chen, B.; Lu, G.; Fang, H.Z. Method of satellite anomaly detection based on least squares support vector machine. Comput. Meas. Control 2014, 22, 690–692. [Google Scholar]
Yairi, T.; Kawahara, Y.; Fujimaki, R.; Sato, Y.; Machida, K. Telemetry-mining: A machine learning approach to anomaly detection and fault diagnosis for space systems. In Proceedings of the 2006 2nd IEEE International Conference on Space Mission Challenges for Information Technology, Pasadena, CA, USA, 17–20 July 2006. [Google Scholar]
Chen, J. Similarity Measure of Time Series for Satellite Telemetry Data. Master’s Thesis, Harbin Institute of Technology, Harbin, China, 1 June 2015. [Google Scholar]
Yairi, T.; Ogasawara, S.; Hori, K.; Nakasuka, S.; Ishihama, N. Summarization of Spacecraft Telemetry Data by Extracting Significant Temporal Patterns. In Proceedings of the 8th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD 2004), Sydney, Australia, 26–28 May 2004. [Google Scholar]
Martínez-Heras, J.A.; Donati, A.; Fischer, A. DrMUST-a Data Mining Approach for Anomaly Investigation. In Proceedings of the SpaceOps 2012 Conference, Stockholm, Sweden, 11–15 June 2012. [Google Scholar]
Martínez-Heras, J.A.; Donati, A.; Kirsch, M.G.F. New Telemetry Monitoring Paradigm with Novelty Detection. In Proceedings of the SpaceOps 2012 Conference, Stockholm, Sweden, 11–15 June 2012. [Google Scholar]
Bay, S.D.; Schwabacher, M. Mining Distance-based Outliers in Near Linear Time with Randomization and a Simple Pruning Rule. In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2003), Washington, DC, USA, 24–27 August 2003. [Google Scholar]
GritBot. RuleQuest Research. Available online: http://www.rulequest.com (accessed on 10 July 2023).
Iverson, D.L. Inductive System Health Monitoring. In Proceedings of the International Conference on Artificial Intelligence, IC-AI ‘04, Volume 2 & Proceedings of the International Conference on Machine Learning; Models, Technologies & Applications, MLMTA ‘04, Las Vegas, NV, USA, 21–24 June 2004. [Google Scholar]
Iverson, D.L.; Martin, R.; Schwabacher, M.; Spirkovska, L.; Taylor, W.; Mackey, R.; Baskaran, V. General Purpose Data-Driven Monitoring for Space Operations. J. Aerosp. Comput. Inf. Commun. 2012, 9, 26–44. [Google Scholar] [CrossRef]
Azevedo, D.R.; Ambrósio, A.M.; Vieira, M. Applying Data Mining for Detecting Anomalies in Satellites. In Proceedings of the Ninth European Dependable Computing Conference, Sibiu, Romania, 8–11 May 2012. [Google Scholar]
Schwabacher, M.; Oza, N.; Matthews, B. Unsupervised Anomaly Detection for Liquid-Fueled Rocket Propulsion Health Monitoring. J. Aerosp. Comput. Inf. Commun. 2009, 6, 464–482. [Google Scholar] [CrossRef]
Mohammad, B.R.; Hussein, W.M. A Novel Approach of Health Monitoring and Anomaly Detection Applied to Spacecraft Telemetry Based on PLSDA Multivariate Latent Technique. In Proceedings of the 15th International Workshop on Research and Education in Mechatronics (REM), El Gouna, Egypt, 9–11 September 2014. [Google Scholar]
Takeishi, N.; Yairi, T.; Nishimura, N.; Nakajima, Y.; Takata, N. Dynamic Grouped Mixture Models for Intermittent Multivariate Sensor Data. In Proceedings of the 20th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD 2016), Auckland, New Zealand, 19–22 April 2016. [Google Scholar]
Takeishi, N.; Yairi, T. Anomaly Detection from Multivariate Time-Series with Sparse Representation. In Proceedings of the 2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC), San Diego, CA, USA, 5–8 October 2014. [Google Scholar]
Chan, P.K.; Mahoney, M.V. Modeling Multiple Time Series for Anomaly Detection. In Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM’05), Houston, TX, USA, 27–30 November 2005. [Google Scholar]
Akoglu, L.; Tong, H.; Koutra, D. Graph based anomaly detection and description: A survey. Data Min. Knowl. Discov. 2015, 29, 626–688. [Google Scholar] [CrossRef]
Huang, H.; Hong, Z.; Zhou, H.; Wu, J.; Jin, N. Knowledge graph construction and application of power grid equipment. Math. Probl. Eng. 2020, 2020, 8269082. [Google Scholar] [CrossRef]
Xiaoping, G.; Mengyu, R.; Hong, Z.; Ping, W.; Ruijun, R.; Feng, G. Construction technology of knowledge graph and its application in power grid. In Proceedings of the International Conference on Power System and Energy Internet (PoSEI2021), Chengdu, China, 16–18 April 2021. [Google Scholar]
Wu, J.; Li, Q.; Chen, Q.; Peng, G.; Wang, J.; Fu, Q.; Yang, B. Evaluation, analysis and diagnosis for HVDC transmission system faults via knowledge graph under new energy systems construction: A critical review. Energies 2022, 15, 8031. [Google Scholar] [CrossRef]
Hu, J.; Zhao, S.; Nie, Q. Research on modeling of power grid information system based on knowledge graph. In Proceedings of the 2021 IEEE International Conference on Power Electronics, Computer Applications (ICPECA 2021), Shenyang, China, 22–24 January 2021. [Google Scholar]
Liu, W.; Zhu, Z.; Cai, K.; Pu, D.; Du, Y. Application of knowledge graph in smart grid fault diagnosis. Appl. Math. Nonlinear Sci. 2022. [Google Scholar] [CrossRef]
Tang, Y.; Liu, T.; Liu, G.; Li, J.; Dai, R.; Yuan, C. Enhancement of power equipment management using knowledge graph. In Proceedings of the 2019 IEEE Innovative Smart Grid Technologies-Asia (ISGT Asia), Chengdu, China, 21–24 May 2019. [Google Scholar]
Huang, H.; Chen, Y.; Lou, B.; Hongzhou, Z.; Wu, J.; Yan, K. Constructing knowledge graph from big data of smart grids. In Proceedings of the 2019 10th International Conference on Information Technology in Medicine and Education (ITME), Qingdao, China, 23–25 August 2019. [Google Scholar]
Feng, Y.; Zhai, F.; Li, B.; Cao, Y. Research on intelligent fault diagnosis of power acquisition based on knowledge graph. In Proceedings of the 2019 IEEE 3rd International Conference on Electronic Information Technology and Computer Engineering (EITCE), Xiamen, China, 18–20 October 2019. [Google Scholar]
Deng, L.Q.; Wu, Y.F.; Wang, J.Y.; Wang, B. Network Situation Features Extraction Method of Computer Network Based on Knowledge Graph. In Proceedings of the 2022 3rd International Conference on Computer Vision, Image and Deep Learning & International Conference on Computer Engineering and Applications (CVIDL & ICCEA), Changchun, China, 20–22 May 2022. [Google Scholar]
Xie, C.; Yu, B.; Zeng, Z.; Yang, Y.; Liu, Q. Multilayer internet-of-things middleware based on knowledge graph. IEEE Internet Things J. 2020, 8, 2635–2648. [Google Scholar] [CrossRef]
Djenouri, Y.; Srivastava, G.; Belhadi, A.; Lin, J.C. Intelligent blockchain management for distributed knowledge graphs in IoT 5G environments. Trans. Emerg. Telecommun. Tech. 2021, e4332. [Google Scholar] [CrossRef]
Sun, Y.; Tian, Z.; Li, M.; Zhu, C.; Guizani, N. Automated attack and defense framework toward 5G security. IEEE Netw. 2020, 34, 247–253. [Google Scholar] [CrossRef]
Tian, Z.; Sun, Y.; Su, S.; Li, M.; Du, X.; Guizani, M. Automated attack and defense framework for 5G security on physical and logical layers. arXiv 2019, arXiv:1902.04009. [Google Scholar]
Tang, X.; Hu, B.; Wang, J.; Wu, C.; Noman, S.M. Intelligent Auxiliary Fault Diagnosis for Aircraft Using Knowledge Graph. In Proceedings of the 2nd International Conference on Advanced Intelligent Technologies (ICAIT 2021): Advanced Intelligent Technologies for Industry, Online, 23–24 November 2021. [Google Scholar]
Tang, X.; Wang, J.; Wu, C.; Hu, B.; Noman, S.M. Constructing Aircraft Fault Knowledge Graph for Intelligent Aided Diagnosis. In Proceedings of the 4th International Conference on Information Technologies and Electrical Engineering (ICITEE 2021), Changde, China, 29–31 October 2021. [Google Scholar]
Tang, X.; Chi, G.; Cui, L.; Ip, A.W.H.; Yung, K.L.; Xie, X. Exploring Research on the Construction and Application of Knowledge Graphs for Aircraft Fault Diagnosis. Sensors 2023, 23, 5295. [Google Scholar] [CrossRef] [PubMed]
Yue, S.; Xiao, L.; Li, J.; Wang, N. Research on application of knowledge graph for aircraft maintenance. Adv. Mech. Eng. 2022, 14, 16878132221107429. [Google Scholar] [CrossRef]
Cheng, Y.; Jiao, Y.; Wei, W.; Wu, Z. Research on construction method of knowledge graph in the civil aviation security field. In Proceedings of the 2019 IEEE 1st International Conference on Civil Aviation Safety and Information Technology (ICCASIT), Kunming, China, 17–19 October 2019. [Google Scholar]
Agarwal, A.; Gite, R.; Laddha, S.; Bhattacharyya, P.; Kar, S.; Ekbal, A.; Thind, P.; Zele, R.; Shankar, R. Knowledge Graph—Deep Learning: A Case Study in Question Answering in Aviation Safety Domain. arXiv 2022, arXiv:2205.15952. [Google Scholar]
Keller, M.; Building, R. Building a knowledge graph for the air traffic management community. In Proceedings of the WWW ‘19: The Web Conference, San Francisco, CA, USA, 13–17 May 2019. [Google Scholar]
Chao, K.; Tao, L.; Li, M.; Guoyu, J.; Yuchao, W.; Yu, Z. Construction and application research of knowledge graph in spacecraft launch. In Journal of Physics: Conference Series; IOP Publishing: Chongqing, China, 2021; Volume 1754, p. 012180. [Google Scholar]
Shi, H.B.; Huang, D.; Wang, L.; Wu, M.Y.; Xu, Y.C.; Zeng, B.E.; Pang, C. An information integration approach to spacecraft fault diagnosis. Enterp. Inf. Syst. 2021, 15, 1128–1161. [Google Scholar] [CrossRef]
Zhang, L.; Gao, M.; Li, P.; Jiang, Y. Construction and applications of embedded aerospace software defect knowledge graph. In Proceedings of the 6th International Conference on Signal and Information Processing, Networking and Computers (ICSINC), Guiyang, China, 13–16 August 2019. [Google Scholar]
Lehmann, J.; Isele, R.; Jakob, M.; Jentzsch, A.; Kontokostas, D.; Mendes, P.N.; Hellmann, S.; Morsey, M.; Van Kleef, P.; Auer, S.; et al. Dbpedia—A large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web 2015, 6, 167–195. [Google Scholar] [CrossRef]
Miller, G.A. WordNet: A lexical database for English. Commun. ACM 1995, 38, 39–41. [Google Scholar] [CrossRef]
Van Houdt, G.; Mosquera, C.; Nápoles, G. A review on the long short-term memory model. Artif. Intell. Rev. 2020, 53, 5929–5955. [Google Scholar] [CrossRef]
Lafferty, J.; McCallum, A.; Pereira, F.C.N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning, San Francisco, CA, USA, 28 June–1 July 2001. [Google Scholar]
Wang, Q.; Mao, Z.; Wang, B.; Guo, L. Knowledge graph embedding: A survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 2017, 29, 2724–2743. [Google Scholar] [CrossRef]

Figure 1. Knowledge graph-based spacecraft anomaly detection algorithm architecture.

Figure 2. Knowledge graph-driven spacecraft anomaly detection and processing system framework.

Figure 3. Technical route of the spacecraft anomaly detection method based on a knowledge graph.

Figure 4. Entity recognition model.

Figure 5. Entity annotation example.

Figure 6. Relationship extraction model.

Figure 7. Visualization of the data processing workflow.

Figure 8. Ontology design of a spacecraft equipment knowledge graph.

Figure 9. Ontology design of a spacecraft abnormal event graph.

Figure 10. Ontology design of an event chain subgraph.

Figure 11. Unstructured corpus example (sensitive information removed).

Figure 12. The interface of doccano for anomaly report text annotation (partial).

Figure 13. Outlier removal visualization example.

Figure 14. Missing value imputation visualization example (regression imputation).

Figure 15. CMG system parameter correlation heatmap.

Figure 16. Visualization of feature data for each group.

Figure 17. Knowledge graph visualization.

Figure 18. Clustering result visualization example.

Figure 19. The abrupt changes of telemetry parameters (green part).

Figure 20. Spacecraft CMG system anomaly handling experimental case.

Table 1. Evaluation results of the entity annotation model.

Number of Entities in the Test Set	173
Identified Entities	139
Correctly Identified Entities	131
Precision (P)	94.24%
Recall (R)	75.72%
F1 Score	83.97%

Table 2. Statistics of entity and relationship counts in the spacecraft abnormal event graph.

Entity Type	Count	Relationship Type	Count
Anomaly Event	112	Time	358
Anomaly Time	358	Phenomenon	284
Anomaly Phenomenon	243	Device	116
Anomaly Device	93	Parameter	145
Anomaly Parameter	111	Cause	142
Anomaly Cause	134	Plan	157
Handling Plan	144	Belong	109

Table 3. Detection result comparison.

Metric	Threshold-Based Method	Method in This Paper
Precision	65%	95%
Correctly Identified	39	57
False Alarms	17	1
Missed Alarms	4	2

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yi, X.; Huang, P.; Che, S. Application of Knowledge Graph Technology with Integrated Feature Data in Spacecraft Anomaly Detection. Appl. Sci. 2023, 13, 10905. https://doi.org/10.3390/app131910905

AMA Style

Yi X, Huang P, Che S. Application of Knowledge Graph Technology with Integrated Feature Data in Spacecraft Anomaly Detection. Applied Sciences. 2023; 13(19):10905. https://doi.org/10.3390/app131910905

Chicago/Turabian Style

Yi, Xiaojian, Peizheng Huang, and Shangjie Che. 2023. "Application of Knowledge Graph Technology with Integrated Feature Data in Spacecraft Anomaly Detection" Applied Sciences 13, no. 19: 10905. https://doi.org/10.3390/app131910905

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Application of Knowledge Graph Technology with Integrated Feature Data in Spacecraft Anomaly Detection

Abstract

Featured Application

Abstract

1. Introduction

2. Knowledge Graph-Based Spacecraft Anomaly Detection Algorithm Architecture and Framework

2.1. Knowledge Graph-Based Spacecraft Anomaly Detection Algorithm Architecture

2.2. Knowledge Graph-Driven Spacecraft Anomaly Detection and Processing System Framework

3. Methods

3.1. Constructing the Knowledge Graph

3.1.1. Ontology Construction for the Knowledge Graph

3.1.2. Knowledge Extraction

3.2. Extracting Feature Data

3.2.1. Data Cleaning

3.2.2. Feature Selection

3.2.3. Data Classification

3.3. Detecting Anomalies in Real-Time Telemetry Data

3.3.1. Data Matching Based on a Knowledge Graph

3.3.2. Anomaly Analysis Based on a Knowledge Graph

4. Example

4.1. Constructing the Knowledge Graph

4.1.1. Knowledge Graph Ontology Construction

4.1.2. Knowledge Extraction

4.2. Extracting Feature Data

4.2.1. Data Cleaning

4.2.2. Feature Selection

4.2.3. Data Classification

4.3. Telemetry Data Anomaly Detection Experiment

4.3.1. Data Matching Based on a Knowledge Graph

4.3.2. Discussion of Experimental Results

4.3.3. Anomaly Analysis Based on a Knowledge Graph

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI