1. Introduction
Wireless sensor networks (WSNs) constitute a nascent discipline that synthesizes insights from sensor technology, network communication, and information perception. The networks are self-organizing, multi-hop systems comprising stationary or mobile sensors that facilitate real-time monitoring, processing, and data transmission pertaining to sensed objects within their operational ranges to end users [
1,
2]. Recognized as one of the most pivotal technologies of the 21st century, WSNs act as a conduit between the logical information domain and the physical environment, thus revolutionizing information sensing, acquisition, and processing methodologies. In recent years, the notions of electronic sensors and sensor networks have attracted considerable attention from scholars and organizations within the power system [
3,
4].
The power grid is a network of transmission lines, substations, and other facilities that transport electricity from power plants to end users. Integrating WSN technology with power grids seeks to establish a smart, clean low-carbon electronic sensor network that aligns with the global consensus and strategic direction [
5,
6,
7]. Distinct from traditional power grids, electronic sensors and sensor networks demonstrate a spectrum of capabilities in monitoring power equipment and grids, energy metering, and supplying foundational data for energy digitalization. The creation of a flexible, robust, and secure foundational network is imperative for the realization of electronic sensors and sensor networks. With the increasing integration of distributed energy sources and power electronic devices, future power grids are poised for substantial transformations. These changes necessitate comprehensive information support through innovative sensing and measurement mechanisms to ensure stable power grid operations amid intricate network interconnections [
8,
9,
10].
Presently, electronic sensors and sensor networks are extensively employed in diverse power sectors to enhance monitoring, control, and data acquisition. Serving as fundamental instruments for interfacing with external environments and capturing perceptual information, electronic sensors are critical for the advancement of electronic sensor networks. The networks are designed to facilitate swift, extensive, and precise information gathering, and they empower energy and optimization control across regional and energy production and consumption types through information exchange [
11,
12,
13,
14]. As pivotal infrastructure in the future power grid, electronic sensor networks are poised to exhibit significant application value within the energy internet domain.
On the power supply side, an array of sensors facilitates fault diagnosis and health monitoring for renewable energy generation equipment, such as wind and photovoltaic systems, thereby mitigating accidents, enhancing efficiency, and prolonging equipment lifespan [
15,
16,
17,
18,
19]. On the power grid side, the widespread implementation of diverse sensors ensures comprehensive information sensing and intelligent applications, supporting grid production and operational activities [
20,
21,
22,
23,
24,
25]. On the consumer side, sensors and measurement devices drive intelligent power consumption and smart-home technologies, improving overall energy utilization [
26,
27].
The electronic sensor network comprises a wireless, self-organizing system that amalgamates data acquisition, processing, and communication functionalities and features densely deployed sensors across monitored zones. These sensor nodes relay collected data through multi-hop routing to aggregation nodes, subsequently transmitting the data to edge computing nodes for processing and analysis. Operated in hostile and hazardous environments, these networks encounter substantial security risks and challenges owing to resource limitations and unattended deployment. Recently, anomaly detection methods for mixed-attribute data in electronic sensors and sensor networks have been developed to effectively mitigate these challenges.
1.1. Related Works
The data security of electronic sensor networks is paramount. Various security mechanisms, such as encryption, authentication, and intrusion detection, are necessary to safeguard perception data against hardware damage and environmental impacts. Compared with traditional networks, electronic sensor networks differ significantly in their hardware and network environments, requiring distinct data security research methods.
Symmetric encryption and lightweight authentication protect wireless sensor nodes, making the network vulnerable to attacks like Sybil Attack, Node Capture, Sinkhole Attack, and DoS. Such attacks can compromise perception data confidentiality and consistency, severely affecting network operations. Therefore, evaluating security performance and implementing measures to enhance overall network security and robustness is vital. Presently, efficient security assessment models for the comprehensive evaluation of system security performance are lacking.
Anomaly detection methods in electronic sensors and sensor networks also face big challenges, such as the large data volume of sensor nodes, high dynamic heterogeneity of data distribution, limited computational and energy resources, etc. In recent years, many scholars have proposed more effective solutions to these problems. To date, researchers have proposed many anomaly detection methods, and a rough classification of these methods includes probabilistic-statistics-based, clustering-based, nearest-neighbor-based, density-based, tree-based, and machine learning-based methods. Breunig et al. proposed a new anomaly detection algorithm that determines outliers by computing the density, where the density of a normal sample region is basically the same as that of its neighbors, while the density of an abnormal sample region is significantly different from that of its neighbors [
28]. Hawkins et al. proposed an anomaly detection method based on replicator neural network (RNN) [
29]. The RNN was trained from a sampled dataset to build a model that predicts the given data. The model was then used to develop an anomaly score so that the trained RNN could be applied to the entire dataset to provide a quantitative measure of outlier based on the reconstruction error. Shyu et al. investigated the application of robust principal component analysis in an anomaly detection method [
30]. There were two principal components that played a decisive role in this method; the primary principal component explained about 50% of the total variance, while the secondary principal component had eigenvalues of less than 0.20. Mia et al. proposed a location and distribution estimation algorithm known as the minimum covariance determinant (MCD) algorithm, which was highly robust and could be computed by the FAST-MCD method for efficient computation [
31]. More robust mean and covariance estimates can be obtained using the MCD algorithm, which can then be computed based on the Mahalanobis distance, allowing for more accurate detection of outliers. GoldStein et al. proposed an algorithm for anomaly detection by constructing histograms, which had a low time complexity and was therefore suitable for anomaly detection in large datasets [
32]. It required features to be independent of each other, and its detection was faster for multivariate datasets compared with other algorithms. However, this method was only suitable for global anomaly detection and was less effective for local anomaly detection. Malhotra et al. introduced long and short-term memory networks into the anomaly detection task, which were suitable for data types that were multidimensional time series, and then fitted the error with a multivariate normal distribution [
33].
Giatrakos et al. proposed a composite solution to the anomaly detection problem by having a data similarity feature modeling and a variable sliding window to guarantee the data rate and to improve the accuracy of anomaly prediction [
34]. In this scheme, the bandwidth occupancy was predicted through the proposed framework, and the amount of communication data was reduced, thus prolonging the survival cycle of the WSNs. Yu et al. proposed an unsupervised contextual anomaly detection method for WSNs [
35]. The scheme used grid segmentation and grid cell merging for data classification and detected anomalous data using spatial- and temporal-based correlation between contextual anomalies. Zhang et al. proposed a local anomaly identification scheme using unsupervised data, which is effective for univariate data anomaly detection [
36]. The scheme used the mean-shift algorithm to cluster the datasets, and it set a shift window for each data point in the dataset, which was calculated based on the average shift vector of the data points.
1.2. Contributions and Outcomes
This paper focuses on enhancing the security of electronic sensors and sensor networks. A hierarchical electrical sensor network model is analyzed by employing the analytic hierarchy process, and we construct a hierarchical perception security architecture tailored for the anomaly detection method in electronic sensors and sensor networks. Central to our approach is the introduction of a weighted neighborhood information network (WNIN) for anomaly detection in mixed-attribute data, and we propose a WNIN-enabled anomaly detection method. Initially, we present a neighborhood information system to identify the relationships among data objects with mixed attributes. Subsequently, we develop a weighted neighborhood information network for electronic sensors and sensor networks, capturing these relationships. The network features a state-transferring probability matrix derived from data object similarity. Finally, a random wandering process is executed within the WNIN, and the proposed method determines the importance of data objects based on the steady-state distribution vector, thereby determining the anomaly data and fortifying the security framework of the electronic sensors and sensor network.
The rest of the paper is structured as follows. The AHP-based security analysis for electronic sensors and sensor networks is presented in
Section 2. In
Section 3, the WNIN-enabled anomaly detection method is presented, including the neighborhood information system for mixed-attribute data, the WNIN for electronic sensors and sensor networks, as well as the anomaly detection method based on the WNIN. The simulation results are presented in
Section 4, and
Section 5 is the conclusion.
2. AHP-Based Security Analysis for Electronic Sensors and Sensor Networks
In this section, the security of electronic sensors and the sensor network will be analyzed through the analytic hierarchy process (AHP) according to the hierarchical model of electrical sensor networks.
The sensors in electronic sensor networks include four main types:
The core technology of an RFID sensor is radio frequency identification (RFID), a type of Internet of Things identification technology that confirms the identification of markers mainly through a simple wireless system, which strengthens the collection and processing of key information.
- (2).
Electrical sensor
Current, voltage, and harmonics are the key parameters in the operation of power equipment, but also the main perception of network nodes and lines.
- (3).
Non-electrical sensor
Electric field, force, heat, humidity, and other multi-physical fields produces a comprehensive effect during the operation of power equipment, which involves the application of non-electrical sensing mechanisms.
- (4).
Environmental sensor
The power grid belongs to the infrastructure, and the deployment environment is more complex. The environmental parameters belong to the main data in the new energy operation.
The hierarchical electrical sensor network model is shown in
Figure 1.
The electrical sensor network system is divided into three parts: the perception layer, the network layer, and the edge computing layer. The perception layer includes the bushing monitoring sensor, voltage/current monitoring sensor, watering sensor, partial discharge sensor, on-load tap-changer monitoring sensor, etc. The network layer encompasses the mobile network, wireless local area network (WLAN), and cellular network, providing the requisite network technology for the electrical sensor network system. The communication process must comply with the protocols of wireless LAN authentication and privacy infrastructure (WAPI), as well as the micro-power wireless network communication protocol for the internet for power transmission and transformation equipment (Q/GDW). The edge computing layer pertains to the application of the electrical sensor network, including for substation fault analysis, electronic line monitoring, and energy consumption management.
Based on the hierarchical electrical sensor network model, the AHP method is employed to analyze the security of the electronic sensors and sensor network, and a hierarchical perception security architecture is constructed. The AHP provides a structured and systematic approach to decision making by decomposing the complex security assessment problem of the electrical sensors and sensor network into hierarchy sub-problems. The AHP uses pairwise comparisons to determine the relative importance of any security attack, has the ability to handle mixed attributes, is robust, and can perform comprehensive evaluation. The establishment of the hierarchical perception security architecture considers factors from the perception layer, the network layer, and the edge computing layer, as illustrated in
Figure 2.
According to the hierarchical perception security architecture, perception security is divided into perception layer security, network layer security, edge computing layer security, and other security.
(1) Perception layer security includes electromagnetic tampering, laser tampering, acoustic tampering, and sensor life, etc. Electromagnetic tampering changes the data perceived by the electronic sensor by sending electromagnetic waves or by adding an external magnetic field. Laser tampering changes the data with a laser of a specific frequency. Acoustic tampering changes the data by ultrasonic attacks or by sending control commands. The lifetime of an electrical sensor under long working conditions is one of the main indicators that directly reflects its reliability.
(2) Network layer security encompasses data theft and node phishing. Data theft is when the data transmitted through the network are obtained by means of monitoring and the data confidentiality is destroyed. Node phishing means that malicious nodes impersonate electronic sensors to send false information to the network, or that malicious nodes impersonate sink nodes to trick electronic sensor data.
(3) Edge computing layer security includes data leakage, memory tampering, and denial of service. Data leakage means that the service data stored on edge terminals are obtained through network attacks or device intrusion. Memory tampering is when the storage of edge terminal devices is attacked by means of magnetism or lasers to change the stored service and management data. Denial of service causes the edge terminal device to stop providing service to the electronic sensor by continuously sending data to the edge terminal device.
(4) Other security covers environmental threats, natural disasters, management risks, etc. Environmental threats refer to the complexity and variability of the environment where sensors, cameras, and other equipment are located, as well as their susceptibility to man-made damage. Natural disasters pertain to losses caused by events such as floods, fires, earthquakes, and other natural calamities. Management risks relate to the quality and completeness of the management system, supervision mechanisms, national policies, laws, and regulations.
3. WNIN-Enabled Anomaly Detection Method
In this section, we delineate the weighted neighborhood information network (WNIN)-enabled anomaly detection method tailored for mixed-attribute data within electrical sensors and sensor networks. Traditional network-model-based anomaly detection methods typically represent data objects as nodes within a network, with the presence of edges and their associated weights determined by the relationships and similarities among these data objects. Consequently, the efficacy of these methods hinges significantly on the construction of the network model. Recognizing that current methods predominantly address datasets with numerical attributes, we observe a scarcity of consideration for datasets encompassing diverse attribute types, particularly those featuring nonnumerical attributes.
To bridge this gap, we integrate the theory of neighborhood rough sets with network modeling to introduce a WNIN-enabled anomaly detection method for mixed-attribute data. Utilizing the neighborhood information system (NIS), we ascertain the neighbor relationships between data objects within the mixed-attribute dataset. Building upon this, we construct a WNIN model. A Markovian stochastic process is then executed on the WNIN model to derive a node importance value index, which quantifies the significance of each data object. This metric facilitates the identification of anomalies within the mixed-attribute dataset, thereby enhancing the anomaly detection capabilities of the sensor network.
3.1. Neighborhood Information System for Mixed-Attribute Data
The specific establishment process of the NIS for mixed-attribute data is shown in
Figure 3. The NIS is the basic expression of the neighborhood rough set. For electronic sensors and sensor networks, an NIS can be expressed as
, where
denotes a non-empty finite set of electronic perception data objects,
is a non-empty finite set of mixed attributes based on the hierarchical perception security architecture,
is the value domain of the attributes in
, and
serves as an information function reflecting the correspondence between the data objects and the mixed attributes [
37]. According to the hierarchical perception security architecture shown in
Figure 2, the data object
is assumed to be described by
numerical attributes and
nonnumerical attributes. For convenient analysis, the first
attributes of
are numerical attributes, and the last
attributes are defined as nonnumerical attributes.
The subsequent processing of numerical and nonnumerical attributes is different. The numerical attribute values should be normalized, the distance between the different data objects in the numerical attribute can be calculated by the Euclidean distance formula, and the neighborhood radius of the numerical attribute can be defined as a number that depends on the variance and mean of the numerical attribute. However, the distance in the nonnumerical attribute is equal to 1 when the data objects are the same; otherwise, the distance is equal to 0. Based on this, the neighborhood radius of the nonnumerical attribute is defined 0.
In electronic sensors and sensor networks, there usually are large differences in the order of magnitude or dimension of the perception data, and the direct calculation of the initial data in different magnitude orders greatly reduce the accuracy of the detection results. Therefore, the maximum–minimum normalization method is adopted for the initial dataset normalization of the numerical attribute, and the value of the processed data object
in the numerical attribute (
) is written as
where
is the
-th attribute (
) value of
.
In order to efficiently and accurately measure the distance between different data objects with mixed attributes, the mixed Euclidean overlap metric is used, where the distance between the data objects
and
is defined as
where the expression of
is
The size of the neighborhood radius directly determines the neighbor relationship between different data objects, thus affecting the construction of the NIS. The traditional neighborhood radius usually assigns a fixed value to all the attributes based on experts’ experience without considering the data distribution characteristics. Data objects in different attributes show unique distribution characteristics, so the neighborhood radius should be different in different attributes, and it is more reasonable to set the corresponding neighborhood radius for each attribute according to the data distribution characteristics in different attributes. To this end, the variation coefficient and the neighborhood radius adjustment factor are used to construct the expression of the neighborhood radius to make it more adaptable and objective for different electronic sensor application scenarios, and the neighborhood radius in the attribute
is defined as follows.
where
and
denote the standard deviation and mean value of the numerical attribute
(
) of the data objects, respectively, and
is the neighborhood radius adjustment factor, which ensures that the neighborhood radius of the proposed method is more adaptive to the numerical attributes of different data distribution characteristics. The neighbor set of the data object
in the attribute
is denoted as
, which is given as follows:
Only when the distance in the attribute
between the data objects
and
is less than the neighborhood radius
is the data object
one of the neighbors of
. When
and
, the data objects
and
are neighbors of each other. The neighborhood relationship between
and
in the attribute
is denoted as
and is shown as follows:
Therefore, the neighborhood relation contained in
can be expressed as follows:
where
denotes the neighbor number of the data object
in the attribute
.
3.2. B. WNIN for Electronic Sensors and Sensor Networks
A WNIN for electronic sensors and sensor networks is constructed by modeling the data object
as a node in the network. According to the NIS for mixed-attribute data of the electronic sensors and sensor network, there are
nodes in the network. There exists an undirected edge between
and
only when
or
, and the total similarity between
and
in the attribute
is the weight of the corresponding edge. The total similarity between the data objects
and
is denoted as
and defined as follows:
where
is the weight of the attribute
, and
is the similarity between
and
in the attribute
.
The weight of
is
and can be calculated by using the entropy weighting method, which is shown as follows:
where
is the value of the entropy information in the attribute
, which is shown as follows:
In Formula (10), when .
The expression for
is shown as follows:
In this case, the WNIN can be represented by an adjacency matrix
, which is defined as the product of the similarity matrix
and the edge existence matrix
.
where “
” denotes the operation of the Hadamard product. The edge existence matrix
contains the neighborhood information of each node, and
is defined as follows:
3.3. Anomaly Detection Method Based on the WNIN
The specific process of the anomaly detection method based on the WNIN through a random wandering process is shown in
Figure 4. To perform a random wandering process in the WNIN-based electronic sensor data network, the state-transferring probability matrix
is obtained by normalizing the adjacency matrix
.
where
is a diagonal matrix with respect to the adjacency matrix
, and each element in the diagonal of
is equal to the sum of the corresponding row elements in
. The state-transferring probability matrix
is also a Markov random wandering matrix that satisfies the condition that the sum of the transferring probabilities from each point to all other nodes is equal to 1, i.e.,
.
The random wandering process begins at a randomly selected node within the WNIN. From the starting node, the process takes random steps to adjacent nodes. These steps are determined by the state-transferring probability matrix
. The random wandering process continues to move from node to node, exploring the network in a stochastic manner. After a certain number of iterations, the random wandering process tends toward a stable state, that is, all the nodes in the WNIN-based electronic sensor data network are no longer changing the probability of being accessed, which can be represented by the steady-state distribution vector. The steady-state distribution vector is defined as follows:
where
denotes the probability that a random wanderer stays at node
at timeslot
.
The similarity between two nodes is used to customize the state-transferring probability matrix
of the Markov random wandering process, which can effectively and accurately complete the transferring process between nodes. According to this mechanism, nodes that tend to be outliers have a smaller probability of being visited. When the Markov random wandering process reaches a steady state after a certain number of iterations, each element in the steady-state distribution vector can represent the probability of the random wanderer staying at each node, i.e., the importance of each node. The larger the value of the steady-state distribution vector is, the smaller the probability of converging to an outlier. In order to distinguish the importance of each node, we standardize the importance values in the steady-state distribution vector. The standardized importance value of
is denoted as
, which is defined as follows:
where
.
In addition, the importance value of a node is closely related to the number of its neighbors, and the value of the neighborhood radius adjustment factor
has a large impact on the dispersion degree of each node in the WNIN-based electronic sensor data network. Thus, the importance degree is defined as follows:
Considering the principle that potential anomaly data objects have less importance in the WNIN for electronic sensors and sensor networks, the smaller the importance degree of a data object is, the greater the probability that it will tend to be an anomaly object.
The complexity of the anomaly detection method based on the WNIN consists of two parts. One is the complexity of constructing the WNIN for electronic sensors and sensor networks and its , where is the number of data objects in the WNIN and is the number of the attributes. The other is the complexity of the random wandering process and its , where is the number of steps in the random walk, is the number of data objects in the WNIN, and is the number of edges. Therefore, the construction of the WNIN can be prohibitive for large datasets with high dimensionality, which leads to long computation times and the requirement for significant computational resources. The random wandering process may require a large number of steps to converge and achieve stable results, especially in complex WNINs, leading to variability in anomaly detection.
4. Performance Evaluation
In this section, the experimental comparison and analysis is performed with the public dataset and the constructed dataset. The WNIN-enabled anomaly detection method is compared and analyzed with two other related anomaly detection methods to verify the effectiveness of the proposed method. One is the VOS method, which is based on the KNN and the random wandering process [
38]. Integrating local information with implicit connections within the graph representation of the original dataset, the VOS method constructed a similarity graph using the top-k similar neighbors for each object. It introduced a virtual node coupled with a collection of virtual edges to generate a k-virtual graph. Subsequently, a Markov random walk process was conducted on the similarity graph, with the principle that potential anomalies should receive more weight to be visited. The alternative NIEHDOD method used is based on neighborhood information entropy [
37]. The neighborhood information system was defined by a heterogeneous distance and a self-adapting radius, with neighborhood information entropy subsequently formulated to quantify overall uncertainty. Three incremental information measures were constructed to characterize individual objects, culminating in the establishment of a neighborhood entropy-based anomaly factor for anomaly detection.
The performance of the proposed method was evaluated in terms of its accuracy rate, recall rate, false-alarm rate, and F1 score. In data anomaly detection, the data sample space can be generally categorized into two categories: normal data and anomaly data. Then, the detection results and the actual categories to which the samples belong are True Positive (TP), False Positive (FP), True Negative (TN), False Negative, (FN). Among them, a True positive (TP) is an outlier that is effectively detected by the algorithm, while a False Positive (FP) is a normal data point that is misjudged as an outlier by the algorithm. A True Negative (TN) is a normal data point that is correctly determined by the algorithm, and a False Negative (FN) is an outlier that has not been detected by the algorithm.
The accuracy rate is the ability to effectively detect outliers and distinguish them from normal data and is expressed as follows:
The recall rate reflects the detection ability of the anomaly detection algorithm and is the proportion of detected outliers to the overall total number of outliers.
The false-alarm rate reflects the failure rate of the algorithm, i.e., the lower bound of the algorithm’s accuracy, and it can be expressed as follows:
Taking the accuracy rate and recall rate into consideration, the F1 score is defined as follows:
The Lymphography public dataset was selected from the UCI database. The Lymphography dataset contains 148 data objects, and the 148 objects are categorized into four classes, including normal find, metastases, fibrosis, and malign lymph, with 2, 81, 61, and 4 data objects per class, respectively. The normal find and malign lymph categories are considered rare and contain a total of six real outliers. The proposed method was examined using the Lymphography dataset, and the results are shown in
Figure 5. The x axis represents the identification of the data objects, that is, the number of data objects in the dataset. Obviously, the anomaly objects have small values of importance degree, which verifies the principle that potential anomaly data objects have less importance in the WNIN for the electronic sensors and sensor network.
The performance of the proposed method is analyzed against two other anomaly detection methods using the Lymphography dataset, and the comparison results are shown in
Table 1. The proposed method has a high value of accuracy rate among the three anomaly detection methods. The recall rate of the proposed method is equal to that of the VOS method, but its false-alarm rate is much lower than that of the VOS method. The proposed method is optimal in the F1 score.
According to the hierarchical perception security architecture in
Figure 2, a dataset was constructed that contained 450 data objects, including 430 normal objects and 20 anomaly objects. The results of the proposed method examined by the constructed dataset are shown in
Figure 6. In the detection results, there are two outliers that were not detected, but the detection results also verify that the anomaly objects have small values of importance degree.
The performance of the proposed method against two other anomaly detection methods in the self-constructed dataset is shown in
Table 2. In the constructed dataset, the proposed method has high values of accuracy rate and recall rate among the three anomaly detection methods. The false-alarm rate of the proposed method is equal to 0, and it is the best in terms of false-alarm performance. Obviously, the proposed method is also optimal in the F1 score.
From the above experimental results, it has obviously been shown that the proposed method has better comprehensive performance than the VOS method and the NIEHDOD method. For the two datasets, the value range of the neighborhood radius in each numerical attribute can be determined by the standard deviation and mean value of the numerical attribute. And the neighborhood radius in the numerical attribute could be adaptively adjusted through the neighborhood radius adjustment factor, thereby improving the anomaly detection rate of the proposed method.
5. Conclusions
This paper proposed a WNIN-enabled anomaly detection method for mixed-attribute data in electronic sensors and sensor networks. The method employed the AHP to assess the security of the electronic sensors and sensor network, utilizing a hierarchical electrical sensor network model. A hierarchical perception security architecture was subsequently constructed. The NIS was then established to identify the neighborhood relationships among data objects with mixed attributes. A WNIN was developed for the electronic sensors and sensor network to represent these relationships, incorporating a state-transition matrix derived from data object similarity. A random wandering process within the network was executed, and the anomaly degree of data objects was quantified based on the steady-state distribution vector. Simulation outcomes indicate that the proposed method outperforms other comparative methods in terms of anomaly detection rate.