Weighted-Neighborhood-Information-Network-Enabled Anomaly Detection Method for Electronic Sensors and Sensor Networks

An, Chunyan; Liu, Yingyi; Li, Qi; Si, Pengbo

doi:10.3390/electronics13173482

Open AccessArticle

Weighted-Neighborhood-Information-Network-Enabled Anomaly Detection Method for Electronic Sensors and Sensor Networks

¹

China Electric Power Research Institute Co., Ltd., Beijing 100192, China

²

Electric Power Intelligent Sensing Technology Laboratory of State Grid Corporation, Beijing 102209, China

³

Beihang University, Beijing 100191, China

⁴

Information and Communication Engineering, Beijing University of Technology, Beijing 100124, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(17), 3482; https://doi.org/10.3390/electronics13173482

Submission received: 25 July 2024 / Revised: 20 August 2024 / Accepted: 25 August 2024 / Published: 2 September 2024

Download

Browse Figures

Versions Notes

Abstract

:

As electronic sensors and sensor networks advance, perception data are increasingly characterized by mixed attributes. Traditional anomaly detection methods predominantly focus on numerical attributes. In this paper, we introduce a weighted neighborhood information network (WNIN)-enabled anomaly detection method tailored for mixed-attribute data from electronic sensors and sensor networks. Firstly, we employ the analytic hierarchy process (AHP) to analyze the security of sensor networks, leveraging a hierarchical electronic sensor network model to construct a hierarchical perception security architecture for anomaly detection. Subsequently, a neighborhood information system is established to ascertain the relationships between data objects with mixed attributes. We then develop the WNIN to encapsulate the relationships, and a state-transferring probability matrix based on data object similarity is derived. Ultimately, a random wandering process within the WNIN is executed, and the importance of data objects is evaluated using the steady-state distribution vector, thereby determining the anomaly data. Simulation outcomes reveal that our proposed method attains superior anomaly detection rates compared with existing methods.

Keywords:

electronic anomaly detection; sensors and sensor networks; weighted neighborhood information network

1. Introduction

Wireless sensor networks (WSNs) constitute a nascent discipline that synthesizes insights from sensor technology, network communication, and information perception. The networks are self-organizing, multi-hop systems comprising stationary or mobile sensors that facilitate real-time monitoring, processing, and data transmission pertaining to sensed objects within their operational ranges to end users [1,2]. Recognized as one of the most pivotal technologies of the 21st century, WSNs act as a conduit between the logical information domain and the physical environment, thus revolutionizing information sensing, acquisition, and processing methodologies. In recent years, the notions of electronic sensors and sensor networks have attracted considerable attention from scholars and organizations within the power system [3,4].

The power grid is a network of transmission lines, substations, and other facilities that transport electricity from power plants to end users. Integrating WSN technology with power grids seeks to establish a smart, clean low-carbon electronic sensor network that aligns with the global consensus and strategic direction [5,6,7]. Distinct from traditional power grids, electronic sensors and sensor networks demonstrate a spectrum of capabilities in monitoring power equipment and grids, energy metering, and supplying foundational data for energy digitalization. The creation of a flexible, robust, and secure foundational network is imperative for the realization of electronic sensors and sensor networks. With the increasing integration of distributed energy sources and power electronic devices, future power grids are poised for substantial transformations. These changes necessitate comprehensive information support through innovative sensing and measurement mechanisms to ensure stable power grid operations amid intricate network interconnections [8,9,10].

Presently, electronic sensors and sensor networks are extensively employed in diverse power sectors to enhance monitoring, control, and data acquisition. Serving as fundamental instruments for interfacing with external environments and capturing perceptual information, electronic sensors are critical for the advancement of electronic sensor networks. The networks are designed to facilitate swift, extensive, and precise information gathering, and they empower energy and optimization control across regional and energy production and consumption types through information exchange [11,12,13,14]. As pivotal infrastructure in the future power grid, electronic sensor networks are poised to exhibit significant application value within the energy internet domain.

On the power supply side, an array of sensors facilitates fault diagnosis and health monitoring for renewable energy generation equipment, such as wind and photovoltaic systems, thereby mitigating accidents, enhancing efficiency, and prolonging equipment lifespan [15,16,17,18,19]. On the power grid side, the widespread implementation of diverse sensors ensures comprehensive information sensing and intelligent applications, supporting grid production and operational activities [20,21,22,23,24,25]. On the consumer side, sensors and measurement devices drive intelligent power consumption and smart-home technologies, improving overall energy utilization [26,27].

The electronic sensor network comprises a wireless, self-organizing system that amalgamates data acquisition, processing, and communication functionalities and features densely deployed sensors across monitored zones. These sensor nodes relay collected data through multi-hop routing to aggregation nodes, subsequently transmitting the data to edge computing nodes for processing and analysis. Operated in hostile and hazardous environments, these networks encounter substantial security risks and challenges owing to resource limitations and unattended deployment. Recently, anomaly detection methods for mixed-attribute data in electronic sensors and sensor networks have been developed to effectively mitigate these challenges.

1.1. Related Works

The data security of electronic sensor networks is paramount. Various security mechanisms, such as encryption, authentication, and intrusion detection, are necessary to safeguard perception data against hardware damage and environmental impacts. Compared with traditional networks, electronic sensor networks differ significantly in their hardware and network environments, requiring distinct data security research methods.

Symmetric encryption and lightweight authentication protect wireless sensor nodes, making the network vulnerable to attacks like Sybil Attack, Node Capture, Sinkhole Attack, and DoS. Such attacks can compromise perception data confidentiality and consistency, severely affecting network operations. Therefore, evaluating security performance and implementing measures to enhance overall network security and robustness is vital. Presently, efficient security assessment models for the comprehensive evaluation of system security performance are lacking.

Anomaly detection methods in electronic sensors and sensor networks also face big challenges, such as the large data volume of sensor nodes, high dynamic heterogeneity of data distribution, limited computational and energy resources, etc. In recent years, many scholars have proposed more effective solutions to these problems. To date, researchers have proposed many anomaly detection methods, and a rough classification of these methods includes probabilistic-statistics-based, clustering-based, nearest-neighbor-based, density-based, tree-based, and machine learning-based methods. Breunig et al. proposed a new anomaly detection algorithm that determines outliers by computing the density, where the density of a normal sample region is basically the same as that of its neighbors, while the density of an abnormal sample region is significantly different from that of its neighbors [28]. Hawkins et al. proposed an anomaly detection method based on replicator neural network (RNN) [29]. The RNN was trained from a sampled dataset to build a model that predicts the given data. The model was then used to develop an anomaly score so that the trained RNN could be applied to the entire dataset to provide a quantitative measure of outlier based on the reconstruction error. Shyu et al. investigated the application of robust principal component analysis in an anomaly detection method [30]. There were two principal components that played a decisive role in this method; the primary principal component explained about 50% of the total variance, while the secondary principal component had eigenvalues of less than 0.20. Mia et al. proposed a location and distribution estimation algorithm known as the minimum covariance determinant (MCD) algorithm, which was highly robust and could be computed by the FAST-MCD method for efficient computation [31]. More robust mean and covariance estimates can be obtained using the MCD algorithm, which can then be computed based on the Mahalanobis distance, allowing for more accurate detection of outliers. GoldStein et al. proposed an algorithm for anomaly detection by constructing histograms, which had a low time complexity and was therefore suitable for anomaly detection in large datasets [32]. It required features to be independent of each other, and its detection was faster for multivariate datasets compared with other algorithms. However, this method was only suitable for global anomaly detection and was less effective for local anomaly detection. Malhotra et al. introduced long and short-term memory networks into the anomaly detection task, which were suitable for data types that were multidimensional time series, and then fitted the error with a multivariate normal distribution [33].

Giatrakos et al. proposed a composite solution to the anomaly detection problem by having a data similarity feature modeling and a variable sliding window to guarantee the data rate and to improve the accuracy of anomaly prediction [34]. In this scheme, the bandwidth occupancy was predicted through the proposed framework, and the amount of communication data was reduced, thus prolonging the survival cycle of the WSNs. Yu et al. proposed an unsupervised contextual anomaly detection method for WSNs [35]. The scheme used grid segmentation and grid cell merging for data classification and detected anomalous data using spatial- and temporal-based correlation between contextual anomalies. Zhang et al. proposed a local anomaly identification scheme using unsupervised data, which is effective for univariate data anomaly detection [36]. The scheme used the mean-shift algorithm to cluster the datasets, and it set a shift window for each data point in the dataset, which was calculated based on the average shift vector of the data points.

1.2. Contributions and Outcomes

This paper focuses on enhancing the security of electronic sensors and sensor networks. A hierarchical electrical sensor network model is analyzed by employing the analytic hierarchy process, and we construct a hierarchical perception security architecture tailored for the anomaly detection method in electronic sensors and sensor networks. Central to our approach is the introduction of a weighted neighborhood information network (WNIN) for anomaly detection in mixed-attribute data, and we propose a WNIN-enabled anomaly detection method. Initially, we present a neighborhood information system to identify the relationships among data objects with mixed attributes. Subsequently, we develop a weighted neighborhood information network for electronic sensors and sensor networks, capturing these relationships. The network features a state-transferring probability matrix derived from data object similarity. Finally, a random wandering process is executed within the WNIN, and the proposed method determines the importance of data objects based on the steady-state distribution vector, thereby determining the anomaly data and fortifying the security framework of the electronic sensors and sensor network.

The rest of the paper is structured as follows. The AHP-based security analysis for electronic sensors and sensor networks is presented in Section 2. In Section 3, the WNIN-enabled anomaly detection method is presented, including the neighborhood information system for mixed-attribute data, the WNIN for electronic sensors and sensor networks, as well as the anomaly detection method based on the WNIN. The simulation results are presented in Section 4, and Section 5 is the conclusion.

2. AHP-Based Security Analysis for Electronic Sensors and Sensor Networks

In this section, the security of electronic sensors and the sensor network will be analyzed through the analytic hierarchy process (AHP) according to the hierarchical model of electrical sensor networks.

The sensors in electronic sensor networks include four main types:

(1).: RFID sensor

The core technology of an RFID sensor is radio frequency identification (RFID), a type of Internet of Things identification technology that confirms the identification of markers mainly through a simple wireless system, which strengthens the collection and processing of key information.

(2).: Electrical sensor

Current, voltage, and harmonics are the key parameters in the operation of power equipment, but also the main perception of network nodes and lines.

(3).: Non-electrical sensor

Electric field, force, heat, humidity, and other multi-physical fields produces a comprehensive effect during the operation of power equipment, which involves the application of non-electrical sensing mechanisms.

(4).: Environmental sensor

The power grid belongs to the infrastructure, and the deployment environment is more complex. The environmental parameters belong to the main data in the new energy operation.

The hierarchical electrical sensor network model is shown in Figure 1.

The electrical sensor network system is divided into three parts: the perception layer, the network layer, and the edge computing layer. The perception layer includes the bushing monitoring sensor, voltage/current monitoring sensor, watering sensor, partial discharge sensor, on-load tap-changer monitoring sensor, etc. The network layer encompasses the mobile network, wireless local area network (WLAN), and cellular network, providing the requisite network technology for the electrical sensor network system. The communication process must comply with the protocols of wireless LAN authentication and privacy infrastructure (WAPI), as well as the micro-power wireless network communication protocol for the internet for power transmission and transformation equipment (Q/GDW). The edge computing layer pertains to the application of the electrical sensor network, including for substation fault analysis, electronic line monitoring, and energy consumption management.

Based on the hierarchical electrical sensor network model, the AHP method is employed to analyze the security of the electronic sensors and sensor network, and a hierarchical perception security architecture is constructed. The AHP provides a structured and systematic approach to decision making by decomposing the complex security assessment problem of the electrical sensors and sensor network into hierarchy sub-problems. The AHP uses pairwise comparisons to determine the relative importance of any security attack, has the ability to handle mixed attributes, is robust, and can perform comprehensive evaluation. The establishment of the hierarchical perception security architecture considers factors from the perception layer, the network layer, and the edge computing layer, as illustrated in Figure 2.

According to the hierarchical perception security architecture, perception security is divided into perception layer security, network layer security, edge computing layer security, and other security.

(1) Perception layer security includes electromagnetic tampering, laser tampering, acoustic tampering, and sensor life, etc. Electromagnetic tampering changes the data perceived by the electronic sensor by sending electromagnetic waves or by adding an external magnetic field. Laser tampering changes the data with a laser of a specific frequency. Acoustic tampering changes the data by ultrasonic attacks or by sending control commands. The lifetime of an electrical sensor under long working conditions is one of the main indicators that directly reflects its reliability.

(2) Network layer security encompasses data theft and node phishing. Data theft is when the data transmitted through the network are obtained by means of monitoring and the data confidentiality is destroyed. Node phishing means that malicious nodes impersonate electronic sensors to send false information to the network, or that malicious nodes impersonate sink nodes to trick electronic sensor data.

(3) Edge computing layer security includes data leakage, memory tampering, and denial of service. Data leakage means that the service data stored on edge terminals are obtained through network attacks or device intrusion. Memory tampering is when the storage of edge terminal devices is attacked by means of magnetism or lasers to change the stored service and management data. Denial of service causes the edge terminal device to stop providing service to the electronic sensor by continuously sending data to the edge terminal device.

(4) Other security covers environmental threats, natural disasters, management risks, etc. Environmental threats refer to the complexity and variability of the environment where sensors, cameras, and other equipment are located, as well as their susceptibility to man-made damage. Natural disasters pertain to losses caused by events such as floods, fires, earthquakes, and other natural calamities. Management risks relate to the quality and completeness of the management system, supervision mechanisms, national policies, laws, and regulations.

3. WNIN-Enabled Anomaly Detection Method

In this section, we delineate the weighted neighborhood information network (WNIN)-enabled anomaly detection method tailored for mixed-attribute data within electrical sensors and sensor networks. Traditional network-model-based anomaly detection methods typically represent data objects as nodes within a network, with the presence of edges and their associated weights determined by the relationships and similarities among these data objects. Consequently, the efficacy of these methods hinges significantly on the construction of the network model. Recognizing that current methods predominantly address datasets with numerical attributes, we observe a scarcity of consideration for datasets encompassing diverse attribute types, particularly those featuring nonnumerical attributes.

To bridge this gap, we integrate the theory of neighborhood rough sets with network modeling to introduce a WNIN-enabled anomaly detection method for mixed-attribute data. Utilizing the neighborhood information system (NIS), we ascertain the neighbor relationships between data objects within the mixed-attribute dataset. Building upon this, we construct a WNIN model. A Markovian stochastic process is then executed on the WNIN model to derive a node importance value index, which quantifies the significance of each data object. This metric facilitates the identification of anomalies within the mixed-attribute dataset, thereby enhancing the anomaly detection capabilities of the sensor network.

3.1. Neighborhood Information System for Mixed-Attribute Data

The specific establishment process of the NIS for mixed-attribute data is shown in Figure 3. The NIS is the basic expression of the neighborhood rough set. For electronic sensors and sensor networks, an NIS can be expressed as

N I S = (A, C, E, f)

, where

A = \{a_{1}, a_{2}, \dots, a_{N}\}

denotes a non-empty finite set of electronic perception data objects,

C = \{c_{1}, c_{2}, \dots, c_{M}\}

is a non-empty finite set of mixed attributes based on the hierarchical perception security architecture,

E = \cup_{c \in C} e_{c}

is the value domain of the attributes in

C

, and

f : A \times C \to E

serves as an information function reflecting the correspondence between the data objects and the mixed attributes [37]. According to the hierarchical perception security architecture shown in Figure 2, the data object

a_{n}

is assumed to be described by

K

numerical attributes and

M - K

nonnumerical attributes. For convenient analysis, the first

K

attributes of

a_{n}

are numerical attributes, and the last

M - K

attributes are defined as nonnumerical attributes.

The subsequent processing of numerical and nonnumerical attributes is different. The numerical attribute values should be normalized, the distance between the different data objects in the numerical attribute can be calculated by the Euclidean distance formula, and the neighborhood radius of the numerical attribute can be defined as a number that depends on the variance and mean of the numerical attribute. However, the distance in the nonnumerical attribute is equal to 1 when the data objects are the same; otherwise, the distance is equal to 0. Based on this, the neighborhood radius of the nonnumerical attribute is defined 0.

In electronic sensors and sensor networks, there usually are large differences in the order of magnitude or dimension of the perception data, and the direct calculation of the initial data in different magnitude orders greatly reduce the accuracy of the detection results. Therefore, the maximum–minimum normalization method is adopted for the initial dataset normalization of the numerical attribute, and the value of the processed data object

a_{n}

in the numerical attribute (

m \leq K

) is written as

b_{n}^{m} = \frac{a_{n}^{m} - \underset{i = 1, 2, \dots, N}{m i n} \{a_{i}^{m}\}}{\underset{i = 1, 2, \dots, N}{m a x} \{a_{i}^{m}\} - \underset{i = 1, 2, \dots, N}{m i n} \{a_{i}^{m}\}},

(1)

where

a_{n}^{m}

is the

m

-th attribute (

c^{m}

) value of

a_{n}

.

In order to efficiently and accurately measure the distance between different data objects with mixed attributes, the mixed Euclidean overlap metric is used, where the distance between the data objects

a_{n}

and

a_{i}

is defined as

r_{n, i} = \sqrt{\sum_{m = 1}^{M} d_{n, i}^{m}},

(2)

where the expression of

d_{n, i}^{m}

is

d_{n, i}^{m} = \{\begin{array}{l} |b_{n}^{m} - b_{i}^{m}|, w h e n m = 1, 2, \dots, K \\ 1, w h e n b_{n}^{m} \neq b_{i}^{m} (m = K + 1, K + 2, \dots, M) \\ 0, w h e n b_{n}^{m} = b_{i}^{m} (m = K + 1, K + 2, \dots, M) \end{array} .

(3)

The size of the neighborhood radius directly determines the neighbor relationship between different data objects, thus affecting the construction of the NIS. The traditional neighborhood radius usually assigns a fixed value to all the attributes based on experts’ experience without considering the data distribution characteristics. Data objects in different attributes show unique distribution characteristics, so the neighborhood radius should be different in different attributes, and it is more reasonable to set the corresponding neighborhood radius for each attribute according to the data distribution characteristics in different attributes. To this end, the variation coefficient and the neighborhood radius adjustment factor are used to construct the expression of the neighborhood radius to make it more adaptable and objective for different electronic sensor application scenarios, and the neighborhood radius in the attribute

c^{m}

is defined as follows.

{\bar{r}}^{m} = \{\begin{array}{l} \frac{σ^{m}}{θ μ^{m}}, w h e n m = 1, 2, \dots, K \\ 0, w h e n m = K + 1, K + 2, \dots, M \end{array},

(4)

where

σ^{m}

and

μ^{m}

denote the standard deviation and mean value of the numerical attribute

c^{m}

(

m = 1, 2, \dots, K

) of the data objects, respectively, and

θ (θ > 0)

is the neighborhood radius adjustment factor, which ensures that the neighborhood radius of the proposed method is more adaptive to the numerical attributes of different data distribution characteristics. The neighbor set of the data object

a_{n}

in the attribute

c^{m}

is denoted as

N e g_{n}^{m}

, which is given as follows:

N e g_{n}^{m} = \{a_{i} (i \neq n) |a_{i} \in A, d_{n, i}^{m} \leq {\bar{r}}^{m}\} .

(5)

Only when the distance in the attribute

c^{m}

between the data objects

a_{n}

and

a_{i}

is less than the neighborhood radius

{\bar{r}}^{m}

is the data object

a_{i}

one of the neighbors of

a_{n}

. When

a_{i} \in N e g_{n}^{m}

and

a_{n} \in N e g_{i}^{m}

, the data objects

a_{n}

and

a_{i}

are neighbors of each other. The neighborhood relationship between

a_{n}

and

a_{i}

in the attribute

c^{m}

is denoted as

R^{m}

and is shown as follows:

R^{m} = \{(a_{n}, a_{i}) \in A \times A |a_{i} \in N e g_{n}^{m}, a_{n} \in N e g_{i}^{m}\} .

(6)

Therefore, the neighborhood relation contained in

A

can be expressed as follows:

A |R^{m} = \{ρ_{1}^{m}, ρ_{2}^{m}, \dots, ρ_{N}^{m}\},

(7)

where

ρ_{n}^{m}

denotes the neighbor number of the data object

a_{n}

in the attribute

c^{m}

.

3.2. B. WNIN for Electronic Sensors and Sensor Networks

A WNIN for electronic sensors and sensor networks is constructed by modeling the data object

a_{n}

as a node in the network. According to the NIS for mixed-attribute data of the electronic sensors and sensor network, there are

N

nodes in the network. There exists an undirected edge between

a_{n}

and

a_{i}

only when

a_{n} \in N e g_{i}^{m}

or

a_{i} \in N e g_{n}^{m}

, and the total similarity between

a_{n}

and

a_{i}

in the attribute

c^{m}

is the weight of the corresponding edge. The total similarity between the data objects

a_{n}

and

a_{i}

is denoted as

S_{n, i}

and defined as follows:

S_{n, i} = \{\begin{array}{l} 0, w h e n n = i \\ \sum_{m = 1}^{M} ϖ^{m} s_{n, i}^{m}, w h e n n \neq i \end{array},

(8)

where

ϖ^{m}

is the weight of the attribute

c^{m}

, and

s_{n, i}^{m}

is the similarity between

a_{n}

and

a_{i}

in the attribute

c^{m}

.

The weight of

c^{m}

is

ϖ^{m}

and can be calculated by using the entropy weighting method, which is shown as follows:

ϖ^{m} = \frac{1 - H^{m}}{M - \sum_{m = 1}^{M} H^{m}},

(9)

where

H^{m}

is the value of the entropy information in the attribute

c^{m}

, which is shown as follows:

H^{m} = \ln M \sum_{n = 1}^{N} \frac{|ρ_{n}^{m}|}{\sum_{n = 1}^{N} |ρ_{n}^{m}|} \ln \frac{|ρ_{n}^{m}|}{\sum_{n = 1}^{N} |ρ_{n}^{m}|} .

(10)

In Formula (10),

\underset{|ρ_{n}^{m}| \to 0}{l i m} \frac{|ρ_{n}^{m}|}{\sum_{n = 1}^{N} |ρ_{n}^{m}|} \ln \frac{|ρ_{n}^{m}|}{\sum_{n = 1}^{N} |ρ_{n}^{m}|} = 0

when

|ρ_{n}^{m}| = 0

.

The expression for

s_{n, i}^{m}

is shown as follows:

s_{n, i}^{m} = \{\begin{array}{l} 1 - |b_{n}^{m} - b_{i}^{m}|, w h e n m = 1, 2, \dots, K \\ 1, w h e n b_{n}^{m} = b_{i}^{m} (m = K + 1, K + 2, \dots, M) \\ 0, w h e n b_{n}^{m} \neq b_{i}^{m} (m = K + 1, K + 2, \dots, M) \end{array} .

(11)

In this case, the WNIN can be represented by an adjacency matrix

{[G_{n, i}]}^{N \times N}

, which is defined as the product of the similarity matrix

{[S_{n, i}]}^{N \times N}

and the edge existence matrix

{[U_{n, i}]}^{N \times N}

.

{[G_{n, i}]}^{N \times N} = {[S_{n, i}]}^{N \times N} \circ {[U_{n, i}]}^{N \times N} .

(12)

where “

\circ

” denotes the operation of the Hadamard product. The edge existence matrix

{[U_{n, i}]}^{N \times N}

contains the neighborhood information of each node, and

U_{n, i}

is defined as follows:

U_{n, i} = \{\begin{matrix} 1, w h e n a_{n} \in N e g_{i}^{m} \\ 0, w h e n a_{n} \notin N e g_{i}^{m} \end{matrix} .

(13)

3.3. Anomaly Detection Method Based on the WNIN

The specific process of the anomaly detection method based on the WNIN through a random wandering process is shown in Figure 4. To perform a random wandering process in the WNIN-based electronic sensor data network, the state-transferring probability matrix

{[P_{n, i}]}^{N \times N}

is obtained by normalizing the adjacency matrix

{[G_{n, i}]}^{N \times N}

.

{[P_{n, i}]}^{N \times N} = {[G_{n, i}]}^{N \times N} \times {({[Q_{n, i}]}^{N \times N})}^{- 1},

(14)

where

{[Q_{n, i}]}^{N \times N}

is a diagonal matrix with respect to the adjacency matrix

{[G_{n, i}]}^{N \times N}

, and each element in the diagonal of

{[Q_{n, i}]}^{N \times N}

is equal to the sum of the corresponding row elements in

{[G_{n, i}]}^{N \times N}

. The state-transferring probability matrix

{[P_{n, i}]}^{N \times N}

is also a Markov random wandering matrix that satisfies the condition that the sum of the transferring probabilities from each point to all other nodes is equal to 1, i.e.,

\sum_{i} P_{n, i} = 1 (\forall n)

.

The random wandering process begins at a randomly selected node within the WNIN. From the starting node, the process takes random steps to adjacent nodes. These steps are determined by the state-transferring probability matrix

{[P_{n, i}]}^{N \times N}

. The random wandering process continues to move from node to node, exploring the network in a stochastic manner. After a certain number of iterations, the random wandering process tends toward a stable state, that is, all the nodes in the WNIN-based electronic sensor data network are no longer changing the probability of being accessed, which can be represented by the steady-state distribution vector. The steady-state distribution vector is defined as follows:

[π_{1}^{t + 1}, π_{2}^{t + 1}, \dots, π_{N}^{t + 1}] = [π_{1}^{t}, π_{2}^{t}, \dots, π_{N}^{t}] \times {[P_{n, i}]}^{N \times N},

(15)

where

π_{n}^{t}

denotes the probability that a random wanderer stays at node

a_{n}

at timeslot

t

.

The similarity between two nodes is used to customize the state-transferring probability matrix

{[P_{n, i}]}^{N \times N}

of the Markov random wandering process, which can effectively and accurately complete the transferring process between nodes. According to this mechanism, nodes that tend to be outliers have a smaller probability of being visited. When the Markov random wandering process reaches a steady state after a certain number of iterations, each element in the steady-state distribution vector can represent the probability of the random wanderer staying at each node, i.e., the importance of each node. The larger the value of the steady-state distribution vector is, the smaller the probability of converging to an outlier. In order to distinguish the importance of each node, we standardize the importance values in the steady-state distribution vector. The standardized importance value of

a_{n}

is denoted as

ς_{n}

, which is defined as follows:

ς_{n} = \frac{(λ - γ) (π_{n}^{t} - \min ([π_{1}^{t}, π_{2}^{t}, \dots, π_{N}^{t}]))}{\max ([π_{1}^{t}, π_{2}^{t}, \dots, π_{N}^{t}]) - \min ([π_{1}^{t}, π_{2}^{t}, \dots, π_{N}^{t}])} + γ,

(16)

where

λ > γ

.

In addition, the importance value of a node is closely related to the number of its neighbors, and the value of the neighborhood radius adjustment factor

θ

has a large impact on the dispersion degree of each node in the WNIN-based electronic sensor data network. Thus, the importance degree is defined as follows:

ξ_{n} = \frac{ς_{n}}{\sum_{i = 1}^{N} U_{n, i}} .

(17)

Considering the principle that potential anomaly data objects have less importance in the WNIN for electronic sensors and sensor networks, the smaller the importance degree of a data object is, the greater the probability that it will tend to be an anomaly object.

The complexity of the anomaly detection method based on the WNIN consists of two parts. One is the complexity of constructing the WNIN for electronic sensors and sensor networks and its

O (M N^{2})

, where

N

is the number of data objects in the WNIN and

M

is the number of the attributes. The other is the complexity of the random wandering process and its

O (T (N + L))

, where

T

is the number of steps in the random walk,

N

is the number of data objects in the WNIN, and

L

is the number of edges. Therefore, the construction of the WNIN can be prohibitive for large datasets with high dimensionality, which leads to long computation times and the requirement for significant computational resources. The random wandering process may require a large number of steps to converge and achieve stable results, especially in complex WNINs, leading to variability in anomaly detection.

4. Performance Evaluation

In this section, the experimental comparison and analysis is performed with the public dataset and the constructed dataset. The WNIN-enabled anomaly detection method is compared and analyzed with two other related anomaly detection methods to verify the effectiveness of the proposed method. One is the VOS method, which is based on the KNN and the random wandering process [38]. Integrating local information with implicit connections within the graph representation of the original dataset, the VOS method constructed a similarity graph using the top-k similar neighbors for each object. It introduced a virtual node coupled with a collection of virtual edges to generate a k-virtual graph. Subsequently, a Markov random walk process was conducted on the similarity graph, with the principle that potential anomalies should receive more weight to be visited. The alternative NIEHDOD method used is based on neighborhood information entropy [37]. The neighborhood information system was defined by a heterogeneous distance and a self-adapting radius, with neighborhood information entropy subsequently formulated to quantify overall uncertainty. Three incremental information measures were constructed to characterize individual objects, culminating in the establishment of a neighborhood entropy-based anomaly factor for anomaly detection.

The performance of the proposed method was evaluated in terms of its accuracy rate, recall rate, false-alarm rate, and F1 score. In data anomaly detection, the data sample space can be generally categorized into two categories: normal data and anomaly data. Then, the detection results and the actual categories to which the samples belong are True Positive (TP), False Positive (FP), True Negative (TN), False Negative, (FN). Among them, a True positive (TP) is an outlier that is effectively detected by the algorithm, while a False Positive (FP) is a normal data point that is misjudged as an outlier by the algorithm. A True Negative (TN) is a normal data point that is correctly determined by the algorithm, and a False Negative (FN) is an outlier that has not been detected by the algorithm.

The accuracy rate is the ability to effectively detect outliers and distinguish them from normal data and is expressed as follows:

A c c u r a c y = \frac{T P + T N}{T P + F P + T N + F N} .

(18)

The recall rate reflects the detection ability of the anomaly detection algorithm and is the proportion of detected outliers to the overall total number of outliers.

R e c a l l = \frac{T P}{T P + F N} .

(19)

The false-alarm rate reflects the failure rate of the algorithm, i.e., the lower bound of the algorithm’s accuracy, and it can be expressed as follows:

F a l s e A l a r m = \frac{F P}{F P + T N} .

(20)

Taking the accuracy rate and recall rate into consideration, the F1 score is defined as follows:

F 1 S c o r e = \frac{2 * A c c u r a c y * R e c a l l}{A c c u r a c y + R e c a l l}

(21)

The Lymphography public dataset was selected from the UCI database. The Lymphography dataset contains 148 data objects, and the 148 objects are categorized into four classes, including normal find, metastases, fibrosis, and malign lymph, with 2, 81, 61, and 4 data objects per class, respectively. The normal find and malign lymph categories are considered rare and contain a total of six real outliers. The proposed method was examined using the Lymphography dataset, and the results are shown in Figure 5. The x axis represents the identification of the data objects, that is, the number of data objects in the dataset. Obviously, the anomaly objects have small values of importance degree, which verifies the principle that potential anomaly data objects have less importance in the WNIN for the electronic sensors and sensor network.

The performance of the proposed method is analyzed against two other anomaly detection methods using the Lymphography dataset, and the comparison results are shown in Table 1. The proposed method has a high value of accuracy rate among the three anomaly detection methods. The recall rate of the proposed method is equal to that of the VOS method, but its false-alarm rate is much lower than that of the VOS method. The proposed method is optimal in the F1 score.

According to the hierarchical perception security architecture in Figure 2, a dataset was constructed that contained 450 data objects, including 430 normal objects and 20 anomaly objects. The results of the proposed method examined by the constructed dataset are shown in Figure 6. In the detection results, there are two outliers that were not detected, but the detection results also verify that the anomaly objects have small values of importance degree.

The performance of the proposed method against two other anomaly detection methods in the self-constructed dataset is shown in Table 2. In the constructed dataset, the proposed method has high values of accuracy rate and recall rate among the three anomaly detection methods. The false-alarm rate of the proposed method is equal to 0, and it is the best in terms of false-alarm performance. Obviously, the proposed method is also optimal in the F1 score.

From the above experimental results, it has obviously been shown that the proposed method has better comprehensive performance than the VOS method and the NIEHDOD method. For the two datasets, the value range of the neighborhood radius in each numerical attribute can be determined by the standard deviation and mean value of the numerical attribute. And the neighborhood radius in the numerical attribute could be adaptively adjusted through the neighborhood radius adjustment factor, thereby improving the anomaly detection rate of the proposed method.

5. Conclusions

This paper proposed a WNIN-enabled anomaly detection method for mixed-attribute data in electronic sensors and sensor networks. The method employed the AHP to assess the security of the electronic sensors and sensor network, utilizing a hierarchical electrical sensor network model. A hierarchical perception security architecture was subsequently constructed. The NIS was then established to identify the neighborhood relationships among data objects with mixed attributes. A WNIN was developed for the electronic sensors and sensor network to represent these relationships, incorporating a state-transition matrix derived from data object similarity. A random wandering process within the network was executed, and the anomaly degree of data objects was quantified based on the steady-state distribution vector. Simulation outcomes indicate that the proposed method outperforms other comparative methods in terms of anomaly detection rate.

Author Contributions

Writing—original draft preparation, C.A.; writing—review and editing, Y.L.; validation, Q.L. and P.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Research and Development Project of the State Grid Corporation of China: Research on Anti-tampering Technology for Power IoT Sensor Data based on Cyber-physical Cross-domain Collaboration (5700-202336282A-1-1-ZN).

Data Availability Statement

Data are not publicly available due to privacy concerns.

Conflicts of Interest

Author Chunyan An was employed by the company China Electric Power Research Institute Co., Ltd. and Electric Power Intelligent Sensing Technology Laboratory of State Grid Corporation. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Akyildiz, I.F.; Su, W.; Sankarasubramaniam, Y.; Cayirci, E. Wireless sensor networks: A survey. Comput. Netw. 2002, 38, 393–422. [Google Scholar] [CrossRef]
Kopetz, H. Internet of things. In Real-Time Systems; Springer: New York, NY, USA, 2011; pp. 307–323. [Google Scholar]
Lu, Z.; Huang, H.; Shan, B.; Wang, Y.; Du, S.H.; Li, J.H. Morphological evolution model and power forecasting prospect of future electric power systems with high proportion of renewable energy. Autom. Electr. Power Syst. 2017, 41, 12–18. [Google Scholar]
Zhou, X.; Zeng, R.; Gao, F.; Zhou, X. Development status and prospects of the energy Internet. Sci. Sin. Inf. 2017, 47, 149–170. [Google Scholar]
Li, H.; Hu, Y.; Li, Y.; Yang, D.; Liang, Y.; Ouyang, H.; Lan, Y. Overview of condition monitoring and fault diagnosis for grid-connected high-power wind turbine unit. Electr. Power Autom. Equip. 2016, 36, 6–16. [Google Scholar]
Lewis, F.L. Smart environments: Technologies, protocols, and applications. In Wireless Sensor Networks; John Wiley: Hoboken, NJ, USA, 2004; pp. 11–46. [Google Scholar]
Chen, W.P.; Hou, J.; Sha, L. Dynamic clustering for acoustic target tracking in wireless sensor networks. IEEE Trans. Mob. Comput. 2004, 3, 258–271. [Google Scholar] [CrossRef]
Kang, J.J.; Yang, W.; Dermody, G.; Ghasemian, M.; Adibi, S.; Haskell-Dowland, P. No soldiers left behind: An iot-based low-power military mobile health system design. IEEE Access 2020, 8, 201498–201515. [Google Scholar] [CrossRef]
Liao, Y.; Mollineaux, M.; Hsu, R.; Bartlett, R.; Singla, A.; Raja, A.; Bajwa, R.; Rajagopal, R. Snowfort: An open source wireless sensor network for data analytics in infrastructure and environmental monitoring. IEEE Sens. J. 2014, 14, 4253–4263. [Google Scholar] [CrossRef]
Tang, Q.; Zhu, Y.; Hao, J. Shadow diagnosis and localization of PV array based on optimal sensor collocation. Acta Energy Solaris Sin. 2018, 39, 513–519. [Google Scholar]
Harris, N.; Cranny, A.; Rivers, M.; Smettem, K.; Barrett-Lennard, E.G. Application of distributed wireless chloride sensors to environmental monitoring: Initial results. IEEE Trans. Instrum. Meas. 2016, 65, 736–743. [Google Scholar] [CrossRef]
Byun, J.; Jeon, B.; Noh, J.; Kim, Y.; Park, S. An intelligent self-adjusting sensor for smart home services based on zigbee communications. IEEE Trans. Consum. Electron. 2012, 58, 794–802. [Google Scholar] [CrossRef]
Li, M.; Lin, H.J. Design and implementation of smart home control systems based on wireless sensor networks and power line communications. IEEE Trans. Ind. Electron. 2015, 62, 4430–4442. [Google Scholar] [CrossRef]
Jiang, X.; Liu, Y.; Fu, X.; Xu, P.; Wang, S.J.; Sheng, G. Construction ideas and development trends of transmission and distribution equipment of the ubiquitous power Internet of things. High Volt. Eng. 2019, 45, 1345–1351. [Google Scholar]
Villadangos, J.; Falcone, F.; Lopez, A.; Astrain, J.J.; Sanchis, P.; Matias, I.R. Distributed opportunistic wireless mapplicationing system towards smart city service provision. In Proceedings of the 2021 IEEE Sensors, Sydney, Australia, 31 October–3 November 2021; pp. 1–4. [Google Scholar]
Balid, W.; Refai, H.H. On the development of self-powered iot sensor for real-time traffic monitoring in smart cities. In Proceedings of the 2017 IEEE Sensors, Glasgow, UK, 29 October–1 November 2017; pp. 1–3. [Google Scholar]
Liu, L.; Hua, S.; Lai, Q. Automatic control system of balancing agricultural stereo cultivation based on wireless sensors. IEEE Sens. J. 2021, 21, 17517–17524. [Google Scholar] [CrossRef]
Rao, A.; Shao, H.; Yang, X. The design and implementation of smart agricultural management platform based on uav and wireless sensor network. In Proceedings of the 2019 IEEE 2nd International Conference on Electronics Technology (ICET), Chengdu, China, 10–13 May 2019; pp. 248–252. [Google Scholar]
Lu, J.; Sheng, W.; Liu, R. Design and application of power distribution Internet of things. High Volt. Eng. 2019, 45, 1681–1688. [Google Scholar]
Guembe, I.P.; Lopez-Iturri, P.; Astrain, J.J.; Aguirre, E.; Azpilicueta, L.; Celaya-Echarri, M.; Villadangos, J.; Falcone, F. Basketball player on-body biophysical and environmental parameter monitoring based on wireless sensor network integration. IEEE Access 2021, 9, 27051–27066. [Google Scholar] [CrossRef]
Shi, J.; Sha, M.; Yang, Z. Distributed graph routing and scheduling for industrial wireless sensor-actuator networks. IEEE/ACM Trans. Netw. 2019, 27, 1669–1682. [Google Scholar] [CrossRef]
Joris, L.; Dupont, F.; Laurent, P.; Bellier, P.; Stoukatch, S.; Redoute, J.-M. An autonomous sigfox wireless sensor node for environmental monitoring. IEEE Sens. Lett. 2019, 3, 1–4. [Google Scholar] [CrossRef]
Xue, F.; Lei, X.; Zhang, Y.; Liu, H.; Gao, C. Battery management of smart charging and swapping service network for electric vehicle based on Internet of things. Autom. Electr. Power Syst. 2012, 36, 41–46. [Google Scholar]
Zhang, Z.; Glaser, S.; Watteyne, T.; Malek, S. Long-term monitoring of the sierra nevada snowpack using wireless sensor networks. IEEE Internet Things J. 2022, 9, 17185–17193. [Google Scholar] [CrossRef]
Chen, L.W.; Cheng, J.H.; Tseng, Y.C. Distributed emergency guiding with evacuation time optimization based on wireless sensor networks. IEEE Trans. Parallel Distrib. Syst. 2016, 27, 419–427. [Google Scholar] [CrossRef]
Yu, T.; Tan, Z.; Cheng, L.; Jiang, H.; Zhang, Z.; Wang, K. Cyber-physical energy USB System for multi-user Interaction. Autom. Electr. Power Syst. 2019, 43, 97–106. [Google Scholar]
Ma, Y.; Liu, K.; Chen, M.; Ma, J.; Zeng, X.; Wang, K.; Liu, C. Ant: Deadline-aware adaptive emergency navigation strategy for dynamic hazardous ship evacuation with wireless sensor networks. IEEE Access 2020, 8, 135758–135769. [Google Scholar] [CrossRef]
Breunig, M.M.; Kriegel, H.P.; Ng, R.T.; Sander, J. LOF: Identifying density-based local outliers. In Proceedings of the 2000 ACM SICMOD International Conference on Management of Data, Dallas, TX, USA, 16–18 May 2000; pp. 93–104. [Google Scholar]
Hawkins, S.; He, H.; Williams, G.; Baxter, R. Outlier detection using replicator neural networks. In Data Warehousing and Knowledge Discovery, Proceedings of the 4th International Conference, DaWaK 2002, Aix-en-Provence, France, 4–6 September 2002; Springer: Berlin/Heidelberg, Germany, 2002; pp. 170–180. [Google Scholar]
Shyu, M.L.; Chen, S.C.; Sarinnapakorn, K.; Chang, L. A novel anomaly detection scheme based on principal component classifier. In Proceedings of the IEEE Foundations and New Directions of Data Mining Workshop, Piscataway, NJ, USA, 19 November 2003. [Google Scholar]
Hubert, M.; Debruyne, M. Minimum covariance determinant. Wiley Interdiscip. Rev. Comput. Stat. 2010, 2, 36–43. [Google Scholar] [CrossRef]
Goldstein, M.; Dengel, A. Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm. KI-2012: Poster and demo track. In Proceedings of the 35th German Conference on Artificial Intelligence, Saarbrucken, Germany, 24–27 September 2012; Volume 1, pp. 59–63. [Google Scholar]
Malhotra, P.; Vig, L.; Shroff, G.; Agarwal, P. Long Short Term Memory Networks for Anomaly Detection in Time Series. ESANN 2015, 2015, 89. [Google Scholar]
Giatrakos, N.; Deligiannakis, A.; Garofalakis, M.; Kotidis, Y. Omnibus outlier detection in sensor networks using windowed locality sensitive hashing. Future Gener. Comput. Syst. 2020, 110, 587–609. [Google Scholar] [CrossRef]
Yu, X.; Lu, H.; Yang, X.; Chen, Y.; Song, H.; Li, J.; Shi, W. An adaptive method based on contextual anomaly detection in Internet of things through wireless sensor networks. Int. J. Distrib. Sens. Netw. 2020, 16, 1550147720920478. [Google Scholar] [CrossRef]
Zhang, T.; Zhao, Q.; Shin, Y.; Nakamoto, Y. An unsupervised local outlier detection method for wireless sensor networks. Int. J. Adv. Comput. Sci. Appl. 2017, 8, 386–393. [Google Scholar] [CrossRef]
Yuan, Z.; Zhang, X.; Feng, S. Hybrid data-driven outlier detection based on neighborhood information entropy and its developmental measures. Expert Syst. Appl. 2018, 112, 243–257. [Google Scholar] [CrossRef]
Wang, C.; Liu, Z.; Gao, H.; Fu, Y. VOS: A new outlier detection model using virtual graph. Knowl.-Based Syst. 2019, 185, 104907. [Google Scholar] [CrossRef]

Figure 1. The hierarchical electrical sensor network model.

Figure 2. The hierarchical perception security architecture.

Figure 3. The establishment process of the neighborhood information system for mixed-attribute data.

Figure 4. The process of the anomaly detection method based on the WNIN through a random wandering process.

Figure 5. Experimental results of the proposed method using the Lymphography dataset.

Figure 6. Experimental results of the proposed method using the constructed dataset.

Table 1. Experimental results of the proposed method vs. other anomaly detection methods using the Lymphography dataset.

	Accuracy Rate	Recall Rate	False-Alarm Rate	F1 Score
The proposed method	98.65%	83.33%	0.70%	90.35%
VOS method	96.62%	83.33%	2.11%	89.48%
NIEHDOD method	96.62%	66.67%	2.82%	78.89%

Table 2. Experimental results of the proposed method vs. other anomaly detection methods in the constructed dataset.

	Accuracy Rate	Recall Rate	False-Alarm Rate	F1 Score
The proposed method	99.56%	90.00%	0	94.54%
VOS method	97.56%	85.00%	1.86%	90.85%
NIEHDOD method	98.00%	75.00%	0.93%	84.97%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

An, C.; Liu, Y.; Li, Q.; Si, P. Weighted-Neighborhood-Information-Network-Enabled Anomaly Detection Method for Electronic Sensors and Sensor Networks. Electronics 2024, 13, 3482. https://doi.org/10.3390/electronics13173482

AMA Style

An C, Liu Y, Li Q, Si P. Weighted-Neighborhood-Information-Network-Enabled Anomaly Detection Method for Electronic Sensors and Sensor Networks. Electronics. 2024; 13(17):3482. https://doi.org/10.3390/electronics13173482

Chicago/Turabian Style

An, Chunyan, Yingyi Liu, Qi Li, and Pengbo Si. 2024. "Weighted-Neighborhood-Information-Network-Enabled Anomaly Detection Method for Electronic Sensors and Sensor Networks" Electronics 13, no. 17: 3482. https://doi.org/10.3390/electronics13173482

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Weighted-Neighborhood-Information-Network-Enabled Anomaly Detection Method for Electronic Sensors and Sensor Networks

Abstract

1. Introduction

1.1. Related Works

1.2. Contributions and Outcomes

2. AHP-Based Security Analysis for Electronic Sensors and Sensor Networks

3. WNIN-Enabled Anomaly Detection Method

3.1. Neighborhood Information System for Mixed-Attribute Data

3.2. B. WNIN for Electronic Sensors and Sensor Networks

3.3. Anomaly Detection Method Based on the WNIN

4. Performance Evaluation

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI