Online Adaptive Kalman Filtering for Real-Time Anomaly Detection in Wireless Sensor Networks

Ahmad, Rami; Alkhammash, Eman H.

doi:10.3390/s24155046

Open AccessArticle

Online Adaptive Kalman Filtering for Real-Time Anomaly Detection in Wireless Sensor Networks

by

Rami Ahmad

^1,*

and

Eman H. Alkhammash

²

¹

College of Computer Information Technology, American University in the Emirates, Dubai 503000, United Arab Emirates

²

Department of Computer Science, College of Computers and Information Technology, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Sensors 2024, 24(15), 5046; https://doi.org/10.3390/s24155046

Submission received: 3 July 2024 / Revised: 1 August 2024 / Accepted: 2 August 2024 / Published: 4 August 2024

(This article belongs to the Section Fault Diagnosis & Sensors)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Wireless sensor networks (WSNs) are essential for a wide range of applications, including environmental monitoring and smart city developments, thanks to their ability to collect and transmit diverse physical and environmental data. The nature of WSNs, coupled with the variability and noise sensitivity of cost-effective sensors, presents significant challenges in achieving accurate data analysis and anomaly detection. To address these issues, this paper presents a new framework, called Online Adaptive Kalman Filtering (OAKF), specifically designed for real-time anomaly detection within WSNs. This framework stands out by dynamically adjusting the filtering parameters and anomaly detection threshold in response to live data, ensuring accurate and reliable anomaly identification amidst sensor noise and environmental changes. By highlighting computational efficiency and scalability, the OAKF framework is optimized for use in resource-constrained sensor nodes. Validation on different WSN dataset sizes confirmed its effectiveness, showing 95.4% accuracy in reducing false positives and negatives as well as achieving a processing time of 0.008 s per sample.

Keywords:

WSNs; anomaly detection; sensors; unsupervised learning; Kalman filter; adaptive Kalman filtering

1. Introduction

Wireless sensor networks (WSNs) have become an integral part of various applications ranging from environmental monitoring and precision agriculture to industrial automation and smart cities [1]. These networks consist of independent, spatially distributed sensors that monitor physical or environmental conditions, such as temperature, sound, pressure, and motion, and cooperatively pass their data across the network to a key location [2]. However, the dynamic nature of these environments and the inherent noise in sensor measurements pose significant challenges for accurate data analysis and anomaly detection [3]. Moreover, integrating low-cost sensors into WSNs poses other significant challenges, especially for anomaly detection [4]. These economically attractive sensors often lack accuracy and reliability, which is critical to meeting the precise anomaly detection needs in WSN applications [5].

The inherent variability and potential for drift of low-cost sensors requires the development of sophisticated data processing techniques [6]. These techniques are essential for effectively filtering out noise and accurately identifying anomalies, but they escalate the computational and operational costs [7]. Furthermore, seeking to implement effective anomaly detection systems has significant financial implications [8]. Formulating adaptive algorithms capable of adeptly dealing with noisy, incomplete, or imbalanced data requires a significant investment in research and development [9]. These algorithms must strike a careful balance between sophistication and computational efficiency to facilitate real-time data analysis without imposing prohibitive costs. The challenge is compounded by the large computational resources required, which adds layers of complexity to the deployment and maintenance of WSNs [10,11,12]. Given these challenges, there is an urgent need to focus on finding tailored solutions to the problem of anomaly detection within WSNs that are low cost and high accuracy [13]. This requires a concerted effort to innovate in the areas of sensor technology, data processing, and algorithm development [14]. The goal is to devise anomaly detection systems that can efficiently process data from low-cost sensors, distinguishing between real anomalies and noise with high accuracy, all while maintaining a cost-effective operational model.

This includes exploring new methodologies in machine learning and data analytics that are specifically optimized to address the limitations of low-cost sensors and the dynamic environments of WSNs [15]. However, the Central Processing Unit (CPU) cost is still the main challenge along with these technologies [16]. Traditional methods, such as z-score analysis and control charts, are effective in stable environments but falter in dynamic environments because of their reliance on fixed thresholds and statistical parameters [17], often resulting in a high rate of false positives or negatives. In contrast, machine learning-based anomaly detection, using algorithms such as clustering, Support Vector Machines (SVMs), and neural networks, provides a powerful alternative by learning the underlying patterns of the data, thus improving the identification of outliers without pre-defined assumptions. The authors in [18] combined Continuous Wavelet Transform (CWT) with a convolutional neural network (CNN) to detect ECG abnormalities and combined Discrete Wavelet Transform (DWT) with long-term memory (LSTM) to improve false-positive sequence patterns in the training phase. Moreover, CWT and CNNs have been used to detect cyber–physical security [19,20]. Given these considerations, there is an urgent need to focus on developing techniques that are not only lightweight but also leverage the strengths of unsupervised learning to accommodate the dynamic nature of streaming datasets common in sensor networks. Again, we are dealing with small CPUs and small hardware [21].

Adaptive Kalman Filtering (AKF) technology is emerging as a powerful tool for detecting anomalies in real-time WSNs [22], providing an advanced way to filter out noise and detect outliers in sensor data at a very low cost. The use of a Kalman filter, although traditionally applied to dynamical systems, is justified in this context by the non-stationary but slowly changing nature of the environmental conditions monitored by WSNs. Even seemingly static datasets contain underlying dynamics affected by diurnal cycles, weather changes, and other environmental factors, necessitating a dynamic approach to modeling.

Moreover, the Kalman filter is a recursive algorithm designed to estimate the state of linear dynamical systems from a series of noisy measurements [23]. It works by predicting the state of the system and then updating that prediction based on new measurements. However, traditional Kalman Filtering assumes that the noise characteristics of the system and measurements are known and constant, which may not be true in the complex and evolving environments in which WSNs operate [24]. This limitation has led to the development of AKF (Online AKF) techniques, which adjust the filter parameters—namely process noise covariance (Q) and measurement noise covariance (R)—in real time based on the observed data. By adapting these parameters, Online Adaptive Kalman Filtering (OAKF) can maintain a high filtering performance even when system dynamics or noise characteristics change, making it particularly suitable for anomaly detection in WSNs. This adaptability is achieved by constantly evaluating the innovation sequence (the difference between actual measurements and expected cases) and adjusting the Q and R to reduce the estimation error and anomaly detection threshold, thus enhancing the sensitivity of the filter to true anomalies while suppressing false alarms due to normal fluctuations or transient noise, as shown in Figure 1.

The objectives of the paper are as follows:

Optimize an Online Adaptive Kalman Filtering (OAKF) framework for real-time anomaly detection within WSNs: This work focuses on dynamically adjusting the filtering parameters to better adapt to the evolving data dynamics and noise characteristics inherent in WSN environments, ensuring enhanced precision and reliability in the detection of anomalies.
Enhance the accuracy and computational efficiency of anomaly detection in WSNs while keeping the sensor’s cost low: This goal is centered on reducing false positives and negatives, enabling the precise identification of anomalies while ensuring the AKF algorithm remains lightweight and viable for deployment on resource-constrained sensor nodes.
Validate the AKF framework’s performance across a variety of WSN datasets, demonstrating its versatility and effectiveness in accurately detecting anomalies under diverse environmental conditions and in different application scenarios ranging from industrial monitoring to environmental sensing.

Following this introduction, Section 2 reviews the related work to identify gaps and set the stage for our contributions. Section 3 presents details of our proposed model and demonstrates our innovative approach to detecting anomalies in sensor networks. Section 4 presents the results and discussion and analyzes the experimental performance of the model. The paper concludes within Section 5, summarizing the main findings and suggesting directions for future research.

2. Related Work

Various methodologies, including traditional statistical approaches [22,25], machine learning algorithms [8,16], and hybrid techniques [19], have been investigated to improve the accuracy and efficiency of anomaly detection in WSNs. These methods are often employed for monitoring changes and addressing physical security concerns [26]. However, a significant challenge arises from the misalignment between the configuration requirements of sensors and the financial implications of deploying such advanced technologies. Consequently, there is a pressing need to develop a cost-effective, lightweight solution capable of delivering high-precision outcomes in anomaly detection [27].

The Kalman filter model is considered a fast and lightweight model for anomaly detection, but the efficiency of this technique decreases with fluctuations in natural sensor readings, as shown with the following specific data: For example, in the dataset we used, the Kalman filter model showed an average false positive rate of 15% under stable conditions, which rose to 30% when exposed to dynamic environmental changes, demonstrating its limitations in fluctuating conditions. Therefore, many studies have worked to improve the performance of this algorithm, as we will discuss later, but the problem remains the increased cost. The cost of a Kalman filter becomes similar to that of other techniques.

The Kalman filter model [22] is a fast and lightweight anomaly detection model. It works by predicting the state of the system and updating this prediction based on new measurements, assuming the known and constant noise characteristics. However, its efficiency decreases with natural fluctuations in sensor readings. For example, when the noise characteristics of the sensor change, the fixed parameters in a traditional Kalman filter may not accurately represent the system, resulting in an increase in false positives and negatives. To improve it, ref. [28] combined Kalman filters with autoencoders (AE) to enhance the detection accuracy through optimizing (machine learning) system state estimation and feature representation. However, a Kalman AE demonstrates notable effectiveness in experiments, outperforming existing techniques in identifying anomalies across various datasets, but there is no cost analysis. Moreover, ref. [5] proposed a novel Time-Variant Local Autocorrelated Polynomial (TVLAP) model with Kalman Filtering (TVLAP-KF) for non-stationary time series analysis, addressing challenges like noise, outliers, and anomalies in sensing systems. This model enhances the signal processing capabilities, including denoising, outlier correction, and anomaly detection, by effectively modeling and predicting system states. Despite its advancements, practical implementation considerations such as model complexity and computational demands underscore the balance between accuracy and efficiency in real-world applications. In addition, ref. [29] developed an AKF-based condition-monitoring technique for induction motors, focusing on real-time signal processing capabilities. This innovative approach uses multiple AKFs for outlier and anomaly detection, leveraging vibration signal analysis to assess the motor’s condition. Despite its effectiveness in real-world applications, challenges include estimating random vibration signals and quantifying health status. In the same context, ref. [30] presented an Adaptive Kalman Filtering approach integrated with an Autoregressive (AR) model for improving Air Quality Index (AQI) prediction accuracy. This model efficiently processes and predicts AQI values by leveraging historical data collected via a WSN in Nanjing. Also, the study demonstrates that the hybrid KF-AR model surpasses traditional AR models in terms of forecasting performance, particularly for monthly AQI data, showcasing its potential for effective air quality monitoring and prediction.

Ref. [31] introduced a Bayesian filtering method for dynamic anomaly detection and tracking in maritime surveillance, leveraging a Bernoulli Random Finite Set for modeling unknown control inputs as binary switches. This advanced approach enables precise tracking and anomaly detection amid false alarms and missed detections. However, a significant limitation is the method’s complexity and computational demand, challenging its scalability and real-time application in extensive surveillance systems. Furthermore, ref. [32] proposed a framework for event detection in Wireless Body Area Networks (WBANs) using Kalman Filtering and Power Divergence. This approach aims to automatically detect physiological changes or faulty measurements from sensor data, distinguishing between genuine health emergencies and erroneous readings to reduce false alarms. Despite its high detection accuracy and low false alarm rate, a limitation is the computational complexity involved in real-time data analysis, which could challenge deployment on devices with limited processing capabilities. In contrast, KFPSO [33] combines the Kalman filter and particle swarm optimization (PSO) to dynamically adjust the filter parameters. This hybrid approach aims to improve the accuracy of state estimation and anomaly detection by improving real-time noise variations. Despite its effectiveness, KFPSO introduces significant computational complexity due to the optimization process, which makes it less suitable for WSNs with limited resources.

The discussion emphasizes exploring different anomaly detection methodologies in WSNs, highlighting the balance between cost and efficiency [34]. It showcases the potential of the Kalman filter as a fast and lightweight model despite performance limitations amid sensor fluctuations. Therefore, our proposed OAKF framework offers several innovative contributions to address the existing challenges in WSNs anomaly detection:

Unlike traditional Kalman filters with fixed parameters, OAKF dynamically adjusts for noise variations in the process and measurement in real time. This ensures a high filtering performance even under different environmental conditions and sensor noise characteristics.
The OAKF framework is specifically designed for real-time applications. It can quickly adapt to changing data streams, ensuring accurate and timely anomaly detection without significant computational costs.
The OAKF algorithm is optimized for use in resource-constrained sensor nodes, making it highly scalable and computationally efficient. It can be deployed in large-scale wireless sensor networks without compromising the detection accuracy or response time.

3. Proposed Model

The goal of the proposed Online AKF (OAKF) framework is to detect anomalies in real-time data streams from spatially distributed sensors. The streams are divided into equal-sized intervals containing

N

continuous samples (sequence of sensor readings) for each sensor, as illustrated in Figure 1. For this task, we relied on a dataset released by the Intel Berkeley Research Laboratory, which included temperature measurements from 54 device sensors at one-minute time intervals [35].

Based on Figure 1, the OAKF framework operates by adapting two variables, R (measurement noise covariance) and Q (process noise covariance), to achieve high-performance anomaly detection without a significant increase in the computational cost. In this work, we define an anomaly as a large deviation from the expected state [36], where the innovation exceeds the maximum threshold, indicating a possible sensor anomaly. Moreover, the noise in the sensor measurements is modeled as Gaussian noise [37], assuming that both the R and the Q are Gaussian white noise with a zero mean.

In OAKF, higher innovation values result in gradual increases in the Q and R, which helps the filter adapt to changing noise levels. Higher noise power increases uncertainty, which can lead to more false positives and negatives. Conversely, lower noise power results in more reliable measurements and stable operation, which enhances accuracy by reducing false alarms. OAKF dynamically adjusts Q and R based on innovations in real time, maintaining a high filtering performance and robust anomaly detection despite changing noise characteristics. This adaptability ensures the effective management of noise levels, improving the overall detection accuracy, as we discuss below. For further clarification, Table 1 summarizes the notation list used in this proposal.

However, KF operates in two main steps: prediction and update. In the prediction step, the Predicated State Estimate is illustrated as shown in Equation (1):

{\hat{x}}_{k | k - 1} = {\hat{x}}_{k - 1 | k - 1}

(1)

Here,

x

represents the sensor measurement, and k represents the time step. This equation assumes a simple model where the next state is equal to the current estimate. The Predicated State Covariance is updated using:

P_{k | k - 1} = P_{k - 1 | k - 1} + Q

(2)

In this equation,

P

is the estimate covariance, and

Q

is the process noise covariance. This step updates the estimate’s uncertainty by adding the process noise, reflecting the system’s inherent unpredictability.

During the update steps, the Kalman Gain (

K

) is calculated to determine the weight given to the new measurement versus the prediction, as illustrated in Equation (3).

K_{k} = \frac{P_{k | k - 1}}{P_{k | k - 1} + R}

(3)

Here,

R

is the measurement noise covariance. The Updated State Estimate is computed as follows:

{\hat{x}}_{k | k} = {\hat{x}}_{k | k - 1} + K_{k} (z_{k} - {\hat{x}}_{k | k - 1})

(4)

where

z_{k}

is the actual measurement. This equation corrects the predicted state using the measurement and the Kalman Gain. Finally, the Updated Estimate Covariance is calculated to reduce uncertainty, as shown in Equation (5).

P_{k | k} = (1 - K_{k}) P_{k | k - 1}

(5)

To adaptively adjust the parameters (

Q

and

R

), the OAKF framework uses the innovation sequence, which is the difference between the actual measurement and the predicted state

(z_{k} - {\hat{x}}_{k | k - 1})

. The adjustments are made based on the following conditions:

i f (z_{k} - {\hat{x}}_{k | k - 1}) > T h r e s h o l d \{\begin{matrix} Q = \min (Q + Δ Q, Q_{m a x}) \\ R = \min (R + Δ R, R_{m a x}) \end{matrix}

(6)

This adaptive mechanism ensures that the filtering parameters are dynamically adjusted in response to live data, enhancing the filter’s performance in detecting anomalies amidst sensor noise and environmental changes. The

Δ Q

and

Δ R

are small increments, and

Q_{m a x}

and

R_{m a x}

are the maximum allowed values for

Q

and

R

, respectively.

The OAKF technique for real-time anomaly detection in sensor measurements can be outlined as shown with Algorithm 1.

Algorithm 1: OAKF

1. Start
2. Set

{\hat{x}}_{0 | 0}

to the first sensor measurement or a known initial condition
3. Initialize

P_{0 | 0}

,

Q

, and

R

based on prior knowledge or estimation
4. Define thresholds for anomaly detection and adaptive adjustment

(Δ Q, Δ R, Q_{m a x}, R_{m a x})

5. Read sensor data periodically (

z_{k}

)
6. While

z_{k}

do
7. Prediction:
7.1 Predicated State Estimate

(z_{k})

7.2 Update State Covariance

(z_{k})

8. Update:
8.1 Find

K_{k}

8.2 Update the state estimate with the new measurement

({\hat{x}}_{k | k})

8.3 Update the estimate covariance

(P_{k | k})

9. Adaptive Adjustment:
9.1 Calculate the innovation

({i n n o v a t i o n = z}_{k} - {\hat{x}}_{k | k - 1})

9.2 Adjust

Q

and

R

based on Equation (6)
10. Anomaly Detection:
10.1 for each

k

10.2

i f |i n n o v a t i o n| > p r e d e f i n e d_a n o m a l y_t h r e s h o l d

10.3 Flage

(z_{k})

= −1
10.4 else if
10.5 Flage

(z_{k})

= 1
10.6 end if
10.7 end for
11. end while
12. threshold = std(window_innovations) * 2 // set threshold as twice the moving standard deviation
13. End

4. Results and Discussion

In this section, we will delve into the aspects of dataset collection, the incidence of deviations, and the corresponding remediation procedures.

4.1. Dataset Collection

Our study rigorously tested a new methodology using real-time data from WSNs deployed in the IBRL [35]. We curated a dataset from 54 sensors, focusing on temperature measurements at one-minute intervals and a bandwidth of 12.4 kbps. The initial dataset comprises approximately 2.5 million records and includes attributes such as the date, time, node ID, epochs, temperature, humidity, light, voltage, and location of the nodes. Data cleansing was vital to remove inconsistencies, narrowing down from eight attributes to only temperature. This refinement involved filtering out anomalies and irregularities, such as instances where the temperature value reached 120 degrees. Moreover, we conducted a visual analysis by plotting the first 2500 samples of test data from sensor 4 for in-depth analysis, as presented in Figure 2.

4.2. Incidence of Deviations

To assess our method’s impact on detecting anomalies, we introduced controlled offsets (

Υ

) to the data from a low-cost sensor (node no. 4), mimicking imprecision. These offsets, ranging from −3 to +3, were uniformly added to the node’s readings. The offsets were applied as described by Equation (7):

U n = \{\begin{matrix} v_{i} + Υ, 0 < Υ < 3 \\ v_{i} + Υ, - 3 \leq Υ < 0 \end{matrix}

(7)

Figure 3 illustrates the results after applying Equation (7) to sequences of 40 to 60 readings from node 4.

4.3. Detection Procedure

In this section, we explore the impact of drift detection on the accuracy of WSN node readings. We assess the efficacy of our methodology through two key metrics: accuracy and time efficiency. Our strategy leverages unsupervised learning techniques. For accuracy measurement, labels are assigned to original readings devoid of any offsets, while readings with errors are marked to facilitate accuracy evaluation using a confusion matrix approach. The formula for accuracy is:

A c c u r a c y = \frac{T r u e P o s i t i v e s + T r u e N e g a t i v e s}{T r u e P o s i t i v e s + T r u e N e g a t i v e s + F a l s e P o s i t i v e s + F a l s e N e g a t i v e s}

(8)

For time consumption, the equation can be:

T o t a l P r o c e s s i n g T i m e = t o t a l_e n d_t i m e - t o t a l_s t a r t_t i m e

(9)

Our analysis, conducted through Python simulations on a machine with a 1.8 GHz Core i5 processor, 8 GB cache, and 12 GB RAM [11], also includes the initial parameter values detailed in Table 2.

In the context of our analysis, it is critical to acknowledge the deliberate calibration of key parameters to refine our outcomes. This meticulous process involved adjusting variables such as the initial estimate covariance, the process noise covariance (Q), the measurement noise covariance (R), the threshold for detecting anomalies, and the window size for analyzing innovations. The chosen values are as follows: an initial estimate covariance of 1 × 10⁻⁴, a process noise covariance (Q) set at 1 × 10⁻⁵, a measurement noise covariance (R) also at 1 × 10⁻⁵, a threshold for detecting anomalies set at 2.0, and a window size for innovations set to 500. These values were chosen based on experimental tests, ensuring a balance between the sensitivity to distortions and the stability of the filter. As a result, these adjustments were useful in improving the performance of the system, ensuring more accurate results for anomaly detection and state estimation. Moreover, it is important to note that the values chosen for the optimization parameters are specific to the dataset used in this study.

Additionally, the false positive rate (FPR) was employed to assess the accuracy calculation, as depicted in Figure 4. Real sensor readings were assigned a label of 1, while distorted readings were labeled as −1. Despite the inherent fluctuations and abrupt changes in the actual sensor readings (e.g., at 500 and 1940), the proposed OAKF model achieved an impressive accuracy (based on FPR) of 99.6% in aligning sensor readings (labeled) with predictions generated during the training process. The color red represents the FPR, while the color blue signifies the matching values between OAKF predictions and the actual sensor readings.

The reason for this accuracy is due to the algorithm’s ability to calibrate the R and Q. The filter adapts its process noise covariance (Q) and measurement noise covariance (R) based on the magnitude of the innovations (the difference between the measured temperature and the estimated state from the previous time step). Moreover, surprisingly, the computational cost of these processes did not exceed 0.008 s, since the STD calculation is performed after every 500 readings. On the other hand, there was no adjustment to the threshold because the level of data volatility was not outside the range of the initial value.

In the process of generating the displacement and implementing OAKF on those decisions, as shown in Figure 5, OAKF gave excellent outputs, with the accuracy exceeding approximately %94.27 and the execution time approximately 0.0080 s. The accuracy of the results is due to the ability to adapt the Q and R.

The figure also shows the ability to adapt the threshold to changing readings. Moreover, the reason it costs so little is that the threshold adaptation is not applied continuously.

In analyzing the effect of the data size on the OAKF model’s accuracy and execution time, we determined the relationship, as shown in Figure 6.

The figure shows a decline in the accuracy from 95.4% for the dataset 1250 to 94.3% for the dataset 2500, while in the rest of the datasets, the accuracy is close. The reason for this is that the largest part of the readings in the first dataset does not contain a drift; until reading 700, there is no drift. After 2500 readings, the accuracy variance starts to decrease with the increasing dataset, and the explanation for the observed decline in the accuracy in the initial part of the dataset and the subsequent stability is that the first part and up to 4000 readings represent the stage of adaptation to the values of the process noise covariance (Q) and measurement noise covariance (R). After this adaptation phase, the stability of the accuracy is due to the continuous adjustment of Q and R, which allows the OAKF framework to maintain a high filtering performance even as the dataset size increases. Moreover, the time cost increases regularly due to the size of the dataset. The cost increased from 0.001 s in the smallest dataset (1250) to 0.0197 s in the fourth dataset (8000). This time is considered very short compared to other methods, as we will see later.

By comparing the proposed model (OAKF) with other methods, we selected different techniques and recent studies. KFPSO [33], standard AKF [22], the Kalam AE [28], DWR-K-means [6] were used in this analysis, as illustrated in Figure 7.

The OAKF model showcases a high performance in the domain of anomaly detection across datasets of varying sizes, indicating its robust adaptability and efficiency. Unlike standard Kalman filters, OAKF utilizes a dynamic adjustment mechanism for its process and measurement noise covariances (denoted as Q and R in the code), which are fine-tuned based on the discrepancies between predicted and actual measurements, termed innovations. The OAKF framework’s enhanced ability to adjust its parameters in real time to the evolving statistical properties of the dataset likely contributes to its consistently high accuracy, as it can better handle non-linear patterns and subtle anomalies that may be present in larger datasets. This adaptability is crucial for maintaining precision in state estimation and anomaly detection in complex systems, where data variability is common. Furthermore, the ability to adapt threshold_anomlay detection helped achieve this accuracy.

In addition, the Kalam AE also gave a high accuracy; however, OAKF’s approach may lead to faster and more responsive adjustments to the observed data, which could explain its higher accuracy in anomaly detection. The deep embedding optimization in the Kalman AE, while powerful, might be more suited to capturing complex patterns rather than quick adaptation, which can be crucial depending on the nature of the dataset and the anomalies present. Furthermore, OAKF might outperform KFPSO if its adaptation mechanism more closely aligns with the actual changes in the data, providing more accurate estimates and anomaly detection. Conversely, if the PSO in KFPSO is able to find a set of parameters that are near optimal and the data does not change too dramatically over time, KFPSO could also show a strong performance. The observed higher accuracy of OAKF suggests that for the given datasets, its adaptive mechanism may be more aligned with the data’s characteristics, leading to better performance compared to the potentially static optimization of KFPSO. Also, the advanced accuracy of OAKF over DWT-K-means for anomaly detection in time series data can be attributed to its real-time adaptability, dynamic parameter optimization, and potentially lower computational complexity. While DWT-K-means provides a robust method for feature extraction and the identification of anomalies at different scales, its static nature and computational demands may limit its effectiveness and efficiency compared to the more adaptive and streamlined approach of OAKF.

OAKF exhibits a notable improvement in accuracy over other methods, with gains of 2.325% over AKF, 1.245% over the Kalman AE, a significant 8.278% over DWT-K-means, and 1.483% over KFPSO, underscoring its effectiveness in anomaly detection across varying dataset sizes.

The process cost is very important in wireless sensor networks because the sensors work with a microprocessor, as we explained previously. Figure 8 presents a clear trade-off between the computational time and complexity of the methods.

Based on the provided time consumption comparison figure, AKF appears to have the lowest time consumption, suggesting efficient computation particularly with smaller datasets. As the dataset size increases, OAKF maintains a competitive time efficiency, outperforming the more computationally intensive methods such as DWT-K-means and KFPSO. This indicates that while OAKF achieves a high accuracy, it does not do so at the cost of excessive computational demands, striking a balance between performance and efficiency. However, the Kalman AE is not included in Figure 8 because it required about 39.16 s during the training process.

The average time consumption data reveals that OAKF is more time consuming compared to the KF, with OAKF taking approximately 700% more time on average. Despite this, OAKF is significantly more time efficient than both DWT-K-means and KFPSO, which take approximately 7737.9% and 24,583.5% more time than OAKF, respectively. This suggests that while OAKF does not have the lowest time consumption, it strikes a balance by outperforming more complex methods that take considerably longer to execute, offering a middle ground in terms of computational efficiency.

In a summary of the comparative analysis, Table 3 shows the comparisons between anomaly detection techniques.

The proposed OAKF framework demonstrated a balance between high accuracy (95.4%) and low false positive rate (3.0%), with a minimum processing time of 0.008 s per sample. This efficiency is achieved by dynamically adjusting filtration parameters based on the innovation sequence, ensuring robustness against varying noise characteristics and environmental changes. Unlike traditional methods that rely on fixed parameters or complex optimization processes, OAKF constantly adapts to evolving statistical properties of the data.

5. Conclusions and Future Work

The work presented effectively addresses the pivotal challenge of real-time anomaly detection in WSNs by proposing a new framework, called Online Adaptive Kalman Filtering (OAKF). OAKF stands out due to its dynamic parameter adjustment capabilities, enabling it to adeptly handle the intrinsic noise and variability of low-cost sensor data along with adaptive anomaly detection_Threshold. This innovation not only advances the accuracy of anomaly detection but also does so while preserving the computational efficiency, a critical aspect for the deployment in resource-constrained sensor nodes. The OAKF framework demonstrated a balance between high accuracy (95.4%) and low FPR (3.0%), with a minimum processing time of 0.008 s per sample.

The comparative analysis demonstrates OAKF’s better performance in terms of accuracy against established methods like AKF, the Kalman AE, DWT-K-means, and KFPSO. Notably, it offers significant improvements, especially over DWT-K-means, while maintaining a competitive time efficiency even as dataset sizes increase. This is particularly important given the limited computational capabilities of typical WSN nodes. AKF showed a significant improvement in accuracy over other methods, with gains of 2.325% over AKF, 1.245% over the Kalman AE, 8.278% over DWT-K-means, and 1.483% over KFPSO. Moreover, OAKF is significantly more time efficient than both DWT-K-means and KFPSO, which take about 7737.9% and 24,583.5% longer than OAKF, respectively.

In future work, we need to conduct an analytical study on the impact of OAFK on different types and datasets.

Author Contributions

Conceptualization, R.A. and E.H.A.; methodology, R.A.; software, R.A.; validation, R.A. and E.H.A.; formal analysis, R.A.; investigation, E.H.A.; resources, E.H.A.; data curation, R.A.; writing—original draft preparation, E.H.A.; writing—review and editing, E.H.A.; visualization, E.H.A.; supervision, E.H.A.; project administration, R.A.; funding acquisition, E.H.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Taif University, Taif, Saudi Arabia (TU-DSPP-2024-113).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable, Existing data from IBRL Lab were used.

Acknowledgments

The authors extend their appreciation to Taif University, Saudi Arabia, for supporting this work through project number (TU-DSPP-2024-113).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Ahmad, R.; Hämäläinen, M.; Wazirali, R.; Abu-Ain, T. Digital-Care in next Generation Networks: Requirements and Future Directions. Comput. Netw. 2023, 224, 109599. [Google Scholar] [CrossRef]
Alhasan, W.; Ahmad, R.; Wazirali, R.; Aleisa, N.; Abo Shdeed, W. Adaptive Mean Center of Mass Particle Swarm Optimizer for Auto-Localization in 3D Wireless Sensor Networks. J. King Saud. Univ. Comput. Inf. Sci. 2023, 35, 101782. [Google Scholar] [CrossRef]
Ahmad, R.; Rinner, B.; Wazirali, R.; Abujayyab, S.K.M.; Almajalid, R. Two-Level Sensor Self-Calibration Based on Interpolation and Autoregression for Low-Cost Wireless Sensor Networks. IEEE Sens. J. 2023, 23, 25242–25253. [Google Scholar] [CrossRef]
Jhin, S.Y.; Lee, J.; Park, N. Precursor-of-Anomaly Detection for Irregular Time Series. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Long Beach, CA, USA, 6–10 August 2023; ACM: New York, NY, USA, 2023; pp. 917–929. [Google Scholar]
Wang, S.; Li, C.; Lim, A. A Model for Non-Stationary Time Series and Its Applications in Filtering and Anomaly Detection. IEEE Trans. Instrum. Meas. 2021, 70, 6502911. [Google Scholar] [CrossRef]
Le, K.N.T.; Dang, T.B.; Le, D.T.; Raza, S.M.; Kim, M.; Choo, H. VEAD: Variance Profile Exploitation for Anomaly Detection in Real-Time IoT Data Streaming. Internet Things 2024, 25, 100994. [Google Scholar] [CrossRef]
Gu, J.; Peng, Y.; Lu, H.; Chang, X.; Chen, G. A Novel Fault Diagnosis Method of Rotating Machinery via VMD, CWT and Improved CNN. Measurement 2022, 200, 111635. [Google Scholar] [CrossRef]
Blázquez-García, A.; Conde, A.; Mori, U.; Lozano, J.A. A Review on Outlier/Anomaly Detection in Time Series Data. ACM Comput. Surv. 2021, 54, 1–33. [Google Scholar] [CrossRef]
Tama, B.A.; Comuzzi, M.; Rhee, K.H. TSE-IDS: A Two-Stage Classifier Ensemble for Intelligent Anomaly-Based Intrusion Detection System. IEEE Access 2019, 7, 94497–94507. [Google Scholar] [CrossRef]
Ahmad, R.; Sundararajan, E.A.; Abu-Ain, T. Analysis the Effect of Clustering and Lightweight Encryption Approaches on WSNs Lifetime. In Proceedings of the 2021 International Conference on Electrical Engineering and Informatics (ICEEI), Kuala Terengganu, Malaysia, 12–13 October 2021; IEEE: Selangor, Malaysia, 2021; pp. 1–6. [Google Scholar]
Ahmad, R.; Wazirali, R.; Abu-Ain, T.; Almohamad, T.A. Adaptive Trust-Based Framework for Securing and Reducing Cost in Low-Cost 6LoWPAN Wireless Sensor Networks. Appl. Sci. 2022, 12, 8605. [Google Scholar] [CrossRef]
Ashrif, F.F.; Sundararajana, A.E.; Hasan, M.K.; Ahmad, R.; Hashim, A.-H.A.; Abu Talib, A. Provably Secured and Lightweight Authenticated Encryption Protocol in Machine-to-Machine Communication in Industry 4.0. Comput. Commun. 2024, 218, 263–275. [Google Scholar] [CrossRef]
Cauteruccio, F.; Cinelli, L.; Corradini, E.; Terracina, G.; Ursino, D.; Virgili, L.; Savaglio, C.; Liotta, A.; Fortino, G. A Framework for Anomaly Detection and Classification in Multiple IoT Scenarios. Future Gener. Comput. Syst. 2021, 114, 322–335. [Google Scholar] [CrossRef]
Ashrif, F.F.; Sundararajan, E.A.; Ahmad, R.; Hasan, M.K.; Yadegaridehkordi, E. Survey on the Authentication and Key Agreement of 6LoWPAN: Open Issues and Future Direction. J. Netw. Comput. Appl. 2024, 221, 103759. [Google Scholar] [CrossRef]
Zakrzewski, R.; Martin, T.; Oikonomou, G. Anomaly Detection in Logical Sub-Views of WSNs. In Proceedings of the IEEE Symposium on Computers and Communications, Rhodes, Greece, 30 June–3 July 2022; IEEE: New York, NY, USA, 2022. [Google Scholar]
Aboah Boateng, E.; Bruce, J.W.; Talbert, D.A. Anomaly Detection for a Water Treatment System Based on One-Class Neural Network. IEEE Access 2022, 10, 115179–115191. [Google Scholar] [CrossRef]
Mare, D.S.; Moreira, F.; Rossi, R. Nonstationary Z-Score Measures. Eur. J. Oper. Res. 2017, 260, 348–358. [Google Scholar] [CrossRef]
Golgowski, M.; Osowski, S. Anomaly Detection in ECG Using Wavelet Transformation. In Proceedings of the 2020 IEEE 21st International Conference on Computational Problems of Electrical Engineering (CPEE), Online, 16–19 September 2020; IEEE: New York, NY, USA; pp. 1–4. [Google Scholar]
Wang, L.; Zhang, X. Anomaly Detection for Automated Vehicles Integrating Continuous Wavelet Transform and Convolutional Neural Network. Appl. Sci. 2023, 13, 5525. [Google Scholar] [CrossRef]
Gou, L.; Li, H.; Zheng, H.; Li, H.; Pei, X. Aeroengine Control System Sensor Fault Diagnosis Based on CWT and CNN. Math. Probl. Eng. 2020, 2020, 5357146. [Google Scholar] [CrossRef]
Ping, L.; Chun-Guang, Z.; Xu, Z. Improved Support Vector Clustering. Eng. Appl. Artif. Intell. 2010, 23, 552–559. [Google Scholar] [CrossRef]
Knorn, F.; Leith, D.J. Adaptive Kalman Filtering for Anomaly Detection in Software Appliances. In Proceedings of the IEEE INFOCOM 2008—IEEE Conference on Computer Communications Workshops, Phoenix, AZ, USA, 13–18 April 2008; IEEE: New York, NY, USA, 2008; pp. 1–6. [Google Scholar]
Singh, R.; Mehra, R.; Sharma, L. Design of Kalman Filter for Wireless Sensor Network. In Proceedings of the 2016 International Conference on Internet of Things and Applications (IOTA), Pune, India, 22–24 January 2016; IEEE: New York, NY, USA, 2016; pp. 63–67. [Google Scholar]
Kumar, D.; Rajasegarar, S.; Palaniswami, M. Automatic Sensor Drift Detection and Correction Using Spatial Kriging and Kalman Filtering. In Proceedings of the 2013 IEEE International Conference on Distributed Computing in Sensor Systems, Cambridge, MA, USA, 20–23 May 2013; pp. 183–190. [Google Scholar] [CrossRef]
Moustafa, N.; Slay, J. The Evaluation of Network Anomaly Detection Systems: Statistical Analysis of the UNSW-NB15 Data Set and the Comparison with the KDD99 Data Set. Inf. Secur. J. Glob. Perspect. 2016, 25, 18–31. [Google Scholar] [CrossRef]
Beg, O.A.; Nguyen, L.V.; Johnson, T.T.; Davoudi, A. Cyber-Physical Anomaly Detection in Microgrids Using Time-Frequency Logic Formalism. IEEE Access 2021, 9, 20012–20021. [Google Scholar] [CrossRef]
Oreilly, C.; Gluhak, A.; Imran, M.A.; Rajasegarar, S. Anomaly Detection in Wireless Sensor Networks in a Non-Stationary Environment. IEEE Commun. Surv. Tutor. 2014, 16, 1413–1432. [Google Scholar] [CrossRef]
Huang, X.; Zhang, F.; Wang, R.; Lin, X.; Liu, H.; Fan, H. KalmanAE: Deep Embedding Optimized Kalman Filter for Time Series Anomaly Detection. IEEE Trans. Instrum. Meas. 2023, 72, 3537211. [Google Scholar] [CrossRef]
Kim, J.; Song, M.; Kim, D.; Lee, D. An Adaptive Kalman Filter-Based Condition-Monitoring Technique for Induction Motors. IEEE Access 2023, 11, 46373–46381. [Google Scholar] [CrossRef]
Chen, J.; Chen, K.; Ding, C.; Wang, G.; Liu, Q.; Liu, X. An Adaptive Kalman Filtering Approach to Sensing and Predicting Air Quality Index Values. IEEE Access 2020, 8, 4265–4272. [Google Scholar] [CrossRef]
Forti, N.; Millefiori, L.M.; Braca, P.; Willett, P. Bayesian Filtering for Dynamic Anomaly Detection and Tracking. IEEE Trans. Aerosp. Electron. Syst. 2022, 58, 1528–1544. [Google Scholar] [CrossRef]
Salem, O.; Serhrouchni, A.; Mehaoua, A.; Boutaba, R. Event Detection in Wireless Body Area Networks Using Kalman Filter and Power Divergence. IEEE Trans. Netw. Serv. Manag. 2018, 15, 1018–1034. [Google Scholar] [CrossRef]
Maghfiroh, H.; Nizam, M.; Anwar, M.; Ma’Arif, A. Improved LQR Control Using PSO Optimization and Kalman Filter Estimator. IEEE Access 2022, 10, 18330–18337. [Google Scholar] [CrossRef]
Martí, L.; Sanchez-Pi, N.; Molina, J.M.; Garcia, A.C.B. Anomaly Detection Based on Sensor Data in Petroleum Industry Applications. Sensors 2015, 15, 2774–2797. [Google Scholar] [CrossRef] [PubMed]
Intel Berkeley Research Lab Intel Lab Data. Available online: http://db.csail.mit.edu/labdata/labdata.html (accessed on 12 April 2022).
Yu, X.; Yang, X.; Tan, Q.; Shan, C.; Lv, Z. An Edge Computing Based Anomaly Detection Method in IoT Industrial Sustainability. Appl. Soft Comput. 2022, 128, 109486. [Google Scholar] [CrossRef]
Wu, D.; Jiang, Z.; Xie, X.; Wei, X.; Yu, W.; Li, R. LSTM Learning with Bayesian and Gaussian Processing for Anomaly Detection in Industrial IoT. IEEE Trans. Ind. Inf. 2020, 16, 5244–5253. [Google Scholar] [CrossRef]

Figure 1. The difference between the traditional AKF and the proposed strengthening of the AKF.

Figure 2. Sample of sensor 4 reading measurements.

Figure 3. Sample of sensor 4 drifted measurements.

Figure 4. FPR (anomalies) post-OAKF.

Figure 5. OAKF anomaly for 4000 drifted readings.

Figure 6. OAKF’s accuracy and execution time vs. dataset size.

Figure 7. Accuracies of different methods.

Figure 8. Time costs for different methods.

Table 1. Notation list.

Symbol	Description
$x$	Sensor measurement
$k$	Time step
$P$	Estimate covariance
$Q$	Process noise covariance
$K$	Kalman Gain
$z$	Actual measurement
$Δ Q$	Increment for process noise covariance
$Δ R$	Increment for measurement noise covariance
$Q_{m a x}$	Maximum value for measurement noise covariance
$i n n o v a t i o n$	Difference anomaly detection threshold
$t h r e s h o l d$	Predefined anomaly detection threshold

Table 2. Initial parameter values of OAKF.

Variable	Value
initial_estimate_covariance	1 × 10⁻⁴
initial_Q	1 × 10⁻⁵
initial_R	1 × 10⁻⁵
Initial_threshold_ detect_anomalies	2.0

Table 3. Comparative analysis of anomaly detection techniques.

Symbol	Accuracy (%)	FPR (%)	Time Cost (s)
Traditional KF [22]	91.0	22.0	0.001
KFPSO [33]	92.0	15.0	2.0
DWT-K-means [6]	85.0	30.0	0.6
Kalman AE [28]	95.2	4.0	39.1 (training)
OAKF (proposed)	95.4	3.0	0.008

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ahmad, R.; Alkhammash, E.H. Online Adaptive Kalman Filtering for Real-Time Anomaly Detection in Wireless Sensor Networks. Sensors 2024, 24, 5046. https://doi.org/10.3390/s24155046

AMA Style

Ahmad R, Alkhammash EH. Online Adaptive Kalman Filtering for Real-Time Anomaly Detection in Wireless Sensor Networks. Sensors. 2024; 24(15):5046. https://doi.org/10.3390/s24155046

Chicago/Turabian Style

Ahmad, Rami, and Eman H. Alkhammash. 2024. "Online Adaptive Kalman Filtering for Real-Time Anomaly Detection in Wireless Sensor Networks" Sensors 24, no. 15: 5046. https://doi.org/10.3390/s24155046

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Online Adaptive Kalman Filtering for Real-Time Anomaly Detection in Wireless Sensor Networks

Abstract

1. Introduction

2. Related Work

3. Proposed Model

4. Results and Discussion

4.1. Dataset Collection

4.2. Incidence of Deviations

4.3. Detection Procedure

5. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI