Identifying Tampered Radio-Frequency Transmissions in LoRa Networks Using Machine Learning
Abstract
:1. Introduction
- We have generated a comprehensive dataset consisting of benign samples and tampered signals on LoRa transmissions. The dataset captures both normal and manipulated frequency signals, offering a diverse collection of real-world data for thorough analysis.
- We provide an anomaly detection system for LoRa networks that is based on machine learning. To identify tampered radio frequencies in LoRa-based IoT systems, our method combines several sophisticated algorithms, including Autoencoder, Local Outlier Factor (LOF), Principal Component Analysis (PCA), Variational Autoencoder, and Isolation Forest. Several challenges have been addressed in applying these algorithms to identity tampered signals on LoRa transmissions.
- Using real datasets, extensive experiments have been conducted, revealing significant improvements in detection performance. The Local Outlier Factor (LOF) algorithm achieved the highest detection accuracy at 97.78%, closely followed by the Variational Autoencoder, Principal Component Analysis, and Autoencoder with 97.27% accuracy, respectively, while the Isolation Forest demonstrated lower performance at 84.49%.
2. Related Works
3. System Architecture
Hardware Components
- The MKRWAN1310 devices serve as both the transmitter and receiver units in the LoRa network, playing a central role in data collection and anomaly detection. Figure 2 demonstrates the setup that we use. Each MKRWAN1310 device operates at a frequency of 915 MHz with a bandwidth of 500 kHz and a spreading factor of 5, ensuring effective and reliable communication within the LoRa network. The devices are equipped with an Atmel ATSAMD21 microcontroller featuring a 32-bit ARM Cortex-M0+ core, 256 KB of flash memory, and 32 KB of SRAM. They offer various interfaces, including UART, SPI, and I2C, and operate within a voltage range of 3.3 V to 5 V. Integrated with the Semtech SX1276 LoRa transceiver, the MKRWAN1310 supports long-range communication with an adaptive data rate and low power consumption.
- The HackRF devices are employed to capture and analyze the radio frequency spectrum of the LoRa transmissions. Each HackRF device covers a wide frequency range from 1 MHz to 6 GHz and supports a maximum sample rate of 20 MS/s with an 8-bit ADC resolution. The devices feature a maximum instantaneous bandwidth of 10 MHz, are connected via USB 2.0, and are powered by a 5 V supply. The direct conversion architecture of the HackRF, combined with its broad frequency range, enables detailed signal analysis and effective detection of anomalies. Figure 3 shows the HackRF device that we use utilized in this work.
- Shielding and Isolation: HackRF devices are enclosed in conductive materials to block external RF signals and minimize ambient noise from nearby electronics.
- Calibration: A meticulous calibration process adjusts parameters like gain and frequency offsets to align the device with the expected signal characteristics, ensuring accurate reception and transmission.
- Frequency Optimization: The system selects optimal frequency ranges to reduce interference from other wireless systems, focusing on the LoRa signals of interest.
- High-Quality Filters: Bandpass filters are used to allow only the desired frequency range to pass, effectively isolating LoRa signals from unwanted background noise.
4. Proposed Methodology
- Gathering of Dataset and Preparation: The work starts by gathering a large dataset of frequency pictures produced by radio-frequency broadcasts in an IoT network based on LoRa. Both regular and anomalous transmissions are included in the dataset, which is then used to train and assess the models. This dataset includes both normal and anomalous transmissions, providing a diverse representation of signal behaviors crucial for training and evaluating the models. Following the gathering process, several data preparation steps were undertaken to ensure the dataset’s integrity and effectiveness for machine learning applications. First, each image in the dataset was resized to a standardized dimension to maintain consistency across the data. This resizing facilitated efficient processing and ensured that all input data conformed to the expected dimensions for the machine learning models. Next, pixel normalization was performed, scaling the pixel values to a range of [0, 1]. This normalization is essential as it enhances the convergence of the training algorithms, allowing the models to learn effectively from the data. Moreover, the dataset was carefully labeled, distinguishing between normal and anomalous signals based on their characteristics. This labeling not only supported supervised learning techniques but also enabled a structured approach for analyzing the performance of different anomaly detection algorithms. To further enrich the dataset, data augmentation techniques were applied. These included transformations such as rotation, flipping, and the addition of random noise, which helped to increase the dataset’s diversity and robustness. By simulating variations in the signal data, these augmentations allowed the models to generalize better and improve their ability to detect anomalies in real-world applications.
- A variety of machine learning models are utilized in the anomaly detection process. To identify any deviations from the norm and learn the features of typical data, each of these models is trained independently on the dataset (anomalies). Figure 4 shows a flowchart for the methodology of image-based anomaly detection. The models that were used in this research are:
- Local Outlier Factor (LOF): A density-based technique that analyzes a data point’s local density with respect to its neighbors to find anomalies.
- Autoencoder: A neural network model that has been taught to recreate the source data. Any notable inaccuracy in reconstruction points to an oddity.
- Isolation Forest: A tree-based method that effectively identifies outliers by separating values and arbitrarily choosing attributes to isolate data.
- Principal Component Analysis (PCA): A method of reducing the dimensionality of data by breaking down large, multidimensional data into a smaller group of uncorrelated elements known as principle components that together account for the majority of the variance in the data. By keeping the key patterns and traits, it helps streamline datasets and make them simpler to analyze.
- Variational Autoencoder: A Variational Autoencoder is a generative model that combines neural networks with probabilistic graphical models, allowing it to learn a latent representation of input data while generating new data samples from that representation.
- Evaluation: The models are used to evaluate the dataset after they have been trained. The outputs produced by each model show whether a transmission is abnormal or not. Next, these models’ performance is assessed using criteria, including precision, recall, and total accuracy. The best model for identifying anomalies in LoRa broadcasts is determined in part by this evaluation.
Algorithm 1 Pseudocode for Multi-Model Anomaly Detection |
|
4.1. Dataset
4.2. System of Tampered Frequency Anomalies Using Machine Learning Techniques
4.2.1. Autoencoder-Based Tampered RF Transmission Detection
4.2.2. LOF-Based Tampered RF Transmission Detection
4.2.3. LOF for Image Anomaly Detection
- Convolutional Neural Networks (CNNs): Utilizing pre-trained models such as VGG16, and ResNet, or designing custom CNNs can help in extracting deep features from images.
- Traditional Methods: Techniques like Scale-Invariant Feature Transform (SIFT), Histogram of Oriented Gradients (HOG), or even raw pixel values can be used as features.
- Distance Metric: Choose an appropriate distance metric (e.g., Euclidean distance) for computing the distances between feature vectors.
- Selecting k: The choice of k (number of neighbors) is critical and can be determined through cross-validation or domain knowledge.
- Anomaly Scoring: The LOF scores for each image are calculated. Images with LOF scores significantly higher than 1 are considered anomalies.
4.2.4. Isolation Forest-Based Tampered RF Transmission
- Data Preparation: Preprocessing is performed on images to normalize pixel values and standardize their size. Transforming high-dimensional picture data into a format suitable for anomaly detection requires this method.
- Feature Extraction: Images are converted into one-dimensional vectors to enable the Isolation Forest algorithm’s application. This conversion allows the algorithm to process the image data as a high-dimensional feature matrix.
- Model Training and Evaluation: A dataset of typical photos is used to train the Isolation Forest model. It is used to find anomalies in validation and anomalous image datasets after training.
4.2.5. Principal Component Analysis (PCA)
4.2.6. Variational Autoencoder (VAE)
5. Experiment and Results
5.1. Experiment Setup
- Signal Acquisition: We configured the MKRWAN 1310 transmitter to operate at 915 MHz with a bandwidth of 125 kHz. The transmitter sent a series of packets, including both typical and manipulated signals to simulate anomalies.
- Data Collection: One HackRF SDR is used to record the transmitted signals. These recordings were then split into individual frames, resulting in a dataset comprising numerous frames of normal and anomalous data.
- Dataset Composition: We created two separate datasets: one for normal transmission signals and one for manipulated or anomalous signals with others. The normal dataset comprised typical transmission signals reflecting standard operational conditions. The anomalous dataset was designed to include intentional disturbances and alterations to simulate various potential anomalies. These malicious signals were generated by HackRF.
- Data Preparation: The collected frames are processed and added into the classified folder as normal or anomalous. This folder is used to build a comprehensive dataset, enabling detailed analysis and testing of various anomaly detection methods.
5.2. Evaluation Metrics
- Accuracy: This measure assesses the proportion of cases—both normal and anomalous—that are accurately identified in relation to the total number of cases. It functions as a broad indicator of the model’s efficacy. One measures accuracy using:
- Precision: Out of all cases designated as positive (including both true positives and false positives), precision reflects the percentage of true positive predictions (properly recognized aberrant signals). It is calculated as follows and shows the model’s accuracy in identifying positive instances:
- Sensitivity: The recall measures the percentage of actual aberrant signals, or true positive events, that the model is able to identify. It is based on the following and indicates the model’s capacity to find all pertinent anomalies:
- F1-Score: Through the computation of their harmonic mean, the F1-score integrates recall and precision into a single statistic. Because it offers a balanced measure of recall and precision, this score is especially useful in situations when class distributions are unbalanced. It is calculated with the help of:
5.3. Results and Analysis
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
IoT | Internet of Things |
LOF | Local Outlier Factor |
LoRa | Long-Range Modulation |
LoRaWAN | Long-Range Wide Area Network |
LOFAR | Low-Frequency Array |
ROAD | Radio Observatory Anomaly Detector |
HPC | High-Performance Computing |
PCA | Principal Component Analysis |
IDS | Intrusion Detection System |
MitM | Man in the Middle |
CR-AE | Convolutional Recurrent AutoEncoder |
ConvLSTM | Convolutional Long Short-Term Memory |
KL | Kullback–Leibler |
VAE | Variational Autoencoder |
References
- Andrade, R.O.; Yoo, S.G. A Comprehensive Study of the Use of LoRa in the Development of Smart Cities. Appl. Sci. 2019, 9, 4753. [Google Scholar] [CrossRef]
- Liu, L.; Yao, Y.; Cao, Z.; Zhang, M. Deeplora: Learning accurate path loss model for long distance links in lpwan. In Proceedings of the IEEE INFOCOM 2021—IEEE Conference on Computer Communications, Vancouver, BC, Canada, 10–13 May 2021; pp. 1–10. [Google Scholar]
- Statista. Global IoT and Non-IoT Connections 2010–2025; Statista: Hamburg, Germany, 2025. [Google Scholar]
- Babazadeh, M. LoRa-Based Anomaly Detection Platform: Center and Sensor-Side. IEEE Sens. J. 2020, 20, 6677–6684. [Google Scholar] [CrossRef]
- Jáquez, A.D.B.; Herrera, M.T.A.; Celestino, A.; Ramírez, E.N.; Cruz, D.A.M. Extension of LoRa Coverage and Integration of an Unsupervised Anomaly Detection Algorithm in an IoT Water Quality Monitoring System. Water 2023, 15, 1351. [Google Scholar] [CrossRef]
- Babazadeh, M. Edge analytics for anomaly detection in water networks by an Arduino101-LoRa based WSN. ISA Trans. 2019, 92, 273–285. [Google Scholar] [CrossRef] [PubMed]
- Kurniawan, A.; Kyas, M. Machine Learning Models for LoRa Wan IoT Anomaly Detection. In Proceedings of the 2022 International Conference on Advanced Computer Science and Information Systems (ICACSIS), Depok, Indonesia, 1–3 October 2022; pp. 193–198. [Google Scholar] [CrossRef]
- Mesarcik, M.; Boonstra, A.; Iacobelli, M.; Ranguelova, E.; Laat, C.; Van Nieuwpoort, R. The ROAD to discovery: Machine-learning-driven anomaly detection in radio astronomy spectrograms. Astron. Astrophys. 2023, 680, A74. [Google Scholar] [CrossRef]
- Chen, W.L.; Lin, Y.B.; Ng, F.L.; Liu, C.Y.; Lin, Y.W. RiceTalk: Rice Blast Detection Using Internet of Things and Artificial Intelligence Technologies. IEEE Internet Things J. 2020, 7, 1001–1010. [Google Scholar] [CrossRef]
- Yin, C.; Zhang, S.; Wang, J.; Xiong, N.N. Anomaly Detection Based on Convolutional Recurrent Autoencoder for IoT Time Series. IEEE Trans. Syst. Man Cybern. Syst. 2022, 52, 112–122. [Google Scholar] [CrossRef]
- Ullah, I.; Mahmoud, Q.H. Design and Development of a Deep Learning-Based Model for Anomaly Detection in IoT Networks. IEEE Access 2021, 9, 103906–103926. [Google Scholar] [CrossRef]
- Bao, Y.; Tang, Z.; Li, H.; Zhang, Y. Computer vision and deep learning–based data anomaly detection method for structural health monitoring. Struct. Health Monit. 2019, 18, 401–421. [Google Scholar] [CrossRef]
- Mothukuri, V.; Khare, P.; Parizi, R.M.; Pouriyeh, S.; Dehghantanha, A.; Srivastava, G. Federated-Learning-Based Anomaly Detection for IoT Security Attacks. IEEE Internet Things J. 2022, 9, 2545–2554. [Google Scholar] [CrossRef]
- Wang, B.; Yang, C. Video Anomaly Detection Based on Convolutional Recurrent AutoEncoder. Sensors 2022, 22, 4647. [Google Scholar] [CrossRef] [PubMed]
- Borghesi, A.; Bartolini, A.; Lombardi, M.; Milano, M.; Benini, L. Anomaly Detection Using Autoencoders in High Performance Computing Systems. Proc. AAAI Conf. Artif. Intell. 2019, 33, 9428–9433. [Google Scholar] [CrossRef]
- Choi, J.; Park, J.; Japesh, A.; Adarsh. A Subspace Projection Approach to Autoencoder-based Anomaly Detection. arXiv 2023, arXiv:2302.07643. [Google Scholar] [CrossRef]
- Alghushairy, O.; Alsini, R.; Soule, T.; Ma, X. A Review of Local Outlier Factor Algorithms for Outlier Detection in Big Data Streams. Big Data Cogn. Comput. 2020, 5, 1. [Google Scholar] [CrossRef]
- Yu, S.; Li, X.; Zhao, L.; Wang, J. Hyperspectral Anomaly Detection Based on Low-Rank Representation Using Local Outlier Factor. IEEE Geosci. Remote Sens. Lett. 2021, 18, 1279–1283. [Google Scholar] [CrossRef]
- Xu, Z.; Kakde, D.; Chaudhuri, A. Automatic Hyperparameter Tuning Method for Local Outlier Factor, with Applications to Anomaly Detection. In Proceedings of the 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 9–12 December 2019; pp. 4201–4207. [Google Scholar] [CrossRef]
- Hendrycks, D.; Mazeika, M.; Dietterich, T. Deep anomaly detection with outlier exposure. arXiv 2018, arXiv:1812.04606. [Google Scholar]
- Wang, H.; Yang, S.; Liu, Y.; Li, Q. A novel abnormal data detection method based on dynamic adaptive local outlier factor for the vibration signals of rotating parts. Meas. Sci. Technol. 2023, 34, 085118. [Google Scholar] [CrossRef]
- Cheng, X.; Zhang, M.; Lin, S.; Zhou, K.; Zhao, S.; Wang, H. Two-Stream Isolation Forest Based on Deep Features for Hyperspectral Anomaly Detection. IEEE Geosci. Remote Sens. Lett. 2023, 20, 1–5. [Google Scholar] [CrossRef]
- Senol, N.S.; Rasheed, A. A Testbed for LoRa Wireless Communication between IoT devices. In Proceedings of the 2023 11th International Symposium on Digital Forensics and Security (ISDFS), Chattanooga, TN, USA, 11–12 May 2023; pp. 1–6. [Google Scholar] [CrossRef]
- Monjur, M.M.R.; Heacock, J.; Sun, R.; Yu, Q. An attack analysis framework for LoRaWAN applied advanced manufacturing. In Proceedings of the 2021 IEEE International Symposium on Technologies for Homeland Security (HST), Boston, MA, USA, 8–9 November 2021; pp. 1–7. [Google Scholar]
- Raulf, T.; Claudi, A. Evaluation of SDR-Receivers for the Detection and Localization of Radio Interferences Caused by Corona on Overhead Lines. In Proceedings of the 21st International Symposium on High Voltage Engineering: Volume 2, Budapest, Hungary, 26–30 August 2019; Springer: Cham, Switzerland, 2020; pp. 795–804. [Google Scholar]
Algorithm | Accuracy | Precision | Recall | F1 Score |
---|---|---|---|---|
LOF | 97.78% | 0.98 | 0.98 | 0.98 |
Autoencoder | 97.27% | 0.97 | 0.97 | 0.97 |
Isolation Forest | 84.49% | 0.88 | 0.84 | 0.84 |
Principal Component Analysis | 97.27% | 0.97 | 0.97 | 0.97 |
VAE | 97.27% | 0.97 | 0.97 | 0.97 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Senol, N.S.; Rasheed, A.; Baza, M.; Alsabaan, M. Identifying Tampered Radio-Frequency Transmissions in LoRa Networks Using Machine Learning. Sensors 2024, 24, 6611. https://doi.org/10.3390/s24206611
Senol NS, Rasheed A, Baza M, Alsabaan M. Identifying Tampered Radio-Frequency Transmissions in LoRa Networks Using Machine Learning. Sensors. 2024; 24(20):6611. https://doi.org/10.3390/s24206611
Chicago/Turabian StyleSenol, Nurettin Selcuk, Amar Rasheed, Mohamed Baza, and Maazen Alsabaan. 2024. "Identifying Tampered Radio-Frequency Transmissions in LoRa Networks Using Machine Learning" Sensors 24, no. 20: 6611. https://doi.org/10.3390/s24206611
APA StyleSenol, N. S., Rasheed, A., Baza, M., & Alsabaan, M. (2024). Identifying Tampered Radio-Frequency Transmissions in LoRa Networks Using Machine Learning. Sensors, 24(20), 6611. https://doi.org/10.3390/s24206611