An Adaptive Framework for Intrusion Detection in IoT Security Using MAML (Model-Agnostic Meta-Learning)
Abstract
:1. Introduction
- Adaptive Intrusion Detection System (IDS): This paper presents an adaptive Intrusion Detection System (IDS) solution that can adapt to the changing IoT threat landscape and, as a result, provide IDS response in real time.
- Meta-Learning Approach: The research demonstrates a new approach based on Model-Agnostic Meta-Learning (MAML). As a result, the system becomes more resilient to new threats and IoT architecture settings.
- Real-Time Threat Recognition: The work presents a system capable of detecting known and unknown threats in real time, which is a stark problem in today’s IoT security scenario.
- Cross-Domain Applicability: This proposed framework’s model can be applied to many IoT situations, ensuring flexibility for many IoT use cases and conditions.
- Enhanced IoT Security: With MAML, overall IoT security is enhanced by increasing the IoT systems’ ability to adapt to trending and evolving threats.
Problem Statement
2. Literature Review
3. Data Collection
3.1. UNSW-NB15 Dataset
3.2. Procedure for Pre-Processing Stage
- Step 1: Data Analysis
- The data must be loaded and analyzed, including its structure, distributions, and missing values, before applying the cleaning technique.
- Before commencing the recruitment process, conduct a preliminary analysis of the distribution of classes, types of features (categorical, numerical, and redundant features), and outliers [31].
- Determine whether the dataset contains features like ID, Timestamp, or other useful but uninformative metadata.
- Step 2: Handling Missing Values
- Search for missing values. A missing value is represented as a null value in the database language.
- Choose a strategy to handle missing data based on its proportion and importance:
- It is recommended to drop the feature if its missing value percentage exceeds 50%.
- If the percentage of missing values for numerical features is less than 10%, then it is better to perform mean, median, or mode imputation.
- For categorical features, replace missing values with any of the most frequently occurring categories.
- Step 3: Encoding Categorical Features
- Categorical/qualitative data consist of factors that can be categorized into specific categories. They could also refer to protocol types, service types, or flags.
- Transform categorical variables into numerical form to impose compatibility with deep learning models. Use the following:
- Label encoding: Every category is assigned a distinct integer number (e.g., Hypertext Transfer Protocol (HTTP) is equal to 0, and File Transfer Protocol (FTP) is equal to 1) [32].
- One-hot encoding: Generate dummy variables for each category to prevent a certain ethnicity relationship from being perceived.
- Step 4: Normalizing and Scaling Features
- Normalize all quantitative measures to the same size, for example, between 0 and 1, to avoid situations where some features are extensive and control the training process.
- Normalize your training data, for instance, using Min–Max Scaling to bring previously scaled features to the same scale [33].
- If the features that discriminate factors such as packet size or duration have very high or very low values, it is worth applying logarithmic transformations to decrease the skewness.
- Step 5: Feature Selection and Feature Reduction
- It will also be helpful to calculate feature correlations and select features with high correlation coefficients. Eliminate one of the high Maximal Information Coefficient (MIC) pairs, because the more you have of a type, the less likely two are to co-occur.
- This means removing those extra unnecessary columns that were added but do not contribute any more to the difference in prediction models.
- Use principal component analysis (PCA) to transform the high-dimensional feature space into lower-dimensional space while maintaining variance [34].
- Step 6: Handling Class Imbalance
- Since intrusion detection datasets often have an imbalance between normal and attack traffic, apply the following resampling techniques:
- Oversampling: Several methods, such as SMOTE (Synthetic Minority Oversampling Technique), should be employed to create samples for minority classes [35].
- Undersampling: Then again, the samples from the majority class are reduced to bring the numbers into proportion with the minority classes.
- Class weighting: There are ways to tune the model so that it might reduce the bias, such as changing class weights in the loss function so that the misclassification of minority classes is more costly.
- Step 7: Splitting Features and Labels
- After that, we need to identify the target column, our label column. This is common for binary classification (label) and multi-class classification (attack cat).
- Division of the objects into features (X) and targets or labels (y).
- Step 8: Train–Test Split
- Split the dataset into training, validation, and test sets:
- Training set: Used to train the model.
- Validation set: Applied to optimize several iterations and to avoid leaking information from the validation set.
- Test set: Employed to measure how well the final model has been developed.
- Typically, use 80: Choosing a 10:10:10 or 70:15:15 ratio for the data division between the train–validation–test is also acceptable.
- Step 9: Outlier Detection and Removal
- Detect outliers in the numerical features using techniques like the following:
- Quantitative criteria (e.g., values greater than 3 sigma or standard deviational mean) [36].
- Moving ranges or control charts, box plots, or interquartile ranges (IQRs).
- If outliers negatively influence model performance, they should be truncated or a limit placed on them.
- Step 10: Data Balancing Across Tasks for MAML
- Make sure that the training data are in task format for MAML. The example set for each task should include an equal number of instances from both classes: normal and attack.
4. Methodology
4.1. MAML
Algorithm 1: Model-Agnostic Meta-Learning is the first algorithm. |
P(T): distribution across tasks is required. Step size α and β are required. hyperparameters 1: Initialize θ at random 2: Perform while not finished 3: Sample a batch of jobs Ti ~ p(T) 4: for all that Ti does 5. Assess ∇θLTi (fθ) in relation to K instances. 6. Use gradient descent to calculate the adjusted parameters: (fθ) = θ 0 i = θ − α∇θLTi 7: completion of 8: Update θ → θ − β∇θ P LTi (fθ 0 i) Ti ~ p(T) 9: finish whilst |
Algorithm 2: Few-shot supervised learning with MAML. It is necessary that p(T): distribution over jobs |
Step size hyperparameters α and β are required. 1: initialize θ at random 2: perform while not finished 3: Sample a batch of jobs Ti ~ p(T) 4: for all that Ti does 5: Example K data points from Ti D = {x (j), y (j) } 6: Use D and LTi in Equation (2) or Equation (3) to evaluate ∇θLTi (fθ). 7:Use gradient descent to obtain the modified parameters: θ 0 i = θ − α∇θLTi (fθ). 8: Example data points for the meta-update from Ti are D 0 i = {x (j), y (j) } 9: finish for 10: Revision θ → θ − β∇θ P Using each D 0 i and LTi in Equation (2) or Equation (3), Ti ~ p(T) LTi (fθ 0 i) 11: finish whilst |
4.2. Proposed Model Architecture
- Input Layer
- ii.
- First Fully Connected Layer
- Activation: Next, we follow the chosen linear transformation with the ReLU activation function. ReLU is used because it reduces the vanishing gradient problem in training and enhances the convergence rate.
- Batch Normalization (BN1): We use the batch normalization layer to scale the output of the previous activation. This technique maintains the stability of the learning process by minimizing the internal covariate shift, enabling faster learning.
- Dropout (Drop1): To randomly skip some of the activations during training, dropout has been applied with a probability of 40% (p = 0.4). This technique is used to counteract overfitting since regularizing makes this model more resilient and has to generalize more than it has to fit the data.
- iii.
- Fully Connected Layer No 2 (FC2)
- Batch Normalization (BN2): Once more, batch normalization makes learning stable and converges at the right optimum point.
- Dropout (Drop2): A 40% dropout rate reduces the model’s overreliance on particular features or paths through the data.
- iv.
- The Third Fully Connected Layer is FC3
- Activation: The layer used in that level is the activation function ReLU.
- Batch Normalization (BN3): Once again, similar to all previous layers, batch normalization is applied here to ensure that all data going through the network remain normalized, helping the network converge.
- Dropout (Drop3): The last use of dropout is 40% to retain the model’s regularization characteristic and its ability to generalize to further new input data.
- v.
- Output Layer (FC4)
- vi.
- Optimization and Training
- vii.
- Meta-Learning Approach (MAML)
- Inner Loop (Task-Specific Updates): Most of the time, the model is updated to a specific subset of the data (or a particular task) using several gradients (inner iterations). This helps give better initial weights that can be further improved quickly for the model.
- Outer Loop (Meta-Training): Instead, after the inner loop, the meta-optimizer revises the model’s parameters according to its performance across all the tasks. This meta-training method guarantees this model’s ability to learn rapidly, which is required in classifying network traffic flow from different attacks.
- viii.
- Learning Rate Scheduler
5. Results
5.1. Model Overfitting and Underfitting
- Computational Cost
- ii.
- Overfitting
- iii.
- Underfitting
- iv.
- Avoiding Overemphasizing and Underemphasizing
- v.
- Role of Model-Agnostic Meta-Learning (MAML) for addressing overfit and under-fit conditions
5.2. Model Evaluation on NSL-KDD Dataset
5.3. Core Contributions and Novel Model Points
- Tailored Model Architecture
- Deep Hierarchical Structure: The proposed model then has more than one fully connected layer, connected densely and with an incremental complexity to support learning both ‘shallow’ and ‘deep’ features unique to our dataset.
- Batch Normalization: To augment the function after each hidden layer, we continue adding batch normalization, which reduces internal covariate shifts, stabilizes the training process, and speeds it up.
- Dropout Layers: Incorporating dropout layers in the network architecture distinguishes the key levels of the network architecture and the reasons for their implementation: to prevent overfitting by adding randomness to the learning model, thereby improving its generalization.
- Learning Rate Scheduling: The Step LR scheduler lowers the learning rate progressively to allow the model to always improve and not fluctuate severely.
- Novelty in Integration
- Multi-Step Regularization: Batch normalization, dropout, and L2 regularization were used selectively in this work but adjusted for this study because of the high dimensionality that is likely to impose some level of design reality.
- Custom Loss Optimization: A cross-entropy loss function was used to maximize the possibility of correct classification; meta-optimization helped focus on misclassified samples with a high total accuracy rate.
- Impact of Key Decisions
- High Training and Validation Accuracy: The small difference in training (99.98%) and validation accuracy (99.78%) proves the regularization techniques’ effectiveness and generalization capability across the unseen data.
- Consistently High Evaluation Metrics: When testing the bunch by the criteria of precision, recall, F1 score, and ROC–AUC, the model was validated to be insensitive to changes in the data distribution and to classify the classes well.
6. Discussion
6.1. Comparative Analysis
6.2. Limitations
- Low degree of Dataset Diversity: A limited dataset from the IoT may have been used to train and evaluate, and hence, it may not apply to different IoT contexts. Yet, the proposed framework’s applicability in many real-world IoT use cases has not been tested.
- Scalability Challenges: Since the proposed model performs well in more minor scales, scalability challenges could occur when applying it to a large scale, mainly when the Io WE network is vast with millions of connected devices.
- Vulnerability in Adversarial Attacks: Perhaps some of the changes made to the model in the context of adversarial attacks are not discussed in the paper. MAML of deep learning also uses techniques that turn adversarial to ad, and using the method in real-time security of IoT may hurt the system’s reliability and preciseness.
- Swift in Actions: The study may not have been rapid enough to sufficiently assess the model’s action in the real-time environment of IDS, which is sought to be swift. Where long processing times are necessary to process signals, the network must be extensive enough to impair system efficiency in containing threats.
- However, deep learning with MAML being used in various possibilities and tools may not be easily intelligible as to what makes its decision-making process, even if its process is in a black box. This poses a problem for IoT security because some issues surrounding trust and transparency may not favor the effective functioning of IoT security.
- While MAML allows meta-learning for the model, no work in this issue explores its extensibility in learning new problems, objects, or devices in rapidly changing IoT domains that are subject to attack with new forms of behavior.
- Computational Complexity: Meta-learning techniques like MAML may require high computation, such as CPU and memory, for vast IoT applications, which may contradict the majority of resource-scarce IoT devices.
6.3. Future Directions
- Exploring Diverse Datasets: Future work would involve applying the model to other publicly available datasets related to different kinds of network traffic.
- Integration with Real-Time Systems: The current study confines the online classification tasks to offline ones. Finally, it is worth considering integrating the model into current real-time network monitoring systems as a logical development of the proposed ideas. In high traffic conditions, this endows techniques to improve the inference rate and decrease the computational costs to improve the efficiency but not sacrifice the algorithm accuracy.
- Adversarial Robustness: The second application is adversarial attacks against cybersecurity applications themselves aimed at deceiving the classifiers. As such, adversarial training or defense mechanisms can provide the model with the capability to resist such attacks, and thereby, the model can be used in secure spaces.
- Explainability and Interpretability: Making the model act as a black box achieves high performance and thus provides explainability and interpretability. Future work could also study other approaches to XAI to uncover what had to happen for predictions to occur as they did. This is important because network administrators and security analysts must use their reasons to accept the model’s decisions.
- Semi-Supervised and Unsupervised Learning: Annotating big data for learning purposes is expensive and takes a lot of time, so semi-supervised and unsupervised learning are required. Using semi-supervised or unsupervised learning in the model can explore whether large numbers of unlabeled samples combined with fewer labeled samples can make the model scalable.
- Multi-Task Learning: One possible extension is extending the model to handle several related tasks, e.g., attack classification or anomaly detection, to reduce the time and effort spent during network analysis. It can also learn shared representation as multi-task learning, enhancing the overall learning.
- Cross-Domain Transfer Learning: Structure, intensity, and many other network traffic characteristics may vary dramatically from one environment or domain to another. In addition, investigating how to employ transfer learning to enable the model to accommodate other domains without pain, i.e., retraining, would increase the model’s usability.
- Energy-Efficient Models: Regarding the future direction, green AI has recently attracted research interest; however, improving the power efficiency of the model in the training and running phases is still an essential issue. Hence, the possibility of hinting at techniques that can be used to train more lightweight versions that are more suitable for edge devices can be stated.
- Enhancing Feature Selection Techniques: Although Random Forest-based feature selection was sufficient, experimental studies like Lasso regularization, mutual information, or deep learning techniques feature selection can be experimental and will give additional information on it and input space.
- Collaboration with Domain Experts: Integrating domain knowledge in feature creation and during the final validation of the feature would help increase the model’s performance, particularly when considering the aspects of real-life network traffic.
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
MAML | Model-Agnostic Meta-Learning |
IDS | Intrusion Detection System |
DL | Deep-Learning |
AE | Autoencoders |
RNN | Recurrent Neural Network |
AUC | Area Under the Curve |
MSE | Mean-Squared Error |
BN | Batch Normalization |
IoT | Internet of Things |
ML | Machine-Learning |
CNN | Convolutional Neural Network |
LSTM | Long short-term memory |
ADFA | Australian Defense Force Academy |
ROC | Receiver Operating Characteristic Curve |
DNN | Deep Neural Network |
References
- Ekinci, G.; Broggi, A.; Fiondella, L.; Bastian, N.D.; Kul, G. Adaptive Network Intrusion Detection Systems Against Performance Degradation via Model Agnostic Meta-Learning. In Proceedings of the 11th ACM Workshop on Adaptive and Autonomous Cyber Defense, New York, NY, USA, 14–18 October 2024; pp. 23–26. [Google Scholar] [CrossRef]
- Lu, C.; Wang, X.; Yang, A.; Liu, Y.; Dong, Z. A Few-Shot-Based Model-Agnostic Meta-Learning for Intrusion Detection in Security of Internet of Things. IEEE Internet Things J. 2023, 10, 21309–21321. [Google Scholar] [CrossRef]
- Ji, Y.; Zou, K.; Zou, B. Mi-maml: Classifying Few-Shot Advanced Malware Using Multi-Improved Model-Agnostic Meta-Learning. Cybersecurity 2024, 7, 72. [Google Scholar] [CrossRef]
- Shi, Z.; Xing, M.; Zhang, J.; Wu, B.H. Few-Shot Network Intrusion Detection Based on Model-Agnostic Meta-Learning with L2F Method. In Proceedings of the 2023 IEEE Wireless Communications and Networking Conference (WCNC), Glasgow, UK, 26–29 March 2023; pp. 1–6. [Google Scholar] [CrossRef]
- Duan, Y.; Wang, X.; Li, J.; Zhang, Q.; Zhao, F. Learning to Diagnose: Meta-Learning for Efficient Adaptation in Few-Shot AIOps Scenarios. Electronics 2024, 13, 2102. [Google Scholar] [CrossRef]
- He, J.; Zhang, H.; Ma, Q.; Xu, W.; Zuo, Y. Model-Agnostic Generation-Enhanced Technology for Few-Shot Intrusion Detection. Appl. Intell. 2024, 54, 3181–3204. [Google Scholar] [CrossRef]
- Sakthidevi, I.; Subhashini, S.J.; Rubel Angelina, J.J.; Yegnanarayanan, V.; Kumar, K.S. Leveraging Meta-Learning for Dynamic Anomaly Detection in Zero Trust Clouds. In Proceedings of the International Conference on Cloud Computing and Security, Shanghai, China, 28–30 June 2024; pp. 121–137. [Google Scholar] [CrossRef]
- Hu, Y.; Wu, J.; Li, G.; Li, J.; Cheng, J. Privacy-Preserving Few-Shot Traffic Detection Against Advanced Persistent Threats via Federated Meta Learning. IEEE Trans. Netw. Sci. Eng. 2024, 11, 2549–2560. [Google Scholar] [CrossRef]
- Yang, A.; Wang, Y.; Zhang, T.; Li, R.; Liu, X. Application of Meta-Learning in Cyberspace Security: A Survey. Digit. Commun. Netw. 2023, 9, 67–78. [Google Scholar] [CrossRef]
- Liu, F.; Li, M.; Liu, X.; Xue, T.; Ren, J.; Zhang, C. A Review of Federated Meta-Learning and Its Application in Cyberspace Security. Electronics 2023, 12, 3295. [Google Scholar] [CrossRef]
- Chuang, H.-M.; Ye, L.-J. Applying Transfer Learning Approaches for Intrusion Detection in Software-Defined Networking. Sustainability 2023, 15, 9395. [Google Scholar] [CrossRef]
- Zhao, J.; Li, Q.; Hong, Y.; Shen, M. MetaRockETC: Adaptive Encrypted Traffic Classification in Complex Network Environments via Time Series Analysis and Meta-Learning. IEEE Trans. Netw. Serv. Manag. 2024, 21, 2460–2476. [Google Scholar] [CrossRef]
- Di Monda, D.; Montieri, A.; Persico, V.; Voria, P.; De Ieso, M.; Pescapè, A. Few-Shot Class-Incremental Learning for Network Intrusion Detection Systems. IEEE Open J. Commun. Soc. 2024, 5, 6736–6757. [Google Scholar] [CrossRef]
- Hosseini, M.; Shi, W. Intrusion Detection in IoT Network Using Few-Shot Class Incremental Learning. In Advances in Intelligent Systems and Computing; Springer Nature: Cham, Switzerland, 2024; pp. 617–636. [Google Scholar] [CrossRef]
- He, B.; Wang, F. Specific Emitter Identification via Sparse Bayesian Learning Versus Model-Agnostic Meta-Learning. IEEE Trans. Inf. Forensics Secur. 2023, 18, 3677–3691. [Google Scholar] [CrossRef]
- Bo, J.; Chen, K.; Li, S.; Gao, P. Boosting Few-Shot Network Intrusion Detection with Adaptive Feature Fusion Mechanism. Electronics 2024, 13, 4560. [Google Scholar] [CrossRef]
- Saba, T.; Rehman, A.; Sadad, T.; Kolivand, H.; Bahaj, S.A. Anomaly-based intrusion detection system for IoT networks through deep learning model. Comput. Electr. Eng. 2022, 99, 107810. [Google Scholar] [CrossRef]
- Mushtaq, E.; Zameer, A.; Khan, A. A Two-Stage Stacked Ensemble Intrusion Detection System Using Five Base Classifiers and MLP with Optimal Feature Selection. Microprocess. Microsyst. 2022, 94, 104660. [Google Scholar] [CrossRef]
- Rashid, M.; Kamruzzaman, J.; Imam, T.; Wibowo, S.; Gordon, S. A Tree-Based Stacking Ensemble Technique with Feature Selection for Network Intrusion Detection. Appl. Intell. 2022, 52, 9768–9781. [Google Scholar] [CrossRef]
- Sarhan, M.; Layeghy, S.; Moustafa, N.; Gallagher, M.; Portmann, M. Feature Extraction for Machine Learning-Based Intrusion Detection in IoT Networks. Digit. Commun. Netw. 2024, 10, 205–216. [Google Scholar] [CrossRef]
- Ullah, F.; Ullah, S.; Srivastava, G.; Lin, J.C.-W. IDS-INT: Intrusion Detection System Using Transformer-Based Transfer Learning for Imbalanced Network Traffic. Digit. Commun. Netw. 2024, 10, 190–204. [Google Scholar] [CrossRef]
- More, S.; Idrissi, M.; Mahmoud, H.; Asyhari, A.T. Enhanced Intrusion Detection Systems Performance with UNSW-NB15 Data Analysis. Algorithms 2024, 17, 64. [Google Scholar] [CrossRef]
- Javeed, D.; Gao, T.; Saeed, M.S.; Kumar, P. An Intrusion Detection System for Edge-Envisioned Smart Agriculture in Extreme Environment. IEEE Internet Things J. 2024, 11, 26866–26876. [Google Scholar] [CrossRef]
- Cui, B.; Chai, Y.; Yang, Z.; Li, K. Intrusion Detection in IoT Using Deep Residual Networks with Attention Mechanisms. Future Internet 2024, 16, 255. [Google Scholar] [CrossRef]
- Shafieian, S.; Zulkernine, M. Multi-Layer Stacking Ensemble Learners for Low Footprint Network Intrusion Detection. Complex Intell. Syst. 2023, 9, 3787–3799. [Google Scholar] [CrossRef]
- Holubenko, V.; Gaspar, D.; Leal, R.; Silva, P. Autonomous Intrusion Detection for IoT: A Decentralized and Privacy-Preserving Approach. Int. J. Inf. Secur. 2025, 24, 7. [Google Scholar] [CrossRef]
- Musthafa, M.B.; Huda, S.; Kodera, Y.; Ali, M.A.; Araki, S.; Mwaura, J.; Nogami, Y. Optimizing IoT Intrusion Detection Using Balanced Class Distribution, Feature Selection, and Ensemble Machine Learning Techniques. Sensors 2024, 24, 4293. [Google Scholar] [CrossRef] [PubMed]
- Ayad, A.G.; Sakr, N.A.; Hikal, N.A. A Hybrid Approach for Efficient Feature Selection in Anomaly Intrusion Detection for IoT Networks. J. Supercomput. 2024, 80, 26942–26984. [Google Scholar] [CrossRef]
- Azar, A.T.; Shehab, E.; Mattar, A.M.; Hameed, I.A.; Elsaid, S.A. Deep learning based hybrid intrusion detection systems to protect satellite networks. J. Netw. Syst. Manag. 2023, 31, 82. [Google Scholar] [CrossRef]
- Qaddoura, R.; Al-Zoubi, A.M.; Faris, H.; Almomani, I. A Multi-Layer Classification Approach for Intrusion Detection in IoT Networks Based on Deep Learning. Sensors 2021, 21, 2987. [Google Scholar] [CrossRef]
- Feng, C.; Shao, L.; Wang, J.; Zhang, Y.; Wen, F. Short-Term Load Forecasting of Distribution Transformer Supply Zones Based on Federated Model-Agnostic Meta Learning. IEEE Trans. Power Syst. 2024, 40, 31–45. [Google Scholar] [CrossRef]
- Lu, L.; Cui, X.; Tan, Z.; Wu, Y. MedOptNet: Meta-Learning Framework for Few-Shot Medical Image Classification. IEEE/ACM Trans. Comput. Biol. Bioinform. 2024, 21, 725–736. [Google Scholar] [CrossRef]
- Bovenzi, G.; Di Monda, D.; Montieri, A.; Persico, V.; Pescapè, A. Meta Mimetic: Few-Shot Classification of Mobile-App Encrypted Traffic via Multimodal Meta-Learning. In Proceedings of the 2023 35th International Teletraffic Congress (ITC-35), Politecnico di Torino, Italy, 3–5 October 2023; pp. 1–9. [Google Scholar] [CrossRef]
- Islam, M.D.S.; Yusuf, A.; Gambo, M.D.; Barnawi, A.Y. A Novel Few-Shot ML Approach for Intrusion Detection in IoT. Arab. J. Sci. Eng. 2024, 1–15. [Google Scholar] [CrossRef]
- Di Monda, D.; Bovenzi, G.; Montieri, A.; Persico, V.; Pescapè, A. IoT Botnet-Traffic Classification Using Few-Shot Learning. In Proceedings of the 2023 IEEE International Conference on Big Data (BigData), Sorrento, Italy, 15–18 December 2023; pp. 3284–3293. [Google Scholar] [CrossRef]
- Chen, J.; Wang, C.; Hong, Y.; Mi, R.; Zhang, L.J.; Wu, Y.; Wang, H.; Zhou, Y. A Survey on Anomaly Detection with Few-Shot Learning; Springer Nature: Basel, Switzerland, 2025; pp. 34–50. [Google Scholar] [CrossRef]
- Zerdoumi, S. Meta-Prediction: Uncovering Patterns and Enhancing Predictive Accuracy Through Self-Learning Algorithms. Multimed. Tools Appl. 2024, 1–71. [Google Scholar] [CrossRef]
- Finn, C.; Abbeel, P.; Levine, S. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks; PMLR: New York, NY, USA, 2017; pp. 1126–1135. [Google Scholar]
References | Datasets | Methodology | Limitations | Results |
---|---|---|---|---|
[16] | Limited network traffic samples for intrusion detection | Metric-based meta-learning and Adaptive Feature Fusion | Challenges in model information acquisition with FSL. | 97.78% accuracy in multi-class few-shot tasks |
[17] | Benign network traffic used for intrusion detection model training | Host-based anomaly detection using packet representations for security. | It relies solely on benign traffic and is limited to anomaly detection. | Accuracy 0.9874, precision 0.9384, recall 0.9971, F1 score 0.9529. |
[18] | NSL-KDD benchmark dataset for intrusion detection evaluation. | Stacked ensemble with multiple classifiers and MLP. | A single classifier is insufficient for metamorphic malware detection. | 88.10% accuracy, 0.87 detection rate, 0.17 false alarm. |
[19] | NSL-KDD and UNSW-NB15 intrusion detection datasets | Tree-based stacking ensemble with feature selection techniques. | Ensemble methods require additional computation and complexity. | UNSW-NB15 accuracy of 0.9400 with XGBoost. |
[20] | CSE-CIC-IDS2018, UNSW-NB15, and ToN-IoT for evaluation and testing. | ML models with PCA, AE, and LDA for feature extraction. | Performance varies, no universal feature set exists, and LDA is ineffective. | AE with 10 dimensions: DR 98.28%, FAR 3.21% |
[21] | UNSW-NB15, CIC-IDS2017, NSL-KDD used for performance evaluation. | Transformer-based transfer learning with CNN-LSTM for attack detection. | Imbalanced data, complex features, and minority attack identification challenges. | Accuracy 99.21%. |
[22] | UNSW-NB15 network traffic dataset used for cyber-attack detection. | LR, SVM, DT, and Random Forest algorithms were applied. | False positives and accuracy improvements are needed for IDS systems. | Random Forest: F1 97.80%, Accuracy 98.63%, FAR 1.36%. |
[23] | CICIDS2018, ToN-IoT, Edge-IIoTset used for performance evaluation. | Hybrid approach: Bi-GRU, LSTM with softmax, TBPTT for learning. | Handling large data volumes, extreme environments, and attack exploitation. | Accuracy: 98.32%; FPR: 0.0426%. |
[24] | UNSW-NB15 was used for model evaluation and testing. | Temporal convolutional residual modules with attention mechanism for detection. | Inadequate feature extraction and insufficient model generalization in prior methods. | Accuracy: UNSW-NB15 89.23% |
[25] | Malicious data exfiltration and benign network traffic examples. | Comparison of bagging, boosting, and multi-layer stacking. | Few models meet strict acceptance criteria for detection. | 0.978 accuracy, 0.001 false positive rate with MLP. |
[26] | ADFA-LD dataset used. | Proposes lightweight Host-based Intrusion Detection System (HIDS) using system call traces and ML. | Limited focus on real-world implementation and HIDS research gap. | Achieved 98% accuracy and explained results using eXplainable AI. |
[27] | NSL-KD and UNSW-NB15 datasets were used for performance evaluation and testing. | Class balancing, feature selection, SVM-bagging, LSTM-stacking with ANOVA. | Overfitting and feature selection challenges in Complex model optimization. | LSTM-stacking achieved 96.92–99.77% accuracy, low overfitting, high AUC. |
Metrics | Value |
---|---|
Training Time | 40 min |
Model Employed | MAML |
Hardware Employed | Google Colabs GPU |
Batch Size | 32 |
Total Parameters | 1.6 million |
Epochs | 100 |
Evaluation Metric | Value |
---|---|
Accuracy | 99.98 |
Recall | 99.0 |
Precision | 99.5 |
F1 score | 99.4 |
Evaluation Metric | Value |
---|---|
Accuracy | 99.1 |
Recall | 98.2 |
Precision | 97.3 |
F1 score | 98.5 |
References | Methodology | Accuracy | Dataset |
---|---|---|---|
[18] | Stacked ensemble | 88.10% | UNSW-NB15 |
[19] | Tree-based stacking ensemble | 94% | UNSW-NB15 |
[20] | ML models with PCA, AE, LDA | 98.28% | UNSW-NB15 |
[21] | Transformer-based transfer learning | 99.21% | UNSW-NB15 |
[22] | LR, SVM, DT, RF | 98.63% | UNSW-NB15 |
[23] | Hybrid approach: Bi-GRU, LSTM | 98.32% | UNSW-NB15 |
[24] | Temporal convolutional residual modules | 89.23% | UNSW-NB15 |
[25] | Bagging, boosting, multi-layer stacking | 0.978 | UNSW-NB15 |
[26] | Lightweight HIDS, ML | 98% | UNSW-NB15 |
[27] | Class balancing, SVM-bagging, LSTM-stacking | 96.92% | UNSW-NB15 |
Our Proposed Work | MAML | 99.98% for UNSW-NB15 and 99.1% for NSL-KDD | UNSW-NB15 NSL-KDD |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Alrayes, F.S.; Amin, S.U.; Hakami, N. An Adaptive Framework for Intrusion Detection in IoT Security Using MAML (Model-Agnostic Meta-Learning). Sensors 2025, 25, 2487. https://doi.org/10.3390/s25082487
Alrayes FS, Amin SU, Hakami N. An Adaptive Framework for Intrusion Detection in IoT Security Using MAML (Model-Agnostic Meta-Learning). Sensors. 2025; 25(8):2487. https://doi.org/10.3390/s25082487
Chicago/Turabian StyleAlrayes, Fatma S., Syed Umar Amin, and Nada Hakami. 2025. "An Adaptive Framework for Intrusion Detection in IoT Security Using MAML (Model-Agnostic Meta-Learning)" Sensors 25, no. 8: 2487. https://doi.org/10.3390/s25082487
APA StyleAlrayes, F. S., Amin, S. U., & Hakami, N. (2025). An Adaptive Framework for Intrusion Detection in IoT Security Using MAML (Model-Agnostic Meta-Learning). Sensors, 25(8), 2487. https://doi.org/10.3390/s25082487