Survey on Intrusion Detection Systems Based on Machine Learning Techniques for the Protection of Critical Infrastructure
Abstract
:1. Introduction
2. Research Objectives
- Synthesize and analyze the most representative research works that have been conducted to develop IDSs for industrial systems through ML techniques;
- Generate a discussion and a critical evaluation of the existing foundation of knowledge in the development of IDSs using ML techniques for the protection of CI.
3. Methodology
4. Fundamental Concepts
4.1. Critical Infrastructure Concept
4.2. ML and IDSs to Protect CI
4.3. Cybersecurity Datasets to Test IDSs
5. Machine Learning in Intrusion Detection Systems (IDSs) to Protect CI
6. Conclusions and Future Direction
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Markopoulou, D.; Papakonstantinou, V. The regulatory framework for the protection of critical infrastructures against cyberthreats: Identifying shortcomings and addressing future challenges: The case of the health sector in particular. Comput. Law Secur. Rev. Int. J. Technol. Law Pract. 2021, 41, 105502. [Google Scholar] [CrossRef]
- Selim, G.E.I.; Hemdan, E.E.-D.; Shehata, A.M.; El-Fishawy, N.A. Anomaly events classification and detection system in critical industrial internet of things infrastructure using machine learning algorithms. Multimedia Tools Appl. 2021, 80, 12619–12640. [Google Scholar] [CrossRef]
- Ahmed, I.; Anisetti, M.; Ahmad, A.; Jeon, G. A Multilayer Deep Learning Approach for Malware Classification in 5G-Enabled IIoT. IEEE Trans. Ind. Inform. 2022, 19, 1495–1503. [Google Scholar] [CrossRef]
- Ridwan, M.A.; Radzi, N.A.M.; Abdullah, F.; Jalil, Y.E. Applications of Machine Learning in Networking: A Survey of Current Issues and Future Challenges. IEEE Access 2021, 9, 52523–52556. [Google Scholar] [CrossRef]
- Shaukat, K.; Luo, S.; Varadharajan, V.; Hameed, I.A.; Xu, M. A Survey on Machine Learning Techniques for Cyber Security in the Last Decade. IEEE Access 2020, 8, 222310–222354. [Google Scholar] [CrossRef]
- Kruszka, L.; Klósak, M.; Muzolf, P. Critical Infrastructure Protection Best Practices and Innovative Methods of Protection; NATO Science for Peace and Security, Sub-Series D, Information and Communication Security; IOS Press: Amsterdam, The Netherlands, 2019; Volume 52. [Google Scholar]
- Khraisat, A.; Gondal, I.; Vamplew, P.; Kamruzzaman, J. Survey of intrusion detection systems: Techniques, datasets and challenges. Cybersecurity 2019, 2, 20. [Google Scholar] [CrossRef] [Green Version]
- Nguyen, T.T.; Reddi, V.J. Deep Reinforcement Learning for Cyber Security. IEEE Trans. Neural Netw. Learn. Syst. 2021, 1–17. [Google Scholar] [CrossRef]
- Alimi, O.A.; Ouahada, K.; Abu-Mahfouz, A.M.; Rimer, S.; Alimi, K.O.A. A Review of Research Works on Supervised Learning Algorithms for SCADA Intrusion Detection and Classification. Sustainability 2021, 13, 9597. [Google Scholar] [CrossRef]
- Almalawi, A.; Fahad, A.; Tari, Z.; Khan, A.I.; Alzahrani, N.; Bakhsh, S.T.; Alassafi, M.O.; Alshdadi, A.; Qaiyum, S. Add-On Anomaly Threshold Technique for Improving Unsupervised Intrusion Detection on SCADA Data. Electronics 2020, 9, 1017. [Google Scholar] [CrossRef]
- Conti, M.; Donadel, D.; Turrin, F. A Survey on Industrial Control System Testbeds and Datasets for Security Research. IEEE Commun. Surv. Tutor. 2021, 23, 2248–2294. [Google Scholar] [CrossRef]
- Ring, M.; Wunderlich, S.; Scheuring, D.; Landes, D.; Hotho, A. A survey of network-based intrusion detection data sets. Comput. Secur. 2019, 86, 147–167. [Google Scholar] [CrossRef] [Green Version]
- Bhamare, D.; Zolanvari, M.; Erbad, A.; Jain, R.; Khan, K.; Meskin, N. Cybersecurity for industrial control systems: A survey. Comput. Secur. 2020, 89, 101677. [Google Scholar] [CrossRef] [Green Version]
- Ghosh, S.; Sampalli, S. A Survey of Security in SCADA Networks: Current Issues and Future Challenges. IEEE Access 2019, 7, 135812–135831. [Google Scholar] [CrossRef]
- Ramotsoela, D.; Abu-Mahfouz, A.; Hancke, G. A Survey of Anomaly Detection in Industrial Wireless Sensor Networks with Critical Water System Infrastructure as a Case Study. Sensors 2018, 18, 2491. [Google Scholar] [CrossRef] [Green Version]
- Thomé, A.M.T.; Scavarda, L.F.; Scavarda, A.J. Conducting systematic literature review in operations management. Prod. Plan. Control 2016, 27, 408–420. [Google Scholar] [CrossRef]
- Gallais, C.; Filiol, E. Critical Infrastructure: Where Do We Stand Today? A Comprehensive and Comparative Study of the Definitions of a Critical Infrastructure. J. Inf. Warf. 2017, 16, 64. Available online: https://www.jstor.org/stable/26502877 (accessed on 20 October 2022).
- Kure, H.; Islam, S. Cyber Threat Intelligence for Improving Cybersecurity and Risk Management in Critical Infrastructure. J. Univers. Comput. Sci. 2019, 25, 1478–1502. [Google Scholar] [CrossRef]
- Herrera, L.-C.; Maennel, O. A comprehensive instrument for identifying critical information infrastructure services. Int. J. Crit. Infrastruct. Prot. 2019, 25, 50–61. [Google Scholar] [CrossRef]
- Mattioli, R.; Levy-Bencheton, C.; European Union, European Network and Information Security Agency. Methodologies for the Identification of Critical Information Infrastructure Assets and Services: Guidelines for Charting Electronic Data Communication Networks; European Union Agency for Network and Information Security: Heraklion, Greece, 2014. [Google Scholar]
- U.S. Homeland Security Office. Homeland Security Presidential Directive 7: Critical Infrastructure Identification, Prioritization, and Protection. Available online: https://www.cisa.gov/homeland-security-presidential-directive-7 (accessed on 17 December 2003).
- Pătraşcu, P. Emerging Technologies and National Security: The Impact of IoT in Critical Infrastructures Protection and Defence Sector. Land Forces Acad. Rev. 2021, 26, 423–429. [Google Scholar] [CrossRef]
- Das, S.K.; Kant, K.; Zhang, N. Handbook on Securing Cyber-Physical Critical Infrastructure. Waltham, MA: Morgan Kaufmann, 2012. Available online: https://ezproxy.uniandes.edu.co/login?url=https://search.ebscohost.com/login.aspx?direct=true&db=e000xww&AN=453871&lang=es&site=eds-live&scope=site (accessed on 1 November 2022).
- Kure, H.I.; Islam, S.; Mouratidis, H. An integrated cyber security risk management framework and risk predication for the critical infrastructure protection. Neural Comput. Appl. 2022, 34, 15241–15271. [Google Scholar] [CrossRef]
- Dawson, M.; Bacius, R.; Gouveia, L.B.; Vassilakos, A. Understanding the Challenge of Cybersecurity in Critical Infrastructure Sectors. Land Forces Acad. Rev. 2021, 26, 69–75. [Google Scholar] [CrossRef]
- Malatji, M.; Marnewick, A.L.; Von Solms, S. Cybersecurity capabilities for critical infrastructure resilience. Inf. Comput. Secur. 2022, 30, 255–279. [Google Scholar] [CrossRef]
- Arora, P.; Kaur, B.; Teixeira, M.A. Evaluation of Machine Learning Algorithms Used on Attacks Detection in Industrial Control Systems. J. Inst. Eng. (India) Ser. B 2021, 102, 605–616. [Google Scholar] [CrossRef]
- Zeadally, S.; Adi, E.; Baig, Z.; Khan, I.A. Harnessing Artificial Intelligence Capabilities to Improve Cybersecurity. IEEE Access 2020, 8, 23817–23837. [Google Scholar] [CrossRef]
- Handa, A.; Sharma, A.; Shukla, S.K. Machine learning in cybersecurity: A review. WIREs Data Min. Knowl. Discov. 2019, 9, e1306. [Google Scholar] [CrossRef]
- Buczak, L.; Guven, E. A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun. Surv. Tutor. 2015, 18, 1153–1176. [Google Scholar] [CrossRef]
- Sarker, I.H.; Kayes, A.S.M.; Badsha, S.; Alqahtani, H.; Watters, P.; Ng, A. Cybersecurity data science: An overview from machine learning perspective. J. Big Data 2020, 7, 1–29. [Google Scholar] [CrossRef]
- Polat, H.; Türkoğlu, M.; Polat, O.; Şengür, A. A novel approach for accurate detection of the DDoS attacks in SDN-based SCADA systems based on deep recurrent neural networks. Expert Syst. Appl. 2022, 197, 116748. [Google Scholar] [CrossRef]
- Sarnovsky, M.; Paralic, J. Hierarchical Intrusion Detection Using Machine Learning and Knowledge Model. Symmetry 2020, 12, 203. [Google Scholar] [CrossRef] [Green Version]
- Mishra, P.; Varadharajan, V.; Tupakula, U.; Pilli, E.S. A Detailed Investigation and Analysis of Using Machine Learning Techniques for Intrusion Detection. IEEE Commun. Surv. Tutor. 2019, 21, 686–728. [Google Scholar] [CrossRef]
- Shams, E.A.; Rizaner, A.; Ulusoy, A.H. A novel context-aware feature extraction method for convolutional neural network-based intrusion detection systems. Neural Comput. Appl. 2021, 33, 13647–13665. [Google Scholar] [CrossRef]
- Viegas, E.K.; Santin, A.O.; Oliveira, L.S. Toward a reliable anomaly-based intrusion detection in real-world environments. Comput. Netw. 2017, 127, 200–216. [Google Scholar] [CrossRef]
- Kanimozhi, V.; Jacob, T.P. Artificial Intelligence based Network Intrusion Detection with hyper-parameter optimization tuning on the realistic cyber dataset CSE-CIC-IDS2018 using cloud computing. ICT Express 2019, 5, 211–214. [Google Scholar] [CrossRef]
- Sarhan, M.; Layeghy, S.; Portmann, M. Towards a Standard Feature Set for Network Intrusion Detection System Datasets. Mob. Netw. Appl. 2022, 27, 357–370. [Google Scholar] [CrossRef]
- Kenyon, A.; Deka, L.; Elizondo, D. Are public intrusion datasets fit for purpose characterising the state of the art in intrusion event datasets. Comput. Secur. 2020, 99, 102022. [Google Scholar] [CrossRef]
- Nechaev, B.; Allman, M.; Paxson, V.; Gurtov, A. Lawrence Berkeley National Laboratory (LBNL)/ICSI Enterprise Tracing Project; LBNL/ICSI: Berkeley, CA, USA, 2004. [Google Scholar]
- Sperotto, A.; Sadre, R.; Van Vliet, F.; Pras, A. A labeled data set for flow-based intrusion detection. In IP Operations and Management, Proceedings of the 9th IEEE International Workshop, IPOM 2009, Venice, Italy, 29–30 October 2009; Springer: Berlin/Heidelberg, Germany, 2009; pp. 39–50. [Google Scholar]
- Fontugne, R.; Borgnat, P.; Abry, P.; Fukuda, K. MAWILab: Combining Diverse Anomaly Detectors for Automated Anomaly Labeling and Performance Benchmarking. In Proceedings of the 6th International Conference, Philadelphia, PA, USA, 30 November–3 December 2010; pp. 1–12. [Google Scholar]
- Song, J.; Takakura, H.; Okabe, Y.; Eto, M.; Inoue, D.; Nakao, K. Statistical analysis of honeypot data and building of Kyoto 2006+ dataset for NIDS evaluation. In Proceedings of the EuroSys’11: Sixth EuroSys Conference 2011, Salzburg, Austria, 10 April 2011; pp. 29–36. [Google Scholar] [CrossRef]
- Gogoi, P.; Bhuyan, M.H.; Bhattacharyya, D.K.; Kalita, J.K. Packet and flow based network intrusion dataset. In Proceedings of the International Conference on Contemporary Computing, Noida, India, 6–8 August 2012; pp. 322–334. [Google Scholar]
- Shiravi, A.; Shiravi, H.; Tavallaee, M.; Ghorbani, A.A. Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Comput. Secur. 2012, 31, 357–374. [Google Scholar] [CrossRef]
- Wheelus, C.; Khoshgoftaar, T.M.; Zuech, R.; Najafabadi, M.M. A Session Based Approach for Aggregating Network Traffic Data—The SANTA Dataset. In Proceedings of the 2014 IEEE International Conference on Bioinformatics and Bioengineering, Boca Raton, FL, USA, 10–12 November 2014; pp. 369–378. [Google Scholar] [CrossRef]
- Bhattacharya, S.; Selvakumar, S. SSENet-2014 dataset: A dataset for detection of multiconnection attacks. In Proceedings of the 3rd International Conference on Eco-Friendly Computing and Communication Systems, ICECCS 2014, Mangalore, India, 18–21 December 2014; pp. 121–126. [Google Scholar] [CrossRef]
- Kent, D. Comprehensive, Multi-Source Cyber-Security Events Data Set; Los Alamos National Lab (LANL): Los Alamos, NM, USA, 2015. [Google Scholar]
- García, S.; Grill, M.; Stiborek, J.; Zunino, A. An empirical comparison of botnet detection methods. Comput. Secur. 2014, 45, 100–123. [Google Scholar] [CrossRef]
- Beer, F.; Hofer, T.; Karimi, D.; Bühler, U. A New Attack Composition for Network Security. 2017. Available online: https://openwrt.org/ (accessed on 25 October 2022).
- Sharma, R.; Singla, R.; Guleria, A. A New Labeled Flow-based DNS Dataset for Anomaly Detection: PUF Dataset. Procedia Comput. Sci. 2018, 132, 1458–1466. [Google Scholar] [CrossRef]
- Maciá-Fernández, G.; Camacho, J.; Magán-Carrión, R.; García-Teodoro, P.; Therón, R. UGR‘16: A new dataset for the evaluation of cyclostationarity-based network IDSs. Comput. Secur. 2018, 73, 411–424. [Google Scholar] [CrossRef] [Green Version]
- Adepu, S.; Junejo, K.N.; Mathur, A.; Goh, J. A Dataset to Support Research in the Design of Secure Water Treatment Systems Physical Layer security for Cyber Physical Systems: Attack Design, Detection and Solution (ADDS) View Project Advancing Security of Public Infrastructure Using Resilience and Economics View Project A Dataset to Support Research in the Design of Secure Water Treatment Systems. Available online: https://www.researchgate.net/publication/305809559 (accessed on 30 September 2022).
- Guerra-Manzanares, A.; Medina-Galindo, J.; Bahsi, H.; Nõmm, S. MedBIoT: Generation of an IoT botnet dataset in a medium-sized IoT network. In Proceedings of the ICISSP 2020—6th International Conference on Information Systems Security and Privacy, Valletta, Malta, 25–27 February 2020; pp. 207–218. [Google Scholar]
- MVS Datasets z/OS TSO/E Customization SA32-0976-00. Available online: https://www.ibm.com/docs/en/zos/2.1.0?topic=tsoe-mvs-data-sets (accessed on 3 November 2022).
- Center for Applied Internet Data Analysis at the University of California’s, CAIDA Data—Completed Datasets. Available online: https://www.caida.org/catalog/datasets/completed-datasets/ (accessed on 5 November 2022).
- Faramondi, L.; Flammini, F.; Guarino, S.; Setola, R. A Hardware-in-the-Loop Water Distribution Testbed Dataset for Cyber-Physical Security Testing. IEEE Access 2021, 9, 122385–122396. [Google Scholar] [CrossRef]
- Wu, M.; Song, J.; Sharma, S.; Di, J.; He, B.; Wang, Z.; Zhang, J.; Lin, L.W.L.; Greaney, E.A.; Moon, Y. Development of testbed for cyber-manufacturing security issues. Int. J. Comput. Integr. Manuf. 2020, 33, 302–320. [Google Scholar] [CrossRef]
- Haider, W.; Hu, J.; Slay, J.; Turnbull, B.; Xie, Y. Generating realistic intrusion detection system dataset based on fuzzy qualitative modeling. J. Netw. Comput. Appl. 2017, 87, 185–192. [Google Scholar] [CrossRef]
- Zoppi, T.; Gharib, M.; Atif, M.; Bondavalli, A. Meta-Learning to Improve Unsupervised Intrusion Detection in Cyber-Physical Systems. ACM Trans. Cyber-Phys. Syst. 2021, 5, 1–27. [Google Scholar] [CrossRef]
- Alsaedi, A.; Moustafa, N.; Tari, Z.; Mahmood, A.; Anwar, A. TON_IoT Telemetry Dataset: A New Generation Dataset of IoT and IIoT for Data-Driven Intrusion Detection Systems. IEEE Access 2020, 8, 165130–165150. [Google Scholar] [CrossRef]
- Hindy, H.; Bayne, E.; Bures, M.; Atkinson, R.; Tachtatzis, C.; Bellekens, X. Machine Learning Based IoT Intrusion Detection System: An MQTT Case Study (MQTT-IoT-IDS2020 Dataset). In Selected Papers from the 12th International Networking Conference: INC 2020; Springer International Publishing: Cham, Switzerland, 2021; pp. 73–84. Available online: http://arxiv.org/abs/2006.15340 (accessed on 1 December 2022).
- Al-Hawawreh, M.; Sitnikova, E.; Aboutorab, N. X-IIoTID: A Connectivity-Agnostic and Device-Agnostic Intrusion Data Set for Industrial Internet of Things. IEEE Internet Things J. 2021, 9, 3962–3977. [Google Scholar] [CrossRef]
- Ferrag, M.A.; Friha, O.; Hamouda, D.; Maglaras, L.; Janicke, H. Edge-IIoTset: A New Comprehensive Realistic Cyber Security Dataset of IoT and IIoT Applications for Centralized and Federated Learning. IEEE Access 2022, 10, 40281–40306. [Google Scholar] [CrossRef]
- Gyamfi, E.; Jurcut, A. Intrusion Detection in Internet of Things Systems: A Review on Design Approaches Leveraging Multi-Access Edge Computing, Machine Learning, and Datasets. Sensors 2022, 22, 3744. [Google Scholar] [CrossRef]
- Ahsan, R.; Shi, W.; Ma, X.; Croft, W.L. A comparative analysis of CGAN-based oversampling for anomaly detection. IET Cyber-Phys. Syst. Theory Appl. 2022, 7, 40–50. [Google Scholar] [CrossRef]
- Francia, G.A. A Machine Learning Test Data Set for Continuous Security Monitoring of Industrial Control Systems. In Proceedings of the 2017 IEEE 7th Annual International Conference on CYBER Technology in Automation, Control, and Intelligent Systems (CYBER), Honolulu, HI, USA, 31 July 2017–4 August 2017; pp. 1043–1048. [Google Scholar] [CrossRef]
- Fujdiak, R.; Blazek, P.; Mlynek, P.; Misurec, J. Developing Battery of Vulnerability Tests for Industrial Control Systems. In Proceedings of the 2017 IEEE 7th Annual International Conference on CYBER Technology in Automation, Control, and Intelligent Systems (CYBER), Honolulu, HI, USA, 31 July 2017–4 August 2019; pp. 1–5. [Google Scholar] [CrossRef]
- Kaouk, M.; Flaus, J.-M.; Potet, M.-L.; Groz, R. A Review of Intrusion Detection Systems for Industrial Control Systems. In Proceedings of the 2019 6th International Conference on Control, Decision and Information Technologies (CoDIT), Paris, France, 23–26 April 2019; pp. 1699–1704. [Google Scholar] [CrossRef]
- Kegyes, T.; Süle, Z.; Abonyi, J. The Applicability of Reinforcement Learning Methods in the Development of Industry 4.0 Applications. Complexity 2021, 2021, 1–31. [Google Scholar] [CrossRef]
- Roberts, C.; Ngo, S.-T.; Milesi, A.; Peisert, S.; Arnold, D.; Saha, S.; Scaglione, A.; Johnson, N.; Kocheturov, A.; Fradkin, D. Deep Reinforcement Learning for DER Cyber-Attack Mitigation. September 2020. Available online: http://arxiv.org/abs/2009.13088 (accessed on 5 December 2022).
- Shitharth, S.; Kshirsagar, P.R.; Balachandran, P.K.; Alyoubi, K.H.; Khadidos, A.O. An Innovative Perceptual Pigeon Galvanized Optimization (PPGO) Based Likelihood Naïve Bayes (LNB) Classification Approach for Network Intrusion Detection System. IEEE Access 2022, 10, 46424–46441. [Google Scholar] [CrossRef]
- Prashanth, S.K.; Shitharth, S.; Kumar, B.P.; Subedha, V.; Sangeetha, K. Optimal Feature Selection Based on Evolutionary Algorithm for Intrusion Detection. SN Comput. Sci. 2022, 3, 1–9. [Google Scholar] [CrossRef]
- MR, G.R.; Ahmed, C.M.; Mathur, A. Machine learning for intrusion detection in industrial control systems: Challenges and lessons from experimental evaluation. Cybersecurity 2021, 4, 27. [Google Scholar] [CrossRef]
- Mishra, N.; Pandya, S. Internet of Things Applications, Security Challenges, Attacks, Intrusion Detection, and Future Visions: A Systematic Review. IEEE Access 2021, 9, 59353–59377. [Google Scholar] [CrossRef]
- Le, T.-T.; Kim, H.; Kang, H.; Kim, H. Classification and Explanation for Intrusion Detection System Based on Ensemble Trees and SHAP Method. Sensors 2022, 22, 1154. [Google Scholar] [CrossRef]
- Faker, O.; Dogdu, E. Intrusion detection using big data and deep learning techniques. In Proceedings of the ACMSE 2019, Kennesaw, GA, USA, 18–20 April 2019; pp. 86–93. [Google Scholar] [CrossRef]
- Nirmala, P.; Manimegalai, T.; Arunkumar, J.R.; Vimala, S.; Rajkumar, G.V.; Raju, R. A Mechanism for Detecting the Intruder in the Network through a Stacking Dilated CNN Model. Wirel. Commun. Mob. Comput. 2022, 2022, 1955009. [Google Scholar] [CrossRef]
- Liu, Z.; Ghulam MU, D.; Zhu, Y.; Yan, X.; Wang, L.; Jiang, Z.; Luo, J. Deep Learning Approach for IDS. In Proceedings of the Fourth International Congress on Information and Communication Technology, London, UK, 25–26 February 2020; pp. 471–479. [Google Scholar]
- Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward generating a new intrusion detection dataset and intrusion traffic characterization. In Proceedings of the International Conference on Information Systems Security and Privacy, Funchal, Portugal, 22–24 January 2018; pp. 108–116. [Google Scholar]
- Malik, A.J.; Khan, F.A. A hybrid technique using binary particle swarm optimization and decision tree pruning for network intrusion detection. Clust. Comput. 2017, 21, 667–680. [Google Scholar] [CrossRef]
- Al Jallad, K.; Aljnidi, M.; Desouki, M.S. Big data analysis and distributed deep learning for next-generation intrusion detection system optimization. J. Big Data 2019, 6, 1–18. [Google Scholar] [CrossRef] [Green Version]
- Batina, L.; Picek, S.; Mondal, M. Security, Privacy, and Applied Cryptography Engineering, Proceedings of the 10th International Conference, SPACE 2020, Kolkata, India, 17–21 December 2020; Springer Nature: Berlin/Heidelberg, Germany, 2020; Volume 12586. [Google Scholar]
- Khan, I.A.; Pi, D.; Khan, Z.U.; Hussain, Y.; Nawaz, A. HML-IDS: A Hybrid-Multilevel Anomaly Prediction Approach for Intrusion Detection in SCADA Systems. IEEE Access 2019, 7, 89507–89521. [Google Scholar] [CrossRef]
- Sangeetha, K.; Shitharth, S.; Mohammed, G.B. Enhanced SCADA IDS Security by Using MSOM Hybrid Unsupervised Algorithm. Int. J. Web-Based Learn. Teach. Technol. 2022, 17, 1–9. [Google Scholar] [CrossRef]
- Khadidos, A.O.; Manoharan, H.; Selvarajan, S.; Khadidos, A.O.; Alyoubi, K.H.; Yafoz, A. A Classy Multifacet Clustering and Fused Optimization Based Classification Methodologies for SCADA Security. Energies 2022, 15, 3624. [Google Scholar] [CrossRef]
- Kwon, H.-Y.; Kim, T.; Lee, M.-K. Advanced Intrusion Detection Combining Signature-Based and Behavior-Based Detection Methods. Electronics 2022, 11, 867. [Google Scholar] [CrossRef]
- Song, J.Y.; Paul, R.; Yun, J.H.; Kim, H.C.; Choi, Y.J. CNN-based anomaly detection for packet payloads of industrial control system. Int. J. Sens. Netw. 2021, 36, 36–49. [Google Scholar] [CrossRef]
- Wang, C.; Liu, H.; Sun, Y.; Wei, Y.; Wang, K.; Wang, B. Dimension Reduction Technique Based on Supervised Autoencoder for Intrusion Detection of Industrial Control Systems. Secur. Commun. Netw. 2022, 2022, 5713074. [Google Scholar] [CrossRef]
- Durairaj, D.; Venkatasamy, T.K.; Mehbodniya, A.; Umar, S.; Alam, T. Intrusion detection and mitigation of attacks in microgrid using enhanced deep belief network. Energy Sources, Part A Recover. Util. Environ. Eff. 2022, 1–23. [Google Scholar] [CrossRef]
- Chen, J.; Gao, X.; Deng, R.; He, Y.; Fang, C.; Cheng, P. Generating Adversarial Examples Against Machine Learning-Based Intrusion Detector in Industrial Control Systems. IEEE Trans. Dependable Secur. Comput. 2022, 19, 1810–1825. [Google Scholar] [CrossRef]
- Panagiotis, F.; Taxiarxchis, K.; Georgios, K.; Maglaras, L.; Ferrag, M.A. Intrusion Detection in Critical Infrastructures: A Literature Review. Smart Cities 2021, 4, 1146–1157. [Google Scholar] [CrossRef]
- Yadav, G.; Paul, K. Architecture and security of SCADA systems: A review. Int. J. Crit. Infrastruct. Prot. 2021, 34, 100433. [Google Scholar] [CrossRef]
- Jmila, H.; Ibn Khedher, M. Adversarial machine learning for network intrusion detection: A comparative study. Comput. Netw. 2022, 214, 109073. [Google Scholar] [CrossRef]
- Madry, A.; Makelov, A.; Schmidt, L.; Tsipras, D.; Vladu, A. Towards Deep Learning Models Resistant to Adversarial Attacks. Available online: https://github.com/MadryLab/cifar10_challenge (accessed on 2 December 2022).
- Gao, R.; Liu, F.; Zhang, J.; Han, B.; Liu, T.; Niu, G.; Sugiyama, M. Maximum Mean Discrepancy Test is Aware of Adversarial Attacks. In Proceedings of the International Conference on Machine Learning, Virtual Event, 13–18 July 2020; Available online: http://arxiv.org/abs/2010.11415 (accessed on 17 November 2022).
- Akhtar, N.; Mian, A. Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey. IEEE Access 2018, 6, 14410–14430. [Google Scholar] [CrossRef]
- Yurekten, O.; Demirci, M. Citadel: Cyber threat intelligence assisted defense system for software-defined networks. Comput. Netw. 2021, 191, 108013. [Google Scholar] [CrossRef]
Ref | Name | Survey Area | IDS Specific | Methodological Approach | ML | IC or ICS Specific | Dataset Analysis |
---|---|---|---|---|---|---|---|
[11] | A Survey on Industrial Control System Testbeds and Datasets for Security Research | Security Research | x | x | x | ||
[5] | A Survey on Machine Learning Techniques for Cyber Security in the Last Decade | Cybersecurity | Own process for Article Selection | x | x | ||
[12] | A survey of network-based intrusion detection datasets | Intrusion Detection Dataset | x | ||||
[13] | Cybersecurity for industrial control systems: A survey | Cybersecurity | x | x | x | ||
[7] | Survey of intrusion detection systems: techniques, datasets, and challenges | Intrusion Detection Dataset | x | x | x | ||
[14] | A Survey of Security in SCADA Networks: Current Issues and Future Challenges | Cybersecurity | x | ||||
[15] | A Survey of Anomaly Detection in Industrial WirelessSensor Networks with Critical Water SystemInfrastructure as a Case Study | Cybersecurity in industrial wireless sensor | x | x | x | ||
[16] | Survey on Intrusion Detection Systems based on Machine Learning Techniques for the Protection of Critical Infrastructure | Cybersecurity | x | x | x | x |
Methodology Criteria | Results | |
---|---|---|
Keywords | IDS, NID, Anomaly Detection Method, Signature Detection Method, Hybrid Detection Method, ML, AI, Deep Learning, CI, ICS, SCADA | More than 30,000 results; a lot of redundancy and inaccurate results |
Keyword filter | IDS, ML, CI | 1396 document results |
1st filter | The last five years from 2018 to 2022 | 1192 document results |
2nd filter | Article, Conference Paper, Review, or Short Survey | 1091 document results |
3rd filter | Written in English or Spanish | 1079 document results |
4th filter | Focus on papers that specifically deal with IDSs, ML, and CCI | 300 document results |
5th filter | Abstract analysis to filter the documents | 166 documents results |
Abstract review | Deep analysis to select the documents that positively contribute to the survey | 98 documents results * |
According to the scope | Host-based IDSs |
Network-based IDSs | |
According to the methodology | Signature-based IDSs |
Anomaly-based IDSs | |
Rule-based IDSs | |
Hybrid IDSs |
Release Year | Dataset’s Name | Source |
---|---|---|
2005 | LBNL | [40] |
2009 | TWENTE | [41] |
2010 | MAWILab | [42] |
2011 | KYOTO | [43] |
2012 | TTUIDS | [44] |
2012 | ISCX | [45] |
2014 | SANTA | [46] |
2014 | SSENET(V2) | [47] |
2015 | ARCS | [48] |
2016 | DDoS | [49] |
2017 | NDSec-1 | [50] |
2018 | PUF | [51] |
2018 | UGR’16 | [52] |
2019 | SWAT | [53] |
2020 | MedBIoT | [54] |
2021 | MWS | [55] |
2022 | NF-UQ-NIDS (V2) | [38] |
2022 | NF-CSE-CIC-IDS (V2) | [38] |
2022 | NF-ToN-IoT | [38] |
2002–2020 | CAIDA | [56] |
Ref | Dataset | Dataset Date | Learning Model | Characteristics | Tested Algorithms | Results | ||||
---|---|---|---|---|---|---|---|---|---|---|
Accuracy | Precision | Recall | F1-Score | Other | ||||||
[76] | NF-BoT-IoT (V2) | 2022 | Supervised learning | Ensemble models | Random forest decision tree classifiers | AUC: 1.0 | ||||
[33] | KDD 99 | 1990 | Naïve Bayes, decision tree classifiers | 0.998 | 0.998 | 0.998 | ||||
[78] | CTU-UNB | 2015 | Convolutional networks | Dilated convolutional neural networks (unsupervised pretraining and supervised fine-tuning) | 0.899 | 0.917 | 0.899 | 0.897 | ||
[35] | CICDS | 2017 | Convolutional neural networks | 0.992 | ||||||
ADFA-LD | 2009 | 0.953 | ||||||||
NSL-KDD | 2009 | 0.834 | ||||||||
[77] | CICDS | 2017 | Deep networks | Deep neural networks | 0.997 | |||||
NF UNSW-NB15 | 2022 | 0.970 | ||||||||
[79] | NSL-KDD | 2009 | 0.954 | 0.962 | 0.935 | |||||
[80] | CICDS | 2017 | Multi-layer perceptron | 0.77 | 0.83 | 0.76 | ||||
[36] | TRAbID | Decision trees | Decision tree (DoS) | 0.900 | FP (%): 0.00, FN (%): 19.84 | |||||
[81] | KDD CUP | 1999 | Decision tree, multi-objective DT pruning | 0.966 | 0.998 | |||||
[77] | NF UNSW-NB15 | 2022 | Random forest (multiclassification) | 0.917 | ||||||
[80] | CICDS | 2017 | Random forest | 0.98 | 0.97 | 0.97 | ||||
Adaboost | 0.77 | 0.84 | 0.77 | |||||||
ID3 | 0.98 | 0.98 | 0.98 | |||||||
[36] | TRAbID | 2017 | Bayesian networks | Naïve Bayes (DoS) | 0.833 | FP: 0.35%, FN: 36.99% | ||||
[80] | CICDS | 2017 | Naïve Bayes | 0.88 | 0.04 | 0.04 | ||||
[72] | NSL-KDD | 2009 | Likelihood naïve Bayes (PPGO-LNB) | 0.965 | 0.975 | Sensitivity: 0.965 Specificity: 0.964 | ||||
CICDS | 2017 | 0.999 | 0.999 | Sensitivity: 0.999 Specificity: 0.999 | ||||||
NF-BoT-IoT (V2) | 2022 | 0.999 | 0.999 | Sensitivity: 0.999 Specificity: 0.999 | ||||||
[80] | CICDS | 2017 | Generative models | Quadratic discriminant analysis | 0.97 | 0.88 | 0.92 | |||
[32] | CSIC | 2018 | Long short-term memory networks | Recurrent neural networks | 0.976 | 0.977 | 0.96 | |||
[82] | MAWI | 2022 | Deep recurrent neural networks | |||||||
[80] | CICDS | 2017 | Neighbor-based models | K-nearest neighbors | 0.96 | 0.96 | 0.96 | |||
[60] | NGIDS-DS | 2009 | Unsupervised learning | ODIN | 0.98 | 0.948 | 0.729 | MCC: 0.824 | ||
COF | 0.87 | 0.253 | 0.759 | MCC: 0.824 | ||||||
[83] | ISOT-CID | 2010 | Reinforcement learning | Deep networks | Double Deep Q-Networks | 0.9217 | AUC: 0.811 | |||
NSL-KDD | 2009 | 0.797 | AUC: 0.798 |
Ref | Dataset | Tested Algorithms | Advantages | Disadvantages |
---|---|---|---|---|
[76] | NF-BoT-IoT (V2) | Ensemble model: Random forest decision tree classifiers | High accuracy and stability prediction, minimal misclassification. | High complexity in algorithm design |
[33] | KDD 99 | Ensemble model: Naïve Bayes, decision tree classifiers | Outdated dataset (1990). High complexity in algorithm design | |
[78] | CTU-UNB | Dilated convolutional neural networks (unsupervised pretraining and supervised fine-tuning) | It is well-suited to large-scale networks, has low detection time, feature extraction capability | Highly dependable on the relevancy of the features |
[35] | CICDS | Convolutional neural networks | Context-aware Feature Extraction, the dataset contains network flows | High resource computing |
ADFA-LD | Context-aware Feature Extraction, Host-based intrusion detection | Outdated dataset (2009). High resource computing | ||
NSL-KDD | Context-aware Feature Extraction | |||
[77] | CICDS | Deep neural networks | Binary and multiclass classification, the dataset contains network flows | Complex model difficult to interpret the results, high resource computing |
[80] | CICDS | Deep network: Multi-layer perceptron | The dataset contains network flows | Long execution time with nonlinear problems |
[72] | NSL-KDD | Likelihood naïve Bayes (PPGO-LNB) | Low false positives and low computational cost, low detection time | Outdated dataset (2009), the model was applied just to binary classification. It assumes that the variables are independent. |
CICDS | The dataset contains network flows, low false positives, low computational cost, and low detection time. | The model was applied just to binary classification. It assumes that the variables are independent. | ||
NF-BoT-IoT (V2) | Low false positives, low computational cost, and low detection time |
ICC | Dataset | Attacks | ML Techniques | Source |
---|---|---|---|---|
SCADA | CSE-CIC-IDS 2018 | “Bot, DDoS, DoS, SSH-Brute Force, FTP-Brute Force, Infiltration, Brute Force Web, Brute Force XXS, SQL Injection” | Multifaceted data clustering model; gradient descent spider monkey optimization-deep sequential long short-term memory | [86] |
NSL-KDD | DoS, probe, R2L, U2R | |||
BoT-IoT | “Information Gathering, DDoS, DoS, Information Theft” | |||
Water treatment system | SWAT | “Single Stage Single Point (SSSP), Single Stage Multi-Point (SSMP), Multi-Stage Single Point (MSSP), Multi-Stage Multi-Point (MSMP)” | Autoencoder neural network (modified) | [87] |
“36 attacks were carried out on communication links attacking different sensors/actuators aiming at one device or multiple devices and/or stages simultaneously”. | Convolutional neural networks (modified) | [88] | ||
Power system | Dataset developed by Mississippi State University and Oak Ridge National Laboratory | “Data injection, remote tripping command injection, and relay setting change”. | Supervised autoencoder and PCA algorithm | [89] |
Industrial Control System (ICS), Cyber Attack Datasets | “False Data Injection and Denial of Service attacks”. | Deep belief network | [90] | |
Custom | “Injection attack, function code attack, and reconnaissance attack”. | GAN | [91] |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Pinto, A.; Herrera, L.-C.; Donoso, Y.; Gutierrez, J.A. Survey on Intrusion Detection Systems Based on Machine Learning Techniques for the Protection of Critical Infrastructure. Sensors 2023, 23, 2415. https://doi.org/10.3390/s23052415
Pinto A, Herrera L-C, Donoso Y, Gutierrez JA. Survey on Intrusion Detection Systems Based on Machine Learning Techniques for the Protection of Critical Infrastructure. Sensors. 2023; 23(5):2415. https://doi.org/10.3390/s23052415
Chicago/Turabian StylePinto, Andrea, Luis-Carlos Herrera, Yezid Donoso, and Jairo A. Gutierrez. 2023. "Survey on Intrusion Detection Systems Based on Machine Learning Techniques for the Protection of Critical Infrastructure" Sensors 23, no. 5: 2415. https://doi.org/10.3390/s23052415
APA StylePinto, A., Herrera, L. -C., Donoso, Y., & Gutierrez, J. A. (2023). Survey on Intrusion Detection Systems Based on Machine Learning Techniques for the Protection of Critical Infrastructure. Sensors, 23(5), 2415. https://doi.org/10.3390/s23052415