Supervised learning and pattern recognition is a crucial area of research in information retrieval, knowledge engineering, image processing, medical imaging, and intrusion detection. Numerous algorithms have been designed to address such complex application domains. Despite an enormous array of supervised classifiers, researchers are
[...] Read more.
Supervised learning and pattern recognition is a crucial area of research in information retrieval, knowledge engineering, image processing, medical imaging, and intrusion detection. Numerous algorithms have been designed to address such complex application domains. Despite an enormous array of supervised classifiers, researchers are yet to recognize a robust classification mechanism that accurately and quickly classifies the target dataset, especially in the field of intrusion detection systems (IDSs). Most of the existing literature considers the accuracy and false-positive rate for assessing the performance of classification algorithms. The absence of other performance measures, such as model build time, misclassification rate, and precision, should be considered the main limitation for classifier performance evaluation. This paper’s main contribution is to analyze the current literature status in the field of network intrusion detection, highlighting the number of classifiers used, dataset size, performance outputs, inferences, and research gaps. Therefore, fifty-four state-of-the-art classifiers of various different groups, i.e., Bayes, functions, lazy, rule-based, and decision tree, have been analyzed and explored in detail, considering the sixteen most popular performance measures. This research work aims to recognize a robust classifier, which is suitable for consideration as the base learner, while designing a host-based or network-based intrusion detection system. The NSLKDD, ISCXIDS2012, and CICIDS2017 datasets have been used for training and testing purposes. Furthermore, a widespread decision-making algorithm, referred to as Techniques for Order Preference by Similarity to the Ideal Solution (TOPSIS), allocated ranks to the classifiers based on observed performance reading on the concern datasets. The J48Consolidated provided the highest accuracy of 99.868%, a misclassification rate of 0.1319%, and a Kappa value of 0.998. Therefore, this classifier has been proposed as the ideal classifier for designing IDSs.
Full article