High Density Sensor Networks Intrusion Detection System for Anomaly Intruders Using the Slime Mould Algorithm
Round 1
Reviewer 1 Report
The paper presents a new IDS method applicable to WSN. I am suggesting a merge of Figure 3 and Figure 5 as well as Figure 8 and Figure 9. Also, an extension of the reference list is required. I am suggesting the addition of papers related to IDS from the MDPI Symmetry/MDPI Electronics journal as well as more references from other prestigious journals. It would be extremely interesting to see what are the weaknesses of the SMA algorithm. Could anything lead it in the wrong direction? Does the attacker have an advantage if he knows that particular system is being used? Can the anomalies be used against the system, i.e. what is the possible scenario? Please make the analysis and discussion. The conclusion should be extended and rewritten. For instance, the phrase 'the confusion matrix measures show significant results' does not mean anything. Please clarify and explain in more detail.
Author Response
- Comments by Reviewer 1
- Opening Remarks
The paper presents a new IDS method applicable to WSN.
Response/Action taken: The authors are humbled by the kind comments from the reviewer. Guided by his/her feedback we have invested a lot of effort to improve the previous submission. We are forever grateful for the criticisms, patience, and feedback.
- Comment 1
“I am suggesting a merge of Figure 3 and Figure 5 as well as Figure 8 and Figure 9.”
Response/Action taken: Thanks for your comments. In the revised version we have merged Figure 3 with Figure 5 and now they are one Figure numbered as Figure 3. Figure 8 merged with Figure 9 Where they are now Figure 7.
- Comment 2
“Also, an extension of the reference list is required. I am suggesting the addition of papers related to IDS from the MDPI Symmetry/MDPI Electronics journal as well as more references from other prestigious journals.”
Response/Action taken: Thanks for your comments. In the revised version we have extended the literature review as we added eight-more related to the IDS references:
[26] D. N. Mhawi, A. Aldallal, and S. Hassan, "Advanced Feature-Selection-Based Hybrid Ensemble Learning Algorithms for Network Intrusion Detection Systems," Symmetry, vol. 14, p. 1461, 2022.
[27] H. Han, H. Kim, and Y. Kim, "An Efficient Hyperparameter Control Method for a Network Intrusion Detection System Based on Proximal Policy Optimization," Symmetry, vol. 14, p. 161, 2022.
[28] F. B. Saghezchi, G. Mantas, M. A. Violas, A. M. de Oliveira Duarte, and J. Rodriguez, "Machine Learning for DDoS Attack Detection in Industry 4.0 CPPSs," Electronics, vol. 11, p. 602, 2022.
[29] E. Jaw and X. Wang, "Feature Selection and Ensemble-Based Intrusion Detection System: An Efficient and Comprehensive Approach," Symmetry, vol. 13, p. 1764, 2021.
[30] A. Adnan, A. Muhammed, A. A. Abd Ghani, A. Abdullah, and F. Hakim, "An Intrusion Detection System for the Internet of Things Based on Machine Learning: Review and Challenges," Symmetry, vol. 13, p. 1011, 2021.
[31] S. Alabdulwahab and B. Moon, "Feature Selection Methods Simultaneously Improve the Detection Accuracy and Model Building Time of Machine Learning Classifiers," Symmetry, vol. 12, p. 1424, 2020.
[32] R. Abdulhammed, H. Musafer, A. Alessa, M. Faezipour, and A. Abuzneid, "Features Dimensionality Reduction Approaches for Machine Learning Based Network Intrusion Detection," Electronics, vol. 8, p. 322, 2019.
[33] A.-U.-H. Qureshi, H. Larijani, N. Mtetwa, A. Javed, and J. Ahmad, "RNN-ABC: A New Swarm Optimization Based Technique for Anomaly Detection," Computers, vol. 8, p. 59, 2019.
Accordingly, the Introduction section now involves the following new paragraphs:
Authors in [26], offer a Feature Selection strategy for selecting the best features to include in the final product. Next, these subsets are sent to the suggested hybrid ensemble learning, that also uses minimal computational and time resources to increase the IDS's stability and accuracy. This research is important since it seeks to: decrease the dimensionality of the CICIDS2017 dataset by combining Correlation Feature Selection and Forest Panelized Attributes (CFS–FPA); determine the optimal machine learning strategy for aggregating the four modified classifiers (SVM, Random Forests, Naïve Bayes, and K-Nearest Neighbor (KNN)); and validate the effectiveness of the hybrid ensemble scheme. Examine the CFS-FPA and other feature selection methods in regards to accuracy, detections, and false alarms. Ultimately, the results will be employed to extend the effectiveness of the suggested features selection method. Evaluate how each of the classification techniques performed prior to and after being modified to use the AdaBoosting technique. Further, they will evaluate the recommended methodology against alternative available options.
Proximal policy optimization (PPO) is used in [27] to present an intrusion detection hyperparameter control system (IDHCS) that trains a deep neural network (DNN) feature extractor and a k-means clustering component under control of the DNN. Intruders can be detected using k-means clustering, and the IDHCS ensures that the strongest useful attributes are extracted from the network architecture by controlling the DNN feature extractor. Automatically performance improvements tailored to the network context in which the IDHCS is deployed are achieved by iterative learning utilizing a PPO-based reinforcement learning approach. To test the efficacy of the methodology, researchers ran simulations with the CICIDS2017 as well as UNSW-NB15 databases. An F1-score of 0.96552 been attained in CICIDS2017, whereas an F1-score of 0.94268 being attained in UNSW-NB15. As a test, they combined the two data sets to create a larger, greater realistic scenario. The variety and complexity of the attacks seen in the study increased after integrating the records. When contrasted to CICIDS2017 and UNSW-NB15, the combined dataset received an F1-score of 0.93567, suggesting performance between 97% and 99%. The outcomes demonstrate that the recommended IDHCS enhanced the IDS's effectiveness through the automated learning of new forms of attacks via the management of intrusion detection attributes independent of the changes in the network environment.
In this research [28], researchers present an ML-based IDS for use in a practical Industry to identify DDoS attacks. They acquire innocuous data from Infineon's semiconductor manufacturing plants, such as network traffic statistics. They mine the DDoSDB database maintained by the University of Twente for fingerprints and examples of real-world DDoS assaults. To train 8-supervised learning algorithms—LR, NB, Bayesian Network (BN), KNN, DT, RF, and Classifier employ feature selection techniques like Principal Component Analysis (PCA) to lower the dimensionality of the input. Authors then investigate one semi-supervised/statistical classifier, the univariate Gaussian algorithm, and two unsupervised learning algorithms, simple K-Means and Expectation-Maximization (EM). This is the first study that researchers are aware of to use data collected from an actual factory to build ML models for identifying DDoS attacks in Industry. Some works have applied ML to the problem of detecting distributed denial-of-service (DDoS) attacks in Operational Technology (OT) networks, but these approaches have relied on data that was either synthesized for the purpose or collected from an IT network.
Jaw and Wang introduced a hybrid feature selection (HFS) using an ensemble classifier in their paper [29]. This method picks relevant features in an effective manner and offers a classification that is compatible with the attack. Initially. For instance, it outperformed all of the selected individual classification methods, cutting-edge FS, and some current IDSs methodologies, achieving an amazing outcome accuracy of 99.99%, 99.73%, and 99.997%, and a detection rate of 99.75%, 96.64%, and 99.93%, respectively, for CIC-IDS2017, NSL-KDD, and UNSW-NB15, respectively. The findings, on the other hand, are dependent on 11, 8, and 13 pertinent attributes that were chosen from the dataset.
Researchers give a review of IDSs in [30] that looks at them from the point of view of ML. They discuss the three primary obstacles that an IDS faces in general, as well as the difficulties that an IDS for the Internet of Things (IoT) faces specifically; these difficulties are idea drift, high dimensionality, and high computational. The orientation of continued research as well as studies aimed at finding solutions to each difficulty are discussed. In addition, authors have devoted an entirely separate section of this study to the presentation of datasets that are associated with an IDS. In particular, the KDD99 dataset, the NSL dataset, and the Kyoto dataset are shown herein. This article comes to the conclusion that there are three aspects of concept drift, elevated number of features, and computational awareness that are symmetric in their impact and need to be resolved in the neural network based concept for an IDS in the internet of things (IoT).
Using the Weka tool and 10-fold cross-validation with and without feature selection/reduction techniques, the authors of [31] assessed six supervised classifiers on the entire NSL-KDD training dataset. The study's authors set out to find new ways to improve upon and protect classifiers that boast the best detection accuracy and the quickest model-building times. It is clear from the results that using a feature selection/reduction technique, such as the wrapper method in conjunction with the discretize filter, the filter method in conjunction with the discretize filter, or just the discretize filter, can drastically cut down on the time spent on model construction without sacrificing detection precision.
Auto-Encoder (AE) and PCA are the two methods that Abdulhammed et al. [32] utilize to reduce the dimensionality of the features. The low-dimensional features that are produced as a result of applying either method are then put to use in the process of designing an intrusion detection system by being incorporated into different classifiers, such as RF, Bayesian Networks, Linear Discriminant Analysis (LDA), and Quadratic Discriminant Analysis. The outcomes of the experiments with binary and multi-class categorization using low-dimensional features demonstrate higher performance in terms of Detection Rate, F-Measure, False Alarm Rate, and Accuracy. This investigation endeavor was successful in lowering the number of feature dimensions in the CICIDS2017 dataset from 81 to 10, while still achieving an accuracy rate of 99.6% in both multi-class and binary classification. New IDS based on random neural network and ABC is proposed in article [33]. (RNN-ABC). Using the standard NSL-KDD data set, researchers train and evaluate the model. The recommended RNN-ABC is compared to the conventional RNN trained with the gradient descent approach on a variety of metrics, including accuracy. Still maintaining a 95.02% accuracy rate generally
- Comment 3
“It would be extremely interesting to see what are the weaknesses of the SMA algorithm. Could anything lead it in the wrong direction? Does the attacker have an advantage if he knows that particular system is being used? Can the anomalies be used against the system, i.e. what is the possible scenario? Please make the analysis and discussion.”
Response/Action taken: Thanks for your comments. In the revised version we have explained the SMA algorithm weaknesses in the section Results and Discussion, which is the first paragraph as follows:
One of the noticeable weaknesses of the SMA algorithm is the pre-mature problem. Nevertheless, to overcome this problem, the simulation parameters have to be carefully set. Furthermore, the data should be pre-processed, such as normalization, and cleaning. Otherwise, the algorithm will suffer from pre-maturing problem. Extensive simulations have been conducted in order to find the best settings to be utilized in our final experiments. One of the most important parameters that has been set during this phase is the number of neighbors of the KNN classifier, K. Using the SMA-algorithm is an advantageous way since it is a nature inspired approach and it can be used for the FS portion of the IDS. However, aim of using the SMA-algorithm is to reduce number of attributes, while classification is the job other classifiers, DT and SVM. Even when novel attacker is appeared, SMA algorithm can be recognize anomalies based on features. In other words, the selected features will be updated as long as there are new anomalies.
- Comment 4
“The conclusion should be extended and rewritten. For instance, the phrase 'the confusion matrix measures show significant results' does not mean anything. Please clarify and explain in more detail.”
Response/Action taken: Thanks for your comments. In the revised version the conclusion section has been re-written and more detailed. In other words, the conclusion section now includes the following lines:
, in other words, false positive rate, false negative rate, true positive rate, and true negative rate are improved significantly if contrasted with other works, where only five features were selected using SMA-algorithm (‘Service’, ‘src_bytes’, ‘Hot’, ‘serror_rate’, and ‘dst_host_srv_serror_rate’) and the measures are obtained based on these five features only. Thus, the accuracy was 99.39%, error rate reached only 0.61%, the overall sensitivity was 99.36% when the specificity was 99.42%, while the FPR is reduced to 0.58% and the F-measure was 99.34%. It is worth mention that the SMA-algorithm may fall into pre-mature problem, if not carefully set, in other words, the control parameters of the SMA-algorithm should be selected properly before to deploy it in the IDS. As a future development, it is suggested to implement the presented methodology on different datasets and for multi-class categorization job.
Author Response File: Author Response.pdf
Reviewer 2 Report
I.
Abstract has to many words, exactly 247, what is 22% more than suggested in instructions for authors.Also it should be an objective representation of the paper following the recommanded structure for article:
1) Background: in which should be placed the questions addressed in a broad context and highlighted the purpose of the study and it's contribution to the scientiffic knowledge;
2) Methods: in which should be described briefly the main methods and materialss applied.
3) Results: in which should be given the paper's main findings;
4) Conclusion: in which should be given the main conclusions.
which the authors did not do.
II.
The paper is not organized in recommanded structure as it is given in Instruction for authors of the journal and because of that it has unnecessary biger number of sections and it is difficult to understand.
Section 2,3 and 4. could be one section 2. Material and methods with two subsections:
2.1. NSL-KDD Dataset Description, and with content given in article and now present section 3. , and
2.2. Methods, with two subsubsections
2.2.1. Slime Mould Algorithm Mathematical Model and with content given in article and now present section 2., and
2.2.2. Classification Tools and Evaluation Parameters, and with content given in paper and now present section 4.
Rest of the sections in article should be renumbered as follows i.e.
3. Results and Discussion,
5.Conclusion
III.
In section Results and discussion are missing:
- comprenhensive explantation of evaluations of proposed model,
- comparision proposed model with other, similar machine learning methodologies
-limitations of propose model
IV.
Conclusion section is unadmissible short:
-without clear explantation of the paper's contributions, and
-without description of possibly future work on the subject of this paper.
V.
For the journal in which the article is to be published, the number of annotated references is much less than necessary, and there is also a lack of a review of today's most modern ensembles of machine learning methodologies and their use for solving the considered problem.
Author Response
- Comment 1
“Abstract has to many words, exactly 247, what is 22% more than suggested in instructions for authors.Also it should be an objective representation of the paper following the recommanded structure for article:
1) Background: in which should be placed the questions addressed in a broad context and highlighted the purpose of the study and it's contribution to the scientiffic knowledge;
2) Methods: in which should be described briefly the main methods and materialss applied.
3) Results: in which should be given the paper's main findings;
4) Conclusion: in which should be given the main conclusions.
which the authors did not do.”
Response/Action taken: Thanks for your comments. In the revised version the abstract has been re-formed completely according to your valuable comments.
- Comment 2
“The paper is not organized in recommanded structure as it is given in Instruction for authors of the journal and because of that it has unnecessary biger number of sections and it is difficult to understand.
Section 2,3 and 4. could be one section 2. Material and methods with two subsections:
2.1. NSL-KDD Dataset Description, and with content given in article and now present section 3. , and
2.2. Methods, with two subsubsections
2.2.1. Slime Mould Algorithm Mathematical Model and with content given in article and now present section 2., and
2.2.2. Classification Tools and Evaluation Parameters, and with content given in paper and now present section 4.
Rest of the sections in article should be renumbered as follows i.e.
- Results and Discussion,
5.Conclusion”
Response/Action taken: Thanks for your comments. In the revised version the section-titles are revised according to your comments as follows:
Section 2 became
- Materials and Methods
The employed dataset in this article will be describe in this section. Furthermore, the methodology which is utilized for the FS process will be discussed as well as the classification tools are also presented in this section.
2.1 NSL-KDD Dataset Description
2.2 Methods
2.2.1. Slime Mould Algorithm Mathematical Model
2.2.2. Classification Tools and Evaluation Parameters
- Results and Discussion
- Conclusion
- Comment 3
“In section Results and discussion are missing:
- comprenhensive explantation of evaluations of proposed model,
- comparision proposed model with other, similar machine learning methodologies
-limitations of propose model”
Response/Action taken: Thanks for your comments. In the revised version of our paper, the results section has been corrected according to your valued comments. For instance, we have added the following paragraphs in the beginning of the section:
One of the noticeable weaknesses of the SMA algorithm is the pre-mature problem. Nevertheless, to overcome this problem, the simulation parameters have to be carefully set. Furthermore, the data should be pre-processed, such as normalization, and cleaning. Otherwise, the algorithm will suffer from pre-maturing problem. Extensive simulations have been conducted in order to find the best settings to be utilized in our final experiments. One of the most important parameters that has been set during this phase is the number of neighbors of the KNN classifier, K. Using the SMA-algorithm is an advantageous way since it is a nature inspired approach and it can be used for the FS portion of the IDS. However, aim of using the SMA-algorithm is to reduce number of attributes, while classification is the job other classifiers, DT and SVM. Even when novel attacker is appeared, SMA algorithm can be recognize anomalies based on features. In other words, the selected features will be updated as long as there are new anomalies.
However, The KNN-algorithm is a straightforward example of the supervised learning subfield of ML. The KNN method makes an assumption about the degree of correspondence between the new instance and the existing examples and places it in the class to which it is very closely related. The KNN-algorithm remembers all the information it has, and then uses that information to determine how to categorize a new piece of data. This means that the KNN method can be used to quickly and accurately categorize newly-emerging data into a set of predefined categories. While the KNN technique has certain applications in Regression, it is most commonly utilized for Classification. When it comes to the underlying data, KNN is a non-parametric methodology. In the training phase, the KNN-algorithm simply saves the dataset, and when it receives new data, it assigns it to a class that best fits it.
SVM is widely utilized for both classification and regression tasks, making it one of the most well-known supervised learning methods. On the other hand, its most common application is in the field of ML for categorization issues. The SVM algorithm's objective is to find the optimal line or decision boundary that divides the n-dimensional space into classes, making it simple to assign the new information point to the right class in the following. A hyperplane describes the optimal choice boundaries. To create the hyperplane, SVM selects the most extreme points (or vectors). The name "Support Vector Machine" refers to the fact that this technique is designed to help in the most severe instances.
The DT-algorithm takes its name from the tree-like form it builds, beginning with a single node at the base and branching out from there. While DT, like other supervised learning methods, can be applied to the solution of regression tasks, it is most commonly utilized to address classification issues. It is a classifier organized like a tree, with internal nodes standing in for the attributes of a dataset, branches for the decision rules, and leaf nodes for the results. The two types of nodes in a DT are the decision node and the leaf node. Leaf nodes are the results of previous decisions and do not include any additional branches, while decision nodes feature many branches that are utilized to arrive at those choices. The characteristics of the provided dataset are used to make a determination or conduct a test. It's a visual tool for figuring out every feasible option in a decision tree, supplied some initial state.
Now, for the comparison, we have added the following lines in the results section, which are highlighted in yellow color:
Although the accuracy of our proposed methodology is less than than obtained in [29], which was 99.73%, by 0.34% and 3.26% for the DT and SVM-Polynomial, respectively, but this was at the expense of the number of selected features. Where in [29], the number of selected features were 8, 11, and 13, while our approach was only five features.
Our results of the sensitivity are almost close to that found in [29], but, as explained above, the expense was on the number of features. Note that when the number of features increased, the training time and computational complexity as well as the required memory will be increased. That is, the sensitivity using the five-selected features was respectable.
- Comment 4
“Conclusion section is unadmissible short:
-without clear explantation of the paper's contributions, and
-without description of possibly future work on the subject of this paper.”
Response/Action taken: Thanks for your comments. The conclusion section was expanded and we have added the following lines accordingly:
, in other words, false positive rate, false negative rate, true positive rate, and true negative rate are improved significantly if contrasted with other works, where only five features were selected using SMA-algorithm (‘Service’, ‘src_bytes’, ‘Hot’, ‘serror_rate’, and ‘dst_host_srv_serror_rate’) and the measures are obtained based on these five features only. Thus, the accuracy was 99.39%, error rate reached only 0.61%, the overall sensitivity was 99.36% when the specificity was 99.42%, while the FPR is reduced to 0.58% and the F-measure was 99.34%. It is worth mention that the SMA-algorithm may fall into pre-mature problem, if not carefully set, in other words, the control parameters of the SMA-algorithm should be selected properly before to deploy it in the IDS. As a future development, it is suggested to implement the presented methodology on different datasets and for multi-class categorization job.
- Comment 5
“For the journal in which the article is to be published, the number of annotated references is much less than necessary, and there is also a lack of a review of today's most modern ensembles of machine learning methodologies and their use for solving the considered problem.”
Response/Action taken: Thanks for your comments. We have revised the part of the literature review. Accordingly 8-more references are added as follows:
[26] D. N. Mhawi, A. Aldallal, and S. Hassan, "Advanced Feature-Selection-Based Hybrid Ensemble Learning Algorithms for Network Intrusion Detection Systems," Symmetry, vol. 14, p. 1461, 2022.
[27] H. Han, H. Kim, and Y. Kim, "An Efficient Hyperparameter Control Method for a Network Intrusion Detection System Based on Proximal Policy Optimization," Symmetry, vol. 14, p. 161, 2022.
[28] F. B. Saghezchi, G. Mantas, M. A. Violas, A. M. de Oliveira Duarte, and J. Rodriguez, "Machine Learning for DDoS Attack Detection in Industry 4.0 CPPSs," Electronics, vol. 11, p. 602, 2022.
[29] E. Jaw and X. Wang, "Feature Selection and Ensemble-Based Intrusion Detection System: An Efficient and Comprehensive Approach," Symmetry, vol. 13, p. 1764, 2021.
[30] A. Adnan, A. Muhammed, A. A. Abd Ghani, A. Abdullah, and F. Hakim, "An Intrusion Detection System for the Internet of Things Based on Machine Learning: Review and Challenges," Symmetry, vol. 13, p. 1011, 2021.
[31] S. Alabdulwahab and B. Moon, "Feature Selection Methods Simultaneously Improve the Detection Accuracy and Model Building Time of Machine Learning Classifiers," Symmetry, vol. 12, p. 1424, 2020.
[32] R. Abdulhammed, H. Musafer, A. Alessa, M. Faezipour, and A. Abuzneid, "Features Dimensionality Reduction Approaches for Machine Learning Based Network Intrusion Detection," Electronics, vol. 8, p. 322, 2019.
[33] A.-U.-H. Qureshi, H. Larijani, N. Mtetwa, A. Javed, and J. Ahmad, "RNN-ABC: A New Swarm Optimization Based Technique for Anomaly Detection," Computers, vol. 8, p. 59, 2019.
Accordingly, the Introduction section now involves the following new paragraphs:
Authors in [26], offer a Feature Selection strategy for selecting the best features to include in the final product. Next, these subsets are sent to the suggested hybrid ensemble learning, that also uses minimal computational and time resources to increase the IDS's stability and accuracy. This research is important since it seeks to: decrease the dimensionality of the CICIDS2017 dataset by combining Correlation Feature Selection and Forest Panelized Attributes (CFS–FPA); determine the optimal machine learning strategy for aggregating the four modified classifiers (SVM, Random Forests, Naïve Bayes, and K-Nearest Neighbor (KNN)); and validate the effectiveness of the hybrid ensemble scheme. Examine the CFS-FPA and other feature selection methods in regards to accuracy, detections, and false alarms. Ultimately, the results will be employed to extend the effectiveness of the suggested features selection method. Evaluate how each of the classification techniques performed prior to and after being modified to use the AdaBoosting technique. Further, they will evaluate the recommended methodology against alternative available options.
Proximal policy optimization (PPO) is used in [27] to present an intrusion detection hyperparameter control system (IDHCS) that trains a deep neural network (DNN) feature extractor and a k-means clustering component under control of the DNN. Intruders can be detected using k-means clustering, and the IDHCS ensures that the strongest useful attributes are extracted from the network architecture by controlling the DNN feature extractor. Automatically performance improvements tailored to the network context in which the IDHCS is deployed are achieved by iterative learning utilizing a PPO-based reinforcement learning approach. To test the efficacy of the methodology, researchers ran simulations with the CICIDS2017 as well as UNSW-NB15 databases. An F1-score of 0.96552 been attained in CICIDS2017, whereas an F1-score of 0.94268 being attained in UNSW-NB15. As a test, they combined the two data sets to create a larger, greater realistic scenario. The variety and complexity of the attacks seen in the study increased after integrating the records. When contrasted to CICIDS2017 and UNSW-NB15, the combined dataset received an F1-score of 0.93567, suggesting performance between 97% and 99%. The outcomes demonstrate that the recommended IDHCS enhanced the IDS's effectiveness through the automated learning of new forms of attacks via the management of intrusion detection attributes independent of the changes in the network environment.
In this research [28], researchers present an ML-based IDS for use in a practical Industry to identify DDoS attacks. They acquire innocuous data from Infineon's semiconductor manufacturing plants, such as network traffic statistics. They mine the DDoSDB database maintained by the University of Twente for fingerprints and examples of real-world DDoS assaults. To train 8-supervised learning algorithms—LR, NB, Bayesian Network (BN), KNN, DT, RF, and Classifier employ feature selection techniques like Principal Component Analysis (PCA) to lower the dimensionality of the input. Authors then investigate one semi-supervised/statistical classifier, the univariate Gaussian algorithm, and two unsupervised learning algorithms, simple K-Means and Expectation-Maximization (EM). This is the first study that researchers are aware of to use data collected from an actual factory to build ML models for identifying DDoS attacks in Industry. Some works have applied ML to the problem of detecting distributed denial-of-service (DDoS) attacks in Operational Technology (OT) networks, but these approaches have relied on data that was either synthesized for the purpose or collected from an IT network.
Jaw and Wang introduced a hybrid feature selection (HFS) using an ensemble classifier in their paper [29]. This method picks relevant features in an effective manner and offers a classification that is compatible with the attack. Initially. For instance, it outperformed all of the selected individual classification methods, cutting-edge FS, and some current IDSs methodologies, achieving an amazing outcome accuracy of 99.99%, 99.73%, and 99.997%, and a detection rate of 99.75%, 96.64%, and 99.93%, respectively, for CIC-IDS2017, NSL-KDD, and UNSW-NB15, respectively. The findings, on the other hand, are dependent on 11, 8, and 13 pertinent attributes that were chosen from the dataset.
Researchers give a review of IDSs in [30] that looks at them from the point of view of ML. They discuss the three primary obstacles that an IDS faces in general, as well as the difficulties that an IDS for the Internet of Things (IoT) faces specifically; these difficulties are idea drift, high dimensionality, and high computational. The orientation of continued research as well as studies aimed at finding solutions to each difficulty are discussed. In addition, authors have devoted an entirely separate section of this study to the presentation of datasets that are associated with an IDS. In particular, the KDD99 dataset, the NSL dataset, and the Kyoto dataset are shown herein. This article comes to the conclusion that there are three aspects of concept drift, elevated number of features, and computational awareness that are symmetric in their impact and need to be resolved in the neural network based concept for an IDS in the internet of things (IoT).
Using the Weka tool and 10-fold cross-validation with and without feature selection/reduction techniques, the authors of [31] assessed six supervised classifiers on the entire NSL-KDD training dataset. The study's authors set out to find new ways to improve upon and protect classifiers that boast the best detection accuracy and the quickest model-building times. It is clear from the results that using a feature selection/reduction technique, such as the wrapper method in conjunction with the discretize filter, the filter method in conjunction with the discretize filter, or just the discretize filter, can drastically cut down on the time spent on model construction without sacrificing detection precision.
Auto-Encoder (AE) and PCA are the two methods that Abdulhammed et al. [32] utilize to reduce the dimensionality of the features. The low-dimensional features that are produced as a result of applying either method are then put to use in the process of designing an intrusion detection system by being incorporated into different classifiers, such as RF, Bayesian Networks, Linear Discriminant Analysis (LDA), and Quadratic Discriminant Analysis. The outcomes of the experiments with binary and multi-class categorization using low-dimensional features demonstrate higher performance in terms of Detection Rate, F-Measure, False Alarm Rate, and Accuracy. This investigation endeavor was successful in lowering the number of feature dimensions in the CICIDS2017 dataset from 81 to 10, while still achieving an accuracy rate of 99.6% in both multi-class and binary classification. New IDS based on random neural network and ABC is proposed in article [33]. (RNN-ABC). Using the standard NSL-KDD data set, researchers train and evaluate the model. The recommended RNN-ABC is compared to the conventional RNN trained with the gradient descent approach on a variety of metrics, including accuracy. Still maintaining a 95.02% accuracy rate generally
Author Response File: Author Response.pdf
Reviewer 3 Report
In this paper, the authors considered security issues. They suggested the SMA algorithm for Intrusion Detection System (IDS). In my opinion, developing methods for IDSs is significant because many threats await internet users. So, we should improve existing or suggest new methods. The article is well written, but I found some remarks that should be improved:
- the authors should indicate their contributions in the Introduction,
- the authors should add a Related Works section and consider newest papers connected with methods for IDSs
- the conclusions section is poor, and the authors should improve it; for example, they should add their plans for next research
Author Response
- Opening Remarks
In this paper, the authors considered security issues. They suggested the SMA algorithm for Intrusion Detection System (IDS). In my opinion, developing methods for IDSs is significant because many threats await internet users. So, we should improve existing or suggest new methods. The article is well written,
Response/Action taken: The authors are humbled by the kind comments from the reviewer. Guided by his/her feedback we have invested a lot of effort to improve the previous submission. We are forever grateful for the criticisms, patience, and feedback.
- Comment 1
“the authors should indicate their contributions in the Introduction”
Response/Action taken: thank you for your valuable comment. We have added an explanation for our contribution in the introduction section as well as a modification was achieved for the old part of the explanation. Overall, the new paragraph became as follows:
To the best knowledge of the authors, SMA-algorithm was not previously employed as a FS in an IDS, that is, in this paper, SMA methodology is proposed for FS. Thus, the 41-features of the NSL-KDD dataset may include redundant features or useless features, hence, feature reduction using SMA algorithm will be used to reduce this number of features. To achieve this goal, a classifier should be integrated in the SMA-technique. Therefore, the KNN algorithm has been chosen as a classifier, to be used as an evaluation approach in the SMA-algorithm because it is the algorithm with the least amount of complexity and the most straightforward integration. This prevents the design process from becoming more convoluted. Then, after the feature selection process accomplished, an ensemble of classification methodologies will be implemented, which will be SVM-Polynomial core and DT-classification method.
- Comment 2
“the authors should add a Related Works section and consider newest papers connected with methods for IDSs”
Response/Action taken: Thanks for your comments. We have revised the part of the literature review. Accordingly 8-more references are added as follows:
[26] D. N. Mhawi, A. Aldallal, and S. Hassan, "Advanced Feature-Selection-Based Hybrid Ensemble Learning Algorithms for Network Intrusion Detection Systems," Symmetry, vol. 14, p. 1461, 2022.
[27] H. Han, H. Kim, and Y. Kim, "An Efficient Hyperparameter Control Method for a Network Intrusion Detection System Based on Proximal Policy Optimization," Symmetry, vol. 14, p. 161, 2022.
[28] F. B. Saghezchi, G. Mantas, M. A. Violas, A. M. de Oliveira Duarte, and J. Rodriguez, "Machine Learning for DDoS Attack Detection in Industry 4.0 CPPSs," Electronics, vol. 11, p. 602, 2022.
[29] E. Jaw and X. Wang, "Feature Selection and Ensemble-Based Intrusion Detection System: An Efficient and Comprehensive Approach," Symmetry, vol. 13, p. 1764, 2021.
[30] A. Adnan, A. Muhammed, A. A. Abd Ghani, A. Abdullah, and F. Hakim, "An Intrusion Detection System for the Internet of Things Based on Machine Learning: Review and Challenges," Symmetry, vol. 13, p. 1011, 2021.
[31] S. Alabdulwahab and B. Moon, "Feature Selection Methods Simultaneously Improve the Detection Accuracy and Model Building Time of Machine Learning Classifiers," Symmetry, vol. 12, p. 1424, 2020.
[32] R. Abdulhammed, H. Musafer, A. Alessa, M. Faezipour, and A. Abuzneid, "Features Dimensionality Reduction Approaches for Machine Learning Based Network Intrusion Detection," Electronics, vol. 8, p. 322, 2019.
[33] A.-U.-H. Qureshi, H. Larijani, N. Mtetwa, A. Javed, and J. Ahmad, "RNN-ABC: A New Swarm Optimization Based Technique for Anomaly Detection," Computers, vol. 8, p. 59, 2019.
Accordingly, the Introduction section now involves the following new paragraphs:
Authors in [26], offer a Feature Selection strategy for selecting the best features to include in the final product. Next, these subsets are sent to the suggested hybrid ensemble learning, that also uses minimal computational and time resources to increase the IDS's stability and accuracy. This research is important since it seeks to: decrease the dimensionality of the CICIDS2017 dataset by combining Correlation Feature Selection and Forest Panelized Attributes (CFS–FPA); determine the optimal machine learning strategy for aggregating the four modified classifiers (SVM, Random Forests, Naïve Bayes, and K-Nearest Neighbor (KNN)); and validate the effectiveness of the hybrid ensemble scheme. Examine the CFS-FPA and other feature selection methods in regards to accuracy, detections, and false alarms. Ultimately, the results will be employed to extend the effectiveness of the suggested features selection method. Evaluate how each of the classification techniques performed prior to and after being modified to use the AdaBoosting technique. Further, they will evaluate the recommended methodology against alternative available options.
Proximal policy optimization (PPO) is used in [27] to present an intrusion detection hyperparameter control system (IDHCS) that trains a deep neural network (DNN) feature extractor and a k-means clustering component under control of the DNN. Intruders can be detected using k-means clustering, and the IDHCS ensures that the strongest useful attributes are extracted from the network architecture by controlling the DNN feature extractor. Automatically performance improvements tailored to the network context in which the IDHCS is deployed are achieved by iterative learning utilizing a PPO-based reinforcement learning approach. To test the efficacy of the methodology, researchers ran simulations with the CICIDS2017 as well as UNSW-NB15 databases. An F1-score of 0.96552 been attained in CICIDS2017, whereas an F1-score of 0.94268 being attained in UNSW-NB15. As a test, they combined the two data sets to create a larger, greater realistic scenario. The variety and complexity of the attacks seen in the study increased after integrating the records. When contrasted to CICIDS2017 and UNSW-NB15, the combined dataset received an F1-score of 0.93567, suggesting performance between 97% and 99%. The outcomes demonstrate that the recommended IDHCS enhanced the IDS's effectiveness through the automated learning of new forms of attacks via the management of intrusion detection attributes independent of the changes in the network environment.
In this research [28], researchers present an ML-based IDS for use in a practical Industry to identify DDoS attacks. They acquire innocuous data from Infineon's semiconductor manufacturing plants, such as network traffic statistics. They mine the DDoSDB database maintained by the University of Twente for fingerprints and examples of real-world DDoS assaults. To train 8-supervised learning algorithms—LR, NB, Bayesian Network (BN), KNN, DT, RF, and Classifier employ feature selection techniques like Principal Component Analysis (PCA) to lower the dimensionality of the input. Authors then investigate one semi-supervised/statistical classifier, the univariate Gaussian algorithm, and two unsupervised learning algorithms, simple K-Means and Expectation-Maximization (EM). This is the first study that researchers are aware of to use data collected from an actual factory to build ML models for identifying DDoS attacks in Industry. Some works have applied ML to the problem of detecting distributed denial-of-service (DDoS) attacks in Operational Technology (OT) networks, but these approaches have relied on data that was either synthesized for the purpose or collected from an IT network.
Jaw and Wang introduced a hybrid feature selection (HFS) using an ensemble classifier in their paper [29]. This method picks relevant features in an effective manner and offers a classification that is compatible with the attack. Initially. For instance, it outperformed all of the selected individual classification methods, cutting-edge FS, and some current IDSs methodologies, achieving an amazing outcome accuracy of 99.99%, 99.73%, and 99.997%, and a detection rate of 99.75%, 96.64%, and 99.93%, respectively, for CIC-IDS2017, NSL-KDD, and UNSW-NB15, respectively. The findings, on the other hand, are dependent on 11, 8, and 13 pertinent attributes that were chosen from the dataset.
Researchers give a review of IDSs in [30] that looks at them from the point of view of ML. They discuss the three primary obstacles that an IDS faces in general, as well as the difficulties that an IDS for the Internet of Things (IoT) faces specifically; these difficulties are idea drift, high dimensionality, and high computational. The orientation of continued research as well as studies aimed at finding solutions to each difficulty are discussed. In addition, authors have devoted an entirely separate section of this study to the presentation of datasets that are associated with an IDS. In particular, the KDD99 dataset, the NSL dataset, and the Kyoto dataset are shown herein. This article comes to the conclusion that there are three aspects of concept drift, elevated number of features, and computational awareness that are symmetric in their impact and need to be resolved in the neural network based concept for an IDS in the internet of things (IoT).
Using the Weka tool and 10-fold cross-validation with and without feature selection/reduction techniques, the authors of [31] assessed six supervised classifiers on the entire NSL-KDD training dataset. The study's authors set out to find new ways to improve upon and protect classifiers that boast the best detection accuracy and the quickest model-building times. It is clear from the results that using a feature selection/reduction technique, such as the wrapper method in conjunction with the discretize filter, the filter method in conjunction with the discretize filter, or just the discretize filter, can drastically cut down on the time spent on model construction without sacrificing detection precision.
Auto-Encoder (AE) and PCA are the two methods that Abdulhammed et al. [32] utilize to reduce the dimensionality of the features. The low-dimensional features that are produced as a result of applying either method are then put to use in the process of designing an intrusion detection system by being incorporated into different classifiers, such as RF, Bayesian Networks, Linear Discriminant Analysis (LDA), and Quadratic Discriminant Analysis. The outcomes of the experiments with binary and multi-class categorization using low-dimensional features demonstrate higher performance in terms of Detection Rate, F-Measure, False Alarm Rate, and Accuracy. This investigation endeavor was successful in lowering the number of feature dimensions in the CICIDS2017 dataset from 81 to 10, while still achieving an accuracy rate of 99.6% in both multi-class and binary classification. New IDS based on random neural network and ABC is proposed in article [33]. (RNN-ABC). Using the standard NSL-KDD data set, researchers train and evaluate the model. The recommended RNN-ABC is compared to the conventional RNN trained with the gradient descent approach on a variety of metrics, including accuracy. Still maintaining a 95.02% accuracy rate generally
- Comment 3
“the conclusions section is poor, and the authors should improve it; for example, they should add their plans for next research”
Response/Action taken: Thanks for your comments. We have revised the conclusion section as follows:
In this paper, the Slime Mould Algorithm (SMA-Algorithm) was suggested for feature reduction in WSN intrusion detection system. The suggested algorithm reduced the number of features from 41 to 5-features only. The detection of attack records of the NSL-KDD dataset, which is an updated version of the KDD CUP 99 dataset, was performed using two algorithms, say, SVM with polynomial core algorithm and DT-algorithm. SMA-Algorithm showed a comparable performance with other researches, where the confusion matrix measures show significant results, in other words, false positive rate, false negative rate, true positive rate, and true negative rate are improved significantly if contrasted with other works, where only five features were selected using SMA-algorithm (‘Service’, ‘src_bytes’, ‘Hot’, ‘serror_rate’, and ‘dst_host_srv_serror_rate’) and the measures are obtained based on these five features only. Thus, the accuracy was 99.39%, error rate reached only 0.61%, the overall sensitivity was 99.36% when the specificity was 99.42%, while the FPR is reduced to 0.58% and the F-measure was 99.34%. It is worth mention that the SMA-algorithm may fall into pre-mature problem, if not carefully set, in other words, the control parameters of the SMA-algorithm should be selected properly before to deploy it in the IDS. As a future development, it is suggested to implement the presented methodology on different datasets and for multi-class categorization job.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
The paper should be now accepted.
Author Response
We are forever grateful for the criticisms, patience, and feedback
Reviewer 2 Report
The authors accepted all my suggestions and corrected the paper according to them.
But authors could:
1. add larger number of references related to the methodology and especially the comparison of proposed with existing methods.
2. state the limitation of proposed solution more clearly and in explicit manner.
Author Response
- Comment 1
“Abstract has to many words, exactly 247, what is 22% more than suggested in instructions for authors.Also it should be an objective representation of the paper following the recommanded structure for article:
1) Background: in which should be placed the questions addressed in a broad context and highlighted the purpose of the study and it's contribution to the scientiffic knowledge;
2) Methods: in which should be described briefly the main methods and materialss applied.
3) Results: in which should be given the paper's main findings;
4) Conclusion: in which should be given the main conclusions.
which the authors did not do.”
Response/Action taken: Thanks for your comments. In the revised version the abstract has been re-formed completely according to your valuable comments.
- Comment 2
“The paper is not organized in recommanded structure as it is given in Instruction for authors of the journal and because of that it has unnecessary biger number of sections and it is difficult to understand.
Section 2,3 and 4. could be one section 2. Material and methods with two subsections:
2.1. NSL-KDD Dataset Description, and with content given in article and now present section 3. , and
2.2. Methods, with two subsubsections
2.2.1. Slime Mould Algorithm Mathematical Model and with content given in article and now present section 2., and
2.2.2. Classification Tools and Evaluation Parameters, and with content given in paper and now present section 4.
Rest of the sections in article should be renumbered as follows i.e.
- Results and Discussion,
5.Conclusion”
Response/Action taken: Thanks for your comments. In the revised version the section-titles are revised according to your comments as follows:
Section 2 became
- Materials and Methods
The employed dataset in this article will be describe in this section. Furthermore, the methodology which is utilized for the FS process will be discussed as well as the classification tools are also presented in this section.
2.1 NSL-KDD Dataset Description
2.2 Methods
2.2.1. Slime Mould Algorithm Mathematical Model
2.2.2. Classification Tools and Evaluation Parameters
- Results and Discussion
- Conclusion
- Comment 3
“In section Results and discussion are missing:
- comprenhensive explantation of evaluations of proposed model,
- comparision proposed model with other, similar machine learning methodologies
-limitations of propose model”
Response/Action taken: Thanks for your comments. In the revised version of our paper, the results section has been corrected according to your valued comments. For instance, we have added the following paragraphs in the beginning of the section:
First of all, the limitations and how to overcome these limitations are reconsidered in our revised version in the results section (first paragraph of the results section):
One of the noticeable weaknesses of the SMA algorithm is the pre-mature problem [19]. SMA could come with some downsides, for instance becoming confined to limited local areas and having an inappropriate equilibrium among the exploitation and discovery rounds of the process [41].Nevertheless, to overcome this problem, the simulation parameters have to be carefully set. Furthermore, the data should be pre-processed, such as normalization, and cleaning. Otherwise, the algorithm will suffer from pre-maturing problem. Extensive simulations have been conducted in order to find the best settings to be utilized in our final experiments.
One of the most important parameters that has been set during this phase is the number of neighbors of the KNN classifier, K. Using the SMA-algorithm is an advantageous way since it is a nature inspired approach and it can be used for the FS portion of the IDS. However, aim of using the SMA-algorithm is to reduce number of attributes, while classification is the job other classifiers, DT and SVM. Even when novel attacker is appeared, SMA algorithm can be recognize anomalies based on features. In other words, the selected features will be updated as long as there are new anomalies.
However, The KNN-algorithm is a straightforward example of the supervised learning subfield of ML. The KNN method makes an assumption about the degree of correspondence between the new instance and the existing examples and places it in the class to which it is very closely related. The KNN-algorithm remembers all the information it has, and then uses that information to determine how to categorize a new piece of data. This means that the KNN method can be used to quickly and accurately categorize newly-emerging data into a set of predefined categories. While the KNN technique has certain applications in Regression, it is most commonly utilized for Classification. When it comes to the underlying data, KNN is a non-parametric methodology. In the training phase, the KNN-algorithm simply saves the dataset, and when it receives new data, it assigns it to a class that best fits it.
SVM is widely utilized for both classification and regression tasks, making it one of the most well-known supervised learning methods. On the other hand, its most common application is in the field of ML for categorization issues. The SVM algorithm's objective is to find the optimal line or decision boundary that divides the n-dimensional space into classes, making it simple to assign the new information point to the right class in the following. A hyperplane describes the optimal choice boundaries. To create the hyperplane, SVM selects the most extreme points (or vectors). The name "Support Vector Machine" refers to the fact that this technique is designed to help in the most severe instances.
The DT-algorithm takes its name from the tree-like form it builds, beginning with a single node at the base and branching out from there. While DT, like other supervised learning methods, can be applied to the solution of regression tasks, it is most commonly utilized to address classification issues. It is a classifier organized like a tree, with internal nodes standing in for the attributes of a dataset, branches for the decision rules, and leaf nodes for the results. The two types of nodes in a DT are the decision node and the leaf node. Leaf nodes are the results of previous decisions and do not include any additional branches, while decision nodes feature many branches that are utilized to arrive at those choices. The characteristics of the provided dataset are used to make a determination or conduct a test. It's a visual tool for figuring out every feasible option in a decision tree, supplied some initial state.
More comparisons have been added to the revised version of our paper as two more studies are added for this purpose, as well as and for more convenience, we have added Table 4, which summarize the comparison. The following lines are added to the revised version:
Although the accuracy of our proposed methodology is less than that obtained in [29], which was 99.73%, by 0.34% and 3.26% for the DT and SVM-Polynomial, respectively, but this was at the expense of the number of selected features. Where in [29], the number of selected features were 8, 11, and 13, while our approach was only five features.
Our results of the sensitivity are almost close to that found in [29], but, as explained above, the expense was on the number of features. Note that when the number of features increased, the training time and computational complexity as well as the required memory will be increased. That is, the sensitivity using the five-selected features was respectable.
Moreover, the accuracy of the RF-based feature selection in [37] was 95.21% and FPR of 1.57%, when the number of features equals to 10. On the other hand, in [23], the best accuracy achieved using bat algorithm was 96.42% with FPR of 0.98%. However, the number of selected features was 32. Table 4 lists the comparison of our proposed approach with those in [23, 29, 37].
TABLE 4
Comparison of proposed SMA-algorithm with other researches
Technique |
Number of features |
Accuracy |
|
Our approach |
DT |
5 |
99.39% |
SVM |
96.11% |
||
Jaw and Wang [29] |
Minimum of 8 |
99.73% |
|
Li et al. [23] |
32 |
96.42% |
|
Li et al. [37] |
10 |
95.21% |
Accordingly, the literature review was also reconsidered and four-more studies have been added as follows:
An experimental IDS built on ML is presented in [34]. The proposed approach is computationally efficient without sacrificing detection precision. In this study, researchers examine how different oversampling techniques affect the required training sample size for a given model, and then find the smallest feasible training record group. Approaches for optimizing the IDS's performance depending on its hyperparameters are currently under investigation. Using the NSL-KDD dataset, the created system's effectiveness is verified. The experimental results show that the suggested approach cuts down on the size of the feature set by 50% (20-featured), while also decreasing the amount of mandates elements by 74%.
Using a convolutional neural network to obtain sequence attributes of data traffic, whereupon reassigning the weights of each channel using the attention methodology, and eventually utilizing bidirectional long short-term memory (Bi-LSTM) to learn the architecture of sequence attributes, this study recommends a conceptual framework for traffic outlier recognition called a deep learning algorithm for IDS. A great deal of data inconsistency is present in publicly available IDS databases. Hence, the study in [35] utilizes a redesigned stacked autoencoder for data feature reduction with the goal of improving knowledge integration, and it leverages adaptive synthetic sampling for sample enlargement of minority category observations, to ultimately generate a fairly balanced dataset. Since deep learning algorithm for IDS methodology is a full-stack structure, it does not require any human intervention in the form of extraction of features. Study findings reveal that this technique outperforms existing techniques of comparison on the public standard dataset for IDS, NSL-KDD, with an accuracy of 90.73% and an F1-measure of 89.65%.
Using a χ2 statistical approach and Bi-LSTM, the authors of [36] suggest a new feature-driven IDS called χ2-Bi-LSTM. The χ2-Bi-LSTM process begins by ranking all the characteristics with a χ2 approach, and then employing an upwards leading search method, it looks for the optimal subset. The final step involves feeding the ideal set to a Bi-LSTM framework for categorization. The empirical findings show that the suggested χ2-Bi-LSTM technique outperforms the conventional LSTM technique and perhaps various available attribute IDS approaches on the NSL-KDD dataset, with an accuracy rate of 95.62% and an F-score of 95.65%, and a minimal FPR of 2.11%. Li et al. recommended a smart IDS for software-defined 5G networks in [37]. Taking advantage of software-defined technologies, it unifies the invocation and administration of the various security-related function modules into a common framework. In addition, it makes use of ML to automatically acquire rules through massive amounts of information and to identify unexpected assaults utilizing flow categorization. During flow categorization, it employs a hybrid of k-means and adaptive boosting (AdaBoost) and use RF for FS. There will be a significant increase in security for upcoming 5G networks with the adoption of the suggested solution as per the authors.
- Comment 4
“Conclusion section is unadmissible short:
-without clear explantation of the paper's contributions, and
-without description of possibly future work on the subject of this paper.”
Response/Action taken: Thanks for your comments. The conclusion section was expanded and we have added the following lines accordingly:
, in other words, false positive rate, false negative rate, true positive rate, and true negative rate are improved significantly if contrasted with other works, where only five features were selected using SMA-algorithm (‘Service’, ‘src_bytes’, ‘Hot’, ‘serror_rate’, and ‘dst_host_srv_serror_rate’) and the measures are obtained based on these five features only. Thus, the accuracy was 99.39%, error rate reached only 0.61%, the overall sensitivity was 99.36% when the specificity was 99.42%, while the FPR is reduced to 0.58% and the F-measure was 99.34%. It is worth mention that the SMA-algorithm may fall into pre-mature problem, if not carefully set, in other words, the control parameters of the SMA-algorithm should be selected properly before to deploy it in the IDS. As a future development, it is suggested to implement the presented methodology on different datasets and for multi-class categorization job.
- Comment 5
“For the journal in which the article is to be published, the number of annotated references is much less than necessary, and there is also a lack of a review of today's most modern ensembles of machine learning methodologies and their use for solving the considered problem.”
Response/Action taken: Thanks for your comments. We have revised the part of the literature review. Accordingly 8-more references are added as follows:
[26] D. N. Mhawi, A. Aldallal, and S. Hassan, "Advanced Feature-Selection-Based Hybrid Ensemble Learning Algorithms for Network Intrusion Detection Systems," Symmetry, vol. 14, p. 1461, 2022.
[27] H. Han, H. Kim, and Y. Kim, "An Efficient Hyperparameter Control Method for a Network Intrusion Detection System Based on Proximal Policy Optimization," Symmetry, vol. 14, p. 161, 2022.
[28] F. B. Saghezchi, G. Mantas, M. A. Violas, A. M. de Oliveira Duarte, and J. Rodriguez, "Machine Learning for DDoS Attack Detection in Industry 4.0 CPPSs," Electronics, vol. 11, p. 602, 2022.
[29] E. Jaw and X. Wang, "Feature Selection and Ensemble-Based Intrusion Detection System: An Efficient and Comprehensive Approach," Symmetry, vol. 13, p. 1764, 2021.
[30] A. Adnan, A. Muhammed, A. A. Abd Ghani, A. Abdullah, and F. Hakim, "An Intrusion Detection System for the Internet of Things Based on Machine Learning: Review and Challenges," Symmetry, vol. 13, p. 1011, 2021.
[31] S. Alabdulwahab and B. Moon, "Feature Selection Methods Simultaneously Improve the Detection Accuracy and Model Building Time of Machine Learning Classifiers," Symmetry, vol. 12, p. 1424, 2020.
[32] R. Abdulhammed, H. Musafer, A. Alessa, M. Faezipour, and A. Abuzneid, "Features Dimensionality Reduction Approaches for Machine Learning Based Network Intrusion Detection," Electronics, vol. 8, p. 322, 2019.
[33] A.-U.-H. Qureshi, H. Larijani, N. Mtetwa, A. Javed, and J. Ahmad, "RNN-ABC: A New Swarm Optimization Based Technique for Anomaly Detection," Computers, vol. 8, p. 59, 2019.
Accordingly, the Introduction section now involves the following new paragraphs:
Authors in [26], offer a Feature Selection strategy for selecting the best features to include in the final product. Next, these subsets are sent to the suggested hybrid ensemble learning, that also uses minimal computational and time resources to increase the IDS's stability and accuracy. This research is important since it seeks to: decrease the dimensionality of the CICIDS2017 dataset by combining Correlation Feature Selection and Forest Panelized Attributes (CFS–FPA); determine the optimal machine learning strategy for aggregating the four modified classifiers (SVM, Random Forests, Naïve Bayes, and K-Nearest Neighbor (KNN)); and validate the effectiveness of the hybrid ensemble scheme. Examine the CFS-FPA and other feature selection methods in regards to accuracy, detections, and false alarms. Ultimately, the results will be employed to extend the effectiveness of the suggested features selection method. Evaluate how each of the classification techniques performed prior to and after being modified to use the AdaBoosting technique. Further, they will evaluate the recommended methodology against alternative available options.
Proximal policy optimization (PPO) is used in [27] to present an intrusion detection hyperparameter control system (IDHCS) that trains a deep neural network (DNN) feature extractor and a k-means clustering component under control of the DNN. Intruders can be detected using k-means clustering, and the IDHCS ensures that the strongest useful attributes are extracted from the network architecture by controlling the DNN feature extractor. Automatically performance improvements tailored to the network context in which the IDHCS is deployed are achieved by iterative learning utilizing a PPO-based reinforcement learning approach. To test the efficacy of the methodology, researchers ran simulations with the CICIDS2017 as well as UNSW-NB15 databases. An F1-score of 0.96552 been attained in CICIDS2017, whereas an F1-score of 0.94268 being attained in UNSW-NB15. As a test, they combined the two data sets to create a larger, greater realistic scenario. The variety and complexity of the attacks seen in the study increased after integrating the records. When contrasted to CICIDS2017 and UNSW-NB15, the combined dataset received an F1-score of 0.93567, suggesting performance between 97% and 99%. The outcomes demonstrate that the recommended IDHCS enhanced the IDS's effectiveness through the automated learning of new forms of attacks via the management of intrusion detection attributes independent of the changes in the network environment.
In this research [28], researchers present an ML-based IDS for use in a practical Industry to identify DDoS attacks. They acquire innocuous data from Infineon's semiconductor manufacturing plants, such as network traffic statistics. They mine the DDoSDB database maintained by the University of Twente for fingerprints and examples of real-world DDoS assaults. To train 8-supervised learning algorithms—LR, NB, Bayesian Network (BN), KNN, DT, RF, and Classifier employ feature selection techniques like Principal Component Analysis (PCA) to lower the dimensionality of the input. Authors then investigate one semi-supervised/statistical classifier, the univariate Gaussian algorithm, and two unsupervised learning algorithms, simple K-Means and Expectation-Maximization (EM). This is the first study that researchers are aware of to use data collected from an actual factory to build ML models for identifying DDoS attacks in Industry. Some works have applied ML to the problem of detecting distributed denial-of-service (DDoS) attacks in Operational Technology (OT) networks, but these approaches have relied on data that was either synthesized for the purpose or collected from an IT network.
Jaw and Wang introduced a hybrid feature selection (HFS) using an ensemble classifier in their paper [29]. This method picks relevant features in an effective manner and offers a classification that is compatible with the attack. Initially. For instance, it outperformed all of the selected individual classification methods, cutting-edge FS, and some current IDSs methodologies, achieving an amazing outcome accuracy of 99.99%, 99.73%, and 99.997%, and a detection rate of 99.75%, 96.64%, and 99.93%, respectively, for CIC-IDS2017, NSL-KDD, and UNSW-NB15, respectively. The findings, on the other hand, are dependent on 11, 8, and 13 pertinent attributes that were chosen from the dataset.
Researchers give a review of IDSs in [30] that looks at them from the point of view of ML. They discuss the three primary obstacles that an IDS faces in general, as well as the difficulties that an IDS for the Internet of Things (IoT) faces specifically; these difficulties are idea drift, high dimensionality, and high computational. The orientation of continued research as well as studies aimed at finding solutions to each difficulty are discussed. In addition, authors have devoted an entirely separate section of this study to the presentation of datasets that are associated with an IDS. In particular, the KDD99 dataset, the NSL dataset, and the Kyoto dataset are shown herein. This article comes to the conclusion that there are three aspects of concept drift, elevated number of features, and computational awareness that are symmetric in their impact and need to be resolved in the neural network based concept for an IDS in the internet of things (IoT).
Using the Weka tool and 10-fold cross-validation with and without feature selection/reduction techniques, the authors of [31] assessed six supervised classifiers on the entire NSL-KDD training dataset. The study's authors set out to find new ways to improve upon and protect classifiers that boast the best detection accuracy and the quickest model-building times. It is clear from the results that using a feature selection/reduction technique, such as the wrapper method in conjunction with the discretize filter, the filter method in conjunction with the discretize filter, or just the discretize filter, can drastically cut down on the time spent on model construction without sacrificing detection precision.
Auto-Encoder (AE) and PCA are the two methods that Abdulhammed et al. [32] utilize to reduce the dimensionality of the features. The low-dimensional features that are produced as a result of applying either method are then put to use in the process of designing an intrusion detection system by being incorporated into different classifiers, such as RF, Bayesian Networks, Linear Discriminant Analysis (LDA), and Quadratic Discriminant Analysis. The outcomes of the experiments with binary and multi-class categorization using low-dimensional features demonstrate higher performance in terms of Detection Rate, F-Measure, False Alarm Rate, and Accuracy. This investigation endeavor was successful in lowering the number of feature dimensions in the CICIDS2017 dataset from 81 to 10, while still achieving an accuracy rate of 99.6% in both multi-class and binary classification. New IDS based on random neural network and ABC is proposed in article [33]. (RNN-ABC). Using the standard NSL-KDD data set, researchers train and evaluate the model. The recommended RNN-ABC is compared to the conventional RNN trained with the gradient descent approach on a variety of metrics, including accuracy. Still maintaining a 95.02% accuracy rate generally
Author Response File: Author Response.pdf
Reviewer 3 Report
It's ok
Author Response
We are forever grateful for the criticisms, patience, and feedback