3.2.5. Stopping Criteria

The process of the GA is repetitive. For each iteration, there is a generation, which consists of a set of chromosomes that represent a solution. The maximum number of iterations was set to 100. Therefore, the stopping criteria were either the GA running for 100 iterations or a lack of improvement in the solution obtained. Improvement was measured in terms of the fitness function defined above, so the stopping criterion was a lack of improvement in the obtained accuracy. To define the improvement, we used a threshold. If the difference of accuracy between two consecutive generations was less than the threshold, then the process stopped. The threshold for this system was set to be 0.05.

#### *3.3. Support Vector Machine (SVM)*

The support vector machine (SVM) is one of the best machine learning algorithms used for binary classification. IDS can be viewed as binary classification, since transactions are classified as either normal or intrusions regardless of the type of attack.

This work adopted an SVM because of the following advantages. SVMs are highly effective in high-dimensional spaces such as the case we studied here. Even if the number of dimensions is greater than the number of samples, an SVM still produces effective results. SVMs are memory efficient, because in the decision function, they use a subset of training points. SVMs are also versatile, as the decision function can be specified using different kernel functions [27].

#### *3.4. Integration between GA and SVM*

The process starts by applying a genetic algorithm to select the needed number of features. At this stage, the preprocessed data (explained later in Section 4.1) is fed into the proposed GA to extract the proper set of features, since the original data consist of a high-dimensional feature set with 77 features, out of which the best features are extracted. In the next stage, the dataset of these features is divided into training and testing sets. After that, a support vector machine is applied to the training set, and then the fitness is calculated. If the fitness is not accepted, the SVM parameters are changed, and SVM is applied again on the training set to calculate the new accuracy. What is meant by not accepted is that the accuracy is less than the threshold. For this work, the threshold was set to the minimum accuracy achieved by other researchers, which was 94.86% [7]. If the results are accepted, they are saved for future comparison with results obtained from other iterations of different feature selections. The next step is to check the total number of iterations; if it is greater than the threshold, then these results, along with the parameters that led to these results, are stored, and the process stops. If the iterations are less than the threshold, the process loops again to the first step. Figure 7 shows the flowchart of the proposed hybrid IDS. In the present work, the GA at the first stage was run for 100 generations. At each generation, a collection of features was examined using the fitness function on that sample. The proposed Hybrid IDS algorithm is presented in Algorithm 2:

**Algorithm 2:** Hybrid IDS using GA and SVM.

6:

	- Apply SVM to the current training set with specific hyperparameters.

<sup>1:</sup> Apply Algorithm 1 to select features.

<sup>2:</sup> **loop**

<sup>3:</sup> Create initial generation.

#### **Algorithm 2:** *Cont.*

9: performance of the current generation= average score of the results of n folds

10: Save the results of step 6.


**Figure 7.** Flowchart of the proposed hybrid IDS.

#### **4. Implementation and Results**

This section explains the implementation process. It starts with data preparation, then explains the measures used to evaluate the system performance. Finally, it discusses the obtained results and how they were evaluated using different metrics so they could be compared with existing techniques.

#### *4.1. Experiments Procedure and Results*

The conducted experiments had two folds. The first was to find the optimal set of features, while the second was to identify the set of parameters values for the SVM that produce the maximum performance in terms of the proposed fitness function. For the first fold, the GA started with a chromosome length equal to 10, which represents the number of features. For the second fold, the SVA was applied to these 10 features with different parameter values, which are listed in Table 4. The SVA has three basic parameters: kernel, gamma, and degree. The kernel functions used in this work were linear, polynomial, RBF and sigmoid. For linear, the degree is useless, but degree affects the results when the kernel function is polynomial. The final parameter of the SVM that was applied in the present experiment is gamma. It can be either Scale or Auto. The experiment was conducted using 12 different combinations of these parameters, as illustrated in Table 5. For each combination, the GA was executed with a different set of features for a maximum predefined number of generations. The best result of the last generation was considered in this study. The results of this experiment are shown in Table 5. The same type of experiment was conducted for chromosome lengths of 15, 20, 25, 30, 35, and 40, producing the results listed in Tables 6–11. Beyond 40 features, the system experienced unacceptable slowness; hence, these results were not considered in the analysis.



**Table 5.** Results of 12 experiments on the proposed hybrid IDS using 10 features and different SVM parameters.


**Names of the selected features:** Total Backward Packets, Bwd Packet Length Std, Fwd IAT Total, Bwd IAT Mean, Fwd Header Length, URG Flag Count, CWE Flag, Count, Fwd Avg Packets Bulk, Subflow Fwd Bytes, Active Max.


**Table 6.** Results of 12 experiments on the proposed hybrid IDS using 15 features and different SVM parameters.

**Names of the selected features:** Fwd Packet Length Std, Bwd IAT Max, Bwd I AT Min, Fwd PSH Flags, Fwd Header Length, Fwd Packets\_s, Bwd Packets\_s, ECE Flag Count, Avg Fwd Segment Size, Avg Bwd Segment Size, Fwd Avg Packets\_Bulk, Bwd Avg Packets\_Bulk, Subflow Fwd Bytes, Subflow Bwd Bytes, Min Seg\_Size\_Forward.

**Table 7.** Results of 12 experiments of the proposed Hybrid IDS using 20 features and different SVM parameters.


**Table 7.** *Cont.*


**Names of the selected features:** Total Length of Bwd Packets, Bwd Packet Length Mean, Flow IAT Mean, Flow IAT Max, Fwd IAT Std, Bwd IAT Mean, Bwd IAT Max, Bwd Header Length, Fwd Packets\_s, PSH Flag Count, URG Flag Count, ECE Flag Count, Avg Bwd Segment Size, Fwd Avg Bytes Bulk, Bwd Avg Packets Bulk, Subflow Bwd Bytes, Init Win Bytes Backward, Active Min, Idle Mean, Idle Max.

> **Table 8.** Results of 12 experiments on the proposed hybrid IDS using 25 features and different SVM parameters.


**Names of the selected features:** Flow Duration, Total Fwd Packets, Fwd Packet Length Std, Flow IAT Min, Fwd IAT Max, Fwd IAT Min, Bwd IAT Total, Fwd PSH, Flags, Max Packet Length, Packet Length Variance, SYN Flag Count, RST Flag Count, URG Flag Count, CWE Flag Count, ECE Flag, Count, Fwd Header Length, Fwd Avg Bulk Rate, Subflow Fwd Bytes, Init\_Win\_Bytes\_Forward, Init\_Win\_Bytes\_Backward, Act Data Pkt Fwd, Active Min, Idle Mean, Idle Max, Idle Min.

**Table 9.** Results of 12 experiments on the proposed hybrid IDS using 30 features and different SVM parameters.


**Table 9.** *Cont.*


**Names of the selected features:** Total Fwd Packets, Total Length of Fwd Packets, Total Length of Bwd Packets, Bwd Packet Length Max, Bwd Packet Length Min, Flow IAT Mean, Fwd IAT Total, Fwd IAT Std, Fwd IAT Max, Bwd IAT Mean, Bwd IAT Std, Fwd PSH Flags, Bwd PSH Flags, Bwd Header Length, Fwd Packets\_s, Packet Length Variance, SYN Flag Count, PSH Flag Count, CWE Flag Count, ECE Flag Count, DownJJp Ratio, Average Packet Size, Avg Bwd Segment Size, Fwd Avg Packets\_Bulk, Bwd Avg Bytes\_Bulk, Bwd Avg Bulk Rate, Subflow Bwd Bytes, Active Max, Active Min, Idle Mean.

**Table 10.** Results of 12 experiments on the proposed hybrid IDS using 35 features and different SVM parameters.


**Names of the selected features:** Total Length of Fwd Packets, Fwd Packet Length Max, Fwd Packet Length Min, Fwd Packet Length Mean, Bwd Packet Length Min, Bwd Packet Length Std, Flow IAT Std, Flow IAT Max, Flow IAT Min, Fwd IAT Std, Bwd IAT Std, Fwd URG Flags, Fwd Header Length, Bwd Header Length, Fwd Packets\_s, Bwd Packets\_s, Min Packet Length, Packet Length Variance, SYN Flag Count, Average Packet Size, Avg Bwd Segment Size, Fwd Avg Bytes\_Bulk, Fwd Avg Packets\_Bulk, Fwd Avg Bulk Rate, Bwd Avg Bulk Rate, Subflow Fwd Packets, Subflow Bwd Bytes, Init\_Win\_Bytes\_Forward, Init\_Win\_Bytes\_Backward, Min\_Seg\_Size\_Forward, Active Mean, Active Std, Active Min, Idle Mean, Idle Std.

These experiments produced the following observations: from 10 to 15 features the results were very close, and the training and testing times of the data were adequate. From 25 to 40 features, a decrease in the performance of the system was observed, as the accuracy percentage decreased, and the time of training and testing increased.

To benchmark the performance of the proposed system with other systems using the KDD CUP 99 dataset, another experiment was conducted using the same 12 combinations of SVM parameters. The results of this experiment are shown in Table 12. Analysis of these experiments is presented in the next subsection.


**Table 11.** Results of 12 experiments on the proposed hybrid IDS using 40 features and different SVM parameters.

**Names of the selected features:** Flow Duration, Total Length of Fwd Packets, Total Length of Bwd Packets, Fwd Packet Length Max, Bwd Packet Length Max, Bwd Packet Length Std, Flow IAT Max, Flow lATMin, Fwd IAT Max, Bwd IATTotal, Bwd lATStd, Bwd lATMin, Fwd PSH Flags, Bwd PSH Flags, Bwd URG Flags, Fwd Header Length, Bwd Header Length, Fwd Packets\_s, Bwd Packets\_s, Packet Length Mean, Packet Length Variance, SYN Flag Count, RST Flag Count, PSH Flag Count, URG Flag Count, CWE Flag Count, ECE Flag Count, Average Packet Size, Fwd Avg Packets\_Bulk, Fwd Avg Bulk Rate, Bwd Avg Bytes\_Bulk, Bwd Avg Bulk Rate, Subflow Fwd Packets, Act Data Pkt Fwd, Min Seg Size Forward, Active Std, Active Min, Idle Mean, Idle Max, Idle Min.

**Table 12.** Results of 12 experiments of the proposed Hybrid IDS using 15 features from (KDD CUP99) dataset.


**Name of the selected features:** service, flag, src\_bytes, num\_failed\_logins, logged\_in, lnum\_file\_creations, lnum\_outbound\_cmds, count, srv\_count, rerror\_rate, same srv rate, dst\_host\_same\_src\_port\_rate, dst\_host\_srv\_diff\_host\_rate, dst host rerror rate, dst host srv rerror rate.

#### *4.2. Analysis of Results*

Different types of SVM kernel functions were applied. It was observed that if the kernel was linear, then the accuracy did not depend on any other parameters (neither gamma nor degree). If the kernel was polynomial and gamma was scale, then the accuracy depended on the degree. When the degree was two or three, the SVM produced higher accuracy than if the degree was one in most cases. In terms of the number of features, it was noted that the highest ever scored fitness, which was 100%, was obtained when the number of features was 20. This high performance was achieved when SVM kernel function was linear, or when the kernel function was polynomial and the degree was either one or two. In addition, the highest average of all 12 combinations was achieved when the number of features was 20 as well. Graphically, this is illustrated in Figure 8. This showed

that the optimal number of features was 20, which was 25.6% of features in the dataset. These features are listed in Table 7. Achieving the optimal performance of 100% using an only quarter of features proved that the proposed method could perform well using a limited number of features. On the other hand, one could conclude that the dataset had many low-value features that could be eliminated from the dataset safely. However, this conclusion needs to be further examined using other ML techniques.

**Figure 8.** Average fitness per number of features.

Comparing the obtained results using the CICIDS2017 dataset with those of the other hybrid IDS mentioned in Section 2, which used the KDD CUP 99 dataset, it was noted that [4] achieved a 96.38% detection rate while our system achieved a maximum of 100%, which means that the proposed system achieved an improvement of 3.32%. Reference [5] achieved a 95.26% detection rate compared with the 99.50% achieved in the present study, which means that the proposed system achieved an improvement of 4.74%. Reference [6] achieved a 95.17% detection rate, which was lower than that obtained in the present study by 4.83%. Reference [7] achieved a 94.86% detection rate, which was 5.14% less than that obtained by the proposed hybrid IDS. These results are summarized in Table 13.



The proposed system was also tested with the KDD CUP 99 dataset to compare the present results with previous works that used the same dataset in their works. It was noted that [4] achieved a 96.38% detection rate, while our system achieved a maximum detection rate of 99.3% using the same dataset, which means the proposed system achieved an improvement of 3.03%. Reference [5] achieved a 95.26% detection rate, while the proposed hybrid IDS system achieved a maximum detection rate of 99.65%, which means that the proposed system achieved an improvement of 4.24%. Reference [6] achieved a 95.17% detection rate, which is lower than that obtained by the proposed system by 4.33%. Referenced [7] achieved a 94.86% of detection rate, which is 4.68% lower than that obtained by the proposed hybrid IDS. Compared with other IDS systems that applied fuzzy vectorized GA as a core technique, it was noted that [39] used a vectorized fitness function on the NSL-KDD dataset with 42 features. The NSL-KDD dataset is the enhanced version of KDD CUP 99. In the experiment in [39], six different models were applied, four of which achieved accuracy ranging between 91.86 and 95.53%. On the other hand, two models achieved accuracies of 99.18% and 99.02%. The last two scores were much closer to the achieved results in this work; they were less by only 0.12% and 0.28%, respectively. Another work that applied GA is [47]. The authors developed a GPSVM, which combined both genetic programming and an SVM. The detection rate varied depending on the type of attack. The maximum detection rate achieved was 97.59%, for detecting U2R attacks. It was noted that this detection rate was 1.75% lower than that obtained by our system. Considering the results obtained by a recent study, [48], in which intrusion was detected using a real-time sequential deep extreme learning machine, said machine achieved a maximum accuracy rate of 93.58% when applied on a fused dataset (NSL-KDD and KDD CUP 99). One of the recent studies is [49]. The system therein used a deep extreme learning machine (DELM) to identify any malice or intrusion. Its accuracy using KDD CUP 99 was 94.6%, which was better than its accuracy using NSL-KDD by 0.69%. It is worth pointing that our proposed system outperformed the DELM by up to 5.47%. Table 14 summarizes these results.


**Table 14.** Summary of results analysis using the KDD CUP 99 and NSL-KDD datasets.
