2.1.1. Anomaly-Based IDS

Anomaly-based IDS detect misuse in the cloud environment or penetrations and classify them into normal and abnormal user behavior through a system that collects all the information on normal user behavior or action over a period of time. Then, a statistical test is performed to check whether a suspected behavior is linked to a normal user's behavior or not. Figure 1 presents the general architecture of anomaly-based IDS. The difficulty in maintaining this type of IDS is that it cannot be updated without losing the data that the previous system was trained on. Also, the accuracy of identification is low, which gives a high number of false positives for this type of system. Anomaly-based IDS has been addressed by several researchers as follows.

**Figure 1.** Anomaly-based IDS architecture [13].

Aljawarneh et al. [14] proposed an IDS based on anomaly detection using feature selection, which resulted in an efficient hybrid model. This new hybrid model was useful in estimating intrusions based on network activity and transaction of optimal data options that were available for training. More optimizing techniques were required to create a better IDS model that has a better accuracy rate.

In reference [15], a web application profile was developed by learning constants (e.g., to log in, the value for the user should be the same as the value when logging in). To verify whether the constants were violated, the source code was used. If a static element violation was observed, an anomaly was entered.

In reference [16], an anomaly-based IDS was developed by analyzing the workflow of all web sessions. It consisted of small business groups of data objects. These groups were used and applied to typical data access sequences for the workflow processes. The authors used a hidden Markov model (HMM) application. The results showed that this approach could detect anomalous web sessions and lent evidence to sugges<sup>t</sup> that the clustering approach could achieve relatively low FPR while maintaining its detection accuracy.

Le et al. [17] developed the double-guard framework, which tested both database and web server logs to detect attacks that leaked information. They reported a 0.6% FP rate for dynamic webpages and a 0% FP rate for static webpages.

Nascimento and Correia [18] studied an IDS in which they collected the dataset from a scale web app and trained the dataset. They considered "GET" requests and did not consider "POST" types of requests or response pages. They took logs from the T-Shark tool and converted them to a common format. The filtered data was created by accessing the subapplications. They used nine detection models.

Ariu [19] developed a host-based IDS to protect web applications from attacks by using HMMs. This system was used to model a series of attributes and values received by a web app. To calculate the different parameters and values, they used several HMMs and combined them to arrive at a specific demand on the probability that was generated from the training dataset.

In reference [20], a firewall was developed as a web application to detect any anomalous request and capture the behavior through an "XML" file that defined the required attributes of parameter values. Input values that deviated from the profile were considered attacks. However, this approach produced *FP* warnings because it did not consider the path information and page to be more accurate.
