**3. Materials and Methods**

The proposed architecture is depicted in Figure 2.

**Figure 2.** Steps of the proposed architecture.

The architecture consists of the following steps:


#### *3.1. Data Preprocessing*

The Kaggle dataset [13] was utilized in this research to train, evaluate, and compare the performance of the RNN autoencoder with several classifiers. The dataset was prepared by collecting different SQL injection queries from multiple websites. The dataset contained 30,919 SQL query statements of the form "SELECT FROM" and related variations. Each statement had a binary label, with 1 indicating malicious and 0 benign.

In order to enhance the accuracy of our trained models, we performed data cleaning on the selected dataset. This involved removing any null values and eliminating duplicate records. The removal of missing or null values is crucial, as it prevents the model from learning incorrect relationships or making predictions based on incomplete data. After completing the cleaning process, the dataset consisted of a total of 30,907 records, with 19,529 normal statements and 11,378 malicious statements. The statistics for the dataset are depicted in Figure 3. Each record contained two main features: "Query", which represented the statement itself, and "Label", which indicated whether the statement was normal (0) or malicious (1).

**Figure 3.** Distribution of benign and SQL injection attacks in the dataset.

Stratified sampling was applied, which ensured that the training and testing sets had similar proportions of each class. This is important for imbalanced datasets like the SQL injection dataset, where the number of malicious queries is much lower than the number of benign queries [14].

#### *3.2. Data Splitting*

The dataset was divided into two parts: 80% for training and 20% for testing. This division allowed us to train the proposed approach with the majority of the data and assess its performance with unseen samples.

### *3.3. Building and Training RNN Autoencoder Model*

We developed an architecture for an RNN autoencoder that combines an autoencoder and a recurrent neural network (RNN) for SQL injection attack detection. Figure 4 illustrates the architecture of the proposed model.

**Figure 4.** The RNN autoencoder architecture for SQL injection attack detection.

As shown in Figure 4, the proposed architecture consists of two main parts: the autoencoder and the RNN. The autoencoder contains an input layer, an encoder, and a decoder. The encoder takes the input data and compresses it into a lower-dimensional latent space, which is then fed to the decoder. The decoder then reconstructs the input data from the encoded representation. The size of the latent space can affect the performance of the autoencoder and RNN model, as a smaller latent space may lead to loss of information, while a larger latent space may lead to overfitting. The dimensionality of the latent space in an autoencoder is a crucial hyperparameter that should be carefully tuned [15]. In this research, we experimented with different values for the latent space using a grid search technique to find a value that resulted in a good balance between representation power and computational efficiency. The result of the hyperparameter tuning process showed that 64 was the optimal value for the latent space hyperparameter, which meant that the encoder layer compressed the input data into a 64-dimensional latent space. The RNN was designed to take the compressed representation of the input data learned by the autoencoder and use it to make binary classification predictions [16]. The RNN consisted of an LSTM layer and a dense layer, which takes the encoded data from the autoencoder as input and processes it through an LSTM layer, from where it is then fed to a dense layer to make a prediction with the output.

## *3.4. Model Evaluation*

After training the RNN autoencoder model on the training set, we applied it to the testing set and calculated various performance metrics, such as the ROC curve, accuracy, precision, recall, and F1-score, to measure the effectiveness of the RNN autoencoder in detecting SQLIAs. The mathematical representation of these metrics was as follows.

The accuracy metric measures the percentage of correctly classified samples [17], and it is calculated as follows:

$$Accuracy = \frac{TP + TN}{TP + TN + FN + FP} \tag{1}$$

Precision, another important metric, represents the probability that a sample will be correctly classified [17]. It is calculated as follows:

$$Precision = \frac{TP}{TP + FP} \tag{2}$$

Recall, also known as sensitivity or the true-positive rate, indicates the proportion of positive samples that are correctly classified [17]. The recall score is calculated as follows:

$$Recall = \frac{(TP)}{(TP + FN)}\tag{3}$$

The F1-score is a combined metric that considers both precision and recall, providing a balanced measure of model performance [18]. It is calculated as follows:

$$F1Score = 2 \ast \frac{Precision \ast Recall}{Precision + Recall} \tag{4}$$

TN is the true-negative rate. It indicates the number of correctly predicted normal requests. TP is the true-positive rate. It indicates the number of correctly predicted malicious requests. FN is the false-negative rate. It indicates the number of incorrectly predicted normal requests. FPis the false-positive rate. It indicates the number of incorrectly predicted malicious requests.

#### **4. Results and Discussion**

This section provides a description of the experimental results. The Python environment was used to implement the system. Table 1 summarizes the performance of the RNN autoencoder in terms of the evaluation metrics.


**Table 1.** Performance metrics for the proposed model.

The results from Table 1 show that the RNN autoencoder performed better in terms of prediction accuracy. The RNN autoencoder achieved an accuracy of 94% and an F1 score of 92%. Further, we used the receiver operating characteristic (ROC) curve to check the performance of the proposed approach. The ROC curve is a graph that shows the relationship between the true-positive rate (TPR) and false-positive rate (FPR) for different classification thresholds [19].

The AUC curve for the RNN autoencoder model is shown in Figure 5. We obtained the value of 0.94, which indicated that our model could successfully separate 94% of positive and negative rates.

**Figure 5.** Receiver operating curve (ROC) for our proposed approach.

Regarding RQ1, based on the results provided, it appears that the proposed RNN autoencoder model performed well in correctly identifying instances of SQL injection attacks in the dataset and can be effective for the detection of SQL injection attacks.

Regarding RQ2, one of the most used methods to optimize RNN autoencoders to improve their performance in detecting SQL injection attacks is to adjust the hyperparameters of the model, such as epochs [19]. To find the optimal number of epochs to train the model, we experimented with various numbers of epochs and checked how they affected the accuracy. In the first iteration, we used 10 epochs.

With 10 epochs, we obtained an accuracy of 88%. From Figure 6, we can infer that the validation error decreased. Next, we set the number of epochs to 50.

**Figure 6.** Loss in SQL injection dataset using 10 epochs.

As shown in Figure 7, the accuracy of the model increased to 94% with 50 epochs. Next, we tried to increase the number of epochs to 100.

**Figure 7.** Loss in SQL injection dataset using 50 epochs.

As shown in Figure 8, with 100 epochs, the accuracy increased to 95% but the validation error also increased. This may cause overfitting. Using a small number of epochs, the model cannot capture the underlying patterns in the data, and this may cause underfitting. Furthermore, training the model using many epochs may lead to overfitting, where the model even learns noise or unwanted parts of the data [20]. Therefore, from the this experiment, we deduced that we could stop the training process early at around 50 epochs

to obtain better performance from the model without underfitting or overfitting. Then, a grid search technique was used to find the optimal combination of hyperparameters, such as the activation function. Table 2 summarizes the choices for the different hyperparameters after using the grid search.

**Figure 8.** Loss in SQL injection dataset using 100 epochs.

**Table 2.** Values for several hyperparameters.


The proposed model achieved the best performance when trained for 50 epochs using the Adam optimizer, a batch size of 128, the ReLU activation function for the encoder layer, and the sigmoid activation function for the decoder layer in the autoencoder and output layer in the RNN.

We compared the performance of the proposed approach with the performance of several classifiers, including the ANN, CNN, decision tree, naive Bayes, SVM, random forest, and logistic regression classifiers. The results are presented in Figure 9.

The results in Figure 9 show that the RNN autoencoder and the ANN were effective in detecting SQL injection attacks, achieving a high accuracy of 94% and F1-score of 92%. The RF, LR, and DT models also performed well, achieving accuracy scores of 92%, 93%, and 90%, respectively, and F1-scores of 89%, 90%, and 87%. The CNN model had the highest accuracy of 96% and an F1-score of 49%, indicating its potential for detecting SQL injection attacks. However, the naive Bayes and SVM models had lower accuracy and F1-scores, achieving accuracy scores of 82% and 75%, respectively, and F1-scores of 80% and 49%.

**Figure 9.** The comparison of evaluation metrics for different ML algorithms.

Regarding RQ3, the results indicated that the RNN autoencoder approach outperformed some of the other algorithms, including the logistic regression, decision tree, random forest, SVM, and naive Bayes algorithms, in terms of accuracy, precision, recall, and F1-score. The RNN autoencoder approach also performed comparably to some of the other algorithms, including the CNN and ANN models, in many NLP tasks, but each architecture has its strengths and weaknesses. According to a study by Yin et al. [21], CNNs perform better at tasks that require local feature extraction, such as sentiment analysis, while RNNs perform better at tasks that require an understanding of longer-term dependencies, such as question answering. They found that both CNNs and RNNs are sensitive when hyperparameter values are varied depending on the task. Banerjee et al. [22] developed CNN and RNN models with similar architectures for classifying radiology reports and found that RNNs were the more powerful model to encode sequential information. However, the study noted that CNNs required less hyperparameter tuning to prevent overfitting and were more stable, while RNNs needed more careful regularization.

In this research, since the SQL queries could contain longer-term dependencies, it made sense that the RNN autoencoder model achieved comparable accuracy to the CNN model. The added memory and sequencing modeling of the RNN likely helped it perform well with the longer query texts, but it may require additional tuning to match the performance of CNNs in some cases. This may explain why the CNN model slightly outperformed the RNN model.

In summary, our results are consistent with previous findings that indicate that RNNs are well suited for longer textual sequences but may require additional tuning to maximize performance compared to CNN models. The strong accuracy of 94% demonstrates the promise of the RNN autoencoder architecture for detecting SQL injection attacks. The key advantage of the RNN autoencoder is that it can learn a compressed representation of the input data, allowing it to capture the underlying patterns and relationships in the data more effectively than traditional methods.

#### **5. Conclusions and Outlooks**

A deep learning architecture model based on an RNN autoencoder was proposed for detecting SQL injection attacks. The autoencoder was trained to learn a compressed representation of the input data, while the RNN used this compressed representation to make binary classification predictions. In this study, the RNN autoencoder was trained with different optimization techniques on a public SQL injection dataset. The performance of the model was evaluated using standard evaluation metrics, such as accuracy, precision, recall, and F1-score. Additionally, an ROC curve was calculated to evaluate the model's performance. The experimental results showed that the proposed approach achieved an accuracy of 94% and an F1-score of 92%, indicating that the RNN autoencoder is a promising method for detecting SQL injection attacks. As part of future research, we plan to explore the use of a more complex architecture for the RNN autoencoder to detect SQL injection attacks. Additionally, we acknowledge that the dataset used in this study was

relatively small, and we recommend expanding the dataset and implementing the models in real-world scenarios in future investigations.

**Author Contributions:** Conceptualization, M.A. and D.A.; methodology, M.A.; software, M.A.; validation, M.A., D.A. and S.A.; investigation, M.A.; resources, M.A.; data curation, M.A.; writing original draft preparation, M.A.; writing—review and editing, S.A.; visualization, M.A.; supervision, D.A. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Deanship of Scientific Research (DSR) at King Abdulaziz University, Jeddah, under grant no. IFPDP-284-22. The authors, therefore, acknowledge with thanks the DSR's technical and financial support.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **Abbreviations**

The following abbreviations are used in this manuscript:


#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

MDPI St. Alban-Anlage 66 4052 Basel Switzerland www.mdpi.com

*Mathematics* Editorial Office E-mail: mathematics@mdpi.com www.mdpi.com/journal/mathematics

Disclaimer/Publisher's Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Academic Open Access Publishing

mdpi.com ISBN 978-3-0365-9133-9