**1. Introduction**

Structured query language (SQL) is a programming language used to manage, organize, and manipulate relational databases. It also allows the user or an application program to interact with a database by inserting new data, deleting old data, and changing previously stored data. Structured query language injection attacks (SQLIAs) pose a severe security threat to Web applications [1]. These attacks involve the malicious execution of SQL queries on a server, enabling unauthorized access to and retrieval of restricted data stored within databases [2]. Figure 1 illustrates the basic process of an SQLIA.

**Figure 1.** SQL injection attack process adopted from [3].

Attackers can exploit Web applications by injecting SQL statements or sending special symbols through user input to target the database tier and gain unauthorized access to valuable assets [3]. Due to the absence of proper validation in some Web applications, which

**Citation:** Alqhwazi, M.; Alghazzawi, D.; Alarifi, S. Deep Learning Architecture for Detecting SQL Injection Attacks Based on RNN Autoencoder Model. *Mathematics* **2023**, *11*, 3286. https://doi.org/ 10.3390/math11153286

Academic Editor: Tao Zhou

Received: 7 July 2023 Revised: 23 July 2023 Accepted: 24 July 2023 Published: 26 July 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

is usually the programmer's fault, attackers can bypass authentication mechanisms and gain access to databases, enabling them to retrieve or manipulate data without appropriate authorization [2].

In recent years, researchers have proposed many detection methods, including machine learning algorithms and deep neural network models. Deep neural networks, also known as deep learning, are a rapidly evolving research area within the field of machine learning. They were developed to bring machine learning closer to its original goal of achieving artificial intelligence. Deep learning involves training complex models that can learn the underlying patterns and representations of large datasets. This has proven to be a powerful technique for interpreting various forms of data, including text, images, and sounds. Deep learning has also been successfully applied to Web security detection, highlighting its potential impact on a broad range of applications [4]. However, one of the major drawbacks of using neural networks is their tendency to make overconfident predictions. This means that they have a high degree of certainty in their predictions, even when they are incorrect [5,6]. Even though the models perform well on test data from the same distribution as the training data, they do not know the limits of their knowledge and make erroneous guesses outside that domain. This pitfall arises because neural networks learn highly nonlinear functions that do not output calibrated probability estimates for unfamiliar data [7]. To address this issue, researchers have developed various techniques for estimating predictive uncertainty in neural networks. Lakshminarayanan et al. [7] introduced deep ensembles, where multiple models are independently trained on the same data and their predictions are averaged to capture model uncertainty. Mishra et al. [5] evaluated Bayesian neural networks (BNNs) as a technique that can provide accurate predictions along with reliably quantified uncertainties. Amodei et al. [6] suggested using model rollouts/lookahead during training to avoid reward hacking, improve safety, and reduce overconfidence. In summary, while neural networks have shown great promise in many applications, it is important to be aware of their tendency to make overconfident predictions and the potential pitfalls of overfitting. Estimating predictive uncertainty using techniques such as deep ensembles can help mitigate these issues and improve the reliability of neural network predictions.

Detection of SQL injection attacks is crucial to ensure the security and integrity of Web applications and their associated data. To address this issue, a deep learning architecture based on the recurrent neural network (RNN) autoencoder model is proposed for detecting SQL injection attacks. The RNN autoencoder is a special case of the RNN-based encoder– decoder (RNN-ED) model. The autoencoder consists of an encoder RNN that encodes the input sequence into a hidden state and a decoder RNN that decodes the hidden state back into the original input sequence. The encoder and decoder RNNs are trained jointly using backpropagation to minimize the reconstruction error between the input and output sequences [8].

The aim of this study was to develop an architecture based on a recurrent neural network (RNN) autoencoder to detect SQL injection attacks. Moreover, the proposed approach that addresses this attack is discussed and compared with other approaches. The research questions were:


The main contributions of this paper are as follows:


The paper is structured as follows: Section 2 reviews the related research in this area. The methodology is discussed in Section 3. Experiment results and the discussion are shown in Section 4. The last section provides the conclusion and discusses future work.
