*Article RaKShA***: A Trusted Explainable LSTM Model to Classify Fraud Patterns on Credit Card Transactions**

**Jay Raval 1, Pronaya Bhattacharya 2, Nilesh Kumar Jadav 1, Sudeep Tanwar 1,\*, Gulshan Sharma 3, Pitshou N. Bokoro 3,\*, Mitwalli Elmorsy 4, Amr Tolba <sup>5</sup> and Maria Simona Raboaca 6,7**


**Abstract:** Credit card (CC) fraud has been a persistent problem and has affected financial organizations. Traditional machine learning (ML) algorithms are ineffective owing to the increased attack space, and techniques such as long short-term memory (LSTM) have shown promising results in detecting CC fraud patterns. However, owing to the black box nature of the LSTM model, the decision-making process could be improved. Thus, in this paper, we propose a scheme, *RaKShA*, which presents explainable artificial intelligence (XAI) to help understand and interpret the behavior of black box models. XAI is formally used to interpret these black box models; however, we used XAI to extract essential features from the CC fraud dataset, consequently improving the performance of the LSTM model. The XAI was integrated with LSTM to form an explainable LSTM (X-LSTM) model. The proposed approach takes preprocessed data and feeds it to the XAI model, which computes the variable importance plot for the dataset, which simplifies the feature selection. Then, the data are presented to the LSTM model, and the output classification is stored in a smart contract (SC), ensuring no tampering with the results. The final data are stored on the blockchain (BC), which forms trusted and chronological ledger entries. We have considered two open-source CC datasets. We obtain an accuracy of 99.8% with our proposed X-LSTM model over 50 epochs compared to 85% without XAI (simple LSTM model). We present the gas fee requirements, IPFS bandwidth, and the fraud detection contract specification in blockchain metrics. The proposed results indicate the practical viability of our scheme in real-financial CC spending and lending setups.

**Keywords:** Explainableartificial intelligence; credit card frauds; deep learning; long short-term memory; fraud classification

**MSC:** 91G45

#### **1. Introduction**

Modern credit-card (CC)-based applications are web/mobile-driven, and the customer base has shifted toward electronic payment modes. The online repayment modes for CC bring users flexibility and quality of service (QoS). Still, on the downside, it also opens the

**Citation:** Raval, J.; Bhattacharya, P.; Jadav, N.K.; Tanwar, S.; Sharma, G.; Bokoro, P.N.; Elmorsy, M.; Tolba, A.; Raboaca, M.S. *RaKShA*: A Trusted Explainable LSTM Model to Classify Fraud Patterns on Credit Card Transactions. *Mathematics* **2023**, *11*, 1901. https://doi.org/10.3390/ math11081901

Academic Editors: Snezhana Gocheva-Ilieva, Hristina Kulina and Atanas Ivanov

Received: 12 March 2023 Revised: 14 April 2023 Accepted: 15 April 2023 Published: 17 April 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

doors for malicious intruders to intercept the web channels. Thus, recent statistics have suggested a surge in security attacks in CC ecosystems and payment gateway services [1,2]. These attacks mainly include banking frauds, attacks on credit and debit payments of CCs due to unsecured authentication, expired certificates, web injection loopholes, attacks on payment gateways (third-party gateway services), and many others [3]. A recent report by the Federal Trade Commission (FTC) has suggested that financial fraud on a global scale has exponentially risen from 2018 to 2023. Consumers reported losing more than \$5.8 billion to fraud in 2021, which was up more than 70% from the year before, according to a newly released FTC report [4]. Thus, it becomes highly imperative to study the nature of these CC frauds conducted by malicious attackers.

The surge in CC frauds has pushed researchers globally to look at possible solutions that can thwart the attack vectors and secure the boundaries of the CC financial system (network, software, and hardware) [5]. The fraud incidents have forced innovative solutions to secure the network perimeters and present privacy and integrity [6,7], authorization and identity-based [8] and non-repudiation-based solutions [9]. Owing to the complex nature of attacks and zero-day possibilities, it is difficult to build an end-to-end CC fraud detection scheme that addresses financial ecosystems' security, privacy, and accuracy requirements. Traditional crypto-primitives (secured 3D passwords and multi-layer gateway encryption) require overhead due to proxy channels and the requirement of identity control and multiattribute signatures. It significantly hampers the Quality-of-Service (QoS) for end CC applications [10]. For CC fraud detection, security schemes are proposed of specific nature, and thus, such schemes are not generic and are custom-built to support end applications. Thus, it is crucial to analyze and study the CC attack patterns, the effect, and the disclosure strategy to conceptualize a generic security scheme that can cater to large attack sets.

Lately, artificial intelligence (AI)-based models have been used as a potential tool in CC financial fraud (FF) patterns [11,12]. The CC-FF detection algorithm works on URL detection, phishing detection, behavior-based authentication, and others. Machine learning (ML) and deep learning (DL) models are proposed to increase the attack detection accuracy in financial payment ecosystems [13]. Thus, the AI scope in CC-FF detection has solved challenges of security vulnerabilities of Android/IoS mobile OS, permission attacks, and web-URL attacks [14]. However, owing to the massive amount of available data, and real-time analysis, ML models are not generally considered effective for CC-FF detection. Thus, DL techniques are mostly employed to improve the accuracy and precision of CC-FF detection [15]. A fraudulent transaction closely resembles a genuine transaction, which can be detected by minute-level (fine-grained) pattern analysis. In such cases, the behavioral pattern technique aids in determining the transaction's flow and order. So, the anomaly is quickly identified based on the behavioral trend observed from previous attacks dictionaries.

For small datasets, standard ML techniques employ decision trees, random forests, and support vector machines. For large data, mostly recurrent neural networks (RNNs) and long short-term memory (LSTM) models are considered as they can process data sequences, which is a common feature in financial transactions. These models maintain a memory of past events (unusual patterns in CC transaction histories, spending behavior, unusual withdrawals, deposits, and small transactions to multiple accounts) [16]. Such events are considered anomalous events. Other suitable models include deep belief networks, autoencoders, and gated recurrent unit models. These models have shown promising models, but the performance varies significantly owing to the application requirements and dataset characteristics. In most average cases, RNNs and LSTM perform well [14]. Thus, in the proposed scheme, we have worked on the CC-FF detection based on the decoded–encoded input using the LSTM model.

With LSTMs, the accuracy of the prediction model improves, but it is equally important to understand the factors the model uses to make its predictions. Thus, including explainable AI (XAI) with LSTM is a preferred choice which would help the users understand significant data features the LSTM model uses to predict fraudulent CC transactions [17]. The Explainable Artificial Intelligence (XAI) refers to techniques and approaches in machine learning that enable humans to understand and interpret the reasoning behind the decisions made by AI models. XAI aims to improve AI systems' transparency, accountability, and trustworthiness. It [18] is also used in other domains such as healthcare, education, marketing, and agriculture. For instance, the authors of [19] utilize XAI in an autonomous vehicle where they efficiently interpret the black box AI models to enhance the accuracy scores and make autonomous driving safe and reliable. Furthermore, in [20], the authors use the essential properties of XAI for fall detection using wearable devices. They applied the Local Interpretable Model-Agnostic Explanations (LIME) model to obtain important features from the fall detection dataset and provide better interpretability of the applied AI models. The integrated model of XAI and LSTM is termed an explainable LSTM (X-LSTM) model. It reduces the bias in the data and model, which is essential for validating the obtained results. This approach applies XAI before the LSTM in the X-LSTM model. X-LSTM helps to improve the accuracy of simple LSTM models via the identification of gaps sequences in the data or model that need to be addressed. It can handle regulatory requirements, which improves the visibility and transparency of CC financial transactions.

Once the prediction results are obtained from the X-LSTM model, there is a requirement for transaction traceability and verification. Thus, the integration of blockchain (BC) and smart contracts (SC) makes the CC-FF detection ecosystem more transparent, auditable, and visible for interpretation to all financial stakeholders (banks, users, CC application, and gateway servers) [21–23]. The obtained model results are stored with action sets in SC, and such contracts are published over decentralized offline ledgers, such as interplanetary file systems (IPFS) or swarm networks. The use of IPFS-assisted SC would improve the scalability factor of public BC networks, as only the metadata of the transactions are stored on the public BC ledger. The actual data can be fetched through the reference hash from the BC ledger and mapped with the IPFS content key to obtain the executed SC. This makes the CC-FF scheme distributed among networked users and adds a high degree of trust, auditability, and compliance in the ecosystem.

After going through several studies in the literature mentioned in Section 2, we analyzed that recent approaches in CC-FF detection mainly include rule-based systems, statistical models, and sequence-based models (RNNs, LSTMs), which often need more interpretability. Most approaches do not cater to the requirements of new FF patterns and are tightly coupled to the end application only. Thus, from the novelty perspective, we integrate XAI with the LSTM model (our proposed X-LSTM approach). *RaKShA* addresses the issue of transparency and interpretability of CC-FF detection and further strengthens the power and capability of LSTM models. Secondly, our scheme is innovative as we propose the storage of X-LSTM output in SC, which provides financial compliance and addresses auditability concerns in financial systems. Via BC, the proposed scheme ensures that all results are verifiable, traceable, and tamper-proof, which is crucial in the financial industry. Finally, to address the scalability concerns of the public BC networks, we have introduced the storage of SC and associated data in IPFS and the content hash to be stored as metadata (transaction) in public BC. This significantly reduces a transaction's size, allowing more transactions to be packaged in a single block. This makes our scheme resilient, adaptable to real-time financial systems, and generic in CC-FF detection scenarios. Furthermore, the research contributions of the article are as follows.


• The performance analysis is completed on testing and validation accuracy, RMSProp optimizer, and XAI variable importance plot. The transaction costs, IPFS bandwidth, and SC contract are evaluated for BC simulation.

The rest of the paper is structured as follows. Section 2 discusses the existing stateof-the-art (SOTA) approaches. Section 3 presents the proposed scheme's system model and problem formulation. Section 4 presents the proposed scheme and details the data preprocessing, the sliding window formulation, the X-LSTM model, and the proposed SC design. Section 5 presents the performance evaluation of the proposed scheme. Finally, Section 6 concludes the article with the future scope of the work.

#### **2. State-of-the-Art**

The section discusses the potential findings by researchers for FF detection via AI models and BC as a trusted component to design auditable financial systems. Table 1 presents a comparative analysis of our scheme against SOTA approaches. For example, Ketepalli et al. [24] proposed the LSTM autoencoder, vanilla autoencoder, and random forest autoencoder techniques for CC datasets. The results show high accuracy for LSTM and random forest autoencoders over vanilla autoencoders. The authors in [25] explored the potential of DL models and presented a convolutional LSTM model for CC-FF detection. An accuracy of 94.56% is reported in their work. In some works, probability and statistical inferences are presented. For example, Tingfei et al. [26] proposed an oversampling-based method for CC-FF detection using the LSTM approach. Cao et al. [7] described a unique method for identifying frauds that combines two learning modules with DL attention mechanisms. Fang et al. suggested deep neural networks (DNN) mechanisms for Internet and web frauds. The scheme utilized the synthetic minority oversampling approach to deal with data imbalances [27]. Chen et al. [28] proposed using a deep CNN network for fraud classification.

Similarly, trust and provenance-based solutions are proposed via BC integration in financial systems. Balagolla et al. [29] proposed a BC-based CC storage scheme to make the financial stakeholders operate autonomously. Additionally, the authors proposed an AI model with scaling mechanisms to improve the scalability issues of the BC. Musbaudeen and Lisa [30] proposed a BC-based accounting scheme to automate daily accounting tasks and simplify audit features for a banking system. The authors in [31] researched the imbalanced classification problem. Additionally, the authors presented limitations of CC datasets (labeled data points), which makes it difficult to summarize model findings. Thus, low-cost models are preferred. Tingfei et al. [26] proposed an oversampling strategy based on variational automated coding (VAE) and DL. This technique was only effective in controlled environments. The study results showed that the unbalanced classification problem could be solved successfully using the VAE-based oversampling method. To deal with unbalanced data, Fang et al. [27] suggested synthetic minority oversampling methods.

Zheng et al. [32] presented boosting mechanisms in CC systems. The authors used AdaBoost ML during the training process. The model incorrectly classified many different symbols. Thus, improved TrAdaBoost is presented that updates the weights of incorrectly classified data. Cao et al. [7] presented a two-level attention model of data representation for fraud detection. The sample-level attention learns in a central manner where the significant information of the misclassified samples goes through a feature-level attention phase, which improves the data representation. The dependency between model fairness and scalability is not discussed.



**Table 1.** *Cont.*


**Table 1.** *Cont.*

Esenogho et al. [36] observed the nature of typical ML models, which entails a static mapping of the input vector to the output vector. These models are inefficient for the detection of CC frauds. To detect credit card fraud, one author proposed the neural network ensemble classifier and the SMOTE techniques to create a balanced dataset. The ensemble classifier uses the adaptive boosting (AdaBoost) algorithm and LSTM neural network as the base learner. Combining SMOTE-ENN and boosted LSTM classifier methods are efficient in detecting fraud. The research on fraud detection on a dataset of Chinese listed businesses using LSTM and GRU was presented by Xiuguo et al. [14]. A DL model with proposed encoding and decoding techniques for anomaly detection concerning time series is presented by Zhang et al. [16]. Balagolla et al. [29] proposed a methodology employing BC and machine intelligence to detect fraud before it happens. Chen et al. [37] presented research on loan fraud prediction by introducing a new method named hierarchical multi-task learning (HMTL) over a two-level fraud classification system. Chen et al. [28] proposed a deep CNN model (DCNN) for CC-FF with alert notifications, and the model presented high accuracy.

From the above literature, we analyzed that many researchers proposed their solutions concerning CC fraud detection. However, their approaches utilize obsolete feature space that cannot be considered in the current timespan. None of them have used the staggering benefits of XAI that efficiently selects the best features from the given feature space. Additionally, it is also analyzed that once the data are classified using AI algorithms, the data are not overlooked for data manipulation attacks. A broad scope is available to the attackers, where they can tamper with the classified data (from AI algorithms), i.e., from fraud to non-fraud or vice versa. Hence, the amalgamation of XAI with AI algorithms and integration of blockchain is not yet explored by the aforementioned solutions. In that view, we proposed an XAI-based LSTM model (X-LSTM) that seamlessly collected the efficient feature space and then passed it to the AI algorithm for the classification task. Furthermore, the classified data are forwarded to the IPFS-based public blockchain to tackle data manipulation attacks and preserve data integrity.

### **3.** *RaKShA***: System Flow Model and Problem Formulation**

In this section, we discussed the proposed scheme, *RaKShA* through a system flow model and presented the problem formulation. The details are shown as follows.

#### *3.1. System Flow Model*

In this subsection, we present the schematics of our scheme *RaKShA*, which presents a classification model to identify fraud patterns in the financial ecosystems. Figure 1 presents the proposed system flow diagram. In the scheme, we consider the entity *EU*, which denotes the user entity (whose financial data are under scrutiny). In the scheme, we assume there are *n EU*, denoted as {*U*1, *U*2,..., *Un*}.

For any *Un*, we consider CC details, denoted by *F*(*Un*) = {*ULB*, *UBA*, *UPA*, *URS*}, where *ULB* denotes the balance limit of CC, *UBA* denotes the pending bill amount of the monthly CC billing cycle, *UPA* denotes the payment amount *EU* is liable to make, and *URS* denotes the repayment status (Step 1). For the experiment, we select the credit card dataset (*UCC*) (Step 2). The data are collected into comma-separated values (CSVs), and preprocessing techniques are applied to the collected CSV. The preprocessed data are sent through a sliding window for *Un*, denoted as *W*(*Un*) (Step 3). Based on *W*(*Un*), the data are sent to the XAI for feature importance which is denoted as *X*(*W*(*Un*)) (Steps 4, 5). The XAI output is then passed to the LSTM model to classify the fraud patterns of *Un* (Step 6). Based on the X-lSTM model output, *EUn* executes an SC to notify the user of the genuineness and the safety of investment on *Un* (Step 7). The classification details are also stored on local IPFS, where any public user can fetch *Un* data based on the IPFS content key (Step 8). Finally, the transaction meta-information obtained from IPFS is stored on public BC (Step 9).

**Figure 1.** *RaKShA*: The proposed system flow model.

#### *3.2. Problem Formulation*

As discussed in the above Section 3.1, the AI-based *RaKShA* scheme is proposed for any *nth EU*. The financial resource is collected from the user entity. For simplicity, we consider that each user has a single CC for which fraud detections are classified. Thus, any user {*U*1, *U*2, ... , *Un* ∈ *EU*} has an associated CC denoted as {*C*1, *C*2, ... , *Cn*}. The mapping function *M* : *EU* → *C* is denoted as follows.

$$
\mathcal{U}\_i \longrightarrow \mathbb{C}\_i \tag{1}
$$

Similarly, the scheme can be generalized for many to one-mapping, where any user might have multiple CCs for which fraud detection is applied. In such cases, we consider a user identity *Uid* to be mapped to different CCs offered by respective banks. The mapping *<sup>M</sup>*<sup>2</sup> : *Uid* → *Ck* → *Bk* is completed, where any *<sup>n</sup>th EU* is mapped to a subset *Ck* ⊂ *<sup>C</sup>*, which is further mapped to associated banking information *Bk* ⊂ *B*, where *C* denotes the overall CC set, and *B* denotes the associated banks who have presented these CCs to *Ui*.

$$\mathcal{U}\_i \longrightarrow \mathbb{C}\_k \tag{2}$$

In the model, we consider two types of transactions: normal (genuine) transactions and fraudulent (fraud) transactions. Any *Ui* uses its CC at multiple shopping venues (online or point-of-sale transactions), money transfers, cash withdrawals, and others. We consider that fake CC transactions are generated by an adversarial user *Ur* who can exploit the operation of the CC transaction.

$$\mathcal{U}\_r \notin \{\mathcal{U}\_1, \mathcal{U}\_2, \dots, \mathcal{U}\_n\} \tag{3}$$

Here, a function *F* represents a *Ur* attack on the normal transaction system, which produces a fake transaction *Tf* in the CC network.

$$F = \left( \mathcal{U}\_r \xrightarrow{\text{attack}} \left( \mathcal{U}\_i \xrightarrow[\text{with}]{\text{tracisation}} \mathcal{C}\_i \right) \right) \tag{4}$$

The goal of the proposed scheme is to detect this malicious *Tf* from normal transaction sets *Tr* = {*T*1, *T*2, ... , *Tl*}, where *Tl* ⊂ *T*, which is proposed by *n* genuine users. The main goal is satisfied when every transaction is in normal behavior similarly to *Tr*. In Equation (5), we present the sum of the maximum count of the normal behavior of the CC transaction.

$$\mathbb{O} = \left(\sum\_{i=0}^{l} \text{score}(T\_r)\right) \tag{5}$$

The models work on the detection of *Tf* and differentiate its anomalous behavior from *Tr*. The detection mechanism is presented in Equation (6) as follows.

$$\mathbb{Q} = \left(\sum\_{i=0}^{l} \text{detect}(T\_f)\right) \tag{6}$$

We design an XAI model for the CC dataset, which finds the important features for the classification of *Tf* . The important features *Imp*(*Fs*) are passed as inputs to the LSTM model, which generates the classification output. The goal is to maximize accuracy *A*(*O*), which is fed to SC to be stored at the BC layer. In general, the problem formulation *Pf* aims at maximizing the *AO*, *Imp*(*Fs*). Secondly, the LSTM model should minimize the training loss *Tloss* and maximize the validation accuracy *A*(*Val*). Mathematically, the conditions for *Pf* are summarized as follows.


The entire problem *Pf* is then represented as a maximization problem.

$$P\_f = \max(\mathbb{C}\_0, \mathbb{C}\_1, -\mathbb{C}\_2, \mathbb{C}\_3) \tag{7}$$

subject to operational constraints as follows.

$$\begin{aligned} \text{OC}\_1: T &\le T\_{\text{max}} \\ \text{OC}\_2: T\_f &\le T\_r \\ \text{OC}\_3: \mathcal{C}(T\_f) &= \{0, 1\} \\ \text{OC}\_4: T\_{\text{loss}} &\le T\_{th \text{resh}} \\ \text{OC}\_5: \mathcal{G}(\mathcal{C}\_t) &\le \mathcal{G}(\mathcal{A} \text{cc}) \\ \text{OC}\_6: \mathcal{E}(\mathcal{C}) &\le \Delta(\max T) \end{aligned} \tag{8}$$

*OC*<sup>1</sup> denotes that the LSTM model should respond with output in a finite bounded period, denoted by *T*, which should not exceed a timeout *Tmax*. *OC*<sup>2</sup> denotes that the scheme is rendered fair when the number of fake transactions exceeds genuine transactions. The scheme would have less accuracy when *Tf* would exceed genuine transactions in the ecosystem. *C*<sup>3</sup> talks about a deterministic property of *Tf* classification *C*(*Tf*), that it would always output {0, 1}, which is a Boolean identifier to classify the transaction as genuine (1) or fake (0). Any other state is not acceptable. *OC*<sup>4</sup> indicates that *Tloss* should not exceed a threshold training loss, which is decided in real time based on previous inputs and outputs to the model. *OC*<sup>5</sup> indicates conditions for SC execution, which signifies that SC should be only executed (*Ce*) when the account wallet has sufficient funds (in terms of gas limit), which are denoted by *G*(*Ce*). Thus, it should be less than the total fund in the wallet (*G*(*Acc*)). *OC*<sup>6</sup> denotes that the time to add the update of SC execution to IPFS

and block mining in BC should again be finite and should not exceed a maximum timeout Δ(*maxT*) set by the public BC network.

### **4.** *RaKShA***: The Proposed Scheme**

This section presents the working of the proposed system, which is presented as a layered model into three sublayers: the data layer, the AI layer, and the BC layer. Figure 2 presents the systematic overview of the architecture. The details of the layers are as follows.

**Figure 2.** Architectural view of the proposed scheme (*RaKShA*).

#### *4.1. Data Layer*

At the data layer, we consider *EU*, which have mapped CCs (one-to-one or one-tomany mapping), and these users conduct transactions from their CCs at multiple venues. We consider that the CC firms can track conditions of lost, inactive, or stolen CC, and thus, any transaction made after a complaint lodged by *EU* should be marked as *Tf* . The challenge lies in identifying fraud transaction patterns *Pf* , which look identical to genuine patterns *Pr*. The real-time data at this layer are collected and compiled to form the CC transaction dataset, which is added as a CSV file to the AI layer.

We consider that *EU* uses different applications (portals) to carry out transactions (both normal and abnormal). We consider transaction instances {*T*1, *T*2,..., *Tl*} for *n* users with *q* CC, with the trivial condition *q* ≥ *n*. We consider any user makes *w* transactions in the billing cycle from CCs, which are mapped to *Uid*, and the overall amount *A* is computed at the billing cycle. Thus, the mapping is denoted as follows.

$$\begin{aligned} \begin{array}{l} \end{array} \xrightarrow{\text{performs}} T\_I \\ T\_I \xrightarrow{\text{maps}} q \\ \begin{array}{l} \text{ $U\_{id}$ } \xrightarrow{\text{bill}} A \end{array} \end{aligned} \tag{9}$$

Specifically, any *Uid* contains the following information.

$$\mathsf{UL}\_{\mathrm{id}} = \{ \mathsf{CC}\_{num}, \mathcal{B}\_{\mathrm{id}\prime}, \mathsf{T} \mathsf{x} \mathsf{n}, \, \mathsf{TA}, \, \mathsf{CC}\_{lim\prime} \mathsf{OD}\_{lim} \} \tag{10}$$

where *CCnum* denotes the CC number, *Bid* denotes the bank identifier which has issued the CC, *Txn* denotes the total transactions carried out on *CCnum* in the billing cycle, *TA* denotes the total amount (debited) in the billing cycle, *CClim* denotes the spending limit (different for online transactions and offline transactions), and *ODlim* denotes the overdraft limit set for *Uid* on *CCnum*.

In the case of genuine transactions *Tr*, the values of *Txn* over the billing period are not sufficiently high, which indicates that CC is used frequently. In addition, the location of the CC swipe (online based on gateway tracking and offline based on geolocation) should not be highly distributed (different locations indicate anomaly). Furthermore, *TA* should not normally exceed the *CClim*, and *ODlim* should not reach the maximum *OD* set for *CCnum*. Fake transactions *Tf* have a high probability of violating any of these conditions, which the AI model captures based on credit histories.

For all users, we collect the transaction details and store them in a CSV file, denoted as *CVd*, and *Ti* represents the transaction data.

$$\forall T\_i \in \mathcal{CV}\_d \tag{11}$$

In the CC fraud detection dataset, there are 31 attributes, where attributes *V*<sup>1</sup> − *V*<sup>28</sup> denote features that resulted from transformation via principal component analysis (PCA) and are numerical values. *T* denotes the elapsed time between the current transaction and the first transaction, and a class *C* attribute signifies whether a transaction is *Tf* or *Tr*.

$$A = \{ \{V\_1, V\_{28}\}, T, \text{sC} \}\tag{12}$$

The prepared CSV file is then sent to the AI layer for analysis.

#### *4.2. AI Layer*

At this layer, the CSV file is sent to the XAI module, whose main aim is to reduce the dataset dimensionality and maximize the accuracy of finding the important features *Imp*(*Fs*) of the dataset, which in turn would maximize *A*(*O*), predicting the required output. The dataset dimension is modeled as *rP*×*Q*, where *P* denotes rows, and *Q* denotes columns. Thus, the goal is to select *Imp*(*Fs*) over *Q* so that only important features are presented to the LSTM model and it achieves high accuracy.

#### 4.2.1. Data Preprocessing

This sublayer considers the data preprocessing techniques, including data cleaning, normalization, and dimensionality reduction. In data cleaning, we correct or delete inaccurate, corrupted, improperly structured, redundant, or incomplete data from the dataset *D* obtained from the CSV file. Not applying data preprocessing techniques leads to inefficient accuracies and high loss while classifying CC frauds. For instance, not applying data normalization leads to a range problem, where the particular value of a column is higher or lower than the other value in the same column. Furthermore, missing values are not formally accepted by the AI models; hence, they must be filled with either 0 or central tendency values, i.e., mean value. Similarly, it is essential to find the best features from the dataset; otherwise, the AI model will not be trained efficiently and not improvise the accuracy and loss parameters. Toward this aim, we utilize the standard preprocessing techniques on the CC fraud data. In place of NaN values, 0 is inserted, and then, the label encoding technique is used to transform string values in columns into a numeric representation. Finally, we consider *C* to denote the different target classes in *D*, and a group transformation is applied, represented by *ω* and *δ*, which divides the data into different timestamps.

$$
\delta\_l = \omega\_l(\delta\_{\text{irstances}\times\text{features}}) \quad \forall l \in \mathbb{C} \tag{13}
$$

The NaN data are denoted as [∅]. We first use the *isnull()* function to find the overview of NULL values in the dataset, and the operation is denoted by *Ns*, for NaN Data[∅].

$$N\_s = is null(CV\_d) \tag{14}$$

From the data, the NULL values are then dropped using the *drop()* function, which is denoted as *Dc*. The function is denoted as *Ri* for a particular row.

$$R\_{\hat{\imath}} \longrightarrow D\_{\hat{\imath}} \tag{15}$$

Next, all NaN values are updated in the column for which we compute the mean using the *fillna*() function. The model replaces categorical data, whereas the mean and median are used to replace numerical data. The column of data is denoted as *ci*. After filling in the empty data, the cleaned data are denoted as *CD*.

$$A = \frac{1}{n} \sum\_{i=1}^{n} c\_i$$

$$C\_D \longleftarrow c\_i$$

Each target class has a string data type and needs to be transformed using the one-hot encoded vector. Consider *y* as the original target class column, with Γ as unique classes. *y* has the shape (*k*, 1) and the data type string. *y* is subjected to a single hot encoding transformation.

$$y\_{k, \Gamma} = \begin{bmatrix} 1 \\ 0 \\ \cdot \\ 0 \\ 0 \end{bmatrix} \dots \begin{bmatrix} 0 \\ 1 \\ \cdot \\ 0 \\ 0 \end{bmatrix} \dots \begin{bmatrix} 0 \\ 0 \\ \cdot \\ 0 \\ 1 \end{bmatrix}$$

Normalization is typically required with attributes of different scales. Otherwise, the effectiveness of a significant and equally significant attribute (on a smaller scale) could be diminished as other qualities have values on a bigger scale. Statistics refers to the process of reducing the size of the dataset so that the normalized data fall between 0 and 1. where (∀*CD* ∈ *X*normalized)

$$X\_{\text{normalized}} = \frac{(X - X\_{\text{minimum}})}{(X\_{\text{maximum}} - X\_{\text{minimum}})} \tag{17}$$

The technique of minimizing the number of random variables or attributes taken into account is known as dimensionality reduction. In many real-world applications, highdimensionality data reduction is crucial as a stage in data preprocessing. In data mining applications, high-dimensionality reduction has become one of the crucial problems. It is denoted as *Dr*.

$$D\_r = \{x\_i, y\_i\}$$

$$\mathbf{x} \in \Pi \mathbb{R}^{\mathbb{Y}}$$

#### 4.2.2. XAI Module

In the work, we propose an XAI module that uses a boosting approach, combining several weak classifiers to form a strong classifier. The *rP*×*<sup>Q</sup>* data function contains 284,808 rows and 31 columns. The XAI function (Ψ) determines the highest priority on *Q*.

In this work, we used the feature importance technique of XAI to detect any CC fraud. Using XAI algorithms obtains better prediction over the other results. The XAI gives the highest priority feature of the dataset columns *(Q)*. XAI function (Ψ) is applied to the dataset, which is denoted as follows.

$$\mathcal{L} = \Psi \{ \forall r^{284808 \times 31} \} \tag{19}$$

After applying the XAI function on the dataset, we obtain the important feature, and dimensionality becomes reduced. The new function of XAI is denoted as *ζ*, and R is a new dimension of the dataset.

$$\mathbb{Z} \longleftarrow \text{ (} \mathbb{R}^{284808 \times 20} \text{)}\tag{20}$$

#### 4.2.3. LSTM Module Post-XAI: The X-LSTM Approach

Once XAI selects *Imp*(*Fs*), we shift toward sending the features as inputs to the LSTM model. We use a technique of flattening to feed information from multiple vectors into the classification model as a 1D array. The preprocessed data are called *CVp*. The parameter is immediately updated after computing the gradient for one training sample in stochastic gradient descent. It is a mini-batch with a batch size of 1. We repeat the two procedures flattening and mini-batch selection—in all the training samples. Mini-batch is represented as B, which is presented as follows.

$$
\hat{y} = \theta^T \mathbb{B} + r \tag{21}
$$

The loss function is used to evaluate the proposed algorithm, which predicts the highlighted dataset. It is determined as *μ*. All preprocessed feature data move on the LSTM block. A hidden LSTM unit has several gates, each completing a certain function. The tanh and sigmoid activation functions manage the gates input and output. Within the LSTM cell, the activation functions sigmoid and tanh are utilized for the input vector *x*. Figure 3 shows the details of the LSTM cell for processing our XAI output.

$$\mu(\theta, B) = \sum\_{i=1}^{n} -y \log(\hat{y}) \tag{22}$$

subject to,

$$
\theta\_{k+1} = \theta\_k - a \nabla f\_{\mathfrak{j}}(\theta) \tag{23}
$$

where *θ* denotes the weights, and *B* is the bias value. The actual class is *y*, and *y*ˆ is the predicted class. *α* is the learning rate and ∇ denotes the partial derivative.

$$CV\_p \longrightarrow B\_s \tag{24}$$

**Figure 3.** LSTM architecture.

The RMSProp optimizer function is considered, and the loss function is MSE. The algorithm addresses the *C*1, *C*2, and *C*<sup>3</sup> conditions of *Pf* under the operational constraints *OC*1, as our proposed model training preprocesses the data. Thus, the training response is in a finite bounded period (does not exceed *Tmax*). The algorithm effectively classifies *Tf* from *Tr* transactions, and *OC*<sup>2</sup> is satisfied. A deterministic output *Ay* = {0, 1} is obtained from the LSTM block, which satisfies our *OC*<sup>3</sup> condition. *Tloss* is under an experimentally observed *Tthresh*, which is obtained from successive process runs on different CC datasets. This satisfies the *OC*<sup>4</sup> condition.

#### *4.3. BC Layer*

At this layer, the classification output *Ay* obtained from the CC fraud detection dataset with the transaction details is stored in an SC. We consider the transaction details *X*, where every row corresponds to a user transaction, and the column corresponds to features. The LSTM model outputs *Ay*, based on the classification function *F*(*X*) = *Y*. In the SC<, we consider a function *storeDetails*(*X*,*Y*), which takes inputs *X* and *Y* from LSTM output. The SC is executed, and the details are published on IPFS, from which we generate a javascript object notation (JSON) file, which is denoted as *JS*(*X*,*Y*). Next, we use the IPFS API to publish *JS*(*X*,*Y*) on the IPFS network, which generates the content key *CK*(*IPFS*) and the hash of *CK*(*IPFS*), which is denoted as *HCK*. *HCK* is stored as a transaction on a public BC network, using the *storeHash(HCK)* function.

Algorithm 1 presents the details of the SC. In the case of fraud transaction detection, all nodes of the BC are notified of the account from which it is detected.


Initially, we connect to the Ethereum network using Web3.js and send a transaction to SC [44]. Next, an account is created to which the function *FraudNoti fication*() has access and message *M* is added with timestamp *T* in a block in case function *Noti f y*\_*Fraud*\_*Detected* returns *true*. In such a case, the required gas fee *G*(*Ce*) should be in the owner's address to deploy the contract. This satisfies the *OC*<sup>5</sup> condition of *Pf* . Next, the executed contract produces bytecodes, and input data are supplied to the contract. The contract data are also published on the IPFS network, and *CK* and *HCK* are generated, which links the published contract with the BC network. Once complete, the function *close*() closes a connection to the Ethereum network and frees up the EVM memory.

For the experiment, we perform an SC function such as the *getFraudStatus*() function, which returns a boolean indicating the fraud status of the *cardholder* associated with the Ethereum address that is calling the function. It retrieves the cardholder struct associated with the caller's address through the cardholders mapping and returns the *isFraud* boolean value of the *cardholder* struct. A true value indicates that the cardholder has engaged in fraudulent activities, while a false value indicates that the cardholder has not. *checkFraudTransactionStatus*() is a public view function in the FraudDetection contract that takes an Ethereum address as input, retrieves the associated *cardholder* struct, and returns a boolean indicating if the cardholder has engaged in fraudulent activities. A

true value means that the cardholder has engaged in fraud, while a *f alse* value means the opposite. *getTransactionsAmounts*() and *getTransactionsLocations*() are public view functions defined in the FraudDetection contract that retrieve and return the transaction amounts and locations, respectively, of the cardholder that is making the function call. Both functions access the *allTransactionAmounts* and *allTransactionLocations* arrays stored in the cardholder struct that is associated with the cardholder's Ethereum address.

### **5.** *RaKShA***: Performance Evaluation**

In this section, we discuss the performance evaluation of the proposed scheme. First, we discuss the tools and setup and present the CC datasets (dataset 1 and dataset 2) used for simulation. Then, we present the simulation parameters for the LSTM model and the BC setup, which are followed by X-LSTM performance analysis, SC design, and performance analysis of BC metrics. Finally, the details are presented as follows.

#### *5.1. Experimental Tools and Setup*

For the experiment, we used the Google Collab platform with Python and imported the required set of libraries: Numpy for linear algebra, Fourier transforms and matrices, Pandas for ML-related tasks, and Matplotlib for visualizing the data. SC is developed in Solidity programming language on Remix IDE for BC simulation. LSTM with parameters such as epochs, batch size, and loss function is defined. We compared different optimizers for the model's accuracy [45].

#### *5.2. Dataset*

Two open-source CC datasets are analyzed for fraud transaction detection [46]. The dataset contains transactions from different CC from September 2013 from European CC users. The dataset is first made balanced, and the explicit features are hidden via PCA, and {*V*1, *V*2, ... , *V*28} features are present. Other features are time (the elapsed time between each transaction and the initial transaction) and transaction amount. This dataset has 0.17% class 1 data for the prediction of futuristic data. There are 284808 data available in the dataset. We experiment with techniques such as XAI data and without XAI data on this dataset to obtain better predictions between the techniques.

Another CC dataset is considered from the UCI repository [47], with CC applications. The attributes and names are hidden (due to privacy preservation and to assure user confidentiality). The dataset contains continuous, nominal, and large values, with 690 instances and 15 attributes. This dataset has attributes that help predict the class labels. There are 22.1% class 1 and 78% Class 0 data for prediction. Here, we also analyze different vectors to understand the behavior of the data. Figure 4 is the visualization of the vector performance concerning the amount. Each point in the scatter plot represents a transaction in the dataset. The amount variable represents the amount of money involved in the transaction, while each V variable is a transformed feature that the model uses to detect fraud. The X-axis represents the values of the V variable, and the Y-axis represents the transaction amount. By plotting the amount against each V variable, we can see if there is any relationship between these variables and the transaction amount. For example, if there is a strong positive correlation between the amount and a particular V variable, transactions with higher values might be more likely to involve larger amounts of money. Conversely, if there is a negative correlation between the amount and a V variable, then transactions with lower values of that variable might be more likely to involve larger amounts of money.

#### *5.3. Simulation Parameters*

For predicting the output, parameter selection plays an important role. In work, we have considered two epoch values: 50 and 500 epochs. Furthermore, the batch size is a hyperparameter that defines the number of samples to work through before updating the internal model parameters. When training neural networks, batch size regulates how

accurately the error gradient is estimated. The details of the hyperparameters are presented in Table 2. On similar lines, Table 3 presents the BC and SC setup parameters.

**Figure 4.** Vector visualization of dataset.



**Table 3.** BC and SC parameters.


#### *5.4. X-LSTM Analysis*

In this section, we present the simulation results of the proposed X-LSTM network based on the tuning hyperparameters. We first present the details of the XAI feature selection process, which exploits boosting mechanism to create a strong classifier from weak classifiers. We consider that XGBoost handles the relationships between data and associated distribution. Initially, we consider the Shapley addictive explanations (SHAP) model on the CC fraud dataset [46].

To validate the results obtained from the SHAP beeswarm plot, we plot the variable importance plot on the same dataset. Figure 5a presents the details. This plot shows the importance of a variable (attribute) in output classification. Thus, the plot signifies how much accuracy is affected by the exclusion of a variable. The variables are presented in decreasing order of importance. The mean on the x-axis is the mean decrease in the Gini coefficient, and thus, the higher the values of the mean decrease in the Gini score, the higher the importance of the variable in the output prediction. From the figure, it is evident that attributes *V*14, *V*17, and *V*<sup>12</sup> are more important, and attributes *V*21, *V*6, and *V*<sup>2</sup> are the least important. The plot closely synchronizes with the SHAP beewswarm plot in most instances, thus validating our cause of selection of important attributes to the LSTM model.

Figure 5b shows the details of the beeswarm SHAP plot. The results show the important features of different features. In the plot, the y-axis represents the features, and the x-axis shows the feature's importance. For example, features *V*14, *V*17, and *V*<sup>10</sup> have a high SHAP value, which signifies a positive impact on CC fraud prediction. Similarly, features *V*20, *V*8, and *V*<sup>15</sup> have a negative impact on the SHAP value and thus are not so important to the output prediction.

The SHAP model has different types of graphs for feature importance; similar to a waterfall, SHAP works by first computing the baseline value of the output, which is the expected value of the output when all input features have their average or most common values. Then, for each instance to be explained, it calculates the contribution of each feature to the difference between the actual output and the baseline output. In Figure 5c,d, each feature's contribution is represented as a vertical bar that either adds or subtracts from the baseline value. The height of the bar represents the magnitude of the contribution. Figure 6 is a Force SHAP model. It also shows the interaction effects between features [48]. These interaction effects are represented as connections between the features, and the thickness of the connection represents the strength of the interaction.

#### **Figure 6.** Force SHAP model.

In the study of feature selection, the Eli5 model is also used to present the feature importance of the data. Eli5 stands for Explain Like I'm Five, and it is used for model interpretation and explanation of machine learning models [49]. These methods help to identify the most influential features and how they affect the model's output. To examine and decipher ML classifiers, ELI5 prepares decision trees by weights for tree-based models using the Gini index [48,50]. The tabulated weights determined for each parameter are displayed in Figure 7a. The features are ranked and given weights according to their significance (the most important parameter is at the top).

LIME is a model interpretation and justification method applied to machine learning [48]. Figure 7b presents the LIME graph for CC fraud detection. In the figure, the green color represents the features are positively correlated with the local values, and red color shows the opposite correlation. The fundamental goal is to create a collection of "local surrogate models" that may be used to explain how the original model makes predictions in a specific situation. To accomplish this, LIME first creates a collection of "perturbed" instances that each have slightly different feature values from the original instance. A local surrogate model, such as a linear model or decision tree, is then trained using these perturbed instances to mimic the behavior of the original model in the immediate vicinity of the instance to be explained. It is also a model for presenting the feature importance for better prediction.

#### 5.4.1. LSTM Performance without XAI Input Selection

Firstly, we present the accuracy comparison by directly applying the model without considering the XAI output. For LSTM, we check the accuracy based on the epochs size, such as 50 and 500. In addition, we check the parameters such as the batch size, which is 200. We have applied the LSTM model on both datasets [46,47]. Figure 8a shows the accuracy and loss graphs for LSTM for 50 epochs. A maximum accuracy of 60% is achieved with RMSProp optimizer on the CC fraud detection dataset. For 500 epochs, Figure 8b shows the results. The model gives 85% accuracy with the RMSProp optimizer. Similarly, for the UCI dataset, Figure 8c reports an accuracy of 76% with 50 epochs and 80% accuracy for 500 epochs. Figure 8d demonstrates the results.

#### 5.4.2. LSTM Performance with XAI Input Selection

Next, we present the performance comparison of the model with inputs considered from the XAI output. We ran the model for 50 and 500 epochs on the CC fraud detection dataset [46]. Figure 8e shows the result on the CC-FF dataset for 50 epochs, and Figure 8f shows the result for 50 epochs. Table 4 presents the comparative analysis for the CC-FFdataset (with and without XAI) for 50 and 500 epochs, respectively. Furthermore, the proposed work is compared with [41], where they used the same dataset to detect CC fraud patterns. However, their work is carried out without applying XAI, which implies that their AI model has not analyzed essential features of CC fraud. Hence, their approach offers 96% training accuracy. Contrary, the proposed work adopts staggering properties of XAI that offer an accuracy of 99.8% without overfitting the LSTM model (as shown in Table 5). In addition, the authors also want to mention that once the AI models classify the data, it requires the data to be secure from data manipulation attacks. Nevertheless, we realize that [41] has not adopted any security feature in their proposed work. On the contrary, we have used an IPFS-based public blockchain for secure storage against data manipulation attacks. This improves the security and privacy concerns of the proposed scheme.

**Figure 7.** Comparison of Eli5 and LIME XAI models. (**a**) Eli5 Model. (**b**) Lime XAI Model.

**Table 4.** Accuracy and loss comparison of X-LSTM.


**Figure 8.** Performance of AI algorithm based on datasets 1 and 2. (**a**) 50 epochs LSTM RMSprop optimizer of dataset 1. (**b**) 500 epochs LSTM RMSprop optimize of dataset 1. (**c**) LSTM 50 Epochs RMSprop Optimizer of dataset 2. (**d**) LSTM 500 Epochs RMSprop Optimizer of dataset 2. (**e**) LSTM and XAI model on 50 epochs of CC-FF dataset. (**f**) LSTM and XAI model on 500 epochs of CC-FF dataset.


**Table 5.** Performance analysis of XAI models for 500 epochs.

#### 5.4.3. Evaluation Metrics

In AI, precision, recall, and accuracy are crucial performance metrics that enable us to quantitatively assess a model's capacity to accurately classify positive and negative instances. These parameters allow us to compare the performance of various models, identify particular areas where a model may need improvement, and are simple enough for both technical and non-technical audiences to comprehend. Overall, important factors that are heavily considered when assessing the effectiveness of binary categorization models include precision, recall, and accuracy.

• Precision (P): Out of all the positive predictions produced by the model, precision is the percentage of actual positive predictions. In other words, precision assesses the reliability of optimistic forecasts. A high precision number means that the model almost never predicts something that will actually happen.

$$
\Delta \mathfrak{P} = \frac{\psi}{\Psi + \Xi} \tag{25}
$$

• Recall (R): Out of all the real positive instances in the dataset, recall is the percentage of true positive predictions. Recall, then, gauges the model's ability to recognize every positive instance in the dataset. A high recall number means that the model almost never misses any successful examples.

$$\mathfrak{R} = \frac{\psi}{\psi + \tilde{\xi}} \tag{26}$$

• Accuracy (A): The percentage of accurate forecasts among all the model's predictions is known as accuracy. In other terms, accuracy assesses how well the model can categorize positive and negative instances accurately. A high accuracy rating shows that the model can accurately classify the majority of the dataset's instances.

$$\mathfrak{A} = \frac{\mathfrak{v} + \mathfrak{g}}{\mathfrak{v} + \mathfrak{g} + \Xi + \mathfrak{F}} \tag{27}$$

where true positive, true negative, false positive, and false negative are represented as *ψ*, *ς*, Ξ, *and ξ*.

A binary classification model's effectiveness is graphically depicted by the Receiver Operating Characteristic (ROC) curve. At various categorization criteria, it plots the true positive rate (TPR) vs. the false positive rate (FPR). The true positive rate (TPR), often called sensitivity or recall, is the percentage of true positives (positive examples that were classified correctly) among all actual positive instances. The FPR, on the other hand, is the ratio of false positives (negative cases that were wrongly categorized as positive) to all true negative instances. The ROC curve depicts the trade-off between these two rates at different categorization thresholds. The area under the ROC curve (AUC) is a metric that quantifies the overall performance of the model across all possible classification thresholds. The AUC ranges from 0 to 1, where a perfect classifier has an AUC of 1, while a random classifier has an AUC of 0.5. Generally, a higher AUC indicates a better performance of the model. Our model achieves the 0.97 roc\_auc shown in Figure 9.

**Figure 9.** ROC-AUC Graph.

#### *5.5. Smart Contract Design*

In the proposed scheme, the LSTM classification output is published via SC, which helps any public user to find whether the new transaction is fake or real. In the SC, we have considered transaction detail, amount, the sender and receiver address, the location of the transaction, and the transaction timestamp. The fraud conditions are kept based on anomalies reported by the X-LSTM model. Figure 10 presents the capture of the fraud transaction (Call\_Notify\_Fraud\_Detected(*Ms*) function), as depicted in Algorithm 1, and is indicated as RED box in the figure. Some common operating conditions include the execution of a transaction from a new location (not close to the user location), the transaction amount exceeding a specified threshold, and account debits amounting to multiple unknown parties. In the SC, there are two boolean functions *checkFraudTransaction*, which


checks the transaction as fraud or not based on the LSTM classification, and *getFraudStatus*, which reports the fake transaction details.

**Figure 10.** Fraud transaction SC functions.

#### *5.6. BC Performance Metrics*

In this section, we discuss the performance of the BC, which stores the information of the SC details. We consider the gas value consideration for the SC design and the IPFS bandwidth for the analysis. The details are presented as follows. We forward the nonattack data to be stored on IPFS and the content hash on public BC post-classification [51]. Financial stakeholders authenticate the non-attack data in the contract, and the contract is executed.

#### 5.6.1. Gas Value for SC

Gas is a unit of measurement for the computing work needed to execute the code in an SC on the BC network. Figure 11a presents the gas cost of transaction and execution. The intricacy of the code within an SC determines how much gas is needed for the contract to function. The quantity of gas a user is ready to pay for the transaction to be carried out is specified when they start a transaction that interacts with a smart contract. The transaction might fail, and all fees paid would be forfeited if the gas limit was too low. Conversely, the user will pay more fees than necessary if the gas limit is too high.

#### 5.6.2. IPFS Bandwidth

IPFS is a peer-to-peer network where the data are stored and exchanged between nodes in the BC network. Figure 11b indicates the IPFS transfer and receive bandwidth over a while. When a user requests data from IPFS, the data are retrieved from the network of nodes rather than from a centralized server. This indicates that the data are dispersed across many nodes, which can speed up data retrieval. However, it also means that bandwidth is an important consideration for IPFS users, as the speed of data transfer will depend on the available bandwidth of the nodes on the network.

**Figure 11.** (**a**) Gas cost consumption. (**b**) IPFS bandwidth utilization.

#### 5.6.3. Scalability Graph

The Transactions Per Second (TPS) speed offered by the Blockchain Network (BN) is what determines how scalable a blockchain is. The suggested system's BC (Ethereum) and conventional blockchain (BCN) network scalability comparison graph is shown in Figure 12. The X-axis in this graph shows transaction time in milliseconds, and the Y-axis lists the number of transactions. The suggested method enables more transactions to be added to the BC. Moreover, IPFS can store a lot of data and fetch data much more quickly. Data are kept in IPFS, and IPFS data's hashes are sent to the BC. The proposed strategy using Ethereum Blockchain (EB) outperforms the conventional approach using bitcoin, according to graph visualization. This occurred as a result of bitcoin's lack of advanced technological features offered by the EB.

#### *5.7. Potential Limitations and Future Scope*

In this section, we discuss the potential limitations and future improvements of the proposed *RaKSha* scheme. The scheme offers the benefits of CC-FF detection via a proposed X-LSTM approach and then storing the results on SC executed on a public BC network. However, there are some potential limitations that we need to consider in the approach.

Firstly, using public BC for real-time financial data analysis might not be feasible owing to a large amount of real-time data collection (transactions) by financial stakeholders. Secondly, financial data are highly confidential and are subjected to financial laws, and thus, the privacy preservation of the records becomes a critical issue.

Secondly, the proposed approach requires a significant amount of computing power and resources to assure user scalability. Thus, it requires access to cloud servers for resources, which again jeopardizes the privacy and security of data storage. Thirdly, the proposed X-LSTM approach must be resilient to detect emerging CC fraud patterns. In this case, the model needs to continuously train and update itself to recognize the zero-day patterns, which might make the model bulky over time and limit its effectiveness. Thus, the presented limitations must be carefully studied while designing practical solutions for financial ecosystems.

Thus, the presented limitations open up new avenues for the future expansion of the proposed scheme. To address the issue of privacy preservation, the proposed scheme needs to incorporate random noise (differential privacy) to thwart any possible linkage attacks. To address the issue of the X-LSTM model learning and improve the model accuracy, more specific features must be generated by XAI models, which leads to the design of optimized XAI models that could improve the LSTM output. Finally, the proposed SC can be further optimized to improve the IPFS bandwidth, improving public BC networks' transaction scalability.

**Figure 12.** Scalability graph.

#### **6. Concluding Remarks**

The paper proposed a novel CC fraud detection scheme, *RaKShA*, in which we proposed an integration of XAI with the LSTM (X-LSTM) model, and the output is verified via SC. The results are stored in IPFS, which is referenced on the public BC network. The proposed approach addressed the limitation of traditional fraud detection by providing model interpretability, improved accuracy, security, and transparency. Modeling X-LSTM augmented the power of the LSTM model in CC-FF detection and made the scheme scalable and adaptable, which helps users to prevent themselves from FF. We validated the proposed layered reference scheme against two CC datasets and presented a comparative analysis of LSTM accuracy and loss (with and without XAI interpretation). For 500 epochs, an accuracy of 99.8% is reported via XAI, which shows an improvement of 17.41% on the simple LSTM model. The use of SC and public BC ensures that the fraud detection data are accessible and verifiable by all users, which makes the proposed scheme a useful CC-FF auditing tool at a low cost.

The presented scheme opens exciting opportunities to improve financial ecosystems' security and transparency barriers. The scheme applies not only to CC frauds but is extensible to insurance, tax evasion, and web transaction frauds. In different use cases, the underlying semantics remain common; however, fine-tuning the proposed scheme according to use case practicality needs to be considered for optimal solutions.

**Author Contributions:** Conceptualization: J.R., P.B., N.K.J., G.S., P.N.B. and S.T.; writing—original draft preparation: M.E., A.T., J.R., A.T. and M.S.R.; methodology: S.T., G.S., A.T. and P.N.B.; writing review and editing: S.T., P.B., J.R., N.K.J. and M.S.R.; investigation: M.S.R., S.T., G.S. and M.E.; supervision: S.T., P.N.B. and P.B.; visualization; P.B., J.R., M.E., N.K.J. and M.S.R.; software: A.T., J.R., S.T. and M.E. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was funded by the Researchers Supporting Project Number (RSPD2023R681) King Saud University, Riyadh, Saudi Arabia and also funded by the University of Johannesburg, Johannesburg 2006, South Africa.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** No new data were created or analyzed in this study. Data sharing is not applicable to this article.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
