*3.5. Interpretability Criteria*

The interpretability criteria are essential for evaluating the model's behavior throughout the evaluations. In this case, the eFNN-SODA model uses some approaches to validate the interpretability criteria proposed in [37], so that the generated fuzzy rules are reliable. In this case, the model applies some evaluations throughout the experiments to ensure that the generated fuzzy rules can add some knowledge about the evaluated dataset. These procedures are listed below:

Simplicity and Distinguishability: These two criteria verify whether the proposed model is the simplest and has its structures distinguishable during training. This means that the evaluation revolves around low complexity and high accuracy. Regarding the simplicity of the eFNN-SODA, the model is expected to have a more compact structure and a high degree of assertiveness. The criterion defined for the identification of model simplicity (in the comparison between models) can be expressed by:

#### *i f* {(*La* < *Lb*) ∨ (*accuracymodela accuracymodelb*)} *then modela is simpler than modelb* (46)

where *La* and *Lb* are, respectively, the number of fuzzy rules of the *modela* and *modelb*. Regarding distinguishability, it is expected to assess whether there is an overlapping in the structures formed in the fuzzification process. The SODA approach uses an assessment of overlapping in the evolution process in its sixth stage, thus ensuring that this situation does not occur. For the evaluation of the distinguishability criterion, eFNN-SODA uses the comparison of similarity between the Gaussian neurons formed in the first layer (termed as *zi*(*bef*) and *zi*(*a f ter*)) dimension-wise, and similarity (*Sim*) degree *Sim*(*zi*(*bef*), *zi*(*a f ter*)) can be used for calculating an amalgamated value. The degree of change () is then presented by [11]:

$$\times(z\_i) = 1 - S\_{im}(z\_i(bcf), z\_i(after))\tag{47}$$

*bef* = *N* − *n* and *a f ter* = *N*, assuming that *n* new samples have passed the datastream-based transformation phase with *N* samples operated so far for model training and adaptation [11].

Therefore, it is feasible to conclude that two rules are only identical if all their antecedent parts are equivalent. The x coordinates of the points of intersection of two Gaussians used as fuzzy sets in the identical antecedent part of the Gaussian rule *i* (here for the *j*th) before and after its update can be estimated by [56]:

$$\begin{split} \text{inter}\_{\text{x}}(1,2) &= -\frac{\vec{\varepsilon}\_{\text{bcf},j}\vec{\varepsilon}\_{\text{afer},j}^{2} - \vec{\varepsilon}\_{\text{afer},j}\vec{\varepsilon}\_{\text{bcf},j}^{2}}{\vec{\upsilon}\_{\text{bcf},j}^{2} - \vec{\upsilon}\_{\text{afer},j}^{2}} \\ &+ -\sqrt{(\frac{\vec{\varepsilon}\_{\text{bcf},j}\vec{\upsilon}\_{\text{bcf},j}^{2} - \vec{\upsilon}\_{\text{bcf},j}\vec{\upsilon}\_{\text{afer},j}^{2}}{\vec{\upsilon}\_{\text{afer},j}^{2} - \vec{\upsilon}\_{\text{bcf},j}^{2}}})^{2} - \frac{\vec{\varepsilon}\_{\text{bcf},j}^{2}\vec{\upsilon}\_{\text{afer},j}^{2} - \vec{\upsilon}\_{\text{afer},j}^{2}\vec{\upsilon}\_{\text{bcf},j}^{2}}{\vec{\upsilon}\_{\text{afer},j}^{2} - \vec{\upsilon}\_{\text{bcf},j}^{2}} \end{split} \tag{48}$$

with*cbef* ,*j*,*<sup>σ</sup>bef* ,*j* being the *j*th center coordinate and standard deviation of the Gaussian neuron before its update, and*ca f ter*,*j andσa f ter*,*j* the *j*th center coordinate and standard deviation of the Gaussian neuron after its update [11].

The maximal membership degree of the two Gaussian membership functions in the intersection coordinates is then used as overlap. Consequently, the similarity degree of the corresponding rules' antecedent parts in the *j*th dimension is [11]:

$$S\_{\rm im}^{\rm left,after}(j) = \max(\mu\_i(\textit{inter}\_\mathbf{x}(1)), \mu\_i(\textit{inter}\_\mathbf{x}(2)))\tag{49}$$

with *μi*(*interx*(1)) being the membership degree of the *j*th fuzzy set in Rule *i* in the intersection point *interx*(1). The amalgamation of overall rule antecedent parts leads to the final similarity degree between the rule before and after its update:

$$S\_{im}(z\_i(bcf), z\_i(after)) = T\_{j=1}^p S\_{im}^{hcf, after}(j) \tag{50}$$

where *T* denotes a t-norm operator, and *p* is the number of inputs, as a robust nonoverlap along one single dimension is sufficient for the clouds to not overlap at all [56].

• Consistency, Coverage, and Completeness: The concept of consistency in evolving fuzzy systems is attending when their fuzzy rule-set does not deliver a high noise level or an inconsistently learned output behavior. Therefore, a set of fuzzy rules is considered inconsistent when two or more rules overlap in the antecedents, but not in the consequents. In this paper, the consistency of fuzzy rules (comparing a rule before and after its evolution) can be measured by evaluating the similarity involved in its rule antecedents (*Sante*) and consequents (*Scons*). In this case, they can be expressed by [37]:

$$\begin{aligned} \text{Rule } z\_1 &\text{ is inconsistent to Rule } z\_2 \text{ if } \mathit{and} \ \mathit{only} \ if\\ \text{Sante}(z\_1, z\_2) &\geqslant \text{Sous}(z\_1, z\_2) \ \mathit{with} \ \mathit{Sante}(z\_1, z\_2) &\geqslant \text{thr.} \end{aligned} \tag{51}$$
 
$$\text{where } \mathit{consistency} = \left\{ \begin{array}{c} 1 & \text{if } \mathit{Equation}(\text{51}) \ \mathit{is} \ \mathit{false} \\ 0 & \text{if } \mathit{Equation}(\text{51}) \ \mathit{is} \ \mathit{true} \end{array} \right.$$

where *Sante* and *Scons* are close to 1 invariably can be assumed to indicate a heightened similarity, and when they are close to 0, a low similarity [37]. *thr* is a threshold value usually set at 0.8 or 0.9.

The coverage criterion identifies whether there are holes in the resource space by generating undefined input states. This criterion can be solved by applying Gaussian functions, which have unlimited support. In this case, eFNN-SODA guarantees this criterion using this type of function throughout the model's training [37].

Finally, the *-*-completeness criterion in the eFNN-SODA is defined by [37]:

$$\left(\forall \text{\textquotedblleft \exists i \ (z\_{i} = \underset{j=1,\ldots,r}{T}(\mu\_{i\bar{j}}) > \epsilon)\right) \Rightarrow \left(\forall \text{\textquotedblleft \exists i \ (\forall j \ \mu\_{i\bar{j}} > \epsilon)\right) \tag{52}$$

where *μij* is the membership degree of a fuzzy set *Aj* appearing in the *j*th antecedent part of the *i*th rule, *rl* is the rule length, and *-* = 0.135 according to definitions made in other research, which is considered an evaluation standard for this criterion [37].


#### **4. Auction Fraud Testing**

The following sections presents the details and procedures of the tests that were performed, as well as the models and data set. All tests were performed on a computer with the following settings: Intel (R) Core (TM) i7-6700 CPU 3.40 GHz with 16 GB RAM. In the execution and elaboration of the models present in the tests of this paper, Matlab was used, and for data analysis, the Orange Data Mining tool was used (developed in phyton (https://orangedatamining.com/ (accessed on 17 August 2022))).

#### *4.1. Data Set Features*

As mentioned previously, this study aims to analyze Shill Bidding fraud. The analyzed dataset was collected by [4] and published in the Machine Learning Repository—UCI (arch ive.ics.uci.edu/ml/datasets/Shill+Bidding+Dataset (accessed on 17 August 2022)). The data correspond to fraudulent bids on one of the largest auction sites on the Internet, eBay.

The collected database [4], features completed iPhone 7 auctions for three months (March to June 2017). The original dataset contains 12 input features. However, for the studies described in this paper, the dimensions related to personal IDs (Record ID, Auction ID, and Bidder ID) were removed since they are ID values and, therefore, irrelevant to the experiments. The remaining nine dimensions are Bidder Tendency, Bidding Ratio, Successive Outbidding, Last Bidding, Auction Bids, Auction Starting Price, Early Bidding, Winning Ratio, Auction Duration, and Class (1 to normal and −1 to fraud). Figure 5 presents statistical values and the data distribution by class involved in the problem.

**Figure 5.** Statistical values and the data distribution—Auction Fraud data set.

As shown in Figure 5, the dataset used in this experiment is imbalanced, making the classification task challenging.

Figure 6 presents a representation of the data using the FreeViz technique [57]. In it, points in the same class attract each other, while those from different classes repel each other, and the resulting forces are exerted on the anchors of the attributes, that is, on unit vectors of each dimensional axis. With this technique, it is possible to identify projections of unit vectors that are very short compared with the others. This indicates that their associated attribute is not very informative for a particular classification task.

**Figure 6.** FreeViz projection—Auction Fraud data set.

Based on Figure 6, the most representative dimensions for fraud classification in auctions are successive outbidding, bidding, and winning ratios. In Figure 7, a data visualization according to the Radviz technique [58] is presented to demonstrate a nonlinear multidimensional visualization that can display data defined by three or more variables in a two-dimensional projection. Visualized variables are presented as anchor points evenly spaced around the perimeter of a unit circle. Data instances close to a variable anchor set have higher values for those variables than for others.

**Figure 7.** Radviz projection—Auction Fraud data set.

#### *4.2. Models and Hyperparameters*

The models used in the experiment are evolving fuzzy systems considered state-of-theart in classifying binary patterns. They have different architectures and training methods. The parameters of each of these models were defined through pre-tests that used 10-folds in a group of parameters that make up each approach. The models used in the experiment are listed below:

EFDDC—Evolving fuzzy data density cluster. The evolving model uses data-densitybased clustering based on empirical data analysis operators and nullneuron. The model's training is based on the Extreme Learning Machine and Bolasso technique to select the best neurons. The model parameters are *ρ* = 0.01, bootstraps *bt* = 16, and consensus threshold *λ* = 0.7 (the best parameters are defined between: *ρ* = {0.01, 0.02, 0.03, 0.04}, *bt* = {4, 8, 16, 32}, and *λ* = {0.5, 0.6, 0.7}) [59].

EFNHN—Evolving fuzzy neural hydrocarbon network. The model combines an evolving fuzzification technique based on data density (Autonomous Data Partitioning), training based on Extreme Learning Machine, and unineurons. The defuzzification process is based on an Artificial Hydrocarbon network. The model parameter is the Learning rate = 0.1 (the best parameter is defined between: Learning rate = {0.01, 0.05, 0.1, 0.2}) [60].

EFNNS—Evolving fuzzy neural network and Self-Organized direction aware. The evolving fuzzy neural network model uses the Self-Organized direction aware for fuzzification, unineurons, Extreme Learning Machine, and pruning technique. The only parameter used in the model is the *ϑ* =3 (the best parameter is defined between: Learning rate = {2, 3, 4, 5}) [61].

ALMMo-0\*—Autonomous zero-order multiple learning with pre-processing. The model is a neuro-fuzzy approach to autonomous zero-order multiple learning with preprocessing that improves the classifier's accuracy, as it creates stable models. The parameter is radius = ,2 − 2 cos (<sup>30</sup>◦) (the best parameter is defined between: radius = {,2 − 2 cos (<sup>15</sup>◦), ,2 − 2 cos (<sup>30</sup>◦), ,2 − 2 cos (<sup>45</sup>◦), ,2 − 2 cos (<sup>60</sup>◦)}) [62].

ALMMo—Autonomous zero-order multiple learning. A neuro-fuzzy approach for autonomous zero-order multiple models without pre-processing. The parameter is radius = ,2 − 2 cos (<sup>30</sup>◦) (the best parameter defined between: radius = { ,2 − 2 cos (<sup>15</sup>◦ , ), 2 − 2 cos (<sup>30</sup>◦), ,2 − 2 cos (<sup>45</sup>◦), ,2 − 2 cos (<sup>60</sup>◦)}) [63].

eGNN—Evolving Granular Neural Network. A model that uses concepts of hyberboxes and nullnorm. The parameters are Rho = 0.85, eta = 2, hr = 40, Beta = 0, chi = 0.1, zeta = 0.9, c = 0, counter = 1, and alpha = 0 (for this model, the reference values proposed by the author of the model were used. More information can be seen at https://sites.google.com/view/dleite-evolving-ai/algorithms (accessed on 18 August 2022)) [64].

#### *4.3. Evaluation Criteria of Experiments*

Evolving fuzzy systems approaches can act in the evaluation of stream data, where each sample is considered separately, and the accuracy of the result is evaluated individually and incrementally added to the final result. In this case, the evaluation of accuracy in trend lines is suitable for this type of context as it allows the evolution of the model results to be seen as new samples are evaluated.

These trend lines were calculated using the following criteria: [65]:

$$Accuracy(K+1) = \frac{Accuracy(K) \* K + I\_{\emptyset = y}}{K+1},\tag{53}$$

where accuracy:

$$Accuracy = \frac{TP + TN}{TP + FN + TN + FP} \times 100.\tag{54}$$

where *TP* = true positive, *TN* = true negative, *FN* = false negative, and *FP* = false positive.
