Next Article in Journal
Architecture of a Non-Intrusive IoT System for Frailty Detection in Older People
Previous Article in Journal
Status and Development of Research on Orderly Charging and Discharging of Electric Vehicles
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Memory-Based Sand Cat Swarm Optimization for Feature Selection in Medical Diagnosis

by
Amjad Qtaish
1,
Dheeb Albashish
2,*,
Malik Braik
2,
Mohammad T. Alshammari
1,
Abdulrahman Alreshidi
1 and
Eissa Jaber Alreshidi
1
1
Department of Information and Computer Science, College of Computer Science and Engineering, University of Ha’il, Ha’il 81481, Saudi Arabia
2
Computer Science Department, Prince Abdullah Bin Ghazi Faculty of Information and Communication Technology, Al-Balqa Applied University, Al Salt 19117, Jordan
*
Author to whom correspondence should be addressed.
Electronics 2023, 12(9), 2042; https://doi.org/10.3390/electronics12092042
Submission received: 24 February 2023 / Revised: 19 April 2023 / Accepted: 23 April 2023 / Published: 28 April 2023

Abstract

:
The rapid expansion of medical data poses numerous challenges for Machine Learning (ML) tasks due to their potential to include excessive noisy, irrelevant, and redundant features. As a result, it is critical to pick the most pertinent features for the classification task, which is referred to as Feature Selection (FS). Among the FS approaches, wrapper methods are designed to select the most appropriate subset of features. In this study, two intelligent wrapper FS approaches are implemented using a new meta-heuristic algorithm called Sand Cat Swarm Optimizer (SCSO). First, the binary version of SCSO, known as BSCSO, is constructed by utilizing the S-shaped transform function to effectively manage the binary nature in the FS domain. However, the BSCSO suffers from a poor search strategy because it has no internal memory to maintain the best location. Thus, it will converge very quickly to the local optimum. Therefore, the second proposed FS method is devoted to formulating an enhanced BSCSO called Binary Memory-based SCSO (BMSCSO). It has integrated a memory-based strategy into the position updating process of the SCSO to exploit and further preserve the best solutions. Twenty one benchmark disease datasets were used to implement and evaluate the two improved FS methods, BSCSO and BMSCSO. As per the results, BMSCSO acted better than BSCSO in terms of fitness values, accuracy, and number of selected features. Based on the obtained results, BMSCSO as a FS method can efficiently explore the feature domain for the optimal feature set.

1. Introduction

Features Selection (FS) is an effective data mining method to reduce the dimensionality of a feature space. Instead of generating all possible features subsets for large datasets, FS is applied to select the most informative features in the dataset and exclude redundant and irrelevant features. Redundant features are those features that are closely correlated with each other and eliminating any of them does not affect the classification results badly [1]. On the other hand, related features are those features that are closely related to class labels. Therefore, they have a great impact on the classification process and should be kept to train the model [2]. There are two commonly used FS methods: filter-based and wrapper-based. The filter methods use the characteristics of the dataset to score features without performing a learning process [3]. In contrast, wrapper methods are driven by a learning algorithm that determines the quality of each subset of features depending on the learning process [4]. This may yield more accurate results, but it takes more time compared to quick filter methods. Wrapper-based FS methods consist of search and evaluation processes. In the search process, a search algorithm is implemented to search the feature space to get a feature subset that achieves two goals simultaneously: The minimum number of features selected and the maximum classification performance. In designing a FS method, finding the optimal or near-optimal feature set is crucial. It is broadly known that FS can be considered as an NP-hard problem. In this regard, the exact search algorithms are less productive than the FS algorithms [5]. Hence, in the exact search methodology, all possible combinations of features are generated to choose the best solution from them. Thus, the exact search method is computationally expensive. For instance, consider a dataset of a feature selection task with N features. Then, when employing a wrapper FS method, a subset of 2 N features will be produced and evaluated using a learning model (i.e., the classifier). While the primary goal of FS is to find a minimum dimensional subset of features while augmenting classification accuracy, this method can be considered an optimization task. To make the best decision, it is usually necessary to make a trade-off between these two objectives. On this account, instead of viewing the feature selection problem as a single-objective problem [6], it should be viewed as a multi-objective problem. Thus, Sand Cat Swarm Optimization (SCSO) [7] as a meta-heuristic optimization could be very effective in handling FS tasks.
Many multi-objective meta-heuristic methods, such as Binary Snake Optimizer (BSO) [8], Binary Rat Swarm Optimizer (BRSO) [9], and Binary Ali Baba and the Forty Thieves (BAFT) algorithm [10], have recently been used for FS tasks in a wrapper-based technique. However, there is no single method can address the curse of dimensionality in the original dataset [6]. The No-Free-Lunch (NFL) theorem [11] states that no single optimization technique can solve all optimization problems with a higher level of performance than all other optimization techniques. For many years, this has encouraged many researchers to present new algorithms with improved performance and exploration and exploitation actions.
The demanding need for FS in various domains, as well as the stunning capabilities of meta-heuristics, fueled the desire to utilize a newly designed meta-heuristic algorithm to tackle a wide range of FS tasks. In this work, a newly proposed method, referred to as Sand Cat Swarm Optimization (SCSO) [7], was adopted to handle FS problems. At outset, this algorithm along with an appropriate transfer function was integrated to develop a binary algorithm, known as a Binary SCSO (BSCSO), from the parent continuous version of SCSO. This is to handle FS problems as the nature of these problems requires binary algorithms.
The SCSO algorithm is a recent developed swarm-based algorithm [7] that mimics the behavior of sand cats in nature. The main features of sand cats are the ability to search and attack prey. Sand cats can detect even low-frequency noises so they can capture prey on the ground or underground. The stages of the search attack are inspired by the SCSO as two major components of the search process. The main features of SCSO that prompted us to use it for tackling FS problems is that it has a high capability to control the exploration and exploitation balance of the search process with few parameters and operators used. Although the standard SCSO was improved and applied to address a variety of real-world optimization problems, including inverse robot arm kinematics [12], intrusion detection [13], and transformer defect diagnostics [14], it still suffers from many issues. These issues include premature convergence and cadence of solutions in local minima. Second, the standard SCSO was designed for continuous optimization problems, while the FS tasks are binary optimization problems. Third, the SCSO algorithm does not keep a record of its best positions from earlier iterations, which limits its potential for exploitation and results in early convergence with local optima. With randomization and static swarm behavior, SCSO’s global search ability is strong, but its local search ability is modest, which may lead solutions to fall into the local optimums [15]. All of the above gaps motivate this work to produce two new Binary versions of the standard SCSO, namely BSCSO and Binary Memory-based SCSO (BMSCSO), to handle FS tasks.
The proposed BSCSO utilizes the binary version of the basic SCSO, which employs the S-shaped transfer function to convert the continuous search space of this algorithm into a binary one. In more detail, the obtained solution using BSCSO is represented by binary values (i.e., 1 or 0) depending on whether the features are selected or not. The selected features are represented by one, while non-selected features are denoted by zero. While the proposed BMSCSO integrates the memory-based strategy [16,17] into the positioning updating process of SCSO with the best solutions for further exploitation. More particularly, BMSCSO utilizes two main phases at the core of BSCSO: in the first phase, a new random operator is added to the SCSO for additional exploration. In the second phase, a memory component is incorporated into the SCSO algorithm to keep the coordinates in the solution space connected to the best solution (i.e., fitness score) that the sand cat has realized so far. Thus, this inclusion approach ensures early-stage exploration and later-stage exploitation capability and ensures global optimum achievement with enhanced performance level.
The performance of the proposed FS methods was assessed on many low and high-dimensional medical datasets, where the efficiency of these algorithms was compared with other algorithms. The relevance of the experimental outcomes was demonstrated by statistical analysis using Friedman’s and Holm’s test methods [18]. These methods were applied to evaluate the significance of the obtained results compared to other feature selection methods derived from other meta-heuristic algorithms.
As a result of this work, the following contributions have been made:
  • Investigation of the impact of the binary SCSO, as a wrapper- based FS method, to tack FS tasks in the medical domain.
  • Investigation of the memory-based technique’s influence on the BSCSO, called BMSCSO, to form a new wrapper-based FS method. This new BMSCSO was designed to improve the exploitation potential of BSCSO in addressing FS problems.
  • Examination of the performance of BMSCSO on medical FS problems compared to other popular wrapper FS methods.
The rest the paper is arranged as follows: In Section 2, a literature review is presented. The SCSO algorithm is described in Section 3. In Section 4, the proposed memory based SCSO is presented. While in Section 5, the proposed binary methods for features selection are presented. The performance of BMSCSO is evaluated and compared with other meta-heuristic algorithms in Section 6, and finally, conclusion and future works are given in Section 7.

2. Literature Review

The use of meta-heuristic algorithms to solve NP-complete and multimodal problems in real-world optimization has shown promising results. In recent years, several meta-heuristic types are adapted by scholar for solving optimization problems, which can be categorized into nature-inspired and non-nature inspired methods [19]. blueSeveral nature-inspired methods have been introduced in the literature, including the Whale Optimization Algorithm (WOA) [20] and its improvements in [21], wrapper-based binary Sine Cosine method [22], Discrete moth-flame optimization algorithms [23], which was developed using the moth’s navigation method, and the multi-trial vector-based differential evolution(MTDE) [24]. In recent studies, a sand cat optimization algorithm [7] has been developed based on desert cats’ lifestyle, and it is designed with artificial intelligence to solve continuous optimization problems. Table 1 shows most of the modifications made to the standard SCSO. Because feature selection is an NP-complete problem [10], swarm intelligence methods are frequently used to tackle it.
The main goal of FS is to investigate the characteristics of the data features and then pick the smallest number of essential and significant features capable of describing the original data [9,10,25]. Generally, FS can be accomplished using wrapper or filter approaches. The FS method described in this work is a wrapper-based approach. As a result, filter-based FS and hybridization of the filter and wrapper-based FS approaches are outside the scope of this study. The wrapper FS methods utilize a search algorithm to generate a subset of features that can accurately represent the entire set. After that, the selected subset is evaluated using a specific classifier model (e.g., k-Nearest Neighbor (k-NN) or Support Vector Machine (SVM)). One of the best well-known methods to create feature subsets is meta-heuristic algorithms. The latter is utilized to minimize the time required to generate subsets of features.
Wrapper feature selection based on meta-heuristic methods has been used in many real-life applications, including intrusion detection [26], image processing [27,28], sentiment analysis [29], and many other fields [30,31]. Nevertheless, one of the remaining challenges in using the FS process is its application in medical data diagnosis, which is the focus of this study. Thus, most of the work relevant to the use of FS in the diagnosis of medical data is reviewed in this section.
Table 1. A summary of the most adaptation methods of SCSO in the literature.
Table 1. A summary of the most adaptation methods of SCSO in the literature.
CitationEvolution of SCSODomain
Iraji et al. [32]Hybrid chaotic SCSO and pattern search approachSafety factors of the earth slope exposed
Wu et al. [33]Hybrid of SCSO with a wandering strategyEngineering optimization problems
Li and Wang [34]Hybrid of stochastic variation and elite collaboration with SCSOEngineering optimization problems
Kiani et al. [35]Updates SCSO positions by the political systemEngineering optimization problems
Arasteh et al. [36]Discredited sand cat swarm optimizationSoftware clustering
In a promising study, Alweshah et al. [37] designed a wrapper FS method based on the Greedy Crossover (GC) operator using a binary Coronavirus Herd Immunity Optimizer (BCHIO) to improve the exploration of BCHIO. The proposed algorithm was evaluated on 24 benchmark medical datasets. Experimental results show that BCHIO surpassed state-of-the-art wrapper FS methods in terms of average fitness and accuracy. In [38], the authors presented an improved binary version of the flamingo search algorithm, referred to as IBFSA, by incorporating the Levy flight technique to increase the diversity and provide a high level of randomization. The proposed IBFSA is employed to select the most vital features in the COVID-19 datasets. The proposed method is evaluated on two benchmark sets of COVID-19 text data. The results show that the IBFSA can select essential features to improve diagnostic performance. However, the algorithm’s improvement focused on the exploration phase while ignoring the exploitation phase, resulting in an imbalance between these two phases [39]. Nadimi-Shahraki et al. [40] developed a feature selection method for medical data classification and diagnosis based on improving the binary whale optimization method (BE-WOA). The main goal of the E-WOA was to enhance the WOA by utilizing three search techniques: encircling prey, migrating, and preferential selection. The suggested E-WOA aimed to fill the gap of suffering from the poor search strategy and low population diversity. The BE-WOA outperformed other methods regarding the accuracy and average fitness values when tested on COVID-19 numerical data. In [41], the author utilized Sine Cosine Algorithm (SCA) to solve the FS task for medical datasets. The authors used various transfer functions (S-shaped and V-shaped) to represent the solution in the binary format. The proposed Binary SCA was tested on five medical datasets, and the results showed the effectiveness of the BSCA method. However, the benchmark dataset is relatively small compared to most related studies. In another [42], the authors proposed two binary versions from the Aquila optimizer (AO) algorithm for handling the FS task in the medical domain. The first version of the AO was based on the S-shaped transfer function. In contrast, the second version was based on the V-shaped transfer function. The proposed methods were tested on seven medical datasets. The results showed the S-shaped version outperformed the V-shaped version in most of the datasets.
In a recent study, the author in [43] recently introduced the Binary version of the Starling Murmuration Optimization (BSMO) algorithm for feature selection tasks. The BSMO was developed to select the best feature subset from large medical datasets. Using S- and V-shaped transfer functions, the proposed BSMO was converted to a binary version. The reported experimental results show that the BSMO can dominate all competing methods on medical Nevertheless, the authors should have paid more attention to the balance between exploration and exploitation search in the BSMO. The authors in [44] introduced several binary versions of the quantum-based avian navigation optimizer algorithm (BQANA) [45] to assist in solving the feature selection tasks based on variant transfer functions. The proposed BQANA was applied to ten medical benchmark datasets to select the essential features for the diagnosis tasks. The results exhibit that the BQANA has merit for medical feature selection tasks. However, the authors ignored the balance between the exploration and exploitation phases in the binary version of QANA. The study presented in [46] adapted Binary Particle Swarm Optimization (BPSO) for feature selection tasks using the inertia weight operator. This was used to update the velocity of the BPSO while also balancing the exploitation and exploration features of BPSO. On known heart disease datasets, the proposed BPSO was tested. The findings of this study showed that the proposed technique generates superior outcomes in terms of classification accuracy as well as convergence rate with a smaller feature set.
The primary contribution of this study lies in the development of an effective wrapper FS method for various medical datasets used in diagnosis. According to the literature, wrapper FS methods have been successfully applied to select the most relevant features from a wide range of large datasets. However, due to the large number and variety of medical datasets, it is difficult to determine an appropriate FS cover method for a particular medical data set. In more detail, the key contributions of this study are to propose two wrapper FS methods based on the SCSO algorithm. First, a binary version of the basic SCSO was proposed to handle FS problems in the binary domain. Then, an improved version of SCSO was developed using a new random operator and memory-based technique. This is to increase the exploitation capability and balance the exploitation and exploration capabilities of the basic SCSO. Finally, a binary version of the memory-based SCSO was obtained to tackle FS problems.

3. Sand Cat Swarm Optimization Algorithm

The sand cat is a type of animal belongs to mammals that live in harsh environments such as deserts. It has different behavior in hunting and living. The mechanism of hunting sand cats is very interesting. These types of animals use their great sense of hearing and hear low-frequency noises. In this way, they can detect that prey is moving underground. The sand cat also has a peculiar ability to quickly dig if prey is underground. The SCSO algorithm consists of two stages in searching for and attacking the prey, as these stages were adopted to design the SCSO algorithm [7].

3.1. Initialization of the Population

The SCSO solutions are created by assigning initial random values to the sand cats. These initial values represent the task’s positions in the search space. Each solution is depicted by a one-dimensional vector that includes multidimensional variables (d) corresponding to the number of features in the original dataset. In SCSO, each feature in the solution (sand cat) has a value X i d [ L , U ] , where U and L are the upper and lower bounds of the feature, respectively. Based on the set of solutions, the population in the SCSO is represented by a matrix x, which has the shape of ( N × d ), where (N) is the number of sand cats and d is the number of features in the handling problem. The SCSO is an iterative method like other swarm-based methods. In each iteration, the fitness value is computed for each solution. The initial position of the SCSO in the search space is determined at random as given in Equation (1).
X i ( k ) = l i + r · u i l i
where X i ( k ) denotes the position of the ith population of sand cats in the search space at iteration k, r denotes a random value in the range [ 0 , 1 ] , and l i and u i represent the lower and upper bounds of the decision variables at the ith dimension in the search space.
The basic SCSO stated that the most appropriate solution is one that is very close to the prey. In addition, the best “sand cats” (i.e., solutions) change positions in the next iteration based on the suitability of the solutions. However, if a solution with good fitness is not found in the next iteration, it will not be saved to memory.

3.2. Exploration of the Search Space

The sand cat’s capability to hear low-frequency sounds is one of its most distinguishing characteristics. The sensitivity range of the sound cat is between [2, 0] kHZ. It begins at 2 kHz and linearly reduces until it approximates 0 kHz. In SCSO, the sensitivity level is referred to as S p . The S p value is represented by Equation (2).
S p = s M 2 s M × k K + K
A decrease in the value of S p was detected as the algorithm was iterated. Based on SCSO [7], the value of the constant S M was set to 2 based on the hearing characteristics of sand cats. It should be noted that k and K are the current and maximum iterations, respectively.
Besides, R is a switching or transition parameter that is responsible for controlling the switching between the exploration and exploitation phases.
R = 2 × S p × r a n d o m ( 0 , 1 ) S p
Equation (3) shows the R controller, where it utilizes the S p and randomizes the variables. During the search process, the sand cats use a random variable to change their positions. Thus, a new area is discovered based on this random variable.
SCSO uses the variable r as a sensitivity range for each solution (i.e., sand cat) to avoid trapping in local minima. The variable r in Equation (4) utilizes S p to generate a new value in the search space. Thus, r is the sensitivity parameter in SCSO.
r = S p × r a n d o m ( 0 , 1 )
In SCSO, the position update process for each sand cat is affected by the sensitivity parameter ( S M ), the current location ( X c ), and the best-candidate sand cat ( X b c ). These three factors help the sand cat locate the next best possible location close to prey, as illustrated in Equation (5).
X ( k + 1 ) = r × ( X b c ( k ) r a n d o m ( 0 , 1 ) × X c ( k ) )
Equation (5) shows that different local optimums in the search space can be found. Thus, the computed location is between X c and the location of prey.

3.3. Exploitation of the Search Space

In the sand cat’s life, hunting is achieved by exploiting a local area. To mathematically measure the hunting phase in the SCSO algorithm, the distance between X c and X b is computed.
X r n d = | r a n d × X b ( k ) X c ( k ) |
Equation (6) illustrates this computation. Moreover, the sensitivity range of the sand cat is assumed to be a circle. Based on the behavior of the circle shape, the direction of movement is obtained by an arbitrary angle ( θ ), where this angle can span between 0 and 360, and its cosine value is between −1 and 1. SCSO selects a random θ for each solution in the population utilizing the famous Roulette Wheel selection method. Thus, θ will guide the sand cat to hunt the prey. X r n d in Equation (6) represents a random location that allows the sand cats to approach prey.

3.4. Exploration and Exploitation of the Search Space

During the SCSO search process, the parameters S p and R are in charge of ensuring a soft dynamic transition between the exploration and exploitation phases. In SCSO, the value of the parameter R depends on the value of S p . Thus, the value of R will decrease as the value of S p decreases. In addition, when the distribution of the value of S p is balanced, the value of R will also be balanced. Thus, the chances of operations between the two phases will be sufficient based on the problem. The next position of the sand cat can be determined based on the value of R, which ranges from 1 and −1. When R 1 , the SCSO method focuses on exploitation. Otherwise, the algorithm is forced to go exploring in search of food.
X ( k + 1 ) = | X b ( k ) r × X r n d × cos ( θ ) |
Equation (7) shows these two phases. The exploration and exploitation phases are based on the use of different radius values. In exploration, traps can be avoided in local optima. In contrast, exploitation is achieved by assisting in hunting prey. The balanced approach of the SCSO algorithm makes the convergence rate very accurate. Thus, it will be suitable for multi-objective tasks. The mathematical formula for positioning update of each sand cat in the exploration and exploitation stages of the basic SCSO can be summed up by Equation (8).
X ( k + 1 ) = r cos ( θ ) X b ( k ) X r n d ( k ) R 1 r X b c ( k ) r a n d × X c ( k ) R > 1
The flowchart and pseudo code describing the steps of the standard sand cat swarm optimization algorithm are presented Figure 1 and Algorithm 1, respectively.
Algorithm 1 A pseudo code showing the main steps of the basic sand cat swarm optimization algorithm.
1:
Input parameters: K, and k
2:
Outcome: The global best solution
3:
Randomly initialize the population (i.e., sand cats)
4:
Define and initialize S p , r, and R
5:
while  k K do
6:
   Assess each search agent (i.e., sand cat)
7:
   for each search agent do
8:
     Randomly obtain an angle θ with the Roulette wheel selection method in which 0 θ 360
9:
     if ( | R | 1 ) then
10:
        Update the positions of the search agents using X b ( k ) X r n d cos ( θ ) × r
11:
     else
12:
        Update the positions of the search agents using r × ( X b c ( k ) r a n d o m ( 0 , 1 ) × X c ( k ) )
13:
     end if
14:
   end for
15:
    k = k + 1
16:
end while
17:
Return the global best solution

4. Proposed Memory-Based Improved SCSO Algorithm

For any optimization method, the exploration and exploitation of the search space must be properly balanced to arrive at the global optimum solution. Exploration, also known as diversification, entails searching globally in the search space, whereas exploitation, also known as intensification, entails searching locally, based on the best solution currently available. Excessive exploration and exploitation negatively impacts algorithm’s performance by lengthening the algorithm’s convergence time and raising the likelihood to drop into a local optimum. The traditional SCSO uses a random initial group of sand cats, which fly around to scout the search area in a random manner. This procedure of random initialization broadens the variety of possible solutions and improves the algorithm’s capacity for exploration. Additionally, SCSO only needs to fine-tune a few parameters, and adaptive adjusting of these swarming elements aids in striking a proper balance between local and global search capacities. SCSO, however, is devoid of an internal memory that may store information of formerly got possible solutions. SCSO never maintains track of the prospective set of solutions that may converge to the global optimum during its iterative process and discards all fitness values that are better than the global best solution. The SCSO’s capacity to properly exploit the search space is reduced, and it tends to converge extremely slowly and may stall at local optimums. To beat this issue, a memory-based mechanism as detailed below was proposed to boost the exploration and exploitation behaviors of SCSO. The proposed Binary Memory-based Sand Cat Swarm Optimization (BMSCSO) as a FS method improves the standard BSCSO method in solving FS problems in three main features to boost its performance score, which can be described as follows:
  • First, amendment of the initialization process.
  • Second, iterative adaptation of the dominant parameters of BMSCSO during the positioning update process.
  • Third, an internal memory to store potential solutions that may eventually converge to the global optimum.
These enhancements points to the basic BSCSO are explained in detail below:

4.1. Improved Initialization Process

In the proposed BMSCSO, the initial population is constructed using the chaotic variable ( y ( k ) ) with a uniform random initialization [47,48]. The initialization process employs the number of sand cats in the population and the dimensions of the problem of interest as given in Equation (9).
y i ( k + 1 ) = 1 2 ( y i ( k ) ) 2
where y i ( k ) [ 0 , 1 ] is a chaotic variable, k [ 1 , D i m ] , and D i m is the dimension of the problem, i [ 1 , N u m ] , where N u m is the number of sand cats in the population.
Thus, the initial vector of the ith sand cat can be defined as shown in Equation (10) after the improved initialization process is achieved.
X i ( k ) = l i + ( u i l i ) × y i ( k + 1 )
where X i ( k ) is the new position of the ith sand cat in the k dimension, l i and u i are the lower and upper bounds of the search space at dimension i, respectively.

4.2. Adaptive Positioning Update Process

The technique for amending the mathematical model of BSCSO is described in full in this section. The proposed adaptive positioning update process is then described as a way to enhance the algorithm’s performance score and convergence process.
The proposed BMSCSO tends to provide a new solution X i ( k + 1 ) at iteration k + 1 for a given problem that outperforms the present solution X i ( k + 1 ) at iteration k through the position update process described in Equation (5). In BMSCSO, a slightly different mathematical formula than that employed in BSCSO is utilized to update the locations of the sand cats. Equation (11) provides the positioning updating method of the sand cats in BMSCSO [49].
X i ( k + 1 ) = r a n d × W 1 × X b c ( k ) r a n d × W 2 × X c ( k )
where W 1 and W 2 are adaptive parameters over the iteration loops of the proposed BMSCSO which are computed according to Equations (12) and (13), respectively,
W 1 = c 0 e ( c 1 ( k K ) 2 )
W 2 = d 0 2 e ( d 1 ( k K ) 2 )
where k and K are the current and maximum number of iteration values, respectively, where c 0 is the starting estimate of the duration of the sand cats’ search, c 1 is the final estimate of the length of the search that could be attained. The parameters d 0 and d 1 at the conclusion of the iterative process of BMSCSO are the coefficients of the growth exponential function that represents the capacity of sand cats to seek prey.
During the iteration loops of BMSCSO, the exponential functions specified in Equations (12) and (13) are updated iteratively. These adaptive parameters are updated exponentially with regard to time k and lifetime, K, of the sand cats, respectively.
To estimate the parameters c 0 , c 1 , d 0 and d 1 for the exponential functions of W 1 and W 2 , there are numerous conventional and intelligent conventional methods reported in the literature that employ meta-heuristics in the adjustment of the parameters of other meta-heuristic algorithms in solving optimization problems [50]. These techniques could need a lot of computational burdens. In this study, the parameters c 0 , c 1 , d 0 and d 1 of the models proposed for the adaptive parameters W 1 and W 2 of BMSCSO were determined using experimental design through the investigation of the proposed BMSCSO on a significant subset of benchmark feature selection problems. The coefficients c 0 , c 1 , d 0 and d 1 are all equivalent to 2.0 for all feature selection problems covered in this study. Yet, in the majority of test cases, only good values-and sometimes not even the finest ones-are acquired by experimentation. As a result, these control settings can be changed as needed for various optimization optimal. For each iteration loop of the BMSCSO, the values of the parameters W 1 and W 2 are updated exponentially.
The BMSCSO is presented with time-varying parameters, where a higher c 0 and a lower d 0 were originally set and subsequently reversed over the search process, in light of Equations (12) and (13). Hence, it is anticipated that BMSCSO would reveal superior overall performance than the standard SCSO. This may be as a result of the time-varying of c 0 , c 1 , d 0 , and d 1 , which can balance the capabilities of both local and global search. This indicates that the performance of SCSO may be improved by adapting these settings. The sand cats would group up and converge into a locally or globally optimum location as the iterative BMSCSO process continued. As a result, the population distribution data would differ from that from the early stage.
The adaptive parameters W 1 and W 2 are identified as function of iterations as displayed in Figure 2a,b, respectively.
The curve of W 1 of BMSCSO in Figure 2a demonstrates that the trend of this parameter is decreasing exponentially. The behavior of sand cats in the BMSCSO is impacted by this, and it may be steered towards greater exploration and exploitation, where the sand cats complete their quest by locating prey at the conclusion of their excursion. Moreover, it could cause local optimum solutions to be avoided. Similarly, Figure 2b for the parameter W 2 of BMSCSO shows that this parameter’s trend is expanding exponentially. This has an effect on sand cats’ behavior in the BMSCSO and may lead to more exploration and exploitation, as the sand cats finish their hunt by finding prey at the end of their expedition. In this way, the basic SCSO’s exploration and exploitation stages are improved, allowing the sand cats to finally discover prey and avoid becoming separated from their other group members while foraging.

4.3. Further Exploration Behavior

The exploration phase of the BCOS method is influenced by the random behavior of sand cats. As a result, finding a better location and moving toward the prey can become problematic, lowering the solution’s quality [34].
S o l ( X i ( k + 1 ) , ϕ ) = | t a n h ( ϕ ( X i ( k + 1 ) ) ) |
where X i ( k + 1 ) is the position of the ith sand cat in the ( k + 1 ) iteration, and ϕ increases with the number of iterations as given by Equation (15).
ϕ = ( ϕ m a x ϕ m i n ) ( k K )
where ϕ m a x is the maximum value of ϕ , and ϕ m i n is the minimum value of ϕ .
A binary matrix of size N u m × D i m is used to initialize the population of sand cats, where D i m is the number of features. When the feature of sand cat is set to 0, it means that this feature is excluded. However, when the dimension of sand cat is set to 1, it means that this feature is selected.

4.4. Implementation of the Interior Memory

With the inclusion of interior memory, each sand cat is given the ability to store its co-ordinates in the hyperspace problem associated with fitness value. This is comparable to locating the best local solutions in the proximity neighborhood. The fitness value of sand cats in the present population is compared to the best fitness value throughout each iteration during the iterative process of MSCSO. Better solutions are saved and the local best solutions got by SCSO are framed. Better solutions are stored, and SCSO frames the best local solutions it obtained. The sand cats of SCSO are also designed to keep track of the best fitness values got thus far by any sand cat in the vicinity, which is equivalent to finding the global best solution. The concepts of global and local best solutions in SCSO are highly effective for boosting the exploitation capacity of SCSO. This characteristic of interior memory in MSCSO offers the potential to get out from local optima and offers better performance than the traditional SCSO. In order to realize global optimum solutions, the MSCSO therefore combines the exploration features of SCSO in the early iterations and the exploitation capabilities of local and global solutions integrated with memory concept in the final iterations. So, in this work, the memory-based approach is employed to raise the quality of the potential solutions in BSCSO. Additionally, it is anticipated that these updating processes would significantly enhance BSCO’s exploratory and exploitative behavior, which in turn will assist BSCSO to find a better solution. The memory matrix is used to store the current position of each solution in order to employ it in the exploitation stage. In the initialization procedure, the memory can be represented by the current position of the sand cats. As per this, the memory can be represented for all sand cats as shown below, Equation (16):
m = m 1 1 m 2 1 m d 1 m 1 2 m 2 2 m d 2 m 1 n m 2 n m d n
where m j i denotes the memory for each sand cat in the ith solution at the jth dimension.
The mathematical model for the new positioning of sand cats in the proposed MSCSO was amended over that of the basic SCSO. Accordingly, the new positions of the sand cats in the proposed MSCSO can be located as presented in Equation (17).
X ( k + 1 ) = f b r a × f b X ( k ) · S B R 1.0 X ( k ) + r g × r c m ( c ( k ) ) l b ( k ) e l s e
where X ( k + 1 ) and X ( k ) represent the new and present positions of sand cats at iterations k + 1 and k, respectively, f b represents the global best position ever obtained by any sand cat up until the kth iteration, m ( c ( k ) ) represents the best solutions found and reserved so far at iteration k by the best sand cats where c denotes the memory component of the sand cats utilized at iteration k, l b ( k ) stands for the best position so far reached by sand cats at iteration k, S B = ( r b 0.5 ) and it presents either −1 or 1 to modify the orientation of the search process, r a , r b and r c are random values formed with a uniform distribution in the rang [ 0 , 1 ] , R and r g are two adaptive parameters specified as functions of iterations as demonstrated in Equations (18) and (19), respectively.
r g = τ 0 τ 1 × k / K
where k and K identify the present iteration value and maximum value of iterations, respectively, τ 0 ( τ 0 = 2 ) stands for the incipient estimate of the function r g at the initial iteration value and τ 1 ( τ 1 = 2 ) is a fixed value employed to control exploitation and exploration abilities. These parameters’ values were determined thorough investigation and application to a sizable portion of FS problems, where they reported the best outcomes and served as the basis for all succeeding findings.
R = 2 r g × r k r g
where r k is a random value at the kth index which has a uniform distribution in the interval [ 0 , 1 ] .
Equation (19) demonstrates that R is repeatedly updated during the path of iterations of the MSCSO algorithm.
The parameter c in m ( c ( k ) ) can be identified as presented in Equation (20).
c = ( n 1 ) · r a n d ( n , 1 )
where n is the total number of sand cats, and r a n d ( n , 1 ) is a vector of random values produced with a uniform distribution in the interval [ 0 , 1 ] .
Sand cats update their memory at each iterative loop throughout the processing of MSOCS. In this, they can identify if the quality of the new solution the sand cats find is of higher quality than their prior position. In this regard, Equation (21) may be utilized to iteratively update the memory of sand cats.
m ( c ( k ) = X ( k ) i f f X ( k ) f m ( c ( k ) ) m ( c ( k ) ) i f f X ( k ) < f m ( c ( k ) )
where f ( · ) denotes the fitness function’s score.
It is expected from the discussions and the mathematical model of MSCSO described above that the proposed MSCSO can be highly efficient in solving feature selection problems.

5. The Proposed Binary SCS Method for Feature Selection

The feature selection tasks are binary optimization processes where solutions are represented in binary values (i.e., ‘0’ or ‘1’). In a specific solution, the selected features are assigned a value of one; otherwise, (i.e., not selected) are assigned a value of zero. This indicates that any meta-heuristic strategy used to solve FS tasks requires generating a binary form of the solution. SCSO’s first version was created to deal with continuous domains. To generate a binary version of SCSO (BSCSO), a solution must be characterized using a binary vector. The solution elements can only have values of ‘0’ or ‘1’. Regarding the algorithm’s update strategy, the solutions shift their positions in the feature space. This necessitates using transfer functions to ensure that the solution’s elements are either ‘0’ or ‘1’.
As they produce results in the range [0, 1], logistic transformation functions, also referred to as the S-shaped family, are reliable for mapping operations. This is necessary to explain the probability of changing a binary solution element from ‘0’ to ‘1’ and conversely [51]. Furthermore, Mirjalili et al. presented the V-shaped as a family of potent transformation functions in the field of feature selection that performed the same purpose as the S-shaped family [52]. The slope of the transformation function can be used to distinguish between exploitation and exploration characteristics. However, when the curve of the transformation function is very steep, it contributes to bad exploration. Also, when the curve of the transformation function is flat or less steep, it leads to poor exploitation and readily slips into local minima [53,54]. In addition, a U-shaped transfer function was evolved in [55] with η and ρ serving as two control parameters that, respectively, set the slope and width of the U-shaped function’s basin. In practice, no transfer function has yet been shown to be the best for problems involving feature selection. Three alternative transfer functions from three categories, referred to as S-shaped, V-shaped, and U-shaped, were studied in this study, although only an S-shaped TF was looked at because it is the most effective TF used in this field and allowed for the creation of binary versions of the proposed MSCSO and basic SCSO. As a result, this study explains three different transfer functions, and Figure 3 illustrates the essential graphs of these transfer functions. These transfer functions define the likelihood of updating the elements of the binary solution from ‘0’ to ‘1’ and conversely.
Later in the experimental findings section, the effectiveness of the S-shaped transfer function depicted in Figure 3a will be observed. As an example of how to convert a continuous search space into a binary one to address feature selection problems, three distinct transfer function types from the S-shaped, V-shaped, and U-shaped families are discussed below.
  • S−shaped transfer function: The sigmoid function from the S-shaped family, as stated in Equation (22) [51], and shown in Figure 3a, was used to transform the search space of the proposed and basic algorithms applied in this work from continuous to binary.
    S x t + 1 i , j = 1 1 + e x t i , j
    where S is the transmutation vector, S x t + 1 i , j indicates the probability value generated by this transfer function, x t i , j and x t + 1 i , j indicate the current and next positions of search agent i at dimension j and iterations t and t + 1 , respectively.
    Equation (22) is a stable S-shaped transformation function that converts an unbounded input into a limited output, which changes any interval’s domain to the range [ 0 , 1 ] . The probability value of changing the position value increases as the slope of the S-shaped transformation function lowers, as shown in Figure 3a, which may effectively update search agents’ locations and find the best solutions. The fact that the S-shaped transformation function’s speed is increasing makes it easier to calculate position values. In this, the search agent’s position is converted into a probability value using the transfer function presented in Equation (22), from which the search agent’s next position, x t + 1 i , j is computed using the probability value of its present position. As a result, Equation (23) guarantees the binary value of a Sigmoid expression by using the commonly utilized stochastic threshold.
    x t + 1 i , j = 1 if r a n d < S x t + 1 i , j 0 if r a n d S x t + 1 i , j
    where r a n d stands for random number created utilizing a uniform distribution in the range [ 0 , 1 ] , x t + 1 i , j implements the position of search agent i at iteration t + 1 dimension j, and S x t i , j produces a probability value as presented in Equation (22).
    Equation (23) states that the probability value of changing the search agents’ next positions is calculated using the present positions of the search agents. The sigmoid transfer function in Equation (23) makes it clear that the current version of this function might not provide a good balance between exploration and exploitation, where the exploration amount should be higher than the exploitation amount at the beginning of the optimization process. Therefore, it’s possible that some intriguing areas of the search space may go untouched. Thus, there is a chance that the proposed BMSCSO might become trapped in local optima. In spite of several attempts to fix this problem, no one was able to stop entrapment in local optima [55].
  • V-shaped transfer function: The V-shaped transfer function shown in Equation (24) [56] and displayed in Figure 3b could be employed in further work of this study to calculate the probability of altering the search agents’ position from continuous to binary in the basic SCSO and developed BMSCSO as feature selection algorithms as illustrated in Figure 4.
    V x t + 1 i , j = t a n h x t i , j
    where V denotes the V-shaped transfer function, and V x t + 1 i , j indicates the likelihood of this transfer function of the position, x t + 1 i , j y , for search agent i at iteration t + 1 and dimension j.
    The V-shaped transfer function presented in Equation (24) varies from the S-shaped transfer function in Equation (22) in that this function has been updated with new rules. This is evident from Figure 3b compared to that shown in Figure 3a. Simply expressed, Equation (25) may be used to convert the continuous solution obtained from Equation (24) into a binary one depending on the recovered probability outcomes of the V-shaped transfer function.
    x t + 1 i , j = ¬ x t i , j if r a n d < V x t + 1 i , j x t i , j if r a n d V x t + 1 i , j
    where ¬ x t i , j signifies the complement of the solution x t i , j , r a n d specifies a random value generated between 0 and 1 and V x t + 1 i , j embodies the probability value of the V-shaped transfer function.
    Figure 3b illustrates that the V-shaped transfer function is symmetrical. Equation (25) may be used to observe that the position updating is also different since the search agents are reversed and are not required to accept values of ‘0’ or ‘1’. When the present position value is low during an iteration, the V-shaped transfer function encourages the search agents to stay in their current position. As an alternative, the search agents change to complement when the existing position value is high. Meta-heuristics may still struggle with the problem of falling into local optima as a result of this process, which may have an impact on how search agents update their positions and find the optimal solution. It is obvious that issues like the S-shaped transfer function may still lead a meta-heuristic algorithm to display biased stability between the exploration and exploitation phases. It could be essential to research additional transfer functions in an effort to better balance exploration and exploitation. Therefore, in addition to the transfer functions previously discussed, U-shaped transfer functions may be employed as alternatives to transform continuous algorithms into binary ones.
  • U-shaped transfer function: The U-shaped transfer function presented in Equation (26) [55] and displayed in Figure 3c could be applied to calculate the probability of altering the search agents’ position from continuous to binary in the basic and proposed feature selection algorithms.
    U x t + 1 i , j = η x t i , j ρ η = 1 , ρ = 1.5 , 2.0 , 3.0 , 4.0
    where η and ρ are two basic parameters, where η determines the function’s slope and ρ represents the width of the curve’s basin, and U x t + 1 i , j denotes the probability of the position of search agent i.
    The U-shaped transfer function was produced using two control parameters, η and ρ , where η permits modifying the saturation point of this function and ρ specifies the width of the transfer function’s trough. The pace at which the U-shaped function reaches the saturation point affects how likely it is that the bit will be flipped. This promotes exploration since variables can change fast, and the lower the exploratory behavior, the larger the U-shaped curve. Using Equation (26), the values for the continuous solution elements can be converted to binary values by using Equation (27).
    x t + 1 i , j = ¬ x t i , j if r a n d < U x t + 1 i , j x t i , j if r a n d U x t + 1 i , j
    where U x t + 1 i , j forms the probability value of the U-shaped transfer function and r a n d denotes a uniform random number generated between 0 to 1.
    Equation (27) shows how the search agent’s current position might vary based on the probability value U x t + 1 i , j got by Equation (26). In order to establish if the value of the solution x t i , j at the current iteration is flipped, the random values produced using r a n d are essential. Exploration is an important stage in early rounds of exploring the full search space. The latter phase is therefore required in the last iterations to find better solutions after switching from the exploration stage to the exploitation step.
As mentioned earlier, there are a variety of transfer functions, including S-shaped, V-shaped, and U-shaped ones, which have been extensively used in the literature to convert continuous data into binary data. The authors explained in [52] that the S-shaped and V-shaped transfer functions provided outcomes in binary optimization feature selection approaches that were quite similar. In order to get binary variants of the native continuous algorithms of the proposed MSCSO and basic SCSO, this work applied the sigmoid S-shaped transfer function described in Equation (22) and displayed in Figure 3a. This transfer function was used in this work due to its simplicity and small number of parameters as reported extensively in [10,57,58].
In the FS task, the goal is to reduce the number of features while improving the performance of the classifier (i.e., the learning model) [37]. In this sense, the FS method is a multi-objective optimization method designed to obtain the global solution that maximizes model performance while preserving the minimum number of relevant features.
In the current study, the objective function (i.e., fitness) of the solution is determined by the classification error rate (i.e., an accuracy of one unit) of the k-NN on the validation dataset and the number of selected features. Optimizing the k-NN error rate while minimizing the number of selected features yields a multi-objective function. As a result, to achieve a balance between these two criteria, one fitness function is used to merge both, as shown in Equation (28):
f i t n e s s = α ζ k + β | R | | N |
where ζ k represents the classification error rate arrived at by the k-NN classier, | R | and | N | stand for the number of selected and original features in the dataset, respectively, α and β are two counteractive parameters in the interval from 0 to 1. They represent respectively the weights of the classification rate and the selection ratio of the selected features, where α [ 0 , 1 ] and β is the complement of α , i.e., β = 1 α [59].

Complexity Analysis of BSCSO and BMSCSO

Big-O notation was utilized to estimate the time complexity of the proposed FS methods, namely BSCSO and BMSCSO. The time complexity analysis of these methods for feature selection tasks primarily depends on the dataset dimensions (d), initialization stage, number of iterations (K), fitness function cost (C), population size (n), and number of experiments (V). The S-shaped transfer function is also used to generate binary versions of BSCSO and BMSCSO. The overall computational complexity of BSCSO and BMSCSO can be represented using Big-O notation as presented below:
O B S C S O = O i n i t . + O K p o p . u p d a t e + O K f i t n e s s e v a l . + O K s e l e c t i o n
By calculating the Big-O for each phase in Equation (30), the time complexity for BSCSO can be represented as the following:
O B S C S O = O ( n d ) + O V K n d + O V K n c + O V K n d
For BMSCSO, the time complexity is the same as for BSO except that memory-based and adaptive update of positions are added in each iteration. Since the memory update is evaluated for each search agent within each iteration, it will be computed as O V K n d . The adaptive update of positions utilizes a discrete loop as shown in Algorithm 2. Thus, its time complexity is O V K n d . Based on these two additional procedures, the time complexity of BMSCSO, as illustrated in Figure 4, can be given as shown below:
O B M S C S O = O ( n d ) + O V K n d + O V K n c + O 2 V K n d + O 2 V K n d
Algorithm 2 A pseudo code showing the mains steps of the proposed binary memory-based sand cat swarm optimizer (BMSCSO).
1:
Input parameters: S p , r, R, K, k
2:
Outcome: The global best solution
3:
Initialize the population of the BMSCSO algorithm
4:
Initialize the memory for the initial population
5:
Define and initialize the input parameters
6:
while k K do
7:
   Assess each search agent (i.e., sand cat)
8:
   for each search agent do
9:
     if (the position of the sand cat is between the lower and upper limits) then
10:
        initialize the memory with the current position using Equation (16)
11:
     end if
12:
     Randomly obtain an angle θ using the roulette wheel selection method for which 0 θ 360
13:
     if ( R 1 ) then
14:
        Update the positions of the search agents using f b r a × f b X ( k ) · S B
15:
     else
16:
        Update the positions of the search agents using X ( k ) + r g × r c m ( c ( k ) ) l b ( k ) which uses memory-based technique
17:
     end if
18:
   end for
19:
   for each search agent do
20:
     Adapt the solution using X i ( k + 1 ) = r a n d × W 1 × X b c ( k ) r a n d × W 2 × X c ( k )
21:
   end for
22:
   Assess each search agent
23:
   Update the current solution
24:
   Update the global and best solution
25:
    k + +
26:
end while
27:
Return the global best solution
As indicated by Equation (31), the major criteria for the complexity issue are the iteration count and the size of the population. Moreover, because n d l l V K n d and n d l l V K c n , the component n d can be excluded from the time complexity given in Equation (31). As a consequence, the BMSCSO’s time complexity can be presented as:
O ( B M S C S O ) ( V K ( n f + n M ) d + V K ( n f + n m ) c )

6. Experimental Results

6.1. Training and Testing Datasets

Table 2 shows the datasets used in this study. 21 medical benchmark datasets are used in the experiments. For the benchmark datasets, nine of them were downloaded from the UCI (Diagnostic, Coimbra, BreastEW, Prognostic, Retinopathy, ILPD-Liver, Lymphography, Parkinsons, and ParkinsonC). Seven datasets were downloaded from KEEL (SPECT, Cleveland, HeartEW, Hepatitis, SAHear, Spectfheart, and Thyroid0387). Two datasets (Heart and Pima-diabetes) were downloaded from Kaggle. The remaining three datasets, namely Leukemia, Colon and Prostate_GE, were downloaded from different locations.
(Prostate_GE from https://jundongl.github.io/scikit-feature/datasets.html (accessed on 23 February 2023)), (Colon from https://jundongl.github.io/scikit-feature/datasets.html, (accessed on 23 February 2023)), and (Leukemia from https://jundongl.github.io/scikit-feature/datasets.html, (accessed on 23 February 2023)).

6.2. Experimental Setup

In our experiments, each dataset’s instances were randomly divided into training and testing partitions. The split was 80% for training and 20% for testing as in the previous closed memory-based FS method [16]. Furthermore, to assess the generalizability of the suggested techniques (i.e., BSCSO and BMSCSO), the selected subset of features was evaluated using a k-NN classifier with a Euclidean distance of 5 [31]. Furthermore, as shown in Equation (28), the objective function’s main parameters are set to a = 0.99 and b = 0.01. These values have frequently appeared in many related works [31,37,60]. The upper and lower bounds are set to 1 and 0, respectively. The population size was set to 30, and the maximum number of iterations was 100 for each algorithm to ensure a fair comparison of the contestant algorithms. The details of the parameter settings for the comparison algorithms in this study are shown in Table 3.

6.3. Performance Evaluation

To thoroughly examine the proposed BMSCSO’s performance, its results on FS problems were compared to those of the standard BSCSO and another seven FS methods that were published in the literature: Binary Biogeography-Based Optimization (BBBO) [61], Binary Moth-Flame Optimization (BMFO) algorithm [62], Binary Particle Swarm Optimization (BPSO) [63], Binary Teaching-Learning-Based Optimization (BTLBO) [64], Binary Ali Baba and the Forty Thieves (BAFT) algorithm [10], Binary Honey Badger Algorithm (BHBA) [65] and Binary Success-History based Adaptive Differential Evolution with Linear population size reduction (BLSHADE) [66]. In this comparison, not only were the specifics of the experimental findings provided, but also a thorough and meticulous comparison with those comparative optimization techniques was made. In order to fairly compare the proposed BMSCSO with the comparative optimization techniques, the same experimental settings, including population size and number of iterations, were used to all of the competing algorithms. The parameter settings for each of the contending methods are listed in Table 3.
The experiments were conducted 30 separate runs in order to get results that were statistically significant. The outcomes of the statistical analysis were then obtained based on the general capabilities and conclusions gained during these runs. Each dataset has the same amount of features as it has in terms of dimension. The proposed BMSCSO’s performance was assessed using classification accuracy, sensitivity, specificity, fitness values, and the average amount of selected features during the independent runs. This performance was then compared to the performance of other FS methods using all of the aforementioned rendering criteria. In all comparison tables, the best results are emboldened to give them more prominence over the other results.
The average classification accuracy scores for the standard BSCSO, the proposed BMSCSO, and other competing methods are presented in Table 4 along with their associated standard derivation findings.
Greater accuracy results indicate superior robustness, while lower SD values indicate algorithmic stability. Table 4 shows that the proposed BMSCSO obtained the maximum accuracy in a total of 14 datasets and was rated first solely based on having the best accuracy results in 10 datasets. BSCSO was able to acquire the optimal accuracy results in 3 datasets, namely Hepatitis, Leukemia and ProstateGE. BHBA came in second place by exclusively achieving the best results in 5 datasets and realized the best accuracy score in a total of 6 datasets. BBBO ranked third by getting the best accuracy results exclusively in 1 dataset, namely Parkinsons, and also got the best result in the Parkinsons dataset. Outstandingly, both BSCSO and BMSCSO obtained 100% accuracy in the Hepatitis, Leukemia and ProstateGE datasets. When reading Table 4, one can notice that the proposed BMSCSO was more reliable than its competitors, having the lowest SD values in several datasets and much better than many other rival methods.
Table 5 shows the average fitness value and standard deviation of all competitors in each dataset.
It should be understood that the lowest fitness scores divulge better performance degrees for the optimization techniques. From Table 5, it can be observed that BHBA and BMSCSO ranked first and second, where each exclusively got the minimum fitness values in an overall of 11 and 4 datasets, respectively. BTLBO ranked third by exclusively obtaining the minimal fitness scores in 4 datasets, namely, Coimbra, Parkinsons, Cleveland and Hepatitis. BBBO exclusively obtained the minimum fitness value in one dataset, namely the Leukemia dataset. BLSHADE and BHBA shared the same minimum fitness value of 0.0144 in the Thyroid dataset. Finally, BAFT, BMFO, BSCSO and BPSO did not disclose any lowest fitness values in any of the datasets investigated in this study. A second reading of the standard derivation findings listed in Table 5 shows that BHBA’s and BMSCSO’s performance are superior to those of their rivals by achieving the lowest SD results across the majority of the datasets.
Table 6 provides the sensitivity findings of the proposed BMSCSO compared to BSCSO and the other competitors.
Next, we evaluate the proposed BMSCSO with the other competing algorithms in terms of the sensitivity findings that the FS algorithms aim to improve. It should be mentioned that better performance level corresponds to higher sensitivity outcomes. Table 6 summarizes the outcomes of these opposing algorithms in terms of the average sensitivity findings correlated with their standard derivation values. Regarding the outcomes shown in this table, there is no doubt that BLSHADE has the highest sensitivity values when compared to all of its competitors. In particular, 9 out of 21 datasets, BLSHADE alone, obtained the greatest sensitivity values. By having the greatest sensitivity findings across 6 out of 21 datasets, the proposed BMSCSO clearly came in second place. BMFO has the best exclusive sensitivity results in 4 datasets, namely, Retinopathy, ILPD, Parkinsons and Heart, whereas BHBA placed next despite not achieving the best sensitivity values compared to all other competing algorithms. BBBO came in the next place with the best sensitivity findings only in the ParkinsonC and ProstateGE datasets with values of 0.8829 and 0.8947, respectively. Finally, BAFT, BPSO, and BTLBO did not disclose any distinguished sensitivity results. The proposed BMSCSO’s standard deviation values are low, which shows that the proposed BMSCSO is stable and well-established.
Similar to this, Table 7 provides an overview of the average and standard deviations of the specificity findings for BMSCSO, BSCSO, and all other competing methods.
Reading the specificity findings mentioned in Table 7 reveals that BMSCSO and BLSHADE came in first and second, respectively, with each having the greatest specificity values over a total of 8 and 6 datasets out of 21, respectively. By achieving the highest specificity outcomes across four datasets, BBBO came in third. BHBA alone had the greatest specificity values in Retinopathy and Spectfheart datasets with values of 0.7210 and 0.8448, respectively. The results produced by BMFO and BPSO are respectable and superior to those of other competing algorithms like BTLBO and BAFT, even if they did not attain the best specificity results in any of the datasets. In compared to other competitors, the proposed BMSCSO has relatively low standard deviation values throughout the majority of test datasets. These results demonstrate the stability of BMSCSO’s advantage.
As demonstrated in Table 8, the number of selected features is also taken into account when comparing the performance of the proposed BMSCSO to that of BSCSO and all other FS methods described above.
When evaluating any feature selection technique, the quantity of features selected during the classification process is just as crucial as the classification accuracy. There are many different outcomes when comparing the proposed method to the eight competing algorithms, as shown in Table 8. The nine competing algorithms can be divided into the following groups based on the results in the average number of features: By solely accumulating the fewest amount of selected features across a total of 8 out of 21 datasets, BHBA beat its competitors. BMSCSO came in second place, receiving the minimal number of features in 6 datasets entirely. Whereas BSCSO merely decreased the amount of features in the Lymphography dataset, BMFO had the fewest features in the Retinopathy, ILPD and HeartEW datasets, and BPSO reported its superiority against these two methods by having the fewest number of selected features in the Diagnostic, ParkinsonC, PimaDiabetes and Colon datasets. In every dataset taken into consideration, the BAFT, BLSHADE, BTLBO and BBBO algorithms failed to meet any minimal number of features. A second reading of the results in Table 8 may be used to infer that the BMSCSO outperformed its competitors since it had very small results across the majority of the datasets. This alludes that BMSCSO was able to choose around the same number of features by running the algorithm 30 separate times.
The comparisons between the proposed BMSCSO and other state-of-the-art feature selection algorithms found in the literature demonstrate the robustness of the proposed BMSCSO as noted from the results presented in Table 4, Table 5, Table 6, Table 7 and Table 8. One can observe that algorithms like BTLBO, BBBO, and BPSO behind BMSCSO by examining these outcomes in further detail and comparing the margins between BMSCSO and its rivals. Additionally, the proposed BMSCSO’s standard divisions are small and inferior to those of other rival algorithms.
This demonstrates that the proposed algorithm’s advantage is unquestionably strong. The key to BMSCSO’s acceptable level of performance is the algorithm’s sought-after balance between its exploration and exploitation features, which is made possible by the use of both the proposed mathematical model for this algorithm and the FS process. In this, the search agents of the proposed BMSCSO were able to explore and utilize each potential location in the search space as per its mathematical model, restoring a reasonable balance between exploration and exploitation. In this regard, the search agents of BMSCSO have the option of leaving their immediate area if they find themselves trapped in local optimums.

6.4. Convergence Curves

Figure 5 exhibits the convergence curves of the proposed BMSCSO, the basic BSCSO, and the other comparative optimization algorithms for the 21 datasets based on the fitness metric measure.
In the convergence plots illustrated in Figure 5, the y-axis displays the fitness values and the x-axis the number of iterations. These plot curves show the average convergence behavior of the best solution for the basic BSCSO, the proposed BMSCSO, and the other competing algorithms produced over a total of 30 independent runs. The algorithm that achieves the lowest fitness outcomes with the fewest iterations while also avoiding local optima is selected as the best algorithm. When exploring the search space of each feature selection problem, the behavior of the proposed BMSCSO and the other competing algorithms diverge by significant margins, as shown in Figure 5. This is because the proposed BMSCSO algorithm successfully balances the capacities for exploitation and exploration. Re-examining the curves in Figure 5 reveals that the convergence patterns of the proposed BMSCSO, the basic BSCSO, and the other rival algorithms in the first 40 iterations in the Diagnostic, Breast, Prognostic, Coimbra, Retinopathy, and ILPD datasets are noticeably different. Distinctly, the convergence behavior of the proposed BMSCSO is somewhat better than that of some of its competitors, namely BMFO, BPSO, BBBO, BLSHADE. After two-thirds of the iterations in the SPECT, Cleveland, Saheart, PimaDiabetes, and ProstateGE datasets, the convergence behavior of the two proposed companions (i.e., BSCSO and BMSCSO) becomes nearly similar. Anyway, the convergence behavior of the proposed BMSCSO is slightly and sometimes much better than that of the rival algorithms, namely BMFO, BPSO, BBBO, BLSHADE, BAFT, BHBA, BTLBO. The convergence curves of the HeartEW, Leukemia, and Colon datasets demonstrate how the convergence behaviors of the two proposed FS methods (that is, BSCSO and BMSCSO) and the other competing algorithms, BMFO, BBBO, BLSHADE, BAFT, BTLBO, are not similar in most iterations and becomes indistinguishable in the final few iterations. In the ParkinsonC and Spectfheart datasets, the convergence behavior of the proposed BMSCSO, however, is clearly superior to that of the standard BSCSO, as well as other competing algorithms including BTLBO and BBBO. Finally, in the datasets for Thyroid, Heart, and Hepatitis, the convergence behavior of BMSCSO is superior to that of the basic BSCSO, BTLBO, BBBO, and BAFT. It should be noted that Table 5 shows that the proposed BMSCSO performs better than the basic BSCSO and other competing algorithms, namely BMFO, BPSO, BBBO, BLSHADE, BAFT, BHBA, BTLBO, in terms of the average fitness values in these datasets.

6.5. Exploration and Exploitation

The two most crucial characteristics of optimization algorithms are exploration and exploitation, which can help solve optimization problems more effectively [67]. Empirical studies have demonstrated a strong correlation between optimization algorithm’s capacity for exploration and exploitation and its rate of convergence with reference to these two conceptions. In particular, exploitation techniques are known to increase convergence rate toward the global optimum but also to increase the risk of being trapped in local optimal states [67]. On the other hand, search methods that prioritize exploration over exploitation are more likely to lead to the location of regions within the search space where the global optimum is more likely to exist. This comes at the expense of optimization algorithms’ declining convergence speed [68]. Although it may seem minor, the question of how exploration and exploitation of solutions are accomplished in optimizations has remained an open one in recent years and has continued to be a cause of disagreement among many researchers [69]. Although many ideas and beliefs may appear to be at odds with one another, there seems to be agreement among researchers that this sort of search techniques must have an appropriate balance between exploration and exploitation in order to function reasonably. In order to identify reasonable solutions to an optimization problem, optimizations employ a set of potential solutions to explore the search space. In general, the search process should be directed in their direction by the search agents with the best solutions. This attraction causes the gap between the search agents to widen while the effects of exploitation diminish. On the other hand, the impact of the exploration approach becomes more noticeable when the search agents’ separation widens. A diversity measurement is used to calculate the rise and decrease in distance between the search agents [70]. According to this approach, population variety can be defined as follows [67]:
E x p j = 1 N i = 1 N m e d i a n ( x j ) x i j
E x p = 1 m j = 1 m E x p j
where n is the number of search agents, m identifies the dimension of the problem, x i j stands for the dimension j of search agent i, and m e d i a n ( x j ) stands for the median of dimension j in the total population.
The distance between the dimension j of each search agent and the dimension’s average median is referred to as the diversity in each dimension, or E x p j . The percentage of exploration and exploitation that a specific optimization algorithm uses is known as the complete balance response. The following formulas are used to compute these values at each iteration step [67]:
X P L % = E x p E x p m a x × 100
X P T % = E x p E x p m a x E x p m a x × 100
where E x p m a x denotes the highest diversity value that was found throughout the whole optimization process.
The link between the diversity at each iteration and the greatest diversity obtained is represented by the percentage of exploration (XPL%). The amount of exploitation is represented by the percentage of exploitation (XPT%) [67]. As can be seen, the factors XPL% and XPT% are complimentary and in conflict with one another. The use of the median value while evaluating the balancing response prevents inconsistencies by using a reference element. The E x p m a x value discovered over the whole optimization process also has an impact on this balancing reaction. The rate of exploration and exploitation is estimated using this value as a reference.
Optimization of the BreastEW dataset is used as an example to illustrate the evaluation of the balance response of the proposed BMSCSO. Figure 6 depicts the performance conduct produced by the proposed BMSCSO in optimizing the BreastEW dataset over 100 iterations with respect to the balance assessment presented by Equations (35) and (36).
Five points, denoted as (A), (B), (C), (D), and (E), in Figure 6 have been selected to demonstrate the diversity of solutions and the balance evaluations of each of them. Point (A) denotes an early stage of the proposed BMSCSO, when XPL% and XPT% have balance evaluation values of 90 and 10, respectively. These percentages provide BMSCSO a clear direction to operate within while it explores the search space. This implies that the solutions maintain a significant amount of dispersion of the search space. Point (B) corresponds to 70 iterations, where the balance assessment at this point preserves a value of XPL% = 70 together with XPT% = 30. In this position, the proposed BMSCSO mostly engages in exploration with minimal exploitation. Points (C) and (D) correlate with 75 and 100 iterations, respectively, where the balancing assessments’ exploration and exploitation values are XPL% = 25, XPT% = 75, XPL% = 5, and XPT% = 95, respectively. When these percentages were reached, the proposed BMSCSO’s behavior changed to encourage more exploitation than exploration. These configurations cause the solutions to be dispersed among numerous bunches, which lessens the diversity overall. Finally, point (E) implements the BMSCSO’s final junction. The proposed BMSCSO maintains a subtle tendency toward the exploitation of the top solutions without taking into consideration any exploration approach in such a case.

6.6. Sensitivity Analysis

To pinpoint the ideal parameter values of the proposed BMSCSO FS with memory based, a detailed sensitivity analysis based on the Design of Experiment (DoE) approach was performed. The suggested technique employs k-NN as a classifier in DoE to investigate the sensitivity of critical control parameters ( τ 1 and τ 0 ). The essential control parameter ranges were initially created, and the values of these parameters were determined to determine whether the best values came within the scope or whether further tests were necessary. The FS experiments then used a parameter with one input of the generated DoE values in the desired range while retaining the remaining parameters at their beginning levels. These tests were carried out systematically to investigate the influence of input parameters on the accuracy values of the suggested BMSCSO and arrive at a reasonable answer. Here are the values of each parameter used in this experiment: τ 0 = 0.5, 1.0, 1.5, 2.0 and τ 1 = 0.5, 1.0, 1.5, 2.0.
The numerical findings in Table 9 reveal that BMSCSO performed best when the parameters τ 1 and τ 0 of BMSCSO are set to 2.0 and 2.0, respectively. This highlights the need of using a reasonable range of these critical parameters to improve BMSCSO’s resilience. These suggested approaches are nearly stable to changes in the necessary control parameters for BMSCSO when data sets have varying degrees of dimensionality, as shown by the standard deviation values of the classification accuracy of BMSCSO. To summarize, the results in Table 9 show that BMSCSO is often extremely sensitive to parameter choices of τ 1 and τ 0 and the best values are 2 for both of them.

6.7. Statistical Analysis

The accuracy of the proposed BSCSO and BMSCSO algorithms was assessed in the aforementioned subsections using a variety of performance indicators, and their performance on FS problems was compared with that of other optimization FS methods stated in the literature. Over the course of 30 separate runs, the optimal solutions (i.e., classification accuracy and fitness values) were calculated using the average and standard deviation as statistical metric measures. These measures provide a general overview of how well the proposed algorithms handle FS problems with varied complexity. The first metric shows an algorithm’s average performance, while the second one shows how stable the method is throughout all of the separate run times. These evaluation measures may show the overall effectiveness and power of the FS method under evaluation, but they could not compare each of the separate runs independently. In other words, they have demonstrated the potential exploitation and exploratory behaviors of the proposed FS methods, but they are unable to demonstrate the quality of these methods. To demonstrate the significance of the results and that they were not the result of chance, Friedman’s and Holm’s statistical test methods [18,71] were carried out. Using the null hypothesis that there is no difference in the accuracy of any of the compared FS methods, Friedman’s test is used to determine if there is a root difference in the results of the competing FS methods.
Always receiving the lowest rank is the method with the best performance, while always receiving the highest rank is the method with the poorest performance. For the findings of the FS problems being studied, the p-value of the Friedman’s test must be determined. If this value is equal to or less than the degree of significance, which in this study is 0.05 , the null hypothesis is rejected. Achieving this instance indicates that there are statistically significant differences in how well the comparison strategies function. Following this statistical test, a post-hoc test method-in this case, the Holm’s test procedure-is used to examine pairwise comparison of the comparative algorithms. For post-hoc analysis, the method with the lowest rank obtained by Friedman’s test is typically employed as the control method. The average fitness and classification accuracy results of the proposed BMSCSO algorithm presented in Table 4 and Table 5 are, therefore, statistically deserving of attention and do not statistically deviate from the results of other promising algorithms, according to Friedman’s and Holm’s statistical methods, which were used to make this revelation.
Table 10 shows the ranking results of the Friedman’s method on the basis of the results presented in Table 4.
Friedman’s test method yielded a p-value of 9.442002E-11 based on the average classification accuracy results shown in Table 10. The existence of a statistically significant difference in the accuracy of the compared algorithms was shown by the rejection of the null hypothesis of similar performance. BMSCSO is statistically significant and the most successful FS method among all other FS ones, according to the statistical results shown in Table 10. As a result, BMSCSO gets the highest rank, 1.714285, with a significance level of 5%. In particular, BMSCSO was put in the first place, and was followed in order by BSCSO, BHBA, BLSHADE, BBBO, BAFT, BMFO, BPSO, and BTLBO.
Following Friedman’s test, Holm’s test method was used to assess if there are statistically significant differences between the competing methods indicated in Table 10 and the control algorithm with the lowest rank (i.e., BMSCSO). Table 4 and Table 10 reflect the collected results; Table 11 presents the statistical findings produced by Holm’s test procedures based on those results. In this table, z stands for the statistical difference between the two FS methods being compared, R 0 for the Friedman’s rank assigned to the control method, R i for the Friedman’s rank assigned to method i, and ES for the amount of the control algorithm’s influence on algorithm i.
Using the Holm’s test, which is shown in Table 11 and excludes hypotheses with a p-value 0.025000 , BMSCSO was compared to the other competing algorithms. BMSCSO is a potent FS approach for obtaining favorable results for the datasets under examination, as can be shown from the findings in Table 11, and it performs noticeably better than other skilled methods utilized to handle these FS problems. The p-values also divulges the BMSCSO’s level of performance on FS tasks and demonstrate the effectiveness of this approach. The results in Table 10 and Table 11 demonstrate, in brief, that BMSCSO has successfully avoided local optimal solutions and is regarded as statistically significant by achieving high classification accuracy scores in comparison to the obtained classification accuracy rates by the basic BSCSO and the other competing FS algorithms.
A secondary statistical comparison was done between the proposed BMSCSO and the other competing FS methods in regards to the mean fitness results where their obtained results are displayed in Table 5. The ranking results of the Friedman’s test-based statistical analysis based on fitness values are listed in Table 12.
As per the fitness scores exhibited in Table 5, the p-value determined by Friedman’s test is 6.380973E-11. Due to the outcomes presented in Table 12, the proposed BMSCSO algorithm is ranked among the top three FS methods among all other competing ones. The proposed BSCSO method gained the first rank with a value of 5.14, which means the two proposed methods earned the top ranking among all other competing FS methods due to the nature of their local and global search strategies. In more detail, the proposed BMSCSO method gained the second-rank algorithm with a value of 4.095 with a significance level of a = 5%. The BTLBO is the second best FS method in this comparison with a rank of 3.833333 with a little difference from the rank obtained by BMSCSO. In view of the results displayed in Table 12, BMSCSO acted much better than the other algorithms in the FS task, where it ranked first, followed successively by BHBA, BSCSO, BCMA-ES, BLSHADE, BAFT, BMFO, BPSO, and BTLBO.
Based on the fitness ratings shown in Table 5, Friedman’s test yielded a p-value of 6.380973E-11. According to the outcomes shown in Table 12, the proposed BMSCSO is one of the most effective FS method out of all those in competition. BHBA received the top position as noted from the ranking results presented in Table 12 with a value of 3.833333 and a degree of significance of α = 5 % . With an average ranking score of 3.833333, slightly better than the rank attained by BHBA, the BTLBO technique is the second-best approach. In sum and in light of the ranking results presented in Table 12, BMSCSO performed much better than the other algorithms in the FS problems, where it came in the third place, followed by BLSHADE, BBBO, BSCSO, BAFT, BMFO, and BPSO in the last rank.
These outcomes approximately match those that were previously shown in Table 10. The control algorithm (i.e., BHBA) and the other compared algorithms are then compared using Holm’s test procedure to see if there are statistically discernible differences. The outcomes obtained after performing Holm’s test with α = 0.05 to the results shown in Table 12 are provided in Table 13.
For the outcomes in Table 13, Holm’s technique eliminated the hypotheses with a p-value 016666 . These findings clearly manifest that BMSCSO outperformed many other binary algorithms in solving FS problems, as evidenced by its high performance score compared to those rival ones. These results demonstrate that BMSCSO has wonderfully avoided local solutions throughout the search space by undertaking sensible exploration and exploitation aspects. Table 10, Table 11, Table 12 and Table 13 show that the results of BMSCSO are statistically significant, demonstrating that this FS method is superior to many other FS methods with excellent results over other well-known established optimization algorithms. This ending demonstrates the excellent efficiency of the proposed BMSCSO, which is a binary extension of the SCSO method, in solving a wide range of well-known FS problems.

7. Conclusions and Future Works

This paper proposes a new effective Memory based Sand Cat Swarm Optimization (MSCSO) for addressing Feature Selection (FS) problems. The proposed MSCSO makes use of the exploration ability of SCSO to scout the search space efficiently and internal memory component for quicker convergence and to locate the global best solution. For this study, the basic SCSO and proposed MSCSO are converted to binary format, and given the names BSCSO and BMSCSO, respectively, to properly handle FS tasks. The goals of these FS algorithms include developing clear and non-redundant data, improving data mining performance, and producing straightforward and comprehensive models. In the interim, these proposed algorithms might be utilized to successfully reduce the dimensionality of data for machine learning applications. The performance level of the proposed methods was evaluated on 20 benchmark medical datasets using a number of evaluation criteria. For comparative assessments, the proposed BMSCSO and fundamental binary SCSO are compared with other well-known FS algorithms. According to analysis based on Friedman’s and Holm’s test methods, the proposed BMSCSO got the highest classification accuracy rate. This encourages the use of MSCSO in expert systems as a dependable and robust optimization method. Similar to SCSO, MSCSO has relatively few parameters that need to be adjusted, and because these variables are self-adaptive, it is simple to employ MSCSO to solve optimization problems. MSCSO has a great capacity to avoid local optima compared to other algorithms since it is a population-based algorithm. Additionally, by incorporating the internal memory idea into MSCSO, stagnation at local optima is avoided and the quality of the solution is improved during the iterative updating process. Further enhancements of this version might be done for further study as the proposed BMSCSO demonstrated appealing performance in solving FS problems. For instance, researchers working on multi-objective optimization problems could employ the proposed MSCSO. The applicability of the proposed methods may also be confirmed using a high-dimensional dataset like gene selection. Other types of transfer functions such as V-shaped, X-shaped and U-shaped could be applied to examine the impact of these transfer functions on the effectiveness of the developed FS algorithms.

Author Contributions

Methodology, D.A.; Validation, M.B.; Investigation, A.A.; Writing—original draft, M.T.A.; Writing—review & editing, E.J.A.; Supervision, A.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This research has been funded by Scientific Research Deanship at University of Ha’il—Saudi Arabia through project number RG-22 021.

Data Availability Statement

Not applicable.

Acknowledgments

We want to acknowledge the Scientific Research Deanship at University of Ha’il—Saudi Arabia, for funding this research.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Rostami, M.; Berahmand, K.; Nasiri, E.; Forouzandeh, S. Review of swarm intelligence-based feature selection methods. Eng. Appl. Artif. Intell. 2021, 100, 104210. [Google Scholar] [CrossRef]
  2. Alsahaf, A.; Petkov, N.; Shenoy, V.; Azzopardi, G. A framework for feature selection through boosting. Expert Syst. Appl. 2022, 187, 115895. [Google Scholar] [CrossRef]
  3. Hu, G.; Du, B.; Wang, X.; Wei, G. An enhanced black widow optimization algorithm for feature selection. Knowl.-Based Syst. 2022, 235, 107638. [Google Scholar] [CrossRef]
  4. Feofanov, V.; Devijver, E.; Amini, M.R. Wrapper feature selection with partially labeled data. Appl. Intell. 2022, 52, 12316–12329. [Google Scholar] [CrossRef]
  5. Albashish, D.; Hammouri, A.I.; Braik, M.; Atwan, J.; Sahran, S. Binary biogeography-based optimization based SVM-RFE for feature selection. Appl. Soft Comput. 2021, 101, 107026. [Google Scholar] [CrossRef]
  6. Vommi, A.M.; Battula, T.K. A hybrid filter-wrapper feature selection using Fuzzy KNN based on Bonferroni mean for medical datasets classification: A COVID-19 case study. Expert Syst. Appl. 2023, 218, 119612. [Google Scholar] [CrossRef]
  7. Seyyedabbasi, A.; Kiani, F. Sand Cat swarm optimization: A nature-inspired algorithm to solve global optimization problems. Eng. Comput. 2022, 1, 1–25. [Google Scholar] [CrossRef]
  8. Khurma, R.A.; Albashish, D.; Braik, M.; Alzaqebah, A.; Qasem, A.; Adwan, O. An augmented Snake Optimizer for diseases and COVID-19 diagnosis. Biomed. Signal Process. Control 2023, 84, 104718. [Google Scholar] [CrossRef]
  9. Awadallah, M.A.; Al-Betar, M.A.; Braik, M.S.; Hammouri, A.I.; Doush, I.A.; Zitar, R.A. An enhanced binary Rat Swarm Optimizer based on local-best concepts of PSO and collaborative crossover operators for feature selection. Comput. Biol. Med. 2022, 147, 105675. [Google Scholar] [CrossRef]
  10. Braik, M. Enhanced Ali Baba and the forty thieves algorithm for feature selection. Neural Comput. Appl. 2022, 35, 6153–6184. [Google Scholar] [CrossRef]
  11. Wolpert, D.H.; Macready, W.G. No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1997, 1, 67–82. [Google Scholar] [CrossRef]
  12. Seyyedabbasi, A. Solve the Inverse Kinematics of Robot Arms using Sand Cat Swarm Optimization (SCSO) Algorithm. In Proceedings of the 2022 International Conference on Theoretical and Applied Computer Science and Engineering (ICTASCE), Ankara, Turkey, 29 September–1 October 2022; pp. 127–131. [Google Scholar]
  13. Jovanovic, D.; Marjanovic, M.; Antonijevic, M.; Zivkovic, M.; Budimirovic, N.; Bacanin, N. Feature Selection by Improved Sand Cat Swarm Optimizer for Intrusion Detection. In Proceedings of the 2022 International Conference on Artificial Intelligence in Everything (AIE), Lefkosa, Cyprus, 2–4 August 2022; pp. 685–690. [Google Scholar]
  14. Lu, W.; Shi, C.; Fu, H.; Xu, Y. A Power Transformer Fault Diagnosis Method Based on Improved Sand Cat Swarm Optimization Algorithm and Bidirectional Gated Recurrent Unit. Electronics 2023, 12, 672. [Google Scholar] [CrossRef]
  15. Albashish, D.; Aburomman, A. Weighted heterogeneous ensemble for the classification of intrusion detection using ant colony optimization for continuous search spaces. Soft Comput. 2022, 27, 4779–4793. [Google Scholar] [CrossRef]
  16. Too, J.; Liang, G.; Chen, H. Memory-based Harris hawk optimization with learning agents: A feature selection approach. Eng. Comput. 2022, 38, 4457–4478. [Google Scholar] [CrossRef]
  17. Braik, M.; Al-Zoubi, H.; Ryalat, M.; Sheta, A.; Alzubi, O. Memory based hybrid crow search algorithm for solving numerical and constrained global optimization problems. Artif. Intell. Rev. 2023, 56, 27–99. [Google Scholar] [CrossRef]
  18. Derrac, J.; García, S.; Molina, D.; Herrera, F. A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol. Comput. 2011, 1, 3–18. [Google Scholar] [CrossRef]
  19. Nadimi-Shahraki, M.H.; Taghian, S.; Mirjalili, S. An improved grey wolf optimizer for solving engineering problems. Expert Syst. Appl. 2021, 166, 113917. [Google Scholar] [CrossRef]
  20. Mirjalili, S.; Lewis, A. The whale optimization algorithm. Adv. Eng. Softw. 2016, 95, 51–67. [Google Scholar] [CrossRef]
  21. Nadimi-Shahraki, M.H.; Taghian, S.; Mirjalili, S.; Abualigah, L.; Abd Elaziz, M.; Oliva, D. Ewoa-opf: Effective whale optimization algorithm to solve optimal power flow problem. Electronics 2021, 10, 2975. [Google Scholar] [CrossRef]
  22. Taghian, S.; Nadimi-Shahraki, M.H. A binary metaheuristic algorithm for wrapper feature selection. Int. J. Comput. Sci. Eng. (IJCSE) 2019, 8, 168–172. [Google Scholar]
  23. Nadimi-Shahraki, M.H.; Moeini, E.; Taghian, S.; Mirjalili, S. DMFO-CD: A discrete moth-flame optimization algorithm for community detection. Algorithms 2021, 14, 314. [Google Scholar] [CrossRef]
  24. Nadimi-Shahraki, M.H.; Taghian, S.; Mirjalili, S.; Faris, H. MTDE: An effective multi-trial vector-based differential evolution algorithm and its applications for engineering design problems. Appl. Soft Comput. 2020, 97, 106761. [Google Scholar] [CrossRef]
  25. Awadallah, M.A.; Hammouri, A.I.; Al-Betar, M.A.; Braik, M.S.; Abd Elaziz, M. Binary Horse herd optimization algorithm with crossover operators for feature selection. Comput. Biol. Med. 2022, 141, 105152. [Google Scholar] [CrossRef]
  26. Malibari, A.A.; Alotaibi, S.S.; Alshahrani, R.; Dhahbi, S.; Alabdan, R.; Al-wesabi, F.N.; Hilal, A.M. A novel metaheuristics with deep learning enabled intrusion detection system for secured smart environment. Sustain. Energy Technol. Assess. 2022, 52, 102312. [Google Scholar] [CrossRef]
  27. Braik, M.; Sheta, A.; Aljahdali, S. Diagnosis of brain tumors in MR images using metaheuristic optimization algorithms. In Innovation in Information Systems and Technologies to Support Learning Research: Proceedings of EMENA-ISTL 2019 3; Springer: Berlin/Heidelberg, Germany, 2020; pp. 603–614. [Google Scholar]
  28. Braik, M. Hybrid enhanced whale optimization algorithm for contrast and detail enhancement of color images. Clust. Comput. 2022, 1, 1–37. [Google Scholar] [CrossRef]
  29. Gokalp, O.; Tasci, E.; Ugur, A. A novel wrapper feature selection algorithm based on iterated greedy metaheuristic for sentiment classification. Expert Syst. Appl. 2020, 146, 113176. [Google Scholar] [CrossRef]
  30. Braik, M.; Al-Zoubi, H.; Al-Hiary, H. Artificial neural networks training via bio-inspired optimisation algorithms: Modelling industrial winding process, case study. Soft Comput. 2021, 25, 4545–4569. [Google Scholar] [CrossRef]
  31. Braik, M. A hybrid multi-gene genetic programming with capuchin search algorithm for modeling a nonlinear challenge problem: Modeling industrial winding process, case study. Neural Process. Lett. 2021, 53, 2873–2916. [Google Scholar] [CrossRef]
  32. Iraji, A.; Karimi, J.; Keawsawasvong, S.; Nehdi, M.L. Minimum safety factor evaluation of slopes using hybrid chaotic sand cat and pattern search approach. Sustainability 2022, 14, 8097. [Google Scholar] [CrossRef]
  33. Wu, D.; Rao, H.; Wen, C.; Jia, H.; Liu, Q.; Abualigah, L. Modified Sand Cat Swarm Optimization Algorithm for Solving Constrained Engineering Optimization Problems. Mathematics 2022, 10, 4350. [Google Scholar] [CrossRef]
  34. Li, Y.; Wang, G. Sand Cat Swarm Optimization Based on Stochastic Variation with Elite Collaboration. IEEE Access 2022, 10, 89989–90003. [Google Scholar] [CrossRef]
  35. Kiani, F.; Anka, F.A.; Erenel, F. PSCSO: Enhanced sand cat swarm optimization inspired by the political system to solve complex problems. Adv. Eng. Softw. 2023, 178, 103423. [Google Scholar] [CrossRef]
  36. Arasteh, B.; Seyyedabbasi, A.; Rasheed, J.M.; Abu-Mahfouz, A. Program Source-Code Re-Modularization Using a Discretized and Modified Sand Cat Swarm Optimization Algorithm. Symmetry 2023, 15, 401. [Google Scholar] [CrossRef]
  37. Alweshah, M.; Alkhalaileh, S.; Al-Betar, M.A.; Bakar, A.A. Coronavirus herd immunity optimizer with greedy crossover for feature selection in medical diagnosis. Knowl.-Based Syst. 2022, 235, 107629. [Google Scholar] [CrossRef]
  38. Mahdi, A.Y.; Yuhaniz, S.S. Optimal feature selection using novel flamingo search algorithm for classification of COVID-19 patients from clinical text. Math. Biosci. Eng. 2023, 20, 5268–5297. [Google Scholar] [CrossRef]
  39. Ryalat, M.H.; Dorgham, O.; Tedmori, S.; Al-Rahamneh, Z.; Al-Najdawi, N.; Mirjalili, S. Harris hawks optimization for COVID-19 diagnosis based on multi-threshold image segmentation. Neural Comput. Appl. 2023, 35, 6855–6873. [Google Scholar] [CrossRef]
  40. Nadimi-Shahraki, M.H.; Zamani, H.; Mirjalili, S. Enhanced whale optimization algorithm for medical feature selection: A COVID-19 case study. Comput. Biol. Med. 2022, 148, 105858. [Google Scholar] [CrossRef] [PubMed]
  41. Taghian, S.; Nadimi-Shahraki, M.H. Binary sine cosine algorithms for feature selection from medical data. arXiv 2019, arXiv:1911.07805. [Google Scholar] [CrossRef]
  42. Nadimi-Shahraki, M.H.; Taghian, S.; Mirjalili, S.; Abualigah, L. Binary aquila optimizer for selecting effective features from medical data: A COVID-19 case study. Mathematics 2022, 10, 1929. [Google Scholar] [CrossRef]
  43. Nadimi-Shahraki, M.H.; Asghari Varzaneh, Z.; Zamani, H.; Mirjalili, S. Binary Starling Murmuration Optimizer Algorithm to Select Effective Features from Medical Data. Appl. Sci. 2022, 13, 564. [Google Scholar] [CrossRef]
  44. Nadimi-Shahraki, M.H.; Fatahi, A.; Zamani, H.; Mirjalili, S. Binary Approaches of Quantum-Based Avian Navigation Optimizer to Select Effective Features from High-Dimensional Medical Data. Mathematics 2022, 10, 2770. [Google Scholar] [CrossRef]
  45. Zamani, H.; Nadimi-Shahraki, M.H.; Gandomi, A.H. QANA: Quantum-based avian navigation optimizer algorithm. Eng. Appl. Artif. Intell. 2021, 104, 104314. [Google Scholar] [CrossRef]
  46. Wadhawan, S.; Maini, R. EBPSO: Enhanced binary particle swarm optimization for cardiac disease classification with feature selection. Expert Syst. 2022, 39, e13002. [Google Scholar] [CrossRef]
  47. Braik, M.; Hammouri, A.; Atwan, J.; Al-Betar, M.A.; Awadallah, M.A. White Shark Optimizer: A novel bio-inspired meta-heuristic algorithm for global optimization problems. Knowl.-Based Syst. 2022, 243, 108457. [Google Scholar] [CrossRef]
  48. Braik, M.S. Chameleon Swarm Algorithm: A bio-inspired optimizer for solving engineering design problems. Expert Syst. Appl. 2021, 174, 114685. [Google Scholar] [CrossRef]
  49. Bao, G.; Mao, K. Particle swarm optimization algorithm with asymmetric time varying acceleration coefficients. In Proceedings of the 2009 IEEE International Conference on Robotics and Biomimetics (ROBIO), Guilin, China, 19–23 December 2009; pp. 2134–2139. [Google Scholar]
  50. Braik, M.; Sheta, A.; Turabieh, H.; Alhiary, H. A novel lifetime scheme for enhancing the convergence performance of salp swarm algorithm. Soft Comput. 2021, 25, 181–206. [Google Scholar] [CrossRef]
  51. Kennedy, J.; Eberhart, R.C. A discrete binary version of the particle swarm algorithm. In Proceedings of the 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation, Orlando, FL, USA, 12–15 October 1997; Volume 5, pp. 4104–4108. [Google Scholar]
  52. Mirjalili, S.; Lewis, A. S-shaped versus V-shaped transfer functions for binary particle swarm optimization. Swarm Evol. Comput. 2013, 9, 1–14. [Google Scholar] [CrossRef]
  53. Zhang, X.; Wu, C.; Li, J.; Wang, X.; Yang, Z.; Lee, J.M.; Jung, K.H. Binary artificial algae algorithm for multidimensional knapsack problems. Appl. Soft Comput. 2016, 43, 583–595. [Google Scholar] [CrossRef]
  54. Taghian, S.; Nadimi-Shahraki, M.H.; Zamani, H. Comparative analysis of transfer function-based binary Metaheuristic algorithms for feature selection. In Proceedings of the 2018 International Conference on Artificial Intelligence and Data Processing (IDAP), Malatya, Turkey, 28–30 September 2018; pp. 1–6. [Google Scholar]
  55. Mirjalili, S.; Zhang, H.; Mirjalili, S.; Chalup, S.; Noman, N. A novel U-shaped transfer function for binary particle swarm optimisation. In Soft Computing for Problem Solving 2019: Proceedings of SocProS 2019; Springer: Berlin/Heidelberg, Germany, 2020; Volume 1, pp. 241–259. [Google Scholar]
  56. Rashedi, E.; Nezamabadi-Pour, H.; Saryazdi, S. BGSA: Binary gravitational search algorithm. Nat. Comput. 2010, 9, 727–745. [Google Scholar] [CrossRef]
  57. Nadimi-Shahraki, M.H.; Banaie-Dezfouli, M.; Zamani, H.; Taghian, S.; Mirjalili, S. B-MFO: A binary moth-flame optimization for feature selection from medical datasets. Computers 2021, 10, 136. [Google Scholar] [CrossRef]
  58. Mostafa, R.R.; Gaheen, M.A.; Abd ElAziz, M.; Al-Betar, M.A.; Ewees, A.A. An improved gorilla troops optimizer for global optimization problems and feature selection. Knowl.-Based Syst. 2023, 269, 110462. [Google Scholar] [CrossRef]
  59. Emary, E.; Zawbaa, H.M.; Hassanien, A.E. Binary ant lion approaches for feature selection. Neurocomputing 2016, 213, 54–65. [Google Scholar] [CrossRef]
  60. Abualigah, L.; Diabat, A. Chaotic binary group search optimizer for feature selection. Expert Syst. Appl. 2022, 192, 116368. [Google Scholar] [CrossRef]
  61. Simon, D. Biogeography-based optimization. IEEE Trans. Evol. Comput. 2008, 12, 702–713. [Google Scholar] [CrossRef]
  62. Mirjalili, S. Moth-flame optimization algorithm: A novel nature-inspired heuristic paradigm. Knowl.-Based Syst. 2015, 89, 228–249. [Google Scholar] [CrossRef]
  63. Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95-International Conference on Neural Networks, Perth, WA, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar]
  64. Rao, R.V.; Savsani, V.J.; Vakharia, D. Teaching–learning-based optimization: An optimization method for continuous non-linear large scale problems. Inf. Sci. 2012, 183, 1–15. [Google Scholar] [CrossRef]
  65. Hashim, F.A.; Houssein, E.H.; Hussain, K.; Mabrouk, M.S.; Al-Atabany, W. Honey Badger Algorithm: New metaheuristic algorithm for solving optimization problems. Math. Comput. Simul. 2022, 192, 84–110. [Google Scholar] [CrossRef]
  66. Viktorin, A.; Pluhacek, M.; Senkerik, R. Success-history based adaptive differential evolution algorithm with multi-chaotic framework for parent selection performance on CEC2014 benchmark set. In Proceedings of the 2016 IEEE Congress on Evolutionary Computation (CEC), Vancouver, BC, Canada, 24–29 July 2016; pp. 4797–4803. [Google Scholar]
  67. Morales-Castañeda, B.; Zaldivar, D.; Cuevas, E.; Fausto, F.; Rodríguez, A. A better balance in metaheuristic algorithms: Does it exist? Swarm Evol. Comput. 2020, 54, 100671. [Google Scholar] [CrossRef]
  68. Yang, X.S.; Deb, S.; Fong, S. Metaheuristic algorithms: Optimal balance of intensification and diversification. Appl. Math. Inf. Sci. 2014, 8, 977. [Google Scholar] [CrossRef]
  69. Yang, X.S.; Deb, S.; Hanne, T.; He, X. Attraction and diffusion in nature-inspired optimization algorithms. Neural Comput. Appl. 2019, 31, 1987–1994. [Google Scholar] [CrossRef]
  70. Cheng, S.; Shi, Y.; Qin, Q.; Zhang, Q.; Bai, R. Population diversity maintenance in brain storm optimization algorithm. J. Artif. Intell. Soft Comput. Res. 2014, 4, 83–97. [Google Scholar] [CrossRef]
  71. García, S.; Molina, D.; Lozano, M.; Herrera, F. A study on the use of non-parametric tests for analyzing the evolutionary algorithms’ behaviour: A case study on the CEC’2005 special session on real parameter optimization. J. Heuristics 2009, 15, 617–644. [Google Scholar] [CrossRef]
Figure 1. A flowchart describing the basic sand cats swarm optimization algorithm [7].
Figure 1. A flowchart describing the basic sand cats swarm optimization algorithm [7].
Electronics 12 02042 g001
Figure 2. Proposed functions for W 1 and W 2 in the proposed BMSCSO, (a) W 1 , (b) W 2 .
Figure 2. Proposed functions for W 1 and W 2 in the proposed BMSCSO, (a) W 1 , (b) W 2 .
Electronics 12 02042 g002
Figure 3. Various Kinds of transfer functions.
Figure 3. Various Kinds of transfer functions.
Electronics 12 02042 g003
Figure 4. The flowchart of the proposed BMSCSO, where Equations (10), (11), (16) and (17) are illustrated above.
Figure 4. The flowchart of the proposed BMSCSO, where Equations (10), (11), (16) and (17) are illustrated above.
Electronics 12 02042 g004
Figure 5. The convergence characteristic curves of the proposed BMSCSO, the basic BSCSO, and the other rival algorithms on the 21 studied datasets.
Figure 5. The convergence characteristic curves of the proposed BMSCSO, the basic BSCSO, and the other rival algorithms on the 21 studied datasets.
Electronics 12 02042 g005aElectronics 12 02042 g005b
Figure 6. Performance of proposed BMSCSO over 100 iterations describing the balance assessment provided by Equations (35) and (36).
Figure 6. Performance of proposed BMSCSO over 100 iterations describing the balance assessment provided by Equations (35) and (36).
Electronics 12 02042 g006
Table 2. Medical benchmark datasets.
Table 2. Medical benchmark datasets.
NumberDatasetNumber of FeaturesNumber of InstancesNumber of Classes
1Diagnostic305692
2BreastEW305962
3Prognostic331942
4Coimbra91152
5Retinopathy1911512
6ILPD-Liver105832
7Lymphography181484
8Parkinsons221942
9ParkinsonC7537552
10SPECT222672
11Cleveland132975
12HeartEW132702
13Hepatitis18792
14SAHeart94612
15Spectfheart432662
16Thyroid03872172003
17Heart133025
18Pima-diabetes97682
19Leukemia7129722
20Colon2000622
21Prostate_GE59661022
Table 3. Parameter settings of the competing FS methods.
Table 3. Parameter settings of the competing FS methods.
AlgorithmParameterValue
All algorithmsPopulation size, Number of iterations30, 100
  BBBOImmigration probability bounds for each gene [ 0 , 1 ]
Habitat modification probability1
Immigration rates for each island and maximum migration1
Mutation probability0
Step size for numerical integration of probabilities1
  BMFOLogarithmic spiral0.75
Convergence constant[−1, −2]
  BPSOCognitive factor, Social factor1.8, 2.0
Inertia weight [ 0.9 , 0.4 ]
  BTLBONo particular parametersPopulation size is 5 as this method has two phases
  BAFT α 0 , α 1 , β 0 , β 1 1.0, 2.0, 0.1, 2.0
T d 0 , s, L2.0, 2.0, 100
  BLSHADEScaling factor M F 0.5
M C R 0.5
P b e s t 0.11
Archive rate, memory size1.4, 5
   BHBA( β ), C6.0, 2.0
   BMSCSOSensitivity range (rG)[2, 0]
Phases control range (R)[−2rG, 2rG]
Maximum Sensitivity range (S)2
Table 4. Comparison results between the proposed BMSCSO and other FS methods based on classification accuracy.
Table 4. Comparison results between the proposed BMSCSO and other FS methods based on classification accuracy.
DatasetMeasureBSCSOBMSCSOBBBOBMFOBPSOBTLBOBAFTBLSHADEBHBA
DiagnosticAV0.94390.98250.96340.94310.85750.80440.95400.96990.9823
SD0.08630.00000.00860.07310.11120.10440.00740.01010.0000
BreastAV0.98570.99310.97300.91910.85020.73310.98240.98680.9926
SD0.00000.00640.00350.12130.15580.16610.00660.00330.0000
PrognosticAV0.87370.89470.81840.77540.72110.70180.88950.89470.8947
SD0.03720.04710.01440.06690.05060.03480.01180.00000.0000
CoimbraAV0.89391.00000.87830.79710.65220.60140.81740.82610.8261
SD0.16110.00000.02890.12580.21570.18270.01940.00000.0000
RetinopathyAV0.73910.73910.70770.72360.63060.62250.69390.71390.7417
SD0.00000.00000.01080.03590.05110.02360.01960.02000.0024
ILPDAV0.74900.74920.73710.71300.68990.68960.73740.74610.7739
SD0.01710.01510.01180.05490.02660.02840.01670.01130.0000
LymphographyAV0.93330.94670.85750.79080.73330.73100.53890.55940.6402
SD0.02980.04710.01500.06460.07010.06680.04150.04370.0208
ParkinsonsAV0.99000.99000.99740.97180.92310.90940.97440.97440.9744
SD0.02240.02240.01030.04830.04760.03850.00000.00000.0000
ParkinsonCAV0.79730.82710.69560.69160.68260.67700.74830.75500.7550
SD0.03040.01020.00210.01180.00870.00570.00940.00000.0000
SPECTAV0.80600.84960.73770.70060.62520.60880.75850.78490.8377
SD0.09210.03790.01810.04450.05730.06150.02070.02150.0169
ClevelandAV0.80670.82000.68080.66610.61130.59100.57290.57630.5763
SD0.05060.05480.01000.05000.04770.04190.00760.00000.0000
HeartEWAV0.88890.92590.83950.84630.74140.67960.86670.87410.9630
SD0.03700.00000.03010.05670.10280.06720.02750.03040.0000
HepatitisAV1.00001.00001.00000.97080.92080.87920.87500.87500.9250
SD0.00000.00000.00000.04260.07690.06750.00000.00000.0280
SaheartAV0.78260.79570.72460.69530.60540.60220.71090.73040.7609
SD0.00000.04240.00960.04910.04760.05310.01820.01790.0000
SpectfheartAV0.85840.88750.85160.80630.75970.74970.86790.87170.9094
SD0.02520.03000.01550.04700.05470.03640.02670.02070.0084
ThyroidAV0.97970.96860.97510.97370.95420.94010.98000.98360.9875
SD0.01820.02140.00360.01710.01920.01050.00230.00080.0010
HeartAV0.89630.90370.82330.82220.71610.66330.78670.80670.8500
SD0.03100.02030.02300.05520.08640.06830.03610.01900.0000
PimaDiabetesAV0.81560.82600.77120.74950.70040.71920.77650.79740.7974
SD0.01160.05310.00000.03000.06790.04900.02860.00000.0000
LeukemiaAV1.00001.00000.99520.95240.92380.92620.92860.94290.9857
SD0.00000.00000.01810.04330.02610.01300.00000.03190.0319
ColonAV0.87140.93330.72220.72220.61390.55000.83330.86670.9167
SD0.07260.09130.05050.09110.07420.06030.00000.04560.0000
ProstateGEAV1.00001.00000.85170.81170.77000.75170.90000.90000.9000
SD0.00000.00000.00910.04680.04660.04040.00000.00000.0000
Table 5. Comparison results between the proposed BMSCSO and other methods based on fitness values.
Table 5. Comparison results between the proposed BMSCSO and other methods based on fitness values.
DatasetMeasureBSCSOBMSCSOBBBOBMFOBPSOBTLBOBAFTBLSHADEBHBA
DiagnosticAV0.08020.01340.04130.05990.14460.09260.05120.03540.0217
SD0.14940.00060.00880.07250.11020.00490.00800.01030.0004
BreastAV0.01820.01760.03080.08320.15240.04940.02170.01790.0107
SD0.00000.00150.00330.12040.15420.00340.00730.00390.0005
PrognosticAV0.14950.13990.18480.22660.28020.18950.11480.10890.1080
SD0.01100.01380.01410.06610.05000.01430.01180.00040.0006
CoimbraAV0.23730.13470.12530.20490.34920.12340.18790.17970.1762
SD0.14390.00000.02820.12420.21370.02890.01900.00090.0015
RetinopathyAV0.27490.27450.29510.27770.37000.29530.30830.28840.2606
SD0.00590.00620.01100.03590.05030.02190.01940.01980.0027
ILPDAV0.27330.26500.26510.28700.31150.26140.26400.25560.2268
SD0.01320.01590.01230.05420.02630.01560.01740.01130.0000
LymphographyAV0.14010.13430.14650.21080.26800.14390.46210.44090.3613
SD0.02420.01600.01440.06390.06940.00830.04180.04320.0204
ParkinsonsAV0.11600.11090.00780.03130.07980.00560.03050.02960.0280
SD0.01330.01080.00980.04750.04670.00480.00050.00100.0006
ParkinsonCAV0.24480.24150.30680.31080.31910.30490.25540.24840.2476
SD0.00490.00870.00210.01160.00860.00270.00910.00090.0005
SPECTAV0.19600.18100.26560.30120.37600.25580.24530.21870.1662
SD0.03370.01580.01790.04380.05660.01510.02010.02030.0173
ClevelandAV0.35140.33370.32110.33420.38940.30590.42870.42430.4226
SD0.05060.05510.01000.04930.04760.02380.00780.00150.0008
HeartEWAV0.10060.10770.16310.15620.26010.14570.13660.12880.0411
SD0.01590.03740.02950.05630.10180.02730.02790.03060.0006
HepatitisAV0.06380.07720.00360.03130.08180.00330.12890.12700.0778
SD0.00050.02850.00070.04200.07570.00060.00040.00070.0268
SaheartAV0.25240.28400.27920.30680.39580.28120.29110.27090.2401
SD0.00240.04900.00930.04810.04700.01300.01770.01810.0000
SpectfheartAV0.14040.13670.15260.19690.24250.14390.13730.13210.0959
SD0.06470.02980.01500.04660.05420.01660.02630.01950.0078
ThyroidAV0.03240.04750.02970.02990.04930.02520.01520.01440.0144
SD0.01980.02430.00390.01680.01870.00380.00300.00150.0008
HeartAV0.11800.09150.17930.17910.28550.16560.21740.19740.1531
SD0.04100.00780.02290.05430.08570.02170.03620.01820.0000
PimaDiabetesAV0.29720.25890.23200.25260.30110.23200.22830.20580.2056
SD0.01500.04580.00070.02890.06650.00060.02940.00060.0000
LeukemiaAV0.10400.07560.01010.05270.08030.00950.07710.06150.0182
SD0.00010.03870.01780.04290.02580.01290.00010.03160.0308
ColonAV0.36790.38450.28030.28060.38710.24980.17120.13690.0878
SD0.04520.04520.04980.09020.07340.03800.00010.04510.0005
ProstateGEAV0.18360.18400.15220.19190.23260.19810.10520.10390.1015
SD0.04390.06570.00910.04630.04610.01250.00000.00000.0000
Table 6. Comparison results between the proposed BMSCSO and other methods based on sensitivity results.
Table 6. Comparison results between the proposed BMSCSO and other methods based on sensitivity results.
DatasetMeasureBSCSOBMSCSOBBBOBMFOBPSOBTLBOBAFTBLSHADEBHBA
DiagnosticAV0.95410.96670.89160.87280.72510.61860.86540.89000.8821
SD0.05740.02540.05220.13320.19730.20160.03520.03020.0425
BreastAV0.98630.99190.96800.92940.87960.78300.95740.96300.9684
SD0.00740.00900.01570.09820.11900.13050.01000.00740.0055
PrognosticAV0.10440.12000.16480.18550.18430.10240.84410.90270.8675
SD0.10050.12550.14570.15410.19960.10790.15840.16360.1244
CoimbraAV0.59240.73900.69090.67640.62750.55250.71230.65040.5610
SD0.22390.10200.10060.16690.20320.18490.16100.05320.1652
RetinopathyAV0.59980.63080.63360.66540.61890.63150.59360.59780.5546
SD0.05340.05200.03960.04020.06920.04560.04120.05100.0635
ILPDAV0.81760.81440.81780.84230.82720.81480.77360.82470.8083
SD0.04600.04440.05630.05930.04440.03980.04700.05640.0558
LymphographyAV0.73120.89380.63680.48500.46780.46440.63510.71090.6299
SD0.08530.41230.42620.43700.42170.42000.13290.10930.1989
ParkinsonsAV0.89050.88860.95320.95360.94180.93970.92020.90180.9152
SD0.02810.03060.03270.03840.05140.05490.04680.03040.0415
ParkinsonCAV0.87660.87230.88290.87530.89410.87860.82280.86920.8325
SD0.02600.03490.03100.02650.02540.04050.05880.04880.0320
SPECTAV0.55530.64570.52400.54790.53700.53320.58680.62980.5478
SD0.17810.09080.15000.11350.11600.11490.09320.17350.0550
ClevelandAV0.28080.12250.15320.14230.13610.10760.60280.65540.6196
SD0.10170.10340.13040.14820.11590.08450.03580.06090.0524
HeartEWAV0.81380.84230.81150.84580.76130.71340.76990.91720.8189
SD0.08100.05470.07760.07890.10140.10840.05710.04000.0548
HepatitisAV0.50000.48670.34280.40670.34280.26130.81550.89970.8795
SD0.50000.28050.39610.35330.38570.37480.41830.34190.2236
SaheartAV0.31260.35740.29880.29590.30940.31510.71740.74610.7296
SD0.05780.05640.06870.07400.08410.10160.12620.11260.1299
SpectfheartAV0.84720.87240.86860.85160.85730.85410.86220.85200.8661
SD0.04130.04330.06610.06400.05070.06760.06100.06500.0683
ThyroidAV0.74370.74330.73130.76630.65160.53390.92640.95940.9508
SD0.13700.12870.09870.11670.21750.20180.04900.05000.0491
HeartAV0.85580.89110.91610.92100.79310.66130.73360.79080.8146
SD0.04580.03430.06430.05310.12580.10710.05780.03610.0692
PimaDiabetesAV0.49320.51720.57130.53850.48670.54990.72240.75240.7467
SD0.05170.10060.06480.07600.11910.07680.08560.05040.0633
LeukemiaAV0.73430.73710.66520.69470.71170.69120.87190.95140.9459
SD0.24530.17310.21720.19980.24430.21670.14040.21910.2453
ColonAV0.59900.42000.56950.62110.54900.63910.82840.83490.8284
SD0.24690.23870.28150.26050.22240.21600.31400.20330.1394
ProstateGEAV0.86160.85880.89470.87040.83650.86050.84970.88450.8793
SD0.05290.05780.10020.13510.14330.10800.08060.10190.0966
Table 7. Comparison results between the proposed BMSCSO and other methods based on specificity results.
Table 7. Comparison results between the proposed BMSCSO and other methods based on specificity results.
DatasetMeasureBSCSOBMSCSOBBBOBMFOBPSOBTLBOBAFTBLSHADEBHBA
DiagnosticAV0.82890.88110.96850.95150.92790.92330.92880.92570.9361
SD0.05740.02560.02510.04530.06520.05380.02090.02590.0147
BreastAV0.97920.98760.97920.90700.81940.67680.94850.96490.9578
SD0.00740.00730.01680.14930.20460.21200.01530.00830.0078
PrognosticAV0.93070.93610.89360.91320.91440.91450.87360.87390.8879
SD0.10050.02670.06290.06130.05450.04610.04720.06520.0718
CoimbraAV0.79850.93050.71380.69190.69060.65750.79180.79420.7355
SD0.22390.08750.14580.15890.15460.16440.07080.24360.0517
RetinopathyAV0.68160.69210.70560.70360.65210.65620.68860.71060.7210
SD0.05340.02000.05070.04460.07670.04660.02360.02410.0570
ILPDAV0.31550.36370.31590.30180.31730.30840.70840.75180.7307
SD0.04600.13070.08070.09510.07810.08490.07390.08870.0659
LymphographyAV0.77260.80190.78880.73020.72440.70210.57950.61320.6111
SD0.08530.06910.06060.07390.09170.08630.10250.03570.1065
ParkinsonsAV0.67380.81810.67130.70420.58040.56870.92430.93530.9313
SD0.02810.14760.12330.14380.17640.18120.25600.10060.1495
ParkinsonCAV0.28970.36700.29280.28610.24570.28780.71380.75690.7399
SD0.02600.13720.07290.07100.06970.07340.08290.08190.0455
SPECTAV0.64580.69260.75330.70930.73280.73220.77940.80360.7645
SD0.17810.05250.12960.11810.13620.12700.04940.01860.1047
ClevelandAV0.64260.63980.62350.64210.58910.56930.61640.64330.6258
SD0.10170.06340.08410.05750.09440.06850.03070.07220.0508
HeartEWAV0.76690.82530.70120.70960.63340.56750.78310.80370.7867
SD0.08100.06010.08470.10020.14650.08970.07970.07640.0847
HepatitisAV0.95710.90040.95820.90600.91680.90390.79600.74800.7932
SD0.50000.11570.04970.19890.18760.24960.06540.20000.0000
SaheartAV0.82590.79610.82850.82050.79080.78170.92690.92430.9126
SD0.05780.05700.05950.05760.06120.05270.07470.05790.0544
SpectfheartAV0.46390.53310.38940.41330.36140.39720.79210.82380.8448
SD0.04130.12310.16650.16980.13590.19650.14330.24570.1828
ThyroidAV0.97030.96440.98210.97690.96140.94930.95130.96170.9621
SD0.13700.02250.00490.01700.01880.01260.00540.00550.0028
HeartAV0.69590.77300.63170.59900.59560.57280.76620.76150.7524
SD0.04580.04830.10950.17880.13420.09710.20240.06120.1418
PimaDiabetesAV0.79570.83290.82230.82330.80150.80870.78430.81480.7796
SD0.05170.08490.04560.03260.04770.04890.08240.05260.0431
LeukemiaAV0.95600.95560.97760.96610.94590.98800.95460.96650.9653
SD0.24530.06090.04580.04910.07360.03680.03730.09940.0000
ColonAV0.76310.74510.85530.86100.85040.89560.86570.91680.8478
SD0.24690.15180.14720.12810.13230.12100.10870.07860.1337
ProstateGEAV0.85880.88910.82730.85560.82870.83850.86590.86430.8683
SD0.05290.05780.09660.09910.13230.10610.12510.19110.1218
Table 8. Comparison results between the proposed BMSCSO and other methods based on the average number of selected features.
Table 8. Comparison results between the proposed BMSCSO and other methods based on the average number of selected features.
DatasetMeasureBSCSOBMSCSOBBBOBMFOBPSOBTLBOBAFTBLSHADEBHBA
DiagnosticAV15.2014.40015.7011.0311.0019.4317.4017.4013.00
SD2.591.823.801.612.422.933.212.511.22
BreastAV3.603.004.073.174.176.234.204.803.40
SD0.001.340.780.911.181.521.311.300.55
PrognosticAV15.2012.2017.1714.5313.9021.6318.4016.0012.80
SD2.591.792.962.212.802.132.881.411.92
CoimbraAV4.405.004.273.674.375.776.406.803.60
SD0.890.000.740.761.271.410.890.841.34
RetinopathyAV8.809.6010.837.738.0312.7710.009.809.40
SD1.791.142.601.312.141.942.651.301.14
ILPDAV3.803.604.872.934.476.104.004.203.00
SD1.790.551.430.781.071.691.411.300.00
LymphographyAV6.408.209.736.607.1711.4310.208.409.20
SD2.192.861.721.352.071.792.280.551.30
ParkinsonsAV9.409.4011.677.508.0013.8711.209.205.80
SD1.521.822.431.361.981.931.102.171.30
ParkinsonCAV412.80463.40409.03413.43365.07490.30471.20439.80379.00
SD41.8811.0457.4814.4514.2114.2413.8664.5234.38
SPECTAV12.0012.0013.0710.7010.8013.6713.6012.609.20
SD2.002.121.462.352.522.711.952.302.28
ClevelandAV6.207.406.674.735.908.907.606.204.00
SD0.842.301.751.141.581.471.521.921.00
HeartEWAV6.806.605.535.235.308.176.005.405.80
SD1.482.611.411.521.492.059.806.206.80
HepatitisAV3.405.406.874.606.4712.271.730.890.84
SD0.891.671.331.002.752.530.841.301.64
SaheartAV4.403.805.904.634.705.804.403.603.00
SD2.191.790.710.721.181.060.891.820.00
SpectfheartAV26.0020.5924.8022.7020.6728.7329.0022.4027.40
SD2.245.104.153.212.403.061.417.702.79
ThyroidAV10.608.0010.538.108.2713.3712.4010.208.40
SD1.872.301.961.602.051.711.951.790.55
HeartAV5.204.205.704.035.738.578.007.806.00
SD1.640.841.490.961.361.771.583.270.00
PimaDiabetesAV5.204.804.403.633.605.035.604.204.00
SD0.451.640.560.961.071.400.890.450.00
LeukemiaAV3498.603586.003789.333898.573462.804588.534487.203500.002899.80
SD76.3151.43487.9235.0040.2752.0249.0917.681157.29
ColonAV988.20997.801064.471119.33971.371300.201249.40985.001051.80
SD21.429.23140.2526.9520.1620.2311.7223.7298.62
ProstateGEAV3210.603442.403195.903267.932935.203889.833728.002904.401512.20
SD301.29411.16390.8959.7633.8326.7119.667.3723.73
Table 9. Classification accuracy of BMSCSO using different values of τ 1 and τ 0 with the k-NN classifier.
Table 9. Classification accuracy of BMSCSO using different values of τ 1 and τ 0 with the k-NN classifier.
DatasetMeasure τ 0 = 0.5 τ 0 = 1.0 τ 0 = 1.5 τ 0 = 2.0 τ 1 = 0.5 τ 1 = 1.0 τ 1 = 1.5 τ 1 = 2.0
DiagnosticAV0.780.810.800.940.730.950.870.65
SD0.0020.0020.0070.0030.0030.0030.0020.003
BreastAV0.810.860.830.980.710.990.84140.64
SD0.0030.0030.0030.0010.0040.0070.0030.003
PrognosticAV0.710.750.780.830.670.880.740.663
SD0.0030.0030.0030.0020.0070.0090.0040.007
CoimbraAV0.740.7060.7440.9210.600.9400.810.6391
SD0.00370.00340.00350.00220.00430.00240.00290.0045
RetinopathyAV0.560.6360.6230.7450.5110.710.6270.50
SD0.00380.00380.00320.00390.00380.00380.00320.0037
ILPDAV0.620.680.680.770.570.740.610.5
SD0.01210.01140.01140.01000.02190.01230.02010.0223
LymphographyAV0.670.800.870.940.720.9120.880.6954
SD0.02090.02090.01740.02110.02070.02050.01740.0203
ParkinsonsAV0.7900.8740.850.9730.740.970.860.662
SD0.01320.01320.01100.01330.01310.01300.01100.0128
ParkinsonCAV0.6740.6780.7810.820.660.8690.750.55
SD0.02360.02360.01970.02390.02340.02320.01970.0230
SPECTAV0.780.740.750.870.690.8990.710.650
SD0.01140.01140.00950.01150.01130.01120.00950.0111
ClevelandAV0.410.510.540.600.420.660.520.4267
SD0.00370.00370.00310.00380.00370.00370.00310.0036
HeartEWAV0.7070.7680.7820.9530.6560.890.750.643
SD0.00560.00560.00470.00570.00560.00550.00470.0055
HepatitisAV0.750.840.80.940.720.9350.830.65
SD0.02840.02840.02380.02870.02810.02790.02370.0276
SaheartAV0.580.560.640.7050.550.6020.69020.50
SD0.02410.02380.02280.01980.02810.01980.01870.0309
SpectfheartAV0.7340.780.730.980.670.920.8020.69
SD0.01370.01370.01140.01380.01350.01340.01140.0133
ThyroidAV0.710.83220.860.990.790.980.860.66
SD0.00100.00100.00080.00100.00090.00090.00080.0009
HeartAV0.770.700.740.800.650.80.7030.69
SD0.01920.01920.01600.01940.01900.01880.01600.0186
PimaDiabetesAV0.6900.680.680.710.560.730.650.66
SD0.02380.02290.2180.01870.02520.01890.02100.0278
LeukemiaAV0.810.8500.880.990.760.990.870.72
SD0.00450.00380.00300.00350.00090.00230.00130.0018
ColonAV0.760.770.790.910.690.970.880.64
SD0.00430.00400.00380.00120.00480.00150.00330.0056
ProstateGEAV0.730.7660.790.890.690.80.790.620
SD0.0060.0030.0030.00930.0030.0090.0020.0066
Table 10. Statistical comparison between the presented BMSCSO and other competing methods based on the mean accuracy results.
Table 10. Statistical comparison between the presented BMSCSO and other competing methods based on the mean accuracy results.
AlgorithmRank
BSCSO2.857142
BMSCSO1.714285
BBBO4.785714
BMFO6.119047
BPSO7.714285
BTLBO8.523809
BAFT5.690476
BLSHADE4.500000
BHBA3.095238
Table 11. Results of Holm’s test between the control algorithm (i.e., BMSCSO) and all other algorithms in accordance with the results of Friedman’s test.
Table 11. Results of Holm’s test between the control algorithm (i.e., BMSCSO) and all other algorithms in accordance with the results of Friedman’s test.
iAlgorithm z = ( R 0 R i ) / SE p-Value α ÷iHypothesis
8BTLBO8.0571377.810195E-160.006250Rejected
7BPSO7.0992951.253942E-120.007142Rejected
6BMFO5.2117841.870326E-070.008333Rejected
5BAFT4.7046922.542494E-060.010000Rejected
4BBBO3.6341632.788841E-040.012500Rejected
3BLSHADE3.2961019.803655E-040.016666Rejected
2BHBA1.6339640.1022660.025000Not rejected
1BSCSO1.3522460.1762960.050000Not rejected
Table 12. Statistical comparison between the proposed BMSCSO and all other rivals in reference to the average fitness outcomes.
Table 12. Statistical comparison between the proposed BMSCSO and all other rivals in reference to the average fitness outcomes.
MethodRank
BSCSO5.142857
BMSCSO4.095238
BBBO4.976190
BMFO6.380952
BPSO8.476190
BTLBO3.833333
BAFT5.571428
BLSHADE4.119047
BHBA2.404761
Table 13. Holm’s test Results between the control method and all other methods in reference to Friedman’s test results.
Table 13. Holm’s test Results between the control method and all other methods in reference to Friedman’s test results.
iMethod z = ( R 0 R i ) / SE p-Value α ÷ iHypothesis
8BPSO7.1838116.779448E-130.00625Rejected
7BMFO4.7046922.542494E-060.007142Rejected
6BAFT3.7468501.790687E-040.008333Rejected
5BSCSO3.2397570.0011960.010000Rejected
4BBBO3.0425550.0023450.012500Rejected
3BLSHADE2.0283700.0425220.016666Rejected
2BMSCSO2.0001980.0454780.025000Rejected
1BTLBO1.6903080.0909680.050000Not rejected
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Qtaish, A.; Albashish, D.; Braik, M.; Alshammari, M.T.; Alreshidi, A.; Alreshidi, E.J. Memory-Based Sand Cat Swarm Optimization for Feature Selection in Medical Diagnosis. Electronics 2023, 12, 2042. https://doi.org/10.3390/electronics12092042

AMA Style

Qtaish A, Albashish D, Braik M, Alshammari MT, Alreshidi A, Alreshidi EJ. Memory-Based Sand Cat Swarm Optimization for Feature Selection in Medical Diagnosis. Electronics. 2023; 12(9):2042. https://doi.org/10.3390/electronics12092042

Chicago/Turabian Style

Qtaish, Amjad, Dheeb Albashish, Malik Braik, Mohammad T. Alshammari, Abdulrahman Alreshidi, and Eissa Jaber Alreshidi. 2023. "Memory-Based Sand Cat Swarm Optimization for Feature Selection in Medical Diagnosis" Electronics 12, no. 9: 2042. https://doi.org/10.3390/electronics12092042

APA Style

Qtaish, A., Albashish, D., Braik, M., Alshammari, M. T., Alreshidi, A., & Alreshidi, E. J. (2023). Memory-Based Sand Cat Swarm Optimization for Feature Selection in Medical Diagnosis. Electronics, 12(9), 2042. https://doi.org/10.3390/electronics12092042

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop