Skip Content
You are currently on the new version of our website. Access the old version .
MathematicsMathematics
  • Article
  • Open Access

3 November 2021

Boosting Atomic Orbit Search Using Dynamic-Based Learning for Feature Selection

,
,
,
,
,
,
,
and
1
School of Cyber Science & Engineering, Huazhong University of Science and Technology, Wuhan 430074, China
2
Department of Mathematics, Faculty of Science, Zagazig University, Zagazig 44519, Egypt
3
Artificial Intelligence Research Center (AIRC), Ajman University, Ajman 346, United Arab Emirates
4
Department of Artificial Intelligence Science & Engineering, Galala University, Galala 44011, Egypt
This article belongs to the Section E: Applied Mathematics

Abstract

Feature selection (FS) is a well-known preprocess step in soft computing and machine learning algorithms. It plays a critical role in different real-world applications since it aims to determine the relevant features and remove other ones. This process (i.e., FS) reduces the time and space complexity of the learning technique used to handle the collected data. The feature selection methods based on metaheuristic (MH) techniques established their performance over all the conventional FS methods. So, in this paper, we presented a modified version of new MH techniques named Atomic Orbital Search (AOS) as FS technique. This is performed using the advances of dynamic opposite-based learning (DOL) strategy that is used to enhance the ability of AOS to explore the search domain. This is performed by increasing the diversity of the solutions during the searching process and updating the search domain. A set of eighteen datasets has been used to evaluate the efficiency of the developed FS approach, named AOSD, and the results of AOSD are compared with other MH methods. From the results, AOSD can reduce the number of features by preserving or increasing the classification accuracy better than other MH techniques.

1. Introduction

Data has become the backbones of different fields and domains in recent decades, such as artificial intelligence, data science, data mining, and other related fields. The vast increase of data volumes produced by the web, sensors, and different techniques and systems raised a considerable problem with this excellent data size. The problems of the high dimensionality and big size data have particular impacts on the machine learning classification techniques, represented by the high computational cost and decreasing the classification accuracy [1,2,3]. To solve such challenges, Dimensionality Reduction (DR) techniques can be employed [4,5,6]. There are two main types of DR, called feature selection (FS) and feature extraction (FE). FS methods can remove noisy, irrelevant, and redundant data, which also improves the classifier performance. In general, FS techniques select a subset of the data that capture the characteristics of the whole dataset. To do so, two main types of FS, called filter and wrapper, have been widely used. Wrapper methods leverage the learning classifiers to evaluate the chosen features, where filter methods leverage the characteristic of the original data. Filter methods can be considered more efficient than wrapper methods [7]. FS techniques are used in various domains, for example, big data analysis [8], text classification [9], chemical applications [10], speech emotion recognition [11], neuromuscular disorders [12], hand gesture recognition [13], COVID-19 CT images classification [14], and other many other topics [15].
FS is considered as a complex optimization process, which has two objectives. The first one is to minimize the number of features and minimize error rates or maximize the classification accuracy. Therefore, metaheuristics (MH) optimization algorithms have been widely employed for different FS applications, such as differential evolution (DE) [16], genetic algorithm (GA) [17], particle swarm optimization (PSO) [18], Harris Hawks optimization (HHO) algorithm [7], salp swarm algorithm (SSA) [19], grey wolf optimizer [20], butterfly optimization algorithm [21], multi-verse optimizer (MVO) algorithm [22], krill herd algorithm [23], moth-flame optimization (MFO) algorithm [24] Henry gas solubility optimization (HGS) algorithm [25], and many other MH optimization algorithms [26,27].
In the same context, Atomic Orbital Search (AOS) [28] has been proposed as a metaheuristic technique that belongs to physical-based categories. AOS simulates the laws of quantum technicians and the quantum-based atomic design where the typical arrangement of electrons around the nucleus is in attitude. According to the characteristic of AOS, it has been applied to different applications such as global optimization [28]. In [29], AOS has been used to find the optimal solution to various engineering problems. With these advantages of AOS, it suffers from some limitations such as attraction to local optima, leading to degradation of the convergence rate. This motivated us to provide an improved version for AOS.
The enhanced AOS depends on using the dynamic opposite-based learning strategy to improve the exploration and maintain the diversity of solutions during the searching process. DOL is used in this study since it has several properties that will enhance the performance of different MH techniques. For example, it has been applied to improve the performance for antlion optimizer in [30], and this modification is applied to solve CEC 2014 and CEC 2017 benchmark problems. In [31], the SCA has been enhanced using DOL, and the developed method is applied to the problem of designing the plat-fin heat exchangers. In [32], the flexible job scheduling problem has been solved using the modified version of the grasshopper optimization algorithm (GOA) using DOL. Enhanced teaching–learning-based optimization (TLBO) is presented using DOL, and this algorithm is applied to CEC 2014 benchmark functions.
The main contributions of this study are:
  • We propose an alternative feature selection method to improve the behavior of atomic Orbit optimization (AOS).
  • We use the dynamic opposite-based learning to enhance the exploration and maintain the diversity of solutions during the searching process.
  • We compare the performance of the developed AOSD with other MH techniques using different datasets.
The other sections of this study are organized as follows. Section 2 presents the related works and Section 3 introduces the background of AOS and DOL. The developed method is introduced in Section 4. Section 5 introduces the experiment results and the discussion of the experiments using different FS datasets. The conclusion and future works are presented in Section 6.

3. Background

3.1. Atomic Orbital Search

The AOS is a newly developed optimization method [28], which is inspired by the laws of quantum technicians where the typical arrangement of electrons around the nucleus is in attitude. The mathematical representation of the AOS is given as follows.
The AOS algorithm uses several solutions (X) as shown in Equation (1), and each solution ( X i ) holds several decision variables ( x i , j ).
X = X 1 X 2 X i X N = x 1 1 x 1 2 x 1 j x 1 D x 2 1 x 2 2 x 2 j x 2 D x i 1 x i 2 x i j x i D x N 1 x N 2 x N j x N D , i = 1 , 2 , , N , j = 1 , 2 , , D
where N represents the number of used solutions, and D indicates the dimension length of the tested problem.
The first solutions are randomly initialized using Equation (2).
x i j = x i , m i n j + r a n d × ( x i , m a x j x i , m i n j ) ,
where x i j the position number i in the solution number j, x i , m i n j indicates the lower bound of the ith position, and x i , m a x j represents the upper bound of the ith position.
A vector of energy values includes the objective function of different solutions as presented in Equation (3).
E = E 1 E 2 E i E m
where E represents a vector of objective values, and E i refers to the energy level of the solution number i.
The electron likelihood density chart defines solutions positions estimated using the Probability Density Function (PDF). According to the given description of the individuals by PDF, each imaginarily formulated layer includes several solutions. In this respect, the mathematical representation of the K k positions and the E k of the used individuals in imaginary courses are given as below:
X = X 1 k X 2 k X i k X p n = x 1 1 x 1 2 x 1 j x 1 d x 2 1 x 2 2 x 2 j x 2 d x i 1 x i 2 x i j x i d x p 1 x p 2 x p j x p d , i = 1 , 2 , 3 , , N , j = 1 , 2 , 3 , , D ,
E k = E 1 k E 2 k E i k E p n , k = 1 , 2 , , n
where X i k is the solution number i in the imaginary layer (IL) number k, and n represents the number of the produced IL. p indicates the number of solutions of IL number k. E i k represents the objective value of the solution number i in the IL number k.
In this respect, the required state and energy are defined for the solutions in each supposed IL by analyzing all solutions’ average positions and objective values in the felt layer. More so, the mathematical representation for this scheme is given as:
B S k = i = 1 p X i k p
B E k = i = 1 p E i k p
In Equation (7), B S k and B E k denote the required state and energy of the layer number k, respectively. X i k and E i k stand for the position and fitness value of the solution number i in k-th layer.
Depending on the given items, the required energy and state of an atom are defined by estimating the mean positions and objective values of the used solutions as follows:
B S = i = 1 m X i m
B E = i = 1 m E i m
where B S and B E are the required state and energy of the atom.
The energy level ( E i k ) of t X i k in each IL is associated with the required energy of the layer ( B E k ). Suppose the energy ratio of the current solution in a particular layer is larger than the required energy (i.e., E i k B E k ) so, the photon emission is estimated. In this rule, the individuals are managing to transmit a photon with a cost of energy estimated using γ and β to concurrently give to the required position of the atom ( B S ) and the position of the electron with the lowest energy ratio ( L E ) in the atom. The updating process of individuals is formulated as:
X i + 1 k = X i k + α i ( β i × L E γ i × B S ) k , k = 1 , 2 , , n , i = 1 , 2 , , p
in Equation (10), X i k and X i + 1 k denote the current and expected values for individual i at kth layer. α i , β i , and γ i refer to random vectors.
Suppose the energy ratio of a solution in a particular layer is smaller than the required energy ( E i k < B E k ); the consumption of photon is examined. The mathematical function for the position updating is presented as follows:
X i + 1 k = X i k + α i × ( β i × L E k γ i × B S k )
In the case of generating a random number () for each individual and it is valued less than the P R (i.e., < P R ), the number of photons on the solution is not feasible. Therefore, the action of particles between various layers nearby the nucleus is estimated. The position updating is given as follows:
X i + 1 k = X i k + r i
where r i is a vector of random numbers.

3.2. Dynamic-Opposite Learning

The primary steps of the Dynamic-Opposition-Based Learning (DOL) approach are presented. In the beginning, the conventional Opposition-Based Learning (OBL) approach is presented [48]. This approach is used in this paper to enhance the performance of the proposed method. The OBL approach is employed to create a unique opposition solution to the existing solution. It attempts to determine the best solutions that lead to increasing the speed rate of convergence.
The opposite ( X o ) of a given real number ( X [ U , L ] ) can be calculated as follows.
X o = U + L X
Opposite point [49]: Suppose that X = [ X 1 , X 2 ,…, X D i m ] is a point in a D i m -dimensional search space, and X 1 , X 2 , …, X D i m R and X j [ U j , L j ]. Thus, the opposite point ( X o ) of X is presented as follows:
X j o = U B j + L j X j , w h e r e j = 1 . D .
Moreover, the most useful two points ( X o and X) are chosen according to the fitness function values, and the other is neglected. For the minimization problem, if f(X) ≤ f( X o ), X is maintained; oppositely, X o is maintained.
Related to the opposite point, the dynamic opposite preference ( X D O ) of the value X is represented as follows:
X D o = X + w × r 8 ( r 9 × X o X ) , w > 0
where r 8 and r 9 are random values in the range of [0 1]. w is weighting agent.
Consequently, the dynamic opposite value ( X j D O ) of X is equal to [ X 1 , X 2 ,…, X D i m ], which is presented as follows:
X j D o = X j + w × r a n d ( r a n d × X j o X j ) , w > 0
Accordingly, DOL optimization begins by creating the first solutions ( X = ( X 1 , , X D i m ) and calculate its dynamic opposite values ( X D o ) using Equation (16). Next, based on the given fitness value, the best solution from the given (i.e., X D o and X) is used, and another one is excluded.

4. Developed AOSD Feature Selection Algorithm

To improve the performance of the traditional AOS algorithm and use it as an FS method, we use dynamic opposite-based learning. The steps of the developed AOS-based DOL are given in Figure 1. These steps can be classified into two phases; the first one aims to learn the developed method based on the training set. At the same time, the second phase aims to assess the method’s performance using the testing set.
Figure 1. Steps of AOSD for FS problem.

4.1. Learning Phase

In this phase, the training set representing 70% from the input is applied to learn the model by selecting the optimal subset of relevant features. The developed AOSD aims at the beginning by constructing initial population, and this is achieved using the following formula:
X i = r a n d ( U L ) + L , i = 1 , 2 , , N , j = 1 , 2 , , N F
In Equation (17), N F is the number of features (also, it is used to represents the dimension). U and L are the limits of the search domain. The next process in AOSD is to convert each agent X i to binary form B X i , and this is defined in Equation (18).
B X i j = 1 i f X i j > 0.5 0 o t h e r w i s e
Thereafter, the fitness value of each X i is computed, and it represents the quality. The following formula represents the fitness value that depends on the selected features from the training set.
F i t i = λ × γ i + ( 1 λ ) × | B X i | N F ,
where | B X i | is the number of features that correspond to the ones in B X i . γ i refers to the classification error obtained from the KNN classifier that learned using the reduced training set using features in B X i . λ is applied to manage the process of selecting features which simulate reducing the error of classification.
The following process is to apply the DOL as defined in Equation (16) to each X i to find X i D o . Then select from X X D O the best N solutions that have the smallest fitness value. In addition, the best solution X b is determined with best fitness F i t b .
After that, AOSD starts to update the solutions X using the operators of AOS as discussed in Section 3.1. To maintain the diversity of the solutions X, their opposite values are computed using the following formula:
X = X i f P r D O > 0.5 X N o t h e r w i s e
where P r D O is random probability used to switch between X and X N . X N represents the N solutions chosen from X X D o J based on their fitness value. Whereas, X i j D o J for each X i at dimension j is given as:
X i j D o J = X i j + w × r a n d ( r a n d × X i j o X i j ) , w > 0
where X i j o is defined in Equation (16). In the developed AOSD, the limits of search space are updated dynamically using the following formula:
L j = m i n ( X i j )
U j = m a x ( X i j )
Thereafter, the terminal conditions are checked, and if they are met, then return by X b . Otherwise, repeat the updating steps of AOSD.

4.2. Evaluation Phase

In this phase, the best solution X b is employed to reduce the number of features of the testing set representing 30% from given data. This process is performed by selecting only those features corresponding to ones inside its binary version B X b (computed using Equation (20)). Then, the KNN classifier is applied to the reduced testing set and it predicts the output of the testing set by computing the output’s performance using performance measures.

5. Experimental Results

This section introduces the experimental evaluation of the developed AOSD method. Additionally, extensive comparisons to several existing optimization methods are carried out to verify the performance of the developed AOSD method.

5.1. Experimental Datasets and Parameter Settings

We considered comprehensive datasets to evaluate the proposed AOSD method using twenty datasets with different categories, including low and high dimensionality. The low dimensionality datasets are the well-known UCI datasets [50]. The properties of the used datasets are given in Table 1, including the number of classes, number of features, and number of samples. It is worth mentioning that the used datasets covered several domains, such as games, biology, biomedical, and physics.
Table 1. Datasets’ characteristics.
Furthermore, we set up essential parameters and strategies to evaluate the proposed AOSD method. For example, we use the Hold-out strategy as a classification strategy, with 80% and 20% for training and testing sets, respectively. More so, we repeat each experiment with 30 independent runs. The K nearest neighbor (KNN) is adopted as the classifier with the Euclidean distance metric (K = 5).
In addition, a number of the well-known optimization algorithms have been considered for the comparison, such as Atomic Orbital Search (AOS), arithmetic optimization algorithm (AOA) [51], Marine Predators Algorithm (MPA) [46], Manta ray foraging optimizer (MRFO) [52], Harris Hawks optimization (HHO), Henry gas solubility optimization (HGS) algorithm (HGSO), Whale optimization algorithm (WOA), grey wolf optimization (GWO) [53], GA, and BPSO. These methods are uniformly distributed, and the max iteration number is set to 100, where the population size is 10. In addition, the dimensions of these methods are fixed to the feature numbers as in the datasets.

5.2. Performance Measures

We used several evaluation measures to test the proposed AOSD method. The confusion matrix (CM) is described in Table 2. As known, it is used to test the performance of a classifier, including Accuracy, Specificity, and Sensitivity [54].
Table 2. Confusion Matrix.
  • Average accuracy ( A V G A c c ) : This measure is the rate of correctly data classification, and it is computed as [22,55,56,57]:
    A c c u r a c y = T P + T N T P + F N + F P + T N
    Each method is performed 30 times ( N r = 30 ); thus, the A V G A c c is computed as:
    A V G A c c = 1 N r k = 1 N r A c c B e s t k
  • Average fitness value ( A V G F i t ) : it is used to assess the performance of an applied algorithm, and it puts the error rate of classification and reducing the selection ratio as the following equation [22,55,56,57]:
    A V G F i t = 1 N r k = 1 N r F i t B e s t k
  • Average number of the selected features ( A V G | B X B e s t | ) : This metric is applied to compute the ability of the applied method to reduce the number of features overall number of runs, and it is computed as [22,55,56,57]:
    A V G B X B e s t = 1 N r k = 1 N r B X B e s t k
    in which | B X B e s t k | represents the cardinality of selected features for kth run.
  • Average computation time ( A V G T i m e ) : This measure is used to compute the average of CPU time(s), as in the following equation [22,55,56,57]:
    A V G T i m e = 1 N r k = 1 N r T i m e B e s t k
  • Standard deviation (STD): STD is employed to assess the quality of each applied method and analyze the achieved results in different runs. It is computed as [22,55,56,57]:
    S T D Y = 1 N r k = 1 N r Y B e s t k A V G Y 2
    (Note: S T D Y is computed for each metric: Accuracy, Fitness, Time, Number of selected features, Sensitivity, and Specificity.

5.3. Comparisons

In this section, the developed AOSD is evaluated over eighteen well-known datasets. The evaluation uses ten algorithms to compare the performance of the developed AOSD, namely AOS, AOA, MPA, MRFO, HHO, HGSO, WOA, bGWO, GA, and BPSO. Six measures are used, called maximum fitness function (MAX), the average of the fitness function, minimum fitness function (MIN), accuracy (Acc), and standard deviation (St). The values obtained by the compared algorithms are recorded in Table 3, Table 4, Table 5, Table 6, Table 7, Table 8 and Table 9 where the smaller value in the tables means the better results, except for Table 8, where the higher value is the best; therefore, all best values in the tables are in boldface.
Table 3. Average of the fitness values for FS approaches.
Table 4. Standard deviation of fitness values for FS approaches.
Table 5. Results of the best fitness function values for FS approaches.
Table 6. Results of the worst fitness values’ results for FS approaches.
Table 7. Selected features numbers for FS approaches.
Table 8. Accuracy results for FS approaches.
Table 9. Friedman rank test results for all methods.
The results of the fitness function values are listed in Table 3 and the smaller fitness value means the better results. This table contains the average of the fitness function for the developed AOSD method and the comparison methods for all datasets. From these results, the AOSD got the best results in 6 out of 18 datasets (i.e., S2, S4, S7, S9, S15, and S16); therefore, it got the first rank. The AOA obtained the best values in three datasets (i.e., S3, S8, and S18), and it was ranked second, followed by MPA, MRFO, BPSO, and HHO, respectively; the GA showed the worst results. With the use of the average, it is possible to analyze the behavior of the results provided by the algorithms in the experiments. In terms of optimization, the fitness standard helps identify a typical value in the experiments for each dataset. Figure 2 shows the performance of the AOSD using the average of the fitness functions.
Figure 2. Average of the fitness functions’ values.
Table 4 shows the results of the standard division for all methods. The Std here is used to verify the dispersion of the results along with the experiments with different datasets. A low value in Std represents low dispersion, which means the algorithm is more stable along with the experiments. The AOSD showed good stability compared to the other methods, and it achieved the lowest Std value in 6 out of 18 datasets (i.e., S6, S7, S9, S13, S17, and S18). It was ranked first followed by BPSO and it showed good stability in S14, S8, S10, S11, and S15 datasets. In addition, the AOA, MPA, and MRFO also showed good resilience. The bGWO and WOA showed the worst Std values in this measure.
In addition, the best fitness values are listed in Table 5. By analyzing the best values obtained by the compared algorithms in all the runs for each dataset, the idea is to see which algorithm can provide the best solution in the best case (or in the best run). This table shows that the AOSD showed the best Min values in 33% of all datasets; it obtained the best Min results in S14, S2, S4, S7, S8, S16, and S18. The HHO and MPA received the best values in this measure in two datasets for each, ranking second and third, respectively. All methods obtained the same results in S4 datasets except for HGSO and GA. The GA recorded the worst performance in this measure.
In terms of the worst results of the fitness values, Table 6 shows these results. The study of the worst values in the results of compared algorithms helps to verify that even in the worst case, some algorithms provide reasonable solutions. Besides, it also permits one to see which algorithm is the worst in the worst case. The developed AOSD showed good results compared to other methods and achieved the best results in 7 out of 18 datasets (i.e., S14, S3, S9, S12, S13, S16, and S18). It showed competitive results in the other datasets. The AOA achieved the second rank by obtaining the best results in six datasets (i.e., S4, S6, S7, S8, S10, and S17), followed by AOA and MPA. The other compared methods were ordered as MRFO, BPSO, AOS, HGSO, HHO, bGWO, WOA, and GA in this sequence.
Moreover, the selected features number by each method is recorded in Table 7. In this measure, the best method tries to choose the lowest features and achieve high accuracy results. As shown in Table 7, the AOSD reached the second rank by obtaining the lowest features number in 7 out of 18 datasets, whereas the first rank was received by the WOA method, it selected the lowest number of features in 8 datasets. The third rank was obtained by HGSO followed by MRFO, HHO, AOS, bGWO, MPA, AOA, and BPSO; whereas, the GA showed the worst performance in all datasets.
In addition, Table 8 illustrates the results of all compared methods in terms of classification accuracy. The use of accuracy permits the evaluation of the correct predicted data points out of all the data points. By such interpretation, this study permits identifying if an algorithm is outstanding in classification. The accuracy is essential in multiple real applications; for that reason, its use is mandatory. In this measure, the developed AOSD showed good results in 17% of the datasets; therefore, it was able to classify these datasets with high accuracy compared to other methods, and it obtained the same accuracy with the other methods in 22% of the datasets. The MRFO was ranked second, followed by MPA, AOA, BPSO, HHO, AOS, HHO, HGSO, EFO, and bGWO whereas, the lowest accuracy was shown by the WOA method. Figure 3 illustrates the performance of the AOSD based on the average classification accuracy for all datasets.
Figure 3. Average of the classification accuracy among tested datasets.
Moreover, Table 9 records the statistical results of the Friedman rank test to rank all methods using both the classification accuracy and the fitness function values. This test studies the statistical differences between the algorithms considering the results obtained for the 30 independent runs in all datasets. From Table 9, we can see that developed AOSD achieved the first rank in classification accuracy, followed by MRFO, MPA, AOA, BPSO, AOS, and HHO. The WOA was ranked last. Whereas, in the fitness function, the AOSD showed an excellent rank and was came second after the AOA with slight deference, followed by MPA, BPSO, MRFO, HHO, and HGSO. The GA was ranked last. From these results, we can notice that the AOSD showed the best results in accuracy, whereas it showed the second-best in the fitness function. These results indicate the superiority of the AOSD due to the fact that the classification accuracy measure can be more important than the fitness function value in solving classification problems.
In general, the aforementioned results show that the developed AOSD method showed a noticeable enhancement in solving classification problems by selecting the essential features. The DOL approach improves the performance of the AOS by increasing the ability of the AOS to discover the search domain and save it from getting stuck in a local point.
Furthermore, the results of the AOSD showed its advantages over the compared algorithms by achieving the best fitness functions values in 33% of all datasets, whereas the second-rank HHO method achieved the best values in 16% of the datasets. This result was also observed in the rest of the measures. In addition, if we compare the differences between the proposed method AOSD and its original version AOS, in the accuracy measure, we can see that the proposed method outperformed the original version in 16 out of 18 datasets and showed similar accuracies in the other two cases. Besides, the proposed method is ranked first according to the statistical test (i.e., Friedman test) for accuracy measure, which indicates a significant difference between the AOSD and the compared method at p-value equals 0.05. Based on the results, we will work in the future to increase the performance of the proposed method by improving its exploitation phase and applying it in different optimization problems.

6. Conclusions

This paper developed a modified Atomic Orbit Search (AOS) and used it as a feature selection (FS) approach. The modification has been performed using dynamic opposite-based learning (DOL) to enhance the exploration and diversity of solutions. This leads to improving the convergence rate to explore the feasible region that contains the optima solution (relevant features). To justify the performance of the AOSD as an FS approach, a set of twenty datasets collected from different real-life applications has been used. In addition, the results of AOSD have been compared with other well-known FS approaches based on MH techniques such as AOS, APA, MPA, MRFO, HHO, HGSO, WOA, GWO, GA, and PSO. The obtained results concluded that the developed AOSD provided higher efficiency than other FS approaches.
Besides the obtained results, the developed AOSD can be extended to other real-life applications, including medical images, superpixel-Based clustering, Internet of things (IoT), security, and other fields.

Author Contributions

Conceptualization, D.Y.; Data curation, M.A.E., L.A., A.A.E. and S.L.; Formal analysis, R.A.I.; Funding acquisition, A.A.E.; Investigation, M.A.E., L.A., D.Y., M.A.A.A.-Q. and M.H.N.-S.; Methodology, D.O., S.L. and R.A.I.; Software, M.A.E., A.A.E. and R.A.I.; Supervision, M.H.N.-S.; Validation, M.A.A.A.-Q. and M.H.N.-S.; Visualization, D.Y.; Writing, D.O., M.A.A.A.-Q., A.A.E. and S.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Acknowledgments

This work is supported by the Hubei Provincial Science and Technology Major Project of China under Grant No. 2020AEA011 and the Key Research & Development Plan of Hubei Province of China under Grant No. 2020BAB100 and the project of Science, Technology and Innovation Commission of Shenzhen Municipality of China under Grant No. JCYJ20210324120002006.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Tubishat, M.; Idris, N.; Shuib, L.; Abushariah, M.A.; Mirjalili, S. Improved Salp Swarm Algorithm based on opposition based learning and novel local search algorithm for feature selection. Expert Syst. Appl. 2020, 145, 113122. [Google Scholar] [CrossRef]
  2. Shao, Z.; Wu, W.; Li, D. Spatio-temporal-spectral observation model for urban remote sensing. Geo-Spat. Inf. Sci. 2021, 17, 372–386. [Google Scholar] [CrossRef]
  3. Ibrahim, R.A.; Ewees, A.A.; Oliva, D.; Abd Elaziz, M.; Lu, S. Improved salp swarm algorithm based on particle swarm optimization for feature selection. J. Ambient Intell. Humaniz. Comput. 2019, 10, 3155–3169. [Google Scholar] [CrossRef]
  4. Zebari, R.; Abdulazeez, A.; Zeebaree, D.; Zebari, D.; Saeed, J. A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction. J. Appl. Sci. Technol. Trends 2020, 1, 56–70. [Google Scholar] [CrossRef]
  5. Venkatesh, B.; Anuradha, J. A review of feature selection and its methods. Cybern. Inf. Technol. 2019, 19, 3–26. [Google Scholar] [CrossRef] [Green Version]
  6. Shao, Z.; Sumari, N.S.; Portnov, A.; Ujoh, F.; Musakwa, W.; Mandela, P.J. Urban sprawl and its impact on sustainable urban development: A combination of remote sensing and social media data. Geo-Spat. Inf. Sci. 2021, 24, 241–255. [Google Scholar] [CrossRef]
  7. Abdel-Basset, M.; Ding, W.; El-Shahat, D. A hybrid Harris Hawks optimization algorithm with simulated annealing for feature selection. Artif. Intell. Rev. 2021, 54, 593–637. [Google Scholar] [CrossRef]
  8. El-Hasnony, I.M.; Barakat, S.I.; Elhoseny, M.; Mostafa, R.R. Improved feature selection model for big data analytics. IEEE Access 2020, 8, 66989–67004. [Google Scholar] [CrossRef]
  9. Deng, X.; Li, Y.; Weng, J.; Zhang, J. Feature selection for text classification: A review. Multimed. Tools Appl. 2019, 78, 3. [Google Scholar] [CrossRef]
  10. Ewees, A.A.; Abualigah, L.; Yousri, D.; Algamal, Z.Y.; Al-qaness, M.A.; Ibrahim, R.A.; Abd Elaziz, M. Improved Slime Mould Algorithm based on Firefly Algorithm for feature selection: A case study on QSAR model. Eng. Comput. 2021, 31, 1–15. [Google Scholar]
  11. Alex, S.B.; Mary, L.; Babu, B.P. Attention and Feature Selection for Automatic Speech Emotion Recognition Using Utterance and Syllable-Level Prosodic Features. Circuits, Syst. Signal Process. 2020, 39, 11. [Google Scholar] [CrossRef]
  12. Benazzouz, A.; Guilal, R.; Amirouche, F.; Slimane, Z.E.H. EMG Feature selection for diagnosis of neuromuscular disorders. In Proceedings of the 2019 International Conference on Networking and Advanced Systems (ICNAS), Annaba, Algeria, 26–27 June 2019; pp. 1–5. [Google Scholar]
  13. Al-qaness, M.A. Device-free human micro-activity recognition method using WiFi signals. Geo-Spat. Inf. Sci. 2019, 22, 128–137. [Google Scholar] [CrossRef]
  14. Yousri, D.; Abd Elaziz, M.; Abualigah, L.; Oliva, D.; Al-Qaness, M.A.; Ewees, A.A. COVID-19 X-ray images classification based on enhanced fractional-order cuckoo search optimizer using heavy-tailed distributions. Appl. Soft Comput. 2021, 101, 107052. [Google Scholar] [CrossRef]
  15. Nadimi-Shahraki, M.H.; Banaie-Dezfouli, M.; Zamani, H.; Taghian, S.; Mirjalili, S. B-MFO: A Binary Moth-Flame Optimization for Feature Selection from Medical Datasets. Computers 2021, 10, 136. [Google Scholar] [CrossRef]
  16. Hancer, E. A new multi-objective differential evolution approach for simultaneous clustering and feature selection. Eng. Appl. Artif. Intell. 2020, 87, 103307. [Google Scholar] [CrossRef]
  17. Amini, F.; Hu, G. A two-layer feature selection method using genetic algorithm and elastic net. Expert Syst. Appl. 2021, 166, 114072. [Google Scholar] [CrossRef]
  18. Song, X.f.; Zhang, Y.; Gong, D.w.; Sun, X.y. Feature selection using bare-bones particle swarm optimization with mutual information. Pattern Recognit. 2021, 112, 107804. [Google Scholar] [CrossRef]
  19. Tubishat, M.; Ja’afar, S.; Alswaitti, M.; Mirjalili, S.; Idris, N.; Ismail, M.A.; Omar, M.S. Dynamic salp swarm algorithm for feature selection. Expert Syst. Appl. 2021, 164, 113873. [Google Scholar] [CrossRef]
  20. Sathiyabhama, B.; Kumar, S.U.; Jayanthi, J.; Sathiya, T.; Ilavarasi, A.; Yuvarajan, V.; Gopikrishna, K. A novel feature selection framework based on grey wolf optimizer for mammogram image analysis. Neural Comput. Appl. 2021, 33, 14583–14602. [Google Scholar] [CrossRef]
  21. Sadeghian, Z.; Akbari, E.; Nematzadeh, H. A hybrid feature selection method based on information theory and binary butterfly optimization algorithm. Eng. Appl. Artif. Intell. 2021, 97, 104079. [Google Scholar] [CrossRef]
  22. Ewees, A.A.; Abd El Aziz, M.; Hassanien, A.E. Chaotic multi-verse optimizer-based feature selection. Neural Comput. Appl. 2019, 31, 991–1006. [Google Scholar] [CrossRef]
  23. Abualigah, L.M.Q. Feature Selection and Enhanced Krill Herd Algorithm for Text Document Clustering; Springer: Berlin/Heidelberg, Germany, 2019. [Google Scholar]
  24. Abd Elaziz, M.; Ewees, A.A.; Ibrahim, R.A.; Lu, S. Opposition-based moth-flame optimization improved by differential evolution for feature selection. Math. Comput. Simul. 2020, 168, 48–75. [Google Scholar] [CrossRef]
  25. Neggaz, N.; Houssein, E.H.; Hussain, K. An efficient henry gas solubility optimization for feature selection. Expert Syst. Appl. 2020, 152, 113364. [Google Scholar] [CrossRef]
  26. Helmi, A.M.; Al-qaness, M.A.; Dahou, A.; Damaševičius, R.; Krilavičius, T.; Elaziz, M.A. A Novel Hybrid Gradient-Based Optimizer and Grey Wolf Optimizer Feature Selection Method for Human Activity Recognition Using Smartphone Sensors. Entropy 2021, 23, 1065. [Google Scholar] [CrossRef]
  27. Al-qaness, M.A.; Ewees, A.A.; Abd Elaziz, M. Modified whale optimization algorithm for solving unrelated parallel machine scheduling problems. Soft Comput. 2021, 25, 9545–9557. [Google Scholar] [CrossRef]
  28. Azizi, M. Atomic orbital search: A novel metaheuristic algorithm. Appl. Math. Model. 2021, 93, 657–683. [Google Scholar] [CrossRef]
  29. Azizi, M.; Talatahari, S.; Giaralis, A. Optimization of Engineering Design Problems Using Atomic Orbital Search Algorithm. IEEE Access 2021, 9, 102497–102519. [Google Scholar] [CrossRef]
  30. Dong, H.; Xu, Y.; Li, X.; Yang, Z.; Zou, C. An improved antlion optimizer with dynamic random walk and dynamic opposite learning. Knowl.-Based Syst. 2021, 216, 106752. [Google Scholar] [CrossRef]
  31. Zhang, L.; Hu, T.; Yang, Z.; Yang, D.; Zhang, J. Elite and dynamic opposite learning enhanced sine cosine algorithm for application to plat-fin heat exchangers design problem. Neural Comput. Appl. 2021, 1–14. [Google Scholar] [CrossRef]
  32. Feng, Y.; Liu, M.; Zhang, Y.; Wang, J. A Dynamic Opposite Learning Assisted Grasshopper Optimization Algorithm for the Flexible JobScheduling Problem. Complexity 2020, 2020, 1–19. [Google Scholar]
  33. Agrawal, P.; Abutarboush, H.F.; Ganesh, T.; Mohamed, A.W. Metaheuristic Algorithms on Feature Selection: A Survey of One Decade of Research (2009–2019). IEEE Access 2021, 9, 26766–26791. [Google Scholar] [CrossRef]
  34. Sharma, M.; Kaur, P. A Comprehensive Analysis of Nature-Inspired Meta-Heuristic Techniques for Feature Selection Problem. Arch. Comput. Methods Eng. 2021, 28, 1103–1127. [Google Scholar] [CrossRef]
  35. Rostami, M.; Berahmand, K.; Nasiri, E.; Forouzande, S. Review of swarm intelligence-based feature selection methods. Eng. Appl. Artif. Intell. 2021, 100, 104210. [Google Scholar] [CrossRef]
  36. Nguyen, B.H.; Xue, B.; Zhang, M. A survey on swarm intelligence approaches to feature selection in data mining. Swarm Evol. Comput. 2020, 54, 100663. [Google Scholar] [CrossRef]
  37. Hu, P.; Pan, J.S.; Chu, S.C. Improved binary grey wolf optimizer and its application for feature selection. Knowl.-Based Syst. 2020, 195, 105746. [Google Scholar] [CrossRef]
  38. Hu, Y.; Zhang, Y.; Gong, D. Multiobjective particle swarm optimization for feature selection with fuzzy cost. IEEE Trans. Cybern. 2020, 51, 874–888. [Google Scholar] [CrossRef]
  39. Gao, Y.; Zhou, Y.; Luo, Q. An efficient binary equilibrium optimizer algorithm for feature selection. IEEE Access 2020, 8, 140936–140963. [Google Scholar] [CrossRef]
  40. Al-Tashi, Q.; Abdulkadir, S.J.; Rais, H.M.; Mirjalili, S.; Alhussian, H.; Ragab, M.G.; Alqushaibi, A. Binary multi-objective grey wolf optimizer for feature selection in classification. IEEE Access 2020, 8, 106247–106263. [Google Scholar] [CrossRef]
  41. Alazzam, H.; Sharieh, A.; Sabri, K.E. A feature selection algorithm for intrusion detection system based on pigeon inspired optimizer. Expert Syst. Appl. 2020, 148, 113249. [Google Scholar] [CrossRef]
  42. Zhang, Y.; Gong, D.W.; Gao, X.z.; Tian, T.; Sun, X.Y. Binary differential evolution with self-learning for multi-objective feature selection. Inf. Sci. 2020, 507, 67–85. [Google Scholar] [CrossRef]
  43. Dhiman, G.; Oliva, D.; Kaur, A.; Singh, K.K.; Vimal, S.; Sharma, A.; Cengiz, K. BEPO: A novel binary emperor penguin optimizer for automatic feature selection. Knowl.-Based Syst. 2021, 211, 106560. [Google Scholar] [CrossRef]
  44. Hammouri, A.I.; Mafarja, M.; Al-Betar, M.A.; Awadallah, M.A.; Abu-Doush, I. An improved dragonfly algorithm for feature selection. Knowl.-Based Syst. 2020, 203, 106131. [Google Scholar] [CrossRef]
  45. Zhang, Y.; Liu, R.; Wang, X.; Chen, H.; Li, C. Boosted binary Harris hawks optimizer and feature selection. Eng. Comput. 2020, 37, 3741–3770. [Google Scholar] [CrossRef]
  46. Sahlol, A.T.; Yousri, D.; Ewees, A.A.; Al-Qaness, M.A.; Damasevicius, R.; Abd Elaziz, M. COVID-19 image classification using deep features and fractional-order marine predators algorithm. Sci. Rep. 2020, 10, 15364. [Google Scholar] [CrossRef] [PubMed]
  47. Abdel-Basset, M.; Mohamed, R.; Chakrabortty, R.K.; Ryan, M.J.; Mirjalili, S. An efficient binary slime mould algorithm integrated with a novel attacking-feeding strategy for feature selection. Comput. Ind. Eng. 2021, 153, 107078. [Google Scholar] [CrossRef]
  48. Tizhoosh, H.R. Opposition-based learning: A new scheme for machine intelligence. In Proceedings of the International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC’06), Vienna, Austria, 28–30 November 2005; Volume 1, pp. 695–701. [Google Scholar]
  49. Houssein, E.H.; Hussain, K.; Abualigah, L.; Abd Elaziz, M.; Alomoush, W.; Dhiman, G.; Djenouri, Y.; Cuevas, E. An improved opposition-based marine predators algorithm for global optimization and multilevel thresholding image segmentation. Knowl.-Based Syst. 2021, 229, 107348. [Google Scholar] [CrossRef]
  50. Frank, A. UCI Machine Learning Repository. Available online: http://archive.ics.uci.edu/ml (accessed on 1 August 2020).
  51. Abualigah, L.; Diabat, A.; Mirjalili, S.; Abd Elaziz, M.; Gandomi, A.H. The arithmetic optimization algorithm. Comput. Methods Appl. Mech. Eng. 2021, 376, 113609. [Google Scholar] [CrossRef]
  52. Abd Elaziz, M.; Yousri, D.; Al-qaness, M.A.; AbdelAty, A.M.; Radwan, A.G.; Ewees, A.A. A Grunwald–Letnikov based Manta ray foraging optimizer for global optimization and image segmentation. Eng. Appl. Artif. Intell. 2021, 98, 104105. [Google Scholar] [CrossRef]
  53. Ibrahim, R.A.; Abd Elaziz, M.; Lu, S. Chaotic opposition-based grey-wolf optimization algorithm based on differential evolution and disruption operator for global optimization. Expert Syst. Appl. 2018, 108, 1–27. [Google Scholar] [CrossRef]
  54. Sokolova, M.; Lapalme, G. A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 2009, 45, 427–437. [Google Scholar] [CrossRef]
  55. Ferri, C.; Hernández-Orallo, J.; Modroiu, R. An experimental comparison of performance measures for classification. Pattern Recognit. Lett. 2009, 30, 27–38. [Google Scholar] [CrossRef]
  56. Elaziz, M.A.; Hosny, K.M.; Salah, A.; Darwish, M.M.; Lu, S.; Sahlol, A.T. New machine learning method for image-based diagnosis of COVID-19. PLoS ONE 2020, 15, e0235187. [Google Scholar] [CrossRef]
  57. Neggaz, N.; Ewees, A.A.; Abd Elaziz, M.; Mafarja, M. Boosting salp swarm algorithm by sine cosine algorithm and disrupt operator for feature selection. Expert Syst. Appl. 2020, 145, 113103. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.