An Interpolation-Based Evolutionary Algorithm for Bi-Objective Feature Selection in Classification

Xu, Hang

doi:10.3390/math12162572

Open AccessArticle

An Interpolation-Based Evolutionary Algorithm for Bi-Objective Feature Selection in Classification

by

Hang Xu

School of Mechanical, Electrical & Information Engineering, Putian University, Putian 351100, China

Mathematics 2024, 12(16), 2572; https://doi.org/10.3390/math12162572

Submission received: 14 July 2024 / Revised: 4 August 2024 / Accepted: 19 August 2024 / Published: 20 August 2024

Download

Browse Figures

Versions Notes

Abstract

:

When aimed at minimizing both the classification error and the number of selected features, feature selection can be treated as a bi-objective optimization problem suitable for solving with multi-objective evolutionary algorithms (MOEAs). However, traditional MOEAs may encounter difficulties due to discrete optimization environments and the curse of dimensionality in the feature space, especially for high-dimensional datasets. Therefore, in this paper an interpolation-based evolutionary algorithm (termed IPEA) is proposed for tackling bi-objective feature selection in classification, where an interpolation based initialization method is designed for covering a wide range of search space and exploring the adaptively detected regions of interest. In experiments, IPEA is been compared with four state-of-the-art MOEAs in terms of two widely-used performance metrics on a list of 20 public real-world classification datasets with the dimensionality ranging from low to high. The overall empirical results suggest that IPEA generally performs the best of all tested algorithms, with significantly better search abilities and much lower computational time cost.

Keywords:

classification dataset; evolutionary algorithm; feature selection; multi-objective optimization

MSC:

68W50

1. Introduction

As is known to all, evolutionary algorithms (EAs) have been widely used to solve optimization problems for decades [1], especially for multi-objective optimization problems (MOPs) with multiple contradictory objectives [2], which are then called the multi-objective evolutionary algorithms (MOEAs) [3]. Today, a great variety of MOEAs have been proposed from all around the world; these can be roughly divided into several categories: dominance-based [4,5,6], which adopt the nondominated sorting method for environmental selection; decomposition-based [7,8,9,10], which decompose an MOP into a series of simpler single-objective problems with weight vectors; indicator-based [11,12,13,14], which make use of a certain performance indicators for environmental selection; surrogate-based [15,16,17], which introduce a surrogate model for tackling expensive optimization problems; and multi-task [18,19,20], which generate multiple independent and cooperative tasks for different evolutionary purposes.

What is more, owing to their population-based search capability and ability to function without domain knowledge, MOEAs have also been applied for solving many real-world optimization problems [21,22,23], such as task offloading [24,25], network construction [26,27], community detection [28,29], and the feature selection problem [30,31] that is the focus of this work. To be more specific, feature selection is a common and important data preprocessing technique [32,33] which selects only a subset of useful features for classification rather than all of them. It is especially useful for large-scale datasets or those with high dimensionality [34]. When aimed at minimizing both the ratio of selected features and the ratio of classification errors, feature selection turns out to be a multi-objective (bi-objective) optimization problem, which can be formally described as follows [35]:

\begin{matrix} m i n i m i z e F (x) = {(f_{1} (x), f_{2} (x), \dots, f_{M} (x))}^{T} \\ s u b j e c t t o x = (x_{1}, x_{2}, \dots, x_{D}), x_{i} = {0, 1} \end{matrix}

(1)

where M is the number of objectives, which in bi-objective feature selection is set to 2, D is the total number of features in the decision space,

F (x)

is the objective vector of

x

, with

f_{i} (x)

denoting the objective value in the

f_{i}

direction,

x = (x_{1}, x_{2}, \dots, x_{D})

is the decision vector of

x

, where

x_{i} = 1

means selecting the ith feature and

x_{i} = 0

means not selecting that feature, and

f_{1} (x)

denotes the ratio of selected features, which can be further defined as follows:

\begin{matrix} f_{1} (x) = \sum_{i = 1}^{D} x_{i} / D \end{matrix}

(2)

with the discrete value ranging from 0 to 1 (i.e.,

0, 1 / D, 2 / D, \dots, 1

). Moreover, given the results in terms of

T P

(True Positive),

T N

(True Negative),

F P

(False Positive), and

F N

(False Negative),

f_{2} (x)

(representing the ratio of classification errors related to the above selected features) can be further defined as follows:

f_{2} (x) = \frac{F P + F N}{T P + T N + F P + F N} .

(3)

Despite of the aforementioned advantages, traditional MOEAs still face the curse of dimensionality in tackling bi-objective feature selection, which can cause the total number of features explodes. While many existing works have attempted to deal with this challenge [36,37], most of these have either introduced complicated frameworks or the need for many essential preset parameters. Moreover, it is difficult to balance the algorithm’s performance on both low-dimensional datasets and high-dimensional ones. Therefore, in this paper an interpolation-based EA (termed IPEA) is proposed that is specifically designed for tackling bi-objective feature selection in classification and is suitable for solving both low- and high-dimensional datasets. In IPEA, an interpolation-based initialization method is designed in order to provide a promising hybrid initial population that not only covers a wide range of search space but also allows the algorithm to explore adaptively within the most promising local areas of interest. In addition, a simple and efficient reproduction method is adopted to provide more population diversity and faster algorithm convergence during evolution. Combining the above-mentioned initialization and reproduction processes increases the search ability of IPEA in handling both low- and high-dimensional datasets.

The remainder of this paper is organized as follows: first, the related works are introduced in Section 2; then, the proposed IPEA is detailed in Section 3; the experimental setup is described in Section 4, while the empirical results are presented in Section 5; finally, the paper is concluded in Section 6.

2. Related Works

Evolutionary feature selection [38] can be generally categorized into wrapper-based and filter-based approaches [39,40]. In brief, the wrapper-based approaches [41,42] adopt a classification model such as SVM (Support Vector Machine) or KNN (K-Nearest Neighbors) [43] to evaluate the classification accuracy corresponding to the selected feature subset. By contrast, filter-based approaches [44,45] are independent of any classifier and directly analyze the classification data to explore the explicit or implicit relationships between features and the corresponding classes while ignoring the classification results of the feature subset, which is selected later. Therefore, wrapper-based approaches are relatively more accurate but typically have higher computational costs due to the additional classification process [20,46,47].

In this paper, a wrapper-based approach for evolutionary bi-objective feature selection is chosen. This is a widely used approach that has been discussed in many existing feature selection works over the last several years. For example, in 2020 Xu et al. [37] proposed a segmented initialization method and offspring modification method; however, the key parameter settings are all fixed and cannot be adaptively altered for different classification datasets or high-dimensional feature selection. Subsequently, Xu et al. [35] proposed an evolutionary algorithm named DAEA based on duplication analysis along with an efficient reproduction method; however, this algorithm’s performance has not been tested under the circumstances of high-dimensional feature selection. Following the idea of DAEA, Jiao et al. [48] further improved the method for handling solution duplication and designed a problem reformulation mechanism named PRDH; however, its applicability across different MOEA frameworks remains unconfirmed.

Therefore, this work attempts to design an interpolation based evolutionary algorithm, termed IPEA, which is suitable for tackling both relatively low-dimensional datasets and relatively high-dimensional ones. The framework of IPEA should be not as complicated as many other existing MOEAs designed for feature selection, and the algorithm should be able to effectively adapt to different optimization environment across a wide variety of features while maintaining a good balance between low- and high-dimensional datasets. This represents the major motivation behind this paper.

3. Proposed Algorithm

In this section, the general framework of the proposed algorithm IPEA is first introduced, then its two specially designed components, i.e., the initialization and reproduction processes, are illustrated further.

3.1. General Framework

The general framework of the IPEA is shown in Algorithm 1, where the population size N and number of total features D are input as the primary parameters. The general framework of IPEA is rather similar to traditional MOEAs, but uses the initialization and reproduction processes specially designed in this paper. Moreover, the environmental selection process has been modified by adding the ability to remove duplicated solutions in the current union population

P o p

, as can be seen in Line 5 of Algorithm 1. Nevertheless, the widely-used truncation method based on nondominated sorting and crowding distances, first introduced in the classic dominance-based algorithm NSGA-II [4], is still adopted in IPEA to select and reserve the N best solutions in

P o p

. The termination criterion of the IPEA is preset based on counting the number of evaluated objective functions, which is normally equal to the total number of generated solutions during evolution, as described in detail in Section 4.

Algorithm 1

G e n e r a l F r a m e w o r k (N, D)

Input: population size N, total feature number D;
Output: final population $P o p$ ;

1:: $P o p = I n i t i a l i z e (N, D)$ ; $/ /$ Algorithm 2, also invoking Algorithms 3–5
2:: while termination criterion is not reached do
3:: $P o p^{*} = R e p r o d u c e (P o p, N, D)$ ; $/ /$ Algorithm 6
4:: $P o p = P o p \cup P o p^{*}$ ;
5:: $P o p \leftarrow$ remove duplicated solutions in $P o p$ ;
6:: $P o p \leftarrow$ select N solutions in $P o p$ by nondominated sorting and crowding distances;
7:: end while

Algorithm 2

I n i t i a l i z e (N, D)

Input: population size N, total feature number D;
Output: initial population $P o p$ ;

1:: $N_{i p} = [\sqrt{N}]$ ; $/ /$ number of interpolation positions
2:: $T = \frac{1}{(N_{i p} + 1)}$ ; $/ /$ average interpolation interval
3:: $N_{s p} = [\frac{N}{N_{i p}}]$ ; $/ /$ interpolation subpopulation size
4:: $B e s t_{f_{2}} = \infty$ ; $/ /$ best objective value in the $f_{2}$ direction
5:: for $i = 1, 2, \dots, N_{i p}$ do
6:: $t_{1} = m i n (N_{s p}, N - N_{s p} * (i - 1))$ ; $/ /$ get the smaller one
7:: $t_{2} = T * i$ ; $/ /$ get the interpolation position
8:: $s u b P o p = N e w S u b P o p (t_{1}, D, t_{2})$ ;
9:: $P o p_{1} = P o p_{1} \cup s u b P o p$ ;
10:: $t_{3} \leftarrow$ get the best $f_{2}$ objective value in $s u b P o p$ ;
11:: if $B e s t_{f_{2}} > t_{3}$ then
12:: $B e s t_{f_{2}} = t_{3}$ ;
13:: $B e s t_{i p} = t_{2}$ ;
14:: end if
15:: end for
16:: $P o p_{2} = N e w S u b P o p (N, D, B e s t_{i p})$ ;
17:: $P o p_{3} = F o r w a r d S u b P o p (m i n (N, D), D)$ ;
18:: $P o p_{4} = B a c k w a r d S u b P o p (m i n (N, D), D)$ ;
19:: $P o p \leftarrow$ select N best solutions from $(P o p_{1} \cup P o p_{2} \cup P o p_{3} \cup P o p_{4})$ by nondominated sorting and crowding distances;

3.2. Initialization Process

The initialization process of the IPEA is shown in Algorithm 2, which also invokes Algorithms 3–5. The initialization process is the major innovation of the IPEA, which is based on the utilization of interpolations in the objective space. Overall, the final initial population

P o p

is made up of four hybrid subpopulations; while

P o p_{1}

is based on a series of adaptively generated interpolation positions in the objective space,

P o p_{2}

is based on the analysis of the previously generated

P o p_{1}

. In turn,

P o p_{3}

and

P o p_{4}

are respectively inspired by the classic forward and backward search ideas in feature selection. To be more specific, the number of interpolation positions for new subpopulations generated in the objective space is adaptively set in Line 1 of Algorithm 2, the average interval between each two adjacent interpolation positions is accordingly set in Line 2 of Algorithm 2, and the standard interpolation subpopulation size is accordingly set in Line 3 of Algorithm 2. The detailed generation process of each interpolation-based subpopulation is shown from Line 5 to 15 in Algorithm 2, which is based on iteratively searching for the best

f_{2}

objective value in all the subpopulations. During this process, Algorithm 3 is also invoked to generate new subpopulations according to the input interpolation position value. Another two pseudocodes, Algorithms 4 and 5, are respectively invoked to generate the forward search-inspired subpopulation and the backward search-inspired one. As can be seen in Algorithms 4 and 5, the former contains unique solutions with only one selected feature, while the later contains unique solutions with only one unselected feature.

Algorithm 3

N e w S u b P o p (K, D, I_{p})

Input: subpopulation size K, total feature number D, interpolation position $I_{p}$ ;
Output: new subpopulation $S u b P o p$ ;

1:: $S u b P o p = Z e r o s (K, D)$ ; $/ /$ create a matrix of zeros
2:: for $i = 1, 2, \dots, K$ do
3:: for $j = 1, 2, \dots, D$ do
4:: if $ρ < I_{p}$ then $/ /$ $ρ$ is a random probability
5:: $S u b P o p (i, j) = 1$ ; $/ /$ select the jth feature
6:: end if
7:: end for
8:: end for

Algorithm 4

F o r w a r d S u b P o p (K, D)

Input: subpopulation size K, total feature number D;
Output: new subpopulation $S u b P o p$ ;

1:: $H \leftarrow$ randomly select K unique integers from 1 to D;
2:: $S u b P o p = Z e r o s (K, D)$ ; $/ /$ create a matrix of zeros
3:: for $i = 1, 2, \dots, K$ do
4:: $S u b P o p (i, H (i)) = 1$ ; $/ /$ select that feature
5:: end for

Algorithm 5

B a c k w a r d S u b P o p (K, D)

Input: subpopulation size K, total feature number D;
Output: new subpopulation $S u b P o p$ ;

1:: $H \leftarrow$ randomly select K unique integers from 1 to D;
2:: $S u b P o p = O n e s (K, D)$ ; $/ /$ create a matrix of ones
3:: for $i = 1, 2, \dots, K$ do
4:: $S u b P o p (i, H (i)) = 0$ ; $/ /$ not select that feature
5:: end for

Figure 1 provides an illustrative example of how the adaptive interpolation-based initialization process in the IPEA works when the population size is set to

N = 16

. Thus, according to the previously introduced Algorithm 2, the number of interpolation positions is set to

N_{i p} = 4

, the average interpolation interval is set to

T = 0.2

, and the standard size of each interpolation based subpopulation is set to

N_{s p} = 4

. As can be seen from Figure 1, four interpolation-based subpopulations (together forming

P o p_{1}

) with an average size of four solutions are generated just around their related interpolation positions, i.e., 0.2, 0.4, 0.6, and 0.8. Apart from these four interpolation positions, two boundary interpolation positions, i.e.,

\frac{1}{D}

and

\frac{(D - 1)}{D}

, are also utilized, which respectively correspond to the previously introduced forward search-inspired subpopulation (

P o p_{3}

generated by Algorithm 4) and backward search-inspired subpopulation (

P o p_{4}

generated by Algorithm 5). In the case of Figure 1, the previously introduced best

f_{2}

objective value lies in the subpopulation related to the 0.4 interpolation. Then, another new subpopulation with the full population size is generated by Algorithm 3 with the interpolation position value 0.4 input as the key parameter. In this way, a full population size of new solutions (i.e.,

P o p_{2}

) is initialized around the most promising interpolation position found above. Therefore, combining all four of the above interpolation-related

(P o p_{1} \cup P o p_{2} \cup P o p_{3} \cup P o p_{4})

, as executed in Line 19 of Algorithm 2, covers a wide range of objective space while exploring adaptively within the boundary and promising local areas. It should be noted that in Figure 1 only shows

P o p_{1}

with 16 solutions distributed, while the rest of

P o p_{2}

,

P o p_{3}

, and

P o p_{4}

are not displayed in Figure 1.

3.3. Reproduction Process

The reproduction process of the IPEA is shown in Algorithm 6, which acts as an efficient supplement for the previously introduced interpolation-based initialization process. As can be seen from Algorithm 6, a totally random mating process first is conducted in Line 1 to select N pairs of parent solutions from the current population

P o p

, indicating that every solution holds the same opportunity for reproduction. Then, the corresponding indexes with different decision variable values are found for each pair of parents; these are used in the later crossover operation. This is key to realizing efficient crossover, i.e., avoiding all invalid crossover operations between the same decision variable values. The adaptive crossover operation is shown from Line 4 to 9, which is based on the dynamic length of

t_{1}

and a random integer limited by

t_{2}

. After crossover, the mutation process is quite simple, adopting the traditional bitwise mutation method widely used in many other MOEAs. The bitwise mutation rate is set to

\frac{1}{D}

, implying a relatively delicate and cautious mutation operation within a parent decision vector. Finally, the mutated parent solution is reserved as the expected new offspring.

Algorithm 6

R e p r o d u c e (P o p, N, D)

Input: current population $P o p$ , expected offspring number N, total feature number D;
Output: offspring set $P o p^{*}$ ;

1:: $P a r s \leftarrow$ randomly select N pairs of parent solutions from $P o p$ ;
2:: for $i = 1, 2, \dots, N$ do
3:: $p_{1}$ and $p_{2}$ ← get the ith pair of parents from $P a r s$ ;
4:: $t_{1} \leftarrow$ find indexes with different decision variable values between $p_{1}$ and $p_{2}$ ;
5:: $t_{2} \leftarrow$ get the length of vector $t_{1}$ ;
6:: $t_{3} \leftarrow$ randomly select an integer from 1 to $t_{2}$ ;
7:: $t_{4} \leftarrow$ randomly select $t_{3}$ unique integers from 1 to $t_{2}$ ;
8:: $j = t_{1} (t_{4})$ ; $/ /$ get decision variable indexes for crossover
9:: $p_{1} (j) = p_{2} (j)$ ; $/ /$ crossover between two parents
10:: $ρ \leftarrow$ get a vector of D random probabilities;
11:: $j \leftarrow$ find indexes with values in $ρ$ smaller than $\frac{1}{D}$ ;
12:: $p_{1} (j) = \neg p_{1} (j)$ ; $/ /$ mutation within one parent
13:: $P o p^{*} (i) = p_{1}$ ;
14:: end for

4. Experimental Setup

4.1. Comparison Algorithms

In this work, four state-of-the-art MOEAs are used as comparison algorithms against the proposed algorithm, IPEA. The compared algorithms are NSGA-II (a nondominated sorting based genetic algorithm) [4], MOEA/D (an MOEA based on decomposition) [7], MOEA/HD (an MOEA based on hierarchical decomposition) [10], and PMEA [13] (a polar metric-based evolutionary algorithm). To be more specific, NSGA-II and MOEA/D are among the most classic and well-known MOEAs, respectively based on dominance and decomposition; MOEA/HD is a recently published MOEA based on both dominance and decomposition, especially designed for solving MOPs with complex Pareto fronts; and PMEA is a recently published MOEA based on a performance indicator called the polar metric, which balances both the population diversity and the algorithm convergence during evolution. Overall, the three mainstream MOEA approaches, i.e., those based on dominance, decomposition and indicator, are all included in the experiments.

4.2. Classification Datasets

A total of 20 open-source classification datasets [49] are used as test problems to check the general optimization performances of all comparison algorithms for bi-objective feature selection. The attribute values of each tested dataset are shown in Table 1, with the total number of features sorted in ascending order. It can be seen from Table 1 that the total number of features for each dataset changes from 60 to 10,509, which covers a wide variety of features ranging from relatively lower to relatively higher dimensionality. Moreover, the number of classification samples changes from 50 to 606 and the number of corresponding classes changes from 2 to 15, suggesting the comprehensiveness of the test instances. In fact, most classification datasets used in this paper originate from real-world scenarios; for example, the “Sonar” dataset contains various patterns obtained by bouncing sonar signals off of a metal cylinder at various angles.

4.3. Performance Indicators

For comprehensive analyses based on empirical data, this paper uses multiple performance indicators to measure the general performance of all compared algorithms on all of the tested datasets. More specifically, the Hypervolume (HV) [50] indicator is used as the main metric to measure general optimization performance regarding both population diversity and algorithm convergence. The reference point for HV is uniformly set to

(1, 1)

in the bi-objective space. In addition, the Inverted Generation Distance Plus (IGD+) [51] indicator, which also covers the measurement of both diversity and convergence, is used to supplement the HV. The preference of IGD+ is highly related to the choice of reference points. Thus, for the sake of fairness, in this paper the reference points of IGD+ are set to the combination of all final nondominated solutions obtained by all of the tested algorithms. Normally, a greater HV value is preferable, while a smaller IGD+ value is preferable. Lastly, Wilcoxon’s test with a

5 %

significance level is adopted to check the significant differences between each two compared algorithms; results below the level of significance are prefixed by the symbol ★.

4.4. Parameter Settings

In the experiments, all of the compared algorithms adopt the same parameter settings stated in the original papers. All algorithms are coded on an open-source MATLAB platform called PlatEMO [52]. Prior to evolution, each classification dataset is randomly divided into training and test subsets in a proportion of

70 / 30

according to the stratified split process [35]. Then, a KNN (

K = 5

) classification model is utilized with 10-fold cross-validation in order to avoid feature selection bias [53]. Lastly, each experimental run was independently executed 20 times, the population size for each algorithm was set to 100, and the termination criterion for each algorithm (i.e., the number of objective function evaluations) was set to 10,000, which is about 100 evolutionary generations.

5. Experimental Studies

In this section, the empirical results are first analyzed in terms of the HV and IGD+ metrics, shown in Table 2 and Table 3, respectively. In addition, the minimum classification error obtained by each algorithm along with the corresponding number of selected features are studied in Table 4. Then, the final nondominated solution distributions in the objective space, shown in Figure 2, are further analyzed. Finally, the computational time cost is analyzed by recording the experimental running time, with the results shown in Table 5.

5.1. Empirical Result Analyses

The general performance of each algorithm on all classification datasets in terms of the HV and IGD+ metrics is shown in Table 2 and Table 3, respectively, with the mean metric values and corresponding standard deviations displayed in each two rows. First, it can be seen from Table 2 and Table 3 that the proposed algorithm IPEA has the best performance of all the compared algorithms on all tested datasets, and the advantages of IPEA over all the other algorithms are rather significant on almost all of the tested datasets. Because both the HV and IGD+ metrics take the population diversity and convergence into consideration, the success of the IPEA in terms of these two metrics obviously suggests its superiority in both diversity and convergence. Moreover, the excellent performance of the IPEA on all of the datasets implies its great advantages and outstandingly effective search ability in tackling bi-objective feature selection not only for low-dimensional datasets but also for high-dimensional ones. In fact, the advantages of the IPEA in tackling the high-dimensional datasets appear to be even more obvious, showing huge differences in magnitude in the results shown in Table 3 due to the sparsity of feature space and the difficulty of finding more optimal solutions.

5.2. Classification Performance Analyses

The empirical results of each algorithm run on each dataset in terms of classification performance are shown in Table 4, with the mean minimum classification error values and corresponding mean number of selected features displayed in each two rows. First, it can be seen from Table 4 that the proposed IPEA has the best performance of all compared algorithms on all tested datasets in terms of the classification results. Again, the advantages of the IPEA over the other algorithms are quite significant on almost all the datasets, with the exception of the first two lower-dimensional ones, which do not fully challenge the IPEA’s search abilities. The minimum classification error (representing the best classification accuracy) obtained by the IPEA is generally much better than the other algorithms on each dataset. Moreover, because the corresponding number of selected features (shown as the NSF values in Table 4) generally affects the classification time and cost (as more selected features means more time and cost), the obviously superior NSF values obtained by the IPEA imply excellent computational efficiency, which actually proved in Section 5.4 below.

The implementation process of the whole wrapper-based evolutionary feature selection experiment is described as follows. At first, a population of N initial solutions containing different subsets of the selected features is evolved generation-by-generation. During this process, the selected features in each solution are only applied to training data that have been previously split from the tested datasets in a proportion of

70 / 30

; meanwhile a KNN (

K = 5

) model and 10-fold cross-validation are executed for classification, returning the

f_{1}

and

f_{2}

objective values for the current solutions. When terminating the iterative evolution process, a set of nondominated solutions is selected from the last population, which is then applied to the test data to obtain the final classification results shown in Table 4 and the HV and IGD+ performance results shown in Table 2 and Table 3.

5.3. Nondominated Solution Distributions

The final nondominated solution distributions in the objective space obtained by each algorithm on each dataset are shown in Figure 2, along with 20 subfigures showing the final obtained nondominated solutions on the training data during evolution. For the sake of fairness, we choose the run with median HV performance from among 20 independent runs on each dataset. Because the nondominated solutions cannot be used to calculate the mean values, as there is no mean nondominated solution, we use the median performance here instead of the mean performance. Overall, it can be seen from Figure 2 that the IPEA obviously performs the best among all the algorithms, with more diverse solution distributions and more converged objective values in both the

f_{1}

and

f_{2}

directions.

To be more specific, the nondominated solution distributions of all the algorithms compared in the first four subfigures relate to relatively low-dimensional datasets. In Figure 2a–d the results appear quite similar, although the IPEA still performs the best. However, in the rest of the subfigures, i.e., Figure 2e–t, which are related to relatively high-dimensional datasets, the advantages of the IPEA in terms of the nondominated solution distributions become tremendous, leaving all the other algorithms’ nondominated solution distributions far behind.

Moreover, it can be observed from Figure 2 that the nondominated solution distributions are relatively sparse on the high-dimensional datasets compared to the low-dimensional datasets. This is because the feature space or search space becomes sparser as the number of total features increases, making it much more difficult for MOEAs to find enough optimal solutions during evolution. This phenomenon also happens to the proposed IPEA; however, the negative impact is been greatly reduced compared with the other algorithms.

5.4. Computational Time Analyses

The mean computational time cost for each algorithm run on each dataset is recorded in seconds, with the results shown in Table 5. It can be seen from Table 5 that the mean computational time cost of the IPEA is always the smallest among all the algorithms on each dataset, with significant advantages over all other algorithms. Moreover, the advantages of the IPEA in terms of the computational time are even more significant on the relatively high-dimensional datasets, generally incurring only half the other algorithms’ mean computational time cost. In fact, if only considering the MOEA framework, the theoretical time complexity of IPEA is not very different from that of many other traditional MOEAs, as the general framework of the IPEA inherits the basic ideas of most MOEAs. The worst theoretical time complexity of the IPEA can be estimated as

O (M N^{2})

with M as the number of objectives and N as the population size, which mainly comes from the process of nondominated sorting. However, the real running time of the IPEA on each dataset is much smaller than the theoretical expectations. This is mainly because the IPEA saves a great deal of time during the classification process owing to its better selection of feature subsets, with much smaller numbers of selected features being used for classification. Therefore, the computational efficiency of the IPEA regarding the whole evolutionary feature selection process is rather time-saving, with significant efficiency advantages over the compared algorithms on each tested dataset, including both the relatively low-dimension and relatively high-dimension ones.

6. Conclusions

In this work, an interpolation-based EA (termed the IPEA) is proposed for tackling bi-objective feature selection on both low-dimensional and high-dimensional classification datasets. The design of the IPEA incorporates an interpolation-based initialization method in order to provide a promising hybrid initial population, allowing it to cover a wide range of search space while adaptively exploring the regions of interest. Moreover, an efficient reproduction method is adopted to provide greater population diversity and better algorithm convergence during evolution. The framework of the IPEA is simple but effective, and can adapt to different kinds of optimization environments across a wide variety of features. The outstanding performance and significant advantages of the IPEA have been verified in the experiments and comprehensively analyzed. When compared with four state-of-the-art MOEAs on a list of 20 public real-world classification datasets, the IPEA always performs best in terms of two widely-used performance metrics, showing excellent search abilities. Furthermore, the IPEA also shows the best performance in terms of the final nondominated solution distributions in the objective space, and is proven to be the most computationally efficient among all tested algorithms in terms of the real running time recorded on each tested dataset.

In future work, it is planned to study the feasibility of the proposed algorithm IPEA on additional kinds of discrete optimization problems, such as neural network construction and community node detection.

Funding

This research was funded by National Natural Science Foundation of China (grant number 62103209), the Natural Science Foundation of Fujian Province (grant number 2020J05213), the Scientific Research Project of Putian Science and Technology Bureau (grant number 2021ZP07), the Startup Fund for Advanced Talents of Putian University (grant number 2019002), and the Research Projects of Putian University (grant number JG202306).

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant 62103209, by the Natural Science Foundation of Fujian Province under Grant 2020J05213, by the Scientific Research Project of Putian Science and Technology Bureau under Grant 2021ZP07, by the Startup Fund for Advanced Talents of Putian University under Grant 2019002, and by the Research Projects of Putian University under Grant JG202306.

Conflicts of Interest

The author declares no conflicts of interest.

References

Eiben, A.E.; Smith, J.E. What is an evolutionary algorithm? In Introduction to Evolutionary Computing; Springer: Berlin/Heidelberg, Germany, 2015; pp. 25–48. [Google Scholar]
Coello, C.A.C.; Lamont, G.B.; Van Veldhuizen, D.A. Evolutionary Algorithms for Solving Multi-Objective Problems; Springer: New York, NY, USA, 2007; Volume 5. [Google Scholar]
Zhou, A.; Qu, B.Y.; Li, H.; Zhao, S.Z.; Suganthan, P.N.; Zhang, Q. Multiobjective evolutionary algorithms: A survey of the state of the art. Swarm Evol. Comput. 2011, 1, 32–49. [Google Scholar]
Deb, K.; Pratap, A.; Agarwal, S.; Meyarivan, T. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 2002, 6, 182–197. [Google Scholar] [CrossRef]
Deb, K.; Jain, H. An Evolutionary Many-Objective Optimization Algorithm Using Reference-Point-Based Nondominated Sorting Approach, Part I: Solving Problems With Box Constraints. IEEE Trans. Evol. Comput. 2014, 18, 577–601. [Google Scholar] [CrossRef]
Tian, Y.; Cheng, R.; Zhang, X.; Su, Y.; Jin, Y. A Strengthened Dominance Relation Considering Convergence and Diversity for Evolutionary Many-Objective Optimization. IEEE Trans. Evol. Comput. 2019, 23, 331–345. [Google Scholar]
Zhang, Q.; Li, H. MOEA/D: A Multiobjective Evolutionary Algorithm Based on Decomposition. IEEE Trans. Evol. Comput. 2007, 11, 712–731. [Google Scholar] [CrossRef]
Li, H.; Zhang, Q. Multiobjective Optimization Problems With Complicated Pareto Sets, MOEA/D and NSGA-II. IEEE Trans. Evol. Comput. 2009, 13, 284–302. [Google Scholar] [CrossRef]
Li, K.; Zhang, Q.; Kwong, S.; Li, M.; Wang, R. Stable Matching-Based Selection in Evolutionary Multiobjective Optimization. IEEE Trans. Evol. Comput. 2014, 18, 909–923. [Google Scholar] [CrossRef]
Xu, H.; Zeng, W.; Zhang, D.; Zeng, X. MOEA/HD: A Multiobjective Evolutionary Algorithm Based on Hierarchical Decomposition. IEEE Trans. Cybern. 2019, 49, 517–526. [Google Scholar] [CrossRef]
Bader, J.; Zitzler, E. HypE: An Algorithm for Fast Hypervolume-Based Many-Objective Optimization. Evol. Comput. 2011, 19, 45–76. [Google Scholar] [CrossRef]
Liang, Z.; Luo, T.; Hu, K.; Ma, X.; Zhu, Z. An Indicator-Based Many-Objective Evolutionary Algorithm With Boundary Protection. IEEE Trans. Cybern. 2021, 51, 4553–4566. [Google Scholar] [CrossRef]
Xu, H.; Zeng, W.; Zeng, X.; Yen, G.G. A Polar-Metric-Based Evolutionary Algorithm. IEEE Trans. Cybern. 2021, 51, 3429–3440. [Google Scholar] [CrossRef] [PubMed]
Lopes, C.L.d.V.; Martins, F.V.C.; Wanner, E.F.; Deb, K. Analyzing Dominance Move (MIP-DoM) Indicator for Multiobjective and Many-Objective Optimization. IEEE Trans. Evol. Comput. 2022, 26, 476–489. [Google Scholar] [CrossRef]
Wang, H.; Jin, Y.; Sun, C.; Doherty, J. Offline data-driven evolutionary optimization using selective surrogate ensembles. IEEE Trans. Evol. Comput. 2018, 23, 203–216. [Google Scholar]
Lin, Q.; Wu, X.; Ma, L.; Li, J.; Gong, M.; Coello, C.A.C. An Ensemble Surrogate-Based Framework for Expensive Multiobjective Evolutionary Optimization. IEEE Trans. Evol. Comput. 2022, 26, 631–645. [Google Scholar] [CrossRef]
Sonoda, T.; Nakata, M. Multiple Classifiers-Assisted Evolutionary Algorithm Based on Decomposition for High-Dimensional Multiobjective Problems. IEEE Trans. Evol. Comput. 2022, 26, 1581–1595. [Google Scholar] [CrossRef]
Da, B.; Gupta, A.; Ong, Y.S.; Feng, L. Evolutionary multitasking across single and multi-objective formulations for improved problem solving. In Proceedings of the 2016 IEEE Congress on Evolutionary Computation (CEC), Vancouver, BC, Canada, 24–29 July 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 1695–1701. [Google Scholar]
Gupta, A.; Ong, Y.S.; Feng, L.; Tan, K.C. Multiobjective Multifactorial Optimization in Evolutionary Multitasking. IEEE Trans. Cybern. 2017, 47, 1652–1665. [Google Scholar]
Chen, K.; Xue, B.; Zhang, M.; Zhou, F. Evolutionary Multitasking for Feature Selection in High-Dimensional Classification via Particle Swarm Optimization. IEEE Trans. Evol. Comput. 2022, 26, 446–460. [Google Scholar] [CrossRef]
Cao, F.; Tang, Z.; Zhu, C.; Zhao, X. An Efficient Hybrid Multi-Objective Optimization Method Coupling Global Evolutionary and Local Gradient Searches for Solving Aerodynamic Optimization Problems. Mathematics 2023, 11, 3844. [Google Scholar] [CrossRef]
Garces-Jimenez, A.; Gomez-Pulido, J.M.; Gallego-Salvador, N.; Garcia-Tejedor, A.J. Genetic and Swarm Algorithms for Optimizing the Control of Building HVAC Systems Using Real Data: A Comparative Study. Mathematics 2021, 9, 2181. [Google Scholar] [CrossRef]
Ramos-Pérez, J.M.; Miranda, G.; Segredo, E.; León, C.; Rodríguez-León, C. Application of Multi-Objective Evolutionary Algorithms for Planning Healthy and Balanced School Lunches. Mathematics 2021, 9, 80. [Google Scholar] [CrossRef]
Long, S.; Zhang, Y.; Deng, Q.; Pei, T.; Ouyang, J.; Xia, Z. An Efficient Task Offloading Approach Based on Multi-Objective Evolutionary Algorithm in Cloud-Edge Collaborative Environment. IEEE Trans. Netw. Sci. Eng. 2023, 10, 645–657. [Google Scholar] [CrossRef]
Zhang, Z.; Ma, S.; Jiang, X. Research on Multi-Objective Multi-Robot Task Allocation by Lin-Kernighan-Helsgaun Guided Evolutionary Algorithms. Mathematics 2022, 10, 4714. [Google Scholar] [CrossRef]
Xue, Y.; Chen, C.; Slowik, A. Neural Architecture Search Based on a Multi-Objective Evolutionary Algorithm With Probability Stack. IEEE Trans. Evol. Comput. 2023, 27, 778–786. [Google Scholar] [CrossRef]
Ponti, A.; Candelieri, A.; Giordani, I.; Archetti, F. Intrusion Detection in Networks by Wasserstein Enabled Many-Objective Evolutionary Algorithms. Mathematics 2023, 11, 2342. [Google Scholar] [CrossRef]
Zhu, W.; Li, H.; Wei, W. A Two-Stage Multi-Objective Evolutionary Algorithm for Community Detection in Complex Networks. Mathematics 2023, 11, 2702. [Google Scholar] [CrossRef]
Gao, C.; Yin, Z.; Wang, Z.; Li, X.; Li, X. Multilayer Network Community Detection: A Novel Multi-Objective Evolutionary Algorithm Based on Consensus Prior Information [Feature]. IEEE Comput. Intell. Mag. 2023, 18, 46–59. [Google Scholar] [CrossRef]
Xu, H.; Xue, B.; Zhang, M. A Bi-Search Evolutionary Algorithm for High-Dimensional Bi-Objective Feature Selection. IEEE Trans. Emerg. Top. Comput. Intell. 2024, 1–14. [Google Scholar] [CrossRef]
Xu, H.; Xue, B.; Zhang, M. Probe Population Based Initialization and Genetic Pool Based Reproduction for Evolutionary Bi-Objective Feature Selection. IEEE Trans. Evol. Comput. 2024, 1. [Google Scholar] [CrossRef]
Nguyen, B.H.; Xue, B.; Andreae, P.; Ishibuchi, H.; Zhang, M. Multiple Reference Points-Based Decomposition for Multiobjective Feature Selection in Classification: Static and Dynamic Mechanisms. IEEE Trans. Evol. Comput. 2020, 24, 170–184. [Google Scholar] [CrossRef]
Khalid, S.; Khalil, T.; Nasreen, S. A survey of feature selection and feature extraction techniques in machine learning. In Proceedings of the 2014 Science and Information Conference, London, UK, 27–29 August 2014; pp. 372–378. [Google Scholar] [CrossRef]
Cheng, F.; Cui, J.; Wang, Q.; Zhang, L. A Variable Granularity Search-Based Multiobjective Feature Selection Algorithm for High-Dimensional Data Classification. IEEE Trans. Evol. Comput. 2023, 27, 266–280. [Google Scholar] [CrossRef]
Xu, H.; Xue, B.; Zhang, M. A Duplication Analysis-Based Evolutionary Algorithm for Biobjective Feature Selection. IEEE Trans. Evol. Comput. 2021, 25, 205–218. [Google Scholar] [CrossRef]
Cheng, F.; Chu, F.; Xu, Y.; Zhang, L. A Steering-Matrix-Based Multiobjective Evolutionary Algorithm for High-Dimensional Feature Selection. IEEE Trans. Cybern. 2022, 52, 9695–9708. [Google Scholar] [CrossRef] [PubMed]
Xu, H.; Xue, B.; Zhang, M. Segmented Initialization and Offspring Modification in Evolutionary Algorithms for Bi-Objective Feature Selection. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference, GECCO’20, Cancun, Mexico, 8–12 July 2020; pp. 444–452. [Google Scholar]
De La Iglesia, B. Evolutionary computation for feature selection in classification problems. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2013, 3, 381–407. [Google Scholar]
Xue, B.; Zhang, M.; Browne, W.N.; Yao, X. A survey on evolutionary computation approaches to feature selection. IEEE Trans. Evol. Comput. 2015, 20, 606–626. [Google Scholar]
Dokeroglu, T.; Deniz, A.; Kiziloz, H.E. A comprehensive survey on recent metaheuristics for feature selection. Neurocomputing 2022, 494, 269–296. [Google Scholar]
Mukhopadhyay, A.; Maulik, U. An SVM-wrapped multiobjective evolutionary feature selection approach for identifying cancer-microRNA markers. IEEE Trans. Nanobiosci. 2013, 12, 275–281. [Google Scholar]
Vignolo, L.D.; Milone, D.H.; Scharcanski, J. Feature selection for face recognition based on multi-objective evolutionary wrappers. Expert Syst. Appl. 2013, 40, 5077–5084. [Google Scholar]
Guyon, I.; Elisseeff, A. An introduction to variable and feature selection. J. Mach. Learn. Res. 2003, 3, 1157–1182. [Google Scholar]
Lazar, C.; Taminau, J.; Meganck, S.; Steenhoff, D.; Coletta, A.; Molter, C.; de Schaetzen, V.; Duque, R.; Bersini, H.; Nowe, A. A survey on filter techniques for feature selection in gene expression microarray analysis. IEEE/ACM Trans. Comput. Biol. Bioinform. TCBB 2012, 9, 1106–1119. [Google Scholar]
Xue, B.; Cervante, L.; Shang, L.; Browne, W.N.; Zhang, M. Multi-objective evolutionary algorithms for filter based feature selection in classification. Int. J. Artif. Intell. Tools 2013, 22, 1350024. [Google Scholar]
Xue, B.; Zhang, M.; Browne, W.N. Particle swarm optimisation for feature selection in classification: Novel initialisation and updating mechanisms. Appl. Soft Comput. 2014, 18, 261–276. [Google Scholar]
Chen, K.; Xue, B.; Zhang, M.; Zhou, F. An Evolutionary Multitasking-Based Feature Selection Method for High-Dimensional Classification. IEEE Trans. Cybern. 2022, 52, 7172–7186. [Google Scholar] [CrossRef] [PubMed]
Jiao, R.; Xue, B.; Zhang, M. Solving Multi-objective Feature Selection Problems in Classification via Problem Reformulation and Duplication Handling. IEEE Trans. Evol. Comput. 2022, 28, 846–860. [Google Scholar] [CrossRef]
Kelly, M.; Longjohn, R.; Nottingham, K. The UCI Machine Learning Repository. Available online: https://archive.ics.uci.edu (accessed on 18 August 2024).
While, L.; Hingston, P.; Barone, L.; Huband, S. A faster algorithm for calculating Hypervolume. IEEE Trans. Evol. Comput. 2006, 10, 29–38. [Google Scholar]
Ishibuchi, H.; Imada, R.; Masuyama, N.; Nojima, Y. Comparison of Hypervolume, IGD and IGD+ from the Viewpoint of Optimal Distributions of Solutions. In Proceedings of the Evolutionary Multi-Criterion Optimization 2019, East Lansing, MI, USA, 10–13 March 2019; Deb, K., Goodman, E., Coello Coello, C.A., Klamroth, K., Miettinen, K., Mostaghim, S., Reed, P., Eds.; Springer: Cham, Switzerland, 2019; pp. 332–345. [Google Scholar]
Tian, Y.; Cheng, R.; Zhang, X.; Jin, Y. PlatEMO: A MATLAB Platform for Evolutionary Multi-Objective Optimization. IEEE Comput. Intell. Mag. 2017, 12, 73–87. [Google Scholar]
Tran, B.; Xue, B.; Zhang, M.; Nguyen, S. Investigation on particle swarm optimisation for feature selection on high-dimensional data: Local search and selection bias. Connect. Sci. 2016, 28, 270–294. [Google Scholar]

Figure 1. An example of how the adaptive interpolation-based initialization process in the IPEA works when the population size is set to

N = 16

.

Figure 1. An example of how the adaptive interpolation-based initialization process in the IPEA works when the population size is set to

N = 16

.

Figure 2. Nondominated solution distributions in the objective space corresponding to the runs of median HV performances obtained by each algorithm on each dataset. (a) Sonar. (b) HillValley. (c) Synthetic. (d) Arrhythmia. (e) Yale. (f) Colon. (g) SRBCT. (h) AR10P. (i) PIE10P. (j) Leukemia1. (k) Tumor9. (l) Brain1. (m) Leukemia2. (n) CNS. (o) ALLAML. (p) Nci9. (q) Pixraw10P. (r) Orlraws10P. (s) Brain2. (t) Prostate.

Table 1. Attributes of each classification dataset used as test problems.

No.	Dataset	Feature	Sample	Class
1	Sonar	60	208	2
2	HillValley	100	606	2
3	Synthetic	100	200	5
4	Arrhythmia	278	452	13
5	Yale	1024	165	15
6	Colon	2000	62	2
7	SRBCT	2308	83	4
8	AR10P	2400	130	10
9	PIE10P	2420	210	10
10	Leukemia1	5327	72	3
11	Tumor9	5726	60	9
12	Brain1	5920	90	5
13	Leukemia2	7070	72	2
14	CNS	7129	60	2
15	ALLAML	7129	72	2
16	Nci9	9712	60	9
17	Pixraw10P	10,000	100	10
18	Orlraws10P	10,304	100	10
19	Brain2	10,367	50	4
20	Prostate	10,509	102	2

Table 2. Mean HV performance of each algorithm on each dataset, with the best results marked in gray (the greater the better) and insignificant differences prefixed by ★.

Dataset	IPEA	NSGA-II	MOEA/D	MOEA/HD	PMEA
Sonar	8.072e−01	★ 7.955e−01	★ 7.940e−01	★ 8.027e−01	7.740e−01
Sonar	±2.36e−02	±2.59e−02	±2.87e−02	±2.53e−02	±3.09e−02
HillValley	6.310e−01	5.849e−01	★ 6.259e−01	★ 6.259e−01	5.819e−01
HillValley	±1.08e−02	±2.51e−02	±1.04e−02	±8.54e−03	±2.18e−02
Synthetic	4.343e−01	3.846e−01	3.852e−01	3.956e−01	3.860e−01
Synthetic	±3.76e−02	±3.88e−02	±4.16e−02	±5.93e−02	±3.74e−02
Arrhythmia	6.932e−01	6.308e−01	6.655e−01	4.826e−01	4.715e−01
Arrhythmia	±1.64e−02	±2.66e−02	±1.58e−02	±1.69e−02	±4.16e−02
Yale	7.043e−01	4.878e−01	5.127e−01	4.920e−01	4.868e−01
Yale	±2.72e−02	±1.34e−02	±2.62e−02	±3.39e−02	±1.92e−02
Colon	8.990e−01	5.500e−01	6.078e−01	5.239e−01	5.387e−01
Colon	±3.06e−02	±2.65e−02	±4.78e−02	±3.30e−02	±2.68e−02
SRBCT	9.230e−01	2.841e−01	3.178e−01	2.554e−01	2.759e−01
SRBCT	±5.64e−02	±2.05e−03	±2.03e−03	±1.81e−03	±2.03e−03
AR10P	7.778e−01	3.631e−01	3.709e−01	3.449e−01	3.536e−01
AR10P	±3.52e−02	±2.01e−02	±2.26e−02	±1.75e−02	±1.48e−02
PIE10P	9.347e−01	6.023e−01	6.458e−01	5.883e−01	5.902e−01
PIE10P	±2.07e−02	±1.06e−02	±1.13e−02	±1.22e−02	±9.87e−03
Leukemia1	9.647e−01	5.290e−01	5.431e−01	5.132e−01	5.160e−01
Leukemia1	±2.43e−02	±1.81e−02	±3.10e−02	±2.40e−02	±1.93e−02
Tumor9	4.897e−01	2.790e−01	2.832e−01	2.673e−01	2.789e−01
Tumor9	±5.65e−02	±2.80e−02	±2.54e−02	±2.35e−02	±2.24e−02
Brain1	7.844e−01	4.718e−01	4.906e−01	4.535e−01	4.698e−01
Brain1	±4.93e−02	±3.11e−03	±1.09e−02	±1.00e−02	±2.77e−03
Leukemia2	9.337e−01	5.360e−01	5.450e−01	5.126e−01	5.270e−01
Leukemia2	±6.48e−02	±8.95e−03	±1.94e−02	±1.79e−02	±1.55e−02
CNS	6.842e−01	3.781e−01	3.743e−01	3.707e−01	3.663e−01
CNS	±6.93e−02	±3.28e−02	±3.32e−02	±3.25e−02	±3.52e−02
ALLAML	9.813e−01	5.205e−01	5.358e−01	5.060e−01	5.148e−01
ALLAML	±3.66e−02	±1.52e−02	±1.34e−02	±1.44e−02	±1.52e−02
Nci9	4.712e−01	2.406e−01	2.616e−01	2.370e−01	2.447e−01
Nci9	±8.13e−02	±2.54e−02	±2.94e−02	±2.21e−02	±2.46e−02
Pixraw10P	9.802e−01	5.795e−01	5.911e−01	5.640e−01	5.778e−01
Pixraw10P	±2.83e−02	±2.22e−03	±9.88e−03	±7.64e−03	±3.00e−03
Orlraws10P	9.605e−01	5.390e−01	5.447e−01	5.297e−01	5.392e−01
Orlraws10P	±3.28e−02	±7.53e−03	±9.58e−03	±8.00e−03	±6.12e−03
Brain2	7.181e−01	3.903e−01	3.824e−01	3.782e−01	3.790e−01
Brain2	±5.65e−02	±2.15e−02	±2.43e−02	±2.12e−02	±1.80e−02
Prostate	9.486e−01	4.629e−01	4.599e−01	4.559e−01	4.598e−01
Prostate	±2.31e−02	±1.29e−02	±1.51e−02	±1.16e−02	±2.09e−02

Table 3. Mean IGD+ performance of each algorithm on each dataset, with the best results marked in gray (the smaller the better) and insignificant differences prefixed by ★.

Dataset	IPEA	NSGA-II	MOEA/D	MOEA/HD	PMEA
Sonar	5.352e−02	★ 6.239e−02	★ 6.637e−02	★ 5.961e−02	7.886e−02
Sonar	±1.54e−02	±2.31e−02	±2.18e−02	±1.70e−02	±2.94e−02
HillValley	2.504e−02	6.517e−02	★ 2.614e−02	★ 2.688e−02	7.049e−02
HillValley	±6.73e−03	±2.45e−02	±9.07e−03	±1.06e−02	±2.43e−02
Synthetic	9.448e−02	1.408e−01	1.341e−01	1.328e−01	1.402e−01
Synthetic	±2.35e−02	±3.14e−02	±3.52e−02	±4.19e−02	±3.06e−02
Arrhythmia	1.767e−02	6.646e−02	3.616e−02	2.092e−01	2.317e−01
Arrhythmia	±7.17e−03	±1.92e−02	±8.31e−03	±1.40e−02	±4.63e−02
Yale	2.232e−02	3.175e−01	2.643e−01	3.197e−01	3.166e−01
Yale	±1.19e−02	±1.06e−02	±1.39e−02	±1.47e−02	±8.91e−03
Colon	6.077e−02	3.889e−01	3.241e−01	4.155e−01	4.011e−01
Colon	±2.96e−02	±1.69e−02	±3.84e−02	±2.42e−02	±2.01e−02
SRBCT	3.704e−02	6.225e−01	5.751e−01	6.693e−01	6.352e−01
SRBCT	±3.28e−02	±3.15e−03	±2.55e−03	±3.12e−03	±3.24e−03
AR10P	5.458e−02	4.639e−01	4.353e−01	4.810e−01	4.725e−01
AR10P	±2.13e−02	±1.59e−02	±1.92e−02	±1.51e−02	±1.27e−02
PIE10P	1.366e−02	3.730e−01	3.164e−01	3.867e−01	3.806e−01
PIE10P	±8.19e−03	±8.95e−03	±1.00e−02	±9.94e−03	±6.47e−03
Leukemia1	1.942e−02	4.343e−01	4.077e−01	4.502e−01	4.415e−01
Leukemia1	±1.34e−02	±7.64e−03	±2.30e−02	±1.30e−02	±9.49e−03
Tumor9	9.052e−02	4.515e−01	4.323e−01	4.650e−01	4.516e−01
Tumor9	±5.05e−02	±1.79e−02	±1.43e−02	±1.20e−02	±1.36e−02
Brain1	6.266e−02	4.300e−01	4.036e−01	4.558e−01	4.328e−01
Brain1	±3.95e−02	±4.38e−03	±1.53e−02	±1.41e−02	±3.90e−03
Leukemia2	7.277e−02	4.499e−01	4.349e−01	4.738e−01	4.571e−01
Leukemia2	±7.13e−02	±6.09e−03	±1.71e−02	±1.31e−02	±1.05e−02
CNS	1.806e−01	5.032e−01	4.974e−01	5.127e−01	5.165e−01
CNS	±7.62e−02	±2.82e−02	±3.43e−02	±2.91e−02	±3.02e−02
ALLAML	2.056e−02	4.608e−01	4.411e−01	4.758e−01	4.636e−01
ALLAML	±4.03e−02	±1.01e−02	±9.58e−03	±1.06e−02	±9.55e−03
Nci9	9.879e−02	4.775e−01	4.513e−01	5.029e−01	4.756e−01
Nci9	±6.21e−02	±1.43e−02	±1.49e−02	±1.07e−02	±1.19e−02
Pixraw10P	1.172e−02	4.431e−01	4.300e−01	4.607e−01	4.450e−01
Pixraw10P	±1.88e−02	±2.51e−03	±1.12e−02	±8.65e−03	±3.40e−03
Orlraws10P	1.684e−02	4.491e−01	4.386e−01	4.603e−01	4.498e−01
Orlraws10P	±1.71e−02	±3.95e−03	±4.68e−03	±5.75e−03	±3.23e−03
Brain2	1.001e−01	4.813e−01	4.828e−01	4.934e−01	4.888e−01
Brain2	±4.86e−02	±1.33e−02	±1.94e−02	±1.34e−02	±1.11e−02
Prostate	1.465e−02	4.890e−01	4.865e−01	4.968e−01	4.929e−01
Prostate	±1.95e−02	±1.09e−02	±1.20e−02	±9.49e−03	±1.55e−02

Table 4. Mean minimum classification errors obtained by each algorithm on each dataset, with the best results marked in gray (the smaller the better) and insignificant differences prefixed by ★. In addition, the mean rounded number of selected features (NSF) related to the obtained minimum classification error is shown beneath as NSF.

Dataset	IPEA	NSGA-II	MOEA/D	MOEA/HD	PMEA
Sonar	1.957e−01	★ 2.000e−01	★ 2.065e−01	★ 2.032e−01	★ 2.097e−01
Sonar	NSF: 6	NSF: 7	NSF: 6	NSF: 5	NSF: 7
HillValley	3.984e−01	4.211e−01	★ 4.048e−01	★ 4.038e−01	4.218e−01
HillValley	NSF: 4	NSF: 10	NSF: 3	NSF: 4	NSF: 11
Synthetic	6.161e−01	6.439e−01	6.600e−01	6.433e−01	6.406e−01
Synthetic	NSF: 8	NSF: 15	NSF: 9	NSF: 12	NSF: 15
Arrhythmia	3.317e−01	3.775e−01	3.530e−01	4.650e−01	4.566e−01
Arrhythmia	NSF: 8	NSF: 17	NSF: 12	NSF: 51	NSF: 58
Yale	3.185e−01	3.519e−01	3.711e−01	3.504e−01	3.459e−01
Yale	NSF: 10	NSF: 329	NSF: 265	NSF: 332	NSF: 323
Colon	1.088e−01	2.070e−01	1.947e−01	2.246e−01	2.123e−01
Colon	NSF: 3	NSF: 703	NSF: 552	NSF: 735	NSF: 720
SRBCT	7.600e−02	6.400e−01	6.400e−01	6.400e−01	6.400e−01
SRBCT	NSF: 4	NSF: 810	NSF: 610	NSF: 990	NSF: 866
AR10P	2.408e−01	4.958e−01	5.150e−01	5.150e−01	5.075e−01
AR10P	NSF: 7	NSF: 909	NSF: 780	NSF: 930	NSF: 931
PIE10P	7.389e−02	9.722e−02	1.044e−01	1.028e−01	1.050e−01
PIE10P	NSF: 11	NSF: 917	NSF: 767	NSF: 933	NSF: 927
Leukemia1	4.697e−02	1.636e−01	1.833e−01	1.712e−01	1.773e−01
Leukemia1	NSF: 3	NSF: 2236	NSF: 2052	NSF: 2308	NSF: 2251
Tumor9	5.537e−01	6.019e−01	6.148e−01	6.037e−01	5.944e−01
Tumor9	NSF: 6	NSF: 2455	NSF: 2329	NSF: 2517	NSF: 2466
Brain1	2.370e−01	2.593e−01	2.593e−01	2.593e−01	2.593e−01
Brain1	NSF: 4	NSF: 2499	NSF: 2321	NSF: 2640	NSF: 2512
Leukemia2	6.818e−02	1.318e−01	1.470e−01	1.439e−01	1.500e−01
Leukemia2	NSF: 2	NSF: 3056	NSF: 2908	NSF: 3181	NSF: 3067
CNS	3.537e−01	4.056e−01	4.370e−01	4.167e−01	4.352e−01
CNS	NSF: 2	NSF: 3104	NSF: 2956	NSF: 3172	NSF: 3122
ALLAML	1.818e−02	1.576e−01	1.652e−01	1.606e−01	1.621e−01
ALLAML	NSF: 2	NSF: 3081	NSF: 2957	NSF: 3181	NSF: 3077
Nci9	5.860e−01	6.491e−01	6.421e−01	6.404e−01	6.526e−01
Nci9	NSF: 6	NSF: 4278	NSF: 4092	NSF: 4606	NSF: 4292
Pixraw10P	1.778e−02	3.333e−02	3.333e−02	3.333e−02	3.333e−02
Pixraw10P	NSF: 2	NSF: 4428	NSF: 4277	NSF: 4628	NSF: 4455
Orlraws10P	3.667e−02	1.033e−01	1.100e−01	1.044e−01	1.022e−01
Orlraws10P	NSF: 7	NSF: 4582	NSF: 4463	NSF: 4702	NSF: 4590
Brain2	3.089e−01	3.778e−01	3.978e−01	3.933e−01	3.844e−01
Brain2	NSF: 4	NSF: 4637	NSF: 4572	NSF: 4733	NSF: 4656
Prostate	5.484e−02	2.462e−01	2.591e−01	2.452e−01	2.462e−01
Prostate	NSF: 3	NSF: 4718	NSF: 4585	NSF: 4797	NSF: 4721

Table 5. Mean computational time cost in seconds for each algorithm run on each dataset, with the best results marked in gray (the smaller the better) and insignificant differences prefixed by ★.

Dataset	IPEA	NSGA-II	MOEA/D	MOEA/HD	PMEA
Sonar	3.881e+01	3.923e+01	4.050e+01	3.944e+01	3.937e+01
Sonar	±3.25e−01	±4.31e−01	±3.27e−01	±4.60e−01	±3.51e−01
HillValley	1.455e+02	1.723e+02	1.506e+02	1.634e+02	1.758e+02
HillValley	±2.17e+00	±7.25e+00	±1.46e+00	±4.00e+00	±3.90e+00
Synthetic	3.990e+01	4.278e+01	4.311e+01	4.353e+01	4.257e+01
Synthetic	±5.41e−01	±1.25e+00	±6.22e−01	±8.03e−01	±6.76e−01
Arrhythmia	1.333e+02	1.571e+02	1.516e+02	1.957e+02	1.887e+02
Arrhythmia	±1.21e+00	±1.18e+01	±6.88e+00	±5.01e+00	±7.70e+00
Yale	2.251e+02	3.871e+02	3.698e+02	4.042e+02	3.789e+02
Yale	±4.75e+00	±9.78e+00	±9.55e+00	±1.49e+01	±1.05e+01
Colon	8.224e+01	1.926e+02	1.666e+02	1.946e+02	1.823e+02
Colon	±3.08e+00	±6.56e+00	±7.77e+00	±8.58e+00	±7.88e+00
SRBCT	2.754e+02	4.754e+02	4.368e+02	5.117e+02	4.727e+02
SRBCT	±3.46e+00	±6.34e+00	±5.39e+00	±7.46e+00	±5.93e+00
AR10P	6.210e+02	1.087e+03	1.053e+03	1.119e+03	1.075e+03
AR10P	±1.78e+01	±2.46e+01	±3.21e+01	±3.56e+01	±3.68e+01
PIE10P	1.167e+03	2.145e+03	2.087e+03	2.309e+03	2.113e+03
PIE10P	±2.69e+01	±1.28e+02	±1.08e+02	±1.18e+02	±8.12e+01
Leukemia1	8.660e+02	1.586e+03	1.499e+03	1.614e+03	1.499e+03
Leukemia1	±3.40e+01	±1.12e+02	±1.00e+02	±1.13e+02	±5.18e+01
Tumor9	7.800e+02	1.441e+03	1.379e+03	1.458e+03	1.338e+03
Tumor9	±4.19e+01	±1.05e+02	±1.09e+02	±1.07e+02	±5.66e+01
Brain1	1.269e+03	2.641e+03	2.542e+03	2.668e+03	2.408e+03
Brain1	±5.98e+01	±2.46e+02	±2.36e+02	±2.35e+02	±4.55e+01
Leukemia2	1.210e+03	2.529e+03	2.402e+03	2.541e+03	2.286e+03
Leukemia2	±5.99e+01	±2.35e+02	±2.17e+02	±2.37e+02	±4.97e+01
CNS	1.022e+03	1.874e+03	1.792e+03	1.905e+03	1.769e+03
CNS	±5.68e+01	±1.34e+02	±1.42e+02	±1.50e+02	±5.26e+01
ALLAML	1.241e+03	2.579e+03	2.461e+03	2.598e+03	2.306e+03
ALLAML	±6.69e+01	±2.69e+02	±2.53e+02	±2.61e+02	±1.78e+01
Nci9	1.490e+03	2.924e+03	2.832e+03	3.042e+03	2.678e+03
Nci9	±8.17e+01	±2.76e+02	±3.03e+02	±3.01e+02	±2.47e+01
Pixraw10P	2.721e+03	5.213e+03	5.096e+03	5.320e+03	4.774e+03
Pixraw10P	±1.87e+02	±4.87e+02	±4.98e+02	±5.02e+02	±4.74e+01
Orlraws10P	2.841e+03	5.396e+03	5.272e+03	4.905e+03	4.833e+03
Orlraws10P	±1.78e+02	±5.12e+02	±5.59e+02	±1.51e+02	±1.25e+02
Brain2	1.438e+03	2.469e+03	2.313e+03	2.493e+03	2.438e+03
Brain2	±1.05e+02	±6.16e+01	±5.46e+01	±7.12e+01	±5.03e+01
Prostate	3.098e+03	5.129e+03	4.943e+03	5.198e+03	5.107e+03
Prostate	±2.09e+02	±1.59e+02	±1.24e+02	±1.78e+02	±1.31e+02

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xu, H. An Interpolation-Based Evolutionary Algorithm for Bi-Objective Feature Selection in Classification. Mathematics 2024, 12, 2572. https://doi.org/10.3390/math12162572

AMA Style

Xu H. An Interpolation-Based Evolutionary Algorithm for Bi-Objective Feature Selection in Classification. Mathematics. 2024; 12(16):2572. https://doi.org/10.3390/math12162572

Chicago/Turabian Style

Xu, Hang. 2024. "An Interpolation-Based Evolutionary Algorithm for Bi-Objective Feature Selection in Classification" Mathematics 12, no. 16: 2572. https://doi.org/10.3390/math12162572

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Interpolation-Based Evolutionary Algorithm for Bi-Objective Feature Selection in Classification

Abstract

1. Introduction

2. Related Works

3. Proposed Algorithm

3.1. General Framework

3.2. Initialization Process

3.3. Reproduction Process

4. Experimental Setup

4.1. Comparison Algorithms

4.2. Classification Datasets

4.3. Performance Indicators

4.4. Parameter Settings

5. Experimental Studies

5.1. Empirical Result Analyses

5.2. Classification Performance Analyses

5.3. Nondominated Solution Distributions

5.4. Computational Time Analyses

6. Conclusions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI