A Knowledge-Guided Competitive Co-Evolutionary Algorithm for Feature Selection
Abstract
:1. Introduction
- We propose a feature grouping method based on correlation and, based on this, introduce a Knowledge-Guided Evolutionary Algorithm (KGEA). Initially, the correlations between features are assessed using Spearman’s correlation coefficient. Then, using a predefined optimal threshold, features are grouped based on their correlations. Finally, the obtained grouping information serves as knowledge to guide the evolutionary process. In this way, the speed and quality of evolutionary feature selection can be significantly enhanced.
- We design a Knowledge-Guided Competitive–Cooperative Evolutionary Algorithm (KCCEA) framework. Initially, a method based on allocation ratios allows the two algorithms to cooperate. Then, during the evolutionary process, the algorithms continually update the allocation ratio dynamically by assessing the success rates of producing superior offspring, allowing the two algorithms to compete. This method achieves an improvement in search speed and solution quality;
- To verify the performance of the proposed methods, we conduct a series of experiments. The experimental results demonstrate that the proposed methods effectively enhance the performance of evolutionary algorithms in feature selection tasks.
2. Background
2.1. Related Work
2.1.1. Multi-Objective Optimization Problem
- Maximize all sub-objective functions;
- Minimize all sub-objective functions;
- Maximize some sub-objective functions while minimizing others.
2.1.2. Non-Dominated Sorting Genetic Algorithm II
- Solutions on a lower-numbered front have no dominators from other fronts;
- Each subsequent front is progressively dominated by those in prior fronts.
- 1.
- Population Generation: The NSGA-II begins with the generation of a random initial population;
- 2.
- Evaluation and Sorting: Each solution is evaluated and ranked based on objective function performance and then sorted into fronts;
- 3.
- Selection: Selection favors solutions on better-ranked fronts. Within these fronts, a further selection criterion based on crowding distance determines choice, emphasizing less crowded areas to enhance diversity among the chosen solutions;
- 4.
- Genetic Operations: Following selection, genetic mechanisms like crossover and mutation are employed to generate new solutions, aiding in the exploration and expansion of the solution space;
- 5.
- Elitism and Generational Transition: To ensure robustness across generations, the NSGA-II integrates elitism, preserving top-performing solutions and carrying them forward. The blended population of parents and offspring undergoes another round of sorting and selection, emphasizing the superior solutions for the next generation.
2.2. Major Motivations
3. Knowledge-Guided Evolutionary Algorithm
3.1. Feature Grouping Based on Correlation
Algorithm 1 Feature _Grouping(correlation_matrix, threshold) | |
Input: correlation_matrix of size , threshold Output: List of feature groups | |
1: | Initialize to an empty list |
2: | number of columns in |
3: | for to do |
4: | |
5: | for each in do |
6: | if all for j in then |
7: | Append i to |
8: | |
9: | Break |
10: | end if |
11: | end for |
12: | if not then |
13: | Append to |
14: | end if |
15: | end for |
16: | return |
3.2. Knowledge-Guided Crossover and Mutation
4. Competition–Cooperation Evolution
4.1. Competition–Cooperation Evolutionary Mechanism
4.2. Dynamic Resource Allocation Based on Success Rate
5. Experimental Setup
5.1. Classification Datasets
5.2. Performance Metrics
5.2.1. Hypervolume
5.2.2. Inverted Generational Distance
5.3. Parameter Settings
6. Experimental Results and Analysis
6.1. Grouping Threshold Sensitivity Analysis
6.2. Effectiveness of Knowledge-Guided Evolution
6.3. Effectiveness of Dynamic Resource Allocation
6.4. Knowledge-Guided Competitive–Cooperative Evolutionary Algorithm against Baseline Algorithm
6.5. Analysis of Pareto Front Distributions
6.6. Discussion
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Qiu, J.; Wu, Q.; Ding, G.; Xu, Y.; Feng, S. A survey of machine learning for big data processing. EURASIP J. Adv. Signal Process. 2016, 2016, 67. [Google Scholar] [CrossRef]
- Feng, S.; Zhao, L.; Shi, H.; Wang, M.; Shen, S.; Wang, W. One-dimensional VGGNet for high-dimensional data. Appl. Soft Comput. 2023, 135, 110035. [Google Scholar] [CrossRef]
- Peng, H.; Long, F.; Ding, C. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 1226–1238. [Google Scholar] [CrossRef]
- Khan, M.A.; Alqahtani, A.; Khan, A.; Alsubai, S.; Binbusayyis, A.; Ch, M.M.I.; Yong, H.S.; Cha, J. Cucumber leaf diseases recognition using multi level deep entropy-ELM feature selection. Appl. Sci. 2022, 12, 593. [Google Scholar] [CrossRef]
- Karlupia, N.; Abrol, P. Wrapper-based optimized feature selection using nature-inspired algorithms. Neural Comput. Appl. 2023, 35, 12675–12689. [Google Scholar] [CrossRef]
- Khan, W.A.; Chung, S.H.; Awan, M.U.; Wen, X. Machine learning facilitated business intelligence (Part I) Neural networks learning algorithms and applications. Ind. Manag. Data Syst. 2020, 120, 164–195. [Google Scholar] [CrossRef]
- Ahadzadeh, B.; Abdar, M.; Safara, F.; Khosravi, A.; Menhaj, M.B.; Suganthan, P.N. SFE: A simple, fast and efficient feature selection algorithm for high-dimensional data. IEEE Trans. Evol. Comput. 2023, 27, 1896–1911. [Google Scholar] [CrossRef]
- Li, J.; Cheng, K.; Wang, S.; Morstatter, F.; Trevino, R.P.; Tang, J.; Liu, H. Feature selection: A data perspective. ACM Comput. Surv. (CSUR) 2017, 50, 1–45. [Google Scholar] [CrossRef]
- Jiao, R.; Nguyen, B.H.; Xue, B.; Zhang, M. A Survey on Evolutionary Multiobjective Feature Selection in Classification: Approaches, Applications, and Challenges. IEEE Trans. Evol. Comput. 2023, 1. [Google Scholar] [CrossRef]
- Pan, H.; Chen, S.; Xiong, H. A high-dimensional feature selection method based on modified Gray Wolf Optimization. Appl. Soft Comput. 2023, 135, 110031. [Google Scholar] [CrossRef]
- Alabrah, A. A novel study: GAN-based minority class balancing and machine-learning-based network intruder detection using chi-square feature selection. Appl. Sci. 2022, 12, 11662. [Google Scholar] [CrossRef]
- Patil, A.R.; Kim, S. Combination of ensembles of regularized regression models with resampling-based lasso feature selection in high dimensional data. Mathematics 2020, 8, 110. [Google Scholar] [CrossRef]
- Jeon, H.; Oh, S. Hybrid-recursive feature elimination for efficient feature selection. Appl. Sci. 2020, 10, 3211. [Google Scholar] [CrossRef]
- Zhou, A.; Qu, B.Y.; Li, H.; Zhao, S.Z.; Suganthan, P.N.; Zhang, Q. Multiobjective evolutionary algorithms: A survey of the state of the art. Swarm Evol. Comput. 2011, 1, 32–49. [Google Scholar] [CrossRef]
- Almutairi, M.S. Evolutionary Multi-Objective Feature Selection Algorithms on Multiple Smart Sustainable Community Indicator Datasets. Sustainability 2024, 16, 1511. [Google Scholar] [CrossRef]
- Deb, K.; Pratap, A.; Agarwal, S.; Meyarivan, T. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 2002, 6, 182–197. [Google Scholar] [CrossRef]
- Xu, Q.; Xu, Z.; Ma, T. A survey of multiobjective evolutionary algorithms based on decomposition: Variants, challenges and future directions. IEEE Access 2020, 8, 41588–41614. [Google Scholar] [CrossRef]
- Li, L.; Xuan, M.; Lin, Q.; Jiang, M.; Ming, Z.; Tan, K.C. An evolutionary multitasking algorithm with multiple filtering for high-dimensional feature selection. IEEE Trans. Evol. Comput. 2023, 27, 802–816. [Google Scholar] [CrossRef]
- Deb, K. Multi-Objective Optimization Using Evolutionary Algorithms; John Wiley & Sons: Hoboken, NJ, USA, 2001; Volume 16. [Google Scholar]
- Goh, C.K.; Tan, K.C.; Liu, D.; Chiam, S.C. A competitive and cooperative co-evolutionary approach to multi-objective particle swarm optimization algorithm design. Eur. J. Oper. Res. 2010, 202, 42–54. [Google Scholar] [CrossRef]
- Zhou, X.; Cai, X.; Zhang, H.; Zhang, Z.; Jin, T.; Chen, H.; Deng, W. Multi-strategy competitive-cooperative co-evolutionary algorithm and its application. Inf. Sci. 2023, 635, 328–344. [Google Scholar] [CrossRef]
- Xie, D.; Ding, L.; Hu, Y.; Wang, S.; Xie, C.; Jiang, L. A Multi-Algorithm Balancing Convergence and Diversity for Multi-Objective Optimization. J. Inf. Sci. Eng. 2013, 29, 811–834. [Google Scholar]
- Xiang, Y.; Lu, X.; Cai, D.; Chen, J.; Bao, C. Multi-algorithm fusion–based intelligent decision-making method for robotic belt grinding process parameters. Int. J. Adv. Manuf. Technol. 2024, 1–16. [Google Scholar] [CrossRef]
- Zhan, Z.H.; Li, J.; Cao, J.; Zhang, J.; Chung, H.S.H.; Shi, Y.H. Multiple populations for multiple objectives: A coevolutionary technique for solving multiobjective optimization problems. IEEE Trans. Cybern. 2013, 43, 445–463. [Google Scholar] [CrossRef] [PubMed]
- Zou, J.; Sun, R.; Liu, Y.; Hu, Y.; Yang, S.; Zheng, J.; Li, K. A multi-population evolutionary algorithm using new cooperative mechanism for solving multi-objective problems with multi-constraint. IEEE Trans. Evol. Comput. 2023, 28, 267–280. [Google Scholar] [CrossRef]
- Katoch, S.; Chauhan, S.S.; Kumar, V. A review on genetic algorithm: Past, present, and future. Multimed. Tools Appl. 2021, 80, 8091–8126. [Google Scholar] [CrossRef]
- Gunantara, N. A review of multi-objective optimization: Methods and its applications. Cogent Eng. 2018, 5, 1502242. [Google Scholar] [CrossRef]
- Wang, Z.; Pei, Y.; Li, J. A survey on search strategy of evolutionary multi-objective optimization algorithms. Appl. Sci. 2023, 13, 4643. [Google Scholar] [CrossRef]
- Srinivas, N.; Deb, K. Muiltiobjective optimization using nondominated sorting in genetic algorithms. Evol. Comput. 1994, 2, 221–248. [Google Scholar] [CrossRef]
- Dutta, S.; Das, K.N. A survey on pareto-based eas to solve multi-objective optimization problems. In Soft Computing for Problem Solving: SocProS 2017, Volume 2; Springer: Singapore, 2019; pp. 807–820. [Google Scholar]
- Espinosa, R.; Jiménez, F.; Palma, J. Surrogate-assisted and filter-based multiobjective evolutionary feature selection for deep learning. IEEE Trans. Neural Netw. Learn. Syst. 2023. [CrossRef]
- Xu, H.; Xue, B.; Zhang, M. A duplication analysis-based evolutionary algorithm for biobjective feature selection. IEEE Trans. Evol. Comput. 2020, 25, 205–218. [Google Scholar] [CrossRef]
- Strehl, A. Relationship-Based Clustering and Cluster Ensembles for High-Dimensional Data Mining; The University of Texas at Austin: Austin, TX, USA, 2002. [Google Scholar]
- Yang, J.M.; Kao, C.Y. A combined evolutionary algorithm for real parameters optimization. In Proceedings of the IEEE International Conference on Evolutionary Computation, Nagoya, Japan, 20–22 May 1996; IEEE: New York, NY, USA, 1996; pp. 732–737. [Google Scholar]
- Feng, Z.k.; Niu, W.j.; Liu, S. Cooperation search algorithm: A novel metaheuristic evolutionary intelligence algorithm for numerical optimization and engineering optimization problems. Appl. Soft Comput. 2021, 98, 106734. [Google Scholar] [CrossRef]
- Dua, D.; Graff, C. UCI Machine Learning Repository. 2017. Available online: https://archive.ics.uci.edu/ml/index.php (accessed on 21 May 2024).
- While, L.; Hingston, P.; Barone, L.; Huband, S. A faster algorithm for calculating hypervolume. IEEE Trans. Evol. Comput. 2006, 10, 29–38. [Google Scholar] [CrossRef]
- Zitzler, E.; Thiele, L.; Laumanns, M.; Fonseca, C.M.; Da Fonseca, V.G. Performance assessment of multiobjective optimizers: An analysis and review. IEEE Trans. Evol. Comput. 2003, 7, 117–132. [Google Scholar] [CrossRef]
- Deb, K.; Thiele, L.; Laumanns, M.; Zitzler, E. Scalable test problems for evolutionary multiobjective optimization. In Evolutionary Multiobjective Optimization: Theoretical Advances and Applications; Springer: London, UK, 2005; pp. 105–145. [Google Scholar]
- Huband, S.; Barone, L.; While, L.; Hingston, P. A scalable multi-objective test problem toolkit. In Proceedings of the Evolutionary Multi-Criterion Optimization: Third International Conference, EMO 2005, Guanajuato, Mexico, 9–11 March 2005. Proceedings 3; Springer: Berlin/Heidelberg, Germany, 2005; pp. 280–295. [Google Scholar]
- Khan, W.A. Balanced weighted extreme learning machine for imbalance learning of credit default risk and manufacturing productivity. Ann. Oper. Res. 2023, 1–29. [Google Scholar] [CrossRef]
- Khan, W.A.; Masoud, M.; Eltoukhy, A.E.; Ullah, M. Stacked encoded cascade error feedback deep extreme learning machine network for manufacturing order completion time. J. Intell. Manuf. 2024, 1–27. [Google Scholar] [CrossRef]
Special Cases | Allocation _Rate |
---|---|
allocation_rate > 0.9 | 0.9 |
allocation_rate < 0.1 | 0.1 |
success_rate1 = 0 AND success_rate2 ≠ 0 | 0.1 |
success_rate1 ≠ 0 AND success_rate2 = 0 | 0.9 |
success_rate1 = 0 AND success_rate2 = 0 | 0.5 |
No. | Dataset Name | Instance | Feature | Class |
---|---|---|---|---|
1 | Wine | 178 | 13 | 3 |
2 | MUSK1 | 476 | 166 | 2 |
3 | LSVT_Voice | 126 | 310 | 2 |
4 | ISOLET5 | 1559 | 617 | 26 |
5 | Toxic | 171 | 1203 | 2 |
Dataset | Optimal Threshold | Feature | Groups |
---|---|---|---|
1 | 0.65 | 13 | 11 |
2 | 0.76 | 166 | 71 |
3 | 0.77 | 310 | 96 |
4 | 0.72 | 617 | 260 |
5 | 0.75 | 1203 | 350 |
Dataset | KCCEA (Optimal) | KCCEA (0.5) | ||
---|---|---|---|---|
Mean | Variance | Mean | Variance | |
1 | ||||
2 | ||||
3 | ||||
4 | ||||
5 |
Dataset | NSGA-II | KGEA | KCCEA | |||
---|---|---|---|---|---|---|
Mean | Variance | Mean | Variance | Mean | Variance | |
1 | ||||||
2 | ||||||
3 | ||||||
4 | ||||||
5 |
Dataset | NSGA-II | KGEA | KCCEA | |||
---|---|---|---|---|---|---|
Mean | Variance | Mean | Variance | Mean | Variance | |
1 | ||||||
2 | ||||||
3 | ||||||
4 | ||||||
5 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhou, J.; Zheng, H.; Li, S.; Hao, Q.; Zhang, H.; Gao, W.; Wang, X. A Knowledge-Guided Competitive Co-Evolutionary Algorithm for Feature Selection. Appl. Sci. 2024, 14, 4501. https://doi.org/10.3390/app14114501
Zhou J, Zheng H, Li S, Hao Q, Zhang H, Gao W, Wang X. A Knowledge-Guided Competitive Co-Evolutionary Algorithm for Feature Selection. Applied Sciences. 2024; 14(11):4501. https://doi.org/10.3390/app14114501
Chicago/Turabian StyleZhou, Junyi, Haowen Zheng, Shaole Li, Qiancheng Hao, Haoyang Zhang, Wenze Gao, and Xianpeng Wang. 2024. "A Knowledge-Guided Competitive Co-Evolutionary Algorithm for Feature Selection" Applied Sciences 14, no. 11: 4501. https://doi.org/10.3390/app14114501
APA StyleZhou, J., Zheng, H., Li, S., Hao, Q., Zhang, H., Gao, W., & Wang, X. (2024). A Knowledge-Guided Competitive Co-Evolutionary Algorithm for Feature Selection. Applied Sciences, 14(11), 4501. https://doi.org/10.3390/app14114501