A Model for Feature Selection with Binary Particle Swarm Optimisation and Synthetic Features
Abstract
:1. Introduction
2. Literature Review
2.1. Particle Swarm Optimisation
2.2. Long Short-Term Memory Networks
- An Input gate : Responsible for determining the current state of the LSTM cell by applying a sigmoid function and function.
- A Forget gate : Responsible for determining which information is fed into an LSTM cell and which information should be forgotten or discarded.
- An Output gate : Determines what information is passed out of the LSTM cell.
2.3. Related Work
3. Materials and Methods
3.1. Data Analysis and Pre-Processing
3.2. Building Synthetic Features
Algorithm 1: Create synthetic features from features with weak correlation to the label |
3.3. Model Efficiency Considerations
3.4. Model Training and Binary Particle Swarm Optimization
3.5. Data Pre-Processing and Model Design
3.6. Model Evaluation
4. Results
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
LSTM | Long Short-Term Memory |
BPSO | Binary Particle Swarm Optimisation |
FS | Feature selection |
References
- Rostami, M.; Berahmand, K.; Nasiri, E.; Forouzandeh, S. Review of swarm intelligence-based feature selection methods. Eng. Appl. Artif. Intell. 2021, 100, 104210. [Google Scholar] [CrossRef]
- Dhal, P.; Azad, C. A comprehensive survey on feature selection in the various fields of machine learning. Appl. Intell. 2022, 52, 4543–4581. [Google Scholar] [CrossRef]
- Pudjihartono, N.; Fadason, T.; Kempa-Liehr, A.W.; O’Sullivan, J.M. A review of feature selection methods for machine learning-based disease risk prediction. Front. Bioinform. 2022, 2, 927312. [Google Scholar] [CrossRef] [PubMed]
- Effrosynidis, D.; Arampatzis, A. An evaluation of feature selection methods for environmental data. Ecol. Inform. 2021, 61, 101224. [Google Scholar] [CrossRef]
- Pearson, K.; Lee, A. Mathematical contributions to the theory of evolution. VIII. on the inheritance of characters not capable of exact quantitative measurement. Part I. introductory. Part II. on the inheritance of coat-colour in horses. Part III. on the inheritance of eye-colour in man. Philos. Trans. R. Soc. Lond. Ser. A Contain. Pap. Math. Phys. Character 1900, 195, 79–150. [Google Scholar]
- Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95-International Conference on Neural Networks, Perth, WA, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar]
- Miranda, L.J.V. PySwarms, a research-toolkit for Particle Swarm Optimization in Python. J. Open Source Softw. 2018, 3, 433. [Google Scholar] [CrossRef]
- McHugh, M.L. The chi-square test of independence. Biochem. Medica 2013, 23, 143–149. [Google Scholar] [CrossRef] [PubMed]
- Alshaer, H.N.; Otair, M.A.; Abualigah, L.; Alshinwan, M.; Khasawneh, A.M. Feature selection method using improved CHI Square on Arabic text classifiers: Analysis and application. Multimed. Tools Appl. 2021, 80, 10373–10390. [Google Scholar] [CrossRef]
- Jahan, S.; Islam, M.S.; Islam, L.; Rashme, T.Y.; Prova, A.A.; Paul, B.K.; Islam, M.M.; Mosharof, M.K. Automated invasive cervical cancer disease detection at early stage through suitable machine learning model. SN Appl. Sci. 2021, 3, 806. [Google Scholar] [CrossRef]
- Cañete-Sifuentes, L.; Monroy, R.; Medina-Pérez, M.A. A review and experimental comparison of multivariate decision trees. IEEE Access 2021, 9, 110451–110479. [Google Scholar] [CrossRef]
- Mienye, I.D.; Sun, Y. A survey of ensemble learning: Concepts, algorithms, applications, and prospects. IEEE Access 2022, 10, 99129–99149. [Google Scholar] [CrossRef]
- Raileanu, L.E.; Stoffel, K. Theoretical comparison between the gini index and information gain criteria. Ann. Math. Artif. Intell. 2004, 41, 77–93. [Google Scholar] [CrossRef]
- Kennedy, J. The particle swarm: Social adaptation of knowledge. In Proceedings of the 1997 IEEE International Conference on Evolutionary Computation (ICEC’97), Indianapolis, IN, USA, 13–16 April 1997; pp. 303–308. [Google Scholar] [CrossRef]
- Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar]
- Chen, C.; Dai, J. Mitigating backdoor attacks in lstm-based text classification systems by backdoor keyword identification. Neurocomputing 2021, 452, 253–262. [Google Scholar] [CrossRef]
- Huang, W.; Liu, M.; Shang, W.; Zhu, H.; Lin, W.; Zhang, C. LSTM with compensation method for text classification. Int. J. Wirel. Mob. Comput. 2021, 20, 159–167. [Google Scholar] [CrossRef]
- Behera, R.K.; Jena, M.; Rath, S.K.; Misra, S. Co-LSTM: Convolutional LSTM model for sentiment analysis in social big data. Inf. Process. Manag. 2021, 58, 102435. [Google Scholar] [CrossRef]
- Gandhi, U.D.; Kumar, P.M.; Babu, G.C.; Karthick, G. Sentiment Analysis on Twitter Data by Using Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM). Wirel. Pers. Commun. 2021, 1–10. [Google Scholar] [CrossRef]
- Zhang, B.; Zou, G.; Qin, D.; Lu, Y.; Jin, Y.; Wang, H. A novel Encoder-Decoder model based on read-first LSTM for air pollutant prediction. Sci. Total Environ. 2021, 765, 144507. [Google Scholar] [CrossRef] [PubMed]
- Nguyen, H.; Tran, K.; Thomassey, S.; Hamad, M. Forecasting and Anomaly Detection approaches using LSTM and LSTM Autoencoder techniques with the applications in supply chain management. Int. J. Inf. Manag. 2021, 57, 102282. [Google Scholar] [CrossRef]
- Zhou, H.; Zhang, J.; Zhou, Y.; Guo, X.; Ma, Y. A feature selection algorithm of decision tree based on feature weight. Expert Syst. Appl. 2021, 164, 113842. [Google Scholar] [CrossRef]
- Cui, X.; Li, Y.; Fan, J.; Wang, T. A novel filter feature selection algorithm based on relief. Appl. Intell. 2022, 52, 5063–5081. [Google Scholar] [CrossRef]
- Dhiman, G.; Oliva, D.; Kaur, A.; Singh, K.K.; Vimal, S.; Sharma, A.; Cengiz, K. BEPO: A novel binary emperor penguin optimizer for automatic feature selection. Knowl.-Based Syst. 2021, 211, 106560. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Bukhari, A.H.; Raja, M.A.Z.; Sulaiman, M.; Islam, S.; Shoaib, M.; Kumam, P. Fractional neuro-sequential ARFIMA-LSTM for financial market forecasting. IEEE Access 2020, 8, 71326–71338. [Google Scholar] [CrossRef]
Feature | Relevance Indicator |
---|---|
Open | 0 |
High | 1 |
Low | 0 |
Close | 0 |
Adj Close | 0 |
Volume | 1 |
Open_Volume | 1 |
High_Volume | 0 |
Low_Volume | 1 |
Close_Volume | 1 |
Adj Close_Volume | 1 |
Proposed Feature Selection | MAE | MSE |
---|---|---|
YES | 0.000305 | 0.012886 |
NO | 0.000433 | 0.016486 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ojo, S.O.; Adisa, J.A.; Owolawi, P.A.; Tu, C. A Model for Feature Selection with Binary Particle Swarm Optimisation and Synthetic Features. AI 2024, 5, 1235-1254. https://doi.org/10.3390/ai5030060
Ojo SO, Adisa JA, Owolawi PA, Tu C. A Model for Feature Selection with Binary Particle Swarm Optimisation and Synthetic Features. AI. 2024; 5(3):1235-1254. https://doi.org/10.3390/ai5030060
Chicago/Turabian StyleOjo, Samuel Olusegun, Juliana Adeola Adisa, Pius Adewale Owolawi, and Chunling Tu. 2024. "A Model for Feature Selection with Binary Particle Swarm Optimisation and Synthetic Features" AI 5, no. 3: 1235-1254. https://doi.org/10.3390/ai5030060