Algorithms for Feature Selection

A special issue of Algorithms (ISSN 1999-4893).

Deadline for manuscript submissions: closed (25 December 2022) | Viewed by 16296

Special Issue Editor


E-Mail Website
Guest Editor
Department of Software, Faculty of Artificial Intelligence and Software, Gachon University, Seongnam 13120, Republic of Korea
Interests: algorithms; computational intelligence and its applications
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

In recent years, feature selection has been acknowledged as one of the significant activity research fields due to the obvious emergence of datasets comprising large numbers of features. As a result, feature selection was considered an excellent technique for both improving the modeling of the underlying data generation process and lowering the cost of obtaining the features. Additionally, from a machine learning perspective, because feature selection may shrink the complexity of an issue, it can be utilized to preserve or even boost the effectiveness of algorithms while minimizing computing costs. Recently, the emergence of Big Data has created new hurdles for machine learning researchers, who must now deal with massive amounts of data, both in terms of instances and characteristics, rendering the learning process more complicated and computationally intensive than ever before. In particular, while engaging with a significant number of features, the efficiency of learning algorithms might degrade owing to overfitting; as learned models get more complicated, their interpretability decreases, and the performance and efficacy of the algorithms decrease correspondingly. Unfortunately, some of the most widely used algorithms were designed when dataset sizes were considerably lower, and therefore do not scale well in recent times, necessitating the need to repurpose these effective methods to address Big Data concerns.

In this Special Issue, we invite researchers to publish papers concerning current advances in feature selection algorithms for high-dimensional settings, as well as review papers that will motivate ongoing efforts to grasp the challenges commonly faced in this field. We are seeking high-quality articles that address both theoretical and practical challenges relating to feature selection algorithms.

Prof. Dr. Muhammad Adnan Khan
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Algorithms is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • techniques for feature selection based on evolutionary search
  • ensemble methods for feature selection
  • feature selection for high dimensional data
  • feature selection for time series data
  • feature selection applications
  • feature selection for textual data
  • deep feature selection

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research

3 pages, 158 KiB  
Editorial
Special Issue “Algorithms for Feature Selection”
by Muhammad Adnan Khan
Algorithms 2023, 16(8), 368; https://doi.org/10.3390/a16080368 - 31 Jul 2023
Cited by 1 | Viewed by 904
Abstract
This Special Issue of the open access journal Algorithms is dedicated to showcasing cutting-edge research in algorithms for feature selection [...] Full article
(This article belongs to the Special Issue Algorithms for Feature Selection)

Research

Jump to: Editorial

17 pages, 2914 KiB  
Article
The Use of Correlation Features in the Problem of Speech Recognition
by Nikita Andriyanov
Algorithms 2023, 16(2), 90; https://doi.org/10.3390/a16020090 - 7 Feb 2023
Cited by 2 | Viewed by 1516
Abstract
The problem solved in the article is connected with the increase in the efficiency of phraseological radio exchange message recognition, which sometimes takes place in conditions of increased tension for the pilot. For high-quality recognition, signal preprocessing methods are needed. The article considers [...] Read more.
The problem solved in the article is connected with the increase in the efficiency of phraseological radio exchange message recognition, which sometimes takes place in conditions of increased tension for the pilot. For high-quality recognition, signal preprocessing methods are needed. The article considers new data preprocessing algorithms used to extract features from a speech message. In this case, two approaches were proposed. The first approach is building autocorrelation functions of messages based on the Fourier transform, the second one uses the idea of building autocorrelation portraits of speech signals. The proposed approaches are quite simple to implement, although they require cyclic operators, since they work with pairs of samples from the original signal. Approbation of the developed method was carried out with the problem of recognizing phraseological radio exchange messages in Russian. The algorithm with preliminary feature extraction provides a gain of 1.7% in recognition accuracy. The use of convolutional neural networks also provides an increase in recognition efficiency. The gain for autocorrelation portraits processing is about 3–4%. Quantization is used to optimize the proposed models. The algorithm’s performance increased by 2.8 times after the quantization. It was also possible to increase accuracy of recognition by 1–2% using digital signal processing algorithms. An important feature of the proposed algorithms is the possibility of generalizing them to arbitrary data with time correlation. The speech message preprocessing algorithms discussed in this article are based on classical digital signal processing algorithms. The idea of constructing autocorrelation portraits based on the time series of a signal has a novelty. At the same time, this approach ensures high recognition accuracy. However, the study also showed that all the algorithms under consideration perform quite poorly under the influence of strong noise. Full article
(This article belongs to the Special Issue Algorithms for Feature Selection)
Show Figures

Figure 1

18 pages, 553 KiB  
Article
Discovering Critical Factors in the Content of Crowdfunding Projects
by Kai-Fu Yang, Yi-Ru Lin and Long-Sheng Chen
Algorithms 2023, 16(1), 51; https://doi.org/10.3390/a16010051 - 12 Jan 2023
Cited by 2 | Viewed by 1989
Abstract
Crowdfunding can simplify the financing process to raise large amounts of money to complete projects for startups. However, improving the success rate has become one of critical issues. To achieve this goal, fundraisers need to create a short video, attractive promotional content, and [...] Read more.
Crowdfunding can simplify the financing process to raise large amounts of money to complete projects for startups. However, improving the success rate has become one of critical issues. To achieve this goal, fundraisers need to create a short video, attractive promotional content, and present themselves on social media to attract investors. Previous studies merely discussed project factors that affect crowdfunding success rates. However, from the available literature, relatively few studies have studied what elements should be involved in the project content for the success of crowdfunding projects. Consequently, this study aims to extract the crucial factors that can enhance the crowdfunding project success rate based on the project content description. To identify the crucial project content factors of movie projects, this study employed two real cases from famous platforms by using natural language processing (NLP) and feature selection algorithms including rough set theory (RST), decision trees (DT), and ReliefF, from 12 pre-defined candidate factors. Then, support vector machines (SVM) were used to evaluate the performance. Finally, “Role”, “Cast”, “Merchandise”, “Sound effects”, and “Sentiment” were identified as important content factors for movie projects. The findings also could provide fundraisers with suggestions on how to make their movie crowdfunding projects more successful. Full article
(This article belongs to the Special Issue Algorithms for Feature Selection)
Show Figures

Figure 1

12 pages, 1788 KiB  
Article
Augmentation of Densest Subgraph Finding Unsupervised Feature Selection Using Shared Nearest Neighbor Clustering
by Deepesh Chugh, Himanshu Mittal, Amit Saxena, Ritu Chauhan, Eiad Yafi and Mukesh Prasad
Algorithms 2023, 16(1), 28; https://doi.org/10.3390/a16010028 - 3 Jan 2023
Cited by 1 | Viewed by 1538
Abstract
Determining the optimal feature set is a challenging problem, especially in an unsupervised domain. To mitigate the same, this paper presents a new unsupervised feature selection method, termed as densest feature graph augmentation with disjoint feature clusters. The proposed method works in two [...] Read more.
Determining the optimal feature set is a challenging problem, especially in an unsupervised domain. To mitigate the same, this paper presents a new unsupervised feature selection method, termed as densest feature graph augmentation with disjoint feature clusters. The proposed method works in two phases. The first phase focuses on finding the maximally non-redundant feature subset and disjoint features are added to the feature set in the second phase. To experimentally validate, the efficiency of the proposed method has been compared against five existing unsupervised feature selection methods on five UCI datasets in terms of three performance criteria, namely clustering accuracy, normalized mutual information, and classification accuracy. The experimental analyses have shown that the proposed method outperforms the considered methods. Full article
(This article belongs to the Special Issue Algorithms for Feature Selection)
Show Figures

Figure 1

22 pages, 3781 KiB  
Article
Automated Pixel-Level Deep Crack Segmentation on Historical Surfaces Using U-Net Models
by Esraa Elhariri, Nashwa El-Bendary and Shereen A. Taie
Algorithms 2022, 15(8), 281; https://doi.org/10.3390/a15080281 - 11 Aug 2022
Cited by 5 | Viewed by 2573
Abstract
Crack detection on historical surfaces is of significant importance for credible and reliable inspection in heritage structural health monitoring. Thus, several object detection deep learning models are utilized for crack detection. However, the majority of these models are powerful at most in achieving [...] Read more.
Crack detection on historical surfaces is of significant importance for credible and reliable inspection in heritage structural health monitoring. Thus, several object detection deep learning models are utilized for crack detection. However, the majority of these models are powerful at most in achieving the task of classification, with primitive detection of the crack location. On the other hand, several state-of-the-art studies have proven that pixel-level crack segmentation can powerfully locate objects in images for more accurate and reasonable classification. In order to realize pixel-level deep crack segmentation in images of historical buildings, this paper proposes an automated deep crack segmentation approach designed based on an exhaustive investigation of several U-Net deep learning network architectures. The utilization of pixel-level crack segmentation with U-Net deep learning ensures the identification of pixels that are important for the decision of image classification. Moreover, the proposed approach employs the deep learned features extracted by the U-Net deep learning model to precisely describe crack characteristics for better pixel-level crack segmentation. A primary image dataset of various crack types and severity is collected from historical building surfaces and used for training and evaluating the performance of the proposed approach. Three variants of the U-Net convolutional network architecture are considered for the deep pixel-level segmentation of different types of cracks on historical surfaces. Promising results of the proposed approach using the U2Net deep learning model are obtained, with a Dice score and mean Intersection over Union (mIoU) of 71.09% and 78.38% achieved, respectively, at the pixel level. Conclusively, the significance of this work is the investigation of the impact of utilizing pixel-level deep crack segmentation, supported by deep learned features, through adopting variants of the U-Net deep learning model for crack detection on historical surfaces. Full article
(This article belongs to the Special Issue Algorithms for Feature Selection)
Show Figures

Figure 1

13 pages, 737 KiB  
Article
False Information Detection via Multimodal Feature Fusion and Multi-Classifier Hybrid Prediction
by Yi Liang, Turdi Tohti and Askar Hamdulla
Algorithms 2022, 15(4), 119; https://doi.org/10.3390/a15040119 - 29 Mar 2022
Cited by 5 | Viewed by 2689
Abstract
In the existing false information detection methods, the quality of the extracted single-modality features is low, the information between different modalities cannot be fully fused, and the original information will be lost when the information of different modalities is fused. This paper proposes [...] Read more.
In the existing false information detection methods, the quality of the extracted single-modality features is low, the information between different modalities cannot be fully fused, and the original information will be lost when the information of different modalities is fused. This paper proposes a false information detection via multimodal feature fusion and multi-classifier hybrid prediction. In this method, first, bidirectional encoder representations for transformers are used to extract the text features, and S win-transformer is used to extract the picture features, and then, the trained deep autoencoder is used as an early fusion method of multimodal features to fuse text features and visual features, and the low-dimensional features are taken as the joint features of the multimodalities. The original features of each modality are concatenated into the joint features to reduce the loss of original information. Finally, the text features, image features and joint features are processed by three classifiers to obtain three probability distributions, and the three probability distributions are added proportionally to obtain the final prediction result. Compared with the attention-based multimodal factorized bilinear pooling, the model achieves 4.3% and 1.2% improvement in accuracy on Weibo dataset and Twitter dataset. The experimental results show that the proposed model can effectively integrate multimodal information and improve the accuracy of false information detection. Full article
(This article belongs to the Special Issue Algorithms for Feature Selection)
Show Figures

Figure 1

19 pages, 1242 KiB  
Article
Ensemble Machine Learning Model to Predict the Waterborne Syndrome
by Mohammed Gollapalli
Algorithms 2022, 15(3), 93; https://doi.org/10.3390/a15030093 - 11 Mar 2022
Cited by 17 | Viewed by 3356
Abstract
The COVID-19 epidemic has highlighted the significance of sanitization and maintaining hygienic access to clean water to reduce mortality and morbidity cases worldwide. Diarrhea is one of the prevalent waterborne diseases caused due to contaminated water in many low-income countries with similar living [...] Read more.
The COVID-19 epidemic has highlighted the significance of sanitization and maintaining hygienic access to clean water to reduce mortality and morbidity cases worldwide. Diarrhea is one of the prevalent waterborne diseases caused due to contaminated water in many low-income countries with similar living conditions. According to the latest statistics from the World Health Organization (WHO), diarrhea is among the top five primary causes of death worldwide in low-income nations. The condition affects people in every age group due to a lack of proper water used for daily living. In this study, a stacking ensemble machine learning model was employed against traditional models to extract clinical knowledge for better understanding patients’ characteristics; disease prevalence; hygienic conditions; quality of water used for cooking, bathing, and toiletries; chemicals used; therapist’s medications; and symptoms that are reflected in the field study data. Results revealed that the ensemble model provides higher accuracy with 98.90% as part of training and testing phases when experimented against frequently used J48, Naïve Bayes, SVM, NN, PART, Random Forest, and Logistic Regression models. Managing outcomes of this research in the early stages could assist people in low-income countries to have a better lifestyle, fewer infections, and minimize expensive hospital visits. Full article
(This article belongs to the Special Issue Algorithms for Feature Selection)
Show Figures

Figure 1

Back to TopTop