applsci-logo

Journal Browser

Journal Browser

Machine-Learning-Based Feature Extraction and Selection

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (20 July 2024) | Viewed by 13699

Special Issue Editor


E-Mail Website
Guest Editor
SING Research Group, Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, Department of Computer Science, Universidade de Vigo, ESEI—Escola Superior de Enxeñaría Informática, Edificio Politécnico, Campus Universitario As Lagoas S/N, 32004 Ourense, Spain
Interests: text mining; artificial intelligence; image processing machine learning; deep learning; big data
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The technological advances attained during the last decade, together with the enhancement of data storage and computation capabilities, have stimulated the continuous generation and storage of large volumes of high-dimensional heterogeneous data at an unprecedented speed.

In this context, feature extraction and selection methods have become a crucial mechanism to alleviate two key issues related to high-dimensional data: (i) the increase in computational efforts required for its processing and/or analysis, and (ii) the existence of additional duplicated and/or meaningless information associated with the curse of dimensionality phenomenon.

In this Special Issue, we will explore the potential of applying Machine-Learning-Based Feature Extraction and Selection methods to reduce model complexity by decreasing data dimensionality. This Special Issue is open for the publication of experimental works, properly validated designs, theoretical studies, and state-of-the-art review papers.

Dr. David Ruano Ordás
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • information retrieval and text mining
  • machine learning
  • data mining and knowledge discovery
  • deep learning
  • information extraction
  • machine learning for NLP
  • dimensionality reduction

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research

5 pages, 190 KiB  
Editorial
Machine Learning-Based Feature Extraction and Selection
by David Ruano-Ordás
Appl. Sci. 2024, 14(15), 6567; https://doi.org/10.3390/app14156567 - 27 Jul 2024
Viewed by 1050
Abstract
Over the last decade, technological advances have brought breakthroughs in the landscape of data management, transmission, processing, and storage [...] Full article
(This article belongs to the Special Issue Machine-Learning-Based Feature Extraction and Selection)

Research

Jump to: Editorial

38 pages, 1981 KiB  
Article
Investigating the Performance of a Novel Modified Binary Black Hole Optimization Algorithm for Enhancing Feature Selection
by Mohammad Ryiad Al-Eiadeh, Raneem Qaddoura and Mustafa Abdallah
Appl. Sci. 2024, 14(12), 5207; https://doi.org/10.3390/app14125207 - 14 Jun 2024
Cited by 1 | Viewed by 576
Abstract
High-dimensional datasets often harbor redundant, irrelevant, and noisy features that detrimentally impact classification algorithm performance. Feature selection (FS) aims to mitigate this issue by identifying and retaining only the most pertinent features, thus reducing dataset dimensions. In this study, we propose an FS [...] Read more.
High-dimensional datasets often harbor redundant, irrelevant, and noisy features that detrimentally impact classification algorithm performance. Feature selection (FS) aims to mitigate this issue by identifying and retaining only the most pertinent features, thus reducing dataset dimensions. In this study, we propose an FS approach based on black hole algorithms (BHOs) augmented with a mutation technique termed MBHO. BHO typically comprises two primary phases. During the exploration phase, a set of stars is iteratively modified based on existing solutions, with the best star selected as the “black hole”. In the exploration phase, stars nearing the event horizon are replaced, preventing the algorithm from being trapped in local optima. To address the potential randomness-induced challenges, we introduce inversion mutation. Moreover, we enhance a widely used objective function for wrapper feature selection by integrating two new terms based on the correlation among selected features and between features and classification labels. Additionally, we employ a transfer function, the V2 transfer function, to convert continuous values into discrete ones, thereby enhancing the search process. Our approach undergoes rigorous evaluation experiments using fourteen benchmark datasets, and it is compared favorably against Binary Cuckoo Search (BCS), Mutual Information Maximization (MIM), Joint Mutual Information (JMI), and minimum Redundancy Maximum Eelevance (mRMR), approaches. The results demonstrate the efficacy of our proposed model in selecting superior features that enhance classifier performance metrics. Thus, MBHO is presented as a viable alternative to the existing state-of-the-art approaches. We make our implementation source code available for community use and further development. Full article
(This article belongs to the Special Issue Machine-Learning-Based Feature Extraction and Selection)
Show Figures

Figure 1

17 pages, 2515 KiB  
Article
Exploring the Role of Self-Adaptive Feature Words in Relation Quintuple Extraction for Scientific Literature
by Yujiang Liu, Lijun Fu, Xiaojun Xia and Yonghong Zhang
Appl. Sci. 2024, 14(10), 4020; https://doi.org/10.3390/app14104020 - 9 May 2024
Cited by 1 | Viewed by 663
Abstract
Extracting relation quintuple and feature words from unstructured text is a prelude to the construction of the scientific knowledge base. At present, the prior works use explicit clues between entities to study this task but ignore the use and the association of the [...] Read more.
Extracting relation quintuple and feature words from unstructured text is a prelude to the construction of the scientific knowledge base. At present, the prior works use explicit clues between entities to study this task but ignore the use and the association of the feature words. In this work, we propose a new method to generate self-adaptive feature words from the original text for every single sample. These words can add additional correlation information to the knowledge graph. We allow the model to generate a new word representation and apply it to the original sentence to judge the relation type and locate the head and tail of the relation quintuple. Compared with the previous works, the feature words increase the flexibility of relying on information and improve the explanatory ability. Extensive experiments on scientific field datasets illustrate that the self-adaptive feature words method (SAFW) is good at ferreting out the unique feature words and obtaining the core part for the quintuple. It achieves good performance on four public datasets and obtains a markable performance improvement compared with other baselines. Full article
(This article belongs to the Special Issue Machine-Learning-Based Feature Extraction and Selection)
Show Figures

Figure 1

16 pages, 1102 KiB  
Article
A New Permutation-Based Method for Ranking and Selecting Group Features in Multiclass Classification
by Iqbal Muhammad Zubair, Yung-Seop Lee and Byunghoon Kim
Appl. Sci. 2024, 14(8), 3156; https://doi.org/10.3390/app14083156 - 9 Apr 2024
Cited by 1 | Viewed by 1099
Abstract
The selection of group features is a critical aspect in reducing model complexity by choosing the most essential group features, while eliminating the less significant ones. The existing group feature selection methods select a set of important group features, without providing the relative [...] Read more.
The selection of group features is a critical aspect in reducing model complexity by choosing the most essential group features, while eliminating the less significant ones. The existing group feature selection methods select a set of important group features, without providing the relative importance of all group features. Moreover, few methods consider the relative importance of group features in the selection process. This study introduces a permutation-based group feature selection approach specifically designed for high-dimensional multiclass datasets. Initially, the least absolute shrinkage and selection operator (lasso) method was applied to eliminate irrelevant individual features within each group feature. Subsequently, the relative importance of the group features was computed using a random-forest-based permutation method. Accordingly, the process selected the highly significant group features. The performance of the proposed method was evaluated using machine learning algorithms and compared with the performance of other approaches, such as group lasso. We used real-world, high-dimensional, multiclass microarray datasets to demonstrate its effectiveness. The results highlighted the capability of the proposed method, which not only selected significant group features but also provided the relative importance and ranking of all group features. Furthermore, the proposed method outperformed the existing method in terms of accuracy and F1 score. Full article
(This article belongs to the Special Issue Machine-Learning-Based Feature Extraction and Selection)
Show Figures

Figure 1

15 pages, 6690 KiB  
Article
Application of Remote Sensing and Geographic Information System Technologies to Assess the Impact of Mining: A Case Study at Emalahleni
by Monica Naa Morkor Cudjoe, Efiba Vidda Senkyire Kwarteng, Enoch Anning, Idowu Racheal Bodunrin and Samuel Ato Andam-Akorful
Appl. Sci. 2024, 14(5), 1739; https://doi.org/10.3390/app14051739 - 21 Feb 2024
Cited by 2 | Viewed by 1193
Abstract
This article presents an assessment of the impact of mining activities in the Emalahleni municipality, using GIS and RS technologies. The random forest algorithm was used to classify the land use and land cover in the Emalahleni municipality over a three-decade period (1990–2020). [...] Read more.
This article presents an assessment of the impact of mining activities in the Emalahleni municipality, using GIS and RS technologies. The random forest algorithm was used to classify the land use and land cover in the Emalahleni municipality over a three-decade period (1990–2020). The classifications are settlement, water, mining area, vegetation, and bare land. The majority of the study area was found to be rocky ground, accounting for approximately 60% of the total study area. Change detection maps were created for vegetation and mining to assess the extent of land degradation in the study area over the three-decade period. The findings in this study highlight the importance of understanding the changes in land use and vegetation cover in the study area and its impact on the environment, as well as the local community. It is crucial to develop sustainable land management strategies that ensure that a reasonable balance concerning the economic development activities is achieved, such as mining with environmental management for its long-term viability for future generations. The data presented in this study provides a useful baseline for further research and can inform land-use planning and decision-making processes in Emalahleni. Full article
(This article belongs to the Special Issue Machine-Learning-Based Feature Extraction and Selection)
Show Figures

Figure 1

11 pages, 4484 KiB  
Article
A Hybrid CNN and RNN Variant Model for Music Classification
by Mohsin Ashraf, Fazeel Abid, Ikram Ud Din, Jawad Rasheed, Mirsat Yesiltepe, Sook Fern Yeo and Merve T. Ersoy
Appl. Sci. 2023, 13(3), 1476; https://doi.org/10.3390/app13031476 - 22 Jan 2023
Cited by 25 | Viewed by 5618
Abstract
Music genre classification has a significant role in information retrieval for the organization of growing collections of music. It is challenging to classify music with reliable accuracy. Many methods have utilized handcrafted features to identify unique patterns but are still unable to determine [...] Read more.
Music genre classification has a significant role in information retrieval for the organization of growing collections of music. It is challenging to classify music with reliable accuracy. Many methods have utilized handcrafted features to identify unique patterns but are still unable to determine the original music characteristics. Comparatively, music classification using deep learning models has been shown to be dynamic and effective. Among the many neural networks, the combination of a convolutional neural network (CNN) and variants of a recurrent neural network (RNN) has not been significantly considered. Additionally, addressing the flaws in the particular neural network classification model, this paper proposes a hybrid architecture of CNN and variants of RNN such as long short-term memory (LSTM), Bi-LSTM, gated recurrent unit (GRU), and Bi-GRU. We also compared the performance based on Mel-spectrogram and Mel-frequency cepstral coefficient (MFCC) features. Empirically, the proposed hybrid architecture of CNN and Bi-GRU using Mel-spectrogram achieved the best accuracy at 89.30%, whereas the hybridization of CNN and LSTM using MFCC achieved the best accuracy at 76.40%. Full article
(This article belongs to the Special Issue Machine-Learning-Based Feature Extraction and Selection)
Show Figures

Figure 1

14 pages, 904 KiB  
Article
Malicious URL Detection Model Based on Bidirectional Gated Recurrent Unit and Attention Mechanism
by Tiefeng Wu, Miao Wang, Yunfang Xi and Zhichao Zhao
Appl. Sci. 2022, 12(23), 12367; https://doi.org/10.3390/app122312367 - 2 Dec 2022
Cited by 9 | Viewed by 2196
Abstract
With the rapid development of Internet technology, numerous malicious URLs have appeared, which bring a large number of security risks. Efficient detection of malicious URLs has become one of the keys for defense against cyber attacks. Deep learning methods bring new developments to [...] Read more.
With the rapid development of Internet technology, numerous malicious URLs have appeared, which bring a large number of security risks. Efficient detection of malicious URLs has become one of the keys for defense against cyber attacks. Deep learning methods bring new developments to the identification of malicious web pages. This paper proposes a malicious URL detection method based on a bidirectional gated recurrent unit (BiGRU) and attention mechanism. The method is based on the BiGRU model. A regularization operation called a dropout mechanism is added to the input layer to prevent the model from overfitting, and an attention mechanism is added to the middle layer to strengthen the feature learning of URLs. Finally, the deep learning network DA-BiGRU model is formed. The experimental results demonstrate that the proposed method can achieve better classification results in malicious URL detection, which has high significance for practical applications. Full article
(This article belongs to the Special Issue Machine-Learning-Based Feature Extraction and Selection)
Show Figures

Figure 1

Back to TopTop