Automated Methods for Speech Processing and Recognition

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: 15 June 2024 | Viewed by 788

Special Issue Editors


E-Mail Website
Guest Editor
eVIDA Research Group, Faculty of Engineering, University of Deusto, 48007 Bilbao, Spain
Interests: recommendation systems; machine learing; voice detection and classification; assistive technology

E-Mail Website
Guest Editor
The Deustotech-LIFE (eVIDA) Research Group, Faculty of Engineering, University of Deusto, 48007 Bilbao, Spain
Interests: new technologies applied to physical activity; health and wellbeing; image processing
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor

Special Issue Information

Dear Colleagues,

Automated Methods for Speech Processing and Recognition focuses on the latest advancements in the field of speech technology. The goal of this collection of articles is to showcase cutting-edge research and innovations aimed at automating various aspects of speech analysis, synthesis, and recognition. Researchers and experts in the field explore topics such as automatic speech recognition (ASR), natural language processing (NLP), speaker recognition, and speech synthesis using machine learning and deep learning techniques. The special issue serves as a comprehensive resource for professionals and enthusiasts interested in the development of automated systems that can understand and generate human speech, with applications ranging from virtual assistants to transcription services and more.

Dr. Ibon Oleagordia
Dr. Amaia Méndez-Zorrilla
Prof. Dr. Begoña Garcia-Zapirain
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • automatic speech recognition (ASR)
  • natural language processing (NLP)
  • speaker recognition
  • speech synthesis using machine learning
  • deep learning techniques

Published Papers (2 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

13 pages, 2241 KiB  
Article
Advantages and Pitfalls of Dataset Condensation: An Approach to Keyword Spotting with Time-Frequency Representations
by Pedro Henrique Pereira, Wesley Beccaro and Miguel Arjona Ramírez
Electronics 2024, 13(11), 2097; https://doi.org/10.3390/electronics13112097 - 28 May 2024
Viewed by 194
Abstract
With the exponential growth of data, the need for efficient techniques to extract relevant information from datasets becomes increasingly imperative. Reducing the training data can be useful for applications wherein storage space or computational resources are limited. In this work, we explore the [...] Read more.
With the exponential growth of data, the need for efficient techniques to extract relevant information from datasets becomes increasingly imperative. Reducing the training data can be useful for applications wherein storage space or computational resources are limited. In this work, we explore the concept of data condensation (DC) in the context of keyword spotting systems (KWS). Using deep learning architectures and time-frequency speech representations, we have obtained condensed speech signal representations using gradient matching with Efficient Synthetic-Data Parameterization. From a series of classification experiments, we analyze the models and condensed data performances in terms of accuracy and number of data per class. We also present results using cross-model techniques, wherein models are trained with condensed data obtained from a different architecture. Our findings demonstrate the potential of data condensation in the context of the speech domain for reducing the size of datasets while retaining their most important information and maintaining high accuracy for the model trained with the condensed dataset. We have obtained an accuracy of 80.75% with 30 condensed speech representations per class with ConvNet, representing an addition of 24.9% in absolute terms when compared to 30 random samples from the original training dataset. However, we demonstrate the limitations of this approach in the cross-model tests. We also highlight the challenges and opportunities for further improving the accuracy of condensed data obtained and trained with different neural network architectures. Full article
(This article belongs to the Special Issue Automated Methods for Speech Processing and Recognition)
Show Figures

Figure 1

24 pages, 1490 KiB  
Article
Next-Generation Spam Filtering: Comparative Fine-Tuning of LLMs, NLPs, and CNN Models for Email Spam Classification
by Konstantinos I. Roumeliotis, Nikolaos D. Tselikas and Dimitrios K. Nasiopoulos
Electronics 2024, 13(11), 2034; https://doi.org/10.3390/electronics13112034 - 23 May 2024
Viewed by 366
Abstract
Spam emails and phishing attacks continue to pose significant challenges to email users worldwide, necessitating advanced techniques for their efficient detection and classification. In this paper, we address the persistent challenges of spam emails and phishing attacks by introducing a cutting-edge approach to [...] Read more.
Spam emails and phishing attacks continue to pose significant challenges to email users worldwide, necessitating advanced techniques for their efficient detection and classification. In this paper, we address the persistent challenges of spam emails and phishing attacks by introducing a cutting-edge approach to email filtering. Our methodology revolves around harnessing the capabilities of advanced language models, particularly the state-of-the-art GPT-4 Large Language Model (LLM), along with BERT and RoBERTa Natural Language Processing (NLP) models. Through meticulous fine-tuning tailored for spam classification tasks, we aim to surpass the limitations of traditional spam detection systems, such as Convolutional Neural Networks (CNNs). Through an extensive literature review, experimentation, and evaluation, we demonstrate the effectiveness of our approach in accurately identifying spam and phishing emails while minimizing false positives. Our methodology showcases the potential of fine-tuning LLMs for specialized tasks like spam classification, offering enhanced protection against evolving spam and phishing attacks. This research contributes to the advancement of spam filtering techniques and lays the groundwork for robust email security systems in the face of increasingly sophisticated threats. Full article
(This article belongs to the Special Issue Automated Methods for Speech Processing and Recognition)
Show Figures

Figure 1

Back to TopTop