applsci-logo

Journal Browser

Journal Browser

Advances and Applications of Audio and Speech Signal Processing

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 30 September 2024 | Viewed by 1051

Special Issue Editors


E-Mail Website
Guest Editor
National Engineering Research Center for Speech and Language Information Processing (NERC-SLIP), University of Science and Technology of China, Hefei, China
Interests: voice signal processing; wireless acoustic sensor networks; automatic speech recognition; sound source localization

E-Mail Website
Guest Editor
Énergie Matériaux Télécommunications Research Centre, Institut National de la Recherche Scientifique, Quebec, QC J3X 1P7, Canada
Interests: DSP; speech embedding; speech processing; deep learning; speech recognition; speaker recognition

Special Issue Information

Dear Colleagues,

In this Special Issue, original research article and reviews are welcome. Topics may include (but are not limited to) the following:

  1. Audio and speech modeling, speech coding and transmission.
  2. Single/multiple microphone signal processing for speech enhancement/separation, sound source localization/tracking, speech dereverberation, active noise control, and echo cancellation.
  3. Audio for multimedia and audio processing systems, e.g., audiovisual speech enhancement/recognition/localization.
  4. Bioacoustics and medical acoustics, e.g., using EEG/fMRI measurements to assist speech processing.
  5. The detection and classification of acoustic scenes and events, and the modeling, analysis and synthesis of acoustic environments.
  6. Music information retrieval, and music signal analysis, processing and synthesis.
  7. Speech quality and intelligibility measures, auditory modeling and hearing instruments.
  8. Speech production, speech perception and psychoacoustics.
  9. Speech synthesis and generation, and spatial stereo sound production.
  10. Automatic speech recognition.
  11. Speaker recognition and identity/privacy preservation.
  12. Advanced machine learning methods with application to audio and speech signal processing.

Dr. Jie Zhang
Dr. Douglas O'Shaughnessy
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • speech signal processing
  • audio signal processing
  • speech recognition
  • music signal analysis

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (2 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

23 pages, 421 KiB  
Article
On Beamforming with the Single-Sideband Transform
by Vitor Probst Curtarelli and Israel Cohen
Appl. Sci. 2024, 14(17), 7514; https://doi.org/10.3390/app14177514 - 25 Aug 2024
Viewed by 280
Abstract
In this paper, we examine the use of the Single-Sideband Transform (SSBT) for convolutive beamformers. We explore its unique properties and implications for beamformer design. Our study sheds light on the tradeoffs involved in using the SSBT in beamforming applications, offering insights into [...] Read more.
In this paper, we examine the use of the Single-Sideband Transform (SSBT) for convolutive beamformers. We explore its unique properties and implications for beamformer design. Our study sheds light on the tradeoffs involved in using the SSBT in beamforming applications, offering insights into both its strengths and limitations. Despite the advantage of having real-valued coefficients, we show that the convolution handling of the transform presents challenges that impact fundamental beamforming principles. When compared to the Short-Time Fourier Transform (STFT), the SSBT displays lower robustness, especially in scenarios involving mismatch and modeling noise. Notably, we establish a direct equivalence between the SSBT and STFT when using identical transform parameters, enabling their seamless interchangeability and joint use in time–frequency signal enhancements. We validate our theoretical findings through realistic simulations using the Minimum-Power Distortionless Response beamformer. These simulations illustrate that although the STFT performs marginally better than the SSBT under optimal conditions, it outperforms significantly in non-ideal scenarios. Full article
(This article belongs to the Special Issue Advances and Applications of Audio and Speech Signal Processing)
Show Figures

Figure 1

15 pages, 507 KiB  
Article
Automatic Age and Gender Recognition Using Ensemble Learning
by Ergün Yücesoy
Appl. Sci. 2024, 14(16), 6868; https://doi.org/10.3390/app14166868 - 6 Aug 2024
Viewed by 470
Abstract
The use of speech-based recognition technologies in human–computer interactions is increasing daily. Age and gender recognition, one of these technologies, is a popular research topic used directly or indirectly in many applications. In this research, a new age and gender recognition approach based [...] Read more.
The use of speech-based recognition technologies in human–computer interactions is increasing daily. Age and gender recognition, one of these technologies, is a popular research topic used directly or indirectly in many applications. In this research, a new age and gender recognition approach based on the ensemble of different machine learning algorithms is proposed. In the study, five different classifiers, namely KNN, SVM, LR, RF, and E-TREE, are used as base-level classifiers and the majority voting and stacking methods are used to create the ensemble models. First, using MFCC features, five base-level classifiers are created and the performance of each model is evaluated. Then, starting from the one with the highest performance, these classifiers are combined and ensemble models are created. In the study, eight different ensemble models are created and the performances of each are examined separately. The experiments conducted with the Turkish subsection of the Mozilla Common Voice dataset show that the ensemble models increase the recognition accuracy, and the highest accuracy of 97.41% is achieved with the ensemble model created by stacking five classifiers (SVM, E-TREE, RF, KNN, and LR). According to this result, the proposed ensemble model achieves superior accuracy compared to similar studies in recognizing age and gender from speech signals. Full article
(This article belongs to the Special Issue Advances and Applications of Audio and Speech Signal Processing)
Show Figures

Figure 1

Back to TopTop