Computational Acoustic Scene Analysis

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Acoustics and Vibrations".

Deadline for manuscript submissions: closed (15 June 2018) | Viewed by 11605

Special Issue Editors


E-Mail Website
Guest Editor
Fondazione Bruno Kessler (FBK), 38122 Trento, Italy
Interests: digital signal processing; audio and music signal processing; robustness in ASR; audio and speech corpora; microphone arrays for speech recognition and acoustic scene analysis

E-Mail Website
Guest Editor
Department of Information Engineering, Università Politecnica delle Marche, 60121 Ancona, Italy
Interests: computational intelligence and digital signal processing, with special focus on speech/audio/music processing and energy management
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Laboratory of Signal Processing, Tampere University of Technology, 33720 Tampere, Finland
Interests: machine listening; audio content analysis; audio signal processing; sound source separation; sound event detection

Special Issue Information

Dear Colleagues,

Computational acoustic scene analysis is a highly-active research field where audio signal processing and machine learning meet several scientific topics, such as room acoustics, microphone arrays, sound source localization, source separation, acoustic event detection, pattern classification, and many others. Emerging application fields include surveillance, environmental monitoring, hearing-aids, distant-speech interaction, for example in smart-home and industry automation. In most of these cases, state-of-the-art techniques are still inadequate for a deployment in real-world contexts.

Indeed, very challenging research problems need to be solved, since real-world noisy and reverberant environments are typically characterized by the presence of multiple speakers and noise sources, often overlapping each other. With the recent advent of machine learning, a significant transformation is under way in these fields, as witnessed by many papers presented in recent conferences, workshops, and even challenges such as DCASE, ACE, REVERB and CHIME.

In this Special Issue, we aim to describe current advances on computational methods on acoustic scene analysis in the following topics, but not limited to them:

  • Acoustic event detection and classification

  • Acoustic scene classification

  • Environmental monitoring by means of audio signals

  • Sound source localization and tracking

  • Sound source and speech activity detection

  • Blind source separation

  • Acoustic scene understanding

Preference will be given to works describing advanced digital signal processing and machine learning techniques applied to challenging contexts, such as multiple sources, often overlapping each other, under real-world noisy and reverberant environments.

Dr. Maurizio Omologo
Prof. Dr. Stefano Squartini
Prof. Dr. Tuomas Virtanen
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (2 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 1051 KiB  
Article
Deep Learning for Audio Event Detection and Tagging on Low-Resource Datasets
by Veronica Morfi and Dan Stowell
Appl. Sci. 2018, 8(8), 1397; https://doi.org/10.3390/app8081397 - 18 Aug 2018
Cited by 28 | Viewed by 6560
Abstract
In training a deep learning system to perform audio transcription, two practical problems may arise. Firstly, most datasets are weakly labelled, having only a list of events present in each recording without any temporal information for training. Secondly, deep neural networks need a [...] Read more.
In training a deep learning system to perform audio transcription, two practical problems may arise. Firstly, most datasets are weakly labelled, having only a list of events present in each recording without any temporal information for training. Secondly, deep neural networks need a very large amount of labelled training data to achieve good quality performance, yet in practice it is difficult to collect enough samples for most classes of interest. In this paper, we propose factorising the final task of audio transcription into multiple intermediate tasks in order to improve the training performance when dealing with this kind of low-resource datasets. We evaluate three data-efficient approaches of training a stacked convolutional and recurrent neural network for the intermediate tasks. Our results show that different methods of training have different advantages and disadvantages. Full article
(This article belongs to the Special Issue Computational Acoustic Scene Analysis)
Show Figures

Graphical abstract

12 pages, 1969 KiB  
Article
Acoustic Scene Classification Using Efficient Summary Statistics and Multiple Spectro-Temporal Descriptor Fusion
by Jiaxing Ye, Takumi Kobayashi, Nobuyuki Toyama, Hiroshi Tsuda and Masahiro Murakawa
Appl. Sci. 2018, 8(8), 1363; https://doi.org/10.3390/app8081363 - 13 Aug 2018
Cited by 16 | Viewed by 4233
Abstract
This paper presents a novel approach for acoustic scene classification based on efficient acoustic feature extraction using spectro-temporal descriptors fusion. Grounded on the finding in neuroscience—“auditory system summarizes the temporal details of sounds using time-averaged statistics to understand acoustic scenes”, we devise an [...] Read more.
This paper presents a novel approach for acoustic scene classification based on efficient acoustic feature extraction using spectro-temporal descriptors fusion. Grounded on the finding in neuroscience—“auditory system summarizes the temporal details of sounds using time-averaged statistics to understand acoustic scenes”, we devise an efficient computational framework for sound scene classification by using multipe time-frequency descriptors fusion with discriminant information enhancement. To characterize rich information of sound, i.e., local structures on the time-frequency plane, we adopt 2-dimensional local descriptors. A more critical issue raised in how to logically ‘summarize’ those local details into a compact feature vector for scene classification. Although ‘time-averaged statistics’ is suggested by the psychological investigation, directly computing time average of local acoustic features is not a logical way, since arithmetic mean is vulnerable to extreme values which are anticipated to be generated by interference sounds which are irrelevant to the scene category. To tackle this problem, we develop time-frame weighting approach to enhance sound textures as well as to suppress scene-irrelevant events. Subsequently, robust acoustic feature for scene classification can be efficiently characterized. The proposed method had been validated by using Rouen dataset which consists of 19 acoustic scene categories with 3029 real samples. Extensive results demonstrated the effectiveness of the proposed scheme. Full article
(This article belongs to the Special Issue Computational Acoustic Scene Analysis)
Show Figures

Figure 1

Back to TopTop