**1. Introduction**

Presently, cities are growing in both size and population, and the consequent increase in vehicles is making traffic noise problem more present, with a clear effect on the quality of life of their citizens [1]. Noise is one of the main environmental health concerns [2,3], and its impact on social and economic aspects has been proved [4]. To face this issue, European authorities have driven the European Noise Directive (END) [5], focused on the creation of noise-level maps to inform citizens of their exposure to noise, and aided the authorities to take appropriate action to minimize its impact.

Noise maps have been historically generated by means of costly expert measurements using certified sound-level meters, with a basis of short-term periods aimed at being sufficiently representative. This approach is presently overcome by the technological advances of the Internet of Things in the framework of smart cities, which has allowed the emergence of Wireless Acoustic Sensor Networks (WASN) [6,7]. In the literature, several different WASNs have been designed for urban sound monitoring, some of them focused on security and surveillance and others on city noise management, involving noise mapping, the development of action plans, and public awareness campaigns. For example, the SENSEable project [8] deployed a WASN to collect information from the acoustic environment by means of a set of low-cost acoustic sensors with the goal of analyzing that data together with public health information. Other similar projects are the IDEA project in Belgium [9], or the RUMEUR network in France [10] with special focus on aircraft noise, or even the Barcelona noise-monitoring network [11], whose data is integrated in the Sentilo city management platform [12]. Recently, the SONYC project has deployed 56 low-cost acoustic sensors across New York City to monitor urban noise and perform a multi-label classification of urban sound sources in real time [4], but not on site. Finally, the LIFE DYNAMAP project [13] aims to monitor the noise level generated by road infrastructures by means of two WASNs installed in two pilot areas, one within an urban environment in Milan (District 9), and another in a suburban area surrounding Rome (A90 highway). To monitor Road-Traffic Noise (RTN) levels reliably, all Anomalous Noise Events (ANE) unrelated to regular RTN (e.g., sirens, horns, speech, etc.) should be removed before updating the corresponding noise maps [13].

The deployment of these projects has shown that the WASN paradigm entails several challenges, from technical issues [14,15], to other aspects related to the WASN-based application, such as the automation of data collection and the subsequent signal processing [16–18], especially if the system intends detect acoustic events in real operation and locally in each sensor. Acoustic event detection and classification belongs to the Computational Auditory Scene Analysis (CASA) paradigm [19], and it is usually based on the segmentation of the input acoustic data into slices that represent a single occurrence of the target class, and focus on individual simultaneous events [20]. To do so, Acoustic Event Detection (AED) algorithms are typically trained with databases designed ad hoc in each of the problems to be solved, hence typically considering a finite set of predefined acoustic classes [21,22].

Therefore, the development of AED-based applications entails representative audio databases with all kinds of sounds of interest, as in the one obtained in the SONYC project [4], with data from 56 sensors deployed in different neighborhoods of New York, which considers 10 different kinds of common urban sound sources labeled in an urban soundscape. As a first attempt to create an acoustic dataset to model the acoustic environments of urban and suburban pilot areas in the framework of the DYNAMAP project, an expert-based recording campaign was conducted before the two WASNs were deployed [23]. The analyses showed the highly local, unpredictable, and diverse nature of ANEs in real acoustic environments can be far different to previous models obtained by means of synthetically generated datasets [23]. After labeling the gathered acoustic data, the dataset was used to train the AED-based algorithm designed to detect ANEs, known as Anomalous Noise Event Detector (ANED) [18]. Although that preliminary dataset collected a representative number of acoustic events of interest from both acoustic environments, it missed several key aspects, such as different RTN patterns observed during day–night, weekday–weekend and the effect of diverse weather conditions [24]. This work describes the generation of the acoustic dataset to model the Rome's acoustic environment in real-operation conditions, after deploying the 19-node WASN in its final location. The paper describes the conducted recording campaign and the subsequent labeling of ANEs in 16 different subcategories (without considering combined sounds in a sample), as well as the analysis of their occurrences, duration, Signal-to-Noise Ratio (SNR), and impact on the A-weighted equivalent noise level (*LAeq* ) computation.

The remainder of this paper is the following. Section 2 details the most relevant previous attempts to generate environmental audio databases. Section 3 describes the generation and labeling of the real-operation conditions environmental audio database in the suburban scenario. Section 4 analyses the ANE of the dataset in terms of occurrences, duration, SNR, and impact on the *LAeq*. Finally, Section 5 discusses in detail the results obtained in the analysis and the future applications of the designed dataset.

### **2. Related Work**

In the literature, several audio databases related to machine hearing (or machine listening) have been unveiled for benchmarking purposes under the umbrella of the so-called CASA, and mainly oriented to evaluate the performance of acoustic scene classification and AED. This section reviews the literature about environmental acoustic databases and the datasets designed for challenges (e.g., DCASE), and describes their characteristics and limitations.
