*5.1. WASN-Based Dataset Analysis vs. Expert-Based Dataset*

As previously stated, a preliminary suburban environment dataset in Rome was published in [23], which consisted of a set of expert-based recordings of limited duration and scope. The work presented in this article is based on the knowledge acquired from that first dataset. It responds to the need for increasing the coverage of the RTN and the ANEs at all hours of the day and night, the weekend, and even when elements external to the noise appear in the measurements, such as adverse weather conditions, i.e., in real-operation conditions.

The WASN-based dataset presented in this work has been enriched with seven classes in comparison to the previous one: *rain*, *thun*, *tran*, *bird*, *alrm*, *inte* and *airp*. On the contrary, it has not been possible to record the noise of people talking (*peop*) as in the previous dataset caused by the presence of workers in the portals. All the rest of the ANEs were already part of the expert-based dataset, but with fewer occurrences because the dataset was much smaller. The preliminary dataset contained 3.2% of ANE of the total recorded time, and this new dataset contains 1.8%. A possible explanation to these differences is that the expert-based dataset recording was centered in daytime and this WASN-based dataset has recorded day and night, where night shows low presence of ANE with respect to the day.

The longest ANEs have been found within the *sire* subcategory, while the shortest ones are found in the *door* subcategory, as also happened in the expert-based recording. Moreover, the ANEs labeled as *horn* and *sire* present the highest SNR in both datasets. However, it is worth noting that in this new dataset there are samples of *tran* and *inte* subcategories that also entail high SNRs in many occurrences, a characteristic that was not found in [23].

From this comparative analysis, it can be deduced that the data captured in the expert-based dataset was suitable enough for the first characterization of the suburban soundscape. Nevertheless, the WASN-based recording campaign has shown that there were several noise subcategories that in the preliminary recording campaign had not been recorded and labeled, which present critical characteristics in terms of SNR and duration.
