1. Introduction
Human well-being in cities is partly determined by the acoustic environment, which has been changing in multiple ways in the past decades [
1,
2,
3]. Noise pollution has been an important topic for society since it is clear that negative effects are spreading rapidly across different environments and across species [
4,
5,
6,
7]. Environmental acoustics deals with noise and vibration caused by traffic, aircraft, industrial equipment, and recreational activities [
8,
9]. Not only are there seriously detrimental effects for humans [
10,
11], but anthropogenic noise is causing a serious loss of habitat, biodiversity, and natural soundscapes [
5,
12,
13]. Consequently, approaching the noise pollution problem from just an acoustical point of view is not enough. There is a strong need for a multidisciplinary approach and collaboration between experts from different disciplines (e.g., acousticians, biologists, ecologists, urban planners, etc.) [
5,
14].
City sounds typically contain a lot of traffic noise and frequent presence of human voices, for which we know well how they are appreciated by urban residents. From the human perspective, many people live in very noisy conditions and can often cope with the constant presence of urban noise. However, exposure to exceptionally high and unpredictable levels of noise can have several negative consequences for human health, including both physical (e.g., respiratory agitation, racing pulse, high blood pressure, headaches) and psychological symptoms (e.g., attacks of stress, fatigue, depression, anxiety) [
15,
16]. Furthermore, exposure to high noise levels can cause sleep, memory, and concentration disorders [
17,
18,
19]. Acoustic appreciation of noisy urban conditions can improve by masking irritating or frustrating sounds or by the addition of appreciative sounds.
Noisy activities of humans also affect animals and have caused, for example, a decline in bird diversity and abundance in urban and natural environments [
5,
10,
11]. Given the continuous rise of the human population and the global spread of urbanization and anthropogenic noise, biodiversity conservation has become a global priority for policy makers; industry and nongovernmental organizations; and researchers of urban ecology, human perception, and well-being. Biodiversity not only has intrinsic value but is also beneficial for humans since many studies have shown that people find natural environments and the associated sounds pleasant and desirable (e.g., the sound of water or birds) [
20,
21,
22,
23]. Our traffic noise may, therefore, have a direct negative impact on us, as well as an indirect one, through the impact of bird diversity loss and therefore, reduced vocal diversity.
The vocal presence of birds can positively affect appreciation of urban soundscapes and increase the perceived soundscape restorativeness [
24,
25], but few studies have addressed this psycho-acoustic hypothesis [
26]. A convenient way to explore human perception of natural sounds in urban acoustic environments is by using the soundscape concept [
21,
27,
28]. The soundscape consists of three components: biophony, including all natural sounds of animals; geophony, including all natural sounds of abiotic origin; and anthrophony, including all sounds made by humans, human machinery, and activity [
3]. The appreciation of a particular biophony component is best tested in the presence of realistic levels and diversity of both other components. The impact of the presence of birdsong, for example, can be tested experimentally by playback of natural recordings, which consist of most common anthropogenic sounds (e.g., traffic and human voices) together with some wind turbulence or rustling leaves, together with and without a bird singing.
An adequate psycho-acoustic study on preference variation in urban soundscapes requires appropriate objective sound measurements of the test stimuli used for exposure. When conducting this type of study, one needs to investigate objective acoustic parameters (e.g., spectrum, loudness, sharpness, roughness, and fluctuation strength) alongside subjective parameters, which can help determine the public perception of certain acoustic environment that could ultimately serve as a guideline for identifying and creating a pleasant acoustic environment [
27,
28]. An adequate psycho-acoustic study also requires an appropriate assessment method for the subjective experience by exposed test persons. In our previous work [
29], we established the statistical significance for six bipolar adjective pairs, allowing us to use them in this study together with the assigned numerical score, which enables and facilitates the statistical analysis of specific sound properties [
30].
In the current study, we used a common urban soundscape recording enriched with birdsongs from one of five different species at the time to test for the impact of avian vocal presence on the appraisal of urban soundscapes. We aimed to test the effect of natural birdsong in general and explored variation among the species-specific song variants on the appraisal. We enriched the recording of a small urban park using a children’s playground with birdsong and investigated the appraisal of birdsong presence compared to the unaltered control in a replicate sample of human residents.
3. Results
We calculated objective acoustic parameters for all recordings, i.e., mean values, standard deviation, and maximum values of loudness, sharpness, roughness, and fluctuation strength. This was carried out using Zwicker’s method [
45,
46] and is shown in
Table 2,
Table 3,
Table 4 and
Table 5.
In addition to tables,
Figure 14 shows the loudness analysis for a children’s park, which was reproduced to a control group (marked with black color), and for the recording of a children’s park together with the sound of the blackcap, which was played back to one of the experimental groups (marked with red color). We compared these two recordings since they showed the “biggest” difference in mean and maximum values of loudness.
A direct analysis of the obtained listeners’ responses based on the descriptive marks is shown in
Table 6. For each considered acoustic environment, the mean value of response (x), the standard deviation of the sample (σ), and its square, i.e., the variance (σ
2), was calculated.
Figure 15 shows the comparison of mean values for each bipolar adjective pair and every deemed recording.
As observed above by direct analysis, differences in average ratings exist between the control and experimental groups for each pair of bipolar attributes, but different attribute pairs generated diverse differences.
Regarding the second part of the questionnaire, which helped us to rank the sound sources according to their pleasantness, the results obtained are shown in
Figure 16.
To analyze the data attained from the questionnaire in stronger mathematical terms, we examined the statistical significance of the noted differences (shown in
Table 6 and
Figure 15). To find the statistically significant difference of attribute pairs for different groups, we have used the so-called
t-test and, subsequently, ANOVA (Analysis of the Variance) [
47,
48]. Before performing the tests, certain assumptions must be fulfilled (e.g., normality of data, homogeneity of variance, independence, random sampling). The aforementioned calculations are included in
Appendix A.
The experimental research methodology postulates the manipulation of one independent variable, which is, in this case, added birdsong sounds, while other variables are kept constant for both the experimental groups and the control group [
47]. In each iteration, the two groups (i.e., one control and one experimental) represent independent samples on which statistical significance can be established. When using a two-sample
t-test for evaluation of the difference between means of small independent samples, it is necessary to specify the level of significance α and to determine the degrees of freedom (
N − 1), where
N is defined as the size of the sample (as is common in similar statistical analyses we set α = 0.05 which corresponds to 95% certainty). The results of the
t-test applied to the listeners’ responses on each environment for each attribute pair are shown in
Figure 10 and
Figure 11. The so-called
t-value, which is the core of the
t-test, is calculated according to the following expression [
47]:
where
, σ
2, and
N denote mean values of responses, variance, and number of samples, respectively, while indices “exp” and “cont” refer to the experimental and control groups, respectively.
Using the standard table of significance [
47], it is found that the selected significance level of α = 0.05 corresponds to the demand for a
t-value to be greater than 2.01 to achieve a statistically significant difference between the mean values obtained for control and experimental groups.
Figure 17 shows the calculated
t-values between the control group and experimental groups. The attribute pairs on the x-axis are numbered as in
Table 1. For comparison, the value of
t = 2.01, which is the significance threshold, is also drawn in
Figure 17 (dashed line).
In the final step, it is necessary to test whether different environments would produce statistically significant listeners’ responses in terms of different attribute pair classes. To differentiate the attribute pairs, we independently completed the one-way analysis of variance (ANOVA) [
44,
45] for control and experimental groups. ANOVA is a test used to determine differences between research results from three or more unrelated samples or groups. This method calculates the so-called F-value, which is defined as the ratio of the normalized variance between sequences and the sum of normalized variances of each sequence (in this case, each sequence represents one acoustic environment recording). Therefore,
Figure 18 presents the calculated F-values for each attribute pair across all groups. With a significance level set at α = 0.05 (95% confidence), an F-value exceeding 2.21 is required [
47]. It can be observed that this criterion is met for all adjective pairs in each group.
4. Discussion
We analyzed the spectrograms of the obtained recordings from
Figure 9,
Figure 10,
Figure 11,
Figure 12 and
Figure 13. When comparing all the spectrograms, it is visible that by adding the sound of birds, a distinct acoustic pattern for each bird species appears in a higher frequency band (i.e., from 1 kHz to 8 kHz). In addition, it is necessary to emphasize that each bird species has a different vocalization pattern, which can also be easily observed in
Figure 9,
Figure 10,
Figure 11,
Figure 12 and
Figure 13. By observing the spectrograms, it can be concluded that adding bird sounds complements and enriches the soundscape. However, the question remains how the residents will perceive these newly designed soundscapes, and this is why we investigated the subjective perception and feedback attained from the study participants.
Table 2,
Table 3,
Table 4 and
Table 5 and
Figure 14 show variations in certain objective acoustic parameters, specifically in loudness and sharpness, while other parameters, such as roughness and fluctuation strength, did not change significantly. Loudness and sharpness are indeed a bit higher. Given the high-frequency content of the original bird recordings prior to mixing them with a recording of a children’s park, an increase in sharpness was anticipated in the experimental group. Generally, heightened sharpness aligns with an increase in high-frequency content, a range to which human hearing is more sensitive; thus, it is likely that listeners notice the bird sounds more distinctly. The analysis confirmed that adding the sounds of specific bird species to the original signal did not significantly affect roughness or fluctuation strength, as these values were initially low. Given the careful design of the recordings to integrate bird sounds naturally, this outcome aligns well with our expectations.
The first part of the questionnaire consists of six adjective pairs with a semantic differential that can describe a certain acoustic environment. When we compared only the mean values of all adjective pairs (shown in
Figure 15), more favorable feedback was achieved for the recordings with birds, except in the case of great tit.
The second part of the questionnaire is focused on the participants’ choices of the most pleasant sound source that appears in each created soundscape. When observing
Figure 15, it is apparent that by adding the sound of birds to the existing soundscape recording, the participants’ choices of the most pleasant sound source shift in favor of the added singing bird. This is especially visible in cases of the blackcap, chaffinch, and great tit.
In the case of blackbird and robin, the distribution is more divided between the sound sources. The reason for this could be the fact that a blackbird and robin “fitted better” in the original soundscape of the children’s park, or in other words, their singing was not that noticeable for most study participants. If we look at the spectrograms, these two birds do have a distinct pattern; however, they are not as intense as the others. In addition, if we observe the results of the conducted
t-test for these two cases, the obtained t-values are not statistically different from the control group (
Appendix B), which aligns with the drawn conclusion. Finally, only a few participants decided on the answer, none of the above. It can be concluded that those soundscapes were perceived as very natural and, in general, pleasant for the participants.
Regarding the results obtained from statistical methods, namely
t-test and ANOVA, from
Figure 17, it can be concluded that the majority of attribute pairs (i.e., 1, 2, 4, 5, and 6) exhibit statistically significant values in two considered environments: children’s park with added sound of blackcap and children’s park with added sound of chaffinch. It can be determined that these two birdsong choices had a positive and significant impact on the listeners’ appreciation of newly designed and enriched recordings.
Figure 18 shows the calculated F-values for each attribute pair for every group. The set level of significance α = 0.05 (95% certainty) is equivalent to the demand for the F-value to be larger than 2.21 [
47]. This demand is fulfilled for all the adjective pairs in every group. The ANOVA calculation is also included in
Appendix B. A more detailed analysis shows that the biggest statistical difference (i.e., the largest F-value) among the acoustic environments is achieved for the attribute pair inconspicuous–conspicuous, followed by rough–gentle, stressful–soothing, unpleasant–pleasant, monotonous–diverse, and artificial–natural. It can be concluded that all five attribute pairs (namely monotonous–diverse, unpleasant–pleasant, artificial–natural, stressful–soothing, inconspicuous–conspicuous, and rough–gentle) can describe and differentiate the selected environments in a statistically significant way.
In addition, to compare the results and reach an appropriate conclusion, we utilized the calculation of loudness, sharpness, roughness, and fluctuation strength that occur 95% of the time (N95 factor) (shown in
Table 7,
Table 8,
Table 9 and
Table 10). This factor is derived from the probability density function (PDF) of a certain value. The density distribution provides a better insight into the distribution of a parameter than mere average or maximum values. We calculated the density distributions for all considered objective acoustical parameters. Loudness is important because the sounds of birds need to be “noticed” within a sound environment for listeners to perceive that environment as pleasant. Sharpness is the parameter that reveals the extent of high frequencies present in the audio signal. In the spectrogram images of several sound environments, it is evident that birds occupy a frequency range above 2 kHz, suggesting that the sharpness of their sounds will be higher. This is also reflected in the last column of the calculated sharpness table (
Table 8), which includes an analysis of only birdsong. The same observation applies to loudness: the values for loudness and sharpness of only birdsongs are significantly higher. To provide a more detailed analysis, we calculated two additional objective parameters, roughness and fluctuation strength, since they also provide valuable information about the sound modulation that can be annoying for people (shown in
Table 9 and
Table 10). Finally, the peak frequency of each bird species is included in
Table 11.
Among all the parameters in the questionnaire, we selected the adjective pair unpleasant–pleasant. We chose this pair because it closely aligns with the participants’ understanding of a “good” sound environment and their overall sense of well-being in this study.
Figure 19 illustrates how the rating of this parameter depends on the calculated loudness at the 95% level across individual recordings. From the Figure, it can be concluded that for listeners to assign a high rating to the sound environment, the birds must be louder than the surrounding noise, allowing them to “stand out”.
Figure 20 shows how the rating of this parameter depends on the calculated sharpness for 95% of individual recordings. It can be concluded from the figure that the sharpness of birdsong should not be too high; an excessive proportion of high frequencies (high sharpness) tends to diminish sound environment appraisal.
Figure 21 and
Figure 22 illustrate how the adjective pair unpleasant–pleasant depends on roughness and fluctuation strength 95% of the time. It is evident that listeners show a preference for sounds with slower modulation.
From
Figure 19,
Figure 20,
Figure 21 and
Figure 22, we can infer that listeners rate the sound environment as pleasant if they can hear birds, which means the birds’ sounds must be relatively loud. Conversely, listeners tend to evaluate the songs of birds with a lower proportion of high frequencies more favorably.
Among the species analyzed, the chaffinch slightly stands out. It is loud enough, yet its sharpness is relatively low, indicating that its song is not overly sharp. Furthermore, the sound of this bird did not significantly alter the roughness or fluctuation strength parameters, which may enhance its appeal and contribute to a more pleasant listening experience. The peak frequency of this bird’s sound falls within the mid-range of the observed frequency spectrum. The objective acoustic parameters combined with the subjective questionnaire analysis and statistics have shown that the best results are achieved with the added sound of chaffinch, which shows that not all singing bird sounds will contribute to the overall positive rating of soundscapes.
Finally, it should be discussed that this type of study has certain limitations. Primarily, the limitation of the envisaged study was the involvement of people, i.e., obtaining a large enough sample to apply and prove the statistical significance of the attained results. Therefore, these types of studies are very time-consuming, and they require a big engagement from the researchers when designing and implementing experiments. The experiment must be accessible to a large number of people, and it should not be very time-consuming (to avoid fatigue and annoyance) for the participant while providing enough quality data for the researchers. Throughout the whole experiment, it is desirable to spread awareness about the problem that is being researched because it piques the residents’ interest and involvement.
5. Conclusions
In this paper, one pre-recorded acoustic environment, which typically occurs in contemporary urban areas, has been used to design five new recordings which have been enriched by different singing bird species, namely blackbird, blackcap, chaffinch, great tit, and robin.
Bearing in mind all the aforementioned, we investigated all six recordings in objective and subjective terms. The objective acoustical differences in the studied environments are loudness and spectral distributions, while subjective psychological parameters are determined by examining the responses of the study participants using a specially designed questionnaire. The designed questionnaire relied on the semantic differential method implemented by defining attribute pairs of opposite meanings, where each pair described a sound characteristic for a particular acoustic environment.
After a detailed analysis of the results attained from the semantic differential and direct participant feedback, it can be concluded that the perception of the acoustic environment improves somewhat significantly by adding different singing bird species.
When taking into consideration all the obtained data, mainly the overall rating (i.e., a combination of the attribute pairs and direct responses), the best result was achieved for the recording with the added sound of chaffinch. From an acoustical point of view, each bird species has a distinct pattern in a higher frequency band (i.e., from 1 kHz to 8 kHz), which can be observed in the spectrograms. In addition, listeners assigned different ratings and provided different responses for each bird species. Therefore, it can be established that urban residents do have certain preferences regarding bird species, especially when we consider the calculation and correlation of objective acoustic parameters (loudness, sharpness, roughness, and fluctuation strength) with subjective questionnaire results.
The results presented here offer valuable insights into human response in terms of enriched soundscapes and suggest several avenues for future research. We plan to continue our work of raising awareness regarding the problems of the decrease in biodiversity as a consequence of noise pollution, especially in urban areas. Since the study has shown that listeners favor and enjoy the enriched soundscapes, it can serve as a guideline and path to future work regarding the selection of singing bird species.
Future work will be oriented towards creating more soundscapes with different bird species in order to obtain a more extensive “ranking system” of singing birds in terms of listeners’ perception and overall pleasantness. Thus, we believe that in collaboration with scientists from other fields, this type of finding will have a great impact on the quality of life in urban areas.