**6. Conclusions**

This study described the construction of a soundscape-based exhibition environment using deep neural networks. The baseline was improved by different perspectives, such as modeling methods, learning methods, and domain adaptation methods. In addition, the soundscape music selected by our system was output via hyper-orientated speakers to improve the appreciation experience. To measure the improvements in user experience, we devised a soundscape music evaluation method and an appreciation experience evaluation method and conducted extensive experiments with 70 subjects. However, our research suffered from three major limitations. First, this was not state-of-the-art performance from the perspective of the deep neural network. In particular, models with low accuracy from the feature extraction perspective were selected for mutual learning. Therefore, the development of models with improved audio feature representation is required. Second, our current database comprised only 2000 music items. In addition, these 2000 music pieces were limited from the genre perspective. A wide spectrum of databases containing music from various cultures and times should be used in future studies. Third, we conducted experiments using 70 individuals, but each experiment was conducted on a group of only 10 people. Therefore, our experimental results cannot be generalized. Large-scale experiments need to be conducted to generalize these experimental results. This study shows the results of a pilot test with sighted test participants. We hope that the results of this study help people with visual impairments to appreciate art and that they help promote the cultural enjoyment rights of people with visual impairments.

**Author Contributions:** Conceptualization, methodology, and software, Y.K.; Data collection, experiments, and data curation, H.J.; Review and validation of UX experiments, J.-D.C.; Writing—review and editing, funding acquisition, and project administration, J.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2021-2018-0-01798) supervised by the IITP (Institute for Information and Communications Technology Promotion).

**Informed Consent Statement:** Informed consent was obtained from all subjects involved in the study.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.
