*Article* **Analysis of Forest Landscape Preferences and Emotional Features of Chinese Forest Recreationists Based on Deep Learning of Geotagged Photos**

**Xitong Zeng <sup>1</sup> , Yongde Zhong 1,2,3,\*, Lingfan Yang <sup>4</sup> , Juan Wei <sup>1</sup> and Xianglong Tang <sup>1</sup>**


**Abstract:** Forest landscape preference studies have an important role and significance for forest landscape conservation, quality improvement and utilization. However, there are few studies on objective forest landscape preferences from the perspective of plants and using photos. This study relies on Deep Learning technology to select six case sites in China and uses geotagged photos of forest landscapes posted by the forest recreationists on the "2BULU" app as research objects. The preferences of eight forest landscape scenes, including look down landscape, look forward landscape, look up landscape, single-tree-composed landscape, detailed landscape, overall landscape, forest trail landscape and intra-forest landscape, were explored. It also uses Deepsentibank to perform sentiment analysis on forest landscape photos to better understand Chinese forest recreationists' forest landscape preferences. The research results show that: (1) From the aesthetic spatial angle, people prefer the flat view, while the attention of the elevated view is relatively low. (2) From the perspective of forest scale and level, forest trail landscape has a high preference, implying that trail landscape plays an important role in forest landscape recreation. The landscape within the forest has a certain preference, while the preference of individual, detailed and overall landscape is low. (3) Although forest landscape photographs are extremely high in positive emotions and emotional states, there are also negative emotions, thus, illustrating that people's preferences can be both positive and negative.

**Keywords:** forest landscape; deepsentibank; deep learning; geotagged photos; sentiment analysis

### **1. Introduction**

Forest aesthetics first originated in Germany, founded and proposed by Salisch in 1885, signaling the beginning of a focus on the natural aesthetics of forests. The increase in the dissection and discussion of forest beauty also laid the foundation for the study of forest landscapes and their preferences [1]. Forest landscape is a landscape composed of a forest ecosystem as the main body, and its research aims to reveal certain basic laws through the structure and function of forest landscape, and to implement the protection, construction, planning, restoration and management of forest landscape through scientific means on this basis [2]. Globally, forests perform a variety of functions for people and the social value of the forest environment has received a great deal of attention, especially the health function, recreational value and landscape appreciation brought by forest tourism [3–6]. People involved in forest tourism are generally inclined to prefer the plants element, the main landscape element in the forest landscape, and to care about the trees in the forest [7,8]. The aesthetic role of plants is the most important and the first to be recognized and used [9].

**Citation:** Zeng, X.; Zhong, Y.; Yang, L.; Wei, J.; Tang, X. Analysis of Forest Landscape Preferences and Emotional Features of Chinese Forest Recreationists Based on Deep Learning of Geotagged Photos. *Forests* **2022**, *13*, 892. https:// doi.org/10.3390/f13060892

Academic Editors: Radu-Daniel Pintilii and Diego Varga

Received: 13 May 2022 Accepted: 6 June 2022 Published: 8 June 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

People have also learned from the experience of using forest play that the forest landscape is an integral part of the visual image of the forest [10]. However, there are still few studies on forest landscapes from the perspective of plants or animals [11]. Based on this, we decided to explore forest landscape preferences from a botanical point of view, to compensate, to some extent, for the lack of this panel.

China reported 6 billion forest visitors in a 4-year period (2016–2019), with an average annual growth rate of 15%. Among them, the number of forest tourism visitors reached 1.8 billion in 2019, creating a comprehensive social output value of USD 25.7 million; in the wake of the COVID-19 pandemic, forest tourism has also gained momentum and reached 84.2% of the annual visitor arrivals for 2019 [12]. As forest tourism is gradually chosen and accepted by the public, forest landscape preference studies have started to emerge gradually, focusing on human interaction studies. An early study of forest landscape preferences from a human perspective was conducted by Misgav and Amir—to study the degree of people preference for forest vegetation landscapes from a visual quality perspective [13]. Studies have shown that forest beauty does influence the number of visits to forested areas and preferences [14,15]. Investigation of forest tourism can improve the planning and management of forest resources, especially in terms of forest landscape, which is an important component of forest tourism [16,17]. Among these, forest classification has been a means of describing complex and diverse forest resources, based on vegetation features, landscape features or a combination of both, called landscapes. Forest landscape classification is important for ecological studies of forest landscapes, etc. In reviewing the relevant literature, it is not difficult to find that forest landscapes mostly appear as a certain research element of destination landscape preference studies, with few studies dedicated to them. There is also little use of forest landscape classification systems as a basis for identifying people's preferences for different categories of forest landscapes.

The advent of the Web 2.0 era has opened up more possibilities for the study of forest landscape preferences. Studies have conducted Internet-based studies on preferences for alpine forest landscapes, preferences for forest features and preferences for forest structure [18–20]. This has driven the emergence and catalyst of Travel 2.0, where travelers exchange travel-related content and engage in high levels of social interaction on the Internet. With the popularity and convenience of photography tools, travel photography is one of the indispensable behaviors in travel [21,22]. Travel experience is, again, the most important visual experience, and taking photos is also a fashionable behavior for people to share their lives and publish their experiences. Therefore, the photos shared by tourists have become the main dissemination channel for their preferences, and photo content analysis has been widely used in tourism research and is considered to play a more important role in the tourism process, with more scholars focusing on the nature behind the photographic behavior [23–26]. Some scholars have already conducted early experimental studies, in which tourists were asked to rate the content of photos and found that natural conditions, trail design and forest conditions all affect tourists' perceptions of the forest landscape and its trails [27]. Photographs are intrinsically linked to tourism, as a person must take some form of photography during their travels, and the photographs they take can reflect their unique personal motivations [24,26]. Studies have shown that there are no substantial differences in landscape preferences between visitors who post photos and participants who do not post photos [28]. Therefore, we want to explore more possibilities by using geotagged photos that people post, using the content of the photos to explore landscape preferences. In fact, travel photos shared by tourists contain not only objective information but also hidden information, so the photos posted on social media platforms can capture the emotions of tourists and the conditions of their experiences [29,30]. The visual perception of a forest environment does influence people's psychological emotions and adjusts their psychological state, and almost all studies agree that forests trigger positive emotions. Is it true that people in forest landscape environments feel only positive emotions? We wanted to explore whether other answers existed.

The field of artificial intelligence has made significant advances in computer vision (CV), image processing techniques and deep learning, offering many new possibilities and new ideas for preference, as well as for the study of travel photos. Currently, in addition to manual coding with tools, such as NVIVO and Textblob or using tools for smart tagging of photos [31–34], computer vision technology provides a better solution path for visual content analysis in tourism. Recently, Transformer and Multilayer Perceptron (MLP)-based models, such as MLP-Mixer and Vision Transformer, have started to lead the new trend because of their excellent performance in ImageNet classification tasks [35]. The spread of coding technologies, such as Python, has made interdisciplinary collaboration between the fields of tourism research and computing possible, and computer technologies, such as computer vision, image processing techniques and deep learning, have facilitated content processing, recognition and analysis of photos published on UGC platforms [36]. Based on the characteristics of deep learning, it is mainly applied to two segments of tourism research: tourist volume prediction and image content mining. The application methods are mainly divided into two types: using pre-models as is and training models by migration learning [37]. Tourism photo content mining is mainly from big data on UGC platforms for visual content analysis. Image-based tourism research is also increasingly focused on potential sentiment analysis, e.g., Deng et al. more innovatively started to use Deepsentibank's deep learning tools for tourism group imagery, as well as destination images [38,39]. In addition, several studies have used pre-trained models to classify and analyze tourism photos. Payntar et al. analyzed tourism photographs of the Cusco World Heritage Site in Peru using the ResNet model [40]. Cho et al. used deep learning techniques to classify photos on Flickr in an attempt to develop a photo classification system for tourist destinations [41]. Kim et al. used the Inception-v3 model to classify tourist photos of Seoul and used it as the basis for their study [42]. It proves that Deep Learning is accurate when mining the photo content. Although data generated by the widespread use of the Internet and social media reflect people's real preferences, there are still few articles that combine social media with computer vision algorithms in an attempt to understand individual preferences [43]. However, there have been successful uses of computer vision to characterize human–animal interactions [44], so we think the interaction between forest landscapes and humans can also be studied and analyzed by computer vision and the use of deep learning methods for the tourism segment is mostly on the large tourism categories, and there is less segmentation of the internal elements with the large categories. This literature has given us more inspiration to explore different class preferences in forest landscapes using the MLP-Mixer model that has simplicity for a large number of classifications. This will help forest managers to plan, design and manage the development of forest tourism, forest beauty and forest landscape quality in a more focused manner.

Existing studies have mostly explored preferences by using all landscape elements or images of the destination as research themes, and by default, the photos taken by people are their preferences. Thus, we use deep learning and Deepsentibank to analyze the image content with the core problem of "how to explore people's preference of forest landscape through photos posted on UGC platforms". That is to say, the exploration preference also supplements the sentiment analysis of pictures, aiming to make up for the lack of special research on forest landscape preference and try new angles and methods to discover valuable information from the photos published by people for forest tourism research. The specific objectives are: (1) Through data mining of outdoor website—2BULU (https://www.2bulu.com/ accessed on 1 December 2021)—a dataset of 15,052 photos of forest plants from six places in China, including Kanas, Gongga, Four Girls Mountains, Shennongjia, Changbai Mountain and Moganshan, was established, and through computer depth science, the photo visual content of the data was divided into eight scenes and three categories to determine the Chinese forest recreationists preference for forest landscapes on this basis. (2) Explore the emotional attitudes carried by photos using the Deepsentibank program to complement preference studies with a more objective perspective.

### **2. Case Sites and Datasets**

### *2.1. Case Sites*

The formation of forests is closely related to the long-term effects of the surrounding natural conditions. China is a vast country with five major climatic zones: cold temperate, temperate, warm temperate, subtropical and tropical, from north to south; precipitation generally decreases from south to north and decreases from east to west, and there are various topographies, such as high mountains, plateaus and hills, and basins. These all make the distribution of forests in different regions of China different, with obvious zonality, and also mean it has more types of forests. With the development of modern times, China has gradually increased both the importance and protection of forest cover. According to the State Forestry and Grassland Administration of China, China's forest cover has reached 23.04%, with a forest accumulation of over 17.5 billion cubic meters [45]. China is also gradually promoting forest tourism and actively fostering forest tourism products—building national forest parks, national forest trails, etc., calling for people to go to the forest to get in touch with nature. In addition, forest tourism festivals have made the social influence of forest tourism in China grow rapidly.

The vast majority of China's forest resources are concentrated in the northeast and southeast, while the vast northwest is poor in forest resources. In addition, 80% of China's population is located in the southeastern region, and tourism flows are heavily concentrated in the southeastern half of the country [46]. We selected the specific case sites considering that they have a certain number of forest recreationists and that these recreationists have uploaded a certain number of forest landscape photos for analysis in the "2BULU" app. Therefore, the majority of the case sites were in southeastern China. The difference in the level of economic development between the north and south, east and west of China limits the construction of forest recreation facilities and the accessibility of the forest for different vegetation types. Through pre-experiments and surveys, we set the case sites as Kanas, Gongga, Four Girls Mountains, Shennongjia, Changbai Mountain and Moganshan, in order to better ensure that the case sites are representative. The six selected case sites are all highly visible and influential in China, even world class, such as Shennongjia being selected as a World Biosphere Reserve Network and Changbai Mountain being selected as a United Nations "Man and Biosphere" nature reserve and an international A-class nature reserve. Based on this, we believe that the photographs of the six case sites are representative of the forest landscape preferences of Chinese forest tourists. See Figure 1 for location diagram.

### *2.2. Datasets*

The data comes from the UGC platform "2BULU"—an app for outdoor resource sharing and community interaction—which is widely used in daily trips, travel trips, wilderness camping, etc. It has a large user base and a large amount of data, and is also a more mature platform in China for obtaining photos taken by travelers and their metadata [47]. It should be noted that the users of "2BULU" are not only outdoor travelers, but also ordinary tourists and even local residents, school students, etc. The range of users is relatively wide, so we believe that its data source tends to encompass all kinds of travelers, rather than just outdoor travelers.

We used Python tools to write script code to acquire the photo data using the keywords of the case site. Due to the large amount of data and the fact that we only needed forest photos, we used the Tencent API (Application Programming Interface) filtering port to help with the first round of filtering and selecting photos with plants in the data acquisition phase. This enables us to improve the accuracy and efficiency of data screening and reduce the difficulty and effort of processing raw data from "2BULU". After the first round of selection, we obtained a total of 35,675 photos with plants inside. We use a narrow sense of forest landscape—natural scenery with forest vegetation as the main part within the view of people at a certain point in time and space [48]—and referred to the data processing method of White et al. for analyzing forest landscapes [49]. Therefore, we conducted a second and third round of manual screening to remove photos containing a large number

of people, animals, lakes and other elements, which reduces distractions and shows the true attractiveness and preference of the natural environment [49]. Finally, 15,052 photos were obtained in the final dataset. *Forests* **2022**, *13*, x FOR PEER REVIEW 5 of 18

> **Figure 1.** Schematic diagram of case site location. Chinese vegetation zoning data from the Resource and Environment Science and Data Center, Chinese Academy of Sciences. Base map according to the standard map of the Ministry of Natural Resources of China. **Figure 1.** Schematic diagram of case site location. Chinese vegetation zoning data from the Resource and Environment Science and Data Center, Chinese Academy of Sciences. Base map according to the standard map of the Ministry of Natural Resources of China.

#### *2.2. Datasets* **3. Methods**

#### The data comes from the UGC platform "2BULU"—an app for outdoor resource *3.1. Research Flow*

sharing and community interaction—which is widely used in daily trips, travel trips, wilderness camping, etc. It has a large user base and a large amount of data, and is also a more mature platform in China for obtaining photos taken by travelers and their metadata [47]. It should be noted that the users of "2BULU" are not only outdoor travelers, but also ordinary tourists and even local residents, school students, etc. The range of users is relatively wide, so we believe that its data source tends to encompass all kinds of travelers, rather than just outdoor travelers. Firstly, we crawled the photos on 2BULU and formed a dataset by using "Tencent API screening port + manual" to filter them. Secondly, the MLP-Mixer model was trained by randomly selecting photos in the dataset to form a training set and finally a classification model was formed. At the same time, sentiment analysis was performed on the datasetto obtain adjectives and determine the sentiment status. The results of the analyses complement each other to provide a more comprehensive assessment of the Chinese forest recreationists' preferences for forest landscapes (See Figure 2).

We used Python tools to write script code to acquire the photo data using the keywords of the case site. Due to the large amount of data and the fact that we only needed forest photos, we used the Tencent API (Application Programming Interface) filtering port

acquisition phase. This enables us to improve the accuracy and efficiency of data screening and reduce the difficulty and effort of processing raw data from "2BULU". After the first round of selection, we obtained a total of 35,675 photos with plants inside. We use a narrow sense of forest landscape—natural scenery with forest vegetation as the main part within the view of people at a certain point in time and space [48]—and referred to the data processing method of White et al. for analyzing forest landscapes [49]. Therefore, we conducted a second and third round of manual screening to remove photos containing a large number of people, animals, lakes and other elements, which reduces distractions and shows the true attractiveness and preference of the natural environment [49]. Finally,

Firstly, we crawled the photos on 2BULU and formed a dataset by using "Tencent API screening port + manual" to filter them. Secondly, the MLP-Mixer model was trained by randomly selecting photos in the dataset to form a training set and finally a classification model was formed. At the same time, sentiment analysis was performed on the dataset to obtain adjectives and determine the sentiment status. The results of the analyses complement each other to provide a more comprehensive assessment of the Chinese forest

**Figure 2.** Research Flowchart. **Figure 2.** Research Flowchart.

### *3.2. Forest Landscape Photo Classification*

15,052 photos were obtained in the final dataset.

recreationists' preferences for forest landscapes (See Figure 2).

**3. Methods**

*3.1. Research Flow*

*3.2. Forest Landscape Photo Classification* The proportion of forest landscape in the photos taken by people will have a relatively large impact on people's preferences, and the analysis of the content of the photos can also restore, in a better and more detailed manner, what people appreciate, and can also more accurately describe the landscape [50,51]. In order to classify the forest landscape photographs taken by tourists, we developed the forest landscape photograph classification for this study based on the characteristics of the dataset and with reference to the forest landscape classification proposed by Chen et al. in 2001 based on forest beauty. Chen et al.'s classification combines both distance and aesthetic object scales to classify forest (plant) landscapes into seven levels—Single-tree Composed Landscapes, Intra-forest Landscape, Forest Trail Landscape, Detailed Landscape, Near, Medium and Far Landscape [52]. However, during the pre-experiment, we found that it was difficult for machine learning to recognize and judge the near, medium and far views based on the distance between the observer and the aesthetic object, so we then replaced it with look The proportion of forest landscape in the photos taken by people will have a relatively large impact on people's preferences, and the analysis of the content of the photos can also restore, in a better and more detailed manner, what people appreciate, and can also more accurately describe the landscape [50,51]. In order to classify the forest landscape photographs taken by tourists, we developed the forest landscape photograph classification for this study based on the characteristics of the dataset and with reference to the forest landscape classification proposed by Chen et al. in 2001 based on forest beauty. Chen et al.'s classification combines both distance and aesthetic object scales to classify forest (plant) landscapes into seven levels—Single-tree Composed Landscapes, Intra-forest Landscape, Forest Trail Landscape, Detailed Landscape, Near, Medium and Far Landscape [52]. However, during the pre-experiment, we found that it was difficult for machine learning to recognize and judge the near, medium and far views based on the distance between the observer and the aesthetic object, so we then replaced it with look down, look forward and look up, classified according to the vertical foot of the observer and the aesthetic object; the overall view was added to the scale level of the aesthetic object. Therefore, we finally determined 8 forest landscape scenes including: Look Down Landscape, Look Forward Landscape, Look Up Landscape, Single-tree Composed Landscape, Detailed Landscape, Overall Landscape, Forest Trail Landscape and Intra-forest Landscape—8 forest landscape scenes; 3 forest landscape categories, including Spatial Hierarchy, Forest Hierarchy, and Scale Level (See Figure 3). Spatial hierarchy contains Look Down Landscape, Look Forward Landscape and Look Up Landscape. Scale Level includes Single-tree composed Landscape, Detailed Landscape and Overall Landscape. Forest Trail Landscape is a landscape composed of roads and forest stands along the roads; Intra-forest Landscapes are landscapes composed of forest plants in groups rather than single, multiple components rather than single parts, within the forest without the Forest Trail. There are differences between the two. Forest Trail Landscapes and Intra-forest Landscapes are included in the forest hierarchy because they are both non-confined but extensive photographs of forest recreation in the forest interior, and contain more components that distinguish their types from other types.

down, look forward and look up, classified according to the vertical foot of the observer and the aesthetic object; the overall view was added to the scale level of the aesthetic object. Therefore, we finally determined 8 forest landscape scenes including: Look Down Landscape, Look Forward Landscape, Look Up Landscape, Single-tree Composed Landscape, Detailed Landscape, Overall Landscape, Forest Trail Landscape and Intra-forest Landscape—8 forest landscape scenes; 3 forest landscape categories, including Spatial Hierarchy, Forest Hierarchy, and Scale Level (See Figure 3). Spatial hierarchy contains Look Down Landscape, Look Forward Landscape and Look Up Landscape. Scale Level includes Single-tree composed Landscape, Detailed Landscape and Overall Landscape. Forest Trail Landscape is a landscape composed of roads and forest stands along the roads; Intra-forest Landscapes are landscapes composed of forest plants in groups rather than single, multiple components rather than single parts, within the forest without the Forest Trail. There are differences between the two. Forest Trail Landscapes and Intra-forest Landscapes are included in the forest hierarchy because they are both non-confined but extensive photographs of forest recreation in the forest interior, and contain more compo-

**Figure 3.** Examples of forest landscape classification systems.

nents that distinguish their types from other types.

#### **Figure 3.** Examples of forest landscape classification systems. *3.3. MLP-Mixer Model*

We decided to try the MLP-Mixer model proposed by Google in 2021 for image classification task scene recognition in order to perform a more objective analysis of forest landscape preferences. This is an architecture based only on multilayer perceptron (MLP), which is optimized and has a simpler structure than models such as the convolutional neural network (CNN). The model proposes a mixer structure that first splits the input images into patches, converts each patch into a feature embedding as per-patch fully connected, sends it to N mixer layers, and finally classifies it as fully connected. It uses spatial-mixing MLPs and channel-mixing MLPs to transfer information between different channels and spatial locations (tokens), respectively [35,53]. These two types of layers are alternately stacked to facilitate the exchange of two input dimensions. Each MLP consists of two fully connected and one GELU. Thus, for Top-1 accuracy on the ImageNet validation set, Mixer achieves a slightly better performance than ResNet and basically the same performance as ViT transformer, and the training speed (img/sec/core) Mixer will be faster than the other two, demonstrating the potential of a simple structure, such as MLP [54]. Thus, on large-scale datasets, MLP-Mixer achieves a very promising performance that can effectively help us in forest landscape classification. The specific operation example is shown in the Figures 4 and 5.

We decided to try the MLP-Mixer model proposed by Google in 2021 for image classification task scene recognition in order to perform a more objective analysis of forest landscape preferences. This is an architecture based only on multilayer perceptron (MLP), which is optimized and has a simpler structure than models such as the convolutional neural network (CNN). The model proposes a mixer structure that first splits the input images into patches, converts each patch into a feature embedding as per-patch fully connected, sends it to N mixer layers, and finally classifies it as fully connected. It uses spatialmixing MLPs and channel-mixing MLPs to transfer information between different channels and spatial locations (tokens), respectively [35,53]. These two types of layers are alternately stacked to facilitate the exchange of two input dimensions. Each MLP consists of two fully connected and one GELU. Thus, for Top-1 accuracy on the ImageNet validation set, Mixer achieves a slightly better performance than ResNet and basically the same performance as ViT transformer, and the training speed (img/sec/core) Mixer will be faster than the other two, demonstrating the potential of a simple structure, such as MLP [54]. Thus, on large-scale datasets, MLP-Mixer achieves a very promising performance that can effectively help us in forest landscape classification. The specific operation example is

We decided to try the MLP-Mixer model proposed by Google in 2021 for image classification task scene recognition in order to perform a more objective analysis of forest landscape preferences. This is an architecture based only on multilayer perceptron (MLP), which is optimized and has a simpler structure than models such as the convolutional neural network (CNN). The model proposes a mixer structure that first splits the input images into patches, converts each patch into a feature embedding as per-patch fully connected, sends it to N mixer layers, and finally classifies it as fully connected. It uses spatialmixing MLPs and channel-mixing MLPs to transfer information between different channels and spatial locations (tokens), respectively [35,53]. These two types of layers are alternately stacked to facilitate the exchange of two input dimensions. Each MLP consists of two fully connected and one GELU. Thus, for Top-1 accuracy on the ImageNet validation set, Mixer achieves a slightly better performance than ResNet and basically the same performance as ViT transformer, and the training speed (img/sec/core) Mixer will be faster than the other two, demonstrating the potential of a simple structure, such as MLP [54]. Thus, on large-scale datasets, MLP-Mixer achieves a very promising performance that can effectively help us in forest landscape classification. The specific operation example is

*Forests* **2022**, *13*, x FOR PEER REVIEW 8 of 18

**Figure 4.** MLP-Mixer Flow Chart. **Figure 4.** MLP-Mixer Flow Chart.

**Figure 4.** MLP-Mixer Flow Chart.

shown in the Figures 4 and 5.

shown in the Figures 4 and 5.

*3.3. MLP-Mixer Model*

*3.3. MLP-Mixer Model*

**Figure 5.** MLP-Mixer Model Diagram. We use the MLP-Mixer model to form a classifier to classify the obtained forest landscape data, which is a new attempt. There are no ready-to-use models and training sets, so the data set is divided into 2 parts—50% of the data are divided 8:2 into a training set and a test set; 50% of the data will be classified by the model to determine the forest recreationists' forest landscape preferences. Since there may be two categories of images, machine recognition cannot classify an image twice at the same time. We divided the classification categories into two groups: Single-tree composed Landscape, Detailed Landscape, Overall Landscape, Forest Trail Landscape, and Intra-forest Landscapes as the first group, with 69% accuracy after repeated training; look down, look forward, and look up are the second group with 70% accuracy. This indicates that the model can classify forest landscape images more stably, and initially reaches our desired expectation.

### *3.4. Deepsentibank*

In the era of rapid development of information, people's impressions of their surroundings will appear more and more on the Internet, and some studies believe that this is a mental conceptualization of people's relationship with their surroundings [55]. We study what people capture and their emotions during their forest landscape experiences, which can provide a basis and indicators for forest planning and management [28,56]. In this study, to explore tourists' perception of forest landscape images, Deepsentibank, a visual emotion concept classification tool developed by Chen et al. at Columbia University based on deep learning of images (CNN), is utilized. Its development principle is to obtain text tags from web photos, establish the relationship between "adjectives + nouns" used

as a basis to identify the content of the images to form emotional keywords, and to transform the image information into textual information [57]. His method uses over 1 million geotagged photos to train the classifier, composing a total of 2089 APNs (Adjective Noun Paris)—231 adjectives as well as 424 nouns—and thus has a high accuracy. This conceptuallevel sentiment analysis can extract the implicit sentiment from the ontology and give us a basis for analyzing the emotional attitude of forest landscape images. The content analysis process is shown in Figure 6 (only the first 7 items are listed in the figure). In this case, the program can parse a forest landscape photo into a set of JSON files possessing ANP sorting, with the top sorted words having high weight and greater relevance to the image. form the image information into textual information [57]. His method uses over 1 million geotagged photos to train the classifier, composing a total of 2089 APNs (Adjective Noun Paris)—231 adjectives as well as 424 nouns—and thus has a high accuracy. This conceptual-level sentiment analysis can extract the implicit sentiment from the ontology and give us a basis for analyzing the emotional attitude of forest landscape images. The content analysis process is shown in Figure 6 (only the first 7 items are listed in the figure). In this case, the program can parse a forest landscape photo into a set of JSON files possessing ANP sorting, with the top sorted words having high weight and greater relevance to the image.

We use the MLP-Mixer model to form a classifier to classify the obtained forest landscape data, which is a new attempt. There are no ready-to-use models and training sets, so the data set is divided into 2 parts—50% of the data are divided 8:2 into a training set and a test set; 50% of the data will be classified by the model to determine the forest recreationists' forest landscape preferences. Since there may be two categories of images, machine recognition cannot classify an image twice at the same time. We divided the classification categories into two groups: Single-tree composed Landscape, Detailed Landscape, Overall Landscape, Forest Trail Landscape, and Intra-forest Landscapes as the first group, with 69% accuracy after repeated training; look down, look forward, and look up are the second group with 70% accuracy. This indicates that the model can classify forest

In the era of rapid development of information, people's impressions of their surroundings will appear more and more on the Internet, and some studies believe that this is a mental conceptualization of people's relationship with their surroundings [55]. We study what people capture and their emotions during their forest landscape experiences, which can provide a basis and indicators for forest planning and management [28,56]. In this study, to explore tourists' perception of forest landscape images, Deepsentibank, a visual emotion concept classification tool developed by Chen et al. at Columbia University based on deep learning of images (CNN), is utilized. Its development principle is to obtain text tags from web photos, establish the relationship between "adjectives + nouns" used as a basis to identify the content of the images to form emotional keywords, and to trans-

landscape images more stably, and initially reaches our desired expectation.

**Figure 6.** The process of photos analyzed by Deepsentibank. **Figure 6.** The process of photos analyzed by Deepsentibank.

*Forests* **2022**, *13*, x FOR PEER REVIEW 9 of 18

The results of the study show that Deepsentibank performs well or more significantly in retrieval and annotation compared to support vector analysis models. With its greater use in tourism destination images, these studies also prove that Deepsentibank performs well in sentiment analysis. Since our study is mainly for understanding the emotional The results of the study show that Deepsentibank performs well or more significantly in retrieval and annotation compared to support vector analysis models. With its greater use in tourism destination images, these studies also prove that Deepsentibank performs well in sentiment analysis. Since our study is mainly for understanding the emotional state of images, we are interested in adjectives only.

### **4. Results**

*3.4. Deepsentibank*

### *4.1. MLP-Mixer Model Classification Results*

state of images, we are interested in adjectives only.

From the pictures of the classification results of the MLP-Mixer model, it can be seen that there is a significant difference in people's preference for different categories of forest landscapes in both the first and second groups. From the aesthetic spatial level, people will prefer to look at a flatter forest landscape, accounting for 78.32% of the first group classification, followed by looking down at the forest landscape, accounting for 13.8% of the group, and finally, looking up at the forest landscape, accounting for 7.81% of the group. From the forest level, people will take more photos on the forest road and occupy a higher percentage of the second group, with 64.99%; the in-forest landscape will also be noticed by people, with 21.55%. From the scale level, people's attention is lower, Single-tree composed landscape and overall landscape have similar attention, but open overall landscape will be more popular, with 5.65%; single-tree composed landscape in the forest is also relatively easy to be found and recorded by people, with 5.05%; detailed landscape is lower, with only 2.76%. The results are shown in Figure 7.

### *4.2. Deepsentibank Sentiment Analysis*

### 4.2.1. Emotional High-Frequency Word Analysis

We extracted the first 10 items of the adjective part of the ANP of each forest landscape picture taken by travelers in the six case sites as the salient picture sentiment [39,58,59]. Sentiment lexicon, such as Hownet, was used to compare and analyze with the photo adjectives parsed by Deepsentibank. In general, the emotions embedded in the photographs taken by the Chinese forest recreationists are dominated by positive words and are much higher than negative words. From the results in Table 1, it can be seen that "Classic", "Cute", "Sweet", "Colorful" were the most frequent emotion words. Due to the large volume of word data, words with a frequency of more than 1000 uses were extracted. We found in Table 2, the percentages of the extracted high-frequency words holding positive, neutral and negative sentiments toward the forest landscape were 74.06%, 11.54% and 14.41%, respectively.

From the pictures of the classification results of the MLP-Mixer model, it can be seen that there is a significant difference in people's preference for different categories of forest landscapes in both the first and second groups. From the aesthetic spatial level, people will prefer to look at a flatter forest landscape, accounting for 78.32% of the first group classification, followed by looking down at the forest landscape, accounting for 13.8% of the group, and finally, looking up at the forest landscape, accounting for 7.81% of the group. From the forest level, people will take more photos on the forest road and occupy a higher percentage of the second group, with 64.99%; the in-forest landscape will also be noticed by people, with 21.55%. From the scale level, people's attention is lower, Singletree composed landscape and overall landscape have similar attention, but open overall landscape will be more popular, with 5.65%; single-tree composed landscape in the forest is also relatively easy to be found and recorded by people, with 5.05%; detailed landscape

**Figure 7.** Deep Learning Result Graph. **Figure 7.** Deep Learning Result Graph.

**4. Results**

*4.1. MLP-Mixer Model Classification Results*

is lower, with only 2.76%. The results are shown in Figure 7.

*4.2. Deepsentibank Sentiment Analysis* **Table 1.** High-frequency adjective of related images (more than one thousand).


**Table 2.** Emotional Tendency Gravity Scale.


### 4.2.2. Emotional Dimension Analysis

We classified the high-frequency words (word frequency over one thousand) extracted by Deepsentibank's analysis with the sentiment dimension, based on the Circumplex Model of Affect by Russell in 1980 [60]. The proportions of adjectives in each dimension were obtained and the results are shown in Table 3. "Pleased" has the highest proportion, 45.81%, representing the main emotional tendency of the forest recreationists towards the forest landscape; "Excited" has the second highest response in the Chinese forest recreationists emotional preference (14.45%). The emotional preferences of "Happy", "Delighted" and "Relaxed" are also relatively high (10.11%, 6.65% and 5.02%, respectively). It is worth noting

that although people's emotional tendency toward the forest landscape is dominated by positive emotions, there are still negative emotions and feelings toward the forest landscape. The recreationists showed other emotions, such as "Sad" and "Miserable" from the photos (5.47 and 2.37% respectively), while a small number of photos showed the emotions of "Bored" and "Annoyed".

**Table 3.** Emotional dimension scale.


### **5. Discussion and Conclusions**

*5.1. Discussion*

5.1.1. Preference Characteristics of Different Forest Landscape Categories

The classification results of the MLP-Mixer model show that there are significant differences in different forest landscape scenes as well as categories (See Figure 7), while people have different preferences for different forest landscapes [61].

From the results, it is easy to find that people prefer forest trails, which is consistent with the findings of several studies. Forest trails are important landscape corridors of the forest and have the role of organizing the landscape space, so good trails allow people to enjoy the forest landscape more, and whether or not they have access to plants and other elements in the process will also affect people's preferences [62]. Further, most forest recreationists still enter the forest primarily for walking activities and nature experiences [63]. Forest trail photography is higher than related types of landscapes, because people are more involved in forest activities by walking, and the "2BULU" site we chose is also a photo documentation in this tone, demonstrating the results of You et al., that forest recreationists will prefer forest landscapes along forest trails [64].We corroborate the findings obtained by Gao et al. in 2021, who found that forest tourism participants were prone to interact with the forest on forest trails, for example, by taking photographs [65]. It also corroborates that forest trail coverage has a direct impact on visual quality, with an increase in the proportion of forested landscapes perceived as safe, and a preference for forests with multiple trails or distinct hiking trails [66–69]. Therefore, it is crucial to provide satisfactory forest managers and planners, a well-planned forest trail can increase the attractiveness of the forest landscape from an aesthetic point of view [16,70–73]. In-forest landscapes are less preferred than forest trail landscapes, but in-forest landscapes still have a particularly important position in forest landscape evaluation and management, and can affect the perceived natural beauty and the beauty of the forest stand [52].

In terms of aesthetic spatial entrainment, in the same space, people's preference for landscape will show spatial segmentation differences [74,75]. The results show that people prefer look forward landscapes (see Figure 7), i.e., forest landscapes that can be seen without looking up or down with only horizontal rotation, and look down to see ground cover, fine details or individual fractional landscapes, and look up to view forest landscapes less often. The lower preference for overlooked landscapes can laterally support that ground cover plants have an effect on people's preferences and are not attractive or less attractive

in forest landscape preferences [76]. This also demonstrates that visual preferences actually vary and have a significant impact on forest recreationists' preferences in terms of distance area space [77,78]. The overall landscape and the detailed landscape, in line with Gill and Ryan's view that people prefer a relatively open forest landscape, with good visual access to generate points of interest [79,80]. In terms of individual views, people are actually attracted to unique trees or old trees in the forest and stop to take pictures. Helman 2021, for example, demonstrates that Family Forest Owners in Michigan's Upper Peninsula prefer single trees; You confirms that people are interested in old-growth trees [64,81]. From the fine view, it does not match with Nielsen's results [82], as flowers, roots, mosses, etc., did not gain more preference from people, but on the contrary, were the lowest among all types of landscapes. This also shows that people do not objectively prefer flowers, fruits, etc., as much as they subjectively do, and that the perceived facts can differ from the actual true preferences [83], or perhaps the overall level of basic botanical knowledge of the Chinese forest recreationists has something to do with it, which deserves to be studied in depth. The preference for detailed landscapes, overall landscapes and individual landscapes is also low, which we believe may be due to the fact that people pay more attention to the plants in their field of vision during forest tourism, and only when a plant is very special or magnificent in its overall view does it attract attention and generate preference.

### 5.1.2. Emotional Characteristics Contained in Forest Landscape Photographs

The forest landscape space perceived by humans is a cognitive process from unknown curiosity to the whole, and there will be an interaction of visual behavior and psychological perception [84]. We answered the question "Are people really positive in a forest landscape setting?".

The results show (see Table 2) that recreationists feel different emotions during forest visits and that reactions to the surrounding environment usually involve positive and negative emotions or two bipolar orthogonal dimensions [60,85,86]; however, the participation in forest recreation is still dominated by a positive and pleasant emotional state. Therefore, the heightened perception of the forest is a manifestation of vegetation, which will have a strong psychological impact on what people see, and confirms that visual stimulation is useful as a communication channel in the landscape; "Classic", "Cute", "Sweet", "Colorful" and other emotional words reflect people's positive emotions in the forest (see Table 1), which are directly related to people's innate behavior of pursuing happiness [87–89]. Some studies have shown that people feel comfortable and peaceful in forests and that the main emotions generated are positive [65]. It is also similar to Nielsen's suggestion that people develop emotional and cognitive structures in response to forest landscapes and experience, including "Cosy/uncosy", "Safety", "Serenity", "Care", "Mystery" and "Coherence" [82]. People experience emotions, such as "Pleased", "Excited" and "Happy" (see Table 3), because they gain pleasure through the dynamic function of the landscape and a deeper understanding of the ecological state, evoking a mental response through direct sensory processes and interventions in cognitive structures [90].

The presence of the forest has always proved to be a positive criterion, but this is actually the result of expectations in people's minds. Current spatial analysis based on photographs, objective methods, such as eye-tracking technology, or subjective analysis, such as questionnaires, all consider the forest landscape to be merely attractive, without tapping into the negative emotions that actually exist behind it [8,19,51,74,91–93]. However, it is noteworthy that we found that people, in fact, also have negative emotions towards forest landscapes, such as "Sad", "Miserable", "Bored" and so on (see Table 3), and the negative sentiment even exceeds the neutral sentiment. This corroborates Foltête's suggestion that forest landscapes cannot always be interpreted in a positive way in terms of preferences, which in his view, can be influenced by forest characteristics and cover to produce negative or positive perceptions [74]. In fact, the study by Deng et al. also demonstrated that British tourists suffer from sleepiness when traveling to Beijing [39]. In the course of our analysis, we also found that photographs of forest landscapes with a high degree of grey clutter

are more likely to produce negative or more negative emotional states, so the state of the surrounding forest when people travel to the forest can strongly influence the participants' preference for the landscape [77].

We can think of preferences as encompassing emotional factors, with a strong correlation between the two [49,94]; therefore, emotion can also be considered as a predictor of preference, with preference scores being higher when the environment evokes positive and relaxing emotions, and lower when it does not [86,95]. When positive emotions are amplified, people's visual attention is increased and enhanced, and it is the visual appeal combined with the emotional response evoked by the content that influences people to develop preferences [90,96,97]. From the side, we can also see that there is a positive preference and a negative preference for people to take photos.

### 5.1.3. Shortcomings and Outlook

Social media data still have certain shortcomings in the process of landscape preference research, and the sample selection of forest recreationists photos for forest landscape preference research will have a certain bias [98]; it is also difficult to link to the context of the participants, so we have no way to explore in depth the reasons for the preferences [99,100]. There are many studies that use deep learning for the classification of travel photos, mainly focusing on the classification of large travel segments, such as architecture, plants, food, people, etc. [99,101]; however, our initial attempt to visually discriminate forest landscape type preferences using the computational set has some room for improvement in the accuracy and feasibility of using the dataset [43]. Further, landscape perception is not only visual, it is composed of multiple senses, including hearing and smell, and even the objective physical environment will affect the visual behavior of participants, such as temperature, negative air ion concentration, etc. [102,103]. Thus, the analysis of forest landscape preferences from a photographic perspective is still somewhat inadequate, and a combination of questionnaires and interviews can be conducted afterwards to explore the reasons for preference behavior and whether the perceptions brought about by different sensory combinations have an impact.

### *5.2. Conclusions*

In this study, we analyzed Chinese forest recreationists' preferences for forest landscape classification as well as sentiment, focusing on UGC photos from the "2BULU" website. We novelly used the MLP-Mixer model and Deepsentibank deep learning tool to conduct an objective study. From the categories, we found that people prefer forest trail landscape and in-forest landscape over detailed landscape (e.g., flowers, fruits, etc.), and in terms of aesthetic spatial entrainment, a flat view is more favored and welcomed by people, followed by overhead view, which will focus on ground cover plants or flowers, while elevated view will be ignored by more people; from the perspective of people's emotions in participating in forest tourism, there are positive and negative emotional states, and not all emotions are positive for forest landscapes, so we need to look at people's preferences dialectically. This study is intended to provide a basis for forest planning and design, and to help managers balance resources and needs. China has planned 12 national forest trails since 2017, passing through 20 provinces along the route and covering more than 22,000 km, and is gradually implementing the specific route selection and construction. Our research results are of reference value for the planning and design of nearly 3000 forest parks nationwide and can provide a basis and help for trail construction; for example: constructing high-quality trails to allow recreationists to get in touch with nature on foot; constructing traceless trails that can pass through diverse forest landscape resource areas or maintain the original appearance along the route; planning more landscapes within the forest to highlight features; adopting more tree species to enrich the colors of the forest in different seasons to enhance people's positive emotions, etc.

**Author Contributions:** Conceptualization, X.Z.; Data curation, X.Z. and X.T.; Funding acquisition, Y.Z. and L.Y.; Methodology, X.Z.; Project administration, Y.Z.; Supervision, Y.Z. and J.W.; Validation, Y.Z., J.W. and X.T.; Visualization, X.Z. and J.W.; Writing—original draft, X.Z.; Writing—review and editing, Y.Z. and L.Y. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Ministry of Science and Technology of The People's Republic of China, grants 2019YFD1100400. This research was funded by the science and technology innovation Program of Hunan Province: Research and application of expressway landscape and regional culture integration technology based on Huxiang Culture, grant number 2021SK2050. This research was funded by the Central South University of Forestry and Technology, grants CX202102098.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available on request from the author.

**Acknowledgments:** Thanks to the students especially Ding Ningning and Cui Shuaihu and teachers who gave support during the research process and greatly helped with the writing of the article.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **References**

	- Neural Networks. *arXiv* **2014**, arXiv:1410.8586.
