A Hybridization of Spatial Modeling and Deep Learning for People’s Visual Perception of Urban Landscapes

Farahani, Mahsa; Razavi-Termeh, Seyed Vahid; Sadeghi-Niaraki, Abolghasem; Choi, Soo-Mi

doi:10.3390/su151310403

Open AccessArticle

A Hybridization of Spatial Modeling and Deep Learning for People’s Visual Perception of Urban Landscapes

¹

Geoinformation Technology, Center of Excellence, Faculty of Geodesy and Geomatics Engineering, K.N. Toosi University of Technology, Tehran 19697, Iran

²

Department of Computer Science & Engineering and Convergence Engineering for Intelligent Drone, XR Research Center, Sejong University, Seoul 05006, Republic of Korea

^*

Author to whom correspondence should be addressed.

Sustainability 2023, 15(13), 10403; https://doi.org/10.3390/su151310403

Submission received: 30 April 2023 / Revised: 24 June 2023 / Accepted: 27 June 2023 / Published: 1 July 2023

(This article belongs to the Special Issue Geographical Data and Analysis for Sustainable Urban Studies)

Download

Browse Figures

Versions Notes

Abstract

:

The visual qualities of the urban environment influence people’s perception and reaction to their surroundings; hence the visual quality of the urban environment affects people’s mental states and can have detrimental societal effects. Therefore, people’s perception and understanding of the urban environment are necessary. This study used a deep learning-based approach to address the relationship between effective spatial criteria and people’s visual perception, as well as spatial modeling and preparing a potential map of people’s visual perception in urban environments. Dependent data on people’s visual perception of Tehran, Iran, was gathered through a questionnaire that contained information about 663 people, 517 pleasant places, and 146 unpleasant places. The independent data consisted of distances to industrial areas, public transport stations, recreational attractions, primary streets, secondary streets, local passages, billboards, restaurants, shopping malls, dilapidated areas, cemeteries, religious places, traffic volume, population density, night light, air quality index (AQI), and normalized difference vegetation index (NDVI). The convolutional neural network (CNN) algorithm created the potential map. The potential visual perception map was evaluated using the receiver operating characteristic (ROC) curve and area under the curve (AUC), with the estimates of AUC of 0.877 and 0.823 for pleasant and unpleasant visuals, respectively. The maps obtained using the CNN algorithm showed that northern, northwest, central, eastern, and some southern areas of the city are potent in pleasant sight, and southeast, some central, and southern regions had unpleasant sight potential. The OneR method results demonstrated that distance to local passages, population density, and traffic volume is most important for pleasant and unpleasant sights.

Keywords:

sense of sight; deep learning; spatial modeling; human perception; urban landscape

1. Introduction

1.1. Visual Perception

The phenomenon of perception is a mental process during which sensory experiences become meaningful and through which a person can recognize the relationships between objects and the meanings of objects. Environmental perception occurs according to the five human senses, and the sense of sight has the most significant role in the visual perception of urban environments [1]. Visual perception of the essential resources of the environment’s characteristics (such as colors, algorithms, and structures) is vital information, among other inputs, and humans analyze and comprehend their environment through it [2]. Visual perception is the brain’s ability to receive, interpret, and act on, including seven items of visual discrimination, visual memory, visual-spatial relationships, visual form constancy, visual sequential memory, visual figure/ground, and visual closure [3]. Visual perception also interprets the surroundings through photopic vision (day vision), color vision, scotopic vision (night vision), mesopic vision (twilight vision), and using the light in the visible spectrum reflected off various objects in the environment. Sense of sight depends on the transfer of photo stimuli received through the eye and helps the visual perception to interpret and organize the data. Visual environment is our initial perception of any environment, natural or manufactured [4].

1.2. Importance of People’s Perception of the Urban Visual Environment

Communicating with people in their environment is for perceiving, making suitable interactions, and covering their needs. Therefore, there is a need to know the environment and adequately understand it so that people can receive an appropriate response from the environment. One of the urban environment’s requirements is people’s connection with their environment. Using their five senses, people can perceive the environment, spaces, and phenomena [5]. Therefore, it is crucial to know people’s perception of the visual space of the urban environment through their sense of sight. As an essential reflection of the urban environment, the visual space shows the residents’ visual perception of urban landscapes [6], and visual perception creates interaction between people and the city [7]. The visual landscape is an essential part of people’s everyday urban experience [8], which leads to the perception, interpretation, and evaluation of the environment (by the people in the urban environment) [9]. People live well in environments they perceive as pleasant, and human well-being is influenced by the physical characteristics of the urban space [10]. Landscapes are essential for people as a living environment, recreation and well-being, and people’s emotional relationships with particular places [11]. Urban areas are categorized as pleasant and unpleasant (containing visual pollution) in visual terms [4]. Thus, we have divided visual perception into pleasant and unpleasant sights. The visual landscape attracts people to pleasant sights and repels them from unpleasant sights by emotional arousal affecting their experience of an urban [12]. The benefits of pleasant landscapes include enhancing metropolitan attractions, advancing the economy and tourism, and positively influencing people’s psyches.

People comprehend unpleasant visual landscapes (containing visual pollution) through their sense of sight and are influenced by them; therefore, visual pollution is defined as anything in the environment that does not evoke pleasant feelings and has contrastive effects (creating unpleasant visual feelings) [13]. Unpleasant visual landscapes are the ones most people consider as unattractive, ugly, disturbing, and annoying [14]. The majority of the urban population wants comfort outside of their houses (in public places) to satisfy their physical and emotional needs, and this can be accomplished through the design of pleasing visuals [15]. Consequently, it is essential to identify unpleasant (having damaging effects on people) and pleasant (having constructive impacts on people) metropolitan places and to attempt to boost pleasant sights while eliminating unpleasant ones.

1.3. Literature Survey

The research reviewed in this article was divided into two categories: a study conducted on visual sense perception and factors affecting it and research completed regarding machine learning and deep learning, presented in Table 1 and Table 2, respectively.

1.4. Spatial Modeling

Places are defined as spatial locations that have found meaning through human experiences and are a crucial part of everyday life. Therefore, a sense of place directly relates to people’s perceptions and nebulous definitions [33]. Sense of sight is directly related to spatial perception, through which one can deeply perceive their surroundings [34]. Spatial analysis helps study and perceive the characteristics of a place and the relationships between them using analytic techniques. People can solve complicated space-related issues by gathering important information and responding to questions.

The initial step of analyzing any phenomenon in GIS is spatial modeling, as it enables the spatial display of the natural world [5]. The use of the data gathered through GIS to extract hidden information (in patterns, behaviors, and assumptions) is limited due to the need for powerful analytic methods. However, analytical methods based on deep learning algorithms opened up new frameworks for spatial utilization [35]. Deep learning leads to new problem-solving approaches in many domains (such as data analysis); in other words, deep learning alters data through a cascade of layers and facilitates data analysis [36]. Deep learning is one of the branches of machine learning, and it aims to create a computational model with multiple processing layers to support high-level data abstraction. The human nervous system inspires deep learning concepts, so most deep learning architectures are designed using the ANN framework [37]. The word deep refers to a chain of layers through which data display transforms from one level to another [38]. The difference between deep learning and other traditional machine learning techniques is that deep learning emphasizes automatic learning with large datasets. The advantages of using this method are extracting distinguished traits in the modeling process [39], learning the optimal features directly through the data (without human interference and guidance), and understanding hidden relationships in data automatically [40]. As a result, a deep learning convolutional neural network (CNN) algorithm was employed for spatial modeling of people’s visual perception in the urban environment. The CNN algorithm has proven to be of acceptable accuracy in several urban studies, such as short-term traffic flow prediction [41], predicting and understanding urban perception [42], mapping urban trees within cadastral parcels [43], and air quality [44].

1.5. Research Objectives and Research Questions

The previous studies frequently examined the sense of sight and its perception (such as visual landscapes, visual pollution, and factors impacting them) and the psychological and physical repercussions of good and bad visual landscapes. The lack of spatial modeling and a potential map of people’s visual perception prepared by the deep learning algorithm in previous research led to a solution to this problem in the present study. This study aimed to find and extract effective spatial criteria on people’s perception of sight sense and spatial modeling, as well as develop a potential map of two modes of visual perception (pleasant and unpleasant) in Tehran, Iran, using the CNN algorithm. Then, through the potential maps, pleasant and unpleasant landscapes of each perception state of people’s sense of sight were identified.

According to the objectives of the research, the main questions of this research are as follows:

How much is each of the criteria affecting the sense of sight correlated with each other?
What is the importance of the criteria affecting each pleasant and unpleasant sight?
How accurate is the CNN algorithm in spatial modeling of the two states of people’s sense of sight?
What is the status of each region of Tehran in terms of the perception of each of the states of pleasant and unpleasant sights?

1.6. Research Innovation

In the research that has been performed in the past, visual perception of street images has often been investigated using deep learning [6,26,28], psychological and physical effects affected by visual landscapes [45], visual pollution [14,22,23,46], and visual preferences [17,47] and factors affecting visual sense [15]. In this research, no attention has been paid to the relationship between the effective spatial criteria and people’s sense of sight in the urban environment, the simultaneous consideration of two states of sight (pleasant and unpleasant), as well as the modeling and preparation of the potential map of people’s sense of sight using the CNN algorithm in the urban environment. Therefore, in the present research, these issues were solved. The innovation of this research was to conduct spatial modeling and prepare a potential map of two modes of perception of people’s visual perception, including pleasant and unpleasant, in an urban environment using the CNN algorithm. To achieve this, people’s sense of sight was collected through a questionnaire, and spatial criteria affecting each state of people’s sense of sight were identified. Finally, spatial modeling and the preparation of a map of the potential of people’s sense of sight in the urban environment were performed.

1.7. Research Structure

In this research, in the first and second parts, the introduction and the study area, and in the third part, the methodology with the subsections of the Database construction and Spatial criteria affecting people’s visual perception are presented. Next, the fourth section presents research methods, including Pearson’s correlation coefficient, Convolutional neural network (CNN), Feature importance (OneR), and model accuracy evaluation. In Section 5, the results of these methods are presented. Section 6, with the discussion title, includes subsections such as Assessment of influential factors, Assessment of Modeling, Landscape Policies and Limitations, and future directions. In the final part, the conclusions of the research have been presented.

2. Study Area

Tehran, the capital of Tehran province and Iran is geography located from 35°36′ to 35°44′ in the north and 51°17′ to 51°33′ at east longitude. The population is approximately 8,693,706 (almost 20% of Iran’s population). Tehran is divided into 22 regions, 134 districts, and 370 neighborhoods. Tehran might be crowded and chaotic, but its mountainy nature, waters, trees, its new and old landscape make it beautiful. Tehran is one of Iran’s main tourist attractions containing beautiful castles and museums. The Azadi Tower and Square, the Tabiat Bridge, and the Golestan Palace are among the city’s most prominent attractions. Tehran attracted 64.1 million tourists in 2016 and became one of the Middle East’s leading tourism centers. The visual pollution factors include extensive advertisement, garbage, and construction wastes in the streets, poorly built buildings, and broken and garbled signs. Figure 1 shows the study area together with the modeling sample locations.

3. Methodology

Spatial modeling of people’s visual perception in Tehran was completed in 4 stages (Figure 2), as follows:

The spatial database containing dependent and independent data was constructed in the first stage. The dependent data was gathered through a questionnaire (Appendix A) and consisted of the visual perception of 663 people. The independent data consisted of distances to industrial areas, public transport stations, recreational attractions, primary streets, secondary streets, local passages, billboards, restaurants, shopping malls, dilapidated areas, cemeteries, religious places, traffic volume, population density, night light, air quality index (AQI), and normalized difference vegetation index (NDVI). For each of the visual sense states, including pleasant sight and unpleasant sight, according to the number of places selected by users in the questionnaire (points with value 1), random points (points with value 0) were considered separately. For each of these states (pleasant and unpleasant), all the points and values related to each were prepared as dependent data, along with the values associated with the relevant, effective spatial criteria as independent data as a spatial database.
The spatial correlation between the criterion was determined using the Pearson correlation coefficient method, the relevance of the criteria was determined using the OneR approach, and spatial modeling was performed using the deep learning model in the second stage. The spatial database prepared in the previous step was used to perform these methods. In modeling, 70% of the data in the database was used for training and 30% for validation.
After learning the CNN from the training data in the database, this learning was extended to all parts of Tehran city (with a pixel size of 30 m). By using the values determined for each point of Tehran city by the CNN algorithm, using interpolation by the kriging method (to prepare a raster map), in the third step, the potential map of the people’s visual perception in Tehran was prepared, including two states of pleasant sight and unpleasant sight.
At last, indexes of mean square error (MSE), accuracy, receiver operating characteristic (ROC) curve, and area under the curve (AUC) were employed to analyze the modeling and the potential map of people’s visual perception. The receiver operating characteristic (ROC) curve and area under the curve (AUC) were prepared using 30% of spatial database data.

Conducting this research leads to the identification of practical criteria for the sense of sight through previous research studies, and along with the data of the questionnaire, a database was obtained that can create a map of the potential of people’s sense of sight using the CNN algorithm. This has not been investigated in previous research. The spatial criteria used in this research have significant effects on people’s sense of sight, and using them together with the questionnaire data led to the model’s training and then the generalization of the model to the whole city of Tehran. This algorithm has been used due to its flexibility, more predictive accuracy (especially for big data), superiority over machine learning models, and incremental approaches to learning high-level components from datasets [48].

3.1. Database Construction

People’s Visual Perception Data

Questionnaires were used to assess people’s visual perception in Tehran (online, in person, or through Instagram). This questionnaire was developed in January 2021 and contained three questions:

Identifying the place in Tehran on a map (It was impossible to find the location through written questionnaires and Instagram; thus, the participants were asked to give an accurate address, and the geographic coordinates were extracted from google maps).
They identify the sort of the selected place (home address, work address, and the address of the chosen location).
Choosing the type of sense of sight perceived by users in the selected location in a question containing two options (pleasant and unpleasant).

The design of this questionnaire was chosen considering that urban areas are visually classified into two categories pleasant and unpleasant (with visual pollution) [4]. This questionnaire was collected in three ways (online, in person, or through Instagram). In the questionnaire (in person), young and healthy people were asked to answer the above questionnaire (to ensure the accuracy of their sense of sight fully). Questionnaires (in person) were collected in areas of the city (center, north, south, east, and west) and in public places to ensure that data related to people’s sense of sight were collected throughout the city. In other questionnaires (online and through Instagram), at the beginning of the questionnaire, it was requested that they answer the questionnaire if they have a healthy sense of sight and presence in Tehran city. All the respondents in all three types of questionnaires were people who were thoroughly familiar with the desired place. To ensure the statistical population with different people and to avoid repetition, in the online and Instagram questionnaires, each person with a specific IP was allowed to answer the questions only once. The data gathered from this questionnaire consisted of the sense of sight of 663 people, 517 pleasant sights and 146 unpleasant ones (regarding people’s perception of their surroundings). This information was categorized as dependent data. For each visual sense state, including pleasant and unpleasant, the same number of places chosen by users (points with value 1) were considered separately as random points (points with value 0). For each of the states of pleasant sight and unpleasant sight, all the points (questionnaire and unexpected) and the value of each of them (dependent data), along with the importance of the relevant, effective measures (independent data), were prepared in an Excel file. This Excel file was prepared to enter the modeling process (in Python). In this file, 70% of the data was considered for training and 30% for validation. The status of these people is presented in Figure 1 (separate from the perceived sense of sight). This questionnaire provided supplementary information through images to receive the perceived sense of sight in the chosen places (extracted from expert opinions and popular pictures in media). Figure 3 presents the places highly selected in the final questionnaire.

3.2. Spatial Criteria Affecting People’s Visual Perception

The criteria that affect people’s visual perception (including pleasant and unpleasant) were determined regarding previous research and expert opinions [4,43,49,50]. These criteria include distances to industrial areas, public transport stations, recreational attractions, primary streets, secondary streets, local passages, billboards, restaurants, shopping malls, dilapidated areas, cemeteries, religious places, traffic volume, population density, night light, AQI, and NDVI. These criteria were produced in the ArcGIS10.8 software with 30 × 30 m pixels. Table 3 shows the independent variables along with the source of data.

Night light

Lighting is one of the main factors in improving city landscapes [51]. Using lights in space improves its quality, aesthetics, attraction, liveliness, delight, and visual ease [46]. Nightlights can either enhance or diminish visual landscapes (Figure 4a).

Distance to recreational attractions

Recreational centers are manufactured structures with geographical and recreational substructures built for recreational purposes. Recreational attractions include tourist sites and places with cultural, recreational, and social activities. The design of recreational centers’ architecture is intertwined with culture and art and creates a visual aesthetic through the utilization of color and light [52]. Refurbishment and augmentation of monuments and the addition of cultural and art amenities are examples of ways to increase a city’s quality, aesthetics, attractiveness, vitality, and visual ease [49] (Figure 4b).

Distance to industrial areas

Due to landscape transformation and fragmentation, visual field aggressiveness and homogeneity, light pollution, and chemical pollution (visual perception of smoke emissions and oil spills), industrial facilities (companies and technological facilities) have adverse effects on visual perception of the surroundings and cause visual pollution [46] (Figure 4c).

Distance to public transport stations

The resources of visual pollution are facilities like transportation or advertisement billboards, which are not homogenous to landscapes and block the view of natural objects and scenes and make them look off-balance and violated [46] (Figure 4d).

Distance to streets

The senseless diversity of color, shape, and light, the combination of incoherent visual factors, and the hideous and unappealing manufactured environment are all causes of visual pollution. These causes are seen in public environments like streets and squares [49]. Transportation substructures such as highways can also cause visual disturbance [46] (Figure 4e,f).

Distance to local passages

Local passages are spaces within cities that cause balance, and people can spend their free time playing and resting; therefore, many people are attracted to them. Improving and enhancing the qualities of such places for interactive and walking purposes can increase the aesthetic and appeal of a city [49] (Figure 4g).

Traffic volume

Traffic makes people feel unpleasant and annoyed and disturbs them; therefore, it is one of the factors of visual pollution [50] (Figure 4h).

Distance to billboards

Due to factors such as severe density, inappropriate position, inappropriate size, inappropriate heights, inappropriate colors, and defective structure, billboards detract from the beauty of the urban environment [4]. The visual pollution of billboards leads to disturbance for people and reduces their concentration (Figure 4i).

Distance to restaurants

The architecture of most restaurants is splendid, and light and color in employed in them to attract people. Architecture, proper lighting, and color enhance visual qualities, and places containing these traits have the same effects [49] (Figure 4j).

Distance to shopping malls

Shopping malls are among the locations in the urban space that have visual beauty and appeal and seek visual comfort and improve the quality of the urban due to lighting and the use of bright and pleasant colors [49] (Figure 4k).

Distance to dilapidated areas

Dilapidated buildings with poor architecture can cause pollution and destructive visual perception [14] and decrease pleasant landscapes [4]. Improving, organizing, and enhancing abandoned places can increase the aesthetic and appeal of city environments [49] (Figure 4l).

Population density

The population density indirectly affects the increase of vehicles [53], traffic, damaging and altering green areas [54], and increasing garbage [55] (Figure 4m).

Distance to cemeteries

Cemeteries destroy effects on aesthetics and create a stressful landscape for people [56] (Figure 4n).

Distance to religious places

In the history of Iranian architecture, religious places like mosques were always the primary sources of beauty, art, and Islamic architecture. The primary sources of these arts were simple, whereas the contemporary ones are ornamented [57]. Adding color to landscapes reflects art, causing excitement and esthetics [58] (Figure 4o).

NDVI

Parks and green spaces are attractive, beautiful, and pleasant places in cities; they significantly impact the enhancement of visual quality and ease [15,49]. NDVI is the most standard index of vegetation [59]. Equation (1) is used to estimate the NDVI.

N D V I = \frac{(N I R - R e d)}{(N I R + R e d)},

(1)

NIR for near-infrared spectroscopy in Equation (1) stands, and red stands for a red light. The range of NDVI alternations is from −1 to +1, and increased NDVI leads to an increase in vegetation density [59] (Figure 4p).

AQI

Air pollutant particles can directly influence people’s visual perception, as air pollution reduces the color contrast of landscapes [60]. The primary sources of air quality effects include smoke emitted from polluting sources, visual experience, color, and contrast in the landscape. Visual perception of air pollution is directly related to the density of suspended particles and dust [61]. The AQI is calculated using all air pollutants between 0 to 500. Equation (2) is used to calculate the air quality index of each station for a pollutant using raw data.

A Q I p = \frac{I_{Hi} - I_{L o}}{B P_{H i} - B P_{L o}} (C_{p} - B P_{L o}) + I_{L o},

(2)

In Equation (2), the AQI for pollutant P is measured.

C_{p}

is the measured density of the P pollutant,

B P_{H i}

is the breaking point either higher or equal to

C_{p}

,

B P_{L o}

is the breaking point either lower or equal to

C_{p}

,

I_{Hi}

is the AQI amount coincident with

B P_{H i}

and

I_{L o}

, and the AQI amount coincident with

B P_{L o}

. The air quality index of each station for a pollutant is measured, and the highest amount is reported as the AQI amount (Figure 4q).

4. Methods

4.1. Pearson Correlation Coefficient

Pearson correlation coefficient is used to depict the linear correlation between dependent and normal variables [62]; it is also one of the main methods of measuring the similarities of multiple data variables. The amount of this coefficient is between −1 to 1. If the amount of the correlation coefficient is 1, the correlation is entirely positive, and if it is −1, it is entirely negative. In other words, with the increase of the correlation coefficient, the correlation is reinforced and vice versa [63]. Equation (3) is used to calculate the Pearson correlation coefficient:

r = \frac{\sum (x i - \bar{x}) (y i - \bar{y})}{\sqrt{\sum {(x i - \bar{x})}^{2} \sum {(y i - \bar{y})}^{2}}},

(3)

In this equation, r stands for correlation coefficient,

x i

represents the amount of variable x in one sample,

y i

depicts the amount of variable y in one sample, and

\bar{x}

and

\bar{y}

depict the average amounts of variables x and y, respectively. Evan’s guide (1996) is used to label correlation strength for the absolute value of r (0–0.19) marked as very weak, (0.2–0.39) as weak, (0.4–0.59) as moderate, (0.6–0.79) as strong, and (0.8–1) as very strong.

4.2. Convolutional Neural Network (CNN)

CNN is one of the primary methods of deep learning inspired by biological vision mechanisms [64]. CNN, one of the deep learning techniques, is mainly used for automatically identifying significant traits in studies (without human supervision) [65]. CNN is employed in numerous fields and sciences, like earth science, to classify and predict information; the primary function of this technique is to enter data in the form of photos and photo interpretations [66]. Entering data in images eliminates many parameters and accelerates the process [67]. Complex spatial features can easily be gathered through CNN and the convolution of multiple convolutional kernels (high capacity for extracting spatial features) [68]. The benefits of utilizing CNN are equivalent representations, sparse interactions, and parameter sharing [65]. CNN is a classification of deep learning neural nets and consists of one entry layer, one or a few convolutional, pooling, fully connected layers, and one output layer [69]. Figure 5 presents a CNN containing four layers.

Convolutional Layer: The convolutional layer is essential in CNN as a local feature extractor. The convolutional layer employs three fundamental strategies to increase network operation and eliminate numerous free variables: receptive fields, sparse connectivity, and parameter sharing. The convolutional layer is measured based on Equation (4):

$x_{q}^{l} = f (\sum_{p \in M_{q}} x_{p}^{l - 1} * k_{p q}^{l} + b_{q}^{l}),$

(4)

where $x_{q}^{l}$ is the q in the 1st layer, $k_{p q}^{l}$ is the trainable convolution core in the 1st layer, $b_{q}^{l}$ is the bias matrix, $M_{q}$ is the selection of entries, and f is the activation key.
Pooling layer: This layer produces down-sampled versions of input maps through the pooling operation. The input map is initially divided into non-overlapping rectangular regions (of u∗u size), then the new map is measured by summarizing the maximum values or average of the rectangular areas. Equation (5) is used to measure pooling layers.

$x_{q}^{l} = d o w n (x_{q}^{l - 1}),$

(5)

Down is a sub-sampling function.
Fully-connected layer: It is a standard neural network layer that depicts the non-linear relationships between entry and discharging layers using activation and bias functions. These layers are placed after convolutional and sub-sampling layers and transform two-dimensional maps into one-dimensional vectors. Data is processed in these layers according to Equation (6):

$x^{l} = f (w^{l} x^{l - 1} + b^{l}),$

(6)

$W^{l}$ and $b^{l}$ stand for the weighted matrix and bias matrix of the fully-connected layer, respectively (w and b are both trainable parameters) [68].

The parameters of each layer of the neural network affect the training speed and prediction ability of the whole CNN, so one of the critical parts of CNN training is the design of optimal parameters. Using optimal parameters saves calculation time and increases model performance [70].

4.3. Feature Importance

The OneR (One rule) method was used in this research to evaluate the validity of the criteria that affect two states of sight, including pleasant and unpleasant. This method is based on a one-tier decision tree that consists of a set of rules for the data, and all are evaluated by a particular property [71]. The OneR algorithm created a set of association rules and selected the one with the lowest error ratio. If these traits contain numbers, this algorithm uses a simple technique and divides the amounts into multiple intervals [72].

4.4. Evaluation of Model Accuracy

Access to various algorithms with different success rates has created a need to use evaluation criteria [73]. MSE, accuracy, ROC, and AUC were utilized to evaluate the modeling findings and the potential map of sense of sight.

Indicators like MSE and accuracy were used to evaluate the function of the CNN algorithm in the potential mapping of people’s visual perception. MSE is measured based on Equation (7) and the difference between the values of the expected model and the actual detected ones.

M S E = \frac{\sum_{i = 1}^{N} {(p_{i} - \bar{p_{i}})}^{2}}{N},

(7)

In this equation, N stands for sample size, pi stands for the actual value and

\bar{p_{i}}

for the expected amount.

Equation (8) was used to measure the Accuracy.

Accuracy = \frac{T P + T N}{T P + F N + T N + F P},

(8)

In this equation, TN stands for true negative (accurately predicting the data to be negative), TP for true positive (accurately predicting the data to be positive), FN for false negative (falsely predicting the data to be negative), and FP for false positive (falsely predicting the data to be positive).

The CNN algorithm uses the ROC curve and the AUC to determine the accuracy of potential maps of pleasant and unpleasant sights in urban environments. The ROC curve is a confusion matrix comprising four categories of TN, TP, FN, and FP [74]. The scale of the function of the model results from the confusion matrix and includes true positive rate (TPR, samples correctly labeled as positive) and false positive rate (FPR; samples falsely labeled as positive). TPR and FPR are drawn on axis y and x, respectively [75]. Axis x and y of the ROC index are measured through Equations (9) and (10).

x = 1 - s p e c t i v i c i t y = 1 - \frac{T N}{F P + T N},

(9)

y = s e n s i t i v i t y = \frac{T P}{F N + T P},

(10)

The value of the AUC is estimated between 0.5 to 1 and is classified as weak (0.5–0.6), moderate (0.6–0.7), good (0.7–0.8), very good (0.8–0.9), and excellent (0.9–1) [5]. The higher the value of AUC, the better the quality of the model [76].

5. Results

5.1. Pearson Correlation Coefficient Results

The Pearson correlation of criteria that affect the sense of sight is presented in Figure 6. Distances to restaurants and distances to billboards (0.67), distances to shopping centers and distances to billboards (0.63), and distances to recreational attractions and distances to billboards (0.53) had the highest correlation between the criteria.

Measuring the Pearson correlation coefficient before modeling helps identify the criteria with a strong correlation because a robust correlation can be confusing [5]. All criteria showed very weak, weak, and moderate correlations in all criteria except for distances between restaurants and billboards, with a correlation value of 0.67, and shopping centers and billboards, with a correlation value of 0.63. Consequently, there is no significant association between the criteria, and each criterion can be utilized in modeling.

5.2. Feature Importance Results

The significance of the criteria for two modes of visual perception (pleasant and unpleasant) as determined by the OneR algorithm is presented in Figure 7. The most significant criteria in the state of pleasant sight were distance to local passages (77.82), population density (77.67), and traffic volume (77.22). The most significant ones in the unpleasant sight were population density (78.13), distance to local passages (77.82), and traffic volume (77.37). The most important criteria in both sights were distance to local passages, population density, and traffic volume.

5.3. Modeling Results

A spatial dataset was employed to implement the CNN algorithm in two modes of pleasant and unpleasant sight. The dependent data consisted of two states of people’s visual perception in urban environments (gathered through questionnaires of approximately 663 people). These questionnaires contained 517 pleasant sights and 146 unpleasant ones. Pleasant places are valued at 1 in this state, and unpleasant sight is valued at 0, and vice versa. The modeling entry was extracted through the values of each variable from the maps of effective criteria; 70% of the data was used for training and 30% for validation. The TensorFlow library of the Python program language was used to implement the CNN algorithm.

Table 4 presents the results related to MSE and accuracy in evaluating the CNN algorithm. The value of MSE and accuracy for the pleasant sight were 0.02 and 0.97, respectively; these values were 0.22 and 0.75 for training and test states, respectively. The value of MSE and accuracy for the unpleasant sight were 0.05 and 0.93, respectively; these values were 0.20 and 0.74 for training and test states, respectively. In the training mode, the pleasant sight mode was more accurate than the unpleasant sight mode, but the opposite was confirmed in the test mode. Figure 8 shows the error rate of the training datasets and the validation of pleasant and unpleasant sights. The findings demonstrated that the CNN algorithm was effective at spatially modeling people’s visual perception.

Other research has been performed using the CNN algorithm, which is Spatial prediction of groundwater potential mapping CNN has an accuracy of about 0.844 for training data and 0.843 for testing data [67] and Data-Driven Predictive Modeling of Mineral Prospectivity, which CNN accuracy of 92.38% for classification [69] are some of these researches. The comparison of this research with the current research shows the good accuracy of the CNN algorithm.

The potential map of each state of people’s visual perception, including pleasant and unpleasant sights, was prepared by generalizing the output of the CNN algorithm in ArcGIS 10.8 software to the whole city of Tehran. The potential map of the people’s visual perception (pleasant and unpleasant) is shown in Figure 9. In the case of pleasant sight, the northern areas, a part of the northwest, some central and eastern parts of the city, and a few parts of the south of the city had high potential. Unpleasant sights often had a high potential in parts of the southwest edge, parts of the center, and a small part of the city’s south.

The high potential of people’s visual perception in the northern, central, and eastern regions can be pleasantly due to reasons such as fewer industrial areas, more recreational attractions areas, and suitable NDVI (the presence of more parks and green spaces in these areas) in the northern and central parts and relatively low AQI in the east. Also, the city’s center has a high density of religious places and appropriate light nights, which can effectively create a pleasant sight perception. Low light nights, low NDVI, the absence of recreational attractions and shopping centers, and industrial areas in the southwest can be evidence of the high potential of unpleasant sight perception by people in these areas.

5.4. Validation

Thirty percent of the data related to the sense of sight (of the spatial dataset) was used to evaluate. Figure 10 presents the results of both unpleasant and pleasant states using the ROC and AUC graphs. The AUC of the pleasant sight was 0.844 and 0.823 for the unpleasant sight. The evaluation shows that the CNN algorithm was accurate in both states but more accurate in pleasant sight.

The higher AUC for pleasant than unpleasant sight perception could be due to the more significant number of its data in the spatial database. Because in this research, among the visual perception data received from 663 people, 517 places were related to pleasant sight, and only 146 places belonged to unpleasant sight.

To match the potential maps of the two visual perception states with reality, regions with higher potential than other areas were identified in both states. The places with higher potential included Sorkheh Hesar National Park, National Botanical Garden, and Golestan Palace for the pleasant sight; South Bus Terminal, Shadabad Iron Market, and Azadegan Expressway for the unpleasant sight. The mentioned places are shown separately for each of the perceptions (pleasant sight and unpleasant sight) on the map of Tehran. According to the potential maps of two different vision perception states, these famous places in Tehran corresponded with the results. Figure 11 and Figure 12 depict the areas with higher potentials for pleasant sight and unpleasant sight, respectively.

6. Discussion

6.1. Assessment of Effective Factors

The Pearson correlation results revealed that the distance to restaurants, shopping malls, and recreational attractions had the strongest correlation among all the parameters that affect the sense of sight. Public advertisements (such as billboards) are the most influential kind of advertising; these displays are typically placed along significant thoroughfares and boulevards in cities to attract the attention of passersby [77]. The results showed that the distance from the propaganda billboards had a higher correlation with the distance to main streets than the distance to secondary streets; in research, it was pointed out that there are advertising billboards near the primary streets due to their size and color [78]. One of the most important principles of advertising knowledge and one of the factor elements of the urban landscape in human habitation is urban advertising. Urban advertising is a part of the framework of cities, and the most crucial part of commercial actions is dedicated to this issue [79]. Formal public spaces may be a place to rest, socialize or play while creating a visual pause in the flow of streets in urban areas [80]. Restaurants, shopping centers, and recreational attractions are public spaces where people are present and bring a visual pause in the flow of the streets. Therefore, due to its importance, advertising billboards are built near these centers to be seen better, so the correlation of the billboards’ location with these areas increases.

According to the OneR method, distance to local passages, population density, and traffic volume were among the most important criteria affecting pleasant and unpleasant sights. In addition to the mentioned criteria, in the pleasant sight state, the night light criterion, and the unpleasant sight state, the distance to primary streets criterion was also of great importance. Local passages are mostly away from jobs such as (mechanical and blacksmithing), public transportation stations, and industrial areas, creating an unpleasant environment, and being away from them can create visually pleasant environments. People usually walk in neighborhood streets and public open spaces [81]; while walking, pedestrian eyes gather visual information [82]. Affecting people’s visual perception is among the impacts of traffic [83], and car traffic chaos causes visual dissatisfaction [84]. A study by [85] emphasized that people’s experience of road traffic, in general, significantly impacts their aesthetic experience in cities. The continuous movement and speed of road traffic in urban streets turn people’s experience of moving in the city into a continuous sensory stimulation and cognitive activity and show people’s aesthetic experience of the city. The high density of road traffic has caused congestion and slow movement of cars, making people’s aesthetic experience dull and monotonous. [86] believed that increased population increases environmental pollution like visual pollution, and pollution would not exist without population. Lights employed in the architecture of cities and night lights (of great visual potential) enhance the visual quality of an urban environment [87]. Night lights used in building attraction sites attract people to visit these places at night time [88]. The results obtained by [89] indicated that the correct application and attention to lighting standards is the first factor in forming visual aesthetics in the urban space. In metropolitan environments, the population is the primary factor of visual pollution [90]. Population density raises the number of automobiles [53], traffic, and garbage [91]. Several studies have reported Garbage or public waste as a cause of visual pollution in urban environments [92]. In research conducted by [93], it was stated that the phenomena are divided into three categories of neighborhood, building, and street based on the spaces in the urban environment, and the place’s cleanliness leads to higher visual comfort. High-rise constructions are taking place in cities due to the increase in population and urbanity, while there is not enough land for home construction. High apartment buildings are a distinct housing trait in crowded cities [94], causing problems like blocking view corridors and diminishing urban landscapes [95]. Vehicle density and traffic cause visual pollution and make it unpleasant [96]. Streets can either enhance the visual qualities of a city or diminish them [97]. In a study by [98], crowded spaces (cars) and cluttered streets are also sources of visual pollution. In research, the facade of buildings, green space, sky view, pedestrian space, motorization, and diversity was among the key elements affecting the visual quality of streets (as an essential part of urban public space and closely related to people) [99]. Primary streets are traveling routes for vehicles and public transportation, and billboards, all of which cause unpleasant sights in urban environments.

6.2. Assessment of Modeling

The results related to the ROC and AUC diagram of the sense of sight in the states of pleasant and unpleasant sight were in the range of numbers (0.8–0.9), which showed outstanding accuracy in evaluating the potential maps of the sense of sight. CNN advantages include the ability to consider the correlation of adjacent spatial information, maintain spatial relationships between pixels by learning internal feature representations from factor vectors, and reduce the computational complexity of the network through weight sharing [100]. Reducing the computational complexity of the network through weight sharing, which is one of the advantages of CNN, reduces the number of trainable network parameters and helps the network to increase generalization and avoid overfitting. Large-scale network implementation is much easier with CNN compared to other neural networks. The CNN output is highly organized because this model uses simultaneous learning of feature extraction and classification layers [65]. The CNN algorithm can also reduce the data dimension, extract the feature sequentially, and classify one network structure [101].

Tehran’s Northern, northwest, central, eastern, and some southern areas had high potential for pleasant visuals. Most recreational attractions, restaurants, and shopping centers are built in these areas because they can enhance the quality of city visuals. Recreational attractions, restaurants, and shopping centers are intertwined with art and splendid architecture, and light and color elements are used in them to enhance their visuals and attract people. Color is among the most vital factors in urban environments witnessed through light; light and color play a significant role in increasing the appeal and attraction of a city. Proper utilization of light and color (following lighting and coloring standards) is the initial step in enhancing the visual qualities of urban environments [89]. Shopping centers have exceptional city space qualities, making them inseparable parts of cities [102]. Buildings and shopping centers are urban features interrelated with mapping urban spatial, architecture, and design. The vegetation density primarily exists in the northern areas of the city. Vegetation creates pleasant visual landscapes that give people joy and peace [103]. Southeast, some central, and southern regions of the city are highly potential for unpleasant visuals. These areas are distant from recreational activities, local passages, restaurants, and retail complexes, contributing to their appealing appearance. The west and some central areas hold fewer religious places and vegetation densities. Islamic mosques are manifestations of visual aesthetics and great examples of the combination of symbols and deep-rooted religious beliefs [57]. In several studies, industrial facilities are introduced as the causes of reduced quality of visual landscapes; southeast, some southern, and eastern areas of Tehran are close to industrial facilities and are perceived as unpleasant [104,105]. The city’s southeastern, some southern, and central areas contain dilapidated areas that can affect their visuals negatively, as mentioned in previous research [106,107].

6.3. Landscape Policies

The rapid development of cities and urbanism, alongside the population increase, causes physical inconsistencies in urban environments. Urban planners can design, map, and enhance visually pleasant sights in urban environments, improving people’s living conditions and mental peace. This research showed the potential of people’s visual perception in Tehran. This research helped urban authorizations and planners identify the visual conditions of all districts of Tehran and make plans to eliminate visual pollution and enhance and preserve visual landscapes. The results showed that closeness to industrial and duplicated areas damages the visual qualities of the city; thus, industrial companies and facilities were transferred outside the city to reduce visual pollution. Organizing and enhancing duplicated areas and replacing them with recreational attractions, restaurants, and shopping centers (regarding the vital role in creating visually pleasant environments) were also among the efforts to eliminate unpleasant sights and create pleasant ones. Parks and green spaces were built because vegetation enhances the visual qualities of cities.

6.4. Limitations and Future Directions

The limitations of this research include a need for more access to the location data of garbage cans, urban furniture, graffiti, and cultural billboards in Tehran. Other limitations were the need for more information about where sectional activities like paving, ceremonies, events, and street sales occurred.

The results of the research can be used in future works, such as locating new recreational areas (people’s desire to be in areas with visual beauty), a parameter for determining the value of the real estate and quality of life, checking the level of mental peace in different places related to visual landscapes.

7. Conclusions

This research presented an approach based on the combination of spatial modeling and the CNN algorithm to prepare the potential map of people’s perceived sense of sight in two states of pleasant and unpleasant urban environments. The general results of the research can be stated as follows:

The results showed that the strongest correlation among criteria that affect people’s visual perception belonged to the distances to restaurants and billboards, shopping centers and billboards, and recreational attractions and billboards.
Distance to local passages, population density, and traffic volume were the most significant criteria in the two sight modes of pleasant and unpleasant.
The results of the ROC curve showed good accuracy of the CNN algorithm in modeling the two modes of sight sense, and the pleasant sight mode had a higher accuracy than the unpleasant sight.
The results demonstrated that the northern, southwest, central, southern, and eastern areas of the city were high potential in terms of pleasant visuals. Southwestern, central, and southern areas of the city were a high potential for unpleasant sights.
In general, in pleasant sight, areas with very high potential occupy a small percentage of Tehran city, and we mostly see areas with high potential. In an unpleasant sight state, there are relatively large areas with high and very high potential, which causes concern. Among the advantages of understanding these areas through potential maps, it can be pointed out that creating visually beautiful areas is needed in all areas of Tehran, and this need is felt more in the city’s southwestern edge. According to the maps of effective criteria, the presence of industrial areas, lack of recreational attractions, relatively high AQI, low night light, and the lack of restaurants and shopping centers in the southwest can be a reason for the unpleasant visual perception in those areas. Proposed measures to eliminate negative factors affecting visual perception and create areas with visual beauty to increase citizens’ quality of life according to the potential maps are among the achievements of this research. The potential maps obtained from this study can help urban planners and managers to provide a more favorable visual space (removing unpleasant sight elements and creating pleasant sight elements in required places) to urban users because being in an excellent visual environment is the desire of all residents because they spend much time in the city and it has many effects on them.

Author Contributions

Conceptualization, M.F. and S.V.R.-T.; Data curation, M.F.; Formal analysis, M.F., S.V.R.-T. and S.-M.C.; Funding acquisition, A.S.-N. and S.-M.C.; Investigation, S.V.R.-T.; Methodology, M.F.; Project administration, A.S.-N. and S.-M.C.; Resources, A.S.-N.; Software, M.F. and S.V.R.-T.; Supervision, A.S.-N. and S.-M.C.; Validation, M.F.; Visualization, M.F.; Writing—original draft, M.F.; Writing—review & editing, S.V.R.-T., A.S.-N. and S.-M.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2023-RS-2022-00156354) supervised by the IITP (Institute for Information & Communications Technology Planning & Evaluation) and the Ministry of Trade, Industry and Energy (MOTIE) and Korea Institute for Advancement of Technology (KIAT) through the International Cooperative R&D program (Project No. P0016038).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author, [Abolghasem Sadeghi-Niaraki], upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

References

Majdzadeh, S.A.; Mirzaei, R.; Madahi, S.M.; Mabhoot, M.R.; Heidari, A. Identifying and Assessing the Semantic and Visual Perception Signs in the Identification of Fahadan Neighborhood of Yazd. Creat. City Des. 2021, 4, 55–68. [Google Scholar]
Cole, S.; Balcetis, E. Motivated perception for self-regulation: How visual experience serves and is served by goals. In Advances in Experimental Social Psychology; Elsevier: Amsterdam, The Netherlands, 2021; Volume 64, pp. 129–186. [Google Scholar]
Orloff, S. Learning Re-Enabled: A Practical Guide to Helping Children with Learning Disabilities; Mosby: Maryland Heights, MO, USA, 2004. [Google Scholar]
Jana, M.K.; De, T. Visual pollution can have a deep degrading effect on urban and suburban community: A study in few places of Bengal, India, with special reference to unorganized billboards. Eur. Sci. J. 2015, 8, 94–101. [Google Scholar]
Farahani, M.; Razavi-Termeh, S.V.; Sadeghi-Niaraki, A. A spatially based machine learning algorithm for potential mapping of the hearing senses in an urban environment. Sustain. Cities Soc. 2022, 80, 103675. [Google Scholar] [CrossRef]
Dai, L.; Zheng, C.; Dong, Z.; Yao, Y.; Wang, R.; Zhang, X.; Ren, S.; Zhang, J.; Song, X.; Guan, Q. Analyzing the correlation between visual space and residents’ psychology in Wuhan, China using street-view images and deep-learning technique. City Environ. Interact. 2021, 11, 100069. [Google Scholar] [CrossRef]
Perovic, S.; Folic, N.K. Visual perception of public open spaces in Niksic. Procedia-Soc. Behav. Sci. 2012, 68, 921–933. [Google Scholar] [CrossRef] [Green Version]
Abkar, M.; Kamal, M.; Maulan, S.; Davoodi, S.R. Determining the visual preference of urban landscapes. Sci. Res. Essays 2011, 6, 1991–1997. [Google Scholar]
Golkar, K. Conceptual evolution of urban visual environment; from cosmetic approach through to sustainable approach. Environ. Sci. 2008, 5, 90–114. [Google Scholar]
Sottini, V.A.; Barbierato, E.; Capecchi, I.; Borghini, T.; Saragosa, C. Assessing the perception of urban visual quality: An approach integrating big data and geostatistical techniques. Aestimum 2021, 79, 75–102. [Google Scholar] [CrossRef]
Wartmann, F.M.; Frick, J.; Kienast, F.; Hunziker, M. Factors influencing visual landscape quality perceived by the public. Results from a national survey. Landsc. Urban Plan. 2021, 208, 104024. [Google Scholar] [CrossRef]
Nasar, J.L. The evaluative image of the city. J. Am. Plan. Assoc. 1990, 56, 41–53. [Google Scholar] [CrossRef]
Elena, E.; Cristian, M.; Suzana, P. Visual pollution: A new axiological dimension of marketing. Eur. Integr.–New Chall. 2011, 1, 1836. [Google Scholar]
Wakil, K.; Naeem, M.A.; Anjum, G.A.; Waheed, A.; Thaheem, M.J.; Hussnain, M.Q.u.; Nawaz, R. A hybrid tool for visual pollution Assessment in urban environments. Sustainability 2019, 11, 2211. [Google Scholar] [CrossRef] [Green Version]
Polat, A.T.; Akay, A. Relationships between the visual preferences of urban recreation area users and various landscape design elements. Urban For. Urban Green. 2015, 14, 573–582. [Google Scholar] [CrossRef]
Van Zanten, B.T.; Van Berkel, D.B.; Meentemeyer, R.K.; Smith, J.W.; Tieskens, K.F.; Verburg, P.H. Continental-scale quantification of landscape values using social media data. Proc. Natl. Acad. Sci. USA 2016, 113, 12974–12979. [Google Scholar] [CrossRef] [PubMed]
Tenerelli, P.; Püffel, C.; Luque, S. Spatial assessment of aesthetic services in a complex mountain region: Combining visual landscape properties with crowdsourced geographic information. Landsc. Ecol. 2017, 32, 1097–1115. [Google Scholar] [CrossRef]
Wigness, M.; Eum, S.; Rogers, J.G.; Han, D.; Kwon, H. A rugd dataset for autonomous navigation and visual perception in unstructured outdoor environments. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macao, China, 4–8 November 2019; pp. 5000–5007. [Google Scholar]
Abd-Alhamid, F.; Kent, M.; Bennett, C.; Calautit, J.; Wu, Y. Developing an innovative method for visual perception evaluation in a physical-based virtual environment. Build. Environ. 2019, 162, 106278. [Google Scholar] [CrossRef]
Jeon, J.Y.; Jo, H.I. Effects of audio-visual interactions on soundscape and landscape perception and their influence on satisfaction with the urban environment. Build. Environ. 2020, 169, 106544. [Google Scholar] [CrossRef]
Jo, H.I.; Jeon, J.Y. Effect of the appropriateness of sound environment on urban soundscape assessment. Build. Environ. 2020, 179, 106975. [Google Scholar] [CrossRef]
Wakil, K.; Tahir, A.; Hussnain, M.Q.u.; Waheed, A.; Nawaz, R. Mitigating urban visual pollution through a multistakeholder spatial decision support system to optimize locational potential of billboards. ISPRS Int. J. Geo-Inf. 2021, 10, 60. [Google Scholar] [CrossRef]
Ahmed, N.; Islam, M.N.; Tuba, A.S.; Mahdy, M.; Sujauddin, M. Solving visual pollution with deep learning: A new nexus in environmental management. J. Environ. Manag. 2019, 248, 109253. [Google Scholar] [CrossRef]
Gosal, A.; Ziv, G. Landscape aesthetics: Spatial modelling and mapping using social media images and machine learning. Ecol. Indic. 2020, 117, 106638. [Google Scholar] [CrossRef]
Jamil, A.; ali Hameed, A.; Bazai, S.U. Land Cover Classification using Machine Learning Approaches from High Resolution Images. J. Appl. Emerg. Sci. 2021, 11, 108–112. [Google Scholar]
Wei, J.; Yue, W.; Li, M.; Gao, J. Mapping human perception of urban landscape from street-view images: A deep-learning approach. Int. J. Appl. Earth Obs. Geoinf. 2022, 112, 102886. [Google Scholar] [CrossRef]
Hameed, M.; Yang, F.; Bazai, S.U.; Ghafoor, M.I.; Alshehri, A.; Khan, I.; Baryalai, M.; Andualem, M.; Jaskani, F.H. Urbanization detection using LiDAR-based remote sensing images of azad Kashmir using novel 3D CNNs. J. Sens. 2022, 2022, 6430120. [Google Scholar] [CrossRef]
Li, Y.; Yabuki, N.; Fukuda, T. Measuring visual walkability perception using panoramic street view images, virtual reality, and deep learning. Sustain. Cities Soc. 2022, 86, 104140. [Google Scholar] [CrossRef]
Tasnim, N.H.; Afrin, S.; Biswas, B.; Anye, A.A.; Khan, R. Automatic classification of textile visual pollutants using deep learning networks. Alex. Eng. J. 2023, 62, 391–402. [Google Scholar] [CrossRef]
Sun, P.; Lu, W.; Jin, L. How the natural environment in downtown neighborhood affects physical activity and sentiment: Using social media data and machine learning. Health Place 2023, 79, 102968. [Google Scholar] [CrossRef]
Yasmin, F.; Hassan, M.M.; Hasan, M.; Zaman, S.; Kaushal, C.; El-Shafai, W.; Soliman, N.F. PoxNet22: A fine-tuned model for the classification of monkeypox disease using transfer learning. IEEE Access 2023, 11, 24053–24076. [Google Scholar] [CrossRef]
Hassan, M.M.; Zaman, S.; Mollick, S.; Hassan, M.M.; Raihan, M.; Kaushal, C.; Bhardwaj, R. An efficient Apriori algorithm for frequent pattern in human intoxication data. Innov. Syst. Softw. Eng. 2023, 19, 61–69. [Google Scholar] [CrossRef]
Zhang, F.; Zhou, B.; Liu, L.; Liu, Y.; Fung, H.H.; Lin, H.; Ratti, C. Measuring human perceptions of a large-scale urban region using machine learning. Landsc. Urban Plan. 2018, 180, 148–160. [Google Scholar] [CrossRef]
Rodaway, P. Sensuous Geographies: Body, Sense and Place; Routledge: Oxford, UK, 2002. [Google Scholar]
Kiwelekar, A.W.; Mahamunkar, G.S.; Netak, L.D.; Nikam, V.B. Deep learning techniques for geospatial data analysis. In Machine Learning Paradigms; Springer: Berlin/Heidelberg, Germany, 2020; pp. 63–81. [Google Scholar]
Das, H.; Pradhan, C.; Dey, N. Deep Learning for Data Analytics: Foundations, Biomedical Applications, and Challenges; Academic Press: Cambridge, MA, USA, 2020. [Google Scholar]
Miglani, A.; Kumar, N. Deep learning models for traffic flow prediction in autonomous vehicles: A review, solutions, and challenges. Veh. Commun. 2019, 20, 100184. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Zhang, P.; Ke, Y.; Zhang, Z.; Wang, M.; Li, P.; Zhang, S. Urban land use and land cover classification using novel deep learning models based on high spatial resolution satellite imagery. Sensors 2018, 18, 3717. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Shickel, B.; Tighe, P.J.; Bihorac, A.; Rashidi, P. Deep EHR: A survey of recent advances in deep learning techniques for electronic health record (EHR) analysis. IEEE J. Biomed. Health Inform. 2017, 22, 1589–1604. [Google Scholar] [CrossRef]
Zhang, W.; Yu, Y.; Qi, Y.; Shu, F.; Wang, Y. Short-term traffic flow prediction based on spatio-temporal analysis and CNN deep learning. Transp. A: Transp. Sci. 2019, 15, 1688–1711. [Google Scholar] [CrossRef]
Porzi, L.; Rota Bulò, S.; Lepri, B.; Ricci, E. Predicting and understanding urban perception with convolutional neural networks. In Proceedings of the 23rd ACM International Conference on Multimedia, Brisbane, Australia, 26–30 October 2015; pp. 139–148. [Google Scholar]
Timilsina, S.; Sharma, S.; Aryal, J. Mapping urban trees within cadastral parcels using an object-based convolutional neural network. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2019, 4, 111–117. [Google Scholar] [CrossRef] [Green Version]
Chauhan, R.; Kaur, H.; Alankar, B. Air quality forecast using convolutional neural network for sustainable development in urban environments. Sustain. Cities Soc. 2021, 75, 103239. [Google Scholar] [CrossRef]
Tyrväinen, L.; Ojala, A.; Korpela, K.; Lanki, T.; Tsunetsugu, Y.; Kagawa, T. The influence of urban green environments on stress relief measures: A field experiment. J. Environ. Psychol. 2014, 38, 1–9. [Google Scholar] [CrossRef]
Radomska, M.; Yurkiv, M.; Nazarkov, T. The Assessment of the Visual Pollution from Industrial Facilities in Natural Landscapes. Available online: http://www.kdu.edu.ua/EKB_jurnal/2019_1(27)/PDF/45_49.pdf (accessed on 30 January 2021).
Galindo, M.P.; Hidalgo, M.C. Aesthetic preferences and the attribution of meaning: Environmental categorization processes in the evaluation of urban scenes. Int. J. Psychol. 2005, 40, 19–27. [Google Scholar] [CrossRef]
Ghorbanzadeh, O.; Meena, S.R.; Blaschke, T.; Aryal, J. UAV-based slope failure detection using deep-learning convolutional neural networks. Remote Sens. 2019, 11, 2046. [Google Scholar] [CrossRef] [Green Version]
Nami, P.; Jahanbakhsh, P.; Fathalipour, A. the role and heterogeneity of visual pollution on the quality of urban landscape using GIS; case study: Historical Garden in City of Maraqeh. Open J. Geol. 2016, 6, 20–29. [Google Scholar] [CrossRef] [Green Version]
Zaeimdar, M.; Khalilnezhad Sarab, F.; Rafati, M. Investigation of the relation between visual pollution and citizenry health in the city of Tehran (case study: Municipality districts No. 1 & 12 of Tehran). Anthropog. Pollut. 2019, 3, 1–10. [Google Scholar]
Rombauts, P. Aspects of visual Task Comfort in an Urban Environment. In Proceedings of the Lighting and City Beautification Congress, Istanbul, Turkey, 12–14 September 2001; pp. 99–104. [Google Scholar]
Mokras-Grabowska, J. New urban recreational spaces. Attractiveness, infrastructure arrangements, identity. The example of the city of Łódź. Misc. Geogr. Reg. Stud. Dev. 2018, 22, 219–224. [Google Scholar] [CrossRef] [Green Version]
Aljoufie, M. The impact assessment of increasing population density on Jeddah road transportation using spatial-temporal analysis. Sustainability 2021, 13, 1455. [Google Scholar] [CrossRef]
Bakhshi, M. The position of green space in improving beauty and quality of sustainable space of city. Environ. Conserv. J. 2015, 16, 269–276. [Google Scholar] [CrossRef]
Hiremath, S. Population Growth and Solid Waste Disposal: A burning Problem in the Indian Cities. Indian Streams Res. J. 2016, 6, 141–147. [Google Scholar]
Tudor, C.A.; Iojă, I.C.; Hersperger, A.; Pǎtru-Stupariu, I. Is the residential land use incompatible with cemeteries location? Assessing the attitudes of urban residents. Carpathian J. Earth Environ. Sci. 2013, 8, 153–162. [Google Scholar]
Nejad, J.M.; Azemati, H.; Abad, A.S.H. Investigating Sacred Architectural Values of Traditional Mosques Based on the Improvement of Spiritual Design Quality in the Architecture of Modern Mosques. Int. J. Architect. Eng. Urban Plan 2019, 29, 47–59. [Google Scholar]
Nowghabi, A.S.; Talebzadeh, A. Psychological influence of advertising billboards on city sight. Civ. Eng. J. 2019, 5, 390–397. [Google Scholar] [CrossRef] [Green Version]
Kshetri, T. Ndvi, ndbi & ndwi calculation using landsat 7, 8. GeoWorld 2018, 2, 32–34. [Google Scholar]
Malm, W.C.; Leiker, K.K.; Molenar, J.V. Human perception of visual air quality. J. Air Pollut. Control Assoc. 1980, 30, 122–131. [Google Scholar] [CrossRef]
Oltra, C.; Sala, R. A Review of the Social Research on Public Perception and Engagement Practices in Urban Air Pollution; IAEA: Vienna, Austria, 2014. [Google Scholar]
Xu, H.; Deng, Y. Dependent evidence combination based on shearman coefficient and pearson coefficient. IEEE Access 2017, 6, 11634–11640. [Google Scholar] [CrossRef]
Zhu, H.; You, X.; Liu, S. Multiple ant colony optimization based on pearson correlation coefficient. IEEE Access 2019, 7, 61628–61638. [Google Scholar] [CrossRef]
Ding, A.; Zhang, Q.; Zhou, X.; Dai, B. Automatic recognition of landslide based on CNN and texture change detection. In Proceedings of the 31st Youth Academic Annual Conference of Chinese Association of Automation (YAC), Wuhan, China, 11–13 November 2016; pp. 444–448. [Google Scholar]
Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Data 2021, 8, 53. [Google Scholar] [CrossRef]
Saha, S.; Sarkar, R.; Roy, J.; Hembram, T.K.; Acharya, S.; Thapa, G.; Drukpa, D. Measuring landslide vulnerability status of Chukha, Bhutan using deep learning algorithms. Sci. Rep. 2021, 11, 16374. [Google Scholar] [CrossRef]
Panahi, M.; Sadhasivam, N.; Pourghasemi, H.R.; Rezaie, F.; Lee, S. Spatial prediction of groundwater potential mapping based on convolutional neural network (CNN) and support vector regression (SVR). J. Hydrol. 2020, 588, 125033. [Google Scholar] [CrossRef]
Zhu, Q.; Chen, J.; Zhu, L.; Duan, X.; Liu, Y. Wind speed prediction with spatio–temporal correlation: A deep learning approach. Energies 2018, 11, 705. [Google Scholar] [CrossRef] [Green Version]
Sun, T.; Li, H.; Wu, K.; Chen, F.; Zhu, Z.; Hu, Z. Data-driven predictive modelling of mineral prospectivity using machine learning and deep learning methods: A case study from southern Jiangxi Province, China. Minerals 2020, 10, 102. [Google Scholar] [CrossRef] [Green Version]
Lu, Y.; Huo, Y.; Yang, Z.; Niu, Y.; Zhao, M.; Bosiakov, S.; Li, L. Influence of the Parameters of the Convolutional Neural Network Model in Predicting the Effective Compressive Modulus of Porous Structure; Frontiers Media S.A: Lausanne, Switzerland, 2022. [Google Scholar]
Razavi-Termeh, S.V.; Sadeghi-Niaraki, A.; Farhangi, F.; Choi, S.-M. COVID-19 risk mapping with considering socio-economic criteria using machine learning algorithms. Int. J. Environ. Res. Public Health 2021, 18, 9657. [Google Scholar] [CrossRef]
Al Sayaydeha, O.N.; Mohammad, M.F. Diagnosis of the Parkinson disease using enhanced fuzzy min-max neural network and OneR attribute evaluation method. In Proceedings of the 2019 International Conference on Advanced Science and Engineering (ICOASE), Duhok, Iraq, 2–4 April 2019; pp. 64–69. [Google Scholar]
Wu, S.; Flach, P. A scored AUC metric for classifier evaluation and selection. In Proceedings of the Second Workshop on ROC Analysis in ML, Bonn, Germany, 11 August 2005. [Google Scholar]
Davis, J.; Goadrich, M. The relationship between Precision-Recall and ROC curves. In Proceedings of the 23rd International Conference on Machine Learning, Honolulu, HI, USA, 23–29 July 2006; pp. 233–240. [Google Scholar]
Lei, X.; Chen, W.; Panahi, M.; Falah, F.; Rahmati, O.; Uuemaa, E.; Kalantari, Z.; Ferreira, C.S.S.; Rezaie, F.; Tiefenbacher, J.P. Urban flood modeling using deep-learning approaches in Seoul, South Korea. J. Hydrol. 2021, 601, 126684. [Google Scholar] [CrossRef]
Youssef, A.M.; Al-Kathery, M.; Pradhan, B. Landslide susceptibility mapping at Al-Hasher area, Jizan (Saudi Arabia) using GIS-based frequency ratio and index of entropy models. Geosci. J. 2015, 19, 113–134. [Google Scholar] [CrossRef]
Siddiqui, K.A.; Tarani, S.S.A.; Fatani, S.A.; Raza, A.; Butt, R.M.; Azeema, N. Effect of size, location and content of billboards on brand awareness. J. Bus. Stud. Q. 2016, 8, 40. [Google Scholar]
Edquist, J.; Horberry, T.; Hosking, S.; Johnston, I. Effects of advertising billboards during simulated driving. Appl. Ergon. 2011, 42, 619–626. [Google Scholar] [CrossRef] [PubMed]
ZAMiRi, M. The role of urban advertising in quality of urban land scape. Curr. World Environ. 2016, 11, 14. [Google Scholar] [CrossRef]
Carmona, M. Principles for public space design, planning to do better. Urban Des. Int. 2019, 24, 47–59. [Google Scholar] [CrossRef] [Green Version]
Foster, S.; Giles-Corti, B. The built environment, neighborhood crime and constrained physical activity: An exploration of inconsistent findings. Prev. Med. 2008, 47, 241–251. [Google Scholar] [CrossRef]
Hillnhütter, H. Stimulating urban walking environments—Can we measure the effect? Environ. Plan. B Urban Anal. City Sci. 2022, 49, 275–289. [Google Scholar] [CrossRef]
Wright, C.; Curtis, B. Aesthetics and the urban road environment. Proc. Inst. Civ. Eng.-Munic. Eng. 2002, 151, 145–150. [Google Scholar] [CrossRef]
Ahmed, S.A.G.; Mushref, Z.J. Three-Dimensional Modeling of Visual Pollution of Generator Wires in Ramadi City. PalArch’s J. Archaeol. Egypt/Egyptol. 2021, 18, 1659–1668. [Google Scholar]
Taylor, N. The aesthetic experience of traffic in the modern city. Urban Stud. 2003, 40, 1609–1625. [Google Scholar] [CrossRef]
Bankole, O.E. Urban environmental graphics: Impact, problems and visual pollution of signs and billboards in Nigerian cities. Int. J. Educ. Res. 2013, 1, 1–12. [Google Scholar]
Rozman Cafuta, M. Visual perception and evaluation of artificial night light in urban open areas. Informatologia 2014, 47, 257–263. [Google Scholar]
Boyce, P.R. The benefits of light at night. Build. Environ. 2019, 151, 356–367. [Google Scholar] [CrossRef]
Dabbagh, E. The Effects of Color and Light on the Beautification of Urban Space and the Subjective Perception of Citizens. Int. J. Eng. Sci. Invent. 2019, 8, 20–25. [Google Scholar]
Allahyari, H.; Nasehi, S.; Salehi, E.; Zebardast, L. Evaluation of visual pollution in urban squares, using SWOT, AHP, and QSPM techniques (Case study: Tehran squares of Enghelab and Vanak). Pollution 2017, 3, 655–667. [Google Scholar]
Alam, P.; Ahmade, K. Impact of solid waste on health and the environment. Int. J. Sustain. Dev. Green Econ. 2013, 2, 165–168. [Google Scholar]
Azeema, N.; Nazuk, A. Is billboard a visual pollution in Pakistan. Int. J. Sci. Eng. Res 2016, 7, 862–874. [Google Scholar]
Achsani, R.A.; Wonorahardjo, S. Studies on Visual Environment Phenomena of Urban Areas: A Systematic Review. In IOP Conference Series: Earth and Environmental Science; IOP Publishing: Bristol, UK, 2020; p. 012016. [Google Scholar]
Romanova, E. Increase in population density and aggravation of social and psychological problems in areas with high-rise construction. E3S Web Conf. 2018, 33, 03061. [Google Scholar] [CrossRef] [Green Version]
Karimimoshaver, M.; Hajivaliei, H.; Shokri, M.; Khalesro, S.; Aram, F.; Shamshirband, S. A model for locating tall buildings through a visual analysis approach. Appl. Sci. 2020, 10, 6072. [Google Scholar] [CrossRef]
Voronych, Y. Visual pollution of urban space in Lviv. Przestrz. I Forma 2013, 20, 309–314. [Google Scholar]
Song, Y.; Wang, R.; Fernandez, J.; Li, D. Investigating sense of place of the Las Vegas Strip using online reviews and machine learning approaches. Landsc. Urban Plan. 2021, 205, 103956. [Google Scholar] [CrossRef]
Saghir, B. Tackling Urban Visual Pollution to Enhance the Saudi Cityscape; CLG: Riyadh, Saudi Arabia, 2019. [Google Scholar]
Ye, Y.; Zeng, W.; Shen, Q.; Zhang, X.; Lu, Y. The visual quality of streets: A human-centred continuous measurement based on machine learning algorithms and street view images. Environ. Plan. B Urban Anal. City Sci. 2019, 46, 1439–1457. [Google Scholar] [CrossRef]
Zhang, G.; Wang, M.; Liu, K. Forest fire susceptibility modeling using a convolutional neural network for Yunnan province of China. Int. J. Disaster Risk Sci. 2019, 10, 386–403. [Google Scholar] [CrossRef] [Green Version]
Rere, L.; Fanany, M.I.; Arymurthy, A.M. Metaheuristic algorithms for convolution neural network. Comput. Intell. Neurosci. 2016, 2016, 1537325. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Wong, J.; Tam, K. Spatial identity of fashion brands: The visibility network in complex shopping malls. In Proceedings of the the IFFTI 2019 Conference, Manchester, UK, 8–12 April 2019; pp. 168–185. Available online: https://fashioninstitute.mmu.ac.uk/ifftipapers/paper-85 (accessed on 8 April 2019).
Du, H.; Jiang, H.; Song, X.; Zhan, D.; Bao, Z. Assessing the visual aesthetic quality of vegetation landscape in urban green space from a visitor’s perspective. J. Urban Plan. Dev. 2016, 142, 04016007. [Google Scholar] [CrossRef]
Uzun, O.; Müderrisoğlu, H. Visual landscape quality in landscape planning: Examples of Kars and Ardahan cities in Turkey. Afr. J. Agric. Res. 2011, 6, 1627–1638. [Google Scholar]
Pascal, M.; Pascal, L.; Bidondo, M.-L.; Cochet, A.; Sarter, H.; Stempfelet, M.; Wagner, V. A review of the epidemiological methods used to investigate the health impacts of air pollution around major industrial areas. J. Environ. Public Health 2013, 2013, 737926. [Google Scholar] [CrossRef]
Khanal, K.K. Visual pollution and eco-dystopia: A study of billboards and signs in Bharatpur metropolitan city. Res. J. Engl. Lang. Lit 2018, 6, 202–208. [Google Scholar]
Mohamed, M.A.S.; Ibrahim, A.O.; Dodo, Y.A.; Bashir, F.M. Visual pollution manifestations negative impacts on the people of Saudi Arabia. Int. J. Adv. Appl. Sci 2021, 8, 94–101. [Google Scholar]

Figure 1. The geographical location of the study area.

Figure 2. Four stages of research.

Figure 3. Images perceived by users for sight: (a) pleasant; (b) unpleasant in Tehran.

Figure 4. Spatial criteria affecting people’s visual perception: (a) Night light; (b) Distance to recreational attractions; (c) Distance to industrial areas; (d) Distance to public transport stations; (e) Distance to primary streets; (f) Distance to secondary streets; (g) Distance to local passages; (h) Traffic volume; (i) Distance to billboards; (j) Distance to restaurants; (k) Distance to shopping malls; (l) Distance to dilapidated areas; (m) population density; (n) Distance to cemeteries; (o) Distance to religious places; (p) NDVI; and (q) AQI.

Figure 5. Structure of a CNN with four layers.

Figure 6. Pearson correlation coefficient results (Distance to public transport stations = A; Traffic volume = B; Distance to recreational attractions = C; Distance to secondary streets = D; Distance to primary streets = E; Distance to restaurants = F; Population density = G; Night light = H; NDVI = I; Distance to religious places = J; Distance to shopping malls = K; Distance to local passages = L; Distance to industrial areas = M; Distance to dilapidated areas = N; Distance to cemeteries = O; Distance to billboards = P; AQI = Q).

Figure 7. The importance of criteria affecting the sense of sight.

Figure 8. Result of prediction error: (a,b) pleasant; (c,d) unpleasant sight.

Figure 9. Potential map of people’s visual perception by CNN algorithm: (a) pleasant; (b) unpleasant sight.

Figure 10. Validation using ROC curve for pleasant and unpleasant sight.

Figure 11. Places with high potential in pleasant sight.

Figure 12. Places with high potential in unpleasant sight.

Table 1. Comparison of research conducted in the field of visual sense perception and the factors affecting it.

Researcher	Purpose	Methodology	Results
[15]	Studied the relationship between users’ visual preferences in recreational activities in urban areas and various factors of landscape design	taking photographs of the area in phases, using a photo-questionnaire design, and applying a statistical analysis	Identifying factors with constructive impacts and destructive effects on visual quality
[16]	Quantification of landscape values on a continental scale	social media data	Panoramic results show that Flicker and Instagram can be used to determine the properties of a landscape, and social media data can be used to comprehend how people value a landscape in social, political, and ecological terms.
[17]	Studied the aesthetic services of a mountain region	Combination of visual traits of landscapes and crowdsourced geographic information	Identifying visual factors that attract domestic and foreign tourists
[14]	Presented a new tool for Visual Pollution Assessment (VPA) in urban environments to quantify visual pollution	The explicit and systematic combination of expert and general opinions for classifying and categorizing Visual Pollution Objects (VPOs)	The significant role of VPA in evaluating visual pollution
[18]	Autonomous navigation and visual perception in unstructured outdoor environments	A Robot Unstructured Ground Driving (RUGD) Dataset	Introducing the unique challenges of this data as it relates to navigation tasks
[19]	Visual Perception Evaluation in a Physical-Based Virtual Environment	Using virtual reality technology to compare a 3-dimensional virtual office simulator with real office	There is no significant difference in the two environments based on the studied parameters—The ability of the proposed method to provide realistic, immersive environments
[20]	Studied effects of audio-visual interactions on soundscape and landscape perception and their influence on satisfaction with the urban environment	virtual reality technology	The effect of the availability of visual information on the auditory perception of a number of human-made and natural sounds and the effect of the availability of audio information on the visual perception of various visual elements—The effect of audio information on the perception of the naturalness of a landscape—The effect of audio and visual information is 24 and 76 percent respectively on overall satisfaction
[21]	Effect of the appropriateness of sound environment on urban soundscape assessment	Virtual reality technology	The interaction of appropriateness of sound sources with individuals’ perception of visual elements—The effect of traffic sounds and birdsong on participants’ initial perception of urban soundscape quality—There is a relationship between “human” sounds originating from human activity and the “comfort” aspects of soundscape quality
[22]	Studied the reduction in urban visual pollution	Multilateral decision-making and geographic information system	Identifying the best spots for billboard installation through focused management

Table 2. Comparison of research performed in the field of machine learning and deep learning.

Researcher	Purpose	Methodology	Results
[23]	Solving visual pollution	deep learning	Training accuracy of 95% and validation accuracy of 85% have been achieved by the deep learning model—Relation of the upper limit of accuracy to the size of the dataset size
[24]	studying the aesthetic of landscapes	Spatial modeling and mapping, social media images, and machine learning	Finding important variables such as the pleasant nature of rural areas, mountainous landforms, and vegetation for aesthetic value by the predictive model
[6]	Analyzing the correlation between visual space and residents’ psychology in Wuhan, China	street-view images and deep-learning technique	There is a strong relationship between urban visual space indicators and residents’ psychological perceptions
[25]	Land Cover Classification	Machine Learning Approaches from High-Resolution Images	More accuracy of ANN than SVM—Improving the average accuracy after applying postprocessing using majority analysis.
[26]	studying people’s perception of Shanghai landscapes	deep learning and street view images	More assurance and more liveliness, but against more depression in highly urbanized areas
[27]	Urbanization Detection Using LiDAR-Based Remote Sensing Images of Azad Kashmir	Novel 3D CNNs	The overall accuracy and kappa value are very good for the suggested 3D CNN approach—Proposed 3D CNN approach makes better use of urbanization than the commonly utilized pixel-based support vector machine classifier
[28]	Measuring visual walkability perception	panoramic street view images, virtual reality, and deep learning	Validating the accuracy of the VWP classification deep multitask learning (VWPCL) model for predicting visual walkability perception (VWP)
[29]	Automatic classification of textile visual pollutants	deep learning networks (Faster R-CNN, YOLOv5, and EfficientDet)	The best performance belongs to the EfficientDet framework
[30]	Studying the effect of the natural environment in the downtown neighborhood on physical activity and sentiment	social media data and machine learning	The favorable influence of blue space visibility, activity facilities, street furniture, and safety on physical activity with a social gradient- Positive correlation between amenities, perceived street safety, and beauty to public sentiment- The consistency of social media findings on the environment and physical activity with traditional surveys from the same time period
[31]	Classification of Monkeypox Disease	Transfer Learning	PoxNet22 outperforms other methods in its classification of monkeypox.
[32]	Investigation of frequent pattern in human intoxication data	Apriori algorithm	Eight significant rules were discovered, with a confidence level of 95% and a support level of 45%

Table 3. Independent variables, along with the source of data.

Independent Variables	Source of Data
Distance to local passages	OpenStreetMap (https://www.openstreetmap.org) (accessed on1 January 2021) (1:100,000)
population density	Statistical Centre of Iran (2017)
Traffic volume	Tehran Traffic Control Company (2015–2020)
Night light	VIRS (Visible Infrared Imaging Radiometer Suite) image in Google Earth Engine (https://earthengine.google.com/) (accessed on1 January 2021)
Distance to public transport stations	land use layers (1:10,000)
Distance to billboards	Tehran enhancement organization
Distance to primary streets	OpenStreetMap (2021) (1:100,000)
Distance to recreational attractions	land use layers (1:10,000)
Distance to shopping malls	land use layers (1:10,000)
Distance to restaurants	land use layers (1:10,000)
Distance to religious places	land use layer (1:10,000)
Distance to cemeteries	land use layer (1:10,000)
Distance to dilapidated areas	land use layer (1:10,000)
Distance to secondary streets	OpenStreetMap (2021) (1:100,000)
Distance to industrial areas	land use layers (1:10,000)
AQI	Landsat 8 images in the Google Earth Engine platform (2010–2020)
NDVI	Tehran air quality control company (23 stations) (2010–2020)

Table 4. Result of evaluation indices.

	Train		Test
Sight Type	MSE	Accuracy	MSE	Accuracy
Pleasant	0.02	0.97	0.22	0.75
Unpleasant	0.05	0.93	0.20	0.74

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Farahani, M.; Razavi-Termeh, S.V.; Sadeghi-Niaraki, A.; Choi, S.-M. A Hybridization of Spatial Modeling and Deep Learning for People’s Visual Perception of Urban Landscapes. Sustainability 2023, 15, 10403. https://doi.org/10.3390/su151310403

AMA Style

Farahani M, Razavi-Termeh SV, Sadeghi-Niaraki A, Choi S-M. A Hybridization of Spatial Modeling and Deep Learning for People’s Visual Perception of Urban Landscapes. Sustainability. 2023; 15(13):10403. https://doi.org/10.3390/su151310403

Chicago/Turabian Style

Farahani, Mahsa, Seyed Vahid Razavi-Termeh, Abolghasem Sadeghi-Niaraki, and Soo-Mi Choi. 2023. "A Hybridization of Spatial Modeling and Deep Learning for People’s Visual Perception of Urban Landscapes" Sustainability 15, no. 13: 10403. https://doi.org/10.3390/su151310403

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Hybridization of Spatial Modeling and Deep Learning for People’s Visual Perception of Urban Landscapes

Abstract

1. Introduction

1.1. Visual Perception

1.2. Importance of People’s Perception of the Urban Visual Environment

1.3. Literature Survey

1.4. Spatial Modeling

1.5. Research Objectives and Research Questions

1.6. Research Innovation

1.7. Research Structure

2. Study Area

3. Methodology

3.1. Database Construction

People’s Visual Perception Data

3.2. Spatial Criteria Affecting People’s Visual Perception

4. Methods

4.1. Pearson Correlation Coefficient

4.2. Convolutional Neural Network (CNN)

4.3. Feature Importance

4.4. Evaluation of Model Accuracy

5. Results

5.1. Pearson Correlation Coefficient Results

5.2. Feature Importance Results

5.3. Modeling Results

5.4. Validation

6. Discussion

6.1. Assessment of Effective Factors

6.2. Assessment of Modeling

6.3. Landscape Policies

6.4. Limitations and Future Directions

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI