Evaluating Urban Greenery Through the Front-Facing Street View Imagery: Insights from a Nanjing Case Study

Zhu, Jin; Huang, Yingjing; Cao, Ziyue; Zhang, Yue; Ding, Yuan; Du, Jinglong

doi:10.3390/ijgi14080287

Open AccessArticle

Evaluating Urban Greenery Through the Front-Facing Street View Imagery: Insights from a Nanjing Case Study

by

Jin Zhu

^1,2,*

,

Yingjing Huang

³

,

Ziyue Cao

¹,

Yue Zhang

¹,

Yuan Ding

⁴

and

Jinglong Du

^1,2

¹

School of Geography Science and Geomatics Engineering, Suzhou University of Science and Technology, Suzhou 215009, China

²

Suzhou Key Laboratory of Spatial Information Intelligent Technology and Application, Suzhou 215009, China

³

Institute of Remote Sensing and Geographical Information System, School of Earth and Space Sciences, Peking University, Beijing 100091, China

⁴

College of Geography and Remote Sensing, Hohai University, Nanjing 211100, China

^*

Author to whom correspondence should be addressed.

ISPRS Int. J. Geo-Inf. 2025, 14(8), 287; https://doi.org/10.3390/ijgi14080287

Submission received: 7 May 2025 / Revised: 1 July 2025 / Accepted: 23 July 2025 / Published: 24 July 2025

Download

Browse Figures

Versions Notes

Abstract

Street view imagery has become a vital tool for assessing urban street greenery, with the Green View Index (GVI) serving as the predominant metric. However, while GVI effectively quantifies overall greenery, it fails to capture the nuanced, human-scale experience of urban greenery. This study introduces the Front-Facing Green View Index (FFGVI), a metric designed to reflect the perspective of pedestrians traversing urban streets. The FFGVI computation involves three key steps: (1) calculating azimuths for road points, (2) retrieving front-facing street view images, and (3) applying semantic segmentation to identify green pixels in street view imagery. Building on this, this study proposes the Street Canyon Green View Index (SCGVI), a novel approach for identifying boulevards that evoke perceptions of comfort, spaciousness, and aesthetic quality akin to room-like streetscapes. Applying these indices to a case study in Nanjing, China, this study shows that (1) FFGVI exhibited a strong correlation with GVI (R = 0.88), whereas the association between SCGVI and GVI was marginally weaker (R = 0.78). GVI tends to overestimate perceived greenery due to the influence of lateral views dominated by side-facing vegetation; (2) FFGVI provides a more human-centered perspective, mitigating biases introduced by sampling point locations and obstructions such as large vehicles; and (3) SCGVI effectively identifies prominent boulevards that contribute to a positive urban experience. These findings suggest that FFGVI and SCGVI are valuable metrics for informing urban planning, enhancing urban tourism, and supporting greening strategies at the street level.

Keywords:

green view index; front-facing street view image; urban greenery; boulevards

1. Introduction

Urban green spaces, encompassing trees, shrubs, lawns, and other vegetative elements, constitute an essential component of urban environments, offering a wide array of ecological and societal benefits [1]. These spaces play a pivotal role in reducing air pollution, mitigating urban noise, sequestering carbon dioxide, alleviating the urban heat island effect, and fostering urban biodiversity. In addition, urban greenery is increasingly recognized for its contributions to human health and well-being [2], including promoting physical activity such as walking [3], facilitating leisure activities, reducing body mass index, and enhancing mental health outcomes [4].

In recent years, researchers have used street view imagery to extract the Green View Index (GVI) to represent urban street greenery from a pedestrian’s perspective. GVI provides a perceptually aligned and robust measure of pedestrian’s greenery exposure [5]. Insights from social psychology, such as priming theory, suggest that habitual behaviors, including spatial perception by local residents, remain largely stereotypical and stable in everyday contexts [6]. People’s eyes are mainly oriented to the front when they pass by on a daily basis. Empirical evidence further supports the utility of forward-facing imagery for accurately estimating parameters such as tree cover [7] or indices including GVI, Building View Index (BVI), and Sky View Index (SVI) [8], yielding superior results compared to multi-directional images [9]. While some recent studies have incorporated forward-facing imagery [10,11,12], the predominant approach continues to rely on four-directional street-view imagery for GVI calculations [13,14,15,16].

The standard GVI has been widely utilized to quantify the overall greenery within a given scene. However, its ability to capture the complexity and richness of urban green spaces remains limited. To overcome this shortcoming, this study introduces two novel metrics aimed at refining the calculation methodologies of GVI to provide a more nuanced representation of urban greenery. Among these, the Front-Facing Green View Index (FFGVI) is designed to reflect the green views perceived by pedestrians as they navigate streetscapes in their everyday environments. Furthermore, this study proposes the Street Canyon Green View Index (SCGVI), which identifies boulevards that convey a sense of comfort, spatial enclosure, and aesthetic memorability, embodying qualities characteristic of “room-like” streetscapes. Through a systematic evaluation of GVI, FFGVI, and SCGVI, this study seeks to elucidate the distinctive properties of these metrics, offering new insights into the representation and assessment of urban green spaces.

What are their relationships?
What kind of scenarios do each of them apply to?
Why is FFGVI more accurate than GVI in measuring urban street greenery?

Using Nanjing as a representative case study, this study quantitatively evaluated and compared FFGVI and SCGVI against the standard GVI. This study proposes that the traditional GVI, derived from images taken along four cardinal directions, is particularly well-suited for open environments such as parks, expansive green areas, and intersections [17]. However, in dense urban streets where the visual field is often constrained, GVI estimates based on front-facing, street-level imagery may provide a more accurate representation of the green visual experience. To test this hypothesis, this study deconstructed the GVI into directional components and analyzed their values and variability under diverse spatial conditions. Additionally, the SCGVI was demonstrated to be particularly effective in identifying boulevards with pronounced spatial enclosure, a characteristic associated with the perception of positive, comfortable, and memorable “room-like” urban streetscapes [18].

The paper is structured as follows: Section 2 reviews the existing literature on street view imagery and GVI. Section 3 describes the study area and details the research methodology. Section 4 reports the experimental results, and Section 5 provides an in-depth discussion of the key findings, including the study’s limitations and broader implications. Finally, Section 6 concludes the paper with a synthesis of the main insights.

2. Literature Review

2.1. Street View Images

In recent years, street view imagery has emerged as a valuable data source for urban analytics [19]. These images, typically captured by dedicated collection vehicles traversing urban road networks, are offered by major map service providers such as Google [20], Baidu [15], and Tencent [21]. Unlike remote sensing imagery, which adopts a top-down perspective for earth observation, street view imagery employs an eye-level perspective, offering a more accurate representation of pedestrians’ perceptions of urban street greenery. The availability of such data through publicly accessible web APIs has substantially reduced the time, labor, and financial resources needed for urban studies [22]. Among these services, Google Street View, launched in 2007, remains the most extensive, covering over 114 countries worldwide. However, its restricted availability in mainland China has led to the predominant use of Baidu and Tencent street view imagery for studies focused on Chinese cities. Baidu, in particular, has been providing comprehensive street view coverage within China since 2013 [19].

2.2. Green View Index

Yang introduced GVI as a metric for quantifying the visibility of greenery from a ground-level perspective [17]. This approach utilized color photographs to evaluate the visibility of urban trees. The green vegetation area within each photograph was manually extracted using Photoshop, and the GVI was calculated as the proportion of pixels representing green vegetation in the total image. The GVI at each sampling point was determined by averaging the GVIs derived from four photographs taken in different cardinal directions at that location. Building on this concept, Li developed a modified GVI to assess street-level greenery using imagery from Google Street View [23]. To enhance the accuracy of greenery measurement, images were captured from six horizontal and three vertical angles, resulting in 18 images per sampling point.

The Panoramic View Green View Index (PVGVI), introduced by Xia, quantifies the proportion of green pixels within a panoramic image captured at specific sampling points [24]. Kumakoshi developed the Standardized Green View Index (sGVI), designed to enhance the robustness of GVI measurements across regions [25]. The sGVI is calculated as a weighted average GVI for a given area, with weights determined using Voronoi tessellation. Beyond regional aggregation, GVI can also be applied to path analysis, enabling the identification of optimal “green view” paths through network algorithms [13].

Traditionally, as Figure 1a illustrates, the GVI is determined by capturing images in the four cardinal directions (north, east, south, and west) at a given sampling point, followed by calculating the proportion of green pixels in each image and averaging these values [17]. For instance, Ye utilized GSV imagery captured from four directions (front, back, left, and right), yet their GVI calculation was based on the average values from these directions, necessitating the collection and processing of a substantial number of images [10]. While this methodology offers a broad visual quantification of urban greenery, it may not fully capture the nuanced experiences of greenery encountered during everyday pedestrian activity. For instance, as pedestrians traverse urban streets—whether commuting to work or school—their line of sight predominantly focuses on the forward-facing direction, emphasizing the landscape along the path of movement [13], as shown in Figure 1b. A few studies have utilized front-facing street view images to calculate GVI. For instance, Law assessed street quality using only front-facing images taken from the midpoint of roads [11]. Rui employed front-facing street view images to evaluate streetscape perceptions [26]. Furthermore, a small body of literature has investigated the influence of different street view image acquisition parameters, including image orientation, on derived metrics [8,9]. These studies suggest that GVI and related metrics derived from front-facing street view images demonstrate greater accuracy compared to those obtained from multi-directional images.

2.3. Semantic Segmentation

Traditionally, the extraction of greenery from images relied on traditional image processing software, such as Photoshop [17]. However, this approach was both time-intensive and laborious, rendering it unsuitable for handling large datasets efficiently. To address these limitations, methods leveraging image spectral information were subsequently developed. While effective to some extent, these techniques were prone to inaccuracies, such as misclassifying green objects, billboards for example, as vegetation [23]. In recent years, advances in deep learning technologies have revolutionized greenery extraction through the semantic segmentation techniques. These methods enable the categorization of individual pixels within an image, assigning labels such as vegetation, sky, and buildings. Prominent semantic segmentation models include PSPNet [27], SegNet [10], FCN8s [3], and DeepLab [28], among others. These approaches are both accurate and efficient.

3. Materials and Methods

3.1. Study Area

Our study was conducted in Nanjing, a major city in Jiangsu Province, China, with a total area of 6587 km² and a population of approximately 9.54 million as of 2023. Notably, in the late 1920s, the first modern boulevards in China—Zhongshan North Road, Zhongshan Road, Zhongshan East Road, and Mausoleum Road—were planned and constructed, characterized by their tall and dense chinars. Renowned for its aesthetically remarkable and historically significant boulevards, Nanjing represents an ideal setting for this analysis. The central urban region was selected as the primary study area (Figure 2).

3.2. Research Framework

The research framework, depicted in Figure 3, encompasses three primary steps. Initially, the azimuths of the sampling points were determined using the OpenStreetMap (OSM) road network dataset, while front-facing street view images were retrieved from the Baidu Maps platform. Specifically, the OSM dataset facilitated the generation of sampling points and the calculation of their azimuths, which were subsequently utilized as heading parameters for acquiring front-facing street view images via the Baidu API. In the second step, the MP-Former model was employed as the semantic segmentation approach to ensure an objective and precise classification of pixel-level visual elements [29]. Based on the segmentation outcomes, the FFGVI and SCGVI were computed for each sampling point. Lastly, the FFGVI and SCGVI indices were analyzed through distribution analysis, correlation analysis, and comparative analysis. Additionally, boulevards were identified using the SCGVI.

3.3. Azimuth Calculation and Street View Images Collection

The road network of Nanjing was sourced from the OSM platform. Sampling points were generated along road segments at 20 m intervals, resulting in a total of 133,800 points. Corresponding street view images were retrieved using the Baidu Map platform, which required precise parameter configuration in the Baidu Map API to ensure the capture of front-facing images aligned parallel to the streets. The heading, pitch, and fov parameters control the orientation and field of view of the camera (Figure 4a). Specifically, the heading parameter dictates the compass direction of the camera, ranging from 0° (north) to 360°(clockwise), enabling alignment with a specific azimuth. The pitch parameter adjusts the vertical tilt of the camera, while fov specifies the horizontal field of view. For this study, the pitch and fov values were fixed at 0°and 90°, respectively, while the heading parameter was dynamically set to the azimuth angle of each sampling point. The azimuth for each point was computed based on the coordinates of adjacent points. Using these parameters, along with the latitude and longitude of each sampling point, HTTP requests were constructed to retrieve the front-facing street view images for all points (Figure 4b).

While prior studies have leveraged street view image metadata to derive azimuths [30], this approach is hindered by computational inefficiency, as large-scale datasets necessitate excessive network requests. Here, this study presents a method that computes azimuths directly from road network data, significantly reducing reliance on network requests and enabling rapid azimuth calculation for extensive urban road networks.

The azimuth of a road sampling point is derived from its adjacent sampling points. As depicted in Figure 5a, points A and B represent neighboring sampling locations with coordinates

(x_{a}, y_{a})

and

(x_{b}, y_{b})

, respectively. The azimuth

α

of point B is calculated as follows:

α = \frac{180}{π} a \tan 2 (x_{b} - x_{a}, y_{b} - y_{a})

(1)

The azimuth calculation employs the two-argument arctangent function,

θ

= atan2(y, x) (https://en.wikipedia.org/wiki/Atan2) (17 April 2025), which returns the counterclockwise angle (in radians,

θ

= ∈ (−π, π]) between the positive x-axis and the vector from the origin to point (x, y), as illustrated in Figure 5b. Notably, while atan2 standardly takes the y-coordinate as its first argument, Equation (1) uses the x-increment as the first term and the y-increment as the second. To convert the output to degrees, the result is scaled by 180/π. The atan2 function is natively supported in most programming languages (e.g., Python 3.11). This study implemented the calculation in Python, which was also used for subsequent street view image retrieval.

3.4. Image Segmentation

Following the acquisition of street view imagery, the data were processed using the MP-Former model [29], a universal image segmentation framework capable of addressing diverse segmentation tasks, including panoptic, instance, and semantic segmentation. MP-Former builds upon the foundation of the Mask2Former model (Masked-attention Mask Transformer) [31] by incorporating a mask-piloted (MP) training strategy. This enhancement enables MP-Former to achieve superior segmentation accuracy compared to Mask2Former.

The semantic segmentation performance of MP-Former was assessed using the Cityscapes dataset [32], a large-scale benchmark designed for the analysis of urban street scenes. Employing Swin-L as its backbone, MP-Former achieved a mean intersection over union (mIoU) score of 83.9, surpassing Mask2Former by 0.6 [29]. In this study, the official implementation of MP-Former with the Swin-L backbone was trained on the Cityscapes dataset, with vegetation and terrain classes selected to represent greenery for calculating GVI. Figure 6 illustrates the semantic segmentation outcomes on sample images. Although minor segmentation errors are observed, the trees, grass, and plants are consistently and accurately delineated.

The accuracy of greenery extraction was evaluated using a randomly selected sample of 50 street view images. Ground truth data were manually annotated via Photoshop, serving as a benchmark for comparison. A Pearson correlation coefficient of 0.903 (p = 0.027) was observed between the ground truth green ratios and those estimated by MP-Former, demonstrating that MP-Former achieves a reliable level of accuracy in quantifying greenery.

3.5. FFGVI and SCGVI Calculation

The GVI [17], the modified GVI [23], and the PVGVI [24] metrics consider more about the comprehensiveness of GVI at each sampling point. However, pedestrians typically orient their gaze forward as they traverse roads, primarily perceiving the street environment through the front-facing views. To capture this front-facing perspective, this study proposes the Front-Facing Green View Index (FFGVI), a metric specifically designed to quantify street greenery as observed from the pedestrian’s line of sight while crossing a road. The FFGVI is derived using the front-facing street-view image, as illustrated in Figure 7a, and is computed using Equation (2).

F F G V I = \frac{A r e a_{g r e e n}}{A r e a_{t o t a l}} \times 100 %

(2)

where

A r e a_{g r e e n}

is the total number of green pixels in the front-facing image, and

A r e a_{t o t a l}

is the total number of pixels in that image. As a comparison, GVI is the averaging of the GVI values of the four images taken from four directions (north, east, south, and west) in Figure 7c.

To further emphasize the shading effect of street trees on the sky, this study introduces the SCGVI (Street Canyon Green View Index) metric, which is derived from the GVI of the upper half of the front-facing street view image in Figure 7b. The upper half of such images captures the morphology of the street canyon [33]. SCGVI is specifically designed to identify boulevards, as the upper portion of the image predominantly features the sky. By focusing on this region, SCGVI more effectively reflects the extent to which tree canopies obscure the sky, thereby serving as an indicator of boulevard characteristics. Higher SCGVI values correspond to more greenery and reduced sky visibility, which are associated with enhanced enclosure and favorable perception [18]. SCGVI is computed using Equation (3).

S C G V I = \frac{A r e a_{g r e e n_h a l f}}{A r e a_{t o t a l_h a l f}} \times 100 %

(3)

where

A r e a_{g r e e n_h a l f}

is the total number of green pixels in the upper half of the front-facing image, and

A r e a_{t o t a l_h a l f}

is the total number of pixels in the upper half of the image.

3.6. Evaluation of FFGVI and SCGVI in Comparison with GVI

To comprehensively assess FFGVI and SCGVI, this study analyzed the data distributions and correlations among GVI, FFGVI, and SCGVI. Given the relatively open field of view provided by the front-facing street view images used for FFGVI, along with the GVI calculation formula, this study hypothesizes that the traditional GVI calculation based on four-directional images is better suited for locations with open views, such as parks, green spaces, and road intersections. Conversely, FFGVI may be more appropriate for urban streets where the field of view is constrained. To facilitate the systematic comparison, road intersections and sampling points along streets were selected as representative scenes for GVI and FFGVI, respectively. Additionally, GVI was decomposed into its four directional components, and their variations were examined and compared across the two scenes.

Using the SCGVI values from all sampling points, the SCGVI values for the corresponding road segments were derived through spatial joining. The SCGVI value for each road was calculated as the mean SCGVI of all sampling points along that segment. Roads were classified as boulevards based on a predefined SCGVI threshold (e.g., 70%), such that a road was designated as a boulevard if its SCGVI exceeded this threshold. This process was applied to all roads within Nanjing to identify the city’s boulevards.

4. Results

4.1. Distribution and Correlation Between Different GVIs

The distribution of the GVIs and the correlation between them for all sampling points is shown in Figure 8. GVI values predominantly range from 5 to 35, exhibiting a moderate right skewness (skewness = 0.63). In comparison, the FFGVI distribution is more right-skewed (skewness = 0.89) with values primarily between 1 and 20. The SCGVI distribution shows an even greater right skewness (skewness = 1.14), with most SCGVI values remaining below 5. This is expected, as SCGVI represents the GVI of the upper half of front-facing street view images, which are typically dominated by elements such as the sky, buildings, and traffic signs, thereby limiting higher SCGVI values. However, the SCGVI distribution features a long tail, with elevated values clustering between 60 and 100. Additionally, both FFGVI and GVI exhibit higher values ranging from 60 to 80, with only a few instances exceeding 80.

The analysis reveals a strong positive correlation between FFGVI and GVI, with a correlation coefficient (R) of 0.88. The correlation density plot indicates that the majority of sampling points exhibit GVI values within the range of 0–45 and FFGVI values within 0–40. The GVI values generally exceed their corresponding FFGVI values. Notably, a total of 89,566 points (66.94% of all points) were observed with GVI greater than FFGVI.

The correlation between SCGVI and GVI exhibits the lowest correlation coefficient (R = 0.78) among the analyzed indices. This discrepancy arises from the methodological differences in their computation: SCGVI is derived solely from the upper half of a single front-facing image, whereas GVI integrates data from four images captured in different orientations. The correlation density plot reveals that, for a substantial proportion of sampling points, SCGVI values range from 0 to 20, while the corresponding GVI values lie between 0 and 30. Notably, SCGVI values tend to be slightly lower than their GVI counterparts. Furthermore, a small subset of sampling points exhibits high SCGVI values (60–100) despite relatively low GVI values (20–80).

The correlation between SCGVI and FFGVI exhibits a strong relationship, with a coefficient of 0.91. This high correlation can be attributed to the fact that both indices are derived from front-facing street view images, and the primary distinction lies in SCGVI’s focus on the upper portion of the image. The correlation density plots further illustrate that a substantial proportion of the points have SCGVI values ranging from 0 to 30, while FFGVI values span from 0 to 40. In general, SCGVI values tend to be slightly lower than their FFGVI counterparts. Notably, for a smaller subset of data points with higher SCGVI values (ranging from 60 to 100), the corresponding FFGVI values are relatively lower (ranging from 30 to 70).

The patterns and variations of the three GVIs for three sampling points are presented in Figure 9. Figure 9a depicts the GVIs for a sampling point, where the GVI and FFGVI values are approximately 14%, with the SCGVI significantly lower, near 0%. This is primarily attributed to the relatively wide street at this location; despite the presence of street trees on both sides, the trees are neither tall nor dense, allowing for considerable visibility of the sky in all four directional images. Figure 9b illustrates the GVIs for another sampling point, where the GVI is 17.39%, while both the SCGVI and FFGVI are close to 0%. This point is situated at an intersection, and the front-facing image predominantly captures the intersection with minimal tree cover. However, denser tree canopies are observed on both sides of the road, leading to significant discrepancies between the three GVIs. Figure 9c displays the GVIs for a third sampling point, where the SCGVI is 94.57%, the FFGVI is 77.98%, and the GVI is 80.29%. Here, the taller and denser row of trees on both sides of the road contributes to high values for all three GVIs. Notably, the tree density is greater in the front-facing image, while the back image (the east oriented image) reveals more sky, resulting in a marked difference between the FFGVI (77.98%) and the GVI of the back image (63.08%).

The three GVIs for different street canyon geometries are shown in Figure 10. The classification of street canyon geometries was adapted from Hu’s research [33]. As illustrated in the figures, both FFGVI and SCGVI effectively capture street greenery exposure during pedestrian movement across various street canyon geometries, including (a) deep street canyons and (b) shallow street canyons. In Figure 10b, due to the shallow street canyon, the sky occupies a larger proportion of the view, resulting in a relatively low SCGVI (2.82%). For general intersections in Figure 10c, where traffic conditions were more complex, GVIs are better than FFGVIs and SCGVIs in representing the surrounding greenery. Both (d) and (e) are located under a viaduct, but (d) is an intersection while (e) is not. Intriguingly, despite the viaduct obscuring a substantial portion of the sky, (e) exhibits remarkably high FFGVI and SCGVI values (FFGVI = 29.72%, SCGVI = 36.78%). This can be attributed to tree branches extending upwards around the viaduct, creating effective shading. For SVIs on the viaduct (Figure 10f), SCGVI is 0%, but due to the presence of tall trees beneath the viaduct, both FFGVI and GVI remain around 9%.

4.2. Comparison of GVIs for Intersections and Road Points

The road intersections and sampling points along roads were selected as representative scenes for GVI and FFGVI, respectively. The GVI was decomposed into four directional components, and their variations were analyzed for the two different scenes. For the road intersections, 30 locations were randomly chosen, with the intersections positioned along roadways aligned to the east-west and north-south axes. The four directional GVIs and the mean GVI were calculated for 0° (north), 180° (south), 270° (west), and 90° (east), respectively. The plot was shown in Figure 11a, where each blue line indicates the relationship between the four orientation GVIs and the mean GVI. For the road sampling points, 30 locations were randomly selected from the midsections of roads, all situated at least 50 m from any intersection. Using azimuthal angles, street view images were captured from the front, back, left, and right orientations. The four directional GVIs and the mean GVI were subsequently computed for each point, as shown in Figure 11b.

In Figure 10a, it is evident that, at road intersections, the correlation between the four directional GVIs and the GVI is not large. This suggests that, for intersections characterized by open views, the GVI effectively represents the mean GVI at these locations. Conversely, Figure 10b reveals that, for sampling points along the roads, the relationship between the front, back GVI, and GVI is nearly linear, indicating a strong correlation. In contrast, the relationship between the left and right GVIs is more pronounced, with a steeper gradient, suggesting a weaker correlation between them. This is mainly caused by the fact that the front and back images face the same road and are relatively similar, albeit in different directions, whereas the left and right images are more differentiated.

4.3. The Variations of the Four Orientation GVIs

To elucidate the spatial variation of GVIs across different orientations on a single road, 20 sampling points were selected along Hongwu Road. At each point, the GVIs for the four orientations (front, back, left, and right) were computed, along with the mean GVI, as illustrated in Figure 10. Statistical metrics, including the mean, standard deviation, and coefficient of variation (CV), were then derived for each of these indices and are presented in Table 1. The CV, defined as the ratio of the standard deviation to the mean, quantifies the relative variability of the data distribution. As shown in Table 1, the CVs for the front and back GVIs are comparatively low, both being less than 0.35, indicating minimal variation. In contrast, the left and right GVIs exhibit higher CVs, exceeding 0.5, suggesting greater variability. Notably, the overall GVI also demonstrates relatively low variability, reinforcing its robustness as a general indicator. These trends are further corroborated by the curve plots in Figure 12, which visually represent the variations across the four orientations. Although these observations were made on a single road, similar patterns were observed on other roads. The pronounced variability in the left and right GVIs is largely attributable to the specific locations of the sampling points, while the more consistent front GVI suggests that FFGVI may offer a more stable and potentially more reliable metric for assessing street greenery than the left and right GVIs.

4.4. Spatial Distribution of Boulevards

Boulevards are defined as roads exhibiting high SCGVI values, with those exceeding 90 specifically highlighted in Figure 13. Ten such roads are identified by circled numbers within the figure, and detailed information regarding these roads is provided in Table 2. These roads are predominantly situated in the western and central regions of Nanjing, with fewer located in the southern and northern parts of the city. The SCGVI values for these roads range between 94 and 98. These boulevards typically feature narrow cross-sections, generally comprising two lanes in each direction, and are characterized by the presence of tall street trees that provide substantial canopy cover. Notably, several of these roads (➀, ➄, ➅, ➈, and ➉) pass through forested parks, where the roadside vegetation includes a variety of plant species beyond trees. In contrast, other roads (➁, ➂, ➃, ➆, and ➇) are located in urbanized areas.

Streets exhibiting higher values of FFGVI tend to be narrower and feature a denser distribution of street trees. Conversely, streets with high SCGVI values not only display a greater density of street trees but also require these trees to grow in a manner that directs their canopy towards the center of the roadway to maximize overhead shading. This growth pattern necessitates specific tree species, such as London plane (Platanus × acerifolia (Aiton) Willd.) [34]. The London plane is a symbolic tree species in Nanjing and is known as the “city card”.

The boulevards in Nanjing provide pedestrians with an immersive experience of natural landscapes. Among these, Lingyuan Road stands out as the city’s most renowned thoroughfare, with an SCGVI score of 70.4, despite its absence from Table 2. The road is characterized by rows of London plane trees and is situated within the Sun Yat-sen Mausoleum, a historically significant site. Our findings further reveal that Lingyuan Road is part of a broader network of boulevards in Nanjing, each offering diverse opportunities for engagement with the city’s natural and cultural heritage.

5. Discussion

5.1. Measuring Street Greenery from the Front-Facing Street View Images

This study introduces two distinct GVI metrics, namely FFGVI and SCGVI, derived from the front-facing street view images. The FFGVI is a human-centered metric that better aligns with the GVI perceived by individuals as they traverse urban streets. In contrast, the SCGVI is designed to classify boulevards, particularly those that foster a sense of enclosure and aesthetic appeal, while also mitigating solar glare for pedestrians [35] and reducing solar radiation on the ground [30,36,37].

5.2. FFGVI, SCGVI, and GVI

The correlation analysis presented in Section 4.3 reveals that, for the majority of roads, the FFGVI and SCGVI values are lower than those of GVI. This discrepancy can be attributed to the nature of the front-facing street view images used in the FFGVI and SCGVI calculations, which have a broad field of view. These images capture a wider expanse of the road, sky, and distant trees, resulting in a perspective effect whereby trees farther from the viewer occupy fewer pixels. Consequently, both FFGVI and SCGVI tend to yield lower values. In contrast, the GVI metric is derived from street view images captured from all four directions, with particular emphasis on those facing the two sides of the road. These side-facing images generally capture a larger number of trees that are closer to the viewer, thus occupying more pixels and contributing to higher GVI values. This observation aligns with findings from Aikoh [38]. The phenomenon where the area occupied by trees in an image is influenced by their proximity to the camera has been termed the proximity-to-camera effect [7]. Therefore, GVI may overestimate the actual green visibility encountered by individuals traveling along the road on a daily basis.

Previous research has demonstrated that front-facing street view images can yield superior results in certain applications. Seiferling utilized Google Street View images to estimate perceived tree cover, finding that “when the camera heading aligned with the road heading, the tree cover predictor performed best compared to the three other road-to-image orientations” [7]. Seiferling suggested that road orientation typically provides a comprehensive view of street trees along both sides of the road and that future data collection could benefit from aligning the camera heading accordingly. Similarly, GVI could be more accurately determined by employing front-facing images to calculate GVI along the optimal path [13]. Additionally, sensitivity analyses affirmed that calculating environmental indices such as GVI, the Building View Index (BVI), and the Sky View Index (SVI) yielded greater accuracy when images were oriented towards the front or back, rather than the sides [8].

The comparison experiments in Section 4.4 indicate that for road intersections, the four directional GVIs have large differences and the correlation is weak. This suggests that the GVI is more effectively used to characterize greenery visible from intersections with unobstructed views. In contrast, for urban streets and street canyons, stronger correlations are observed between the FFGVI, back GVI, and GVI, making the FFGVI a more appropriate metric for representing greenery. As shown in Table 1, substantial variation exists between the left and right GVIs along the same road, which can be attributed to the limited field of view from these directions. Consequently, only a small subset of objects is visible, leading to significant disparities between objects observed from different sampling points on the same road. The left and right GVIs are thus influenced by the specific locations of sampling points, whereas FFGVI is less sensitive to this due to its broader field of view.

Figure 14a,b present the sampling points on the same road, with identical sampling intervals but varying locations. Notably, the right-facing image in Figure 14a is devoid of trees, whereas trees are present in Figure 14b, resulting in a significant disparity in GVI values between the two images. However, this difference is less pronounced when examining FFGVI. The sampling points in this case are situated on an internal road within a residential area characterized by sparse tree cover along both sides. While this example may represent an extreme scenario, since urban roads generally exhibit higher tree densities, it effectively demonstrates how the impact of sampling location can be mitigated by using FFGVI. The influence of sampling point location on GVI has not been extensively investigated, with potential contributing factors including the average spacing between trees, road width, and other contextual variables. Further research is warranted to explore these factors in detail.

This study also observed that GVI is notably susceptible to occlusion by large vehicles, such as trucks and buses, in both left-facing and right-facing images. As illustrated in Figure 15, a bus was present in the left-facing street view image at this sampling point. Given its proximity to the data collection vehicle, the bus obstructed a significant portion of the surrounding greenery, resulting in a reduced GVI at this location compared to when no large vehicles were present. In contrast, the front-facing street view images exhibit a more open field of view, leading to a diminished impact of such large vehicles on FFGVI. This is the first time in academia that the impact of large vehicles on GVI has been analyzed.

The calculation of the FFGVI and SCGVI requires only the urban road network to automatically determine the azimuth of the sampling point through the program, offering a flexible and efficient approach with minimal human intervention. In contrast to traditional methods, which typically necessitate the acquisition of four street view images from different directions for GVI computation at each sampling point, this method requires only a single front-facing image. Consequently, the volume of image data is reduced by a factor of four, significantly decreasing data storage requirements. This reduction not only minimizes storage needs but also enhances the efficiency of subsequent processes, such as semantic segmentation and GVI calculation [12], streamlining the overall workflow.

5.3. Differences from Other Methods

Wang derives the azimuths of sampled points from the image metadata and uses these to compute solar radiation from panoramic images [30]. However, extracting metadata from a large number of images necessitates numerous network requests, making this approach less efficient. In contrast, our method directly obtains the front-facing images based on azimuth, offering a more streamlined and convenient solution. Sánchez addressed the challenges posed by the irregular orientations and angles of Mapillary images by developing a complex image processing algorithm designed to detect road centers from semantically segmented images [39]. Although images containing road centers bear resemblance to the front-facing street view images, their method requires preprocessing steps such as semantic segmentation, thereby introducing additional complexity into the workflow.

5.4. Methodology Extensions

This study leverages BSV as the primary data source for acquiring the front-facing street view images. BSV is specifically accessible within mainland China, and the approach is equally applicable to GSV, which offers broader global coverage. During GSV image retrieval, the azimuth angle can be directly applied to the heading parameter, ensuring the acquisition of images in a specified direction [40].

Furthermore, certain applications require the capture of street view images from both sides of the road, such as for the examination of sidewalks [26,40,41] and stores and buildings along the street [42,43,44]. In these cases, it only needs to increase the azimuth calculated by our method by 90 and 180 degrees, adjust it to within 0–360 degrees, and set up the corresponding heading parameter.

5.5. Recommendations for Planning Practice and Urban Governance

FFGVI and SCGVI provide valuable metrics and tools for quantifying urban greenery and monitoring street-level greening initiatives. These indices enable urban planners to identify streets with higher greenery levels, thereby facilitating targeted interventions to enhance urban green spaces. Given its demonstrated advantages over standard methods, the FFGVI presents a viable alternative to standard GVI for urban greenery planning. Additionally, they serve as effective tools for assessing the visibility of urban vegetation and evaluating the visual impacts of urban forest management practices. By integrating network flow algorithms, these indices can also optimize route planning, offering the most scenic paths between locations [13]. Specifically, SCGVI-identified boulevards can be recommended as travel routes for tourists seeking natural landscapes. Collectively, FFGVI and SCGVI present significant potential for advancing urban planning, greening strategies, and tourism management.

5.6. Limitations

In this study, the greenery visible in the front-facing street view images is utilized as GVI for individuals crossing the road. However, previous research has indicated that street view images, such as BSV and GSV, which are captured by specialized collection vehicles traveling within the road lane, differ from the images that pedestrians observe while on the sidewalk, particularly in the case of wider roads [45]. Therefore, the FFGVI and SCGVI proposed in this study are more consistent with the GVI observed when people drive on the roads. There is a subtle difference between the FFGVI and the GVI observed by people when they are strolling on the sidewalk.

Existing semantic segmentation algorithms, which are constrained by their training datasets, tend to aggregate large leaves with gaps as a unified patch of greenery. This tendency is particularly pronounced when calculating the SCGVI. For instance, when a street tree obscures the sky but allows some gaps between the leaves where the sky is visible, the MP-Former algorithm may erroneously classify the entire area, including the gaps, as continuous greenery. This can lead to inflated SCGVI values, with some sampling points reaching 100%. Upon inspection, however, it becomes evident that portions of the sky are still visible, indicating that current semantic segmentation techniques may overestimate the true GVI.

6. Conclusions

This study utilizes the front-facing street view images to assess urban greenery and introduce two novel green visibility metrics: FFGVI and SCGVI. These metrics and the GVI are rigorously evaluated. FFGVI exhibited a strong correlation with GVI (R = 0.88), whereas the association between SCGVI and GVI was marginally weaker (R = 0.78). Our findings suggest that GVI may overestimate the actual greenery visible to pedestrians on a daily basis. While GVI is appropriate for areas with unobstructed views, such as parks and intersections, FFGVI is appropriate for urban streets where the field of view is constrained. The inappropriateness of GVI for street-level analysis stems from the significant variability in the left and right GVI values, which are influenced by the limited sightlines of street environments. In contrast, both FFGVI and SCGVI are human-centered metrics that more accurately reflect the greenery experienced by individuals traveling the street. These metrics also can mitigate the effects of sampling point locations and large vehicles on GVI. Notably, SCGVI is capable of identifying boulevards where tree canopy provides a shading effect, creating an enclosed streetscape that fosters a sense of comfort, positivity, and memorability for pedestrians.

Using the approach outlined in this study, the required volume of street view data is reduced to only a quarter of that needed for standard GVI assessments. This reduction in data volume significantly enhances the efficiency of both semantic segmentation and GVI computation. FFGVI and SCGVI offer valuable tools for the management and monitoring of urban street greenery, as well as for evaluating the effectiveness of urban greenery management practices. Additionally, the boulevards identified through SCGVI can be leveraged to optimize travel route recommendations.

Author Contributions

Conceptualization, Jin Zhu and Jinglong Du; methodology, Ziyue Cao; validation, Yue Zhang; formal analysis, Jinglong Du; investigation, Jin Zhu; resources, Jin Zhu; writing—original draft preparation, Jin Zhu; writing—review and editing, Yingjing Huang, Yuan Ding, and Jinglong Du; supervision, Jin Zhu. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Jiangsu Province Industry-University-Research Cooperation Program, grant number BY20221316, National Natural Science Foundation of China (No. 42471488), and National Key R&D Program of China (No. 2021YFE0112300).

Data Availability Statement

OSM data: https://www.openstreetmap.org (accessed on 9 December 2024), Street view imagery: https://map.baidu.com/ (accessed on 9 December 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

GVI	Green View Index
FFGVI	Front-Facing Green View Index
SCGVI	Street Canyon Green View Index

References

Wolf, K.L. Business District Streetscapes, Trees, and Consumer Response. J. For. 2005, 103, 396–400. [Google Scholar] [CrossRef]
Houlden, V.; Weich, S.; Porto de Albuquerque, J.; Jarvis, S.; Rees, K. The Relationship between Greenspace and the Mental Wellbeing of Adults: A Systematic Review. PLoS ONE 2018, 13, e0203000. [Google Scholar] [CrossRef] [PubMed]
Ki, D.; Lee, S. Analyzing the Effects of Green View Index of Neighborhood Streets on Walking Time Using Google Street View and Deep Learning. Landsc. Urban Plan. 2021, 205, 103920. [Google Scholar] [CrossRef]
Helbich, M. Using Deep Learning to Examine Street View Green and Blue Spaces and Their Associations with Geriatric Depression in Beijing, China. Environ. Int. 2019, 126, 107–117. [Google Scholar] [CrossRef]
Aoki, Y. Evaluation Methods for Landscapes with Greenery. Landsc. Res. 1991, 16, 3–6. [Google Scholar] [CrossRef]
Bargh, J.A.; Chen, M.; Burrows, L. Automaticity of Social Behavior: Direct Effects of Trait Construct and Stereotype Activation on Action. J. Personal. Soc. Psychol. 1996, 71, 230. [Google Scholar] [CrossRef]
Seiferling, I. Green Streets − Quantifying and Mapping Urban Trees with Street-Level Imagery and Computer Vision. Landsc. Urban Plan. 2017, 165, 93–101. [Google Scholar] [CrossRef]
Biljecki, F.; Zhao, T.; Liang, X.; Hou, Y. Sensitivity of Measuring the Urban Form and Greenery Using Street-Level Imagery: A Comparative Study of Approaches and Visual Perspectives. Int. J. Appl. Earth Obs. Geoinf. 2023, 122, 103385. [Google Scholar] [CrossRef]
Kim, J.H.; Lee, S.; Hipp, J.R.; Ki, D. Decoding Urban Landscapes: Google Street View and Measurement Sensitivity. Comput. Environ. Urban Syst. 2021, 88, 101626. [Google Scholar] [CrossRef]
Ye, Y.; Richards, D.; Lu, Y.; Song, X.; Zhuang, Y.; Zeng, W.; Zhong, T. Measuring Daily Accessed Street Greenery: A Human-Scale Approach for Informing Better Urban Planning Practices. Landsc. Urban Plan. 2019, 191, 103434. [Google Scholar] [CrossRef]
Law, S.; Seresinhe, C.I.; Shen, Y.; Gutierrez-Roig, M. Street-Frontage-Net: Urban Image Classification Using Deep Convolutional Neural Networks. Int. J. Geogr. Inf. Sci. 2020, 34, 681–707. [Google Scholar] [CrossRef]
Han, Y.; Zhong, T.; Yeh, A.G.O.; Zhong, X.; Chen, M.; Lü, G. Mapping Seasonal Changes of Street Greenery Using Multi-Temporal Street-View Images. Sustain. Cities Soc. 2023, 92, 104498. [Google Scholar] [CrossRef]
Zhang, J.; Hu, A. Analyzing Green View Index and Green View Index Best Path Using Google Street View and Deep Learning. J. Comput. Des. Eng. 2022, 9, 2010–2023. [Google Scholar] [CrossRef]
Xu, J. Understanding the Nonlinear Effects of the Street Canyon Characteristics on Human Perceptions with Street View Images. Ecol. Indic. 2023, 154, 110756. [Google Scholar] [CrossRef]
Zhu, J.; Gong, Y.; Liu, C.; Du, J.; Song, C.; Chen, J.; Pei, T. Assessing the Effects of Subjective and Objective Measures on Housing Prices with Street View Imagery: A Case Study of Suzhou. Land 2023, 12, 2095. [Google Scholar] [CrossRef]
Zhu, J.; Liu, C.; Yu, C.; Du, J. An Assessment of How Street View Imagery and Remote-Sensing Data of Green and Blue Spaces Can Explain Variations in Housing Prices: A Case Study in Suzhou, China. Remote Sens. Lett. 2024, 15, 1209–1217. [Google Scholar] [CrossRef]
Yang, J.; Zhao, L.; Mcbride, J.; Gong, P. Can You See Green? Assessing the Visibility of Urban Forests in Cities. Landsc. Urban Plan. 2009, 91, 97–104. [Google Scholar] [CrossRef]
Ewing, R.; Handy, S. Measuring the Unmeasurable: Urban Design Qualities Related to Walkability. J. Urban Des. 2009, 14, 65–84. [Google Scholar] [CrossRef]
Biljecki, F.; Ito, K. Street View Imagery in Urban Analytics and GIS: A Review. Landsc. Urban Plan. 2021, 215, 104217. [Google Scholar] [CrossRef]
Fan, Z.; Zhang, F.; Loo, B.P.Y.; Ratti, C. Urban Visual Intelligence: Uncovering Hidden City Profiles with Street View Images. Proc. Natl. Acad. Sci. USA 2023, 120, e2220417120. [Google Scholar] [CrossRef]
Wu, C.; Du, Y.; Li, S.; Liu, P.; Ye, X. Does Visual Contact with Green Space Impact Housing Pricesʔ An Integrated Approach of Machine Learning and Hedonic Modeling Based on the Perception of Green Space. Land Use Policy 2022, 115, 106048. [Google Scholar] [CrossRef]
Rzotkiewicz, A.; Pearson, A.L.; Dougherty, B.V.; Shortridge, A.; Wilson, N. Systematic Review of the Use of Google Street View in Health Research: Major Themes, Strengths, Weaknesses and Possibilities for Future Research. Health Place 2018, 52, 240–246. [Google Scholar] [CrossRef]
Li, X.; Zhang, C.; Li, W.; Ricard, R.; Meng, Q.; Zhang, W. Assessing Street-Level Urban Greenery Using Google Street View and a Modified Green View Index. Urban For. Urban Green. 2015, 14, 675–685. [Google Scholar] [CrossRef]
Xia, Y.; Yabuki, N.; Fukuda, T. Development of a System for Assessing the Quality of Urban Street-Level Greenery Using Street View Images and Deep Learning. Urban For. Urban Green. 2021, 59, 126995. [Google Scholar] [CrossRef]
Kumakoshi, Y.; Chan, S.Y.; Koizumi, H.; Li, X.; Yoshimura, Y. Standardized Green View Index and Quantification of Different Metrics of Urban Green Vegetation. Sustainability 2020, 12, 7434. [Google Scholar] [CrossRef]
Rui, J. Measuring Streetscape Perceptions from Driveways and Sidewalks to Inform Pedestrian-Oriented Street Renewal in Düsseldorf. Cities 2023, 141, 104472. [Google Scholar] [CrossRef]
Gong, F.-Y.; Zeng, Z.-C.; Zhang, F.; Li, X.; Ng, E.; Norford, L.K. Mapping Sky, Tree, and Building View Factors of Street Canyons in a High-Density Urban Environment. Build. Environ. 2018, 134, 155–167. [Google Scholar] [CrossRef]
Chen, L.-C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Computer Vision–ECCV 2018; Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2018; Volume 11211, pp. 833–851. ISBN 978-3-030-01233-5. [Google Scholar]
Zhang, H.; Li, F.; Xu, H.; Huang, S.; Liu, S.; Ni, L.M.; Zhang, L. MP-Former: Mask-Piloted Transformer for Image Segmentation. arXiv 2023, arXiv:2303.07336. [Google Scholar]
Wang, L.; Hou, C.; Zhang, Y.; He, J. Measuring Solar Radiation and Spatio-Temporal Distribution in Different Street Network Direction through Solar Trajectories and Street View Images. Int. J. Appl. Earth Obs. Geoinf. 2024, 132, 104058. [Google Scholar] [CrossRef]
Cheng, B.; Misra, I.; Schwing, A.G.; Kirillov, A.; Girdhar, R. Masked-Attention Mask Transformer for Universal Image Segmentation. arXiv 2022, arXiv:2112.01527. [Google Scholar] [CrossRef]
Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler, M.; Benenson, R.; Franke, U.; Roth, S.; Schiele, B. The Cityscapes Dataset for Semantic Urban Scene Understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 3213–3223. [Google Scholar]
Hu, C.-B.; Zhang, F.; Gong, F.-Y.; Ratti, C.; Li, X. Classification and Mapping of Urban Canyon Geometry Using Google Street View Images and Deep Multitask Learning. Build. Environ. 2020, 167, 106424. [Google Scholar] [CrossRef]
Wang, Y.; Wu, Y.; Sun, Q.; Hu, C.; Liu, H.; Chen, C.; Xiao, P. Tree Failure Assessment of London Plane (Platanus acerifolia (Aiton) Willd.) Street Trees in Nanjing City. Forests 2023, 14, 1696. [Google Scholar] [CrossRef]
Li, X. A Novel Method for Predicting and Mapping the Occurrence of Sun Glare Using Google Street View. Transp. Res. Part C Emerg. Technol. 2019, 106, 132–144. [Google Scholar] [CrossRef]
Li, X.; Ratti, C. Mapping the Spatio-Temporal Distribution of Solar Radiation within Street Canyons of Boston Using Google Street View Panoramas and Building Height Model. Landsc. Urban Plan. 2019, 191, 103387. [Google Scholar] [CrossRef]
Deng, M.; Yang, W.; Chen, C.; Wu, Z.; Liu, Y.; Xiang, C. Street-Level Solar Radiation Mapping and Patterns Profiling Using Baidu Street View Images. Sustain. Cities Soc. 2021, 75, 103289. [Google Scholar] [CrossRef]
Aikoh, T.; Homma, R.; Abe, Y. Comparing Conventional Manual Measurement of the Green View Index with Modern Automatic Methods Using Google Street View and Semantic Segmentation. Urban For. Urban Green. 2023, 80, 127845. [Google Scholar] [CrossRef]
Sánchez, I.A.V.; Labib, S.M. Accessing Eye-Level Greenness Visibility from Open-Source Street View Images: A Methodological Development and Implementation in Multi-City and Multi-Country Contexts. Sustain. Cities Soc. 2024, 103, 105262. [Google Scholar] [CrossRef]
Ki, D.; Chen, Z.; Lee, S.; Lieu, S. A Novel Walkability Index Using Google Street View and Deep Learning. Sustain. Cities Soc. 2023, 99, 104896. [Google Scholar] [CrossRef]
Hosseini, M.; Miranda, F.; Lin, J.; Silva, C.T. CitySurfaces: City-Scale Semantic Segmentation of Sidewalk Materials. Sustain. Cities Soc. 2022, 79, 103630. [Google Scholar] [CrossRef]
Li, X.; Zhang, C.; Li, W. Building Block Level Urban Land-Use Information Retrieval Based on Google Street View Images. GIScience Remote Sens. 2017, 54, 819–835. [Google Scholar] [CrossRef]
Novack, T.; Vorbeck, L.; Lorei, H.; Zipf, A. Towards Detecting Building Facades with Graffiti Artwork Based on Street View Images. IJGI 2020, 9, 98. [Google Scholar] [CrossRef]
Hou, Y.; Quintana, M.; Khomiakov, M.; Yap, W.; Ouyang, J.; Ito, K.; Wang, Z.; Zhao, T.; Biljecki, F. Global Streetscapes—A Comprehensive Dataset of 10 Million Street-Level Images across 688 Cities for Urban Science and Analytics. ISPRS J. Photogramm. Remote Sens. 2024, 215, 216–238. [Google Scholar] [CrossRef]
Ki, D.; Park, K.; Chen, Z. Bridging the Gap between Pedestrian and Street Views for Human-Centric Environment Measurement: A GIS-Based 3D Virtual Environment. Landsc. Urban Plan. 2023, 240, 104873. [Google Scholar] [CrossRef]

Figure 1. (a) GVI is derived from street-view imagery captured in four cardinal directions—north, east, west, and south. The white and blue circle symbolizes the observer’s viewpoint, while the green fan-shaped sectors delineate the corresponding fields of view for each direction. (b) In typical pedestrian behavior, individuals tend to orient their line of sight forward along the direction of the road as they traverse it.

Figure 2. The city of Nanjing and one randomly selected area showing the sampling points of BSV images.

Figure 3. The research framework.

Figure 4. (a) The heading, pitch, and fov parameters in the Baidu map API. (b) The heading value is set to the azimuth to obtain the front-facing street view image.

Figure 5. (a) The azimuth of one point on a road. (b) The atan2 function.

Figure 6. The semantic segmentation results of 4 sample images (a–d) using MP-Former. The (1), (2), and (3) rows represent the original image, the semantic segmentation result, and the semantic segmentation result overlaid on the original image, respectively.

Figure 7. (a) FFGVI is calculated with the front-facing street view image (green box). (b) SCGVI is calculated with the upper half (red box) of the front-facing street view image. (c) GVI is the averaging of GVI values of the four images taken from four directions (north, east, south, and west).

Figure 8. Histogram and the correlation between different GVIs. The first column is GVI, the second column is FFGVI, and the third column is SCGVI.

Figure 9. (a–c) Three GVIs of three sampling points; the values in parentheses are the GVI of the image in each direction. (The non-English terms on the traffic signage are the names of the road.)

Figure 10. Three GVIs for different street canyon geometries. (a) deep street canyon (H/W > 1), (b) shallow street canyon (H/W < 0.5), (c) general intersection, (d) intersection under viaduct, (e) non-intersection under viaduct, and (f) on viaduct.

Figure 11. (a) The four orientation GVIs parallel plot for road intersections. (b) The four orientation GVIs parallel plot for road points.

Figure 12. The four orientation GVIs and GVI for 20 road points on Hongwu Road.

Figure 13. Ten boulevards with SCGVI > 90 in Nanjing and their street view images.

Figure 14. The sampling points on the same road, with the same sampling intervals but different locations. (a) There are no trees in the right-facing image. (b) Trees can be seen in the right-facing image.

Figure 15. GVI is susceptible to occlusion by large vehicles, e.g., buses in the left-facing view have a greater impact on GVI, while FFGVI is less affected by it.

Table 1. The statistics of the four orientation GVIs and GVI for 20 road points on Hongwu Road.

Statistics	Front GVI (%)	Back GVI (%)	Left GVI (%)	Right GVI (%)	GVI (%)
Mean	26.206	24.296	19.766	22.695	23.241
Std	9.093	7.358	10.064	12.379	7.250
CV	0.347	0.303	0.509	0.545	0.312

Table 2. The names, lengths, and SCGVI values of ten boulevards in Nanjing.

ID	Road Name	Chinese Name	SCGVI	Length (m)
➀	Bo ai east road	博爱东路	97.69	589
➁	Jin jiang road	锦江路	97.45	2469
➂	He hai road	河海路	97.38	663
➃	Zhong hua road	中华路	97.23	2774
➄	Mian hua di road	棉花堤路	97.20	980
➅	Huan ling road	环陵路	96.86	2439
➆	Hu bin road	湖滨路	96.01	1174
➇	Huan hu road	环湖路	95.42	8285
➈	Bo ai west road	博爱西路	95.33	490
➉	Weng zhong road	翁仲路	94.64	931

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Published by MDPI on behalf of the International Society for Photogrammetry and Remote Sensing. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhu, J.; Huang, Y.; Cao, Z.; Zhang, Y.; Ding, Y.; Du, J. Evaluating Urban Greenery Through the Front-Facing Street View Imagery: Insights from a Nanjing Case Study. ISPRS Int. J. Geo-Inf. 2025, 14, 287. https://doi.org/10.3390/ijgi14080287

AMA Style

Zhu J, Huang Y, Cao Z, Zhang Y, Ding Y, Du J. Evaluating Urban Greenery Through the Front-Facing Street View Imagery: Insights from a Nanjing Case Study. ISPRS International Journal of Geo-Information. 2025; 14(8):287. https://doi.org/10.3390/ijgi14080287

Chicago/Turabian Style

Zhu, Jin, Yingjing Huang, Ziyue Cao, Yue Zhang, Yuan Ding, and Jinglong Du. 2025. "Evaluating Urban Greenery Through the Front-Facing Street View Imagery: Insights from a Nanjing Case Study" ISPRS International Journal of Geo-Information 14, no. 8: 287. https://doi.org/10.3390/ijgi14080287

APA Style

Zhu, J., Huang, Y., Cao, Z., Zhang, Y., Ding, Y., & Du, J. (2025). Evaluating Urban Greenery Through the Front-Facing Street View Imagery: Insights from a Nanjing Case Study. ISPRS International Journal of Geo-Information, 14(8), 287. https://doi.org/10.3390/ijgi14080287

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Evaluating Urban Greenery Through the Front-Facing Street View Imagery: Insights from a Nanjing Case Study

Abstract

1. Introduction

2. Literature Review

2.1. Street View Images

2.2. Green View Index

2.3. Semantic Segmentation

3. Materials and Methods

3.1. Study Area

3.2. Research Framework

3.3. Azimuth Calculation and Street View Images Collection

3.4. Image Segmentation

3.5. FFGVI and SCGVI Calculation

3.6. Evaluation of FFGVI and SCGVI in Comparison with GVI

4. Results

4.1. Distribution and Correlation Between Different GVIs

4.2. Comparison of GVIs for Intersections and Road Points

4.3. The Variations of the Four Orientation GVIs

4.4. Spatial Distribution of Boulevards

5. Discussion

5.1. Measuring Street Greenery from the Front-Facing Street View Images

5.2. FFGVI, SCGVI, and GVI

5.3. Differences from Other Methods

5.4. Methodology Extensions

5.5. Recommendations for Planning Practice and Urban Governance

5.6. Limitations

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI