Next Article in Journal
Methodology of Mosaicking and Georeferencing for Multi-Sheet Early Maps with Irregular Cuts Using the Example of the Topographic Chart of the Kingdom of Poland
Previous Article in Journal
Spatiotemporal Analysis of Nighttime Crimes in Vienna, Austria
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Coupling Hyperbolic GCN with Graph Generation for Spatial Community Detection and Dynamic Evolution Analysis

1
School of Geosciences and Info-Physics, Central South University, Changsha 410083, China
2
Hunan Geospatial Information Engineering and Technology Research Center, Changsha 410018, China
3
School of Geography and Environment, Jiangxi Normal University, Nanchang 330022, China
*
Author to whom correspondence should be addressed.
ISPRS Int. J. Geo-Inf. 2024, 13(7), 248; https://doi.org/10.3390/ijgi13070248
Submission received: 2 April 2024 / Revised: 7 July 2024 / Accepted: 8 July 2024 / Published: 10 July 2024

Abstract

:
Spatial community detection is a method that divides geographic spaces into several sub-regions based on spatial interactions, reflecting the regional spatial structure against the background of human mobility. In recent years, spatial community detection has attracted extensive research in the field of geographic information science. However, mining the community structures and their evolutionary patterns from spatial interaction data remains challenging. Most existing methods for spatial community detection rely on representing spatial interaction networks in Euclidean space, which results in significant distortion when modeling spatial interaction networks; since spatial community detection has no ground truth, this results in the detection and evaluation of communities being difficult. Furthermore, most methods usually ignore the dynamics of these spatial interaction networks, resulting in the dynamic evolution of spatial communities not being discussed in depth. Therefore, this study proposes a framework for community detection and evolutionary analysis for spatial interaction networks. Specifically, we construct a spatial interaction network based on network science theory, where geographic units serve as nodes and interaction relationships serve as edges. In order to fully learn the structural features of the spatial interaction network, we introduce a hyperbolic graph convolution module in the community detection phase to learn the spatial and non-spatial attributes of the spatial interaction network, obtain vector representations of the nodes, and optimize them based on a graph generation model to achieve the final community detection results. Considering the dynamics of spatial interactions, we analyze the evolution of the spatial community over time. Finally, using taxi trajectory data as an example, we conduct relevant experiments within the fifth ring road of Beijing. The empirical results validate the community detection capabilities of the proposed method, which can effectively describe the dynamic spatial structure of cities based on human mobility and provide an effective analytical method for urban spatial planning.

1. Introduction

The accelerated development of urbanization has led to an expansion in the scale of cities, accompanied by increases in population and traffic flows. As the primary location for human mobility, the incongruity between the sharply rising demand for travel and lagging urban planning and construction is prone to result in a series of urban problems [1]. The study of urban spatial structures assists planners and policy makers in determining the spatial scale of cities and improving the rationality of urban spatial layouts. It is also a key means of solving the ever-increasing urban problems.
The theory of urban spatial structure arose from the early Industrial Revolution, and early studies on urban spatial structure mainly focused on the first-order geographical phenomenon of the static distribution of urban spatial structure. The metrics of urban spatial structure are usually based on static measurement indicators, which are mainly based on traditional data sources such as population censuses, land surveys, or statistical data [2], to directly describe and classify the morphological, organizational, and distributional characteristics of urban space. The description of spatial structure is usually a static, global spatial feature. However, urban geographic regions are not isolated, and regions are connected to each other through spatial interaction [3,4,5]. Therefore, the study of urban spatial structure is not only limited to a static view of the spatial layout, but also needs to explore new dynamic perspectives.
With the rapid development of mobile terminals and sensor technology, a vast amount of high-precision geospatial big data with temporal and spatial labels have been generated, providing new data sources and technical means for the study of urban spatial structures [6,7,8]. In the context of the big data era, the application of big data technology has significantly improved the precision and depth of urban spatial structure research. Spatiotemporal flow data recording human mobility are widely used in the study of urban geographical spatial structure divisions. Some researchers believe that the structure of urban space can be reflected through spatial interactions, such as population movement and traffic flow [9,10]. Therefore, the research on urban spatial structure has begun to shift from the traditional “place space” to “flow space”. The research paradigm of urban dynamic spatial structure proposed from the perspective of spatial interaction has become one of the important research directions. Many scholars have studied urban spatial structure by detecting spatial communities and have developed a series of spatial community detection methods.
However, most existing research on spatial community detection simply extends methods from the field of network science, leading to a lack of applicability when detecting spatial communities. Specifically, the detection of spatial communities is different from the detection of communities in social networks. Firstly, when developing community detection methods, it is necessary to combine the real-world scenarios of spatial interaction networks and fully exploit the characteristics of the network. Relevant theoretical research has shown that hyperbolic space provides a more suitable geometric framework for modeling and analyzing complex networks in the real world [11,12]. From the perspective of actual data, we visualized the spatial interaction network constructed from Monday’s taxi data (see Section 4.1 for data introduction) on the Poincaré disk (Figure 1a) and statistically analyzed the distribution of network node degrees (Figure 1b). It can be observed that mapping the spatial interaction network into hyperbolic space by the means of the Poincaré disk model clearly distinguishes the hierarchical structure of the network and exhibits scale-free phenomena, which are characteristics that traditional Euclidean geometric frameworks struggle to fully describe [13,14]. Secondly, it has been shown that graph generation models can help to complete community detection tasks by defining and optimizing the community affiliation of network nodes [15]. The development of deep learning technology has occurred in recent years, in which the unsupervised deep learning model approach with the idea of a graph generation model as the optimization goal shows advantages [16,17]. In addition, spatial interaction networks are a class of dynamic network [18]. We also trace the evolution of interactions between urban areas by exploring spatial communities over successive time periods as a way of gaining a deeper understanding of the dynamic organization of urban areas.
Based on the above analysis, a framework that couples hyperbolic graph convolution and graph generation for spatial community detection and its dynamic evolution identification is proposed. The method contains two main parts: the detection of spatial communities and the identification of their evolution. In the first phase, we extend the traditional complex network approach for first-order urban cognition to second-order cognition by learning spatial interactions and computing unique vector representations through a hyperbolic graph convolution module, projecting the urban information into mathematical space. We utilize a graph generation model to provide optimization targets, thereby obtaining spatial community detection results. In the second phase, we adopt a method for discovering the community evolution in social networks, which uses a comparative strategy based on the importance of node quantity and similarity to identify and differentiate types of community evolution. Finally, we provide a theoretical foundation for optimizing the dynamic spatial structure of a city by conducting experimental studies and analyses within the Beijing fifth ring road, including the central urban area.
The main contributions of this study are as follows:
(1)
We construct a framework based on hyperbolic graph convolution and graph generation for spatial community detection and its dynamic evolution identification. It is suitable for spatial interaction networks.
(2)
For spatial community detection, we use the hyperbolic graph convolution module to learn the structural features of spatial interaction networks, and optimize the node representation vectors output from the hyperbolic graph convolution module in an unsupervised way through the idea of graph generation, so as to obtain the community affiliation of nodes to complete the task of spatial community detection.
(3)
Since most methods ignore the dynamics of spatial interaction networks, in order to track the evolution of spatial communities, we use the group evolution discovery (GED) [19] algorithm to identify the type of evolution of the spatial community.
The remaining sections of this paper are organized as follows. Section 2 reviews the related work. Section 3 introduces the conceptual framework of the spatial community detection and evolutionary identification proposed in this study. Here, we provide a detailed overview of all components of the framework, describe their formal representations, and outline the computational methods for quantifying dynamic processes. Section 4 presents the results of our proposed method and discusses its effectiveness and advantages. Moreover, considering the apparent scale dependence of geospatial data, we conduct spatial community detection experiments under grids of different scales to further evaluate the proposed method. Section 5 is a discussion. Section 6 summarizes the conclusions of this paper and future work.

2. Related Work

Community detection methods have evolved from the field of social networks, and since both spatial interaction networks and social networks belong to complex networks, early methods for detecting spatial communities directly utilized community detection methods from the field of social networks (hereafter referred to as community detection) for related research. Subsequently, researchers have gradually developed methods for community detection in the field of geography, as well as methods based on deep learning. Therefore, the following mainly reviews the methods and work of these three branches.

2.1. Community Detection

Community detection is a fundamental task in network analysis, aiming to identify tightly connected subgroups within a network. In their early stages, community detection methods relied on explicit rules and optimization criteria for community detection. Initially, research on community detection was mainly based on graph partitioning theory, with the basic idea being to divide the network into several subgraphs with closely connected nodes based on partitioning criteria. Representative methods include the KL algorithm [20] and spectral dichotomy [21], which are highly interpretable but susceptible to resolution parameters. As research into complex networks progressed, community detection methods based on statistical inference theory gradually developed, with the basic idea being to use various prior knowledge and probabilistic graphical models [22] to construct and optimize statistical models, thereby inferring the community structures in networks. For instance, Karrer et al. [23] combined the stochastic block model to identify community structures by simulating random connections within the network. Airoldi et al. [24] developed a mixed-membership stochastic block model, enabling a low-dimensional representation of the network structure and subsequent community detection. These methods offer a probabilistic interpretation of community detection results, aiding in the assessment of result reliability and uncertainty; however, they involve complex mathematical models and computational processes, leading to high computational costs and unsuitability for ultra-large networks. Since community detection is a technique for revealing the clustering of network nodes into communities, some studies have also applied clustering algorithms to community detection [25], giving rise to clustering-based methods. Representative methods are the Girvan–Newman (GN) algorithm based on hierarchical clustering [26] and the FastQ algorithm based on cohesive clustering [27], which can be applied to a variety of network types and adapted to different community shapes and sizes; however, they rely on preset parameters (e.g., the number of clusters), the selection of which requires domain knowledge. Concurrently with clustering-based community detection research, optimization-based community detection also emerged [28], with the basic idea being to find community detection solutions that maximize modularity through optimization algorithms. Representative methods include the Louvain algorithm [29] and the Leiden algorithm [30].
Although directly applying community detection methods from the field of network science can identify some valuable spatial interaction patterns embedded within spatial networks, ignoring the spatial relationships between nodes (such as spatial topological relationships and spatial distances) often results in the discovered community structure primarily reflecting the influence of spatial proximity. This limitation prevents the discovery of community structures caused by other potential factors, thereby restricting the ability to interpret community structures in terms of spatial interactions [31].

2.2. Spatial Community Detection

Spatial community detection refers to the collection of geographical units that are closely connected by spatial interactions, forming clustered patterns with certain structures and functions. Compared to most non-spatial networks, spatial interaction networks are a type of network embedded in space with more nonlinear characteristics [32], which poses significant challenges for community detection in geographical spaces. Scholars, both domestically and internationally, have explored methods of incorporating spatial relationship constraints into community detection methods, mainly including two approaches: (1) Considering the constraints of spatial relationships in the process of defining the objective function. For example, Expert et al. [31] and Gao et al. [33] tried to measure the impact of spatial distance on the connection probability between nodes using the gravity model and modified the modularity function accordingly. Chen et al. [34] considered the decay of node weights with spatial distance to modify the modularity function. Incorporating spatial relationships into the objective function requires additional assumptions (such as distance decay) and parameter settings, which increase the complexity of practical applications, and unreasonable assumptions may also lead to deviations in the mined community results from reality. Therefore, in recent years, some scholars have tried another research approach: (2) Directly adding spatial relationship constraints between nodes in community detection algorithms. For instance, Guo et al. [35] added spatial proximity constraints in the process of modularity optimization and used a tabu search strategy to discover the community structures from taxi trajectory data. Fan et al. [36] added spatial distance constraints in the process of searching for community structures and provided an approximate algorithm to search for the community structures from social media check-in data. Wan et al. [37] extended the density-based community detection method by incorporating spatial distance constraints when estimating local density. Chen et al. [38] detected communities with spatial correlation by using signed spectral clustering to capture the relevance of spatial networks.
In conclusion, community detection methods that incorporate spatial relationship constraints focus on introducing spatial relationship constraints into existing community detection methods to identify spatial communities. However, in most real-world scenarios, network data lack node label information and prior knowledge about communities, placing community detection methods within the realm of unsupervised tasks. This direct approach of capturing information from connections may lead to suboptimal community detection results [39].

2.3. Community Detection Based on Deep Learning

With the development of computer and information technology, deep learning techniques have begun to be applied in the research of community detection methods. Since complex networks are organized as graph-structured data, methods based on Graph Convolutional Networks (GCNs) [40] have become a mainstream approach for community detection. These methods effectively preserve and utilize the topological structure of the network by aggregating the neighborhood information of nodes to capture node representations for community detection at a global level. In the early stages, semi-supervised learning strategies were mainly used. For example, Jin et al. [41] proposed MRFasGCN, which adds the Markov Random Field model as a new convolutional layer in the framework of a GCN to solve the problem of semi-supervised community detection in attribute networks with semantic information. Bhattacharya et al. [42] proposed CommunityGCN as a semi-supervised node classification model, which combines the concept of message passing for node classification with the architecture of semi-supervised graph neural networks to achieve community partitioning. To address the task of community detection in real-world scenarios without labeled data, clustering-based unsupervised learning strategies have been developed. For example, Sun et al. [43] proposed a network embedding model for node clustering to learn the network embeddings of node clusters in attributed graphs, applying clustering loss to complete the clustering task and thereby achieve community detection. Liu et al. [44] proposed a community detection method based on the community perspective and a GCN, combining representation learning and clustering through a Bernoulli–Poisson model to more accurately explore potential community structures. Liang et al. [45] proposed the Region2vec method, which takes a network with node attributes as an input into a GCN to generate node embeddings and uses clustering algorithms to divide communities. Tsitsulin et al. [46] introduced deep modularity networks (DMoNs), an unsupervised pooling method inspired by a modular measure of clustering quality, and showed how it can solve the problem of recovering the challenging clustering structures of real-world attributed graphs. However, clustering-based unsupervised learning strategies typically have a fixed pattern recognition capability, which can easily lead to locally optimal solutions in community detection results. Therefore, when implementing unsupervised community detection, it is necessary to consider new methods that adapt to complex network data.
In summary, compared to the earlier non-deep learning methods, community detection methods based on deep learning tend to focus on two aspects for method research and improvement: network feature learning and model optimization objectives. From the perspective of network feature learning, the key focus is on the topology and attributes of the network. Starting from the local topological connections of network nodes and continuing to the integration of global topological connection relationships, as well as combining the attribute information of network nodes, these model methods have continuously improved the ability to represent network features. However, most studies usually compute network embeddings in Euclidean space to learn low-dimensional representations of the network, without fully considering the hierarchical structure and scale-free nature of spatial interaction networks [11,12,13,14]. Secondly, from the perspective of model optimization objectives, the key focus is on how model methods construct optimization objectives to complete the community detection task. Nevertheless, most research tends to adopt clustering algorithms or modularity optimization methods to obtain communities, failing to fully utilize the structural information of the network data. Studies have shown that methods based on graph probability generation are built on a consensus that the nodes in a common community are more likely to be connected with each other than nodes distributed in other communities. Therefore, it is natural to model community detection using a probabilistic framework [47,48].

3. Methodology

3.1. Overview

3.1.1. Research Framework

In this study, a framework for spatial community detection and evolutionary identification is proposed (Figure 2), which consists of two parts: spatial community detection and spatial community evolution identification. The first part is spatial community detection, which comprises three steps: firstly, introducing geographical spatial knowledge to construct spatial interaction networks based on human mobility (Figure 2a); secondly, feeding the constructed spatial interaction networks into a hyperbolic graph convolutional module for embedding (Figure 2b). Subsequently, based on the graph generation concept, we construct the optimization objective function for the community detection part to obtain membership information from the known spatial interaction network to complete the community detection task. From the output community affiliation matrix, the column with the largest probability value in each row is chosen as the community to which the node belongs. The second part is the evolutionary recognition of spatial communities (Figure 2d), mainly inspired by robust community evolution recognition methods in social networks (GED algorithm [19]) for identifying and analyzing evolutionary types.

3.1.2. Detailed Description

This framework is used to infer the spatial community structures of urban areas based on spatial interaction networks and their evolutionary types. The spatial community detection consists of three parts. Initially, we extract the spatial and non-spatial information inherent to spatial interaction data to construct spatial interaction networks. Considering that the connections between nodes in the network also exhibit distance decay in spatial interaction, meaning that, the closer the distance, the higher the likelihood of a connection between two nodes, we incorporate the influence of geographical spatial distance on the network’s connectivity by using the spatial coordinates of the geographical units as attributes of the network nodes. Subsequently, in order to fully learn the structural features of the spatial interaction network, a hyperbolic map convolution module is introduced, which can be embedded into urban spatial units to obtain a vector representation of their spatial features, reflecting the similarity of the urban spatial units in vector space. Finally, we employ the concept of graph generation models to optimize the node representation vectors after the hyperbolic graph convolution module to output the community affiliation matrix of the network nodes, thereby completing the task of community detection. For the community evolution identification, we measure the similarity based on the joint variation in the number of community nodes and their importance during evolution, and then determine the type of community evolution based on this similarity.

3.2. Construction of the Spatial Interaction Networks

The spatial interaction network is a type of dynamic network, which, in this study, is organized in the form of snapshot graphs. That is, given a time interval , a day can be divided into t = 24 / time snapshots. Therefore, the dynamic network can be represented as an ordered set G 1 ,   G 2 ,   , G t , where each snapshot G t = ( V t ,   E t ,   X t ) is uniquely determined by the set of nodes V t , the set of edges E t , and node attributes X t .
Specifically, we use a given time interval (for example, ∆ = 2 h) to divide and aggregate the trajectory data with spatiotemporal information into geographic units within the study area, organizing interactions between these units in the form of time snapshots (Figure 3a). Specifically, geographic units are used as nodes, the interactions between geographic units as edges, and the geographic coordinates of the units as node attributes, thereby constructing the spatial interaction networks (Figure 3b). This network can be abstracted as an attribute graph G t = ( V t ,   E t ,   X t ) , where V t ,   = {1, …, N} contains N geographic units; E t = { ( i , j ) V × V : A i j } includes edges between any two geographic units; and A i j represents the weight of the edge between nodes i and j. The weights of the edges between all nodes form the adjacency matrix A t ; moreover, the features of the nodes can be represented as X t R N × D , where D denotes the dimensionality of the node features. Here, we treat the geographical coordinates of the nodes as node attributes.

3.3. Spatial Community Detection

3.3.1. Network Embedding Based on Hyperbolic Graph Convolution

In this section, we describe the specific operations for embedding the spatial interaction network into low-dimensional vectors using the hyperbolic graph convolution module. The node representation vectors obtained after the network embedding are set as the initial values of the node community affiliation matrix.
To fully capture the attributes of the nodes in the spatial interaction network, the neighborhood structure of the nodes, and the features of the edges, this study employs a hyperbolic graph convolution module [49] for embedding to obtain a reduced-dimensional representation of the spatial interaction network. In this manner, a refined characterization of network nodes abstracted from geographic units is achieved (Figure 4). Specifically, the multiple spatial interaction networks constructed in Section 3.2 are sequentially fed into the hyperbolic graph convolution for embedding in time order, thereby capturing the structure and attribute information of the spatial interaction network. This generates potential representation vectors for geographic units, which are then used as inputs for downstream tasks.
For the specific task of this study, the description is as follows. Firstly, the mathematical form of the spatial interaction network G t = ( V t ,   E t ,   X t ) embedded through the hyperbolic graph convolution module is:
f : V t ,   E t , ( X t , i E ) i V X t , i l , H R | V | × d
where E represents the Euclidean space and H represents the hyperbolic space, ( X t , i E ) i V represents the Euclidean features of the i-th node input at the t-th time, X t , i l , H is the vector representation of the spatial interaction network i-th node obtained after embedding through l layers of hyperbolic graph convolution modules, and d represents the dimension of the node feature.
Specifically, the embedding of the spatial interaction network through the hyperbolic graph convolution module can be described as follows: given a spatial interaction network G t = ( V t ,   E t ,   X t ) and input Euclidean features ( X t , i E ) i V , they are input into l layers of hyperbolic graph convolution to obtain the feature representation of the nodes in hyperbolic space.
First, we employ exponential mapping to map the Euclidean input features to the hyperbolic space, expressed as:
X t , i H = e x p o K ( 0 , X t , i E ) = ( K cos h ( X t , i E 2 K ) · K sin h ( X t , i E 2 K ) X t , i E X t , i E 2 )
where X t , i E represents the input features of the i-th node of the spatial interaction network at the t-th time in Euclidean space; K represents the curvature of the hyperbolic space, which is computable; c o s h and s i n   h , respectively, denote the hyperbolic cosine and hyperbolic sine functions; e x p o K ( · ) denotes the exponential mapping in hyperbolic space with curvature K, with the reference point o being the origin of the tangent plane in hyperbolic space; and X t , i H represents the features of the i-th node at the t-th time in hyperbolic space obtained after applying e x p o K ( · ) .
Then, the mathematical expression for message passing in each layer of the HGCN is:
h t , i l , H = ( W l K l 1 , H X t , i l 1 , H ) K l 1 b l
y t , i l , H = A G G K l 1 ( h t , i l , H ) i
X t , i l , H = σ K l 1 , K l y t , i l , H
where h t , i l , H represents the features of the i-th node at the t-th time obtained after the hyperbolic feature transformation, y t , i l , H denotes the features of the i-th node at the t-th time after the attention-based neighbor aggregation operation A G G K l 1 ( · ) , and X t , i l , H represents the final hyperbolic embedding features. Generally, features obtained through two layers of HGCN embedding are more appropriate, that is, the value of l is set to 2.

3.3.2. Optimize Community Affiliation Matrix Based on Graph Generation Model

The hyperbolic graph convolution module introduced in Section 3.3.1 is capable of capturing the complex relationships in spatial interaction networks and obtaining the hyperbolic embedding vectors for each node. The downstream task of spatial community detection is aimed at real-world scenarios and is expected to be accomplished through an unsupervised community detection framework. Inspired by the concept of probabilistic generative models [48,50], a connection can be established between the spatial community detection results and the original network to achieve spatial community detection (Figure 5). The underlying logic is: assuming that the probability of forming edges between nodes within spatial communities is higher than that between nodes outside spatial communities, the question arises of how to optimize the spatial community detection results to ensure their accuracy. By reversing this thinking, we assume that, based on the achieved spatial community division, the spatial interaction network is generated by inferring from the results. If the generated spatial interaction network is as close as possible to the original network, it will be considered as the final result of the spatial community detection. Therefore, the problem of spatial community detection can be considered within a probabilistic inference framework. That is, drawing on the idea of probabilistic generative modeling, the original network can be used to guide the embedding of the spatial interaction network by the hyperbolic graph convolution module in Section 3.3.1, and eventually, the community affiliation of each node can be obtained, thus realizing spatial community detection.
Specifically, suppose that the community set of the spatial interaction network is denoted as C (C is a hyperparameter, given an initial value, the final number is determined by model training) and that there are two nodes, i and j , with affiliation strength vectors F i and F j , where F i and F j , respectively, represent the membership probability vectors of node i and node j in their respective communities. The probability P i , j of generating an edge between nodes i and j in the community affiliation graph model is given by:
P i , j = 1 e x p ( F i · F j T )
Therefore, after giving the community affiliation for each geographic unit, the probability of a connection between any two points can be calculated using Equation (6), thereby generating the spatial interaction network. The community detection task can be understood as the inverse process of the following: continuously optimizing the community affiliation matrix of geographical units until the original spatial interaction network is maximally reconstructed with the highest probability. At this point, the affiliation matrix serves as the final output result. In summary, finding the F that maximizes p ( G | F ) :
p G F = i ,   j ϵ E p i ,   j i ,   j E 1 p i ,   j = i ,   j ϵ E ( 1 e x p ( F i · F j T ) ) i ,   j E e x p F i · F j T

3.3.3. The Overall Structure of Spatial Community Detection Models

This study combines the HGCN module described in Section 3.3.1 and the graph generation model described in Section 3.3.2 to solve the optimization problem of the node community affiliation matrix F in Equation (7), thereby accomplishing the task of spatial community detection. The constructed spatial community detection model can be expressed as:
F = H G C N α ( A t ,   X t )
where F is defined as the output of the HGCN, α represents the parameters of the model, and A and X t , respectively, denote the adjacency matrix and node features of the spatial interaction network.
The objective of model optimization is to better delineate communities and accomplish the task of community detection. Inspired by the probabilistic generative model, our fundamental assumption is that if a pair of nodes share a community, the likelihood of them being connected in the network is higher. Based on this premise, a non-negative initial value is assigned to each node–community pair, representing the node’s membership to the community. Then, the probability of an edge between a pair of nodes in the network is modeled as a function of their shared community membership. This establishes a direct connection between the node’s community membership and the probability of edges in the original network. With knowledge of the network structure, this function can be used as the optimization objective, thereby achieving node–community partitioning. The optimization objective of the model is to maximize p ( G | F ) . Combining with Equation (7), we can deduce the loss function of the model optimization process:
L F = ( i , j ) ϵ E log 1 exp F i · F j T ( i , j ) E F i · F j T
Thus, the optimization objective of the HGCN is:
α = arg m a x α L ( F ) = arg m a x α L H G C N α ( A t , X t )
During the training process, the loss function continually constrains the HGCN module, optimizing the process of extracting network features by the HGCN. Consequently, the optimal membership matrix of communities is obtained, leading to the detection results of spatial communities. From the output community affiliation matrix, the column with the largest probability value in each row is chosen as the community to which the node belongs.

3.3.4. Evaluation of Spatial Community Detection Models

As community detection methods are a type of optimization task that falls under the category of unsupervised learning, the evaluation of community detection methods generally relies on unsupervised learning metrics. This paper measures the effectiveness of the model through the intrinsic structure of the community, density, clustering quality, and other aspects. The specific evaluation metrics used include modularity, average density, average conductivity, and average clustering coefficient, all of which are internal evaluation metrics and are described below:
(1)
Modularity Metric
The Modularity (Modularity, Q) metric is used to measure the tightness of connections within communities relative to the connections between communities. A modularity score above 0.3 is generally considered to indicate a reasonable quality of community detection. The specific calculation formula is as follows:
Q = 1 2 m i , j ( A i j k i k j 2 m ) δ ( c i ,   c j )
where m is the sum of the weights of all edges in the network, A i j is the weight of the edge between nodes i and j , k i and k j are the degrees of nodes i and j , respectively, and δ ( c i ,   c j ) is an indicator function that equals 1 if nodes i and j belong to the same community and 0 otherwise.
(2)
Average Density Metric
The Average Density (Average Density, AD) metric is used to measure the average density of the edges within all detected communities. A higher average density value indicates that the connections between nodes within the community are closer. The specific calculation formula is as follows:
A D C 1 , , C K = i ϕ C i C i i C i
where ϕ C i represents the density of a single community C i , which is calculated by dividing the number of existing edges within the community by the number of possible edges that could exist within the community. C i represents the size of the i t h community, which is the number of nodes in that community.
(3)
Average Conductance Metric
The Average Conductance (Average Conductance, AC) metric is a concept used to measure the connectivity of nodes within a network and can be used to assess the compactness of communities. A low conductance indicates a higher proportion of internal edges within the community, suggesting a more tightly knit community structure. The specific calculation formula is as follows:
Conductance ( C ) = c u t ( C , C ¯ ) 2 × v o l ( C ) + c u t ( C , C ¯ )
AC C 1 , , C K = i   Conductance   C i C i i C i
where Equation (13) shows the calculation of the conductance and Equation (14) shows the calculation of the average conductance. C 1 , , C K are all the communities detected in the network, C i represents the size of the i t h community, C represents a subgraph within the community, C ¯ represents the complement of C , c u t ( C , C ¯ ) represents the number of edges between the community C and its complement C ¯ , which is the number of external connections of the community, and v o l ( C ) represents the sum of the edges within the community C , which is the total number of internal connections within the community.
(4)
Average Clustering Coefficient Metric
The Average Clustering Coefficient (Average Clustering Coefficient, ACC) metric is used to measure the degree to which the nodes within detected communities form triangle relationships. A high clustering coefficient indicates that the interconnections among the nodes within the community are relatively tight. The specific calculation formula is as follows:
ClustCoef C i = 3 ×   triangles   in   C i   triplets   in   C i
A v g C l u s t C o e f C 1 , , C K = i   ClustCoef   C i C i i C i
where Equation (15) shows the shows the calculation of the clustering coefficient and Equation (16) shows the calculation of the average clustering coefficient. C 1 , , C K are all the communities detected in the network, C i represents the size of the i t h community, “triplets” refers to combinations of any three nodes within the community C i , and “triangles” refers to the node combinations among these that actually form a triangle relationship.

3.4. Identification of the Evolution of Spatial Communities

Spatial interaction networks are inherently dynamic, with the interactions between nodes continually changing. Therefore, it is necessary to study the evolution of spatial communities. Drawing on relevant definitions of community evolution in social networks, the types of community evolution mainly include birth, growth, split, merge, shrinkage, death, and continuity. Here, we employ a method for discovering the community evolution in social networks (GED algorithm) [19], which compares the importance of node quantity and similarity to distinguish types of community evolution. Specifically, we compare the spatial communities in adjacent time snapshots and introduce a metric to characterize their evolution. This can be mathematically described as follows: suppose that the spatial communities in adjacent time snapshots at time t − 1 and t are C i t 1 and C j t , respectively. The metric measuring the similarity between adjacent temporal communities is defined as I ( C i t 1 , C j t ) and I C j t , C i t 1 :
I C i t 1 , C j t   = C i t 1 C j t C i t 1 · x ϵ C i t 1 C j t S P C i t 1 x x ϵ C i t 1 S P C i t 1 x
I C j t , C i t 1 = | C i t 1 C j t | | C j t | · x ϵ C i t 1 C j t S P C j t x x ϵ C j t S P C j t x
where | C i t 1 | denotes the number of network nodes in community C i t 1 , | C i t 1 C j t | represents the number of overlapping network nodes between the two communities, and S P C i t 1 ( x ) represents the importance indicator of the community nodes, which can be calculated using metrics such as node betweenness or degree centrality to measure the importance of node x in community C i t 1 . In this study, we choose the degree centrality of nodes for calculation.
Furthermore, two hyperparameters, δ 1 and δ 2 , are set to determine the types of evolution. Schematic diagrams of the community evolution types and the criteria for determining the community evolution types are shown in Table 1.

4. Case Study and Results

4.1. Study Area and Data Description

The case study was conducted in the central urban area of Beijing within the fifth ring road, as shown in Figure 6. This area is bounded by the fifth ring road expressway and has a total area of approximately 668.72 km2. It encompasses the core functional area of Beijing and parts of the surrounding districts, including Dongcheng, Xicheng, Haidian, Shijingshan, Chaoyang, and Fengtai. The area is characterized by the highest population density and most active economic activity in Beijing.
This study selected taxi trip data from Beijing for the period from 13 July 2017 (Monday) to 19 July 2017 (Sunday) for experimental validation. The data were anonymized to protect user privacy. From the trajectory data, origin–destination (OD) pairs were extracted to form trip flows. A data example is shown in Table 2, which includes detailed field information and data examples of taxi order data. To obtain accurate and usable data, the data were cleaned: ① the pick-up and drop-off times were converted from UTC to Beijing time to obtain accurate time information; ② data with excessively short time intervals and abnormal speeds were filtered out; and ③ the Mercator projection coordinates under the WGS 1984 coordinate system’s zone 50N were calculated based on the latitude and longitude of the pick-up and drop-off points.
The study area was divided into 584 traffic analysis zones. Additionally, considering that this study mainly focuses on the clustering patterns of the second-order spatial interactions of spatially extensive data, which exhibit scale dependence [51] under different aggregation units, we supplemented experiments of spatial community detection using the proposed method under different-scale regular grid networks to further evaluate our approach. The study area was divided into 2790, 727, 196, and 90 research units based on regular grid networks with sizes of 0.5 km, 1 km, 2 km, and 3 km, respectively.

4.2. Experimental Setup for Spatial Community Detection

In order to verify the effectiveness and advantages of the proposed method, we choose the Leiden method and GCN-based method for comparison. Firstly, we chose the Leiden method because the Leiden method is currently recognized as an SOTA method and the algorithm is more stable. By comparing the Leiden method, the effectiveness of the proposed method can be reflected to some extent. Secondly, the GCN-based method is a suitable deep learning method for spatial cluster detection. It is more versatile than the spatial community detection methods reviewed in Section 2.2. By comparing the GCN-based method, it can reflect the advantage of introducing hyperbolic space in the proposed method in this paper.
Specifically, this study analyzes the quantitative and qualitative perspectives, respectively. Firstly, from a quantitative perspective, this study introduces four metrics: modularity, average density, average conductance, and average clustering coefficient, to evaluate the communities detected by the proposed method. Secondly, from a qualitative perspective, the analyses conducted in this paper include: (1) To verify the effectiveness of the proposed method, it is compared with the Leiden algorithm [30] and tested over five weekdays. To demonstrate the advantages of the proposed method, an ablation study is conducted by comparing it with the method based on GCN. (2) To investigate the scale dependence of the data, experiments are conducted with regular grids of 0.5 km, 1 km, 2 km, and 3 km, respectively.
As described in Section 3.3.2, C is a hyperparameter used as an initial value for the number of communities output by a given spatial community detection model (the number of communities ultimately output by the model is determined by the input network data, up to a maximum of the hyperparameter C). We used the spatial interaction network constructed from Monday’s taxi data as experimental data, explored the initial number of communities between 10 and 35 at intervals of 5, and repeated the experiment three times, taking the change in the mean and standard deviation values of the modularity (Table 3). Our experiments showed that the results were better when C was set to 30, and none of the final outputs exceeded 30 communities as the value of C increased. Thus, we set the hyperparameter C of the model to 30.

4.3. Results of Spatial Community Detection

4.3.1. The Effectiveness of The Proposed Method

First, we took traffic analysis zones (TAZs) as the research units and processed the taxi trajectory data within the fifth ring road of Beijing on 13 March 2017 (Monday). We aggregated these data to construct a network of spatial interaction. The network consisted of 584 TAZ units as nodes, and the edges represent the interaction volume between the research units. Based on the start timestamps, we split all the trip data into 12 time snapshots separated by two hours.
To better illustrate the training effects of the proposed method, the loss function during training is visualized (Figure 7). Clearly, the loss of the proposed method converged during training, and it can also be seen that the modularity gradually increased and then stabilized.
Firstly, for the quantitative evaluation, all communities detected during the four time snapshots, including both the morning and evening peak hours on Mondays, were selected for assessment. We conducted three repeated experiments and reported the mean and standard deviation of each metric. Table 4 shows the comparison of the Modularity metric, Table 5 shows the comparison of the Average Density (AD) metric, Table 6 shows the comparison of the Average Conductance (AC) metric, and Table 7 shows the comparison of the Average Clustering Coefficient (ACC) metric.
As shown in Table 4, the modularity of the communities detected by the proposed method and the comparison methods was always higher than 0.3, and the modularity of the proposed method was slightly higher than that of the comparison methods. This indicates that the proposed method can effectively distinguish the community structures in the spatial interaction network. As shown in Table 5, the average density of the communities detected by the proposed method was slightly higher than that of the Leiden method and the GCN-based method, suggesting that the communities detected by the proposed method were, on average, more closely connected, which is generally considered to be better in community detection. As shown in Table 6, the average conductance of the communities detected by the proposed method and the method based on GCN was higher than that of the Leiden method, indicating that the communities detected by the Leiden method had less connectivity between communities, which is its advantage; compared to the GCN-based method, the communities detected by the proposed method had less connectivity, implying that the proposed method detected communities that were more separable than those detected by the GCN-based method. As shown in Table 7, the average clustering coefficient of the communities detected by the proposed method was higher than that of the Leiden method and the GCN-based method, indicating that the communities detected by the proposed had have a higher degree of aggregation among community nodes.
In summary, the communities detected by the proposed method exhibited a good performance on all four evaluation metrics. Considering these metrics together, it can be concluded that the proposed method achieved a satisfactory level of rationality in detecting spatial community structures.
Secondly, for the qualitative analysis, we selected four time snapshots containing two special time periods: the morning rush hour and the evening rush hour. We compared our method with the widely recognized Leiden method, and the community detection results of both methods are shown in Figure 8. The number of communities detected by the Leiden method was generally less, with larger community sizes and more continuous structures [30]. In contrast, the communities detected by our method tended to have smaller sizes compared to those detected by the Leiden method.
Meanwhile, we further analyzed the community detection results of our method by combining them with a geographic base map and highlighted three core communities. Compared to the Leiden method, the communities detected by our method, as shown in Figure 9a, primarily included scenic spots such as the Summer Palace, Fragrant Hills Park, and West Lake in Community 4; educational institutions like Tsinghua University, Peking University, and Renmin University in Community 6; and areas with transportation facilities like Beijing West Railway Station, Beijing South Railway Station, and Beijing Fengtai Railway Station in Community 5. However, the communities detected by the Leiden method failed to effectively differentiate functional areas. Overall, the Leiden method is an algorithm based on modularity optimization and suffers from the problem of resolution limitation. That is, the modularity is sensitive to the size of the community. The value of modularity may overestimate the existence of large communities and ignore the existence of small communities. In contrast, the proposed method is based on the original network structure for the optimization of community results, which is able to solve the resolution limitation problem due to the modularity and detects a more fine-grained community structure.
Since the proposed community detection method in this paper utilizes hyperbolic graph convolutional modules for embedding in the network embedding part, to verify the advantage of introducing hyperbolic space in our method, we conducted ablation experiments using models based on Graph Convolutional Networks (GCNs) as the backbone network. The GCN-based community detection method employs Euclidean embeddings, which require a large number of dimensions to capture complex relationships. In contrast, our method integrates hyperbolic geometry into the network embedding module to handle complex networks, particularly those in spatial interaction networks. Specifically, we achieved this by substituting Euclidean space with hyperbolic space. We applied the t-SNE algorithm to reduce dimensionality and visualize the features obtained from both the GCN embedding and hyperbolic graph convolutional embedding, as shown in Figure 10.
To verify the superiority of the community detection results, under the same conditions, we further conducted community detection using a model based on Graph Convolutional Networks (GCNs) as the backbone network. We selected four time periods, including both the morning and evening rush hours, and compared the results with our method. The community detection results are shown in Figure 11.
Based on the feature visualization after the t-SNE dimensionality reduction (Figure 10), we found that the embedded features of our method exhibited a better out-of-cluster separability and in-cluster cohesion by introducing hyperbolic space. Thus, the separability of communities can be improved by introducing the hyperbolic space. In addition, we found that the features obtained after the convolutional embedding of the hyperbolic map approximated a circular distribution.
Comparing the community detection results of our method and the ablation experiments based on GCN modules (Figure 11), we found that the proposed method detected more fine-grained communities. For example, during the time period from 6 to 8 a.m., community C1 identified by the GCN-based method corresponded to community C2 and community C12 identified by the proposed method; during the time period from 8 to 10 a.m., community C7 identified by the GCN-based method corresponded to community C4 and community C6 identified by the proposed method; during the time period from 4 to 6 p.m., community C5 identified by the GCN-based method corresponded to community C3 and community C10 identified by the proposed method; and in the time period from 6 to 8 p.m., community C2 identified by the GCN-based method corresponded to community C4 and communities C10 and C14 identified by the proposed method.
Since most residents have the same travelling purpose during weekday commuting time, there was some similarity in their community structure. To further validate the stability of the community detection results of the proposed method, we used this as a reference to conduct experiments on snapshots containing weekday commute times. Figure 12 shows the results of the community detection, comparing the vertical distribution of communities in each column, where the locations of larger communities were basically the same. The results show that our method could consistently detect relatively stable community structures, indirectly confirming the stability of our model.
In addition, we conducted experiments using data from the same time snapshots on weekends, and the community detection results obtained are shown in Figure 13. Comparing the community detection results between weekdays and weekends, it is evident that the community structure on weekends differed significantly from that on weekdays, especially during Sunday mornings from 8 to 10 a.m., where the community structure appeared fragmented, indicating a diversified travel pattern during this time period.
Overall, our comparative analysis between weekdays and weekends confirms the differences in travel patterns between these two time periods.

4.3.2. Exploring the Scale Dependence of Spatial Interaction Data

Considering the scale dependence of the spatial interactions in different aggregation units, we conducted experiments using multiscale regular grids. Based on the size of the study area, we divided the area into 0.5 km, 1 km, 2 km, and 3 km regular grids and performed community detection using our method. We visualized the community detection results for four time snapshots, including both morning and evening rush hours (as shown in Figure 14), under different grid sizes. When using a 0.5 km grid as the study unit, the community structure detected was sparse and scattered. When conducting spatial community detection using grids ranging from 1 km to 3 km as research units, the number of spatial communities detected during the time period of 6–8 a.m. was 24-17-7; during 8–10 a.m. it was 22-13-7; during 4–6 p.m. it was 26-13-8; and during 6–8 p.m. it was 23-13-8. By comparing the results of different aggregation units, it was observed that, as the area of the research unit increased, the number of detected communities decreased, and using different scales of aggregation units led to differences in the results.

4.4. Identifying the Evolving Patterns of Spatial Communities

The evolution of spatial communities over time was captured by comparing the similarities of various communities in consecutive time snapshots, describing the dynamic patterns of urban spatial structures. Following the method described in Section 3.4, we partitioned the evolution types of communities by comparing the calculated inclusion indexes and community sizes. Considering the recommended range for the values of hyperparameters δ 1 and δ 2   [0, 1], here, we set them to 0.7 based on specific experimental data. The statistics of the community evolution types obtained are presented in Table 8. To visually represent the overall evolution trend of spatial communities within a day, we visualize the flow of communities across 12 time snapshots (as shown in Figure 15).
Based on observations, the number of evolution types identified during the morning peak hours (S4–5 and S5–6), evening peak hours (S8–9 and S9–10), and nighttime (S10–11 and S11–12) periods was lower than that in other time intervals, indicating a more singular travel purpose during these times. In addition, events such as birth and dissolution occurred more frequently among the identified community evolution types across different time snapshots, while events classified as persistent types occurred less frequently, reflecting rapid changes in the spatial structure of the city. Generally, as spatial interactions dynamically altered the spatial organization of urban areas, the spatiotemporal characteristics of spatial interaction networks were validated and further elucidated during network evolution. This underscores the suitability of networked communities with spatiotemporal properties for studying the spatial structure of dynamic cities.
To further investigate the evolution of communities, we selected communities covering various types of evolution for visualization (Figure 16). The thumbnail in the top left corner represents the geographical location of the community. Figure 16a illustrates the “growth” event of a community, indicating that, during the period of an increased travel volume in the morning peak hours, community C1 expanded, connecting some areas of Dongcheng District, Xicheng District, and parts of Fengtai District. Figure 16b demonstrates the “continuity” event of a community, showing that, during the period from 12 a.m. to 4 p.m. when the travel volume remained relatively constant, the structure of community C3 remained stable. Figure 16c depicts the “death” event of a community, indicating that, during the period from midnight to 4 a.m. when the travel volume gradually decreased, community C4 underwent dissolution. Figure 16d displays the “split” event of a community, showing that, during the late stages of the evening peak hours when the travel volume increased again, community C5 split into communities C6 and C7. Figure 16e illustrates the “shrinkage” event of a community, indicating that, during the early stages of the evening peak hours, the size of community C8 decreased and travel purposes became more uniform. Figure 16f presents the “merge” event of communities, where communities C10, C11, and C12 merged into community C13, indicating increased interactions between communities, leading to connection formation. Figure 16g demonstrates the “birth” event of a community, showing that, during the early stages of the morning peak hours, residents began to travel, resulting in the emergence of a new community C14.
From the perspective of practical application, the spatial communities detected based on the taxi traveling data actually reflect the areas in the city where people travel relatively frequently. In order to further analyze the practical application value of these results, we selected a specific example in Figure 16a, the community “growth” event, and visualized the spatial distribution of community C1 and community C2, respectively, as shown in Figure 17. With the proposed method, we found that the community C1 detected in time snapshot S4 evolved into community C2 in time snapshot S5, corresponding to the real-life morning rush hour commuting scenario. In this scenario, due to the significant increase in the amount of people traveling, it is easy to cause traffic congestion, which negatively affects people’s travel efficiency. Through the above analysis, we are able to gain a deeper understanding of the dynamic organization of urban space. Therefore, in our future policy recommendations, we suggest that the spatial distribution of communities and their evolution types can be used to assist the dynamic scheduling and optimization of urban resources in order to alleviate traffic pressure and improve urban travel efficiency.

5. Discussion

In this study, a framework for spatial community detection and evolution identification based on hyperbolic graph convolution and graph generation was proposed, mainly consisting of two parts: spatial community detection and spatial community evolution identification. To validate the effectiveness and advantages of the community detection method, we conducted extensive experiments. First, we performed comparative experiments by introducing four quantitative indicators: modularity, average density, average conductivity, and average clustering coefficient. The results demonstrated that the community structures detected by our method achieved good outcomes. Second, to further test the effectiveness of the method, we conducted comparative analyses: (1) comparing the proposed method with the Leiden community detection method and analyzing the commonalities and differences in the collective travel patterns on weekdays and weekends, the experimental results confirmed the effectiveness of our method; (2) conducting ablation experiments comparing the proposed method with a community detection method based on Graph Convolutional Networks (GCNs), the experimental results demonstrated the advantages of embedding in hyperbolic space; and (3) performing community detection under multi-scale regular grid networks, the experimental results confirmed the scale dependency of spatial interaction data. Furthermore, we analyzed the dynamic evolution of spatial communities over time and selected spatial communities covering seven types of evolution for further analysis based on collective travel patterns. These specific evolutionary events can reveal significant changes in the dynamic organization of urban spatial dynamic structures. In summary, the proposed method fully considers the structural features of the spatial interaction network and the dynamics of the spatial interaction network, making it suitable for detecting the community structures in geographic spaces.

6. Conclusions

This study proposed a spatial community detection and its evolution identification framework coupled with hyperbolic graph convolution and graph generation, and applied it to identify the dynamic organizational structure of urban space and its evolution. In summary, the strength of this study lies in its ability to better capture the non-Euclidean structure and spatial heterogeneity in spatial interaction networks, which improves the spatial community detection accuracy and refinement. However, the proposed method has a higher time complexity compared to traditional community detection algorithms. Secondly, by introducing the time dimension into the analysis of spatial communities, it provides a new tool for understanding the dynamic evolution of spatial community structure.
Future research can be improved in the following aspects: (1) The consideration of more geographic features. In addition to the geographic spatial location features and interaction volume used in our study, future research can incorporate geographic semantic features, such as Points of Interest (POI) or land use data. These data can help to differentiate the types of spatial communities more precisely. (2) The integration of multi-source spatial interaction data for research: since the urban space is a complex dynamic system, single-source flow data only reflect one aspect of urban spatial interactions. Future research can conduct community detection studies on networks constructed from multi-source spatial interaction data. By discovering the intrinsic correlations among multi-source flow data, it can achieve the tight coupling between multi-source flow data [52], thus more comprehensively exploring the spatial interactions between urban spatial units and urban spatial communities.

Author Contributions

Conceptualization, Qiu Yang, Rong Gui, Huimin Liu and Jianbo Tang; data curation, Qiu Yang; methodology, Huimin Liu, Qiu Yang and Rong Gui; visualization, Qiu Yang, Huimin Liu and Xuexi Yang; writing—original draft, Huimin Liu and Qiu Yang; writing—review and editing, Huimin Liu, Qiu Yang, Xuexi Yang, Rong Gui and Min Deng. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by grants from the Natural Science Foundation of Hunan Province (2022JJ40585), the National Natural Science Foundation of China (42271485 and 42171459), the Frontier Cross Research Project of Central South University (2023QYJC002), the Hunan Province Natural Resources Science and Technology Project (20230121XX), and the Jiangxi Province “Double Thousand Plan” the third batch of short-term projects to introduce innovative leading talents (jxsq2020102062).

Data Availability Statement

Data available on request due to restrictions.

Acknowledgments

This work was carried out in part using computing resources at the High-Performance Computing Platform of Central South University.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Guan, X.; Wei, H.; Lu, S.; Dai, Q.; Su, H. Assessment on the urbanization strategy in China: Achievements, challenges and reflections. Habitat Int. 2018, 71, 97–109. [Google Scholar] [CrossRef]
  2. Xiao, X.; Xie, X.; Li, J.; Xie, X.; Xue, B. Urban spatial structural change and transformation in the new era. Bull. Chin. Acad. Sci. 2023, 38, 1118–1129. [Google Scholar]
  3. Griffith, D.A.; Jones, K.G. Explorations into the relationship between spatial structure and spatial interaction. Environ. Plan. A 1980, 12, 187–201. [Google Scholar] [CrossRef]
  4. Liu, Y.; Yao, X.; Gong, Y.; Kang, C.; Shi, X.; Wang, F.; Wang, J.; Zhang, Y.; Zhao, P.; Zhu, D.; et al. Analytical methods and applications of spatial interactions in the era of big data. Acta Geogr. Sin. 2020, 75, 1523. [Google Scholar]
  5. Pei, T.; Shu, H.; Guo, S.H.; Song, C.; Chen, J.; Liu, Y.; Wang, X. The Concept and Classification of Spatial Patterns of Geographical Flow. J. Geo-Inf. Sci. 2020, 22, 30–40. [Google Scholar]
  6. Yang, X.; Fang, Z. Recent progress in studying human mobility and urban spatial structure based on mobile location big data. Prog. Geogr. 2018, 37, 880–889. [Google Scholar]
  7. Liu, Y.; Liu, Q.; Deng, M.; Shi, Y. Recent advance and challenge in geospatial big data mining. Acta Geod. Cartogr. Sin. 2022, 51, 1544. [Google Scholar]
  8. Barbosa, H.; Barthelemy, M.; Ghoshal, G.; James, C.R.; Lenormand, M.; Louail, T.; Menezes, R.; Ramasco, J.E.J.; Simini, F.; Tomasini, M. Human mobility: Models and applications. Phys. Rep. 2018, 734, 1–74. [Google Scholar] [CrossRef]
  9. Liu, Y.; Zhan, Z.; Zhu, D.; Chai, Y.; Ma, X.; Wu, L. Incorporating Multi-Source Big Geo-Data to Sense Spatial Heterogeneity Patterns in an Urban Space. Geomat. Inf. Sci. Wuhan Univ. 2018, 43, 327–335. [Google Scholar]
  10. Tu, W.; Cao, J.; Gao, Q.; Cao, R.; Fang, Z.; Yue, Y.; Li, Q. Sensing Urban Dynamics by Fusing Multi-Sourced Spatiotemporal Big Data. Geomat. Inf. Sci. Wuhan Univ. 2020, 45, 1875–1883. [Google Scholar]
  11. Muscoloni, A.; Thomas, J.M.; Ciucci, S.; Bianconi, G.; Cannistraci, C.V. Machine learning meets complex networks via coalescent embedding in the hyperbolic space. Nat. Commun. 2017, 8, 1615. [Google Scholar] [CrossRef] [PubMed]
  12. Jankowski, R.; Allard, A.; Boguñá, M.; Serrano, M.Á. The D-Mercator method for the multidimensional hyperbolic embedding of real networks. Nat. Commun. 2023, 14, 7585. [Google Scholar] [CrossRef] [PubMed]
  13. Boguna, M.; Bonamassa, I.; De Domenico, M.; Havlin, S.; Krioukov, D.; Serrano, M.A.N. Network geometry. Nat. Rev. Phys. 2021, 3, 114–135. [Google Scholar] [CrossRef]
  14. Ye, D.; Jiang, H.; Jiang, Y.; Wang, Q.; Hu, Y. Community preserving mapping for network hyperbolic embedding. Knowl.-Based Syst. 2022, 246, 108699. [Google Scholar] [CrossRef]
  15. Cao, J.; Wang, Y.; Bu, Z.; Wang, Y.; Tao, H.; Zhu, G. Compactness preserving community computation via a network generative process. IEEE Trans. Emerg. Top. Comput. Intell. 2021, 6, 1044–1056. [Google Scholar] [CrossRef]
  16. Jin, D.; Yu, Z.; Jiao, P.; Pan, S.; He, D.; Wu, J.; Philip, S.Y.; Zhang, W. A survey of community detection approaches: From statistical modeling to deep learning. IEEE Trans. Knowl. Data Eng. 2021, 35, 1149–1170. [Google Scholar] [CrossRef]
  17. Su, X.; Xue, S.; Liu, F.; Wu, J.; Yang, J.; Zhou, C.; Hu, W.; Paris, C.; Nepal, S.; Jin, D.; et al. A comprehensive survey on community detection with deep learning. IEEE Trans. Neural Netw. Learn. Syst. 2022, 35, 4682–4702. [Google Scholar] [CrossRef] [PubMed]
  18. Rossetti, G.; Cazabet, R.E.M. Community discovery in dynamic networks: A survey. Acm Comput. Surv. (CSUR) 2018, 51, 1–37. [Google Scholar] [CrossRef]
  19. Bródka, P.; Saganowski, S.; Kazienko, P. GED: The method for group evolution discovery in social networks. Soc. Netw. Analys. Min. 2013, 3, 1–14. [Google Scholar] [CrossRef]
  20. Kernighan, B.W.; Lin, S. An efficient heuristic procedure for partitioning graphs. Bell Syst. Tech. J. 1970, 49, 291–307. [Google Scholar] [CrossRef]
  21. Barnes, E.R. An algorithm for partitioning the nodes of a graph. Siam J. Algebr. Discret. Methods 1982, 3, 541–550. [Google Scholar] [CrossRef]
  22. Koller, D.; Friedman, N. Probabilistic Graphical Models: Principles and Techniques; MIT Press: Cambridge, MA, USA, 2009. [Google Scholar]
  23. Karrer, B.; Newman, M.E. Stochastic blockmodels and community structure in networks. Phys. Rev. E 2011, 83, 16107. [Google Scholar] [CrossRef]
  24. Airoldi, E.M.; Blei, D.; Fienberg, S.; Xing, E. Mixed membership stochastic blockmodels. Adv. Neural Inf. Process. Syst. 2008, 9, 1981–2014. [Google Scholar]
  25. Malliaros, F.D.; Vazirgiannis, M. Clustering and community detection in directed networks: A survey. Phys. Rep. 2013, 533, 95–142. [Google Scholar] [CrossRef]
  26. Newman, M.E. Fast algorithm for detecting community structure in networks. Phys. Rev. E 2004, 69, 66133. [Google Scholar] [CrossRef] [PubMed]
  27. Clauset, A.; Newman, M.E.; Moore, C. Finding community structure in very large networks. Phys. Rev. E 2004, 70, 66111. [Google Scholar] [CrossRef]
  28. Abduljabbar, D.A.; Hashim, S.Z.M.; Sallehuddin, R. Nature-inspired optimization algorithms for community detection in complex networks: A review and future trends. Telecommun. Syst. 2020, 74, 225–252. [Google Scholar] [CrossRef]
  29. Blondel, V.D.; Guillaume, J.; Lambiotte, R.; Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, 2008, P10008. [Google Scholar] [CrossRef]
  30. Traag, V.A.; Waltman, L.; Van Eck, N.J. From Louvain to Leiden: Guaranteeing well-connected communities. Sci. Rep. 2019, 9, 5233. [Google Scholar] [CrossRef]
  31. Expert, P.; Evans, T.S.; Blondel, V.D.; Lambiotte, R. Uncovering space-independent communities in spatial networks. Proc. Natl. Acad. Sci. USA 2011, 108, 7663–7668. [Google Scholar] [CrossRef]
  32. Griffith, D.A. Spatial structure and spatial interaction: A review. Environ. Plan. A 1976, 8, 731–740. [Google Scholar] [CrossRef]
  33. Gao, S.; Liu, Y.; Wang, Y.; Ma, X. Discovering spatial interaction communities from mobile phone data. Trans. Gis 2013, 17, 463–481. [Google Scholar] [CrossRef]
  34. Chen, Y.; Xu, J.; Xu, M. Finding community structure in spatially constrained complex networks. Geogr. Inf. Syst. 2015, 29, 889–911. [Google Scholar] [CrossRef]
  35. Guo, D.; Jin, H.; Gao, P.; Zhu, X. Detecting spatial community structure in movements. Geogr. Inf. Syst. 2018, 32, 1326–1347. [Google Scholar] [CrossRef]
  36. Fang, Y.; Wang, Z.; Cheng, R.; Li, X.; Luo, S.; Hu, J.; Chen, X. On spatial-aware community search. IEEE Trans. Knowl. Data Eng. 2018, 31, 783–798. [Google Scholar] [CrossRef]
  37. Wan, Y.; Liu, Y. DASSCAN: A density and adjacency expansion-based spatial structural community detection algorithm for networks. Isprs Int. J. Geo-Inf. 2018, 7, 159. [Google Scholar] [CrossRef]
  38. Chen, Y.; Baker, J.W. Community detection in spatial correlation graphs: Application to non-stationary ground motion modeling. Comput. Geosci. 2021, 154, 104779. [Google Scholar] [CrossRef]
  39. Fortunato, S.; Newman, M.E. 20 years of network community detection. Nat. Phys. 2022, 18, 848–850. [Google Scholar] [CrossRef]
  40. Zhang, S.; Tong, H.; Xu, J.; Maciejewski, R. Graph convolutional networks: A comprehensive review. Comput. Soc. Netw. 2019, 6, 11. [Google Scholar] [CrossRef]
  41. Jin, D.; Liu, Z.; Li, W.; He, D.; Zhang, W. Graph Convolutional Networks Meet Markov Random Fields: Semi-Supervised Community Detection in Attribute Networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; pp. 152–159. [Google Scholar]
  42. Bhattacharya, R.; Nagwani, N.K.; Tripathi, S. CommunityGCN: Community detection using node classification with graph convolution network. Data Technol. Appl. 2023, 57, 580–604. [Google Scholar] [CrossRef]
  43. Sun, H.; He, F.; Huang, J.; Sun, Y.; Li, Y.; Wang, C.; He, L.; Sun, Z.; Jia, X. Network embedding for community detection in attributed networks. ACM Trans. Knowl. Discov. Data (TKDD) 2020, 14, 1–25. [Google Scholar] [CrossRef]
  44. Liu, H.; Wei, J.; Xu, T. Community detection based on community perspective and graph convolutional network. Expert Syst. Appl. 2023, 231, 120748. [Google Scholar] [CrossRef]
  45. Liang, Y.; Zhu, J.; Ye, W.; Gao, S. Region2vec: Community Detection on Spatial Networks Using Graph Embedding with Node Attributes and Spatial Interactions. In Proceedings of the 30th International Conference on Advances in Geographic Information Systems, Seattle, WA, USA, 1–4 November 2022; pp. 1–4. [Google Scholar]
  46. Tsitsulin, A.; Palowitch, J.; Perozzi, B.; Müller, E. Graph clustering with graph neural networks. J. Mach. Learn. Res. 2023, 24, 1–21. [Google Scholar]
  47. Jin, D.; Wang, X.; He, D.; Dang, J.; Zhang, W. Robust detection of link communities with summary description in social networks. IEEE Trans. Knowl. Data Eng. 2019, 33, 2737–2749. [Google Scholar] [CrossRef]
  48. Yang, J.; Leskovec, J. Community-Affiliation Graph Model for Overlapping Network Community Detection. In Proceedings of the IEEE 12th International Conference on Data Mining, Brussels, Belgium, 10–13 December 2012; pp. 1170–1175. [Google Scholar]
  49. Chami, I.; Ying, Z.; Ré, C.; Leskovec, J. Hyperbolic graph convolutional neural networks. Adv. Neural Inf. Process. Syst. 2019, 32, 4869–4880. [Google Scholar] [PubMed]
  50. Shchur, O.; Günnemann, S. Overlapping community detection with graph neural networks. arXiv 2019, arXiv:1909.12201. [Google Scholar]
  51. Wong, D.W. The modifiable areal unit problem (MAUP). In WorldMinds: Geographical Perspectives on 100 Problems: Commemorating the 100th Anniversary of the Association of American Geographers 1904–2004; Springer: Berlin/Heidelberg, Germany, 2004; pp. 571–575. [Google Scholar]
  52. Shen, J.; Zong, H.; Chen, M. Identifying city communities in China by fusing multisource flow data. Int. J. Digit. Earth 2023, 16, 4247–4264. [Google Scholar] [CrossRef]
Figure 1. Visualization of network structure and node degree statistics. (a) Visualization of the network structure on the Poincaré disk. The red, green, and blue nodes represent network nodes with degrees ranging from high to low, respectively, and the black lines represent the network edges. (b) Visualization of network node degree statistics.
Figure 1. Visualization of network structure and node degree statistics. (a) Visualization of the network structure on the Poincaré disk. The red, green, and blue nodes represent network nodes with degrees ranging from high to low, respectively, and the black lines represent the network edges. (b) Visualization of network node degree statistics.
Ijgi 13 00248 g001
Figure 2. Framework of our proposed community detection method: It mainly includes: (a) construction of spatial interaction networks, (b) hyperbolic graph embedding, (c) community detection based on graph generation theory, and (d) community evolution identification.
Figure 2. Framework of our proposed community detection method: It mainly includes: (a) construction of spatial interaction networks, (b) hyperbolic graph embedding, (c) community detection based on graph generation theory, and (d) community evolution identification.
Ijgi 13 00248 g002
Figure 3. Illustration of constructing spatial interaction networks. (a) Construction of spatial interaction networks. (b) Abstraction of spatial interaction networks at adjacent time snapshots.
Figure 3. Illustration of constructing spatial interaction networks. (a) Construction of spatial interaction networks. (b) Abstraction of spatial interaction networks at adjacent time snapshots.
Ijgi 13 00248 g003
Figure 4. Abstract form of network embedding with hyperbolic graph convolution module.
Figure 4. Abstract form of network embedding with hyperbolic graph convolution module.
Ijgi 13 00248 g004
Figure 5. The connection between spatial community detection results and the original network.
Figure 5. The connection between spatial community detection results and the original network.
Ijgi 13 00248 g005
Figure 6. Map of the study area: (a) The geographical location of the study area and its extent. (b) The specific descriptions within the study area.
Figure 6. Map of the study area: (a) The geographical location of the study area and its extent. (b) The specific descriptions within the study area.
Ijgi 13 00248 g006
Figure 7. The variation in the loss function and modularity during the training process. (a) Network data organized by Monday’s time snapshot S4 and (b) network data organized by Monday’s time snapshot S9.
Figure 7. The variation in the loss function and modularity during the training process. (a) Network data organized by Monday’s time snapshot S4 and (b) network data organized by Monday’s time snapshot S9.
Ijgi 13 00248 g007
Figure 8. Community detection results. Different colors represent detected community categories. The first row represents the community detection results of the Leiden method, and the second row represents the community detection results of our proposed method.
Figure 8. Community detection results. Different colors represent detected community categories. The first row represents the community detection results of the Leiden method, and the second row represents the community detection results of our proposed method.
Ijgi 13 00248 g008
Figure 9. Community detection results. (a) Community detection results of our proposed method and (b) community detection results of the Leiden method.
Figure 9. Community detection results. (a) Community detection results of our proposed method and (b) community detection results of the Leiden method.
Ijgi 13 00248 g009
Figure 10. t-SNE dimensionality reduction visualization. Different colors represent different communities. The first row visualizes the embedding features based on the GCN method, while the second row visualizes the embedding features of our method.
Figure 10. t-SNE dimensionality reduction visualization. Different colors represent different communities. The first row visualizes the embedding features based on the GCN method, while the second row visualizes the embedding features of our method.
Ijgi 13 00248 g010
Figure 11. Community detection results. Different colors represent detected community categories. The first row represents the community detection results based on GCN, while the second row represents the community detection results of our method.
Figure 11. Community detection results. Different colors represent detected community categories. The first row represents the community detection results based on GCN, while the second row represents the community detection results of our method.
Ijgi 13 00248 g011
Figure 12. Visualization of community detection results in four time snapshots on weekdays.
Figure 12. Visualization of community detection results in four time snapshots on weekdays.
Ijgi 13 00248 g012
Figure 13. Visualization of community detection results for four time snapshots on weekends.
Figure 13. Visualization of community detection results for four time snapshots on weekends.
Ijgi 13 00248 g013
Figure 14. Community detection results. Colors denote different community types. Rows 1–4 depict community detection for grid sizes of 0.5 km, 1 km, 2 km, and 3 km, respectively.
Figure 14. Community detection results. Colors denote different community types. Rows 1–4 depict community detection for grid sizes of 0.5 km, 1 km, 2 km, and 3 km, respectively.
Ijgi 13 00248 g014
Figure 15. Evolution of spatial community over time in different time snapshots. A color block on each time snapshot represents a community, and the length of the block represents the number of communities, so that the evolution and development trend of communities can be tracked by following the flow of data.
Figure 15. Evolution of spatial community over time in different time snapshots. A color block on each time snapshot represents a community, and the length of the block represents the number of communities, so that the evolution and development trend of communities can be tracked by following the flow of data.
Ijgi 13 00248 g015
Figure 16. Visualization of community evolution. (a) “growth” evolution event; (b) “continuity” evolution event; (c) “death” evolution event; (d) “split” evolution event; (e) “shrinkage” evolution event; (f) “merge” evolution event; and (g) “birth” evolution event.
Figure 16. Visualization of community evolution. (a) “growth” evolution event; (b) “continuity” evolution event; (c) “death” evolution event; (d) “split” evolution event; (e) “shrinkage” evolution event; (f) “merge” evolution event; and (g) “birth” evolution event.
Ijgi 13 00248 g016
Figure 17. Visualization of the spatial distribution of community: (a) the spatial distribution of community C1 and (b) the spatial distribution of community C2.
Figure 17. Visualization of the spatial distribution of community: (a) the spatial distribution of community C1 and (b) the spatial distribution of community C2.
Ijgi 13 00248 g017
Table 1. The schematic diagram of community evolution types and discriminatory criteria for evolution types.
Table 1. The schematic diagram of community evolution types and discriminatory criteria for evolution types.
TypesJudgmental ConditionSketch
BirthFor C j t in t and each community C i t 1 in t − 1, I C i t 1 , C j t <   δ 1 , I C j t , C i t 1 < δ 2 .Ijgi 13 00248 i001
GrowthCondition 1: I C i t 1 , C j t δ 1 and I C j t , C i t 1 δ 2 and | C i t 1 | | C j t | , or I C i t 1 , C j t δ 1 and I C j t , C i t 1 < δ 2 and | C i t 1 | | C j t | ; Condition 2: the matched communities are one-to-one.Ijgi 13 00248 i002
SplitCondition 1: I C i t 1 , C j t <   δ 1 and I C j t , C i t 1 δ 2 and | C i t 1 | | C j t | ; Condition 2: the matched communities are one-to-many.Ijgi 13 00248 i003
MergeCondition 1: I C i t 1 , C j t δ 1 and I C j t , C i t 1 < δ 2 and | C i t 1 | | C j t | ; Condition 2: the matched communities are many-to-one.Ijgi 13 00248 i004
ShrinkageCondition 1: I C i t 1 , C j t δ 1 and I C j t , C i t 1 δ 2 and | C i t 1 | | C j t | or I C i t 1 , C j t <   α and I C j t , C i t 1 δ 2 and | C i t 1 | | C j t | ; Condition 2: the matched communities are one-to-one.Ijgi 13 00248 i005
DeathFor C i t 1 in t − 1 and each community C j t in t, I C i t 1 , C j t <   δ 1 , I C j t , C i t 1 < δ 2 Ijgi 13 00248 i006
Continuity I C i t 1 , C j t δ 1 and I C j t , C i t 1 δ 2 and C i t 1 = | C j t | Ijgi 13 00248 i007
Table 2. The original taxi trip data example.
Table 2. The original taxi trip data example.
FieldField DescriptionData Example
OtPick-up time1,489,352,929
OlatPick-up latitude40.00288
OlonPick-up longitude116.39449
DtDrop-off time1,489,355,684
DlatDrop-off latitude40.07928
DlonDrop-off longitude116.58233
DeltatRide duration2755
OwdayStart date1
OhourStart time5
DwdayEnd date1
DhourEnd time5
DistanceRide distance18.136932727754598
SpeedTravel speed23.699803201421616
Table 3. Determination of the hyperparameter C.
Table 3. Determination of the hyperparameter C.
C101520253035
Modularity0.307 ± 0.0150.330 ± 0.0160.340 ± 0.0100.352 ± 0.0100.364 ± 0.0050.294 ± 0.021
Table 4. Modularity metric comparison.
Table 4. Modularity metric comparison.
Time Snapshots
S4S5S9S10
Leiden0.322 ± 0.0060.344 ± 0.0020.338 ± 0.0040.337 ± 0.004
GCN-based0.306± 0.0040.350 ± 0.0040.340 ± 0.0040.353 ± 0.003
Ours0.330 ± 0.0080.364 ± 0.0050.346 ± 0.0050.361 ± 0.003
Table 5. AD metric comparison.
Table 5. AD metric comparison.
Time Snapshots
S4S5S9S10
Leiden0.092 ± 0.0000.146 ± 0.0020.144 ± 0.0040.134 ± 0.005
GCN-based0.096 ± 0.0020.180 ± 0.0030.200 ± 0.0010.203 ± 0.004
Ours0.112 ± 0.0010.173 ± 0.0030.205 ± 0.0020.215 ± 0.003
Table 6. AC metric comparison.
Table 6. AC metric comparison.
Time Snapshots
S4S5S9S10
Leiden0.404 ± 0.0080.375 ± 0.0180.393 ± 0.0070.375 ± 0.035
GCN-based0.507 ± 0.0060.528 ± 0.0100.559 ± 0.0080.574 ± 0.020
Ours0.480 ± 0.0040.441 ± 0.0090.537 ± 0.0100.560 ± 0.021
Table 7. ACC metric comparison.
Table 7. ACC metric comparison.
Time Snapshots
S4S5S9S10
Leiden0.004 ± 0.0000.012 ± 0.0010.012 ± 0.0020.009 ± 0.002
GCN-based0.004 ± 0.0020.017 ± 0.0020.024 ± 0.0000.022 ± 0.002
Ours0.006 ± 0.0010.022 ± 0.0010.028 ± 0.0010.032 ± 0.003
Table 8. Evolution types and numbers of spatial communities.
Table 8. Evolution types and numbers of spatial communities.
Time Snapshots
S1–2S2–3S3–4S4–5S5–6S6–7S7–8S8–9S9–10S10–11S11–12
TypesBirth1010116811102452
Growth1/13123231/
Split////2////2/
Merge/////2/3//2
Shrinkage///12121211
Death121336771010136
Continuity//////5////
Count2323151620233018101211
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, H.; Yang, Q.; Yang, X.; Tang, J.; Deng, M.; Gui, R. Coupling Hyperbolic GCN with Graph Generation for Spatial Community Detection and Dynamic Evolution Analysis. ISPRS Int. J. Geo-Inf. 2024, 13, 248. https://doi.org/10.3390/ijgi13070248

AMA Style

Liu H, Yang Q, Yang X, Tang J, Deng M, Gui R. Coupling Hyperbolic GCN with Graph Generation for Spatial Community Detection and Dynamic Evolution Analysis. ISPRS International Journal of Geo-Information. 2024; 13(7):248. https://doi.org/10.3390/ijgi13070248

Chicago/Turabian Style

Liu, Huimin, Qiu Yang, Xuexi Yang, Jianbo Tang, Min Deng, and Rong Gui. 2024. "Coupling Hyperbolic GCN with Graph Generation for Spatial Community Detection and Dynamic Evolution Analysis" ISPRS International Journal of Geo-Information 13, no. 7: 248. https://doi.org/10.3390/ijgi13070248

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop