Graph-Indexed kNN Query Optimization on Road Network

Jiang, Wei; Li, Guanyu; Bai, Mei; Ning, Bo; Wang, Xite; Wei, Fangliang

doi:10.3390/electronics12214536

Open AccessArticle

Graph-Indexed kNN Query Optimization on Road Network

Faculty of Information Science and Technology, Dalian Maritime University, Dalian 116026, China

^*

Authors to whom correspondence should be addressed.

Electronics 2023, 12(21), 4536; https://doi.org/10.3390/electronics12214536

Submission received: 7 October 2023 / Revised: 28 October 2023 / Accepted: 1 November 2023 / Published: 3 November 2023

(This article belongs to the Special Issue Data Privacy in IoT Networks)

Download

Browse Figures

Versions Notes

Abstract

:

The nearest neighbors query problem on road networks constitutes a crucial aspect of location-oriented services and has useful practical implications; e.g., it can locate the k-nearest hotels. However, researches who study road networks still encounter obstacles due to the method’s inherent limitations with respect to object mobility. More popular methods employ indexes to store intermediate results to improve querying time efficiency, but these other methods are often accompanied by high time costs. To balance the costs of time and space, a lightweight flow graph index is proposed to reduce the quantity of candidate nodes, and with this index the results of a kNN query can be efficiently obtained. Experiments on real road networks confirm the efficiency and accuracy of our optimized algorithm.

Keywords:

kNN query; graph index; road network

1. Introduction

The nearest neighbors (top-k nearest neighbors (kNN)) query in a road network is an essential problem in location-based services. The kNN search is applied in various practical contexts. For example, when using hospitality reservation services such as Tongcheng-Elong, it is crucial to show a range of venues near the user’s location. Similarly, in vehicle-for-hire platforms like Didi Chuxing, it is advantageous to offer a variety of vehicles within a short distance of the user.

Given a road network G(V,E), a set of candidate objects M, and a query vertex q, the goal is to find the k objects in M that are closest to q. Figure 1 shows the example of a road network G, a moving object

m_{1}

is on the edge from

v_{6}

to

v_{3}

,

d i s t (v_{3}, v_{6})

= 2, assuming all vertices are candidates. Given a query vertex

v_{3}

, the four nearest neighbors of

v_{3}

are

v_{3}

,

v_{6}

,

v_{7}

, and

v_{10}

, with shortest network distances to

v_{3}

of 0, 2, 3, and 4, respectively; then, the three nearest moving objects are

m_{1}

,

m_{2}

, and

m_{5}

.

The kNN query problem on a road network has seen various research results. The most simple and direct solution is to use Dijkstra’s algorithm [1]; i.e., given a query vertex q, search for vertices in order of non-decreasing distance from other vertices to q, ending when k vertices are found. Although this method is simple, the time cost is high, especially when the dataset is relatively large. The state-of-the-art algorithm for the kNN problem in road networks is TEN-Query [2], using the TEN-Index, which is based on tree decomposition. Mapping the graph into a tree structure, the kNN results for each vertex are precomputed and stored in the index. However, the establishment of the index is limited by the value of k, and the algorithm cannot perform queries when k increases. We have designed an index that breaks this limit by storing all candidate vertices in the local range of each vertex. Using a new tree decomposition generation algorithm, an average weight generation algorithm is used to construct tree decomposition.

The main contributions of this paper are as follows:

(1): Conventional tree decomposition is refined based on the minimum degree principle, and we propose an flow graph index generated from a graph-decomposed tree, which can effectively preserve distance information within the road network;
(2): Based on the flow graph index, the kNN query algorithm is proposed to significantly reduce the retrieval space of candidate nodes; thereby, it can enhance the performance of kNN queries in road networks;
(3): Experiments on real road networks confirm the efficiency and accuracy of the proposed algorithm.

The remainder of this paper is organized as follows. Section 2 reviews related literature. Section 3 describes the problem. Section 4 explains the index-construction methodology and query algorithm. Section 5 reviews our findings. Section 6 presents our conclusions.

2. Related Work

The kNN query can be based on Euclidean space or a transportation network. We concentrate on kNN queries within relaxation networks, which can be in either a static or dynamic environment.

The most elementary solution for kNN queries is Dijkstra’s algorithm. Given a query point q, the algorithm searches for vertices in non-decreasing order of distance from the remaining vertices, stopping when k objects are found. Although effective, this is time-consuming, especially in large networks and when candidate objects are far away.

The grid-based index is a typical index structure in road network queries. D. He et al. [3] proposed a grid-based kNN index, GLAD, dividing road networks into

2^{x}

·

2^{x}

grids according to the vertex latitude and longitude, maintaining a list of contained objects for each grid. To answer a kNN query, GLAD searches for objects starting from the mesh containing the queried vertices and updates the results by exploring adjacent meshes.

Dong Tianyang et al. [4] introduced direction-aware kNN (DAKNN) for moving objects on road networks, which employs an R-tree and a simple grid as fundamental structures to index the static road network and moving objects, respectively. An azimuth represents the moving direction of a target in the road network, and a local network extension method is used to quickly judge the direction of the moving target.

Graph partition-based indices are also used in kNN queries. Bilong Shen et al. [5] proposed a V-tree index to support efficient kNN searches and dynamic updates for moving objects. They iteratively divide the road network into sub-networks, build a tree structure on top of these, and associate a moving object with the nearest vertex in the V-tree. When updating the position of an object, only the tree nodes on the path from the corresponding leaf node to the root must be updated.

Zhong R et al. [6] proposed a balanced search tree index, G-tree, recursively dividing a road network into sub-networks, where each G-tree node corresponds to a sub-network. They introduced a best-first search algorithm on road networks, and an assembly-based approach to efficiently compute the minimum distance from a G-tree node to a query location. Li Z et al. [7] proposed G*-tree, an improved kNN index structure, establishing shortcuts between selected leaf nodes to address the low query efficiency of G-tree.

Dan T et al. [8] proposed LG-tree, which partitions large graphs into subgraphs and indexes these using balanced trees. LG-tree constructs a hierarchical structure based on Dynamic Index Filtering (DIF) and boundary vertices, using stage-based dynamic programming to ensure the efficiency of the shortest distance search.

Bao JL [9] introduced the index structures used in nearest neighborhood queries, discussed current challenges, and explored future research focuses on nearest neighbor query techniques in road networks. Li L et al. [10] addressed the problem of continuously reporting alternative paths for users traveling along a given path by accessing vertices in ascending order of maximum depth value, so as to improve computational efficiency and enhance accuracy. Yuan-Ko Huang et al. [11] addressed the kNN skyline in dynamic road networks, conducting queries using three data structures: object attribute control matrix (OADM), road distance ranked list (RDSL), and skyline object extension tree (SOET). R. Halin et al. [12] introduced the concept of tree decomposition in 1976, expediting graph problem calculations by mapping graphs into tree structures. Dian Ouyang et al. [13] built on this with H2H-Query, a method to efficiently calculate the shortest distance between two vertices on a road network. The shortest distance for each vertex in each ancestor node is stored, enabling the most straightforward network distance calculation between any two points in

O (w)

time. They subsequently proposed a tree-decomposition-based index, TEN-Index, which precalculates the shortest distance from each vertex to its ancestor vertex and stores each vertex’s local kNN, which reduces query times and enhances kNN performance.

There are other kNN indices and algorithms. Hyung-Ju et al. [14] proposed an efficient continuous search algorithm, UNICONS, using Dijkstra’s shortest path algorithm and a precalculated NN list to enhance NN query performance. Maytham Safar [15] used network distance rather than Euclidean distance as the shortest path metric. Guang Zhong Sun [16] suggested a straightforward, effective precomputation technique for kNN calculations on extensive road networks, selecting a suitable representative node set R (a subset of V) from road network G(V,E) and precomputing the distance value for any pair in R.

In response to the natural extension of kNN queries for multiple moving objects, i.e., Aggregate k Nearest Neighbor (

A k

NN) queries, Abeywickrama T [17] proposed the Compressed Object Landmark Tree (COLT) data structure, which implements efficient hierarchical graph traversal and performs various aggregation functions effectively. Bareche I et al. [18] proposed a Velocity Spatiotemporal (VeST) indexing approach for continuous queries, mainly Continuous K-Nearest Neighbor (

C K

NN) and continuous range queries using Apache Spark. The proposed structure is based on a selective velocity-partition method, which reduces the update cost and improves the response time and query accuracy. Ying Ju Chen et al. [19] proposed the SQUARE algorithm, constructing a network index resembling the decoupling model while using coupling concepts to maintain kNN information relative to hotspot areas in the network index. Hotspots represent areas where frequent queries are found in history. Aye Thida Hlaing et al. [20] proposed a fast

R k

NN search algorithm using Simple Materialized Path View (SMPV), employing an Incremental Euclidean Restriction (IER) strategy for swift kNN queries. Luo et al. [21] proposed an index-based kNN query solution, Toain, based on a shrinkage hierarchy, and utilizing a shortcut-based index, SCOB, to accelerate query processing. Chuanwen Li et al. [22] proposed a kNN query algorithm using an index structure based on a lazy update strategy, capitalizing on both CPU and GPU resources. The region being queried is first identified, and the GPU is used to update the index for that region. The index is queried using the GPU and a candidate result set is generated, which is refined by the CPU to obtain the final result.

3. Problem Definitions

We study the kNN query problem on road networks based on tree decomposition, with symbols as defined in Table 1.

The road network is modeled as a weighted undirected graph G(V,E), where V is a vertex set and E is an edge set. For an edge

e (v_{1}, v_{2})

, weight

w (e)

or

w (v_{1}, v_{2})

represents the length of the edge. A path p is a vertex sequence

p = (v_{1}, v_{2}, \dots, v_{k})

, where for any

1 \leq i < k

,

(v_{i}, v_{i + 1}) \in E

, the weight of p is expressed as

φ (p) = \sum_{i = 1}^{k - 1} φ (v_{i}, v_{i + 1})

given two vertices

v_{1}

and

v_{2}

; the shortest path between them is a path p from

v_{1}

to

v_{2}

that satisfies

w (p)

as s minimum. In this paper,

dist (v_{1}, v_{2})

is used to represent the length of the shortest path from

v_{1}

to

v_{2}

.

We study the road network kNN query in a static environment, i.e., when both the query point and candidate object are static. For the dynamic road network kNN query, the dynamic moving candidate object can be mapped into offset neighboring vertices; thus, it can be transformed into a static environment. Specifically, given a query point q and a moving object m, assuming that q is located on the edge

e_{1} = (v_{s}, v_{e})

, the distance from the vertex

v_{e}

is

d_{q}

, m is located on the edge

e_{2} = (v_{s}^{'}, v_{e}^{'})

, and the distance to the vertex

v_{s}^{'}

is

d_{m}

. Then, the distance between q and m is expressed as

dist (q, m) = dist (v_{e}, v_{s}^{'}) + d_{q} + d_{m}

. This paper defines the kNN query problem as follows.

Definition 1

(kNN Query). Given a road network

G (V, E)

, a query point q, an integer k, and a set of candidate vertices M (

| M | > k

), the kNN query aims to calculate a set R that satisfies:

1.: $| R | = k$ ;
2.: $R \subset M$ ;
3.: $\forall v \in R$ , $\forall v^{'} \in M - R$ , $dist (q, v) \leq dist (q, v^{'})$

In the road network shown in Figure 2, there are ten vertexes, the distance between

v_{2}

to

v_{3}

is 6, given the query point

v_{3}

, with

k = 3

, and dotted line vertices (such as

v_{1}

,

v_{5}

,

v_{6}

,

v_{8}

, and

v_{10}

) are candidate vertices, the 3NN of

v_{3}

in the candidate vertices is

R = {v_{6}, v_{10}, v_{8}}

, and their shortest network distances to

v_{3}

are 2, 4, and 5, respectively.

Notably, due to the potential for identical edge weights, the kNN for each vertex may need more uniqueness. This factor does not compromise the algorithm’s validity, as presented in this treatise.

4. Index and Algorithm

We present the solution to the kNN problem on road networks, using an amalgamated index structure, Average Weight Tree Decomposition with Ancestor Table and Children Table (AWTDAC).

4.1. Average Weight Tree Decomposition Index Construction

Definition 2

(Tree Node). For each vertex v in the road network

G (V, E)

, its corresponding tree node is denoted by

X (v)

,

X (v)

represents a subset of the road network vertex set V, comprising vertex v and its neighboring vertices. In this paper, v is referred to as the principal vertex of

X (v)

and is denoted in gray, symbolized by

X (v) . grayId

. The neighboring vertices of v are considered adjacent vertices of

X (v)

.

Taking the road network shown in Figure 1 as an example, Figure 3 depicts the tree node

X (v_{3})

corresponding to vertex

v_{3}

, where

v_{3}

serves as the principal vertex, and

v_{1}

,

v_{4}

, and

v_{8}

constitute adjacent vertices. The calculation process of the specific values of the adjacent vertices of

v_{3}

is based on the average weight tree decomposition process. The result comes from the constructed decomposition tree.

Tree decomposition [5] accelerates the computation of numerous graph problems by mapping graphs onto tree structures and is defined as follows.

Definition 3

(Tree Decomposition). Given a graph

G (V, E)

, its tree decomposition is denoted

T_{G}

, and is composed of several tree nodes with the set of all tree nodes represented as Λ.

T_{G}

satisfies the following conditions:

(1): $\cup X (v) \in Λ X (v) = V$ .
(2): For each edge $e (v_{1}, v_{2}) \in E$ in G, there must exist a tree node $X (v_{3}) \in Λ$ , satisfying $v_{1} \in X (v_{3})$ and $v_{2} \in X (v_{3})$ .
(3): For each vertex v in G, all tree nodes $X (v^{'})$ containing v (i.e., $v \in X (v^{'})$ ) constitute a connected subtree of $T_{G}$ .

Definition 4

(Tree Height and Tree Width). Given a tree decomposition

T_{G}

, this manuscript stipulates that the root node’s depth is 1, while the depth of other tree nodes is the depth of their parent node plus 1. Based on this, the tree height of

T_{G}

is defined as the maximum depth of all tree nodes, denoted by h. The treewidth of

T_{G}

is defined as the maximum number of adjacent vertices among all tree nodes, represented by w; that is,

w = max_{X (v) \in Λ} | X (v) | - 1

(1)

Definition 5

(Elimination Order). Given a graph

G (V, E)

, this manuscript sequentially removes vertices in G following a specific rule. For each currently deleted vertex v,

v . deleteOrder

is utilized to record its removal order, referred to as the elimination order of vertex v.

For instance, considering the road network

G (V, E)

depicted in Figure 1, vertex

v_{6}

is the first to be removed; hence,

v_{6} . deleteOrder = 1

.

We briefly describe the generation process of tree decomposition. For any road network

G (V, E)

, the traditional tree decomposition generation algorithm is based on the principle of minimum degree (i.e., the vertex with the smallest degree in graph G) and it is divided into two stages.

Stage One: Vertex Removal. Select vertex v with the smallest degree in G each time, and use v and its neighbors as tree nodes

X (v)

. For any pair of neighbors

(v_{1}, v_{2})

of v, if no edge exists between

v_{1}

and

v_{2}

, add an edge

e (v_{1}, v_{2})

, whose weight

w (v_{1}, v_{2}) = w (v_{1}, v) + w (v, v_{2})

; if there is an edge

e (v_{1}, v_{2})

and

w (v_{1}, v_{2}) > w (v_{1}, v) + w (v, v_{2})

, then update

w (v_{1}, v_{2})

to

w (v_{1}, v_{2}) = w (v_{1}, v) + w (v, v_{2})

. Finally, delete v from G. Repeat this step until all vertices in G have been removed and record the elimination order of each vertex.

Stage Two: Tree Decomposition Construction. For each tree node

X (v)

generated in Stage One, select the vertex

v^{'}

with the smallest elimination order among the adjacent vertices of

X (v)

, designating

X (v^{'})

as the parent node of

X (v)

. Repeat this step until all tree nodes coalesce into a tree.

Figure 4 exhibits a tree decomposition created using the minimum degree principle for the road network G illustrated in Figure 1. The gray vertices are the principal vertices, like

v_{1}

is the principal vertex, and

v_{2}

, and

v_{9}

constitute adjacent vertices. In this paper, a tree decomposition is constructed by the average weight principle. Both the average weight principle and the minimum degree principle are important concepts in graph algorithms, each having advantages in different types of problems. The average weight principle typically provides better results in dealing with weighted graph problems because it considers the weights of all nodes, not just their degrees. The minimum degree principle, on the other hand, is more suitable for specific problems such as connectivity issues. The average weight principle can provide improvements compared to the minimum degree principle, especially in dealing with weighted graph problems. However, the minimum degree principle may be more effective in dealing with certain specific types of problems. Therefore, the choice of which principle to use depends on the specific problem being addressed and the characteristics of the data.

The proposed average weight tree decomposition builds upon the minimum degree principle. The deletion of vertices no longer adheres to the minimum degree principle but defines a priority

S (v) (α, β, γ)

instead, where

S (v) . α

represents the average weight of vertex v, i.e., the sum of edge weights of v divided by its degree.

S (v) . γ

is the number of edges that must be added after deleting vertex v, and

S (v) . γ

is the ID of vertex v. For the vertices in G, priority is sorted according to

α

; if

α

values are equal, sorting is based on

β

. Vertex v is removed after sorting

S (v) (α, β, γ)

, and other operations remain unaltered. Figure 5 shows a tree decomposition generated based on the average weight principle for the road network G in Figure 1.

Using Figure 5 as an example, the tree height of

T_{G}

is h = 6, and the tree width is w = 3, as the tree nodes within

T_{G}

contain a maximum of three adjacent vertices. For example,

v_{3}

serves as the principal vertex, and

v_{1}

,

v_{4}

, and

v_{8}

constitute adjacent vertices. In the road network, utilizing tree decomposition is reasonable, since the tree width and height are significantly less than the total number of vertices in the road network.

An average-weight-based tree decomposition algorithm is presented as Algorithm 1, where lines 2–9 represent vertex deletion, and lines 10–13 represent the tree decomposition phase. Upon the algorithm’s completion, all tree nodes form a connected tree decomposition. The time complexity of Algorithm 1 is

O (n (w^{2} + log (n)))

, where n is the number of vertices in G, and w is the tree width of tree decomposition

T_{G}

. The space complexity is

O (n w)

.

Algorithm 1: Average Weight Tree Decomposition Construction Algorithm (

G (V, E)

).

The calculated tree decomposition has the following properties [12].

Proposition 1.

For a graph G and its corresponding tree decomposition

T_{G}

, for any two distinct vertices,

v \neq v^{'}

in G,

X (v) \neq X (v^{'})

.

Proposition 2.

Considering a graph G and a tree decomposition

T_{G}

, for any tree node

X (v) \in Λ

and any vertex

v^{'} \in X (v) / v

,

X (v^{'})

is an ancestor of

X (v)

in the tree node

T_{G}

.

Definition 6

(Ancestor). With the tree decomposition

T G

of G, for any tree node

X (v) \in Λ

, the ancestor of

X (v)

is expressed as the set

X (v) . anc

, and the ancestor table of the root node is specified in this paper as ∅. Utilize

X (v) . parent

to represent the parent node of

X (v)

. Consequently, the ancestor

X (v) . anc = X (v) . parent . grayId \cup X (v) . parent . anc

.

Definition 7

(Children). Given the tree decomposition

T_{G}

of G, for any tree node

X (v)

, the child of

X (v)

is symbolized as the set

X (v) . child

. This paper stipulates that the leaf node’s child is ∅, employing

X (v) . childSet

to represent the principal vertex set of

X (v)

’s child node and the child of the non-leaf node

X (v)

.

X (v) . c h i l d = ⋃_{c \in X (v) . c h i l d S e t}^{} c \cup X (c) . c h i l d

(2)

Upon obtaining a tree decomposition, the shortest network distance between two vertices can be computed in

O (w)

time using the H2H-Index [13], where w is the tree width. This is currently the most sophisticated technique to calculate the shortest net distance between vertices in road networks.

For any pair of vertices q and v with the nearest common ancestor X, the shortest path network distance between them is shown, which is described in Dian Ouyang et al. [13]:

d i s t (q, v) = min_{s \in X} d i s t (q, s) + d i s t (s, v)

(3)

We introduce the proposed kNN index. Given a query point q, we traverse the ancestors and children of each vertex. For any other vertex v, there are three possible scenarios:

If $v \in X (q) . anc$ , then $dist (q, v)$ has been pre-calculated, and $dist (q, v)$ can be obtained in $O (1)$ time.
If $v \in X (q) . child$ , likewise, $dist (q, v)$ can be obtained in $O (1)$ time.
If $v \notin X (q) . child$ and $v \notin X (q) . anc$ , according to Equation (3), the shortest path network distance between v and q must pass through a vertex u in their nearest common ancestor. Hence, $dist (q, v) = dist (q, u) + dist (u, v)$ . This paper can obtain $dist (q, u)$ and $dist (u, v)$ in $O (1)$ time, allowing the retrieval of $dist (q, v)$ within $O (1)$ time.

In summary, for any other vertex v, this paper can obtain

dist (q, v)

in

O (1)

time.

Definition 8

(Node extension list). Given a tree decomposition

T_{G}

of G, for any tree node

X (v) \in Λ

, the node extension list of

X (v)

is denoted as the set

X (v) . extList

, where

X (v) . extList = X (v) / v \cup X (v) . child

.

To enhance the algorithm’s efficiency, this paper arranges the vertices in

X (v)

’s node extension list in ascending order based on their road network distance to vertex v.

For instance, in the tree decomposition

T_{G}

, illustrated in Figure 5,

X (v_{3}) . extList = {(v_{6}, 2), (v_{7}, 3), (v_{10}, 4), (v_{4}, 5), (v_{8}, 5), (v_{9}, 5), (v_{2}, 6), (v_{1}, 9)}

, with respective road network distances to

v_{3}

of 2, 3, 4, 5, 5, 5, 6, and 9.

We were inspired by Sun et al. who put forward the idea of flow graph to serialize the processing graph [23]. According to the

T_{G}

structure on average weight tree decomposition in Figure 5, if the relationship between two vertices belongs to the parent–child relationship, the edge between the two vertices is called a tree edge; otherwise, it is a non-tree edge. The direction of the tree edge is from the parent vertex to the child vertex, and the direction of the non-tree edge is the direction from the deleted vertex to its adjacent vertices; in Figure 6, the solid line indicates the tree edge, and the dashed line indicates the non-tree edge. It can be seen that

v_{3}

is the parent vertices of

v_{2}

,

v_{7}

,

v_{8}

, and

v_{9}

, and the distances from child vertices to

v_{3}

are 6, 5, 5, and 5. The adjacent vertices of

v_{3}

are

v_{4}

,

v_{1}

,

v_{8}

.

Definition 9

(Flow Graph). Given a directed label graph

F (V_{F}, E_{F}, L_{F})

, where

V_{F}

represents a set of vertices,

E_{F}

is a set of edges, and

L_{F}

represents a label function, assigning one or more labels to nodes in

V_{F}

. In this paper,

L_{F}

can be expressed as distance from two vertices, the vertex that contains the moving object M and the number of moving objects.

Relying on average weight tree decomposition, we maintain a node extension list for each tree node, resulting in the kNN flow graph index structure depicted in Figure 6. If there is a parent-child relationship between two vertices, the edge between the two points is called a tree edge, otherwise it is called a non-tree edge. The direction of the tree edge is from the parent-child vertex to the child vertex. The direction of the non-tree edge is from the deleted vertex to its adjacent vertex. The solid line represents the tree edge and the dotted line represents the non-tree edge. The parent and child vertices of vertices

v_{2}

,

v_{7}

,

v_{8}

, and

v_{9}

are

v_{3}

, and their distances to

v_{3}

are 6, 5, 5, 5 respectively. The adjacent vertices of

v_{3}

are

v_{4}

,

v_{1}

,

v_{8}

.

The space complexity of the kNN flow graph index is

O (n^{2})

, where n is the total number of vertices in the road network. Using the flow graph index structure can completely preserve the index structure of the tree decomposition, and it can also save the weight between vertices and the information about the moving objects on the vertices.

Theorem 1 verifies the node completeness between the road network and the flow graph index.

Theorem 1.

Given a node v, if there exists a neighbor node

v^{'}

of v, the tree node

X (v^{'})

is a passing node on the path from root node r to v in the graph-decomposed tree Λ rooted by node r.

Proof of Theorem 1.

Considering a node v and its neighbor nodes

N (v)

, node v can be abstracted as a tree node

X (v)

, satisfying

N (v)

=

X (v)

−v. If there exists a tree node

X (v^{'})

, satisfying that

X (v^{'})

is a parent of

X (v)

, then

v^{'}

∈

X (v)

−v.

v^{'}

is connected with all other neighbor nodes

X (v)

−v−

v^{'}

after v is deleted. It is the same if there exists a tree node

X (v^{″})

, satisfying that

X (v^{″})

is a parent of

X (v^{'})

, such that

X (v^{″})

∈ (

X (v)

−v−

v^{'}

) ∪ (

X (v^{'})

−

v^{'}

). Similarly, all tree nodes abstracted from neighbor nodes of v can be found in a path from the root node to v. □

In addition, a filtering rule can be defined to prune the redundant nodes for kNN searching on the road network, shown as Lemma 1

Lemma 1.

Given a query node q and a data node v, if there exists a minimum common ancestor

v^{'}

of q and v, the shortest path dist(q, v) is equal to the sum of shortest paths dist(q,

v^{'}

) and dist(

v^{'}

, v).

Proof of Lemma 1.

If there exists one other node

v^{″}

, satisfying the condition that the shortest path dist(q, v) is equal to the sum of shortest paths dist(q,

v^{″}

) and dist(

v^{″}

, v), then there must exist one or more across-path edges to connect the node

v^{″}

. However, it is impossible that there exists one across-path edge in a graph-decomposed tree according to Theorem 1 because any node v and its neighbor nodes are only found in the path from root node r to v. □

We present the index construction and update strategy. For each tree node in average weight tree decomposition, it is only necessary to compute its node extension list. The construction algorithm for the node extension list is shown as Algorithm 2, where lines 1–3 calculate the shortest network distance from adjacent vertices in

X (v)

to v, adding it to the node extension list; lines 4–6 compute the child vertices of

X (v)

and incorporate them into the node extension list; and line 7 sorts the node extension list. The time complexity of Algorithm 2 is

O (n w)

, where n is the total number of vertices in the road network, and w is the tree width of

T_{G}

.

Algorithm 2: Node Expansion List Construction Algorithm (

X (v)

)

Regarding index updates, each vertex v has a flag bit,

v ._{i s C a n d i d a t e}

, indicating its status as a candidate point, which is false when v ceases to be a candidate point. Consequently, index updates have time complexity

O (1)

.

4.2. Query-Processing Method Based on kNN Index

We examine the specifics of the query algorithm AWTDN.

Definition 10

(Ancestor Relationship). Given any two vertices

v_{1}

and

v_{2}

, within the average weight tree decomposition

T_{G}

, if

v_{1} \in X (v_{2}) . child

or

v_{2} \in X (v_{1}) . child

, then

v_{1}

and

v_{2}

are said to possess an ancestral relationship.

Definition 11

(Current Shortest Path Network Distance). Given vertices

v_{1}

and

v_{2}

with no ancestral relationship, and common ancestor

X (v_{3})

, the current shortest network distance from vertex

v_{1}

to vertex

v_{2}

via vertex

v_{3}

is represented as

curDist ((v_{1}, v_{2}) | v_{3})

, that is,

curDist ((v_{1}, v_{2}) | v_{3}) = dist (v_{1}, v_{3}) + dist (v_{3}, v_{2})

. Since

X (v_{3})

is the common ancestor of

X (v_{1})

,

X (v_{2})

,

dist (v_{1}, v_{3})

and

dist (v_{3}, v_{2})

can be found in

X (v_{3}) . child

without calculation.

To enhance algorithm efficiency, this paper introduces a threshold for pruning, initially set to infinity and updated accordingly when the result is set

| R | \geq k

.

To facilitate the explanation of the kNN query problem, for a given query point q, we use a triplet

(v_{1}, v_{2}, s)

to suggest that

v_{1}

via

v_{2}

has the shortest network distance to query point q equal to s, with the intermediate point

v_{2}

termed the base point.

The vertex expansion strategy of AWTDN has three steps.

Step 1: Insert the query point into the min-heap, process the first vertex p of the heap, and if p is a candidate point, add it to the result set R.

Step 2: Determine the relationship between p and the base point. (1) If p is a child of the base point, add the next vertex of p to the min-heap. (2) If p is an ancestor of the base point, add both the next vertex of p and the next vertex of the base point’s node extension list to the min-heap. (3) If the distance between vertex v and the query point in the heap is updated, add the next vertex of vertex v to the heap.

Step 3: Process the heap’s leading vertex, repeat the first two steps, and when the current road network distance from the heap’s leading vertex to the query point is no less than the threshold, the algorithm concludes.

Consider a 3NN (

q = v_{3}

,

k = 3

) of the query vertex

v_{3}

as an example. In the first step,

v_{3}

is added to the min-heap, the first vertex p of the heap is processed, and the vertex

p = v_{3}

is handled for the first time. As

v_{3}

is not a candidate point, it is not processed. In the second step, with no ancestral or child relationship with the base point, the next vertex

v_{2}

of

v_{3}

’s node extension list is added to the min-heap. Correspondingly, in the third step, the heap’s first vertex,

v_{2}

, is processed.

As mentioned earlier, this paper uses

curDist ((q, v) | p) = dist (q, p) + dist (p, v)

to denote the current shortest path network distance from vertex q to vertex v when passing through point p. Evidently,

curDist ((q, v) | p)

is the upper bound of

dist (q, v)

. Since

dist (q, p)

and

dist (p, v)

are pre-calculated,

curDist ((q, v) | p)

can be retrieved in constant time.

We present a lemma that supports the development of our query algorithm and two theorems that guide it and demonstrate its correctness.

Lemma 2.

For any given vertices q and v, if v is a kNN of q, there must exist a vertex

p \in X (q) . extList \cup {q}

such that v satisfies the following: (i)

v \in X (p) . child

; (ii)

dist (q, v) = dist (q, p) + dist (p, v)

.

As an illustration, consider Figure 6. Assume all vertices are candidate points, and given

q = v_{3}

and

v = v_{5}

. For

{v_{3}} \in X (v_{3}) . extList

, we have

S = {v_{3}, v_{1}, v_{4}, v_{8}}

. There exists a vertex

p = v_{8}

in S such that the shortest path from

v_{3}

to

v_{5}

passes through vertex

v_{8}

.

Based on Lemma 2, we propose our query algorithm. For a given query point q, we scan each vertex p in

{q} \cup X (q) . extList

. We first determine whether p is a candidate point. If so, we compare it with the last vertex in result set R, and update R accordingly. We then assess the relationship between p and the base point and process it accordingly.

Theorem 2.

For a query vertex q, vertex v, and their common ancestor p, if

curDist ((q, v) | p) \geq threshold

, then vertex v cannot be a kNN of q.

Proof of Theorem 2.

Given a query vertex q, assume that there is a vertex v and their common ancestor p, satisfying

curDist ((q, v) | p) \geq threshold

, and v is a kNN of q. Let R be the kNN set of q. It is easy to know that when

threshold \neq + \infty

, there is

| R | \geq k

, and for any point

v^{'}

in R, there is

dist (q, v^{'}) \leq threshold

, so there is

dist (q, v^{'}) \leq threshold \leq curDist ((q, v) | p)

, and from the definition of kNN, we can know that v is not the kNN of q, which contradicts the hypothesis. The proof is completed. □

Based on Theorem 1 and the ascending order feature of the node expansion list, when

curDist ((q, v) | p) \geq threshold

, then the vertex v and the vertices after v in the node expansion list must not be kNN of q, so we can exit the calculation early.

Theorem 3.

Assume that in

{q} \cup X (q) . extList

, the algorithm is currently processing vertex p, and it terminates when

dist (q, p) \geq threshold

. At this point, for any vertex v in the result set R,

curDist ((q, v) | v^{'}) = dist (q, v)

holds, where

v^{'}

is any common ancestor of q and v.

Proof of Theorem 3.

Let v be any vertex in the result set R. Given the termination condition, for any vertex

p_{after}

processed after p (based on the sequence of scanning in

{q} \cup X (q) . extList

), by the property of ascending order in node expansion lists, if

v \in X (p_{after}) . child

, then we have

dist (q, p_{after}) + dist (p_{after}, v) \geq threshold

. Considering that

curDist ((q, v) | v^{'}) \leq threshold

and

v^{'}

is any common ancestor of q and v,

curDist ((q, v) | v^{'}) \leq threshold

; the proof is complete. □

The pseudocode of AWTDN is shown as Algorithm 3. For each vertex p in

{q} \cup X (q) . extList

, if the shortest network distance from p to q is not less than the threshold (line 5 of Algorithm 3), the algorithm terminates because no more kNNs of q will appear. Then, we evaluate p; if p is a candidate point, we update the result set R and the threshold.

Algorithm 3: kNN Searching Algorithm basd on Graph-Decomposed Tree.

The space complexity of the graph-decomposed tree is

O (Λ)

=

\frac{| V | (| V | + 1)}{2}

+

(| V | - 1)

in the worst case, where each node can be connected by all other nodes, the number of tree edges is

| V | - 1

. The time complexity of the graph-decomposed kNN searching algorithm is

O (q)

=

| V | - 1

for query node q, because the nearest nodes can be found in the tree nodes

X (q)

on the graph-decomposed tree

Λ

.

For example, given a query point q =

v_{3}

, k = 3, suppose the candidate set

M = {v_{1}, v_{5}, v_{6}, v_{8}, v_{10}}

(i.e., the vertices of dotted lines in Figure 2), and we query the 3NN of

v_{3}

. Table 2 shows the process. The meaning of triplet representation is as mentioned above. In rows 3–5 of Table 2, the relationship between

v_{6}

,

v_{7}

,

v_{10}

,

v_{9}

,

v_{2}

, and the base point

v_{3}

is as follows. All are children of

v_{3}

, so we add the subsequent vertices in the extended list of

v_{3}

nodes to the heap. In line 6,

v_{4}

is the ancestor of base point

v_{3}

, so we must add the subsequent vertices in the extended lists of nodes

v_{3}

and

v_{4}

to the heap; note that at this time, the first vertex of the

v_{4}

node expansion list is

v_{1}

, and

dist (v_{4}, v_{1}) = 14

,

dist (v_{3}, v_{4}) = 9

. Also,

dist (v_{3}, v_{4})

+

dist (v_{4}, v_{1})

>

dist (v_{3}, v_{1})

. Hence, the shortest distance from

v_{4}

to

v_{3}

through

v_{1}

is greater than the shortest distance from

v_{1}

itself to

v_{3}

, so

v_{1}

does no processing, and we add the follow-up vertex

v_{5}

of

v_{4}

to the heap. In the last line, the first vertex of the heap in the small root heap is

v_{2}

, and its label is (

v_{2}

,

v_{3}

,6); i.e., the shortest distance from

v_{2}

to the query point after

v_{3}

is 6, which is greater than the threshold of 5 at this time. So, the algorithm ends and returns R=(

v_{6}

,2),(

v_{10}

,4),(

v_{8}

,5), indicating that the 3NN of

v_{3}

is

v_{6}

,

v_{10}

,

v_{8}

, and their respective shortest distances to the query point are 2, 4, and 5.

This example helps illustrate the efficiency of AWTDN in handling kNN queries. The algorithm can use the precomputed shortest distances and a threshold value to prune unnecessary computations, improving overall performance.

By focusing on an average weight tree decomposition instead of on the minimum degree, we can more accurately capture the inherent structure of the road network. It visits fewer vertices for a given query point, which increases computational efficiency, as the number of visited vertices of a query point q is much smaller, as confirmed by our experiments, because average weight tree decomposition better preserves the distance information on the road network. On average, the vertices are similar to those in intermediate weight tree decomposition.

4.3. kNN-Query-Processing Method in a Dynamic Environment

We focus on the kNN query problem in a dynamic road network environment, where the candidate points are moving objects. Specifically, given a query point q, we must find the k moving objects that are closest. Since the positions of moving objects change with time, we use the offset method to map them to the nearest road network vertex and then convert the query from a dynamic environment to a static environment.

Considering the directionality of moving objects, we model the road network as a directed weighted graph

G (V, E)

.

Definition 12

(Moving Object). Each moving object on the road network is represented by m = <

(v_{i}, v_{j}), offset, t >

; that is, at time t, moving object m is on edge

(v_{i}, v_{j})

, at an offset from vertex

v_{j}

.

For instance, referring to Figure 7, if the current time is t, moving object

m_{13}

can be represented as

m_{13}

= < (

v_{3}

,

v_{2}

),2,t>, meaning

m_{13}

is on the edge between

v_{3}

and

v_{2}

, the distance between

v_{3}

and

v_{2}

is 5 and the offset from vertex

v_{2}

is 2. The algorithm maps each query point to the starting vertex with the offset. For example, if

q_{1}

is located on edge (

v_{2}

,

v_{1}

), it is mapped to starting point

v_{2}

with an offset of 1. Therefore, the distance between a moving object m =<

(v_{i}, v_{j}), o f f s e t_{1}, t >

and query point q =<

(v_{s}, v_{t}), o f f s e t_{2}, t >

can be represented as

o f f s e t_{1} + dist (v_{j}, v_{s}) + o f f s e t_{2}

.

The location information of each moving object m is updated regularly. We use a moving object table to store this location information at a given moment. Figure 8 shows the location information of the moving objects on a road network at times t and

t + 1

, where

v_{s t a r t}

and

v_{e n d}

are the respective start and end vertices of edge

(v_{s t a r t}, v_{e n d})

; M is the set of moving objects on this edge; offset contains the offsets of the moving objects to the end vertex

v_{e n d}

, corresponding one-to-one with the moving objects in M; and

M_{N}

is the number of moving objects in M.

Taking Figure 8 as an example, given query point

q_{1}

and assuming that all moving objects are candidate objects, when we initiate a kNN query at time t, we first map the query point

q_{1}

to the starting vertex with an offset; i.e.,

q_{1}

⩽〈(

v_{2}

,

v_{1}

), 1, t〉. Subsequently, all moving objects are offset to the vertices in their end direction, which become the candidate vertices. For example, moving object

m_{13}

can be represented as

m_{13}

⩾〈(

v_{3}

,

v_{2}

), 2, t〉; i.e.,

m_{13}

is on the edge

(v_{3}, v_{2})

, with an offset of 2 from the vertex

v_{2}

. Similarly, each moving object is mapped to a vertex on the road network with an offset, and the final result is as shown in the Moving Object List at time t. In this way, we can use the AWTDN algorithm for kNN queries. For the kNN set, R=

{v_{1}, v_{2}, \dots, v_{K}}

returned by AWTDN is not the true result. We must continue restoring the vertices in R to moving objects on the road network. Taking

v_{1}

as an example,

M_{N} = 1

; i.e., there is only one moving object on

v_{1}

, i.e.,

m_{13}

. The final distance from

m_{13}

to the query point

q_{1}

is dist(

m_{13}

,

q_{1}

) = dist(

v_{2}

,

v_{1}

) +

q_{1}

.offset +

m_{13}

.offset= 0 + 1 + 2 = 3. The actual kNN result set is obtained after restoring all vertices to moving objects in turn. It is worth noting that some vertices have more than one moving object, such as vertex

v_{2}

, which has moving objects

m_{11}

and

m_{13}

. Therefore, it is necessary to further screen the moving objects to obtain the final k results.

At time

t + 1

, as shown, due to the changes in positions of moving objects, the vertices to which moving objects are offset may also change. For example, due to the change in the position of

m_{13}

, its offset vertex changes from

v_{2}

at time t to

v_{1}

at time

t + 1

. At this point, we need to set

v_{2 . i s C a n d i d a t e}

to false and

v_{1 . i s C a n d i d a t e}

to true. We take Figure 9 as an example to explain the kNN query in a dynamic environment.

The starting vertex of query point q in Figure 9 is

v_{3}

. At time t, there are five moving objects,

m_{1}, m_{2}, m_{3}, m_{4}, m_{5}

.

m_{1}

is on the edge (

v_{3}, v_{6}

), and the distance between

v_{3}

and

v_{6}

is 2. We map each to a terminating vertex with the offset. Table 3 shows the moving object list at time t after mapping.

O f f s e t

= 1 means the distance from

m_{1}

to

v_{6}

is 1,

M_{N}

= 1 means the number of the moving object on the edge (

v_{3}, v_{6}

) is 1. Then, for the vertices after mapping, we use AWTDN to obtain the result set R and restore its vertices to real moving objects according to the moving object list, so as to obtain the final kNN result. The solution process is shown in Table 4. In the dynamic query, we must modify the kNN result set obtained by AWTDN. Suppose that the 4NN of query point q is 4NN =

{v_{6}, v_{7}, v_{10}, v_{4}}

. If a vertex

v_{8}

satisfies

dist (v_{8},

q) =

dist (v_{4},

q), then we continue to save

v_{8}

; i.e., the 4NN of q is 4NN =

{v_{6}, v_{7}, v_{10}, v_{4}, v_{8}}

. This ensures that the real moving objects are not missed.

A moving object table stores the moving objects’ location information at specific time points, including information such as the starting and ending vertices of the edge on which the object is located, the set of moving objects on that edge, the offset of each object from the ending vertex, and the number of moving objects on the edge. This allows the transformation of the problem from dynamic to static, allowing the use of conventional static kNN algorithms.

5. Experimental Analysis

5.1. Experimental Parameters

We implemented the AWTDN algorithm in the Java language and compared it with TEN-Query [2] and G*Tree [7]. The experimental environment comprised an AMD Ryzen 9 5900X 12-Core CPU at 3.69 GHz, 64 GB of memory, a 4 TB hard drive, and Windows 10.

Dataset: The dataset came from real-world road networks provided by Urban Road Network [24]. Six sets were selected for experiments, as shown in Table 5, where h and w are the respective height and width of average weight tree decomposition.

Parameter Configuration: For each dataset, we randomly generated candidate objects and controlled their numbers via object density

θ

, defined as the ratio of the numbers of candidate objects and road network vertices. The configurations of

θ

- and k-values can be found in Table 6, with default values written in boldface.

5.2. Impact of Dataset Size on Algorithm Query Time

We compared the query times of AWTDN, G*Query, and TEN-Query for different dataset sizes, where the object density

θ

and k take default values. We randomly generated 1000 vertices as query points, using the mean of all query times as the ultimate query time. The results are shown in Figure 10.

It is evident that both AWTDN and TEN-Query are significantly more efficient than G*Query, and AWTDN is the fastest. This is attributed to the avoidance of ineffective computations by applying thresholds. We compared the performance of the algorithms under varying k-values.

5.3. Impact of Changing k-Values on Algorithm Query Time

We compared the query efficiencies of the algorithms across six datasets by altering k-values, as shown in Figure 11.

The query performance varying different k-values is evaluated in Figure 11. A more stationary linearity of our algorithm is measured than G*Query and TEN-Query algorithms as the incremental changes of k-values. The flow graph index abstracts all nodes of the graph as the tree nodes, and the size of the tree nodes depends on the quantity of neighbors. Therefore, this tree is not related to the selection of k-value, which transcend the k-value limitation by storing all potential nodes.

5.4. Impact of Changing Object Density $θ$ on Algorithm Query Time

We compared the query efficiencies of the three algorithms across six datasets by adjusting

θ

values, with results as shown in Figure 12.

It is apparent that, irrespective of changes in the

θ

value, the performance of AWTDN across all six datasets is markedly superior to that of TEN-Query and G*Query, thereby confirming the universal applicability of AWTDN.

5.5. Comparison of Accessed Vertex Numbers across Different Datasets

We compared the number of vertices accessed by AWTDN, TEN-Query, and G*Query across six datasets. For each dataset, we generated 1000 random query points and took the average total number of accessed vertices as the final count. Both k and

θ

took default values. The comparison results are shown in Figure 13.

The results reveal that AWTDN accesses significantly fewer vertices than TEN-Query and G*Query, which is attributable to the efficient pruning capacity of the threshold.

5.6. Comparison of Index Construction Time and Space across Different Datasets

We compared the index construction time and space of AWTDAC, TEN-Index, and G*Query across datasets of varying sizes, where the object density

θ

takes the default value of 0.3. Since the construction of TEN-Index depends on k, to accommodate queries of all k-values, we set k to 100. It is worth mentioning that AWTDAC does not rely on k and thus can satisfy kNN queries of any k-value. We set the parameters of G*Tree according to the original text. The comparison results are depicted in Figure 14 and Figure 15.

AWTDAC incurs higher construction time and space costs than TEN-Index, which only stores the kNN results within the contracted neighbor range, which explains why TEN-Index depends on the k-value. In contrast, AWTDAC stores all vertices within the subtree scope, thus breaking the k-value limitation and enabling kNN queries of any k-value. Although G*Tree has the smallest space cost, it has the largest time cost. G*Tree employs the Dijkstra algorithm to calculate the shortest path network distance, while AWTDAC and TEN-Index use H2H-Query to measure the road network distance, which is significantly more efficient than Dijkstra.

6. Conclusions

kNN queries for road networks are studied in this paper. Firstly, a flow graph index is proposed to store the intermediated results, which is generated from a graph-decomposed tree. Rules obtained from the flow graph index are found to reduce the quantity of candidate nodes. Secondly, the kNN query algorithm is used to conduct the final results on this flow graph index. Finally, the effectiveness and efficiency of our proposed algorithm are confirmed through experiments.

Author Contributions

Conceptualization, W.J.; methodology, M.B.; software, F.W.; validation, X.W.; formal analysis, G.L.; data curation, B.N.; writing—original draft preparation, F.W.; writing—review and editing, W.J.; supervision, G.L.; funding acquisition, G.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China under Grant (No. 61976032 and No. 62002039), the genReral scientific research project of Liaoning (No. LJKZ0063).

Data Availability Statement

The data used to support this study are available from the corresponding author upon request.

Acknowledgments

I would like to extend my sincere gratitude to Yangjie Zhou from Rizhao Institute of Metrology for his invaluable assistance throughout the process of writing this paper. Zhou’s expertise and advice played a significant role in enhancing the quality of the paper. I am deeply appreciative of the time and effort generously provided by Zhou, which greatly facilitated the completion of this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

Dijkstra, E.W. A note on two problems in connexion with graphs. In Edsger Wybe Dijkstra: His Life, Work, and Legacy; ACM Digital Library: New York, NY, USA, 2022; pp. 287–290. [Google Scholar]
Ouyang, D.; Wen, D.; Qin, L.; Chang, L.; Zhang, Y.; Lin, X. Progressive top-k nearest neighbors search in large road networks. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, Portland, OR, USA, 14–19 June 2020; pp. 1781–1795. [Google Scholar]
He, D.; Wang, S.; Zhou, X.; Cheng, R. An efficient framework for correctness-aware kNN queries on road networks. In Proceedings of the 2019 IEEE 35th International Conference on Data Engineering (ICDE), Macao, China, 8–11 April 2019; pp. 1298–1309. [Google Scholar]
Tianyang, D.; Lulu, Y.; Qiang, C.; Bin, C.; Jing, F. Direction-aware KNN queries for moving objects in a road network. World Wide Web 2019, 22, 1765–1797. [Google Scholar] [CrossRef]
Shen, B.; Zhao, Y.; Li, G.; Zheng, W.; Qin, Y.; Yuan, B.; Rao, Y. V-tree: Efficient knn search on moving objects with road-network constraints. In Proceedings of the 2017 IEEE 33rd International Conference on Data Engineering (ICDE), San Diego, CA, USA, 19–22 April 2017; pp. 609–620. [Google Scholar]
Zhong, R.; Li, G.; Tan, K.L.; Zhou, L.; Gong, Z. G-tree: An efficient and scalable index for spatial search on road networks. IEEE Trans. Knowl. Data Eng. 2015, 27, 2175–2189. [Google Scholar] [CrossRef]
Li, Z.; Chen, L.; Wang, Y. G-tree: An efficient spatial index on road networks. In Proceedings of the 2019 IEEE 35th International Conference on Data Engineering (ICDE), Macao, China, 8–11 April 2019; pp. 268–279. [Google Scholar]
Dan, T.; Luo, C.; Li, Y.; Guan, Z.; Meng, X. LG-Tree: An Efficient Labeled Index for Shortest Distance Search on Massive Road Networks. IEEE Trans. Intell. Transp. Syst. 2022, 22, 23721–23735. [Google Scholar] [CrossRef]
Bao, J.L.; Wang, B.; Yang, X.C.; Zhu, H.J. Nearest neighbor query in road networks. J. Softw. 2018, 29, 642–662. [Google Scholar]
Li, L.; Cheema, M.A.; Ali, M.E. Continuously monitoring alternative shortest paths on road networks. In VLDB Endowment, 3rd ed.; Association for Computing Machinery: New York, NY, USA, 2020; pp. 2243–2255. [Google Scholar]
Huang, Y.-K.; Lee, C.-P.; Tsai, C.-Y. Evaluating KNN-skyline queries in dynamic road networks. In Proceedings of the 2018 27th Wireless and Optical Communication Conference (WOCC), Hualien, Taiwan, 30 April–1 May 2018; pp. 1–2. [Google Scholar]
Halin, R. S-functions for graphs. J. Geom. 1976, 8, 171–186. [Google Scholar] [CrossRef]
Ouyang, D.; Wen, D.; Qin, L.; Chang, L.; Lin, X.; Zhang, Y. When hierarchy meets 2-hop-labeling: Efficient shortest distance and path queries on road networks. In Proceedings of the SIGMOD’18: Proceedings of the 2018 International Conference on Management of Data, Paris, France, 13–18 June 2004; pp. 1–25. [Google Scholar]
Cho, H.J.; Chung, C.W. An efficient and scalable approach to CNN queries in a road network. In Proceedings of the International Conference on VLDB, Trondheim, Norway, 30 August–2 September 2005; pp. 865–876. [Google Scholar]
Safar, M. Enhanced Continuous KNN Queries Using PINE on Road Networks. In Proceedings of the 2006 1st International Conference on Digital Information Management, Bangalore, India, 6 December 2007; pp. 248–256. [Google Scholar]
Fu, Q.; Sun, G.; Zhang, Z. An efficient precomputation technique for approximation distance query in road networks. In Proceedings of the 2013 IEEE 14th International Conference on Mobile Data Management, Milan, Italy, 3–6 June 2013; pp. 131–135. [Google Scholar]
Abeywickrama, T.; Cheema, M.A.; Storandt, S. Hierarchical Graph Traversal for Aggregate k Nearest Neighbors Search in Road Networks. In Proceedings of the International Conference on Automated Planning and Scheduling, Nancy, France, 14–19 June 2020; pp. 2–10. [Google Scholar]
Bareche, I.; Xia, Y. A Distributed Hybrid Indexing for Continuous KNN Query Processing over Moving Objects. ISPRS Int. J. Geo-Inf. 2022, 11, 264. [Google Scholar] [CrossRef]
Chen, Y.J.; Chuang, K.T.; Chen, M.S. Coupling or decoupling for KNN search on road networks? a hybrid framework on user query patterns. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management, New York, NY, USA, 24–28 October 2011; pp. 795–804. [Google Scholar]
Hlaing, A.T.; Htoo, H.; Ohsawa, Y. Efficient Reverse kNN Query Algorithm on Road Network Distances Using Partitioned Subgraphs. In Advances in Conceptual Modeling: ER 2014 Workshops, ENMO, MoBiD, MReBA, QMMQ, SeCoGIS, WISM, and ER Demos, Atlanta, GA, USA, 27–29 October 2014; Springer: Berlin/Heidelberg, Germany, 2014; pp. 212–217. [Google Scholar]
Luo, S.; Kao, B.; Li, G.; Hu, J.; Cheng, R.; Zheng, Y. Toain: A throughput optimizing adaptive index for answering dynamic kNN queries on road networks. Proc. Vldb Endow. 2018, 11, 594–606. [Google Scholar] [CrossRef]
Li, C.; Gu, Y.; Qi, J.; He, J.; Deng, Q.; Yu, G. A GPU Accelerated Update Efficient Index for kNN Queries in Road Networks. In Proceedings of the 2018 IEEE 34th International Conference on Data Engineering (ICDE), Paris, France, 16–19 April 2018; pp. 881–892. [Google Scholar]
Sun, Y.; Li, G.; Du, J.; Ning, B.; Chen, H. A subgraph matching algorithm based on subgraph index for knowledge graph. Front. Comput. Sci. 2020, 16, 163606. [Google Scholar] [CrossRef]
Urban Road Network Data. Figshare. Dataset. Available online: https://figshare.com/articles/dataset/Urban_Road_Network_Data/2061897/1 (accessed on 30 September 2023).

Figure 1. The example of a road network G.

Figure 2. Example of kNN query of road network.

Figure 3. Example of tree node

X (v_{3})

.

Figure 3. Example of tree node

X (v_{3})

.

Figure 4. Tree decomposition

T_{G}

based on minimum degree.

Figure 4. Tree decomposition

T_{G}

based on minimum degree.

Figure 5. Tree decomposition

T_{G}

based on average weight.

Figure 5. Tree decomposition

T_{G}

based on average weight.

Figure 6. Example of kNN Flow Graph Index.

Figure 7. Example of kNN Flow Graph Index.

Figure 8. Example of moving object list.

Figure 9. kNN query in dynamic environment.

Figure 10. Tree decomposition

T_{G}

based on average weight.

Figure 10. Tree decomposition

T_{G}

based on average weight.

Figure 11. Impact of Changing k-values on Algorithm Query Time.

Figure 12. Impact of Changing Object Density

θ

on Algorithm Query Time.

Figure 12. Impact of Changing Object Density

θ

on Algorithm Query Time.

Figure 13. Comparison of the number of visited vertices.

Figure 14. Comparison of index construction time.

Figure 15. Comparison of index construction space.

Table 1. Symbol Definitions.

Symbol	Definition
$X (v)$ .extList	Node expansion list of node $X (v)$ , sorted in ascending order of road network distance from node $X (v)$
$v ._{i s C a n d i d a t e}$	Whether vertex v is a candidate vertex
dist( $v_{1}, v_{2}$ )	The true shortest road network distance from vertex $v_{1}$ to vertex $v_{2}$
curDist(( $v_{1}, v_{2}$ )\|p)	The current shortest road network distance from vertex $v_{1}$ to vertex $v_{2}$ along path p
R	kNN set of query point q, sorted in ascending order of road network distance from q
root	The root of the kNN index
threshold	Threshold
M	Candidate vertex set

Table 2. kNN solution process of

v_{3}

(q =

v_{3}

, k = 3).

Table 2. kNN solution process of

v_{3}

(q =

v_{3}

, k = 3).

Currently Processed Vertex	Candidate Point	Result Set	Relationship with Base Point	Extended Vertex Label	Min-Heap	Threshold
$(v_{3}, \emptyset, 0)$	No			$(v_{6}, v_{3}, 2)$	$(v_{6}, v_{3}, 2)$	$+ \infty$
$(v_{6}, v_{3}, 2)$	Yes	$(v_{6}, 2)$	Child	$(v_{7}, v_{3}, 3)$	$(v_{7}, v_{3}, 3)$	$+ \infty$
$(v_{7}, v_{3}, 3)$	No	$(v_{6}, v_{3}, 2)$	Child	$(v_{10}, v_{3}, 4)$	$(v_{10}, v_{3}, 4)$	$+ \infty$
$(v_{10}, v_{3}, 4)$	Yes	$(v_{6}, 2)$ , $(v_{10}, 4)$	Child	$(v_{4}, v_{3}, 5)$	$(v_{4}, v_{3}, 5)$	$+ \infty$
$(v_{4}, v_{3}, 5)$	No	$(v_{6}, 2)$ , $(v_{10}, 4)$	Ancestor	$(v_{8}, v_{3}, 5)$ , $(v_{1}, v_{4}, 14)$	$(v_{8}, v_{3}, 5)$ , $(v_{1}, v_{4}, 14)$	$+ \infty$
$(v_{8}, v_{3}, 5)$	Yes	$(v_{6}, 2)$ , $(v_{10}, 4)$ , $(v_{8}, 5)$	Ancestor	$(v_{9}, v_{3}, 5)$ , $(v_{1}, v_{8}, 14)$ , $(v_{5}, v_{8}, 17)$	$(v_{9}, v_{3}, 5)$ , $(v_{1}, v_{8}, 14)$ , $(v_{5}, v_{8}, 17)$	$+ \infty$
$(v_{9}, v_{3}, 5)$	No	$(v_{6}, 2)$ , $(v_{10}, 4)$ , $(v_{8}, 5)$	Child	$(v_{2}, v_{3}, 6)$ , $(v_{1}, v_{9}, 14)$ , $(v_{5}, v_{9}, 17)$	$(v_{2}, v_{3}, 6)$ , $(v_{1}, v_{9}, 14)$ , $(v_{5}, v_{9}, 17)$	5

Table 3. List of moving objects at time t.

$v_{end}$	M	$v_{start}$	$Offset$	$M_{N}$
$v_{6}$	$m_{1}$	$v_{3}$	1	1
$v_{10}$	$m_{2}$	$v_{3}$	2	1
$v_{5}$	$m_{3}$	$v_{8}$	3	1
$v_{1}$	$m_{4}$	$v_{9}$	2	1
$v_{8}$	$m_{5}$	$v_{7}$	2	1

Table 4. kNN solution process of q.

Object	Mapped Vertex	Currently Processed Vertex	⋯	Min-Heap	Threshold
∅	$v_{3}$	$(v_{3}, \emptyset, 0)$		$(v_{6}, v_{3}, 2)$	$+ \infty$
$m_{1}$	$v_{6}$	$(v_{6}, v_{3}, 2)$	⋯	$(v_{7}, v_{3}, 3)$	$+ \infty$
$m_{1}$	$v_{7}$	$(v_{7}, v_{3}, 3)$	⋯	$(v_{10}, v_{3}, 4)$	$+ \infty$
$m_{1}$ , $m_{2}$	$v_{10}$	$(v_{10}, v_{3}, 4)$	⋯	$(v_{4}, v_{3}, 5)$	$+ \infty$
$m_{1}$ , $m_{2}$	$v_{4}$	$(v_{4}, v_{3}, 5)$	⋯	$(v_{8}, v_{3}, 5)$ , $(v_{1}, v_{4}, 14)$	$+ \infty$
$m_{1}$ , $m_{2}$ , $m_{5}$	$v_{8}$	$(v_{8}, v_{3}, 5)$	⋯	$(v_{9}, v_{3}, 5)$ , $(v_{1}, v_{8}, 14)$ , $(v_{5}, v_{8}, 17)$	$+ \infty$
$m_{1}$ , $m_{2}$ , $m_{5}$	$v_{9}$	$(v_{9}, v_{3}, 5)$	⋯	$(v_{2}, v_{3}, 6)$ , $(v_{1}, v_{9}, 14)$ , $(v_{5}, v_{9}, 17)$	5

Table 5. Statistics of road network data.

Dataset	\|V\|	\|E\|	h	w
Quanzhou (QZ)	5672	7521	210	79
Dalian (DL)	13,605	17,984	209	65
Pune (PN)	28,649	36,925	359	144
Baghdad (BD)	60,108	88,876	560	201
Tehran (TR)	110,580	147,339	721	356
Bangkok (BK)	154,352	187,798	806	357

Table 6. Parameter Settings.

Parameters	Description	Value
k	Quantity of Nearest Neighbors	10, 20, 50, 100
$θ$	Object Density	0.2, 0.3, 0.5, 0.7

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jiang, W.; Li, G.; Bai, M.; Ning, B.; Wang, X.; Wei, F. Graph-Indexed kNN Query Optimization on Road Network. Electronics 2023, 12, 4536. https://doi.org/10.3390/electronics12214536

AMA Style

Jiang W, Li G, Bai M, Ning B, Wang X, Wei F. Graph-Indexed kNN Query Optimization on Road Network. Electronics. 2023; 12(21):4536. https://doi.org/10.3390/electronics12214536

Chicago/Turabian Style

Jiang, Wei, Guanyu Li, Mei Bai, Bo Ning, Xite Wang, and Fangliang Wei. 2023. "Graph-Indexed kNN Query Optimization on Road Network" Electronics 12, no. 21: 4536. https://doi.org/10.3390/electronics12214536

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Graph-Indexed kNN Query Optimization on Road Network

Abstract

1. Introduction

2. Related Work

3. Problem Definitions

4. Index and Algorithm

4.1. Average Weight Tree Decomposition Index Construction

4.2. Query-Processing Method Based on kNN Index

4.3. kNN-Query-Processing Method in a Dynamic Environment

5. Experimental Analysis

5.1. Experimental Parameters

5.2. Impact of Dataset Size on Algorithm Query Time

5.3. Impact of Changing k-Values on Algorithm Query Time

5.4. Impact of Changing Object Density $θ$ on Algorithm Query Time

5.5. Comparison of Accessed Vertex Numbers across Different Datasets

5.6. Comparison of Index Construction Time and Space across Different Datasets

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Graph-Indexed kNN Query Optimization on Road Network

Abstract

1. Introduction

2. Related Work

3. Problem Definitions

4. Index and Algorithm

4.1. Average Weight Tree Decomposition Index Construction

4.2. Query-Processing Method Based on kNN Index

4.3. kNN-Query-Processing Method in a Dynamic Environment

5. Experimental Analysis

5.1. Experimental Parameters

5.2. Impact of Dataset Size on Algorithm Query Time

5.3. Impact of Changing k-Values on Algorithm Query Time

5.4. Impact of Changing Object Density θ on Algorithm Query Time

5.5. Comparison of Accessed Vertex Numbers across Different Datasets

5.6. Comparison of Index Construction Time and Space across Different Datasets

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

5.4. Impact of Changing Object Density $θ$ on Algorithm Query Time