GenTrajRec: A Graph-Enhanced Trajectory Recovery Model Based on Signaling Data

Huang, Hongyao; Xie, Haozhi; Xu, Zihang; Liu, Mingzhe; Xu, Yi; Zhu, Tongyu

doi:10.3390/app14135934

Open AccessArticle

GenTrajRec: A Graph-Enhanced Trajectory Recovery Model Based on Signaling Data

by

Hongyao Huang

,

Haozhi Xie

,

Zihang Xu

,

Mingzhe Liu

^*,

Yi Xu

and

Tongyu Zhu

State Key Laboratory of Complex & Critical Software Environment, Beihang University, Beijing 100191, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(13), 5934; https://doi.org/10.3390/app14135934

Submission received: 3 April 2024 / Revised: 24 June 2024 / Accepted: 27 June 2024 / Published: 8 July 2024

(This article belongs to the Special Issue Advances in Image Recognition and Processing Technologies)

Download

Browse Figures

Versions Notes

Abstract

:

Signaling data are records of the interactions of users’ mobile phones with their nearest cellular stations, which could provide long-term and continuous-time location data of large-scale citizens, and therefore have great potential in intelligent transportation, smart cities, and urban sensing. However, utilizing the raw signaling data often suffers from two problems: (1) Low positioning accuracy. Since the signaling data only describes the interaction between the user and the mobile base station, they can only restore users’ approximate geographical location. (2) Poor data quality. Due to the limitations of mobile signals, user signaling may be missing and drifting. To address the above issues, we propose a graph-enhanced trajectory recovery network, GenTrajRec, to recover precise trajectories from signaling data. GenTrajRec encodes signaling data through spatiotemporal encoders and enhances the traveling semantics by constructing a signaling transition graph. In fusing the spatiotemporal information as well as the deep traveling semantics, GenTrajRec can well tackle the challenge of poor data quality, and recover precise trajectory from raw signaling data. Extensive experiments have been conducted on two real-world datasets from Mobile Signaling and Geolife, and the results confirm the effectiveness of our approach, and the positioning accuracy can be improved from 315 m per point to 82 m per point for signaling data using our network.

Keywords:

signaling data; graph embedding; coarse trajectory data; sequential modeling; uneven time interval

1. Introduction

Trajectory data are movers’ continuous positioning data in space. They can reflect the traveling patterns of the movers, so as to further infer other characteristics of them. Therefore, they play an essential role in intelligent transportation. Traditional trajectory data are usually constructed based on GPS, which has high positioning accuracy and stable sampling frequency. However, due to high cost and privacy concerns, these kinds of trajectory data are usually sparse, which makes it hard to build large-scale trajectory datasets. With the development of communication technology, signaling data have received more and more attention due to their large user base. Signaling data are the user’s interaction data with the base station. The interaction in a period can be constructed into a continuous base station sequence, which can reflect the user’s trajectory in space. Due to the limitations of communication technology, signaling trajectories have the following two defects. (1) Low positioning accuracy. The signaling trajectory is essentially a mobile base station sequence, which cannot show the user’s original trajectory. As shown in Figure 1a, it is difficult to determine whether the user’s true trajectory is

T_{1}

or

T_{2}

based on the signaling sequence

(B_{1}, B_{2}, B_{4})

. (2) Poor data quality. In the interaction process between users and base stations, problems such as redundancy, ping-pong, and the missing of base stations may occur. As shown in Figure 1b, signaling sequences

(B_{1}, B_{2}, B_{3}, B_{4})

may be generated on the user’s original trajectory

T_{1}

, resulting in redundant segments as shown in the dotted line in the trajectories. According to statistical data, in urban areas, the positioning accuracy of base stations is within 300 to 500 m, while in suburban areas, where the density of base stations is lower, the positioning accuracy of base stations is within 500 to 2000 m, which cannot meet the requirements for precise positioning.

Despite the following defects, a number of novel studies focusing on signaling data have emerged in recent years, trying to capture valuable information in them. Yao [1] proposed that users’ travel mode and characteristics of time in signaling data are important. So, he constructed an XGboost-based model classifier and an LSTM-based model. Wang [2] put forward that signaling data are non-Euclidean structure data, so he introduced GAN [3] to capture graph information hiding in them. However, most similar research only applies signaling data to area-level or grid-level tasks rather than fine-grained trajectory recovery tasks. Existing research only utilizes naive approaches such as HMM [4] and a clustering model [5] to recover the original trajectory, which cannot fully utilize the spatiotemporal information in the signaling data.

Based on the above work, three challenges remain in the task of trajectory recovery from signaling data. (1) Existing research on signaling data mainly focuses on using signaling data for coarse-grained urban sensing tasks. How to design a fine-grained trajectory recovery framework is still a blank. (2) Existing methods usually regard trajectory recovery as a sequential problem, and they only use the coordinates and time information of the trajectories to tackle this problem, which makes it difficult to capture the spatial dependence between base stations. (3) Signaling data have complicated spatiotemporal features that can somehow reflect the moving patterns of the users by recording the locations of the cell towers, the time that users are within the detection range of the cell towers, and unique IDs of the cell towers. How to fully utilize the rich information of them and fuse the multiple features well is still a huge challenge.

To tackle these challenges, we propose a novel graph-enhanced trajectory recovery framework, i.e., GenTrajRec, in this paper, which is dedicated to recovering precise trajectories from raw signaling data. To obtain the fine-grained trajectory, we designed a new real position restoration problem that maps a coarse trajectory to an accurate location in this work. To capture deep transition patterns, GenTrajRec first develops a trajectory transition graph module, which reconstructs the relations between base stations. To capture both the spatial-temporal features and the contextual information of trajectory, GenTrajRec then utilizes a time encoder and a position encoder, which can learn rich spatial and temporal patterns. To make full use of these features, we introduce a feature fusion module that can aggregate spatial-temporal embeddings and transition graph embeddings. Finally, we adopt a transformer-based ST-layer to recover precise trajectory. Overall, our contributions can be summarized as follows.

We propose a novel framework for recovering precise trajectories from raw signaling data, which has great prospects in constructing large-scale trajectory datasets and solving the problem of data sparsity in most downstream traffic prediction tasks.
We introduce a trajectory transition graph module to capture deep transition patterns. Combining spatiotemporal features and transition pattern features, we design a transformer-based sequential model for trajectory reconstruction.
We conduct substantial experiments on two real-life datasets to evaluate the effectiveness of our proposed GenTrajRec. Experimental results demonstrate that GenTrajRec outperforms existing methods.

2. Related Work

2.1. Trajectory Prediction

Trajectory prediction is quite popular in recent research, which has been focused on humans [6,7,8] and vehicles [9,10]. For instance, some researchers focused on a segment of a road or a single scene while others may focus on the movement of city dwellers. Traditional methods like Kalman filter [11] and Monte Carlo [12], might be computationally intensive and cannot handle multimodal distributions. Then, some machine learning methods like support vector machines [13] greatly improve positioning accuracy. However, these methods did not take the surroundings, such as traffic lights or maps, into consideration. Methods like DBN [14] can solve this problem, but they cannot be used for scenes like lane changing as well as lane maintaining.

Recently, deep learning methods have become more popular because of their performance. Some methods use reinforcement learning. For instance, Zhao [9] introduced a target-driven trajectory prediction framework and modeled the likelihood of a trajectory given a target. While graph neural networks are becoming more popular in representation learning, Mo [15] proposed an edge-enhanced graph attention network, a three-channel framework that can deal with the heterogeneity of the target agents and traffic participants, and used graph neural networks to model the interactions among traffic individuals and the interactions between traffic individuals and their surroundings. However, all these methods focus on a small area, such as a road segment, and haven’t been used on a city-scale scene.

2.2. Trajectory Recovery

Trajectory recovery, which aims at increasing the sampling rate by recovering the missing points of a given trajectory, is another sort of problem, which can be used to help dig out more useful information about human moving patterns. Some researchers focus on the recovery of human mobility and use graph neural networks as well as attention mechanisms to capture the spatial and temporal relationship from lengthy and sparse trajectories. Feng [16] proposed DeepMove to predict human mobility from long-range and sparse trajectories. Wang [17] combines seq2seq model with a Kalman filter and proposes a deep hybrid trajectory recovery model, named DHTR, to derive more accurate estimations of the positions. Xia [18] proposed AttnMove, an attentional neural network-based model, to identify individual trajectories by recovering unobserved locations at fine-grained spatial-temporal resolution. PeriodicMove [19] used graph neural networks for each trajectory to capture the transition patterns of the users. However, these methods are user-based and cannot be applied to use other users’ trajectories to predict an unknown user’s trajectory. Other researchers focus on the trajectory of the vehicles, Ren [20] proposed a map-constrained multi-task seq2seq trajectory recovery model to generate the missing points and map the points to the road network to improve the recovering accuracy. Chen [21] proposed a road network enhanced trajectory recovery model that considered the road network structure, GPS trajectory representation, and spatial-temporal transformer for trajectory recovery, and used GAT [22] for aggregating information. However, these methods are not appropriate for humans’ mobility patterns. To improve positioning accuracy, some researchers tried to present novel ways of trajectory representation. Yao [23] proposed TrajGAT to model the hierarchical spatial structure while reducing the GPU memory usage of the transformer. Liang proposed TrajFormer [24] to extend the use of transformers for trajectory classification.

2.3. Time-Series Sequential Modeling

Time-series forecasting is important in many areas, such as traffic flow prediction [25,26], electricity prediction, and weather prediction. Most of the methods can be classified into two big categories: classical methods and machine learning methods. For classical methods such as ARIMA [27], their positioning accuracy relies on the choice of parameters, and with the appearance of machine learning methods, their performance seems to be poor. Recently, with the development of deep learning methods, many researchers have decided to use RNN-based methods such as LSTM [28], GRU [29], and transformer-based methods to model the spatial-temporal features of the time series. Most of the methods are transformer-based methods. Lim [30] proposed a temporal fusion transformer to help with the interpretability of multi-horizon time series forecasting. Zhou [31] proposed Informer and used a ProbSparse self-attention mechanism to decrease the memory usage and time complexity of the transformer. Wu proposed [32] AutoFormer, which was inspired by the stochastic process theory, and designed an autocorrelation mechanism based on series periodicity. Zhou [33] proposed a frequency-enhanced decomposed transformer that achieves a linear complexity to the sequence length. Other researchers point out that transformers may not be as effective as people think for time-series forecasting. Zeng [34] proposed LSTF-Linear, which consists of only simple linear layers and reaches comparable results on multiple datasets. MLP-Mixer [35] and TSMixer [36] use only MLP-based models to achieve comparable performance on both vision data and time-series data. Because of the efficiency of the mixer-based methods, more mixer-based methods are emerging. We summarized some of the recent studies about time series in Table 1.

3. Preliminaries

3.1. Transition Graph

The transition graph can be denoted as

G = (V, E)

, which models correlations between base stations. V represents the base stations corresponding to distinct locations, and E symbolizes the connections or pathways between these base stations.

3.2. SD Trajectory Sequence

We define the SD trajectory sequence of an individual user u’s nth trajectory sequence as

Γ_{u}^{n} = (P_{u}^{n}, y_{u}^{n})

.

P_{u}^{n}

is the

n^{t h}

trajectory sequence associated with user u, which can be denoted as a sequence of

p_{u, i}^{n} = {(v_{u, i}^{n}, l a t_{u, i}^{n}, l o t_{u, i}^{n}, t_{u, i}^{n} (i n), t_{u, i}^{n} (o u t)) | 1 \leq i \leq 2 k + 1}

. Here

v_{u, i}^{n}, (l a t_{u, i}^{n}, l o t_{u, i}^{n})

respectively, represents the ID and coordinates of a base station, where

l a t_{u, i}^{n}

represents latitude, while

l o t_{u, i}^{n}

represents longitude, and

t_{u, i}^{n} (i n), t_{u, i}^{n} (o u t)

represents user u’s time of entering and leaving the detection range of base station.

y_{u}^{n}

indicates the precise location corresponding to

P_{u}^{n}

, represented by the latitude and longitude

(l a t_{u}^{o, n}, l o g_{u}^{o, n})

at a specific timestamp

t_{u}^{o, n}

, and

p_{u, k}^{n}

refers to coarse trajectory record that match the precise location. Taking signaling data as an example, this means that the user’s precise location is

(l a t_{u}^{o, n}, l o g_{u}^{o, n})

at

t_{u}^{o, n}

, while this user is also detected by the base station at

l a t_{u, i}^{n}, l o t_{u, i}^{n}

, and we find this record’s last k records and next k records, and thus obtain a coarse trajectory sequence and map it to a precise location.

3.3. Real Trajectory Recovery

The objective of our model is to recover the real locations of users from the coarse trajectory data. It can be formalized as

f (P_{u}^{n}, G) = {\hat{Y}}_{u}^{n}

(1)

where

{\hat{Y}}_{u}^{n} = ({\hat{x}}_{u}^{o, n}, {\hat{y}}_{u}^{o, n}, t_{u}^{' n})

denotes the recovered precise location of user u at any target time

t_{u}^{' n}

, and

t_{u, k}^{n} (i n) \leq t_{u}^{' n} \leq t_{u, k}^{n} (o u t)

, which means the target time should be located in the time between

t_{u, k}^{n} (i n)

and

t_{u, k}^{n} (o u t)

for each of the users’ SD trajectory sequence.

4. Methodology

The architecture of our GenTrajRec is presented in Figure 2. It consists of Spatiotemporal trajectory embedding, transition graph embedding, a feature fusion module, and an ST-layer.

4.1. Spatiotemporal Trajectory Embedding

The whole embedding of spatiotemporal trajectory contains these parts: a position encoder to obtain the embedding of latitude and longitude features, and a time encoder, which is used to deal with uneven time intervals, to obtain time features in each coarse trajectory record.

(1) Position Encoder: This part was designed to obtain the embedding of the latitude and longitude of the trajectories. Following the definitions in the preliminary part, the formula of the representations in the position encoder is as follows.

{\bar{x}}_{u, i}^{n}, {\bar{y}}_{u, i}^{n} = Normalized (l a t_{u, i}^{n}, l a t_{u, i}^{n}),

(2)

x_{u, i}^{p o s, n} = MLP ({\bar{x}}_{u, i}^{n}, {\bar{y}}_{u, i}^{n}),

(3)

where

{\bar{x}}_{u, i}^{n}, {\bar{y}}_{u, i}^{n}

is the normalized latitude and longitude information for each location in the trajectory sequence,

x_{u, i}^{p o s, n}

is the positional embedding of this location. The embedding of the nth trajectory for user u is defined as

x_{u}^{p o s, n}

, and the position representation of the model is defined as

X_{l o c}

.

(2) Time Encoder: Since the time interval between two consecutive signaling records can vary, we cannot ignore this problem. To tackle this problem, we design a time encoder to deal with the timestamps and the uneven time interval information in signaling records. For each timestamp in the trajectory segment

p_{u, i}^{n}

, the formulas are as follows.

t_{i}^{u} = Extract (t_{u, i}^{n} (i n), t_{u, i}^{n} (o u t)),

(4)

x_{t i m e, i}^{u} = cos (W_{t} t_{i}^{u} + b_{t}),

(5)

where

Extract

refers to extracting the features of the timestamps (hour, minute, and second values). We use the

c o s

function and learnable parameters

W_{t}

and

b_{t}

in each record to get periodic time features. We use

X_{t i m e}^{u} = (x_{t i m e, 1}^{u}, x_{t i m e, 2}^{u}, \dots, x_{t i m e, k}^{u})

as the time features of the trajectory for an individual u. The time features of the whole model is defined as

X_{t i m e}

.

4.2. Transition Graph Embedding

We utilize the coarse trajectory transition graph to model the transition patterns between base stations. There are multiple trajectory-transition patterns, which means the edges can be constructed in multiple ways. Most researchers constructed the global graph based on geographical distances. However, this concept of direct spatial proximity does not account for the actual adjacency between two locations. Taking vehicles as an example, the presence of one-way streets might require a detour to reach the destination, meaning the two locations cannot be considered practically adjacent. Additionally, in real-world scenarios, their proximity relationship varies with different modes of travel. For instance, a distance of 1 km could be considered adjacent for cars, but for pedestrians, these two base stations might not be considered adjacent.

Because of the thoughts above, we use the transition patterns and the time interval between base stations to construct the transition graph. We consider that an edge can only be established between two base stations if the time for a mobile phone user to travel from one base station to another is less than a certain time threshold. If an edge already exists between two base stations, the weight of the edge increases. We define all the trajectory sequences as

P_{t r a j} = (P_{s u b}^{1}, P_{s u b}^{2}, \dots, P_{s u b}^{n})

where

P_{s u b}^{i} = (s_{i, 1}, s_{i, 2}, \dots, s_{i, k})

. Each

s_{i, j}

represents the location and time information presented in the preliminaries of the coarse trajectory sequence at step j, which can be denoted as

s_{i, j} = (l_{i, j}, t_{i, j})

.

The graph construction algorithm is shown in Algorithm 1. We construct a directed graph

G = (V, E)

. V is the nodes of the graph, which represents the coarse area index. The node contains the latitude and longitude and the ID of the base station. E is the edges of the graph. In our work, we construct the graph by transition time. For instance, for user u’s trajectory sequence, suppose the time threshold is

α

, for

p_{u, i}^{n} = (v_{u, i}^{n}, x_{u, i}^{n}, y_{u, i}^{n}, t_{u, i}^{n} (i n), t_{u, i}^{n} (o u t))

, if

| t_{u, i + 1}^{n} (i n) - t_{u, i}^{n} (i n) | < α

, we reckon there is an edge between

v_{u, i}^{n}

and

v_{u, i + 1}^{n}

. We construct the graph using all the trajectory sequences and the number of the pairs are regarded as the weights of the edges. The graph is supposed to find hidden transition patterns. Thus, we use Node2Vec [39] as the sampling method, to generate positive samples and randomly selected nodes as negative samples. To obtain the embedding of the whole trajectory, we set a single positive random walk trace ID sequence as

(v_{p o s, 0}, v_{p o s, 1}, . ., v_{p o s_{k - 1}})

, and the negative random walk trace ID sequence as

(v_{n e g, 0}, v_{n e g, 1}, . ., v_{n e g_{k - 1}})

. Then, we set a sliding window with the size of w to obtain the sub-trace, and the formula for the trace embedding is

r_{p o s, 0}, r_{p o s, 1}, \dots, r_{p o s, k - w} = unfold (v_{p o s, 0}, v_{p o s, 1}, v_{p o s, 2}, \dots, v_{p o s, k}),

(6)

where

r_{p o s, i}

is a sub-trace with a size of w. For each trace, the formula for embedding the trace is as follows.

w_{p o s, i} = Embedding (r_{p o s, i}),

(7)

and the embedding for the whole single trace is

x_{p o s} = \frac{1}{k - w - 1} * \sum_{i = 1}^{k - w - 1} w_{0} \cdot w_{p o s, i},

(8)

where

w_{pos, i}

is the embedding of the node

v_{p o s, i} \in V

. Equation (8) is the embedding of the whole trace; since

v_{p o s, 0}

is the start of the trace, we want to obtain the whole representation of the sampling trace, and the procedure for the negative sample will be the same. As we obtain the embedding of all the nodes, the embedding of the ID sequence of a real trajectory can be represented as

X_{s e q} = Embedding ((v_{1}^{I D}, v_{2}^{I D}, . . ., v_{k}^{I D})),

(9)

where

X_{s e q}

represents the coarse trajectory area features of the trajectory sequence,

v_{i}^{I D}

is the ID of the sequence at step i, and k is the length of the trajectory sequence. We maintain the embedding of each of the nodes in the graph. This method can also be used for other downstream tasks. The node features of the output of this module is noted as

X_{e m b}

.

Algorithm 1: Graph construction algorithm

4.3. Feature Fusion Module

To get the spatial and temporal features of the coarse trajectory, we use a feature fusion module to obtain the integrated representations of the data; the architecture of this module is presented in Figure 3.

To fuse the time feature

X_{t i m e} \in R^{b \times c \times d_{1}}

and the position features

X_{l o c} \in R^{b \times c \times d_{2}}

, the formulas of the fusion module are as follows.

X_{r e s} = W_{2} \cdot (Feedforward (X_{t i m e} | | X_{l o c})) + b_{2},

(10)

X_{n o r m} = LayerNorm (X_{r e s}),

(11)

X_{m i d} = Dropout (ϕ (X_{n o r m})),

(12)

X_{f u s e d} = X_{r e s} + Dropout (W_{3} \cdot X_{m i d} + b_{3}) .

(13)

In these formulas,

W_{2}, b_{2}, W_{3}, b_{3}

are all learnable parameters; we use

LayerNorm

to catch the local embedding of the spatial-temporal features, and

X_{r e s}

is the original global features of the network. The above two parts can help smooth the training process, and we use the ResLayer to preserve information across layers.

ϕ

is the activation function of the model. In this module, we use ReLU [40] as the activation function. This module will be used for the fusion of the features of the output of the graph embedding module and the fused features of the position encoder and time encoder.

4.4. ST-Layer and Trajectory Recovery Layer

The ST-layer is used to obtain the spatial-temporal features of the output of the last feature fusion. In our model, we decide to use the transformer encoder part [41] to obtain the spatial-temporal features of the trajectory and the target.

To achieve the task above, first, we need to construct the target embedding of the model, based on the position encoder, time encoder, and feature fusion. Suppose the time we want to recover the real position of the user at

t_{p r e}

, which can be used as the feature of the middle of the trajectory and the features of

t_{p r e}

as the target time features. The target position features are defined as

x_{p o s}

for each user u, and the ID and the middle coordinates of its kth coarse trajectory are

(v_{u, k}^{n}, l a t_{u, k}^{n}, l a t_{u, k}^{n})

; to construct our target embedding of user u’s n-th coarse trajectory, the formulas should be

T_{p r e} = Extract (t_{p r e}),

(14)

x_{t i m e, i}^{u} = cos (W_{p r e} T_{p r e}^{u} + b_{p r e}),

(15)

x_{p o s, k}^{u} = PosEncoder (l a t_{u, k}^{n}, l a t_{u, k}^{n}),

(16)

X_{t a r g e t, k}^{u} = FeatureFusion (FeatureFusion (x_{p o s, k}^{u}, x_{t i m e, k}^{u}, Embedding (v_{u, k}^{n}))),

(17)

where

T_{p r e}^{u}

is the features of the timestamp, while

W_{p r e}

and

b_{p r e}

are learnable features,

X_{t a r}

is the embedding for the target part, and

X_{t a r g e t, k}^{u}

is the target embedding for user u’s kth coarse trajectory embedding when trying to recover the real position at time

t_{p r e}

.

F e a t u r e F u s i o n

represents the feature fusion module’s function. We use

X_{t a r g e t}

to represent this sort of feature. The formula for the ST-layer is as follows:

X_{r a w} = Projector (X_{f u s e d} | | X_{t a r g e t}),

(18)

Z_{s t} = LayerNorm (X_{r a w} + MultiHead (X_{r a w})),

(19)

X_{s t} = LayerNorm (Z_{s t} + FeedForward (X_{r a w})),

(20)

where

X_{st}

is the output of ST-layer,

X_{s t}

is the spatial-temporal features of all the trajectories.

Z_{st}

is the hidden state.

Then, we use the trajectory recovery Layer to obtain the recovered point. Supposing the output of the ST-layer for user u for recovering the location at time t is represented as

X_{s t, u}^{t} = (x_{s t, u}^{t_{i}}) | i \leq k

, the formulation of this layer is as follows:

(x_{u, l a t}^{t}, y_{u, l o n g}^{t}) = W_{4} (x_{s t, u}^{t_{0}} | | x_{s t, u}^{t_{1}} | | \dots x_{s t, u}^{t_{k}} + b_{4}),

(21)

where

x_{s t, u}^{t_{i}}

is the representation of time step i.

W_{4}

and

b_{4}

are all trainable parameters.

4.5. Optimization

In this model, we present the loss function of the model, since we want our model to be general, we use both self-supervised and supervised ways and design a loss to combine both ways. For the supervised part, we use MSELoss as the loss function, for the unsupervised part, the loss function is as follows. Suppose there are M test samples, and the labels of samples are

Y_{r e a l} = ((y_{1, l a t}, y_{1, l o n g}), (y_{2, l a t}, y_{2, l o n g}), \dots, (y_{M, l a t}, y_{M, l o n g}))

, which are the latitude and longitude of the precise locations. The precise locations are the real locations of the users; in the test samples, the predicted locations are

Y_{p r e d} = (({\hat{y}}_{1, l a t}, {\hat{y}}_{1, l o n g}), ({\hat{y}}_{2, l a t}, {\hat{y}}_{2, l o n g}), \dots, ({\hat{y}}_{M, l a t}, {\hat{y}}_{M, l o n g}))

, and the formula in this part is as follows:

L = β * L_{r m s e} + L_{p o s} + L_{n e g},

(22)

L_{r m s e} = \frac{1}{M} \cdot \sum_{i = 1}^{M} \sqrt{{(y_{i, l a t} - {\hat{y}}_{i, l a t})}^{2} + {(y_{i, l o n g} - {\hat{y}}_{i, l o n g})}^{2}},

(23)

L_{p o s} = l o g (σ (x_{p o s})),

(24)

L_{n e g} = l o g (1 - σ (x_{n e g})) .

(25)

In this model,

σ

is the sigmoid function; using this way, the loss contains supervised learning as well as unsupervised learning.

β

is used to control the weight of the part of the metric for the

L_{r m s e}

part.

5. Experiments

5.1. Datasets

We conduct our experiments on two datasets. Mobile Signaling Data (MSD): A private dataset that contains two parts collected by China Mobile Communications Corporation; the first part is the raw mobile signaling data that contains the coarse moving patterns of mobile phone users in Beijing. The positioning accuracy is quite low, with an average of more than 300 m; the other part is OTT (over-the-top) data, which can be generated when mobile phone users use the Internet. OTT data contain precise locations of the users, but they are sparse compared with signaling data. OTT data were used as labels in this paper. Each user has more than 200 signaling records per day, while only 15 OTT records per day for each user. We chose the period from 4 March 2023 to 10 March 2023 and chose the mobile phone users in Beijing. The signaling data contains the information of the station ID, station location, reaching timestamp, and leaving timestamp.

Geolife [42], which is collected in the Geolife project by 182 users from April 2007 to August 2012 by Microsoft Research Asia. Each record has the information of user ID, GPS location, and timestamps. Since most of the records are in Beijing, we use only the trajectories that are in Beijing, There are about 16,000 long trajectories in Beijing in the Geolife dataset. In this work, we mainly focus on the area within the Fifth Ring Road of Beijing, since over 70 percent of the trajectories in Beijing are located within the Fifth Ring Road of Beijing. Here, we present the dataset info in Table 2.

Preprocessing: To obtain the coarse trajectory of the data and map it to a precise location, in the mobile signaling dataset, we use the station ID and its location of the station as the coarse location information and use the OTT record that matches the time range and the radius range of the signaling data, we reckon the detection range of base stations is 500 m. For each OTT record, the matched signaling record should match the time in the range of signaling record’s time range. To generate the trajectory sequence in the model, as long as we match the OTT record and the signaling record, we use the present signaling record’s last m records and the next m records to generate the whole coarse trajectory sequence, and we use the OTT record’s location as the coarse trajectory’s precise location of the whole coarse trajectory sequence, and this part is the sample’s label. For the Geolife dataset, we partition the geographical space into grid cells of equal size and use the middle of the grid as the coarse position, which can also be regarded as virtual base stations. Each grid cell has a size of 500 m in width and height. All the points in the same grid are regarded as the same location, and the trajectory sequence will be constructed with the change of the grid; to construct a single record in

p_{u, i}^{n} = (v_{u, i}^{n}, l a t_{u, i}^{n}, l o t_{u, i}^{n}, t_{u, i}^{n} (i n), t_{u, i}^{n} (o u t))

·

t_{u, i}^{n} (i n)

is the first timestamp of the first GPS point in the grid, while

t_{u, i}^{n} (o u t)

is the first timestamp of the first GPS point in the next grid. For the last record,

t_{u, i}^{n} (o u t)

is the timestamp of the last GPS point, and using this way, we construct “virtual” signaling records. We follow the same way of dealing with the MSD dataset and use the GPS points’ coordinates in the matched grid as labels. For both datasets, following the definitions in the preliminaries, we filter out those sequences that have the following features, (1)

\exists i, t_{u, i}^{n} (i n) = t_{u, i}^{n} (o u t)

(2)

\exists i, t_{u, i}^{n} (o u t) \neq t_{u, i + 1}^{n} (i n)

to ensure that the time is consecutive. Following the above descriptions, we construct the datasets by using the coarse trajectories and their matched precise locations and thus yield the sample dataset for the experiment.

5.2. Experiment Settings

We conduct our experiments on A100 Tensor Core GPU on the MSD and Nvidia 3090 on Geolife dataset. Both of the datasets are performed with 64 GB of memory. The training is performed with 600 epochs and conducts a learning rate decay per 100 epochs. The m for the signaling dataset is 10, while for the Geolife dataset, m is five. As for other hyper-parameters, for Position Encoding, the output dimension of this module is 256. The output dimension of the time encoder is 128, and the coarse area embedding dimension is 256 for the signaling dataset. For the Geolife dataset, the number of these three parts is 64, 32, 64. We sort each user’s complete trajectory record by time and take 80% percent of the trajectories as the training set, 10% as the validation set, and the remaining 10% as the test set, and use grids as our “virtual cell towers”, and the latitude and longitude of the middle of the grids as our approximate positions of these cell towers.

5.3. Baselines

To demonstrate the effectiveness of the model, We decided to compare it with several representative baselines of the model. Since there were no other baselines that did our work, we found some related models and implemented them to meet our needs. We compare our GenTrajRec with deep learning methods as well as traditional machine learning methods. The first two baselines are two statistical baselines.

STD: This is a coarse method that uses the matched cell tower’s position as the result of the reconstructed positions.
STD3: This method uses the position of the recent three cell towers’ mean location as the predicted position of the result.

The next two baselines are machine learning methods.

LinearRegression: This is a traditional machine learning method. We use all the coordinates and the timestamps in the raw coarse trajectory and this method to adaptively reconstruct the precise position.
XGBoost: This statistical machine learning method did well in many areas. We use the same features as in the baseline of LinearRegression to reconstruct the precise position.

Next, four basic deep learning methods that are quite popular in multiple tasks. We modified these methods so that they can deal with the fused features.

MLP: A simple neural network.
LSTM: an RNN architecture used in the field of deep learning and is quite popular in problems related to time series.
BiLSTM: This was an extension of LSTM that processes the input sequence in both forward and backward directions simultaneously.
Transformer: This method was applied to time-series forecasting recently. This model could utilize attention mechanism to adaptively consider the weights at each time step.

Finally, state-of-the-art methods that are related to work. Since there was no previous work that performed our task, we found some related models and modified them to fit our dataset and task.

DeepMove [17]: A seq2seq model that is used for the trajectory recovery problem and uses an attention mechanism to aggregate information from multiple time steps.
TSMixer [36]: This method was used for time-series prediction at first; we regarded the trajectory sequence as a special sort of time series and modified it to fit our task.
TrajFormer [24]: This method was used for a classification task; we adopted its trajectory representation methods and used it for our regression task.

5.4. Metrics

To prove the effectiveness of our model, we make sure that all the calculations in the paper are based on geographical distance. For any two pairs of latitude and longitude

(x_{2}, y_{2})

and

(x_{1}, y_{1})

, where

x_{1}

and

x_{2}

are latitude,

y_{1}

and

y_{2}

are longitude; the geographic distance is calculated using the following formula.

S = 2 \cdot arcsin (\sqrt{{sin}^{2} (\frac{x_{2} - x_{1}}{2}) + cos (x_{2}) \cdot cos (x_{1}) \cdot {sin}^{2} (\frac{y_{2} - y_{1}}{2})}) \times 6,371,393 .

(26)

The unit of the calculation here is meters. The geographic distance calculated using the above formula is represented as “geodesic”. For each sample, the reconstructed location is

({\hat{x}}_{i}, {\hat{y}}_{i})

, while the real location is

(x_{i}, y_{i})

. Our model is evaluated using the following indicators:

RMSE: The equation of this part is

$RMSE = \frac{1}{m} \cdot \sqrt{\sum_{i = 1}^{m} {(geodesic ((x_{i}, y_{i}), ({\hat{x}}_{i}, {\hat{y}}_{i})))}^{2}} .$

(27)
MAE: The equation of this part is

$MAE = \frac{1}{m} \cdot \sum_{i = 1}^{m} geodesic ((x_{i}, y_{i}), ({\hat{x}}_{i}, {\hat{y}}_{i})) .$

(28)
Rate150: The proportion of reconstructed positions where the error in geographic distance from the actual position is within 150 m. We reckon that if the distance between the reconstructed position and the real position is less than 150 m, this sort of reconstructed position can be regarded as a precise position. The equation for Rate150 is

$Rate 150 = N (g e o d e s i c ((x_{i}, y_{i}), ({\hat{x}}_{i}, {\hat{y}}_{i})) < 150 meters) / m .$

(29)

5.5. Experiments Results

We first compare our model with all the baselines above. To ensure the comparison is fair, all the deep learning models were re-implemented to add our graph embedding module, and then we compared the improved baselines with our model. Here, we present the results of these two datasets.

The results from Table 3 show that our model achieves start-of-the-art performance in all benchmarks, and several other observations can be made in this table.

Firstly, deep learning methods achieve higher performance than traditional machine learning methods on both datasets. The result of the STD shows that the positioning accuracy of the signaling data is relatively low. Secondly, bidirectional models perform better than undirected ones, which indicates the importance of the spatial-temporal features of the model; this result also showed that the past and future mobility patterns can help locate the exact locations of users. Thirdly, transformer-based models perform better than MLP-based or LSTM-based models, which shows the fact that the self-attention mechanism greatly improves the performance and transformer-based models can be applied to more time-series scenarios. Moreover, TSMixer performs better than transformer-based methods.

We visualize the results of our work in a more detailed way by showing the rating change with the meter threshold increases and comparing it with some of the baselines in our work on the MSD dataset. To clearly show the figure, we only visualize some of the baselines.

The results from Figure 4a show that our model improves the performance by increasing the ratio of the error under 100 m for a single location, thus improving the performance of the model. Another result in Figure 4b, which is divided by the distance between the cell towers and real locations, shows that the distance between the matched cell tower and the real position may not be the main reason that affects the performance of our model, since the increase in the distance error between the matched cell tower and the real position does not violently affect the performance of the model.

To conclude, our model achieves the best results compared with both traditional methods and deep learning methods, which verifies the effectiveness of our model in capturing the spatial-temporal features of the coarse trajectory sequence, and the method gives the possibility of using coarse-grained trajectory to rebuild or generate fine-grained trajectories at any given period.

5.6. Ablation Study

To analyze the effectiveness of our proposed model components, we create the ablation study by removing them one by one. Specifically, we will remove the self-supervised graph embedding part, the feature fusion part and both of these two parts to evaluate the effectiveness of our proposed module. The feature fusion part was removed by replacing it with a single MLP layer, while the self-supervised graph embedding part was removed by using only MSELoss in the optimization part and the graph embedding part. The results are shown in Table 4.

According to the results, our model outperforms all the ablations, which illustrates the importance of each of the proposed components. The performance drops a lot when we remove the self-supervised graph embedding part since this module can be used to generate area embedding, and the removal of this part also means the removal of the unsupervised learning part. Moreover, we find that the feature fusion part is also helpful for the performance since this part not only contains the raw info but also has the representation of the change of the spatial-temporal features. We also visualize the reconstruction results in a more sophisticated way by visualizing the results of each of the model’s results in a more detailed way. Showing the rating of reconstructing with the meter threshold increase from 100 m to 500 m, each time the increase is 100 m, and the results of the MSD are in Figure 5.

According to the results from Figure 5a, our model performs well in each of the distance ranges. We also find that the graph embedding part performs better than the feature fusion part in the range of 100m. A possible reason is that the graph embedding part helps to locate the position in the coarse area and then predict a possible location in the range of the area, thus improving the positioning accuracy, while the feature fusion part tends to retain the original position information and helps to enhance the positioning accuracy while the base range is longer, as we can see from the calculation from Figure 5b.

To further show the effectiveness of the graph embedding part, we also remove the graph embedding part in all the baselines if the original baseline does not contain this part and show the performance of our GenTrajRec in Table 5.

From Table 5, the results show that the removal of the graph embedding parts causes a decrease in the performance of the baselines. Despite removing the graph embedding part, our GenTrajRec still achieves the best performance among all the baselines. The results also imply that the ID and the latitude and longitude of the cell towers both help with the real position restoration task, and the graph embedding module can be used for more downstream tasks.

Case Analysis

To show that our model can deal with short and long signaling sequences, we visualize some of the cases of our work. To achieve this task, we try to analyze the details of the error distribution of our model. We divide our analysis into three parts: the first two parts are the cases for restoring a single point, and the last part is recovering a long trajectory:

The results in Figure 6 and Figure 7 show that the MLP-based model tends to recover the position on the trajectory line in the model, while the other models can capture the offset of the real place and the base station. Moreover, this figure also shows that our model can also locate the real position when a user is just wandering across several places. Each of the signaling trajectories has a timestamp on it, and the black point with a timestamp on it is the timestamp of the ground truth position.

We also apply our model to a long sequence and use the middle time of the two timestamps of the nearing two signaling sequences as the predicted timestamps. The blue part is the signaling sequence of a single mobile phone use, while the red one is the recovered trajectory of users.

As the results show in the pictures, our model can be used for long-sequence recovery. The first two pictures in Figure 8 show that our model can deal with city-scale moving patterns, which may related to vehicles. The third one is the moving between some fixed locations, despite some small offsets, our model still can capture the moving patterns of the signaling data and generate a probable long trajectory.

6. Discussion

Our GenTrajRec model achieves state-of-the-art performance and provides a novel way of utilizing mobile signaling data. It has been adopted by China Mobile Communications Corporation in their project “Key Technology Project for Spatiotemporal Big Data Products”, to improve the positioning accuracy of their signaling data. We use a feature fusion part to combine the raw and normalized features to obtain better representations. Moreover, unlike most of the work that considers only the trajectory sequence and uses positions of the trajectories to construct the graph, to deal with the uneven time intervals, we construct the graph by considering the time intervals between the two positions in the trajectories, which greatly improves the performance of the model. However, there are still some problems with our model: as is shown in Figure 8’s third picture, this case shows that our model still can capture the transition patterns of the mobile phone users. However, the predicted positions are not fixed, the possible reasons might be as follows: (1) the cell towers’ embeddings are area embeddings, and the predicted position of the user might be affected by this; (2) human mobilities have the feature of uncertainty and all the positions in a certain area could all be reasonably predicted places. Moreover, our work can use more complicated modules to model the global and local features of the base stations. All these problems could be tasks in future work for improvement.

7. Conclusions

In this paper, we propose a novel framework, GenTrajRec, for recovering precise trajectories from raw signaling data. We introduce a trajectory transition graph module to capture deep transition patterns. The time encoder and position encoder are implemented to capture the spatial-temporal patterns. To make full use of these features, we develop a feature fusion module to aggregate spatial-temporal embeddings and transition graph embeddings. Finally, we adopt a transformer-based ST-layer to recover precise trajectory. Extensive experiments on two real-life datasets show the effectiveness of the proposed framework. In future work, we plan to extend our framework by incorporating more features, such as POI information and road network information.

Author Contributions

Methodology, H.H. and M.L.; writing—original draft preparation, H.H., H.X. and Z.X.; writing—review and editing, H.H., H.X. and Z.X.; supervision, Y.X. and T.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (No. 62394332) and the Fundamental Research Funds for the Central Universities (No. YWF-23-L-1203).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The Geolife dataset is available on the Internet and can be found here: https://www.microsoft.com/en-us/download/details.aspx?id=52367 (accessed on 15 June 2024). The MSD data are not publicly available due to privacy.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Yao, L.; Bao, J.; Ding, F.; Zhang, N.; Tong, E. Research on traffic flow forecast based on cellular signaling data. In Proceedings of the 2021 IEEE International Conference on Smart Internet of Things (SmartIoT), Jeju, Republic of Korea, 13–15 August 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 193–199. [Google Scholar]
Wang, Z.; Hu, J.; Min, G.; Zhao, Z.; Chang, Z.; Wang, Z. Spatial-temporal cellular traffic prediction for 5G and beyond: A graph neural networks-based approach. IEEE Trans. Ind. Inform. 2022, 19, 5722–5731. [Google Scholar] [CrossRef]
Creswell, A.; White, T.; Dumoulin, V.; Arulkumaran, K.; Sengupta, B.; Bharath, A.A. Generative adversarial networks: An overview. IEEE Signal Process. Mag. 2018, 35, 53–65. [Google Scholar] [CrossRef]
Qi, H.; Shen, Y.; Yin, B. Intelligent trajectory inference through cellular signaling data. IEEE Trans. Cogn. Commun. Netw. 2019, 6, 586–596. [Google Scholar] [CrossRef]
Bonnetain, L.; Furno, A.; El Faouzi, N.E.; Fiore, M.; Stanica, R.; Smoreda, Z.; Ziemlicki, C. TRANSIT: Fine-grained human mobility trajectory inference at scale with mobile network signaling data. Transp. Res. Part Emerg. Technol. 2021, 130, 103257. [Google Scholar] [CrossRef]
Alahi, A.; Goel, K.; Ramanathan, V.; Robicquet, A.; Fei-Fei, L.; Savarese, S. Social lstm: Human trajectory prediction in crowded spaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 961–971. [Google Scholar]
Gupta, A.; Johnson, J.; Fei-Fei, L.; Savarese, S.; Alahi, A. Social gan: Socially acceptable trajectories with generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 2255–2264. [Google Scholar]
Zhang, G.; Yu, Z.; Jin, D.; Li, Y. Physics-infused machine learning for crowd simulation. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 14–18 August 2022; pp. 2439–2449. [Google Scholar]
Zhao, H.; Gao, J.; Lan, T.; Sun, C.; Sapp, B.; Varadarajan, B.; Shen, Y.; Shen, Y.; Chai, Y.; Schmid, C.; et al. Tnt: Target-driven trajectory prediction. In Proceedings of the Conference on Robot Learning, PMLR, London, UK, 8–11 November 2021; pp. 895–904. [Google Scholar]
Chai, Y.; Sapp, B.; Bansal, M.; Anguelov, D. Multipath: Multiple probabilistic anchor trajectory hypotheses for behavior prediction. arXiv 2019, arXiv:1910.05449. [Google Scholar]
Welch, G.; Bishop, G. An Introduction to the Kalman Filter; University of North Carolina at Chapel Hill Department of Computer Science: Chapel Hill, NC, USA, 1995. [Google Scholar]
Hammersley, J. Monte Carlo Methods; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Hinton, G.E.; Osindero, S.; Teh, Y.W. A fast learning algorithm for deep belief nets. Neural Comput. 2006, 18, 1527–1554. [Google Scholar] [CrossRef] [PubMed]
Mo, X.; Huang, Z.; Xing, Y.; Lv, C. Multi-agent trajectory prediction with heterogeneous edge-enhanced graph attention network. IEEE Trans. Intell. Transp. Syst. 2022, 23, 9554–9567. [Google Scholar] [CrossRef]
Feng, J.; Li, Y.; Zhang, C.; Sun, F.; Meng, F.; Guo, A.; Jin, D. Deepmove: Predicting human mobility with attentional recurrent networks. In Proceedings of the 2018 World Wide Web Conference, Lyon, France, 23–27 April 2018; pp. 1459–1468. [Google Scholar]
Wang, J.; Wu, N.; Lu, X.; Zhao, W.X.; Feng, K. Deep trajectory recovery with fine-grained calibration using kalman filter. IEEE Trans. Knowl. Data Eng. 2019, 33, 921–934. [Google Scholar] [CrossRef]
Xia, T.; Qi, Y.; Feng, J.; Xu, F.; Sun, F.; Guo, D.; Li, Y. Attnmove: History enhanced trajectory recovery via attentional network. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 2–9 February 2021; Volume 35, pp. 4494–4502. [Google Scholar]
Sun, H.; Yang, C.; Deng, L.; Zhou, F.; Huang, F.; Zheng, K. Periodicmove: Shift-aware human mobility recovery with graph neural network. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, Virtual Event, 1–5 November 2021; pp. 1734–1743. [Google Scholar]
Ren, H.; Ruan, S.; Li, Y.; Bao, J.; Meng, C.; Li, R.; Zheng, Y. Mtrajrec: Map-constrained trajectory recovery via seq2seq multi-task learning. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Singapore, 14–18 August 2021; pp. 1410–1419. [Google Scholar]
Chen, Y.; Zhang, H.; Sun, W.; Zheng, B. Rntrajrec: Road network enhanced trajectory recovery with spatial-temporal transformer. In Proceedings of the 2023 IEEE 39th International Conference on Data Engineering (ICDE), Anaheim, CA, USA, 3–7 April 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 829–842. [Google Scholar]
Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
Yao, D.; Hu, H.; Du, L.; Cong, G.; Han, S.; Bi, J. Trajgat: A graph-based long-term dependency modeling approach for trajectory similarity computation. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 14–18 August 2022; pp. 2275–2285. [Google Scholar]
Liang, Y.; Ouyang, K.; Wang, Y.; Liu, X.; Chen, H.; Zhang, J.; Zheng, Y.; Zimmermann, R. TrajFormer: Efficient trajectory classification with transformers. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, Atlanta, GA, USA, 17–21 October 2022; pp. 1229–1237. [Google Scholar]
Wu, Z.; Pan, S.; Long, G.; Jiang, J.; Chang, X.; Zhang, C. Connecting the dots: Multivariate time series forecasting with graph neural networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual Event, 6–10 July 2020; pp. 753–763. [Google Scholar]
Xu, Y.; Han, L.; Zhu, T.; Sun, L.; Du, B.; Lv, W. Generic dynamic graph convolutional network for traffic flow forecasting. Inf. Fusion 2023, 100, 101946. [Google Scholar] [CrossRef]
Ariyo, A.A.; Adewumi, A.O.; Ayo, C.K. Stock price prediction using the ARIMA model. In Proceedings of the 2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation, Cambridge, UK, 26–28 March 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 106–112. [Google Scholar]
Yu, Y.; Si, X.; Hu, C.; Zhang, J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar] [CrossRef]
Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar]
Lim, B.; Arık, S.Ö.; Loeff, N.; Pfister, T. Temporal fusion transformers for interpretable multi-horizon time series forecasting. Int. J. Forecast. 2021, 37, 1748–1764. [Google Scholar] [CrossRef]
Zhou, H.; Zhang, S.; Peng, J.; Zhang, S.; Li, J.; Xiong, H.; Zhang, W. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 2–9 February 2021; Volume 35, pp. 11106–11115. [Google Scholar]
Wu, H.; Xu, J.; Wang, J.; Long, M. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Adv. Neural Inf. Process. Syst. 2021, 34, 22419–22430. [Google Scholar]
Zhou, T.; Ma, Z.; Wen, Q.; Wang, X.; Sun, L.; Jin, R. Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. In Proceedings of the International Conference on Machine Learning, PMLR, Baltimore, MD, USA, 17–23 July 2022; pp. 27268–27286. [Google Scholar]
Zeng, A.; Chen, M.; Zhang, L.; Xu, Q. Are transformers effective for time series forecasting? In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2023; Volume 37, pp. 11121–11128. [Google Scholar]
Tolstikhin, I.O.; Houlsby, N.; Kolesnikov, A.; Beyer, L.; Zhai, X.; Unterthiner, T.; Yung, J.; Steiner, A.; Keysers, D.; Uszkoreit, J.; et al. Mlp-mixer: An all-mlp architecture for vision. Adv. Neural Inf. Process. Syst. 2021, 34, 24261–24272. [Google Scholar]
Chen, S.A.; Li, C.L.; Yoder, N.; Arik, S.O.; Pfister, T. Tsmixer: An all-mlp architecture for time series forecasting. arXiv 2023, arXiv:2303.06053. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Shi, X.; Chen, Z.; Wang, H.; Yeung, D.Y.; Wong, W.K.; Woo, W.c. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. Adv. Neural Inf. Process. Syst. 2015, 28. [Google Scholar]
Grover, A.; Leskovec, J. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 855–864. [Google Scholar]
Glorot, X.; Bordes, A.; Bengio, Y. Deep sparse rectifier neural networks. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, JMLR Workshop and Conference Proceedings, Fort Lauderdale, FL, USA, 11–13 April 2011; pp. 315–323. [Google Scholar]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
Zheng, Y.; Xie, X.; Ma, W.Y. GeoLife: A collaborative social networking service among user, location and trajectory. IEEE Data Eng. Bull. 2010, 33, 32–39. [Google Scholar]

Figure 1. Signaling data. (a) shows that the base stations cannot represent the real position of the users. These base stations still have a distance from the real trajectory. (b) shows that there could be redundant segments in the trajectories.

Figure 2. An overview of the model.

Figure 3. The structure of the feature fusion part; this part can be used for any two related features. The right figure is the MLP structure we are using in this work.

Figure 4. Results for different distances between the matched cell towers and the real positions.

Figure 5. Detailed ablation study on MSD dataset.

Figure 6. The recovering cases for some of the moving results.

Figure 7. The recovering cases for some of the staying results.

Figure 8. Results for long sequence.

Table 1. Studies about time sequential modeling.

Studies for Time-Series Sequential Modeling	Publications
RNN-based methods	[29,37,38]
Transformer-based methods	[30,31,32,33]
other methods	[34,35,36]

Table 2. Dataset introduction.

Dataset	Duration	City	User Number	Cell ID Number
MSD	7 days	Beijing	120000	361293
Geolife	5 years	Mainly Beijing	182	3781

Table 3. The result of GenTrajRec and the baselines.

Method	MSD			Geolife
Method	RMSE	MAE	Rate150	RMSE	MAE	Rate150
STD	404.488	315.976	32.308	213.438	201.535	23.707
STD3	399.023	303.457	34.54	261.967	197.308	37.723
LR	364.962	281.36	37.008	210.91	192.55	32.376
XGBoost	339.354	209.193	50.105	157.056	137.918	58.678
MLP	238.35	178.61	56.31	259.49	190.25	46.12
LSTM	216.49	154.97	64.83	181.17	140.85	61.36
BiLSTM	219.18	149.24	69.48	173.43	138.51	61.6
DeepMove	192.35	135.42	72.49	171.98	135.71	63.5
Transformer	160.87	102.89	83.33	112.01	83.32	86.69
TSMixer	158.4	99.54	83.3	108.45	80.09	87.32
TrajFormer	153.9	92.53	85.52	109.87	79.85	87.45
GenTrajRec	144.58	82.3	87.85	103.45	77.12	88.36

Table 4. The result of the ablation.

Method	MSD			Geolife
Method	RMSE	MAE	Rate150	RMSE	MAE	Rate150
w/o feature fusion	160.87	102.89	83.33	112.01	83.32	86.89
w/o Graph	161.28	103.73	81.26	150.24	124.62	67.89
w/o both	167.32	107.92	80.52	161.56	139.06	59.57
GenTrajRec	144.58	82.3	87.85	103.45	77.12	88.36

Table 5. Results without graph embedding.

Method	MSD			Geolife
Method	RMSE	MAE	Rate150	RMSE	MAE	Rate150
MLP	253.44	190.43	53.03	266.02	194.04	44.69
LSTM	219.67	156.15	64.82	183.98	155.88	53.02
BiLSTM	219.18	149.24	69.48	171.98	145.46	57.37
Transformer	170.989	114.869	79.485	161.56	139.06	59.57
TSMixer	204.63	145.48	66.54	155.62	136.07	60.34
GenTrajRec	161.28	103.73	81.26	150.24	124.62	67.89

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Huang, H.; Xie, H.; Xu, Z.; Liu, M.; Xu, Y.; Zhu, T. GenTrajRec: A Graph-Enhanced Trajectory Recovery Model Based on Signaling Data. Appl. Sci. 2024, 14, 5934. https://doi.org/10.3390/app14135934

AMA Style

Huang H, Xie H, Xu Z, Liu M, Xu Y, Zhu T. GenTrajRec: A Graph-Enhanced Trajectory Recovery Model Based on Signaling Data. Applied Sciences. 2024; 14(13):5934. https://doi.org/10.3390/app14135934

Chicago/Turabian Style

Huang, Hongyao, Haozhi Xie, Zihang Xu, Mingzhe Liu, Yi Xu, and Tongyu Zhu. 2024. "GenTrajRec: A Graph-Enhanced Trajectory Recovery Model Based on Signaling Data" Applied Sciences 14, no. 13: 5934. https://doi.org/10.3390/app14135934

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

GenTrajRec: A Graph-Enhanced Trajectory Recovery Model Based on Signaling Data

Abstract

1. Introduction

2. Related Work

2.1. Trajectory Prediction

2.2. Trajectory Recovery

2.3. Time-Series Sequential Modeling

3. Preliminaries

3.1. Transition Graph

3.2. SD Trajectory Sequence

3.3. Real Trajectory Recovery

4. Methodology

4.1. Spatiotemporal Trajectory Embedding

4.2. Transition Graph Embedding

4.3. Feature Fusion Module

4.4. ST-Layer and Trajectory Recovery Layer

4.5. Optimization

5. Experiments

5.1. Datasets

5.2. Experiment Settings

5.3. Baselines

5.4. Metrics

5.5. Experiments Results

5.6. Ablation Study

Case Analysis

6. Discussion

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI