Spatio-Temporal Unequal Interval Correlation-Aware Self-Attention Network for Next POI Recommendation

Li, Zheng; Huang, Xueyuan; Liu, Chun; Yang, Wei

doi:10.3390/ijgi11110543

Open AccessArticle

Spatio-Temporal Unequal Interval Correlation-Aware Self-Attention Network for Next POI Recommendation

by

Zheng Li

^1,2,3

,

Xueyuan Huang

¹,

Chun Liu

^1,2,3,* and

Wei Yang

^1,2,3

¹

College of Computer and Information Engineering, Henan University, Kaifeng 475004, China

²

Henan Engineering Laboratory of Spatial Information Processing, Henan University, Kaifeng 475004, China

³

Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng 475004, China

^*

Author to whom correspondence should be addressed.

ISPRS Int. J. Geo-Inf. 2022, 11(11), 543; https://doi.org/10.3390/ijgi11110543

Submission received: 30 August 2022 / Revised: 20 October 2022 / Accepted: 24 October 2022 / Published: 29 October 2022

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

As the core of location-based social networks (LBSNs), the main task of next point-of-interest (POI) recommendation is to predict the next possible POI through the context information from users’ historical check-in trajectories. It is well known that spatial–temporal contextual information plays an important role in analyzing users check-in behaviors. Moreover, the information between POIs provides a non-trivial correlation for modeling users visiting preferences. Unfortunately, the impact of such correlation information and the spatio–temporal unequal interval information between POIs on user selection of next POI, is rarely considered. Therefore, we propose a spatio-temporal unequal interval correlation-aware self-attention network (STUIC-SAN) model for next POI recommendation. Specifically, we first use the linear regression method to obtain the spatio-temporal unequal interval correlation between any two POIs from users’ check-in sequences. Sequentially, we design a spatio-temporal unequal interval correlation-aware self-attention mechanism, which is able to comprehensively capture users’ personalized spatio-temporal unequal interval correlation preferences by incorporating multiple factors, including POIs information, spatio-temporal unequal interval correlation information between POIs, and the absolute positional information of corresponding POIs. On this basis, we perform next POI recommendation. Finally, we conduct comprehensive performance evaluation using large-scale real-world datasets from two popular location-based social networks, namely, Foursquare and Gowalla. Experimental results on two datasets indicate that the proposed STUIC-SAN outperformed the state-of-the-art next POI recommendation approaches regarding two commonly used evaluation metrics.

Keywords:

next POI recommendation; spatio-temporal unequal interval correlation; self-attention; linear regression

1. Introduction

With the popularity of mobile smart devices and the development of LBSNs, such as Foursquare, Gowalla and Yelp, users can find preferred POIs through mobile devices and location-based services, publish their check-in information in social network platforms, and share their experience after visiting POIs. POI recommendations can help users quickly find POIs that they are interested in, improve users experience, and help POI providers quickly understand users preferences and further improve service quality in a targeted manner. Therefore, POI recommendation has gradually attracted wide attention from researchers. Since users show strong timing when visiting POIs, the recommendation list will change with the corresponding check-in information. As an extension of the general POI recommendation [1], next POI recommendation can accurately infer POIs that users will visit at the next moment according to users’ historical check-in trajectories, consequently providing useful assistance to users and merchants. Therefore, next POI recommendation has become one of the research hotspots in the academic and industrial fields [2].

In next POI recommendation, most studies focus on exploiting users’ mobility sequence patterns hidden in users’ historical check-in trajectories. As for the sequential POI recommendation methods, Markov-chain-based methods [3] are mainly used in the early stage. However, experiments show that such methods have an excellent effect in sparse scenarios, but fail to capture complex sequence features. Later, researchers tried to improve the accuracy of POI recommendation by using the recurrent neural network (RNN) [4] with memory mechanisms. In order to alleviate the problem of the vanishing gradient in traditional RNN, sequential POI recommendation methods based on long short-term memory (LSTM) [5] and gate recurrent unit (GRU) [6] have been proposed one after another, which have some effects on acquiring the long-term dependencies of users. In recent years, inspired by machine translation Transformer, researchers leveraged the self-attention mechanism for sequential POI recommendation [7]. Experiments show that the performance of the POI recommendation methods is significantly better than that based on Markov chain and RNN.

In the methods of next POI recommendation, more factors considered are the temporal factor [8,9] and spatial factor [10,11]. To this end, researchers have proposed various next POI recommendation methods based on the spatio-temporal context to improve the recommendation performance. Although these methods have shown inspiring results, they still have two major limitations as follows:

(1) The effect of spatio-temporal unequal interval information between any two POIs on user selection of next POI is ignored. Most studies model users’ mobility sequence patterns according to the corresponding check-in temporal sequences in users’ historical check-in trajectories. It is assumed that there is an equal temporal/spatial (distance) interval between consecutive check-in activities in users’ check-in trajectories [12], or considering the spatio-temporal information between POIs visited in only consecutive check-in activities [13] or only in non-consecutive check-in activities [14], and ignoring the impact of the spatio-temporal unequal interval between POIs visited in any two check-in activities whether consecutive or not. Studies have shown that the temporal interval between POIs is different, and the corresponding impact on the selection of next POI is also different [15]. Moreover, different users have different maximum tolerances for temporal and spatial intervals between POIs [10]. In fact, even if the historical check-in trajectories of users are the same, the time when users visit the corresponding POIs may be different [16]. The results observed from Figure 1 support that users have different preferences for spatio-temporal intervals when selecting next POIs. Obviously, users’ different spatio-temporal interval preferences have an impact on the users’ selection of next POIs.

(2) The spatial–temporal unequal interval correlation is not considered when recommending the next POI. Most POI recommendation methods based on the spatial–temporal context only analyze the spatial correlation between POIs visited in consecutive or non-consecutive check-in activities [17,18], while lacking the correlation analysis of the impact of the temporal interval on the spatial interval between any two POIs from the user check-in sequence. Furthermore, as shown in Figure 1, User1 prefers to spend a longer time on visiting POIs with larger spatial intervals, while User2 does the opposite. Therefore, we consider that the greater the temporal interval between the user visiting POIs, the more likely he/she is to visit POIs with larger spatial intervals, and vice versa.

To address the issues mentioned above, we propose a spatio-temporal unequal interval correlation-aware self-attention network (STUIC-SAN) model for next POI recommendation. The model integrates POIs information, spatio-temporal unequal interval correlation information between POIs, and the absolute positional information of corresponding POIs in users’ check-in sequences, learns users’ personalized spatio-temporal unequal interval correlation preference features, and then makes next POI recommendations. Extensive experiments are carried out on two public datasets, namely, Foursquare and Gowalla, to verify the effectiveness of STUIC-SAN. Our primary contributions can be summarized as follows.

(1) We propose a novel approach for next POI recommendation, named STUIC-SAN, which significantly improves the performance of the recommendation.

(2) We deeply mine the correlation between spatio-temporal unequal intervals, and develop an embedding method that takes into account POIs information, spatio-temporal unequal interval correlation information between POIs, and the absolute positional information of corresponding POIs. These factors are considered to be the relationship between any two POIs so as to comprehensively model users’ spatio-temporal unequal interval correlation preferences.

(3) We design a correlation-aware self-attention network model to learn users’ personalized spatio-temporal unequal interval correlation preference features, which can automatically measure the relevance of various inputs mentioned in contribution (2) to the model at each step and then adjust the attention weights for the inputs accordingly.

(4) To verify the effectiveness of the proposed model, we conduct extensive experiments based on two real-world datasets. Experimental results show that the proposed STUIC-SAN outperforms the state-of-the-art next POI recommendation approaches.

The remainder of this paper is organized as follows. Section 2 reviews the works related to next POI recommendation. Section 3 gives the preliminaries and dataset analysis. We introduce the details of the proposed STUIC-SAN model in Section 4. Next, we compare our proposed model with existing next POI recommendation models, and analyze the experimental results as well as the threats to validity in Section 5. Finally, we conclude this paper and outline the future work in Section 6.

2. Related Work

2.1. Sequential Next POI Recommendation

Most of the historical check-in data of users in LBSNs are presented in the form of sequences. Thus, the sequential modeling of users’ check-in information can effectively capture the sequence features of users check-in locations, thereby improving the performance of next POI recommendation. The main task of next POI recommendation is to recommend the next POI based on users’ historical check-in activities. The methods for sequential next POI recommendations can be roughly divided into the following two categories.

(1) Next POI recommendation based on Markov chain. In the early stage, most methods used the first-order Markov chain [19,20] to recommend next POI, only considering users’ last check-in activity, and the acquisition of users’ visiting preferences was not comprehensive. Therefore, He et al. [21] adopted a high-order Markov chain to consider the recent check-in activities so as to obtain relatively complete users’ visiting preferences, and then make the next POI recommendation. However, the next-POI-recommendation methods mentioned above cannot obtain the long-term visiting preferences of users, and the recommendation performance is not ideal when the dataset is sparse.

(2) Next POI recommendation based on deep learning. RNN is very effective for processing sequence data with rich contextual information, so most of sequential next POI recommendation methods are based on RNN and its variant models. For example, Lu et al. [22] proposed a consecutive POI recommendation approach based on latent factors, which combines the sequential patterns of POIs and users visiting preferences for next POI recommendation. However, due to the vanishing gradient problem of traditional RNN, the extended model LSTM of RNN came into being to effectively describe the long-term visiting preferences of users. Therefore, Cui et al. [13] used context to model non-consecutive POIs, obtained the time effect in the long-term module based on the LSTM model, and constructed four short-term sequences in the short-term module to obtain the influence effects of different factors so as to carry out next POI recommendations. Compared with the Markov chain model, RNN and its variant models can effectively alleviate the problem of long-term dependence, but their models are completely dependent on serial computing, and the parallel ability is poor.

Meanwhile, Transformer can realize parallel operations, obtain global information, and effectively solve the above problems. Therefore, researchers applied self-attention mechanism to sequential recommendation and achieved good results [23,24]. For example, Liu et al. [6] proposed a category-aware GRU model based on the self-attention mechanism, which can selectively use the self-attention mechanism to focus on the relevant historical check-in trajectory information and then conduct POI recommendation. Huang et al. [25] proposed a deep attention network model based on social awareness for next POI recommendation, which leveraged the self-attention mechanism to model sequential and social influences in a unified way. Lim et al. [26] proposed a novel explore–exploit model that leveraged random walks as a masked self-attention option to use spatial–temporal-preference graph structures and find new higher-order POI neighbors during exploration as candidate POIs. The above research demonstrates that using the self-attention mechanism in Transformer for next POI recommendation is more effective than traditional RNN models.

However, most of the above methods only retain the sequential information between POIs, or directly set the temporal interval or spatial interval between POIs visited in consecutive or non-consecutive check-in activities to the same value, ignoring the effect of spatio-temporal unequal interval correlation between the corresponding POIs on users’ selection of next POI, and lacking the correlation analysis of the impact of the temporal unequal interval between any two POIs on the spatial interval. In our research, we adopt linear regression to analyze the correlation effect of the temporal unequal interval between any two POIs on the spatial interval, and learn the effect of spatio-temporal unequal interval correlation on the selection of next POI by the self-attention mechanism, consequently improving the performance of next POI recommendation.

2.2. Next POI Recommendation Based on Spatio-Temporal Context

In LBSNs, the spatio-temporal context in users check-in sequences can accurately describe the visiting preferences of users, thus attracting the extensive attention of researchers. For example, Ding et al. [27] used users’ check-in temporal and spatial sequence characteristics to model users’ temporal feature preferences, divided users’ check-in timestamps into time periods, and simulated users’ preferences for POIs in a specific time period, but the model ignored the impact of users’ long-term preferences. Therefore, Doan et al. [28] proposed a deep long-short-term RNN model with a memory attention mechanism, which captured sequence features and spatio-temporal features into its learning representation, and then modeled users’ preferences for next POI. Zhao et al. [29] proposed spatio-temporal gated networks to model the personalized sequential patterns of users’ long-term and short-term preferences for next POI recommendation. Considering that not all users’ historical check-in activities have the same impact on the check-in behavior at the next moment, Huang et al. [30] proposed a spatio-temporal long- and short-term memory network to model the spatio-temporal context information, and combined it with the LSTM model based on attention mechanism to selectively utilize the spatio-temporal context so as to conduct next POI recommendation.

In addition, some studies have combined spatio-temporal context information with users’ social information for next POI recommendation. For example, Davtalab et al. [31] proposed a social spatio-temporal probability matrix factorization model, which used POIs similarity and users similarity, integrated social influence factors, spatial influence factors and POI category influence factors in similarity modeling, and used similarity modeling to obtain users’ preferred POIs. Dai et al. [32] integrated social information on the basis of the spatio-temporal influence of users’ historical check-in sequences, and modeled users’ personalized sequential patterns through joint embedding and a LSTM-based spatio-temporal neural network, thereby recommending personalized POIs for users. Furthermore, researchers also used the spatial and temporal context to alleviate the high sparsity of users’ check-in data. For example, Xi et al. [33] proposed a Bi-STDDP model, which can integrate bidirectional spatio-temporal dependencies and users’ dynamic preferences to identify missing check-in locations that users have visited at a specific time. Ma et al. [34] utilized the LSTM neural network and kernel density estimation to model users’ visiting preferences, integrating spatio-temporal context information, locations information and category factors, and recommended next POIs to users.

The methods mentioned above analyze user-related context information from different dimensions, improving the recommendation performance to some extent. However, the above research lacks the in-depth mining of spatio-temporal information between any two POIs from users’ check-in sequences, and ignores the impact of spatio-temporal unequal interval information and the spatio-temporal unequal interval correlation between any two POIs on the user selection of next POI. In contrast, our work takes the spatio-temporal unequal interval correlation information as the relationship between any two POIs, and designs a spatio-temporal unequal interval correlation-aware self-attention network model, which can combine the POIs information, spatio-temporal unequal interval correlation information between POIs, and absolute positional information of corresponding POIs in users’ check-in sequences, selectively fuse more relevant historical check-in activities, and accurately simulate users’ personalized spatio-temporal unequal interval correlation preferences to enhance the performance of next POI recommendation.

3. Preliminaries and Data Analysis

In this section, we first provide a formulation of our next POI recommendation task and list key notations used in this paper in Table 1, and then analyze the properties of the datasets.

3.1. Preliminaries

In this section, we give some definitions of terms. Assume that

U = \{u_{1}, u_{2}, \dots, u_{|U|}\}

denotes users set and

L = \{l_{1}, l_{2}, \dots, l_{|L|}\}

denotes POIs set, where

|U|

is the total number of users, and

|L|

is the total number of POIs that all users visited.

Definition 1.

POI. In LBSNs, POIs represent specific spatial locations that contain geographical information in real life, such as a restaurant, a shop, a cafe, and so on.

Definition 2.

Check-in activity. For a check-in activity

r_{e}

of user

u_{i}

, we represent it as a triple:

r_{e} = (l_{e}, t_{e}, g_{e})

, which means that user

u_{i}

has visited the POI

l_{e}

at time

t_{e}

, while

g_{e} = (l o n_{e}, l a t_{e})

indicates the latitude and longitude of

l_{e}

.

Definition 3.

Historical check-in trajectory. The historical check-in trajectory of user

u_{i}

is represented by

t r a (u_{i}) = \{r_{1}, r_{2}, \dots, r_{m}\}

, and m is the number of check-in activities in

t r a (u_{i})

.

Definition 4.

Check-in sequence. We set user historical check-in trajectory

t r a (u_{i}) = \{r_{1}, r_{2}, \dots, r_{m}\}

to a fixed-length check-in sequence

s e q (u_{i}) = \{r_{1}, r_{2}, \dots, r_{n}\}

, where n represents the maximum sequence length.

Definition 5.

Temporal sequence and spatial sequence. According to user check-in sequence

s e q (u_{i}) = \{r_{1}, r_{2}, \dots, r_{n}\}

, the corresponding temporal sequence

t (u_{i}) = \{t_{1}, t_{2}, \dots, t_{n}\}

and spatial sequence

g (u_{i}) = \{g_{1}, g_{2}, \dots, g_{n}\}

are generated.

Definition 6.

Spatio-temporal unequal interval correlation-aware next POI recommendation task. Given user check-in sequence

s e q (u_{i}) = \{r_{1}, r_{2}, \dots, r_{n}\} \in t r a (u_{i})

, the goal of next POI recommendation is to predict the most likely POIs that user

u_{i}

will visit at a specific time.

3.2. Preliminary Analysis

When modeling user’s visiting preference for spatio-temporal unequal interval correlation-aware POI recommendation, we consider two important factors: spatio-temporal unequal interval information and spatio-temporal unequal interval correlation between any two POIs. It is well recognized that the spatio-temporal interval between POIs is unequal, so here we focus on analyzing the correlation factor on two public LBSN datasets, i.e., Foursquare and Gowalla.

The results observed from Figure 2 and Figure 3 support that the correlation factor we considered has a certain impact on the users’ selection of the next POI. Note that here, we mainly illustrate the analysis between POIs in consecutive check-in activities; the analysis in non-consecutive check-in activities is similar, so we will not give further analysis due to space limitation.

Specifically, as shown in Figure 2, by analyzing users’ check-in data from Foursquare (Figure 2a) and Gowalla (Figure 2b), we find that users tend to visit POIs with relatively small spatial intervals (such as near the office or home) due to limited free time on weekdays, while they may prefer to visit POIs with relatively large spatial intervals on weekends due to abundant free time. To be specific, Figure 2(1) shows the statistics on the spatial intervals between POIs during weekdays of the two datasets. It can be seen that most users from the two datasets (61.95%, 57.61%) have a spatial interval of no more than 25 km between POIs. Among them, 32.71% and 36.98% of users visit POIs within 5 km, indicating that the spatial interval between users visiting POIs on weekdays is relatively small; similarly, Figure 2(2) shows the statistics on the spatial interval between POIs during weekends. We can see that most users from two datasets (61.14%, 60.53%) have a spatial interval of more than 25 km between POIs. Among them, 43.22% and 56.38% of users visit POIs exceeding 25 km, indicating that users are more likely to visit POIs with relatively large spatial intervals on weekends; furthermore, Figure 2(3) shows the statistical results of comparing the spatial interval between POIs on weekdays and that on weekends. It can be seen that more than 59% of users from two datasets visit POIs on weekends with a larger spatial interval than that on weekdays. Therefore, we considered that the temporal interval between POIs has a certain impact on the spatial interval of users selection of next POI.

We qualitatively analyze the spatio-temporal unequal interval correlation between POIs from Foursquare. We measure the correlation of temporal interval and spatial interval between POIs by calculating the Pearson correlation coefficient. Figure 3(1)–(4) show the results of four users randomly selected from Foursquare. As expected, the temporal interval and spatial interval between POIs are approximately positively correlated. In other words, when the temporal interval between user visiting POIs is small, the spatial interval between POIs is also relatively small, and vice versa. That is, the spatial interval for users to select the next POI is limited by the temporal interval. Therefore, the spatio-temporal interval between POIs has a certain correlation, and such a correlation has a certain impact on the user selection of the next POI. These results demonstrate that the spatio-temporal interval correlation between POIs plays an important role in capturing the personalized spatio-temporal interval correlation preferences of users, which motivates us to use such a correlation for next POI recommendation.

4. The Proposed Method

In this section, we elaborate the general architecture of STUIC-SAN (note that the temporal interval and spatial interval in our model are unequal by default). As shown in Figure 4, the STUIC-SAN model mainly consists of four modules: (1) preference modeling layer, which is used to construct spatio-temporal unequal interval matrices of users according to the spatio-temporal interval correlation information between any two POIs from users check-in sequences so as to comprehensively model users personalized spatio-temporal interval correlation preferences; (2) preference embedding layer, which is used for learning the dense representations of POIs information, spatio-temporal interval correlation information between POIs, and absolute positional information of corresponding POIs in the user check-in sequence; (3) Transformer model, which aggregates relatively important POIs from the user check-in sequence and adjusts different weights to each POI to update the representation of the corresponding POIs; and (4) next POI recommendation, which is used to calculate the preference score of next POI by querying the corresponding POI representation update of a specific user at time t. Then, candidate POIs are sorted according to the corresponding preference scores, and the top-k ranked POIs are recommended.

4.1. User Personalized Spatio-Temporal Unequal Interval Correlation Preference Modeling Layer

Accurately obtaining spatio-temporal unequal interval correlation information between POIs is critical for next POI recommendation. Considering the impact of spatio-temporal unequal interval correlation information on users’ check-in behaviors, we model the spatio-temporal unequal interval correlation information between any two POIs as a relationship between the corresponding POIs. On this basis, users’ personalized temporal unequal interval matrices are constructed based on the corresponding temporal sequences generated by users’ check-in sequences. Subsequently, users are classified by the comparison results of the average temporal interval between POIs from each user check-in sequence, and the average temporal interval between POIs from all users’ check-in sequences. Then, we use linear regression to obtain the spatio-temporal unequal interval correlation of different kinds of users, and calculate the maximum spatial interval of each user to construct the corresponding user personalized spatial unequal interval matrix. Next, we describe the process of obtaining users’ personalized spatio-temporal unequal interval matrices in detail.

4.1.1. Construction of Users Personalized Temporal Unequal Interval Matrices

This subsection is mainly used to construct users’ personalized temporal unequal interval matrices based on the corresponding temporal sequences generated from users’ check-in sequences. For each user check-in sequence, we adopt the similar method proposed in [15] to perform the same processing on each timestamp of the corresponding POI. That is, we divide the temporal interval between any two POIs by the minimum temporal interval except 0 so as to scale down it in equal proportion. Meanwhile, considering that the temporal interval between any two POIs is too large, the clip operation is further performed on all temporal intervals after the reduction to better model the personalized temporal unequal interval between POIs.

Specifically, we generate the corresponding temporal sequence

t (u_{i}) = \{t_{1}, t_{2}, \dots, t_{n}\}

according to the user

u_{i}

check-in sequence, and then calculate the temporal interval between any two POIs, which is represented by

r_{i j}^{t_{u_{i}}} = ∥t_{i} - t_{j}∥

, and

r_{i j}^{t_{u_{i}}} \in R_{u_{i}}^{t}

, where

R_{u_{i}}^{t}

represents the set of temporal intervals in the check-in sequence of user

u_{i}

, and

r_{m i n}^{t_{u_{i}}} = M i n (R_{u_{i}}^{t})

denotes the minimum temporal interval of user

u_{i}

. Then, each element in set

R_{u_{i}}^{t}

is scaled down in equal proportion by Formula (1).

r_{i j}^{t_{u_{i}}} = ⌊\frac{∥t_{i} - t_{j}∥}{r_{m i n}^{t_{u_{i}}}}⌋

(1)

Therefore, the personalized temporal unequal interval matrix

Δ^{t_{u_{i}}} \in N^{n \times n}

of user

u_{i}

is expressed as

Δ^{t_{u_{i}}} = [\begin{matrix} r_{11}^{t_{u_{i}}} & \dots & r_{1 n}^{t_{u_{i}}} \\ ⋮ & ⋱ & ⋮ \\ r_{n 1}^{t_{u_{i}}} & \dots & r_{n n}^{t_{u_{i}}} \end{matrix}]

(2)

Note that the elements on the main diagonal in matrix

Δ^{t_{u_{i}}}

are all 0.

As mentioned above, we consider the case that the temporal interval between any two POIs is too large. So, we set the maximum threshold

k^{t}

for matrix

Δ^{t_{u_{i}}}

, and adjust each element in the matrix to

r_{i j}^{t_{u_{i}}} = M i n (k^{t}, r_{i j}^{t_{u_{i}}})

. Therefore, matrix

Δ^{t_{u_{i}}}

is further expressed as

Δ_{c l i p p e d}^{t_{u_{i}}} = c l i p (Δ^{t_{u_{i}}})

, while

c l i p (\cdot)

indicates that each element in the matrix is clipped according to the corresponding maximum threshold

k^{t}

.

4.1.2. Construction of Users Personalized Spatial Unequal Interval Matrices

The main work of this subsection is to obtain users’ personalized spatial unequal interval matrices according to the spatio-temporal interval correlation between any two POIs. According to the correlation between temporal and spatial intervals of POIs shown in Figure 2 and Figure 3, we consider that the spatial interval between users visiting POIs will be affected by the corresponding temporal interval. So we classify users according to the average temporal interval between POIs from each user check-in sequence and the average temporal interval between POIs from all users’ check-in sequences, and use linear regression to obtain the spatio-temporal interval correlation of different kind of users. Then, we calculate the maximum spatial interval of each user as the maximum spatial span in the corresponding user-personalized spatial unequal interval matrix. On this basis, users’ personalized spatial unequal interval matrices are constructed.

Specifically, we obtain the corresponding spatial sequence

g (u_{i}) = \{g_{1}, g_{2}, \dots, g_{n}\}

generated from the user

u_{i}

check-in sequence, and then calculate the spatial interval between any two POIs, which is represented by

r_{i j}^{s_{u_{i}}} = H a v e r s i n e (G P S_{i}, G P S_{j})

, and

r_{i j}^{s_{u_{i}}} \in R_{u_{i}}^{s}

, where

R_{u_{i}}^{s}

denotes the set of spatial intervals in the check-in sequence of user

u_{i}

. Therefore, the personalized spatial unequal interval matrix

Δ^{s_{u_{i}}} \in N^{n \times n}

of user

u_{i}

is expressed as

Δ^{s_{u_{i}}} = [\begin{matrix} r_{11}^{s_{u_{i}}} & \dots & r_{1 n}^{s_{u_{i}}} \\ ⋮ & ⋱ & ⋮ \\ r_{n 1}^{s_{u_{i}}} & \dots & r_{n n}^{s_{u_{i}}} \end{matrix}]

(3)

Next, we classify users according to the average temporal interval between POIs from each user check-in sequence and the average temporal interval between POIs from all users’ check-in sequences. Among them, the average temporal interval of each user is denoted by

r_{m e a n}^{t_{u_{i}}} = M e a n (R_{u_{i}}^{t})

, and the average temporal interval of all users is represented by

r_{m e a n}^{t} = M e a n (\sum_{i = 1}^{U} R_{u_{i}}^{t})

. Then, we compare the average temporal interval of user

u_{i}

with the average temporal interval of all users to classify users. Furthermore, we count the number of users when

r_{m e a n}^{t_{u_{i}}} > r_{m e a n}^{t}

,

r_{m e a n}^{t_{u_{i}}} = r_{m e a n}^{t}

,

r_{m e a n}^{t_{u_{i}}} < r_{m e a n}^{t}

, respectively, and find that there are almost no users with

r_{m e a n}^{t_{u_{i}}} = r_{m e a n}^{t}

from two datasets. Therefore, according to the corresponding comparison results, users are divided into two categories, as shown in Formula (4).

\begin{matrix} \{\begin{matrix} R_{u_{i}}^{t} | R_{u_{i}}^{t} \in S_{s m a l l}^{t} \\ R_{u_{i}}^{s} | R_{u_{i}}^{s} \in S_{s m a l l}^{s} \end{matrix} r_{m e a n}^{t_{u_{i}}} < r_{m e a n}^{t} \\ \{\begin{matrix} R_{u_{i}}^{t} | R_{u_{i}}^{t} \in S_{l arg e}^{t} \\ R_{u_{i}}^{s} | R_{u_{i}}^{s} \in S_{l arg e}^{s} \end{matrix} r_{m e a n}^{t_{u_{i}}} > r_{m e a n}^{t} \end{matrix}

(4)

where

S_{s m a l l}^{t}

and

S_{s m a l l}^{s}

represent the sets of the temporal interval and spatial interval of users with

r_{m e a n}^{t_{u_{i}}} < r_{m e a n}^{t}

, respectively. Correspondingly,

S_{l a r g e}^{t}

and

S_{l a r g e}^{s}

denote the sets of the temporal interval and spatial interval of users with

r_{m e a n}^{t_{u_{i}}} > r_{m e a n}^{t}

, respectively.

Based on the above classification of users, we use the linear regression method [35] to obtain the spatio-temporal interval correlation between any two POIs from the corresponding check-in sequences of two kinds of users, and adopt Formula (5) to optimize the core objective.

(w^{*}, b^{*}) = arg min_{(w, b)} \sum_{i = 1}^{j} {(w * S_{i}^{t} + b - S_{i}^{s})}^{2}

(5)

where j represents the number of elements in set

S^{t}

or

S^{s}

.

S_{i}^{t} \in S^{t}

represents an element in the temporal interval set of a category of users. Similarly,

S_{i}^{s} \in S^{s}

denotes an element in the spatial interval set of a category of users. w and b represent the slope and intercept in the linear regression equation, respectively. We use the minimization mean square error to solve the model, as shown in Formulas (6) and (7).

w = \frac{\sum_{i = 1}^{j} S_{i}^{s} (S_{i}^{t} - \bar{S^{t}})}{\sum_{i = 1}^{j} S_{i}^{t^{2}} - \frac{1}{j} {(\sum_{i = 1}^{j} S_{i}^{t^{2}})}^{2}}

(6)

b = \frac{1}{j} \sum_{i = 1}^{j} (S_{i}^{s} - w S_{i}^{t})

(7)

where

\bar{S^{t}}

denotes the mean of the temporal intervals in set

S^{t}

.

For each kind of users, we obtain the corresponding values of w, b, denoted by

w_{s m a l l}

,

b_{s m a l l}

,

w_{l a r g e}

,

b_{l a r g e}

, respectively. Considering the differences of users temporal interval preferences for visiting POIs, we further calculate the corresponding maximum spatial interval of each user in a more fine-grained manner according to the obtained maximum temporal interval of each user, as shown in Formula (8).

\{\begin{matrix} R e g (k_{s m a l l}^{s}) = w_{s m a l l} r_{m a x}^{t_{u_{i}}} + b_{s m a l l}, r_{m e a n}^{t_{u_{i}}} < r_{m e a n}^{t} \\ R e g (k_{l a r g e}^{s}) = w_{l a r g e} r_{m a x}^{t_{u_{i}}} + b_{l a r g e}, r_{m e a n}^{t_{u_{i}}} > r_{m e a n}^{t} \end{matrix}

(8)

where

R e g (\cdot)

indicates the linear regression operation,

k_{s m a l l}^{s}

represents the maximum spatial interval of matrix

Δ^{s_{u_{i}}}

of each user with

r_{m e a n}^{t_{u_{i}}} < r_{m e a n}^{t}

;

k_{l a r g e}^{s}

represents the maximum spatial interval of matrix

Δ^{s_{u_{i}}}

of each user with

r_{m e a n}^{t_{u_{i}}} > r_{m e a n}^{t}

; and

r_{m a x}^{t_{u_{i}}} = M a x (R_{u_{i}}^{t})

represents the maximum temporal interval between any two POIs from the check-in sequence of user

u_{i}

. Subsequently, each element in matrix

Δ^{s_{u_{i}}}

is denoted as

r_{i j}^{s_{u_{i}}} = M i n (k^{s}, r_{i j}^{s_{u_{i}}})

. Therefore, the personalized spatial unequal interval matrix of user

u_{i}

is further represented as

Δ_{c l i p p e d}^{s_{u_{i}}} = c l i p (Δ^{s_{u_{i}}})

.

4.2. Embedding Layer Fusing Spatio-Temporal Unequal Interval Correlation Preference

The embedding layer is used to encode the POIs information, spatio-temporal interval correlation information between POIs, and absolute positional information of corresponding POIs in each user check-in sequence as latent representations. Firstly, we create an embedding matrix

M^{L} \in R^{|L| \times d}

for POIs, where

|L|

represents the number of POIs and d represents the latent dimension. Then, for the historical check-in trajectory of user

u_{i}

, we use a constant zero vector as the embedding for padding items, and cut off or pad the user check-in trajectory to the first n check-in activities. As for POIs from the first n check-in activities, the embedding look-up operation retrieves the previous n POI embeddings and stacks them together to generate a embedding matrix

E^{L} \in R^{n \times d}

as shown in Formula (9).

E^{L} = [\begin{matrix} I_{1} \\ I_{2} \\ ⋮ \\ I_{n} \end{matrix}]

(9)

where

I_{i} \in R^{d}

denotes the embedded representation of the POI visited in the i-th check-in activity from the user check-in sequence.

Since the self-attention mechanism cannot directly obtain the POIs position from the user check-in sequence, we use two different learnable positional embedding matrices

M_{K}^{P} \in R^{n \times d}

and

M_{V}^{P} \in R^{n \times d}

, which represent the keys and values in the self-attention mechanism, respectively. This method is more suitable for the self-attention mechanism without requiring additional linear transformations [7]. After the retrieval operation, we obtain the absolute positional embedding matrices

E_{K}^{P} \in R^{n \times d}

,

E_{V}^{P} \in R^{n \times d}

of the user check-in sequence, as shown in Formula (10).

\begin{matrix} E_{K}^{P} = [\begin{matrix} p_{1}^{K} \\ p_{2}^{K} \\ ⋮ \\ p_{n}^{K} \end{matrix}] & E_{V}^{P} = [\begin{matrix} p_{1}^{V} \\ p_{2}^{V} \\ ⋮ \\ p_{n}^{V} \end{matrix}] \end{matrix}

(10)

Similar to the absolute positional embedding, we perform the same operations for temporal interval embedding and spatial interval embedding. Specifically, we use word embedding technology to encode temporal intervals and create two temporal interval embedding matrices

M_{K}^{T} \in R^{k^{t} \times d}

,

M_{V}^{T} \in R^{k^{t} \times d}

. Similarly, we obtain the spatial interval embedding matrices

M_{K}^{S} \in R^{k^{s} \times d}

,

M_{V}^{S} \in R^{k^{s} \times d}

. Then, after retrieving the clipped temporal interval matrix

Δ_{c l i p p e d}^{t_{u_{i}}}

and spatial interval matrix

Δ_{c l i p p e d}^{s_{u_{i}}}

, we obtain the corresponding temporal interval embedding matrices

E_{K}^{T} \in R^{n \times n \times d}

,

E_{V}^{T} \in R^{n \times n \times d}

as well as the spatial interval embedding matrices

E_{K}^{S} \in R^{n \times n \times d}

,

E_{V}^{S} \in R^{n \times n \times d}

of the user check-in sequence, as shown in Formulas (11) and (12).

E_{K}^{T} = [\begin{matrix} r_{11}^{t_{K}} & \dots & r_{1 n}^{t_{K}} \\ ⋮ & ⋱ & ⋮ \\ r_{n 1}^{t_{K}} & \dots & r_{n n}^{t_{K}} \end{matrix}] E_{V}^{T} = [\begin{matrix} r_{11}^{t_{V}} & \dots & r_{1 n}^{t_{V}} \\ ⋮ & ⋱ & ⋮ \\ r_{n 1}^{t_{V}} & \dots & r_{n n}^{t_{V}} \end{matrix}]

(11)

E_{K}^{S} = [\begin{matrix} r_{11}^{s_{K}} & \dots & r_{1 n}^{s_{K}} \\ ⋮ & ⋱ & ⋮ \\ r_{n 1}^{s_{K}} & \dots & r_{n n}^{s_{K}} \end{matrix}] E_{V}^{S} = [\begin{matrix} r_{11}^{s_{V}} & \dots & r_{1 n}^{s_{V}} \\ ⋮ & ⋱ & ⋮ \\ r_{n 1}^{s_{V}} & \dots & r_{n n}^{s_{V}} \end{matrix}]

(12)

4.3. Spatio-Temporal Unequal Interval Correlation-Aware Transformer Model

In this section, we elaborate the spatio-temporal unequal interval correlation-aware Transformer model. It considers POIs information, spatio-temporal interval correlation information between POIs, and the absolute positional information of corresponding POIs in each user check-in sequence as the correlation feature between any two POIs. The feature fusion and update of POIs from each user check-in sequence are carried out through the spatio-temporal unequal interval correlation-aware self-attention network, and through the point-wise feed-forward network, adding a fully connected layer to improve the generalization ability of the model.

4.3.1. Spatio-Temporal Unequal Interval Correlation-Aware Self-Attention Network

Inspired by the self-attention mechanism, we propose an extended model of the self-attention mechanism. The model considers the spatio-temporal interval between any two POIs as the relationship between the corresponding POIs. The output of each step of the model can aggregate POIs related to the current step, and adaptively give different weights to each POI to update the representation of each POI.

Specifically, given the POIs embedding matrix

E^{L} = (I_{1}, I_{2}, \dots, I_{n})

, where

I_{i} \in R^{d}

, after the self-attention network, it outputs a new sequence

NS = (n s_{1}, n s_{2}, \dots, n s_{n})

, where

n s_{i} \in R^{d}

, to ensure that each element in the new sequence not only contains its own information, but also takes into account the impact of all other POIs from the user check-in sequence on the current step. The i-th item

n s_{i}

of the output sequence is computed as a weighted sum of the linearly transformed POIs embedding, temporal interval embedding, and spatial interval embedding between POIs, as well as the absolute positional embedding of corresponding POIs in the user check-in sequence as shown in Formula (13).

n s_{i} = \sum_{j = 1}^{n} a_{i j} (I_{j} W^{V} + r_{i j}^{t_{V}} + r_{i j}^{s_{V}} + p_{j}^{V})

(13)

where n represents the maximum sequence length inputted,

I_{j}

represents the embedding of

PO I_{j}

,

W^{V}

represents the projection matrix of the corresponding values in POIs embedding matrix.

r_{i j}^{t_{V}}

and

r_{i j}^{s_{V}}

represent the temporal interval embedding and spatial interval embedding between POIs, respectively, and

p_{j}^{V}

represents the absolute positional embedding of corresponding POIs in the user check-in sequence.

a_{i j}

represents the weight coefficient, which is calculated by the soft-max function shown in Formula (14).

a_{i j} = \frac{exp e_{i j}}{\sum_{r = 1}^{n} exp e_{i r}}

(14)

where

e_{i j}

represents the attention score, which is calculated by comprehensively considering the POIs information, spatio-temporal interval information between POIs, and absolute positional information of corresponding POIs in the user check-in sequence in Formula (15).

e_{i j} = \frac{I_{i} W^{Q} {(I_{j} W^{K} + r_{i j}^{t_{K}} + r_{i j}^{s_{K}} + p_{j}^{K})}^{T}}{\sqrt{d}}

(15)

where

W^{Q} \in R^{d \times d}

and

W^{K} \in R^{d \times d}

represent projection matrices of the corresponding queries and keys in the POIs embedding matrices of the user check-in sequence, respectively. The scale factor

\frac{1}{\sqrt{d}}

is for avoiding the inner product value being too large, which may cause the vanishing gradient after the soft-max function.

4.3.2. Point-Wise Feed-Forward Network

As described in Section 4.3.1, the spatio-temporal unequal interval correlation-aware self-attention network uses a linear combination-based method to fuse the POIs information, temporal interval information and spatial interval information between POIs, as well as the absolute positional information of corresponding POIs in the user check-in sequence. Inspired by the idea in [14], we apply two linear transformations with ReLU as the activation function after each spatio-temporal unequal interval correlation-aware self-attention network, consequently making our model nonlinear.

FFN (n s_{i}) = M a x (0, n s_{i} W_{1} + b_{1}) W_{2} + b_{2}

(16)

where

W_{1}, W_{2} \in R^{d \times d}

are weight matrices, and

b_{1}, b_{2} \in R^{d}

are bias terms.

After stacking the self-attention networks and feed-forward networks, problems, such as model overfitting, vanishing gradients, and excessive training time, may occur. Therefore, inspired by reference [36], we adopt layer normalization, dropout regularization and residual connections to solve these problems as shown in Formula (17).

N S_{i} = n s_{i} + D p (FFN (L N (n s_{i})))

(17)

where

L N (n s_{i})

is calculated by Formula (18).

L N (n s_{i}) = α ⊙ \frac{n s_{i} - μ}{\sqrt{σ^{2} + ε}} + β

(18)

where ⊙ represents the element-wise product.

α

and

β

denote the scale factor and the bias term, respectively.

μ

and

σ

represent the mean and variance of

n s_{i}

, respectively, while

ε

prevents invalid calculations when the variance is 0.

4.4. Next POI Recommendation

This section computes the preference scores that next POIs may be visited according to the corresponding preference representations of user visiting POIs obtained by the spatio-temporal unequal interval correlation-aware Transformer model. After stacking N self-attention blocks, we obtain the combined representation of POIs information, spatio-temporal interval information between POIs, and absolute positional information of corresponding POIs in each user check-in sequence. In order to recommend the next POI to a user, we use Formula (19) to calculate the user preference score for POI

l_{i}

, sort the candidate POIs according to the corresponding preference scores, then recommend a list of POIs with higher preference scores to the user.

s c o_{l_{i}, t} = N S_{t} M_{l_{i}}^{L}

(19)

where

N S_{t}

represents the combined representation of POIs embedding visited at the first t time in the user check-in sequence, spatio-temporal interval embedding between POIs mentioned above and POIs visited at the t+1 time, as well as the absolute positional embedding of the corresponding check-in sequence.

M_{l_{i}}^{L}

is the embedding of POI

l_{i}

.

4.5. Model Optimization

The purpose of this section is optimizing our proposed model. According to the user historical check-in trajectory

t r a (u_{i}) = \{r_{1}, r_{2}, \dots, r_{m}\}

, we generate a fixed-length check-in sequence

s e q (u_{i}) = \{r_{1}, r_{2}, \dots, r_{n}\}

, and further generate the corresponding temporal sequence

t (u_{i}) = \{t_{1}, t_{2}, \dots, t_{n}\}

as well as the spatial sequence

g (u_{i}) = \{g_{1}, g_{2}, \dots, g_{n}\}

, and define

l = \{l_{1}, l_{2}, \dots, l_{k}\}

as the expected output of the model. Since the interaction information between users and POIs is implicit data, we cannot directly optimize the preference scores of candidate POIs. Moreover, the output of our model is a list of ranked POIs. Therefore, we adopt a negative sampling method to optimize the ranking of candidate POIs. Specifically, for each expected positive output

l_{i}

, a negative sample

l_{i}^{'} \notin s e q (u_{i})

is randomly selected and taken to generate a pair of priority

T S = \{(s e q (u_{i}), t (u_{i}), g (u_{i}), l_{i}, l_{i}^{'})\}

. We normalize the output fraction of the model through soft-max function, and use binary cross entropy as the loss function in Formula (20).

- \sum_{s e q (u_{i})} \sum_{t \in [1, \dots n]} [l o g (σ (s c o_{l_{i}, t})) + l o g (1 - σ (s c o_{l_{i}^{'}, t}))] + λ {∥Θ∥}_{F}^{2}

(20)

where

Θ = \{M^{L}, M_{K}^{P}, M_{V}^{P}, M_{K}^{T}, M_{V}^{T}, M_{K}^{S}, M_{V}^{S}\}

is the set of embedding matrices,

{∥\cdot∥}_{F}

denotes the Frobenius norm, and

λ

is the regularization parameter.

Then we use the Adam optimizer [37] to optimize our model. Since each training sample

T S = \{(s e q (u_{i}), t (u_{i}), g (u_{i}), l_{i}, l_{i}^{'})\}

can be constructed independently, we use mini-batch SGD to improve the training efficiency.

5. Experiments

In this section, we conduct experiments to evaluate the effectiveness of the proposed model STUIC-SAN on two real-world datasets by attempting to answer the following four research questions.

RQ1. Does our approach outperform existing methods in the next-POI-recommendation task?

RQ2. Do personalized spatio-temporal interval information and spatio-temporal interval correlation affect the performance of the model recommendation?

RQ3. How do the parameters of the model, such as the latent dimension, the maximum sequence length, the maximum temporal interval and the maximum spatial interval, affect the recommendation performance?

RQ4. What is the impact of different spatio-temporal interval correlation processing on recommendation performance?

5.1. Experimental Setup

5.1.1. Data Collection and Preprocessing

We evaluated the proposed model on two publicly available LBSNs datasets [38], Foursquare and Gowalla, with densities of 0.13% and 0.22%, respectively. The Foursquare dataset contains users check-in data from April 2012 to September 2013. While the Gowalla dataset contains users’ check-in data from February 2009 to October 2010. Each check-in activity of each user from both datasets is a five-tuple consisting of user ID, POI ID, and POI latitude and longitude, as well as the corresponding visiting timestamp.

For the two datasets, we first remove inactive users who have checked in fewer than 5 POIs, and remove cold-start POIs visited by fewer than 5 users, as they are meaningless data. We further summarize the statistics of the preprocessed datasets in Table 2. Next, we rank the check-in activities of each user from both datasets by the corresponding visiting timestamps. To ensure that the timestamp of the first check-in activity sorted of each user starts from 0, we subtract the timestamp of each check-in activity after sorting by the smallest timestamp among the user check-in activities. For n check-in activities in each user check-in sequence, we divide them into three parts, namely, training set, validation set and testing set. The number of training set is

n - 3

, with the first

n^{'} \in [1, n - 3]

check-in activities as the input sequence and the

[2, n - 2]

visited POI as the label; the validation set uses the first

n - 2

check-in activities as the input sequence and the (

n - 1

)-st visited POI as the label; the testing set uses the first

n - 1

check-in activities as the input sequence and the n-th visited POI as the label. The split of datasets follows the causality that no future data are used in the prediction of future data.

5.1.2. Evaluation Metrics

In order to evaluate the recommendation performance, we adopt two commonly used evaluation metrics [25,39]: NDCG@k and Recall@k, where k is the number of recommended POIs. NDCG@k considers the position of the ground-truth POIs and assigns greater weights to the POIs at higher positions. Recall@k is used to calculate the ratio of true positive samples from the recommended POIs among all positive samples. In our model, NDCG@k indicates whether the POIs that users actually check-in rank at the top of the corresponding recommendation lists. Recall@k indicates whether there are POIs that users actually visit among the top-k recommended POIs. These metrics are computed as follows:

N D C G @ k = \frac{1}{|U|} \sum_{u \in U} \frac{D C G_{u} @ k}{I D C G_{u} @ k}

(21)

D C G_{u} @ k = \sum_{i = 1}^{k} \frac{2^{r_{i}} - 1}{l o g_{2}^{} (i + 1)}

(22)

where

r_{i}

is the graded relevance of POI at position i. We use the simple binary relevance for our work, namely,

r_{i} = 1

if there is a POI actually visited by the user in the recommended POIs list, and 0 otherwise.

I D C G_{u} @ k

denotes the maximum

D C G_{u} @ k

in an ideal ranking.

R e c a l l @ k = \frac{1}{|U|} \sum_{u \in U} \frac{\sum_{i = 1}^{k} r_{i}}{|V_{u}|}

(23)

where

|V_{u}|

denotes the number of positive POIs in the list recommended to user u. Here,

|V_{u}| = 1

.

5.1.3. Baseline Approaches

As for RQ1, we compare it with the following representative baseline approaches.

UCF [40]: a collaborative filtering method based on matrix user-POI, which makes a recommendation according to the correlation between POIs.

FPMC [41]: this method combines the Markov chain and matrix factorization methods, which can simultaneously capture temporal information and user long-term preference information, and then perform POI recommendation.

ST-RNN [42]: this model extends RNN to integrate the temporal context and spatial information in a recurrent neural network for next POI recommendation.

ARNN [43]: this model leverages semantic information, spatial information and user visiting information to build a knowledge graph, obtains POI neighbors through a random walk based on the meta path, and adopts LSTM to model sequence regularity to improve the recommendation performance.

LSTPM [44]: a new method for modeling users’ long-term and short-term preferences, which uses LSTM to model users’ long-term preferences and geographical relationship between POIs visited by users recently, and then makes the next POI recommendation.

TiSASRec [15]: a method based on the self-attention mechanism that explores the impact of different temporal intervals on the prediction of next item, and makes a recommendation in combination with the absolute positional information of items and the temporal interval between items.

Table 3 summarizes the approaches considering different factors in our experiments. In general, these methods can be divided into four categories: first, traditional collaborative filtering recommendation methods, e.g., UCF; second, Markov-chain-based methods, such as FPMC; third, methods based on RNN, such as ST-RNN, ARNN, and LSTPM; and fourth, methods based on the self-attention mechanism, such as TiSASRec.

5.1.4. Configurations

All experiments were conducted on a PC with 2.90 GHz Intel(R) Core(TM) i7, 16 GB RAM, and running on Microsoft Windows 10 (64-bit). The code used in our experiments was written in Python 3.6. In the meantime, we used TensorFlow 1.2.0 as a machine learning framework for the experiments. We stacked a total of two spatio-temporal unequal interval correlation-aware self-attention networks and fine-tuned the hyper-parameters on the validation set. The latent dimension d was set to 50 for two datasets. For each target POI, the number of negative samples was set to 100. We used the Adam optimizer with default betas; the initial learning rate was set to 0.001, and the dropout rate was set to 0.2 to avoid overfitting. The size of each batch was set to 200 in this model. The settings of other hyper-parameters are shown in Table 4. The source code of the proposed model is publicly available for download at (accessed on 28 August 2022) https://github.com/huang-0724/STUIC-SAN.git.

5.2. Results and Discussions

5.2.1. Comparison of Recommendation Performance

For RQ1, we compare the effectiveness of seven methods with tuned parameters on two datasets. The mean and standard deviation of NDCG@k and Recall@k of all methods are reported in Table 5 and Table 6. The numbers shown in bold in Table 5 and Table 6 represent the best performance of each column in the corresponding tables.

From the results shown in Table 5 and Table 6, we can make the following observations.

First, our proposed model, STUIC-SAN, outperforms other models in terms of two metrics on two datasets. Specifically, compared with the best baseline method, the performance of STUIC-SAN on Foursquare dataset on Recall@10 is improved by about 6.57%, and more than 6.87% relative improvements on NDCG@10. The performance gains on Gowalla dataset is also similarly high. These results essentially demonstrate the competitiveness of our model.

Second, methods considering temporal influence work better than those without temporal influence. Obviously, FPMC performs better than UCF. The performance improvements of FPMC may be due to using the Markov chain model, which incorporates the temporal factor to model users’ check-in sequences, showing good performance compared with the traditional collaborative filtering algorithm.

In addition, methods utilizing spatial influence generally perform better than those without spatial influence. It is clear that ST-RNN, ARNN, and LSTPM perform better than FPMC. ST-RNN is greatly improved compared to FPMC since ST-RNN uses RNN to model the spatio-temporal context, showing good performance compared with those methods based on the Markov chain and matrix factorization model. Compared with ST-RNN, ARNN integrates the semantic information of POIs on the basis of fusing the spatio-temporal context to obtain more related POIs to expand the candidate POIs recommended. LSTPM proposes a spatially extended module to obtain users’ short-term preferences by making full use of the spatial relationship between non-consecutive POIs. Therefore, ARNN and LSTPM gain pure improvement compared with ST-RNN.

Third, the performance increase can also be attributed to the deep mining of spatio-temporal information and the self-attention mechanism. Taking the experimental results from the Foursquare dataset as an example, it can be seen that the performance of TiSASRec is better than that of the methods based on the RNN model, such as ST-RNN, ARNN and LSTPM. The reason is that RNN-based methods usually use short trajectories after slicing rather than long trajectories with long-term periodic information of the visiting POIs. Therefore, in view of the shortcomings of the RNN model itself, it is difficult for RNN-based methods to capture the exact impact of the POI visited in each check-in activity from the corresponding user historical check-in trajectory on the next POI selection, while the self-attention mechanism can effectively model users’ long-term preferences. Therefore, TiSASRec can accurately obtain long-term dependencies of users on item interactions by using the self-attention mechanism, and selectively combine the information of relevant item interactions. Furthermore, TiSASRec deeply mines temporal information and models the temporal interval information as users’ visiting preferences to improve the recommendation performance.

Lastly, the performance of our proposed STUIC-SAN model is significantly better than TiSASRec on both datasets mainly because it comprehensively considers the spatio-temporal unequal interval information between any two POIs. Moreover, the experimental results demonstrate that spatio-temporal unequal interval information between any two POIs helps to better capture spatio-temporal information to accurately infer the spatio-temporal unequal interval correlation preferences of users so as to improve the performance of next POI recommendation.

5.2.2. Effectiveness of Different Components

For RQ2, in order to analyze the impacts of the different modules in our model on the performance of next POI recommendation, we conduct ablation experiments in this section.

We investigate the effectiveness of different components on performance, including sequence influence in users’ check-in trajectories, as well as influences of temporal unequal interval, spatial unequal interval and spatio-temporal unequal interval correlation. Moreover, to further validate the benefits brought by each component, we construct the following variants of STUIC-SAN.

STUIC-SE: The model only considers the sequence influence in users historical check-in trajectoryies. In other words, the model only contains the absolute positional information of corresponding POIs, and assumes that there is an equal temporal/spatial interval between POIs in consecutive check-in activities.

STUIC-TE: The model considers the influence of temporal unequal interval only. So, we redefine Formulas (13) and (15) as follows:

n s_{i} = \sum_{j = 1}^{n} a_{i j} (I_{j} W^{V} + r_{i j}^{t_{V}} + p_{j}^{V})

(24)

e_{i j} = \frac{I_{i} W^{Q} {(I_{j} W^{K} + r_{i j}^{t_{K}} + p_{j}^{K})}^{T}}{\sqrt{d}}

(25)

STUIC-SP: This model integrates the spatial unequal interval information on the basis of STUIC-TE model, without considering spatio-temporal unequal interval correlation information between any two POIs. Therefore, we set a unified maximum spatial interval

k^{s}

to perform the corresponding clip operation.

The characteristics of the variant models are shown in Table 7. We take the NDCG@k metric to illustrate the effectiveness of three newly designed components on two datasets, as shown in Table 8. Note that the results on Recall@k are similar to those on NDCG@k, so we analyzed the effects of different components on NDCG@k due to space limitation.

It can be seen from Table 8 that among three variants of STUIC-SAN, STUIC-SE experiences the most performance decrease compared with STUIC-TE and STUIC-SP on both datasets. This is because STUIC-SE does not consider the spatio-temporal unequal interval correlation factors which are particularly important for next POI recommendation, resulting in users’ personalized spatio-temporal unequal interval correlation preferences being unable to be accurately obtained. The significant NDCG@k drop verifies the positive contribution of the spatio-temporal unequal interval information and spatio-temporal unequal interval correlation information integrated into our STUIC-SAN model for the performance gain.

In addition, the performance of STUIC-TE is better than STUIC-SE, because STUIC-TE can obtain temporal unequal intervals between any two POIs from users’ check-in sequences, and accurately simulate users’ temporal unequal interval preferences for selecting next POI by using the self-attention mechanism, which is more reliable than STUIC-SE, which only considers the sequence information.

Moreover, the performance of STUIC-SP is better than STUIC-TE by integrating the spatial unequal interval information between POIs. The reason is that in the next POI recommendation task, the user selection of the next POI will be affected by the spatial distance between the current POI and the next POI. Therefore, the effective integration of the spatial unequal interval information between POIs is beneficial for improving the performance of the recommendation.

In contrast, the performance of our proposed STUIC-SAN is better than STUIC-SE, STUIC-TE and STUIC-SP, which also demonstrates that fully considering the spatio-temporal unequal interval information and spatio-temporal unequal interval correlation information between POIs can help to better capture users’ personalized spatio-temporal unequal interval correlation preferences so as to improve the performance of the next POI recommendation. Note that the components do not conflict with each other and can be utilized to collaboratively learn users’ personalized spatio-temporal unequal interval correlation preferences.

5.3. Sensitive Analysis of Parameters

For RQ3, we analyze the effects of different model parameters on the performance of STUIC-SAN in this section. Here, we focus on four critical parameters, namely, the number of latent dimension d, the maximum sequence length n, the maximum temporal interval

k^{t}

and the maximum spatial interval

k^{s}

. Next, we analyze the effects of four parameters on NDCG@k. Note that the results on Recall@k are similar to those on NDCG@k.

5.3.1. Effect of Latent Dimension

In this subsection, we study how sensitive STUIC-SAN is to the number of latent dimension d, while keeping other hyper-parameters unchanged. As shown in Figure 5, the performance first grows dramatically with the increase in d, then improves relatively slowly. This is because d represents the model complexity. Specifically, the model with a large d is too complicated to depict the datasets, while the model with a small d is not enough to describe the datasets. Thus, we set the number of latent dimension to 50 in this paper.

5.3.2. Effect of Maximum Sequence Length

To illustrate the effects of the maximum sequence length n, we vary n from 10 to 100 while keeping other hyper-parameters unchanged. Figure 6 shows the results. As n increases, we can see that both curves grow slowly and then gradually flatten on the two datasets. The reason is that when the sequence length is too large, many meaningless POIs are utilized to train the representation vector of the target POI, while when the sequence length is too small, it does not accurately depict the spatio-temporal context. Therefore, we choose the maximum sequence length that can obtain the best performance on two datasets as the default settings in Section 5.1.4.

5.3.3. Effect of Maximum Temporal Interval

In order to demonstrate the effects of the maximum temporal interval

k^{t}

, we set values of the maximum temporal intervals of two datasets as 512, 1024, 2048, 4096, 8192, 12,000, while keeping other hyper-parameters unchanged. As shown in Figure 7, as the maximum temporal interval increases, both curves show different trends on two datasets. This is because different maximum temporal intervals have different effects on the recommendation performance. It can be seen that the maximum temporal interval that can obtain the best performance on Foursquare is 4096, while that of Gowalla is 8192. Therefore, we choose such values that can obtain the best performance as the corresponding maximum temporal intervals on two datasets, respectively.

5.3.4. Effect of Maximum Spatial Interval

In this subsection, we further demonstrate the effects of the maximum spatial interval

k^{s}

. Considering the difference in spatial unequal interval of users from two datasets, we set different values of the maximum spatial intervals from two datasets while keeping the other hyper-parameters unchanged. Figure 8 depicts the performance results on two datasets. Similar to the selection of the maximum temporal interval, we set the maximum spatial intervals to 5000 and 15,000, which can achieve the best performance on the two datasets, respectively.

5.4. Effect of Different Spatio-Temporal Interval Correlation Processing

For RQ4, we discuss the impact of different spatio-temporal interval correlation processing on the recommendation in this section. We discuss the following three methods.

STUIC-SP: the maximum temporal interval and the maximum spatial interval are set to a unified value without considering the spatio-temporal interval correlation information between POIs, as described in Section 5.2.2.

STUIC-LE: users are classified by using the length of each user check-in sequence and the average sequence length of all users from each dataset so as to obtain the linear regression equation of the spatio-temporal unequal interval correlation with different sequence lengths.

STUIC-SAN: the spatio-temporal unequal interval processing we adopt, as described in Section 4.1.2.

We illustrate the effectiveness of the three methods mentioned above on two datasets in Figure 9. Among the three methods, STUIC-SP experiences the most performance decrease compared with STUIC-LE and STUIC-SAN on both datasets, while STUIC-SAN has the best performance. This is because STUIC-SP ignores the spatio-temporal interval correlation information between POIs, and sets the maximum temporal interval in all users’ personalized temporal unequal interval matrices to a unified value. Meanwhile, it performs a similar operation for the maximum spatial interval. Thus, it is not enough to capture users personalized spatio-temporal interval correlation preferences. On the basis of STUIC-SP, STUIC-LE and STUIC-SAN consider the correlation between the temporal and spatial interval, and divide users according to the length of users check-in sequences as well as the average temporal interval between any two POIs, respectively, and then obtain the spatio-temporal interval correlation between any two POIs from the corresponding users’ check-in sequences of different categories of users through the linear regression method. Compared with STUIC-LE, STIUC-SAN can more directly simulate the correlation between the temporal interval and spatial interval of users visiting POIs so as to better conduct the next POI recommendation.

5.5. Threats to Validity

In this section, we discuss some potential threats to the validity of our study. Threats to the effectiveness of our research include three aspects: data selection, experimental setup, and the selection of auxiliary information.

Data selection bias is one of the most common threats to validity. In the next POI recommendation task, we need to simulate users sequence visiting patterns according to the corresponding users check-in sequences so as to model users’ personalized visiting preferences. Therefore, we remove those inactive users and unpopular POIs. In addition, in order to facilitate the processing of users historical check-in trajectories, we set historical check-in trajectories of all users to a fixed-length (see the analysis in Section 5.3.2). We also leverage the negative sampling method commonly used in the recommendation system to improve the performance and efficiency of recommendation. Moreover, we also conduct experiments with different maximum temporal intervals and maximum spatial intervals between any two POIs of users from two datasets so as to select the appropriate temporal and spatial intervals to build users’ personalized preference representations. Therefore, we have to admit that the recommendation performance of our model will decrease without appropriate maximum temporal and spatial intervals (see the analysis in Section 5.3.3 and Section 5.3.4).

In our experiment, we trained different types of baseline methods based on their default hyper-parameter settings. As we know, there are also several implicit tricks, e.g., fine tuning, in the baseline approaches based on deep neural networks. Therefore, we cannot ensure that these methods can achieve the same performance shown in their original papers on two datasets.

In our model, we mainly mine the spatio-temporal unequal interval correlation between any two POIs from users’ check-in sequences and then leverage the self-attention mechanism to perform next POI recommendation. However, we cannot obtain the vehicle information from users’ historical check-in trajectories from the datasets commonly used in next POI recommendation, and we also cannot guarantee that users visit the same spatial distance in a unit of time. Therefore, we consider that each user from both datasets visits the POIs in the corresponding user check-in trajectory by the same or similar means of transportation.

In addition, we cannot obtain the check-out time information of users from public datasets, but if the check-in devices that users adopted can apply more sensitive indoor localization technology [45], then we can obtain the duration spent on each POI according to the corresponding check-in and check-out times so as to more accurately simulate users’ preferences for temporal unequal intervals when visiting POIs, consequently improving the accuracy of next POI recommendation.

In our model, users’ visiting preferences are simulated based on the users’ historical check-in sequence, so all next POIs recommended to users are those that users have already visited rather than new POIs. In future work, we can enrich the candidate pool for users to select next POI and improve the diversity of recommendation by integrating the similarity information between POIs. For example, we can recommend POIs from the same category based on semantic similarity information between POIs, or recommend adjacent POIs according to the spatial similarity information between POIs.

With the rapid development of LBSNs, users generate a large amount of check-in data every day. As we all know, the denser the user–POI interaction matrix, the more accurately we can simulate users’ visiting preferences, and thus provide more accurate recommendation. However, our STUIC-SAN model builds temporal and spatial unequal interval matrices for each user to achieve personalized recommendation. If using large datasets [46], the running time of the model will experience latency to some extent. In future work, we will attempt to incorporate some lightweight models, e.g., LightMove [47], into our methods to improve the model efficiency without reducing the accuracy of the recommendations.

6. Conclusions and Future Work

In recent years, next POI recommendation has attracted more attention in the fields of LBSNs and recommendation systems. In this paper, we propose a spatio-temporal unequal interval correlation-aware self-attention network (STUIC-SAN) to improve the performance of next POI recommendation. More specifically, STUIC-SAN uses the linear regression method to analyze the effect of the temporal unequal interval on the spatial unequal interval between any two POIs from users’ check-in sequences, and learns the effect of the spatio-temporal unequal interval correlation on the users’ selection of the next POI through the self-attention mechanism so as to better model users’ personalized spatio-temporal unequal interval correlation preferences, and then improve the performance of next POI recommendation. In addition, we conducted experiments on two publicly available datasets (namely, Foursquare and Gowalla) to verify the effectiveness of STUIC-SAN. The experimental results validate that STUIC-SAN outperforms the state-of-the-art methods regarding two commonly used metrics, namely, NDCG@k, Recall@k.

For future work, we will further enrich and optimize STUIC-SAN by considering more information, such as POI neighbor information, category information, and user check-in frequency, which can model users’ personalized visiting preferences more accurately so as to provide better performance of next POI recommendation. Moreover, we will try to combine this with the lightweight model such that the efficiency of the recommendation will be improved on the basis of ensuring the accuracy of the POI recommendation.

Author Contributions

Zheng Li: Writing—review and editing, Supervision, Funding acquisition; Xueyuan Huang: Conceptualization, Methodology, Software, Writing—original draft, Writing—review and editing; Chun Liu and Wei Yang: Writing—review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

The work described in this paper is partially supported by the National Natural Science Foundation of China (No. 61402150, 61806074); The Key Technologies R & D Program of Henan (No. 182102410063); Key Scientific Research Project Plan of Colleges and Universities in Henan Province (No. 23A520016).

Data Availability Statement

Publicly available datasets were analyzed in this study. These data can be found here (accessed on 28 August 2022): https://github.com/YijunSu/LBSN_Dataset.

Conflicts of Interest

The authors declare no conflict of interest.

References

Gao, H.; Tang, J.; Hu, X.; Liu, H. Exploring temporal effects for location recommendation on location-based social networks. In Proceedings of the 7th ACM Conference on Recommender Systems, Hong Kong, China, 12–16 October 2013; pp. 93–100. [Google Scholar]
Chen, K.; Yang, H.; Lyu, M.; King, I. Where You Like to Go Next: Successive Point-of-Interest Recommendation. In Proceedings of the 23th International Joint Conference on Artificial Intelligence, Beijing, China, 5–9 August 2013; pp. 2605–2611. [Google Scholar]
He, J.; Li, X.; Liao, L.; Song, D. Inferring a Personalized Next Point-of-Interest Recommendation Model with Latent Behavior Patterns. In Proceedings of the 30th Aaai Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; pp. 137–143. [Google Scholar]
Xia, B.; Li, Y.; Li, Q.; Li, T. Attention-based recurrent neural network for location recommendation. In Proceedings of the 12th International Conference on Intelligent Systems and Knowledge Engineering, Nanjing, China, 24–26 November 2017; pp. 1–6. [Google Scholar]
Wu, Y.; Li, K.; Zhao, G. Long-and short-term preference learning for next POI recommendation. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2021; pp. 2301–2304. [Google Scholar]
Liu, Y.; Pei, A.; Wang, F.; Yang, Y.; Zhang, X.; Wang, H.; Dai, H.; Qi, L.; Ma, R. An attention-based category-aware GRU model for the next POI recommendation. Int. J. Intell. Syst. 2021, 36, 3174–3189. [Google Scholar] [CrossRef]
Wang, C.; McAuley, J. Self-Attentive Sequential Recommendation. In Proceedings of the 2018 IEEE International Conference on Data Mining (ICDM), Singapore, 17–20 November 2018; pp. 197–206. [Google Scholar]
Wang, X.; Liu, Y.; Zhou, X.; Leng, Z.; Wang, X. Long- and Short-Term Preference Modeling Based on Multi-Level Attention for Next POI Recommendation. ISPRS Int. J. Geo-Inf. 2022, 11, 323. [Google Scholar] [CrossRef]
Ali, M.; Raf, D.; ACres, F. A Joint Two-Phase Time-Sensitive Regularized Collaborative Ranking Model for Point of Interest Recommendation. IEEE Trans. Knowl. Data Eng. 2019, 32, 1050–1063. [Google Scholar]
Zhang, Y.; Liu, G.; Liu, A.; Zhang, Y.; Li, Z.; Zhang, X.; Li, Q. Personalized Geographical Influence Modeling for POI Recommendation. IEEE Intell. Syst. 2020, 35, 18–27. [Google Scholar] [CrossRef]
Zou, Z.; He, X.; Zhu, A.X. An Automatic Annotation Method for Discovering Semantic Information of Geographical Locations from Location-Based Social Networks. ISPRS Int. J. Geo-Inf. 2019, 8, 487. [Google Scholar] [CrossRef] [Green Version]
Zhao, K.; Zhang, Y.; Yin, H.; Wang, J.; Zheng, K.; Zhou, X.; Xing, C. Discovering Subsequence Patterns for Next POI Recommendation. In Proceedings of the 29th International Joint Conference on Artificial Intelligence, Yokohama, Japan, 7–15 January 2020; pp. 3216–3222. [Google Scholar]
Lu, Y.; Huang, J. GLR: A graph-based latent representation model for successive POI recommendation. Future Gener. Comput. Syst. 2020, 102, 230–244. [Google Scholar] [CrossRef]
Cui, Q.; Zhang, Y.; Wang, J. CANS-Net: Context-Aware Non-Successive Modeling Network for Next Point-of-Interest Recommendation. arXiv 2021, arXiv:abs/2104.02262. [Google Scholar]
Li, J.; Wang, Y.; McAuley, J. Time Interval Aware Self-Attention for Sequential Recommendation. In Proceedings of the 13th International Conference on Web Search and Data Mining, Houston, TX, USA, 3–7 February 2020; pp. 322–330. [Google Scholar]
Yang, G.; Cai, Y.; KReddy, C. Spatio-Temporal Check-in Time Prediction with Recurrent Neural Network based Survival Analysis. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 9–19 July 2018; pp. 2976–2983. [Google Scholar]
Zhao, P.; Luo, A.; Liu, Y.; Xu, J.; Li, Z.; Zhuang, F.; Sheng, V.S.; Zhou, X. Where to Go Next: A Spatio-Temporal Gated Network for Next POI Recommendation. IEEE Trans. Knowl. Data Eng. 2022, 34, 2512–2524. [Google Scholar] [CrossRef]
Luo, Y.; Liu, Q.; Liu, Z. STAN: Spatio-Temporal Attention Network for Next Location Recommendation. In Proceedings of the WWW ’21: Proceedings of the Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021; pp. 2177–2185. [Google Scholar]
He, Q.; Jiang, D.; Liao, Z.; Hoi, S.C.H.; Chang, K.; Lim, E.P.; Li, H. Web Query Recommendation via Sequential Query Prediction. In Proceedings of the 2009 IEEE 25th International Conference on Data Engineering, Shanghai, China, 29 March–2 April 2009; pp. 1443–1454. [Google Scholar]
Feng, S.; Li, X.; Zeng, Y.; Cong, G.; Chee, Y.M.; Yuan, Q. Personalized Ranking Metric Embedding for Next New POI Recommendation. In Proceedings of the 24th International Conference on Artificial Intelligence, Phuket Island, Thailand, 26–27 July 2015; pp. 2069–2075. [Google Scholar]
He, R.; Fang, C.; Wang, Z.; McAuley, J. Vista: A Visually, Socially, and Temporally-aware Model for Artistic Recommendation. In Proceedings of the 10th ACM Conference on Recommender Systems, Boston, MA, USA, 15–19 September 2016; pp. 309–316. [Google Scholar]
Lu, Y.; Shih, W.; Gau, H.; Chung, K.; Huang, J. On successive point-of-interest recommendation. World Wide Web 2018, 22, 1151–1173. [Google Scholar] [CrossRef]
Li, R.; Shen, Y.; Zhu, Y. Next Point-of-Interest Recommendation with Temporal and Multi-level Context Attention. In Proceedings of the 2018 IEEE International Conference on Data Mining (ICDM), Singapore, 17–20 November 2018; pp. 1110–1115. [Google Scholar]
Liu, Y.; Wu, A. POI Recommendation Method Using Deep Learning in Location-Based Social Networks. Wirel. Commun. Mob. Comput. 2021, 2021, 9120864. [Google Scholar] [CrossRef]
Huang, L.; Ma, Y.; Liu, Y.; He, K. DAN-SNR. ACM Trans. Internet Technol. (TOIT) 2021, 21, 1–27. [Google Scholar] [CrossRef]
Lim, N.; Hooi, B.; Ng, S.K.; Wang, X.; Goh, Y.L.; Weng, R.; Varadarajan, J. STP-UDGAT: Spatial-Temporal-Preference User Dimensional Graph Attention Network for Next POI Recommendation. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, Virtual, 19–23 October 2020; pp. 845–854. [Google Scholar]
Ding, R.; Chen, Z.; Li, X. Spatial-Temporal Distance Metric Embedding for Time-Specific POI Recommendation. IEEE Access 2018, 6, 67035–67045. [Google Scholar] [CrossRef]
Doan, K.D.; Yang, G.; KReddy, C. An Attentive Spatio-Temporal Neural Model for Successive Point of Interest Recommendation. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Macau, China, 14–17 April 2019; Voume 11441. [Google Scholar]
Pengpeng, Z.; Haifeng, Z.; Yanchi, L.; Jiajie, X.; Xiaofang, Z. Where to Go Next: A Spatio-Temporal Gated Network for Next POI Recommendation. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; pp. 5877–5884. [Google Scholar]
Huang, L.; Ma, Y.; Wang, S.; Liu, Y. An Attention-Based Spatiotemporal LSTM Network for Next POI Recommendation. IEEE Trans. Serv. Comput. 2021, 14, 1585–1597. [Google Scholar] [CrossRef]
Davtalab, M.; Alesheikh, A.A. A POI recommendation approach integrating social spatio-temporal information into probabilistic matrix factorization. Knowl. Inf. Syst. 2021, 63, 65–85. [Google Scholar] [CrossRef]
Dai, S.; Yu, Y.; Fan, H.; Dong, J. Spatio-Temporal Representation Learning with Social Tie for Personalized POI Recommendation. Data Sci. Eng. 2022, 7, 1–13. [Google Scholar] [CrossRef]
Xi, D.; Zhuang, F.; Liu, Y.; Gu, J.; Xiong, H.; He, Q. Modelling of Bi-Directional Spatio-Temporal Dependence and Users’ Dynamic Preferences for Missing POI Check-In Identification. In Proceedings of the National Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; pp. 5458–5465. [Google Scholar]
Ma, Y.; Gan, M. Exploring multiple spatio-temporal information for point-of-interest recommendation. Soft Comput. 2020, 24, 18733–18747. [Google Scholar] [CrossRef]
Jain, G.; Mishra, N.; Sharma, S.K. CRLRM: Category Based Recommendation Using Linear Regression Model. In Proceedings of the 3th International Conference on Advances in Computing and Communications, Cochin, India, 29–31 August 2013; pp. 17–20. [Google Scholar]
Vaswani, A.; Shazeer, N.M.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is All you Need. In Proceedings of the 31th International Conference on Neural Information Processing Systems, Los Angeles, CA, USA, 4–9 December 2017; pp. 6000–6010. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the 3th International Conference for Learning Representations, Scottsdale, AZ, USA, 2–4 May 2015. [Google Scholar]
Su, Y.; Li, X.; Liu, B.; Zha, D.; Xiang, J.; Tang, W.; Gao, N. FGCRec: Fine-Grained Geographical Characteristics Modeling for Point-of-Interest Recommendation. In Proceedings of the 2020 IEEE International Conference on Communications (ICC), Dublin, Ireland, 7–11 June 2020; pp. 1–6. [Google Scholar]
He, X.; Liao, L.; Zhang, H.; Nie, L.; Hu, X.; Chua, T.S. Neural Collaborative Filtering. In Proceedings of the 26th International Conference on World Wide Web, Perth, Australia, 3–7 April 2017; pp. 173–182. [Google Scholar]
Sarwar, B.M.; Karypis, G.; AKonstan, J.; Riedl, J. Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th International Conference on World Wide Web, Hongkong, China, 1–5 May 2001; pp. 285–295. [Google Scholar]
Rendle, S.; Freudenthaler, C.; Schmidt-Thieme, L. Factorizing personalized Markov chains for next-basket recommendation. In Proceedings of the 19th International Conference on World Wide Web, Raleigh, NC, USA, 26–30 April 2010; pp. 811–820. [Google Scholar]
Liu, Q.; Wu, S.; Wang, L.; Tan, T. Predicting the Next Location: A Recurrent Model with Spatial and Temporal Contexts. In Proceedings of the 30th AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; pp. 194–200. [Google Scholar]
Guo, Q.; Sun, Z.; Zhang, J.; Theng, Y.L. An Attentional Recurrent Neural Network for Personalized Next Location Recommendation. In Proceedings of the AAAI-20 Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 83–90. [Google Scholar]
Sun, K.; Qian, T.; Chen, T.; Liang, Y.; Nguyen, Q.V.H.; Yin, H. Where to Go Next: Modeling Long- and Short-Term User Preferences for Point-of-Interest Recommendation. In Proceedings of the AAAI-20 Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 214–221. [Google Scholar]
Tekler, Z.D.; Low, R.; Gunay, B.; Andersen, R.K.; Blessing, L. A scalable Bluetooth Low Energy approach to identify occupancy patterns and profiles in office spaces. Build. Environ. 2020, 171, 106681–106693. [Google Scholar] [CrossRef]
Low, R.; Tekler, Z.D.; Cheah, L. An End-to-end Point of Interest (POI) Conflation Framework. ISPRS Int. J. Geo-Inf. 2021, 10, 779. [Google Scholar] [CrossRef]
Jeon, J.; Kang, S.; Jo, M.; Cho, S.; Park, N.; Kim, S.; Song, C. LightMove: A Lightweight Next-POI Recommendation forTaxicab Rooftop Advertising. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management, Virtual, 1–5 November 2021; pp. 3857–3866. [Google Scholar]

Figure 1. Illustration of spatio-temporal unequal interval relationship existing in users’ check-in trajectories.

Figure 2. Illustration of spatio-temporal unequal interval relationship existing in users check-in data.

Figure 3. Illustrations of spatio-temporal unequal interval correlation in check-in sequences of different users from Foursquare.

Figure 4. Illustration of STUIC-SAN model.

Figure 5. Effect of the latent dimension d on recommendation performance.

Figure 6. Effect of maximum sequence length n on recommendation performance.

Figure 7. Effect of maximum temporal interval

k^{t}

on recommendation performance.

Figure 7. Effect of maximum temporal interval

k^{t}

on recommendation performance.

Figure 8. Effect of maximum spatial interval

k^{s}

on recommendation performance.

Figure 8. Effect of maximum spatial interval

k^{s}

on recommendation performance.

Figure 9. Effect of different spatio-temporal interval correlation processing on recommendation performance.

Table 1. Key notations used in this paper.

Notation	Description
U, L	User, POI set
$r_{e}$	A check-in activity of user $u_{i}$
$t r a (u_{i})$	Historical check-in trajectory of user $u_{i}$
$s e q (u_{i})$	Fixed length sub-sequence of $t r a (u_{i})$
$t (u_{i})$	Temporal sequence corresponding to $s e q (u_{i})$
$g (u_{i})$	Spatial sequence corresponding to $s e q (u_{i})$
n	Maximum sequence length
d	Latent dimension
$R_{u_{i}}^{t}$	Temporal intervals set of user $u_{i}$
$R_{u_{i}}^{s}$	Spatial intervals set of user $u_{i}$
$Δ^{t_{u_{i}}}$	User personalized temporal unequal interval matrix
$Δ^{s_{u_{i}}}$	User personalized spatial unequal interval matrix
$M^{L}$	Matrix of POIs embedding
$M_{K}^{P}$ , $M_{V}^{P}$	Embedding matrix of position for key and value
$M_{K}^{T}$ , $M_{V}^{T}$	Embedding matrix of temporal unequal interval for key and value
$M_{K}^{S}$ , $M_{V}^{S}$	Embedding matrix of spatial unequal interval for key and value
$N S_{t}$	Output of model as time step t

Table 2. Statistics of two datasets.

	Foursquare	Gowalla
Number of users	7642	5628
Number of POIs	28,484	31,803
Number of check-ins	512,523	620,683

Table 3. The approaches used in our experiments.

Property	UCF	FPMC	ST-RNN	ARNN	LSTPM	TiSASRec	STUIC-SAN
SE	√	√	√	√	√	√	√
TE	×	√	√	√	√	√	√
SP	×	×	√	√	√	×	√
ST	×	×	×	×	×	×	√
SA	×	×	×	×	×	√	√

SE, TE, SP, ST and SA represent whether the given approach considers the sequential influence, temporal information, spatial information, spatio-temporal unequal interval correlation information and self-attention mechanism, respectively.

Table 4. Hyper-parameter settings on two datasets.

Hyper-Parameter	Foursquare	Gowalla
maximum sequence length	100	100
maximum temporal interval (min)	4096	8192
maximum spatial interval (km)	5000	15,000
regularization	0.00005	0.00005

Table 5. Comparison of recommendation performance on Foursquare (mean ± std).

Foursquare	NDCG@5	NDCG@10	Recall@5	Recall@10
UCF	0.0651 ± 0.0042	0.1260 ± 0.0131	0.1838 ± 0.0096	0.2248 ± 0.0049
FPMC	0.1942 ± 0.0094	0.2439 ± 0.0041	0.3278 ± 0.0055	0.3623 ± 0.0018
ST-RNN	0.4027 ± 0.0152	0.4423 ± 0.0046	0.5594 ± 0.0134	0.5846 ± 0.0035
ARNN	0.4987 ± 0.0090	0.5593 ± 0.0050	0.6527 ± 0.0079	0.7085 ± 0.0062
LSTPM	0.5338 ± 0.0110	0.5945 ± 0.0053	0.6887 ± 0.0088	0.7620 ± 0.0055
TiSASRec	0.6471 ± 0.0059	0.6816 ± 0.0040	0.8008 ± 0.0073	0.8398 ± 0.0057
STUIC-SAN	0.7158 ± 0.0027	0.7282 ± 0.0093	0.8320 ± 0.0022	0.8950 ± 0.0037

Table 6. Comparison of recommendation performance on Gowalla (mean ± std).

Gowalla	NDCG@5	NDCG@10	Recall@5	Recall@10
UCF	0.0937 ± 0.0063	0.1419 ± 0.0017	0.2396 ± 0.0087	0.2602 ± 0.0046
FPMC	0.2266 ± 0.0034	0.2982 ± 0.0124	0.3498 ± 0.0021	0.4055 ± 0.0101
ST-RNN	0.4170 ± 0.0085	0.4853 ± 0.0044	0.5743 ± 0.0138	0.6218 ± 0.0057
ARNN	0.5125 ± 0.0057	0.5868 ± 0.0110	0.6727 ± 0.0033	0.7584 ± 0.0068
LSTPM	0.5440±0.0108	0.6218 ± 0.0108	0.7022±0.0131	0.7928 ± 0.0073
TiSASRec	0.6327 ± 0.0062	0.7039 ± 0.0015	0.7959 ± 0.0061	0.8488 ± 0.0037
STUIC-SAN	0.7216 ± 0.0056	0.7407 ± 0.0072	0.8544 ± 0.0043	0.9266 ± 0.0030

Table 7. Comparison of the variant models of STUIC-SAN.

Variants	Sequence Influence	Temporal Unequal Interval Influence	Spatial Unequal Interval Influence	Spatio-Temporal Unequal Interval Correlation Influence
STUIC-SE	√	×	×	×
STUIC-TE	√	√	×	×
STUIC-SP	√	√	√	×
STUIC-SAN	√	√	√	√

Table 8. Effectiveness of different components.

	Foursquare		Gowalla
Variants	NDCG@5	NDCG@10	NDCG@5	NDCG@10
STUIC-SE	0.6249	0.6601	0.6371	0.6898
STUIC-TE	0.6468	0.6823	0.6399	0.7037
STUIC-SP	0.6890	0.7019	0.7021	0.7294
STUIC-SAN	0.7158	0.7282	0.7216	0.7417

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, Z.; Huang, X.; Liu, C.; Yang, W. Spatio-Temporal Unequal Interval Correlation-Aware Self-Attention Network for Next POI Recommendation. ISPRS Int. J. Geo-Inf. 2022, 11, 543. https://doi.org/10.3390/ijgi11110543

AMA Style

Li Z, Huang X, Liu C, Yang W. Spatio-Temporal Unequal Interval Correlation-Aware Self-Attention Network for Next POI Recommendation. ISPRS International Journal of Geo-Information. 2022; 11(11):543. https://doi.org/10.3390/ijgi11110543

Chicago/Turabian Style

Li, Zheng, Xueyuan Huang, Chun Liu, and Wei Yang. 2022. "Spatio-Temporal Unequal Interval Correlation-Aware Self-Attention Network for Next POI Recommendation" ISPRS International Journal of Geo-Information 11, no. 11: 543. https://doi.org/10.3390/ijgi11110543

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Property	UCF	FPMC	ST-RNN	ARNN	LSTPM	TiSASRec	STUIC-SAN
SE	√	√	√	√	√	√	√
TE	×	√	√	√	√	√	√
SP	×	×	√	√	√	×	√
ST	×	×	×	×	×	×	√
SA	×	×	×	×	×	√	√

Property	UCF	FPMC	ST-RNN	ARNN	LSTPM	TiSASRec	STUIC-SAN
SE	√	√	√	√	√	√	√
TE	×	√	√	√	√	√	√
SP	×	×	√	√	√	×	√
ST	×	×	×	×	×	×	√
SA	×	×	×	×	×	√	√

Article Menu

Spatio-Temporal Unequal Interval Correlation-Aware Self-Attention Network for Next POI Recommendation

Abstract

1. Introduction

2. Related Work

2.1. Sequential Next POI Recommendation

2.2. Next POI Recommendation Based on Spatio-Temporal Context

3. Preliminaries and Data Analysis

3.1. Preliminaries

3.2. Preliminary Analysis

4. The Proposed Method

4.1. User Personalized Spatio-Temporal Unequal Interval Correlation Preference Modeling Layer

4.1.1. Construction of Users Personalized Temporal Unequal Interval Matrices

4.1.2. Construction of Users Personalized Spatial Unequal Interval Matrices

4.2. Embedding Layer Fusing Spatio-Temporal Unequal Interval Correlation Preference

4.3. Spatio-Temporal Unequal Interval Correlation-Aware Transformer Model

4.3.1. Spatio-Temporal Unequal Interval Correlation-Aware Self-Attention Network

4.3.2. Point-Wise Feed-Forward Network

4.4. Next POI Recommendation

4.5. Model Optimization

5. Experiments

5.1. Experimental Setup

5.1.1. Data Collection and Preprocessing

5.1.2. Evaluation Metrics

5.1.3. Baseline Approaches

5.1.4. Configurations

5.2. Results and Discussions

5.2.1. Comparison of Recommendation Performance

5.2.2. Effectiveness of Different Components

5.3. Sensitive Analysis of Parameters

5.3.1. Effect of Latent Dimension

5.3.2. Effect of Maximum Sequence Length

5.3.3. Effect of Maximum Temporal Interval

5.3.4. Effect of Maximum Spatial Interval

5.4. Effect of Different Spatio-Temporal Interval Correlation Processing

5.5. Threats to Validity

6. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Property	UCF	FPMC	ST-RNN	ARNN	LSTPM	TiSASRec	STUIC-SAN
SE	√	√	√	√	√	√	√
TE	×	√	√	√	√	√	√
SP	×	×	√	√	√	×	√
ST	×	×	×	×	×	×	√
SA	×	×	×	×	×	√	√