Next Article in Journal
A High FoM and Low Phase Noise Edge-Injection-Based Ring Oscillator in 350 nm CMOS for Sub-GHz ADPLL Applications
Next Article in Special Issue
Distributed Multi-Agent Approach for Achieving Energy Efficiency and Computational Offloading in MECNs Using Asynchronous Advantage Actor-Critic
Previous Article in Journal
Multispectral Remote Sensing Image Change Detection Based on Twin Neural Networks
Previous Article in Special Issue
Seismic Data Query Algorithm Based on Edge Computing
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Differential Privacy-Based Spatial-Temporal Trajectory Clustering Scheme for LBSNs

1
College of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou 450001, China
2
School of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou 450001, China
*
Authors to whom correspondence should be addressed.
Electronics 2023, 12(18), 3767; https://doi.org/10.3390/electronics12183767
Submission received: 29 July 2023 / Revised: 30 August 2023 / Accepted: 4 September 2023 / Published: 6 September 2023

Abstract

:
Location privacy preserving for location-based social networks (LBSNs) has been attracting a great deal of attention. Existing location privacy protection methods are disadvantaged by issues such as information leakage and low data availability, which are no longer suitable for the current diverse and personalized location-based services. To address these issues, we propose a differential privacy-based spatial-temporal trajectory clustering (DP-STTC) scheme, which mainly transforms the existing location privacy protection mechanism into a spatial-temporal trajectory protection mechanism by adjusting the privacy parameters. Then, the trajectories were clustered to uncover users with similar trajectory characteristics. Finally, experiments were conducted on two real datasets. The experimental results show that our DP-STTC scheme can not only achieve better accuracy in trajectory clustering, but also protect user privacy.

1. Introduction

Location-based social networks (LBSNs) have undergone development in recent years due to the widespread usage of various mobile devices and the rapid development of location technology. According to surveys, approximately 80% of applications people use in their daily lives are related to location-based services, such as entertainment services, navigation services, and services provided by wearable devices [1,2,3].
However, while LBSNs offer users a positive experience, they frequently request user location information and generate user behavioral trajectory data. In order to achieve potential economic value, LBS service providers offer the collected user trajectory data to corresponding research institutions for data mining and analysis, so as to achieve accurate recommendations and personalized services. When trajectory data are released, there may be major privacy leaks because they contain a lot of sensitive information and spatial-temporal information [4,5,6]. Attackers can not only obtain users’ geographical activity location data through the original trajectory dataset, but also deduce their interests, habits, health status, social relationships, and home address through big data analysis, which may lead to serious consequences.
In LBSNs, mobile clients generate mobile trajectory sequences by connecting the location data accessed by users in a chronological order. The majority of current location privacy protection systems ignore spatial-temporal correlation in favor of protecting specific places or trajectories. Based on our observation, the method of generating virtual location protection proposed in articles [7,8], and the geographical indistinguishability method presented in article [9] solely address spatial information between locations and disregard the protection of time information. Similarly, the location obfuscation protection method suggested in article [10] solely takes spatial information into account, whereas article [11] exclusively considers time information. This makes it easy for attackers to infer user behavior patterns and fails to protect sensitive information about user spatial-temporal activities. These mechanisms are no longer capable of meeting the future development needs of diverse and personalized location services. To address this problem, a differential privacy-based spatial-temporal trajectory clustering (DP-STTC) scheme is proposed in this paper. First, the mobile client processes each user’s initial location. The location privacy protection based on spatial-temporal activity is realized by adjusting the privacy parameters. Then, the disturbed trajectory sequences are generated according to the time characteristic of locations. Finally, the disturbed trajectory sequences are uploaded by the mobile client to the server, aiming to obtain clustering outcomes from the server side.
The significant contributions brought forward by this paper are as follows:
(1)
We constructed a differential privacy-based spatial-temporal trajectory protection framework. The spatial-temporal activity privacy protection mechanism was proposed to achieve the protection of users’ spatial-temporal activity privacy;
(2)
We proposed a trajectory clustering technology that groups trajectories into different clusters through considering the semantic distance between trajectories;
(3)
We conducted an extensive experimental study to verify the functions and performance of the proposed DP-STTC scheme on two real datasets. The experiment results show that our DP-STTC scheme can cluster the trajectories privately.

2. Background and Related Work

2.1. Trajectory Clustering

The process of grouping the trajectory objects into different clusters based on how similar their trajectories are is known as trajectory clustering. Based on the clustering model, the clustering algorithm can be divided into four aspects: partitional clustering [12], density clustering [13], grid clustering, and hierarchical clustering [14]. In addition to introducing the idea of trajectory segment similarity for the first time, Lee et al. [15] also provided a framework for trajectory clustering based on grouping. The trajectory is divided into multiple trajectory segments using the local characteristics of the trajectory, and the similar trajectory segments are clustered to find common sub-trajectories. Cheng et al. [16] improved the previous work [15] by proposing a clustering method based on the temporal and spatial similarity between sub-trajectories. First, each sub-distance of spatial distance is defined and calculated to obtain the spatial similarity measure. Then, the temporal similarity formula based on the LCSS model is proposed considering the time factor. During clustering, Wang et al.’s suggested trajectory similarity assessment methodology [17] accounted for trajectories’ form characteristics in addition to their temporal and spatial components. A semantic trajectory clustering approach that combines the similarity matrix with the k-NN algorithm was outlined by Qiao et al. [18]. Xu et al. [19] proposed a trajectory clustering algorithm which considers semantic and geographical distances. Using this approach, the trajectories are organized into various clusters based on the potential distance. A machine-learning-based approach for anonymizing trajectory data was proposed by Shaham et al. [20]. The authors utilized the k-means algorithm to cluster trajectories while safeguarding against sensitive data leakage through an enhanced version of the k-means algorithm.

2.2. Privacy Preserving of Trajectories

The existing methods for trajectory privacy protection are mainly divided into three aspects: fake trajectories, trajectory generalization, and suppression technology. The fake trajectory technique [21] generates fake trajectories based on original user trajectories and publishes them to the server. The basic principle of the trajectory generalization technique [22] is similar to the location generalization technique, and the commonly used method is k-anonymity. To accomplish trajectory privacy protection, the trajectory suppression approach [23] directly eliminates or conceals some sensitive locations in the trajectory. These privacy-preserving approaches discussed above require previous knowledge of the background information which the attacker possesses, and do not offer quantitative privacy guarantees.
The privacy-preserving model named differential privacy was proposed by Dwork et al. [24] in 2006, which solves two shortcomings of traditional privacy-preserving models. It has a strict mathematical definition and does not need to take the attacker’s prior information into account. The geographic indistinguishability mechanism was proposed by Andrés [9] et al. To increase time efficiency, Hua et al. [25] suggested a location generalization approach based on differential privacy utilizing a K-means clustering algorithm. In order to satisfy differential privacy and safeguard potential social relationships between two user trajectories, Ou et al. [26] presented an N-body Laplacian architecture. Yang et al. [27] presented a quadtree indexing technique for perturbing trajectory data through location generalization and local differential privacy strategies. The approach considers the correlation between neighboring spatial and temporal nodes, while safeguarding the privacy of the user’s trajectory. Zheng et al. [28] presented a privacy-preserving location trajectory data sharing technique that is sensitive to semantics. This method simultaneously safeguards both user data privacy and semantic privacy, achieving an improved equilibrium between privacy and utility. Wu et al. [29] proposed a trajectory-related privacy protection mechanism that satisfies differential privacy. This method considers the trajectory association problem between multiple users and realizes the protection of trajectory correlation between multiple users.

2.3. Spatial-Temporal Data Privacy Preservation

Trajectories exhibit high temporal and spatial correlation, and users’ daily trajectories also usually show regular changes. It is simple for attackers to infer behavioral activity patterns if they have certain background information. Hence, safeguarding users’ spatial-temporal data is crucial. A predictive perturbation method based on the correlation between different locations was proposed by Chatzikokolakis et al. [30]. The technique predicts a new location based on the previous one, enabling continuous sharing of spatial-temporal data while protecting privacy. Xiao et al. [31] introduced the concept of δ -Location Set and suggested a novel location perturbation method, PIM, to assure the availability of data by lowering the magnitude of added noise. This was carried out according to the location correlation of spatial-temporal data. The idea of sequence indistinguishability was developed by Wang and Xu [32], who also put forward a plan for distributing correlated time series data based on differential privacy. This scheme prevents attackers from distinguishing between the noisy sequence and the original sequence. The ConTPL technique was suggested by Cao et al. [33] after they examined the privacy-leaking issue of conventional differential privacy under temporal correlation. Ghane et al. [34] introduced a novel trajectory generation algorithm called TGM. This algorithm retains both the spatial and temporal information of the trajectories generated within the dataset, while also ensuring the provision of a differential privacy guarantee.

3. Overview of the DP-STTC Scheme

3.1. Problem Definition

Definition 1 
( ϵ -differential privacy [35]). The randomized algorithm  F  satisfies  ϵ -differential privacy if, and only if, any possible output outcome  Y  on adjacent data set  D 1  and  D 2  satisfies:
Pr F D 1 Y exp ϵ Pr F D 2 Y ,
where datasets D 1  and D 2  are adjacent datasets, which differ by one record from each other at most.
According to the Definition 1, the Global sensitivity of F can be computed as:
Δ F = max D 1 , D 2 F D 1 F D 2 ,
where F D 1 F D 2 is the first-order norm distance between F D 1 and F D 2 .
Definition 2 
(δ-Location Set [31]). Let  P t i = Pr L t = s i | L t 1 , , L 1  be the prior probability of the user’s location at timestamp  t . δ -Location Set (denoted as  Δ X t ) is a set containing the minimum number of locations that have a prior probability sum of no less than  1 δ . The specific formula is as follows:
Δ X t = min s i | s i P t i 1 δ ,
where p t i  represents the i-th element in  p t , while  L  and  L *  represent the actual location and the disturbed location, respectively. For example, if p t = 0.1 , 0.5 , 0.3 , 0.02 , 0.03 , 0.05  corresponding to s 1 , s 2 , s 3 , s 4 , s 5 , s 6 , then  Δ X = s 2 , s 3  when δ = 0.2 ; Δ X = s 2 , s 3 , s 1  when  δ = 0.1 .
According to the formula of the δ -Location Set, the parameter δ is inversely proportional to the size of location set Δ X t . When δ is larger, Δ X t is smaller, and vice versa: the smaller δ is, the larger Δ X t is.
Definition 3 
(location points). Each trajectory location point  L  consists of a quaternion  l o n , l a t , t i m e , t y p e , where  l o n , l a t , t i m e  denotes the longitude and latitude of the location and the time of passing through the location point, respectively, and  t y p e  denotes the semantic description information corresponding to the location.
Definition 4 
(trajectory sequence). The trajectory sequence  T  is generated by linking the location data accessed by the user in chronological order throughout the day, and can be expressed as:
T = L 1 L 2 L n
Spatiotemporal activity: The privacy of a participant with a specific pattern of behavior in the real world. For example, “go to the gym every weekend” or “commute between work and home every 8 a.m. and 6 p.m.”. Let S = s 1 , s 2 , , s m stand for all possible spatial domains, where m denotes the domain’s size and s i denotes a location within the domain. L t = s i can be used to indicate the situation if the participant is in location s i at time t . In the spatiotemporal activity protection phase, only the location of the location point with respect to time is considered. The exact definition will be given below.
Definition 5 
(spatiotemporal activity). The spatiotemporal activity can consist of a binary group  l o c , t i m e  or multiple binary equations connected by the Boolean logic operators  A N D ,  O R ,  N O T .
Spatiotemporal activities defined using Boolean expressions enable users to customize their privacy needs in a variety of real-world activities, enabling personalized privacy protection. Spatiotemporal activities often consist of sensitive areas that people need to protect in their daily lives. If a user is present in a sensitive area during a certain time period, it is referred to as a PRESENCE EVENT. If a user visits multiple sensitive areas in consecutive time periods, it is referred to as a PATTERN EVENT. The following are the precise definitions.
Definition 6  
(PRESENCE EVENT [36]). If the user appears in the specified spatial region  S = s 1 , s 2 , , s m  at least once during the provided time period  T = t 1 , t 2 , , t n the activity is labeled as  P R E S E N C E S , T For example, if  S = s 1 , s 2 T = 1 , 2 then the PRESENCE EVNET can be expressed as  L 1 = s 1 L 1 = s 2 L 2 = s 1 L 2 = s 2 .
Definition 7  
(PATTERN EVENT [36]). If the user appears successively in the specified spatial region  S = s 1 , s 2 , , s m  during the provided time period  T = t 1 , t 2 , , t n the activity is labeled as  P A T T E R N S , T For example, if  S = s 1 , s 2 T = 1 , 2 then the PATTERN EVNET can be expressed as L 1 = s 1 L 1 = s 2 L 2 = s 1 L 2 = s 2 .
Definition 8 
( ϵ -spatiotemporal activity privacy). According to the definition of differential privacy, spatiotemporal activity privacy can be defined as follows: a mechanism satisfies  ϵ -spatiotemporal activity privacy if, and only if, any observation  L 1 , L 2 , , L t  given at any moment  t  of  1 , 2 , , T  satisfy:
P r L 1 , L 2 , , L t | E V E N T e ϵ P r L 1 , L 2 , , L t | ! E V E N T ,
where E V E N T  is the temporal activity template that the user needs to protect, ! E V E N T  denotes the negation of the activity, and  P r L 1 , L 2 , , L t | E V E N T  denotes the probability of publishing a location as  L 1 , L 2 , , L t  given the temporal activity template.

3.2. Scheme Design

The architecture of the DP-STTC primarily comprises the mobile client and server. The workflow of the DP-STTC is divided into four stages: acquiring a spatiotemporal activity template, protecting spatiotemporal activity, disturbed trajectory generation, and trajectory clustering. The comprehensive elaborations of the four stages are given as follows:
(1)
Acquiring a spatiotemporal activity template
First, users can customize the privacy information requiring protection in the location-based social network server. Afterward, the LBSN server generates a personalized spatiotemporal activity template based on users’ privacy preferences. The spatiotemporal activity template encompasses one or more spatiotemporal activities that users specify for protection, such as their daily commute between home and the company at 8 a.m. and 5 p.m., and their weekly hospital visits every Monday.
(2)
Protecting spatiotemporal activity
This stage will protect the user’s spatiotemporal activity privacy. The fundamental concept is to transform the existing location privacy protection mechanism into a spatial-temporal activity privacy protection mechanism by adjusting the privacy level. The actual location data collected from the user’s mobile client serves as the original dataset. First, the actual location of the user undergoes processing, and the location privacy protection mechanism is employed to generate the perturbed location. Then, the personalized spatial-temporal activity template is downloaded from the LBSN server, and the disturbance location is quantified by combining the observation sequence and the spatiotemporal activity template to determine whether the ϵ -spatiotemporal activity privacy is full. If the conditions are met, the perturbed location can be directly published. Conversely, if the condition is unsatisfied, the privacy parameters of the location privacy protection mechanism need to be adjusted to perturb again until the condition is satisfied.
(3)
Disturbed trajectory generation
During this stage, the mobile client connects the perturbed locations chronologically to create a sequence of perturbed trajectories. The perturbed trajectory sequence is then uploaded to the LBSN server. This process enhances the protection of user privacy.
(4)
Trajectory clustering
This stage is completed on the LBSN server. The mobile client uploads the perturbed trajectory sequence to the server, which calculates the semantic distance to cluster the disturbance trajectory sequence into different clusters.
As depicted in Figure 1, in the mobile client, the user’s actual location data are first processed for spatial-temporal activity protection. This procedure encompasses several steps: processing the user’s actual location data using the location privacy protection mechanism, quantitatively processing it with the template downloaded from the server, and generating perturbed locations. Subsequently, the perturbed locations are connected to form the perturbed trajectory, which is then uploaded to the LBSN server. On the server side, user-defined information is utilized to create a protection template, which is then transmitted to the mobile client. The uploaded disturbance trajectory data from the mobile client are clustered on the server, and the resulting clusters are put out. This process effectively safeguards the user’s trajectory privacy information and prevents attackers from acquiring the user’s genuine trajectory details during both the upload process and on the server side.

3.3. Attack Hypothesis

In general, the attacker’s goal is to steal the real location information of the targeted user, which may be an untrusted third-party server, a participating user, or an external attacker. The attacker may have rich background knowledge such as the user’s historical movement trajectory and the user’s movement preferences, combined with the time information of each trajectory point, the user’s home address, physical health condition, personal hobbies, social relationships, and other sensitive personal information that can be inferred.

4. Models and Algorithms

4.1. Mobile Inference Models and δ -Location Set

Protecting spatial-temporal activities first requires a location privacy protection mechanism that provides general protection for users’ location privacy against unknown risks. The temporal correlation of mobile user locations is not considered by numerous existing location-based privacy-preserving algorithms. The spatial-temporal data, in turn, are highly spatiotemporally correlated and vulnerable to various inference attacks; therefore, the δ -Location Set is chosen to be used as the location privacy protection mechanism. The δ -Location Set is based on a mobile inference model that infers the set of all possible locations of the user at each moment by exploiting the temporal correlation between successive locations of the user. The key idea of the δ -Location Set is to hide the user’s true location within any impossible location, where any location pair is indistinguishable in the set.
The hidden Markov model (HMM) contains multiple mutually independent hidden (true locations) and explicit states (perturbed locations), and its three important parameters are the initial state probability, the transfer matrix, and the emission matrix. We use a vector, p t , to denote the probability distribution of a user’s location. Formally, P t i = Pr L t = s i . Given the probability vector p t 1 , the probability at timestamp t becomes p t = p t 1 M , where matrix M represents the probability that the user moves from one location to another. If given an actual location, L t , and the mechanism releases a perturbed location L t * , the emission probability can be expressed as Pr L t * L t = s i . The mobile inference model is obtained through the evolution of the HMM inference, assuming that p t denotes the prior probability of the location and p t + denotes the posterior probability of the location. At any moment t , the prior probability can be obtained from the posterior probability at the moment t 1 after calculation, i.e., p t = p t 1 + M . According to Bayes’ rule, given a perturbed location L t , the posterior probability can be obtained by calculating:
P t + i = Pr L t = s i | L t = Pr L t | L t = s i P t i j Pr L t | L t = s j P t j
At any moment, this derivation process can be efficiently implemented through the forward–backward algorithm of the HMM.

4.2. Spatial-Temporal Activity Privacy Protection Mechanism

An improved location privacy protection model is proposed and the location privacy protection mechanism (LPPM) is converted into a spatial-temporal activity privacy protection mechanism by adjusting its privacy level. Figure 2 depicts the protection structure, which primarily consists of two components: the quantification method and the location privacy protection mechanism. First, the real location of each time point is generated as perturbation location L t via the LPPM and passed to the quantization mechanism. The quantization mechanism needs to check whether the generated perturbed location L t satisfies the definition by combining the observation sequence and the EVENT template of the spatial-temporal activities provided by Definition 8. If the condition is satisfied, the perturbed location can be published directly; if not, the privacy parameters of the location privacy protection mechanism need to be adjusted to generate a new perturbed location.
The Laplace mechanism is used to add the noise for disturbed location generation. In the DP-STTC scheme, the location privacy protection is performed in the time domain. For the real time t i , the disturbed time t i * can be computed as:
t i * = t i + L a p Δ F ϵ i ,
where Δ F ϵ i is the parameter of Laplace and ϵ i is the privacy budget of different users. The privacy budget can satisfy different demands of privacy protection, which supports the custom privacy preferences of target user.
In order to ensure the privacy of users and the effectiveness of the data, the proof is necessary for the DP-STTC scheme.
Proof  
(Laplace noise satisfies the ϵ i -personalized local differential privacy). In the Laplace mechanism, we let λ = Δ F ϵ i . For any  t , and any two numerical values  t i , t i * D , the following inequality can be acquired as:
L a p t | t i L a p t | t i * = L a p t t i L a p t t i * = e t t i t t i * λ e ϵ i
Therefore, the Laplace noise satisfies the ϵ i -personalized local differential privacy. □
Referring to previous research [36], an improved spatial-temporal activity privacy protection algorithm is proposed in this paper. The privacy parameters can be adjusted according to a use-defined template. If the generated disturbed location does not meet the ϵ -spatiotemporal activity privacy, the disturbed location can be regenerated by user demand. Algorithm 1 illustrates the procedure of improved spatial-temporal activity privacy protection.
Algorithm 1. Improved Spatial-Temporal Activity Privacy Protection Algorithm
Input: Actual Location L t , ϵ , LPPM, M , E V E N T
Output: Disturbed location L t
1: if t in 1 , 2 , , T then
2: for t in 1 , 2 , , T do
3:  generate disturbed location L t from real location with LPPM;
4:  while  ϵ -spatiotemporal activity privacy is not satisfied do
5:   adjusting privacy parameters and generate L t ;
6:  end while
7:  release L t
8:    end for
9: else
10:    generate disturbed location L t from real location with LPPM;
11:    release L t ;
12: end if
As shown in Algorithm 1, line 3 is mainly to generate the disturbed location corresponding to the real location using the LPPM. The main principle of the LPPM is that the Markov model is firstly utilized to generate the δ -location set, which can hide the real location in any impossible locations. Then, the Laplace mechanism is applied to perturb the δ -location set. Finally, the perturbed location L t can be generated from the perturbed δ -location set. In line 5, it is used to adjust the privacy parameters when the perturbed location fails to meet the demand of ϵ -spatiotemporal activity privacy. The strategy for adjustment involves the exponential decay of privacy parameters, as smaller values correspond to stronger protection of location privacy and reduced privacy leakage. In DP-STTC, the attenuation rate is set as 1 2 . Although using a smaller value can expedite algorithm convergence, it might lead to excessive interference. On the other hand, larger values can enhance availability.
For example, the given privacy parameter is α . First, the perturbed location is generated using the LPPM. Then, if it satisfies the ϵ -spatiotemporal activity privacy, it is directly disclosed. However, if it fails to satisfy the demand, the privacy budget α is halved. A new perturbed location is generated. The entire process is then reiterated. From Algorithm 1, we can see that the spatial-temporal activity privacy protection and the location privacy protection are orthogonal. Location privacy protection can provide users with general protection against unknown risks. The privacy protection of spatial-temporal activities can provide users with protection for spatial-temporal activities. When the real location needs to be protected, the used privacy protection mechanism should satisfy ϵ -differential privacy, which is defined based on the concept of differential privacy. Similar to the definition that precludes differentiation between any two adjacent datasets, when the perturbed location that satisfies the ϵ -spatiotemporal activity privacy is published, potential attackers are rendered incapable of distinguishing the authenticity of the spatial-temporal activity. This comprehensive process maintains the principles of differential privacy.

4.3. Spatial-Temporal Activity Privacy Protection Mechanism

In this paper, we only discuss two types of spatial-temporal activity templates: P R E S E N C E and P A T T E R N . We assume that the template is defined within continuous time, employing s t a r t and e n d to symbolize the initiation and conclusion of the activity, respectively. Quantization of ϵ -spatiotemporal activity privacy determines whether the perturbed location L t generated through the location privacy protection mechanism satisfies inequality P r L 1 , L 2 , , L t | E V E N T e ϵ P r L 1 , L 2 , , L t | ! E V E N T . Given that the input at moment t is the user’s real location L t and the output is the emission matrix of the location privacy protection mechanism for the disturbed location L t , quantifying ϵ -spatiotemporal activity privacy is equivalent to computing the maximum ratio of P r L 1 , L 2 , , L t | E V E N T P r L 1 , L 2 , , L t | ! E V E N T given any L 1 , L 2 , , L t . If the correlation between L 1 , L 2 , , L t and the actual location is unspecified, calculating this ratio directly from the emission matrix becomes challenging. Because the relationship between the disturbed location L t and EVENT cannot be determined, and the tuples in the EVENT may not be independent.
If the observed L 1 , L 2 , , L t and prior probability P are provided, P r L 1 , L 2 , , L t | E V E N T can be computed using Equation (6).
P r L 1 , L 2 , , L t | E V E N T = P r L 1 , L 2 , , L t , E V E N T P r E V E N T ,
where P r E V E N T is the prior probability of the activity and P r L 1 , L 2 , , L t , E V E N T is the joint probability of the activities. Equation (6) can be derived from the conditional probability formula.
The above is based on quantifying the given prior probabilities P and L 1 , L 2 , , L t , with some limitations. Next, arbitrary prior probabilities will be discussed in order to restrict the attacker from any prior knowledge of the user.
By setting the prior probability P as a variable, the ϵ -spatiotemporal activity privacy quantification problem can be transformed into a problem of computing the maximum value of
P r L 1 , L 2 , , L t | E V E N T P r L 1 , L 2 , , L t | ! E V E N T e ϵ ,
and requires that the highest value will never be greater than 0 or equal to 0. According to the research of Cao et al. [37], in order to ensure that ϵ -spatiotemporal activity privacy is fulfilled at any time, the published disturbed location L i needs to satisfy the following inequality:
P 1 , 0 e ϵ 1 a T b e ϵ a T c 1 , 0 T P T + P 1 , 0 b T 0 ,
P 1 , 0 e ϵ 1 a T b + a T c 1 , 0 T P T P 1 , 0 e ϵ b T 0 ,
where
a = i = 1 k 1 M i 0 , 1 T .
For t k
b T = P L 1 * i = 2 t M i 1 P L i * i = t k 1 M i 0 , 1 T c T = P L 1 * i = 2 t M i 1 P L i * 1 , 1 T .
For t > k
b T = P L 1 * i = 2 t M i 1 P L i * 1 , 1 i = t 1 k P L i + 1 * M i T 0 , 1 T ,
c T = P L 1 * i = 2 k M i 1 P L i * 1 , 1 i = t 1 k P L i + 1 * M i T 1 , 1 T ,
where a is the prior probability of the activity, b is the joint probability of the activity, c is the emission probability given observation L t , and k is the last location of the trajectory sequence.
The maximum value problem can be resolved using the quadratic programming approach, because it can be written in the form of P A P T = 1 2 P A + A T P T , where A is a matrix.

4.4. Disturbed Trajectory Generation

This process generates a sequence of perturbed trajectories by connecting the location points after perturbation in the mobile client in the chronological order of the day. Algorithm 2 displays the perturbed trajectory-generating algorithm.
Algorithm 2. Disturbed Trajectory Generation
Input: Disturbed location L t
Output: Disturbed trajectory T
1: Initialize  T ; Location collection L
2: add L t into Location collection L ;
3: sorting L in chronological order throughout the day;
4: for L t in L  do
5:  add L t into T ;
6: end for
7: Return T

4.5. Trajectory Clustering

The location social network server estimates the potential distance between two trajectories by taking into account the semantic distance after receiving the sequence of perturbed trajectories uploaded by the mobile client, and then clustering the trajectories into several clusters.
According to our previous research [19], the semantic distance between trajectory sequences can be calculated using the following formula:
S E M T i * , T j * = T i * T j * T i * + T j *
where T * represents the number of trajectory sequence segments, and T i * T j * represents the number of common trajectory sequence segments between T i * and T j * . If two trajectory sequences have more common sub-sequences in semantical space, their semantic distance is reduced.
Algorithm 3 depicts the trajectory clustering process.
Algorithm 3. Trajectory Clustering
Input: Trajectory sequences set T = T 1 , T 2 , , T n
Output: Trajectory clusters set C = C 1 , C 2 , , C k
1: Define l e n is the length of trajectory sequence, l is the number of common sub-sequence between two trajectories
2: l = 0;
3: for i in 1 , 2 , , n  do
4:  for j in 1 , 2 , , n 1  do
5:     m = min l e n T i , l e n T j ;
6:   for j in 1 , 2 , , m 1  do
7:    l = l + 1;
8:   end for
9:    S E M = l l e n T i + l e n T j
10:   if i = = 1  then
11:    Put T 1 into C 1 ;
12:   else if min i S E M T i , C k θ t  then
13:    Put T i into C k ;
14:   else if min i S E M T i , C k > θ t  then
15:    Create C k + 1 , Put T i into C k + 1 ;
16:  end if
17:  end for
18: end for
19: Return C

5. Performance Evaluation

5.1. Experimental Setup

In this study, MatLab is used to analyze the location data and evaluate the effectiveness of the DP-STTC scheme. The initial user trajectory data are obtained from the GPS trajectory dataset of the GeoLife project at the Microsoft Asia Research Institute [38]. First, the data points with longitudes ranging from 115.4 to 117.6 and latitudes ranging from 39.4 to 41.1 are chosen because location data within Beijing is specifically considered for the experiment. Second, 20 × 20 location units of the same size are divided to create the computation disturbance location template. Finally, the Beijing POI dataset is used to tag the semantic information of each location according to the TF-IDF model and our previous study [39]. This dataset is categorized into 20 service types, encompassing the location data of the majority of Beijing’s points of interest. Figure 3 illustrates the generated locations and their corresponding semantic tagging for a specific user. It is evident that the user has visited seven different types of semantic locations within the trajectory sequence. This information can be utilized for trajectory clustering.
The training and testing sets were created from the GPS trajectory dataset. The training set includes 80% of the location data of the trajectory dataset, which can be used to construct a global prior. It can be taken as the average of the individual prior probabilities of all users visiting the area. Then, the obtained average value is used in the post-mapping mechanism to obtain the location of the optimal service quality loss. The testing set includes 20% of the location data of the trajectory dataset, which can be used to evaluate the mechanisms. It constructs a user-specific prior for at least 20 users and measures the service quality loss of the mechanism when users use their own prior.
As to the experimental methods of P A T T E R N and P R E S E N C E being the same, only the P R E S E N C E event is used for simulation experiments in this section. It is presented through parameters S and T . For example, P R E S E N C E S = 1 : 10 , T = 4 : 8 indicates that the target user has visited the region s 1 , s 2 , , s 10 during the time 4 , 5 , , 8 .

5.2. Evaluation Metrics

The evaluation metrics for trajectory clustering include precision, recall, and the F1-score. We utilize location data with semantic tagging, as established in our previous research [40,41], given the absence of an approach to group users within the Geolife datasets into diverse clusters. This experiment aims to compare the performance of trajectory clustering and privacy security for ILP [18], BU [42], SP-tree [43], DP-LTOD [19], N-gram [44], and NPT [45], respectively.
The schemes of ILP, BU, and SP-tree are used to compare the performance of trajectory clustering with our proposed DP-STTC scheme. The ILP scheme mainly clusters the trajectories according to the labeled location information. The BU scheme can select suitable travel partners for users based on the trajectory sequence uploaded by them. The SP-tree scheme can determine users with similar mobile behaviors based on their mobile behavior and cluster them into the same community. Furthermore, the schemes of DP-LTOD, N-gram, and NPT are used to compare the performance of privacy security with our proposed DP-STTC scheme. The DP-LTOP scheme mainly protects user privacy by considering the generalized trajectory segment. The N-gram scheme can generate anonymous trajectory sequences using N-grams with variable length. The NPT scheme can generate generalized trajectory sequences using noisy prefix trees. The metrics of precision, recall and F1-score are used to evaluate the performance of trajectory clustering. Relative error is used to evaluate the performance of privacy security.
(1)
Precision
The precision of trajectory clustering schemes can be computed as:
P r e c i s i o n @ k = A Β k ,
where Β stands for the set of clustered trajectories in the testing set and A stands for the set of clusters in the training set. Parameter k denotes the number of clusters.
(2)
Recall
The recall of the trajectory clustering scheme can be calculated using the following formula:
R e c a l l @ k = A Β Β ,
where Β denotes the set of positive clusters in the training set and Β is the number of positive clusters.
(3)
F1-Score
To synthetically evaluate the stability of trajectory clustering systems, the F1-Score may be calculated as follows:
F 1 A , Β = 2 P r e c i s i o n R e c a l l P r e c i s i o n + R e c a l l ,
(4)
Relative Error
The usefulness of a counting query Q across the sets of obfuscated trajectories T * and original trajectories T is assessed using the relative error (RE). This equation may be used to calculate the relative error in relation to accuracy:
R E = Q T * Q T max Q T , s ,
where s denotes the sanity bound to effectively mitigate the influence of the queries with negligibly small selectivity.

5.3. Experimental Results and Performance Analysis

The proposed DP-STTC scheme is compared with different schemes using evaluation criteria in the following domains: the impact of percentages of training data, the impact of the number of discovered clusters, the impact of the number of positive clusters, and the impact of the privacy budget.
(1)
Effect of percentages of training data
Certain sensitive location data within the trajectory sequence will remain unpublished until users upload authentic trajectory data, as a precaution to safeguard users’ personal privacy on LBSNs. Consequently, the LBSN server receives limited data. Figure 4 shows the comparison of precision, recall, and F1-score of the trajectory clustering schemes in terms of the proportion of training data to the total data. It is evident from Figure 4a–c that the DP-STTC strategy suggested in this paper is superior to the other three schemes. For determining travel companions, the BU system uses density-based clustering, which takes into account factors such as the size, distance, duration, and density, among others. However, if the trajectory data uploaded by the user are sparse, the BU scheme will not be able to determine suitable partners for the target user. It will result in the scheme’s performance degradation. The continuous probability tree approach is used in the SP-tree strategy to build the user model and mine the trajectory clusters based on mobile behavior. However, the SP-tree technique will not offer efficient trajectory clustering when the trajectory data are sparse or partial, since it only takes the user’s geographic location sequence into account when building the user model. The DP-STTC approach outlined in this paper takes into account both the semantic type of each location, as well as the geographic distance between location sequences. Trajectories are classified into distinct types of clusters, in which users in each cluster have similar interests or preferences, through mining user movement patterns. Since the DP-STTC and the ILP schemes both take the semantic information of the location into account, these two schemes still have good trajectory clustering performance under the condition of data sparsity. In spite of this, the ILP scheme lacks consideration of user movement patterns. It cannot cluster users with similar characteristics well, resulting in lower precision and recall than those of the DP-STTC scheme.
(2)
Effect of the number of discovered clusters
In the process of trajectory clustering, the different number of clusters found reflects the different granularity of trajectory clustering schemes. Coarse-grained tasks find users with similar movement behaviors. Fine-grained tasks can further determine users with similar interests or preferences (e.g., behavior patterns, lifestyle habits, etc.). For LBSNs, it is of great significance to determine potential friends with similar interests or preferences through fine-grained trajectory clustering methods [46]. In terms of the number of clusters found, Figure 5 compares the accuracy, recall, and F1-score of several trajectory clustering algorithms. As depicted in Figure 5a–c, the precision of the DP-STTC scheme, ILP scheme, BU scheme, and SP-tree scheme gradually declines while the recall steadily increases as the number of clusters detected grows. However, compared to the other three techniques, the DP-STTC strategy suggested in this study performs better overall. The reason is that the DP-STTC scheme mines users’ interests or preferences from the perspective of semantic space, which can be better applied to fine-grained trajectory clustering tasks. Although the ILP scheme also considers the semantic information of locations, it lacks consideration of user movement patterns, resulting in lower clustering accuracy than that of the DP-STTC scheme. From the experimental findings, we can also observe that the SP-tree strategy outperforms the BU method. The primary rationale for this is attributed to the utilization of continuous probability trees in the SP-tree scheme for constructing user models. The continuous probability trees can mine the frequent activity patterns of users in a geographical space to identify users with similar mobile behaviors.
(3)
Effect of the number of positive clusters
In the process of trajectory clustering, positive clusters refer to clusters that already exist in the training set; that is, we already know that a certain number of users belong to several different communities before conducting trajectory clustering. Positive clusters will help improve the precision of trajectory clustering, but reduce the recall, because the server will produce higher false negative rates as more positive clusters are included in the training set. Figure 6 compares the trajectory clustering systems’ precision, recall, and F1-score in terms of the quantity of positive clusters. As depicted in Figure 6a–c, with the gradual increase in the number of positive clusters, the precision of the DP-STTC, ILP, BU, and SP-tree schemes gradually increases, while the recall rate gradually decreases. The DP-STTC strategy suggested in this study performs better overall than the other three systems. As discussed above, the DP-STTC scheme considers the geographic information and semantic information of the location to mine users with similar interests or preferences and cluster them. Compared with the ILP scheme, the BU, SP-tree, and DP-STTC schemes consider more factors to achieve better accuracy. Additionally, Figure 6c shows that the value of the F1-score for the DP-STTC scheme does not fluctuate significantly when the number of positive clusters rises, demonstrating that the DP-STTC scheme has higher stability.
(4)
Effect of the privacy budget
This study evaluates the performance of four privacy protection schemes in accordance with various privacy budgets, ϵ , in order to validate the privacy performance of the DP-STTC, DP-LTOD, N-gram, and NPT schemes. Figure 7 shows the comparison of precision, recall, F1-score, and the average relative error of trajectory privacy protection schemes in terms of privacy budget, ϵ . It can be seen that as the value ϵ increases, the performance of the DP-STTC, DP-LTOD, N-gram, and NPT schemes for trajectory clustering continue improving. The fundamental reason is that the privacy protection strength decreases with increasing value ϵ ; in other words, as less noise is provided, the performance of target clustering improves. Moreover, under the same privacy protection strength, the DP- STTC scheme has better trajectory clustering performance than the other three methods. The main reason for this is that the DP-STTC scheme establishes the user’s spatial-temporal template by considering the factors of time and space, and better mines the user’s preference characteristics to provide personalized privacy protection. Although the DP-LTOD scheme considers the semantic information of locations, it lacks consideration of the influence of the time factor on user behavior preference. The N-gram scheme uses the variable length N-gram model to extract sensitive information in the trajectory sequence, and then adaptively increases the noise for each datum in the sequence. The NPT scheme constructs a prefix tree model for the initial trajectory sequence and subsequently introduces noise for each node within the prefix tree. There are some differences between the trajectory sequence and the original trajectory sequence in the geographic space and semantic space, because the N-gram scheme and the NPT method do not take into account the semantic information of locations and the geographic distance between the trajectory segments. By using the user movement pattern template, the DP-STTC scheme suggested in this paper ensures that the generalized trajectory sequence has a good similarity with the original trajectory sequence, allowing for the LBSN server to precisely mine users with similar interests or preferences according to the generalized trajectory sequence at the trajectory clustering stage. Figure 7d especially shows that the DP-STTC scheme can achieve a lower average relative error as the privacy budget value ϵ increases.

6. Conclusions and Future Work

This paper addresses the issue of personalized trajectory privacy protection within the trajectory clustering process in LBSNs. We introduced a differential privacy-based spatial-temporal trajectory clustering (DP-STTC) scheme. First, the scheme transforms the existing location privacy protection mechanism into a spatial-temporal trajectory privacy protection mechanism by adjusting the privacy parameters while considering the user’s privacy preferences during the quantization process. Subsequently, we incorporated semantic distance during the trajectory clustering stage. Finally, we validated the proposed DP-STTC scheme using two real datasets. The results show that the technique improves user privacy protection and trajectory clustering accuracy when compared to previous approaches. Looking ahead, we plan to explore flexible and customizable location privacy protection schemes to mitigate the risk of privacy leakage when users cluster trajectories more effectively.

Author Contributions

Conceptualization, L.Z.; methodology, L.Z.; software, T.L. and J.M. (Jinqiao Mu); validation, L.Z.; formal analysis, Z.C.; investigation, J.M. (Jingzhe Mu); resources, J.Z.; data curation, T.L.; writing—original draft preparation, L.Z. and T.L.; writing—review and editing, L.Z. and T.L.; visualization, Z.C.; supervision, J.Z.; project administration, J.Z.; funding acquisition, L.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (NSFC) under grant no. 61902361, in part by the Henan Key Research Project of Higher Education Institutions under grant no. 22B520046, the Key Research and Development Special Project of Henan Province (221111210500), and the Key Technologies R&D Program of Henan Province (232102211053, 222102210170).

Data Availability Statement

The data used to support the findings of the study are available within the article.

Conflicts of Interest

The authors declare that they have no conflict of interest.

References

  1. Yadav, V.K.; Andola, N.; Verma, S.; Venkatesan, S. P2LBS: Privacy Provisioning in Location-Based Services. IEEE Trans. Serv. Comput. 2023, 16, 466–477. [Google Scholar] [CrossRef]
  2. Jiang, H.B.; Li, J.; Zhao, P.; Zeng, F.Z.; Xiao, Z.; Klyengar, A. Location privacy-preserving mechanisms in location-based services: A comprehensive survey. ACM Comput. Surv. CSUR 2021, 54, 1–36. [Google Scholar] [CrossRef]
  3. Saia, R.; Podda, A.S.; Pompianu, L.; Reforgiato Recupero, D.; Fenu, G. A blockchain-based distributed paradigm to secure localization services. Sensors 2021, 21, 6814. [Google Scholar] [CrossRef] [PubMed]
  4. Liu, B.; Zhou, W.; Zhu, T.; Gao, L.; Xiang, Y. Location Privacy and Its Applications: A Systematic Study. IEEE Access 2018, 6, 17606–17624. [Google Scholar] [CrossRef]
  5. Kim, T.H.; Goyat, R.; Rai, M.K.; Kumar, G.; Thomas, R. A novel trust evaluation process for secure localization using a decentralized blockchain in wireless sensor networks. IEEE Access 2019, 7, 184133–184144. [Google Scholar] [CrossRef]
  6. Shi, X.F.; Tong, F.; Zhang, W.A.; Yu, L. Resilient privacy-preserving distributed localization against dishonest nodes in Internet of Things. IEEE Internet Things J. 2020, 7, 9214–9223. [Google Scholar] [CrossRef]
  7. Do, H.J.; Jeong, Y.-S.; Choi, H.-J.; Kwangjo, K. Another dummy generation technique in location-based services. In Proceedings of the 2016 International Conference on Big Data and Smart Computing, Hong Kong, China, 18–20 January 2016; pp. 532–538. [Google Scholar]
  8. Hara, T.; Suzuki, A.; Iwata, M.; Arase, Y.; Xie, X. Dummy-Based User Location Anonymization Under Real-World Constraints. IEEE Access 2016, 4, 673–687. [Google Scholar] [CrossRef]
  9. Andrés, M.E.; Bordenabe, N.E.; Chatzikokolakis, K.; Palamidessi, C. Geo-Indistinguishability: Differential Privacy for Location-Based Systems. In Proceedings of the ACM SIGSAC Conference on Computer & Communications Security, Berlin, Germany, 4–8 November 2013; pp. 901–914. [Google Scholar]
  10. Ardagna, C.A.; Cremonini, M.; Vimercati, D.C.; Samarati, P. An Obfuscation-Based Approach for Protecting Location Privacy. IEEE Trans. Dependable Secur. Comput. 2011, 8, 13–27. [Google Scholar] [CrossRef]
  11. Hwang, R.-H.; Hsueh, Y.-L.; Chung, H.-W. A Novel Time-Obfuscated Algorithm for Trajectory Privacy Protection. IEEE Trans. Serv. Comput. 2014, 7, 126–139. [Google Scholar] [CrossRef]
  12. Zhu, E.; Ma, R. An Effective Partitional Clustering Algorithm Based on New Clustering Validity Index. Appl. Soft Comput. 2018, 71, 608–621. [Google Scholar] [CrossRef]
  13. Chen, J.; Yu, P.S. A Domain Adaptive Density Clustering Algorithm for Data with Varying Density Distribution. IEEE Trans. Knowl. Data Eng. 2021, 33, 2310–2321. [Google Scholar] [CrossRef]
  14. Jafarzadegan, M.; Safi-Esfahani, F.; Beheshti, Z. Combining Hierarchical Clustering Approaches Using the PCA Method. Expert Syst. Appl. 2019, 137, 1–10. [Google Scholar] [CrossRef]
  15. Lee, J.-G.; Han, J.; Whang, K.-Y. Trajectory Clustering: A Partition-and-Group Framework. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Beijing, China, 11–14 June 2007; pp. 593–604. [Google Scholar]
  16. Cheng, Z.; Jiang, L.; Liu, D.; Zheng, Z. Density Based Spatio-Temporal Trajectory Clustering Algorithm. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 22–27 July 2018; pp. 3358–3361. [Google Scholar]
  17. Wang, C.; Yang, J.; Zhang, J.-P. Privacy Preserving Algorithm Based on Trajectory Location and Shape Similarity. J. Commun. 2015, 36, 144–157. [Google Scholar]
  18. Qiao, D.; Liang, Y.; Ma, C.; Zhang, H. Semantic Trajectory Clustering via Improved Label Propagation with Core Structure. IEEE Sens. J. 2022, 22, 639–650. [Google Scholar] [CrossRef]
  19. Xu, C.; Zhu, L.; Liu, Y.; Guan, J.; Yu, S. DP-LTOD: Differential Privacy Latent Trajectory Community Discovering Services over Location-Based Social Networks. IEEE Trans. Serv. Comput. 2021, 14, 1068–1083. [Google Scholar] [CrossRef]
  20. Shaham, S.; Ding, M.; Liu, B.; Dang, S.; Lin, Z.; Li, J. Privacy Preserving Location Data Disturbed: A Machine Learning Approach. IEEE Trans. Knowl. Data Eng. 2021, 33, 3270–3283. [Google Scholar] [CrossRef]
  21. Kido, H.; Yanagisawa, Y.; Satoh, T. Protection of Location Privacy Using Dummies for Location-Based Services. In Proceedings of the 21st International Conference on Data Engineering Workshops (ICDEW’05), Tokyo, Japan, 3–4 April 2005; p. 1248. [Google Scholar]
  22. Gao, S.; Ma, J.; Shi, W.; Zhan, G.; Sun, C. TrPF: A Trajectory Privacy-Preserving Framework for Participatory Sensing. IEEE Trans. Inf. Forensics Secur. 2013, 8, 874–887. [Google Scholar] [CrossRef]
  23. Gruteser, M.; Liu, X. Protecting Privacy, in Continuous Location-Tracking Applications. IEEE Secur. Priv. 2004, 2, 28–34. [Google Scholar] [CrossRef]
  24. Dwork, C. Differential Privacy: A Survey of Results. In Theory and Applications of Models of Computation: Proceedings of the 5th International Conference, TAMC 2008, Xi’an, China, 25–29 April 2008; Springer: Berlin/Heidelberg, Germany, 2008; Volume 4978, pp. 1–19. [Google Scholar]
  25. Hua, J.; Gao, Y.; Zhong, S. Differentially Private Publication of General Time-Serial Trajectory Data. In Proceedings of the IEEE Conference on Computer Communications (INFOCOM), Hong Kong, China, 26 April–1 May 2015; pp. 549–557. [Google Scholar]
  26. Ou, L.; Qin, Z.; Liao, S.; Hong, Y.; Jia, X. Releasing Correlated Trajectories: Towards High Utility and Optimal Differential Privacy. IEEE Trans. Dependable Secur. Comput. 2020, 17, 1109–1123. [Google Scholar] [CrossRef]
  27. Yang, Z.; Wang, R.; Wu, D.; Wang, H.; Song, H.; Ma, X. Local Trajectory Privacy Protection in 5G Enabled Industrial Intelligent Logistics. IEEE Trans. Ind. Inform. 2022, 18, 2868–2876. [Google Scholar] [CrossRef]
  28. Zheng, Z.; Li, Z.; Jiang, H.; Zhang, L.Y.; Tu, D. Semantic-Aware Privacy-Preserving Online Location Trajectory Data Sharing. IEEE Trans. Inf. Forensics Secur. 2022, 17, 2256–2271. [Google Scholar] [CrossRef]
  29. Wu, L.; Qin, C.; Xu, Z.; Guan, Y.; Lu, R. TCPP: Achieving Privacy-Preserving Trajectory Correlation with Differential Privacy. IEEE Trans. Inf. Forensics Secur. 2023, 18, 4006–4020. [Google Scholar] [CrossRef]
  30. Chatzikokolakis, K.; Palamidessi, C.; Stronati, M. A Predictive Differentially-Private Mechanism for Mobility Traces. In Privacy Enhancing Technologies: Proceedings of the 14th International Symposium, PETS 2014, Amsterdam, The Netherlands, 16–18 July 2014; Springer: Cham, Switzerland, 2014; pp. 21–41. [Google Scholar]
  31. Xiao, Y.; Xiong, L. Protecting Locations with Differential Privacy under Temporal Correlations. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, Denver, CO, USA, 12–16 October 2015; pp. 1298–1309. [Google Scholar]
  32. Wang, H.; Xu, Z. CTS-DP: Disturbed Correlated Time-Series Data via Differential Privacy. Knowl.-Based Syst. 2017, 122, 167–179. [Google Scholar] [CrossRef]
  33. Cao, Y.; Yoshikawa, M.; Xiao, Y.; Xiong, L. Quantifying Differential Privacy in Continuous Data Release Under Temporal Correlations. IEEE Trans. Knowl. Data Eng. 2019, 31, 1281–1295. [Google Scholar] [CrossRef]
  34. Ghane, S.; Kulik, L.; Ramamohanarao, K. TGM: A Generative Mechanism for Disturbed Trajectories with Differential Privacy. IEEE Internet Things J. 2020, 7, 2611–2621. [Google Scholar] [CrossRef]
  35. Dwork, C. Differential Privacy. In Automata, Languages and Programming: Proceedings of the 33rd International Colloquium, ICALP 2006, Venice, Italy, 10–14 July 2006; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2006; Volume 4052, pp. 1–12. [Google Scholar]
  36. Cao, Y.; Xiao, Y.; Xiong, L.; Bai, L. PriSTE: From Location Privacy to Spatiotemporal Event Privacy. In Proceedings of the IEEE 35th International Conference on Data Engineering, Macao, China, 8–11 April 2019; pp. 1606–1609. [Google Scholar]
  37. Cao, Y.; Xiao, Y.X.; Xiong, L.; Bai, L.Q.; Yoshikawa, M. Protecting Spatiotemporal Event Privacy in Continuous Location-Based Services. IEEE Trans. Knowl. Data Eng. 2021, 33, 3141–3154. [Google Scholar] [CrossRef]
  38. Zheng, Y.; Xie, X.; Ma, W. GeoLife: A Collaborative Social Networking Service among User, location and trajectory. IEEE Data Eng. Bull. 2010, 33, 32–40. [Google Scholar]
  39. Zhu, L.; Xu, C.; Guan, J.; Zhang, H. SEM-PPA: A semantical pattern and preference-aware service mining method for personalized point of interest recommendation. J. Netw. Comput. Appl. 2017, 82, 35–46. [Google Scholar] [CrossRef]
  40. Zhu, L.; Liu, X.; Jing, Z.; Yu, L.; Cai, Z.; Zhang, J. Knowledge-Driven Location Privacy Preserving Scheme for Location-Based Social Networks. Electronics 2023, 12, 70. [Google Scholar] [CrossRef]
  41. Zhu, L.; Xie, H.; Liu, Y.; Guan, J.; Liu, Y.; Xiong, Y. PTPP: Preference-Aware Trajectory Privacy-Preserving over Location-Based Social Networks. J. Inf. Sci. Eng. 2018, 34, 803–820. [Google Scholar]
  42. Tang, L.-A.; Zheng, Y.; Yuan, J.; Han, J.; Leung, A.; Hung, C.-C.; Peng, W.-C. On discovery of traveling companions from streaming trajectories. In Proceedings of the IEEE 28th International Conference on Data Engineering, Arlington, VA, USA, 1–5 April 2012; pp. 186–197. [Google Scholar]
  43. Leskovec, J.; Lang, K.; Mahoney, M. Empirical comparison of algorithms for network community detection. In Proceedings of the 19th International World Wide Web Conference, Raleigh, NC, USA, 26–30 April 2010; pp. 631–640. [Google Scholar]
  44. Chen, R.; Acs, G.; Castelluccia, C. Differentially private sequential data publication via variable-length n-grams. In Proceedings of the ACM Conference on Computer and Communications Security, Raleigh, NC, USA, 16–18 October 2012; pp. 638–649. [Google Scholar]
  45. Chen, R.; Fung, B.; Desai, B.; Sossou, N.M. Differentially private transit data publication: A case study on the montreal transportation system. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Beijing, China, 12–16 August 2012; pp. 213–221. [Google Scholar]
  46. Wang, Z.; Zhang, D.; Zhou, X.; Yang, D.; Yu, Z.; Yu, Z. Discovering and Profiling Overlapping Communities in Location-Based Social Networks. IEEE Trans. Syst. Man Cybern. Syst. 2014, 44, 499–509. [Google Scholar] [CrossRef]
Figure 1. The workflow of DP-STTC scheme.
Figure 1. The workflow of DP-STTC scheme.
Electronics 12 03767 g001
Figure 2. Spatial-temporal activity privacy protection framework.
Figure 2. Spatial-temporal activity privacy protection framework.
Electronics 12 03767 g002
Figure 3. Locations and semantic tagging of one user.
Figure 3. Locations and semantic tagging of one user.
Electronics 12 03767 g003
Figure 4. Effect of percentages of training data.
Figure 4. Effect of percentages of training data.
Electronics 12 03767 g004
Figure 5. Effect of the number of the discovered clusters.
Figure 5. Effect of the number of the discovered clusters.
Electronics 12 03767 g005
Figure 6. Effect of the number of positive clusters.
Figure 6. Effect of the number of positive clusters.
Electronics 12 03767 g006
Figure 7. Effect of the privacy budget.
Figure 7. Effect of the privacy budget.
Electronics 12 03767 g007
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhu, L.; Lei, T.; Mu, J.; Mu, J.; Cai, Z.; Zhang, J. Differential Privacy-Based Spatial-Temporal Trajectory Clustering Scheme for LBSNs. Electronics 2023, 12, 3767. https://doi.org/10.3390/electronics12183767

AMA Style

Zhu L, Lei T, Mu J, Mu J, Cai Z, Zhang J. Differential Privacy-Based Spatial-Temporal Trajectory Clustering Scheme for LBSNs. Electronics. 2023; 12(18):3767. https://doi.org/10.3390/electronics12183767

Chicago/Turabian Style

Zhu, Liang, Tingting Lei, Jinqiao Mu, Jingzhe Mu, Zengyu Cai, and Jianwei Zhang. 2023. "Differential Privacy-Based Spatial-Temporal Trajectory Clustering Scheme for LBSNs" Electronics 12, no. 18: 3767. https://doi.org/10.3390/electronics12183767

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop