1. Introduction
With the rapid development of new energy technologies and the gradually rising popularity of electric vehicles (EVs), vehicle-to-grid (V2G) charging networks are being improved [
1]. Due to the limitation of onboard battery technology, EVs need to find charging posts frequently. In interacting with charging posts, data centers will collect EV charging location data, which can provide data services to third-party research institutions and companies [
2]. For example, charging service providers and grid companies can optimize the placement of charging stations.
The location information of users obtained by the management and services of V2G may cause privacy leakage problems. The charging locations frequently accessed by EVs can be linked to a user’s residence, workplace, and points of interest. These sensitive locations can easily expose specific users’ home addresses, health conditions, personal preferences, social connections, and other private information [
3]. In the event of a background knowledge attack [
4], the user bears a significant risk of personal privacy leakage [
5]. Therefore, handling these location data from a privacy protection perspective is crucial for V2G charging network development.
In recent years, researchers have proposed EV location privacy protection methods using anonymous pseudonym techniques to hide the actual ID of the user [
6,
7]. The user provides the exact location of the grid control center to obtain charging services so that unauthorized persons cannot determine the user’s real identity. By using partially restricted blind signature techniques when EVs communicate with charging stations, only the intelligent grid server can map the pseudonym to the actual vehicle identity. Also, the pseudonym changes when the EV moves from one charging station to another. However, there are two main problems with the above identity-anonymity-based protection mechanisms:
Poor flexibility in sharing data among multiple entities for analysis due to the presence of mostly trusted third parties and critical escrow problems.
Privacy protection is not strong enough; trusted third parties are vulnerable to attacks and cannot resist background knowledge attacks.
Another protection method is to hide part of the location information by scrambling and other obfuscation of the user’s location data. Existing location privacy protection schemes are divided into the following categories: encryption mechanisms, caching strategies, anonymization techniques, and differential privacy. Anonymity protects location privacy using generalization theory [
8]. The user’s actual location is sent to the charging service provider along with the locations of other
users to generate an anonymous region to protect the user’s actual location information. However, the level of privacy protection is weakened due to the high-speed autonomous mobility of vehicle nodes. The encryption-based privacy protection scheme [
9] processes the user’s location information through a one-way irreversible encryption function, ensuring service availability without revealing the user’s identity and location information. However, its high-cost consumption and high computational complexity make data sharing and data mining difficult. Solutions based on caching mechanisms [
10] reduce the possibility of user privacy information leakage by reducing the number of interactions between users and charging service providers. However, these caching strategies rarely consider the need for user complexity. Thus, traditional location privacy protection methods do not provide sufficient privacy guarantees for in-vehicle networks.
Differential privacy does not depend on any background knowledge of the attacker, has a strong privacy guarantee, and can completely cut off the possibility of user privacy disclosure from the data source [
11]. Dwork et al. [
12] first proposed the concept of differential privacy in 2006. The main idea of differential privacy is to add interference noise to the published original data and generate false data to protect the potential user privacy information in the data. Therefore, it has been increasingly introduced into the Internet of vehicles in recent years to protect the location privacy of on-board users [
13]. According to whether the third-party data aggregation server is trusted, differential privacy can be divided into centralized differential privacy and local differential privacy. Centralized differential privacy assumes that the third party is trusted, and each user sends his or her real data to the data aggregation server, which then processes the data through perturbation algorithms that satisfy differential privacy. However, not all third parties are trustworthy.
We propose a personalized charging station location privacy protection scheme (PPVC) based on sensitive location information to address the above issues. The scheme can meet the user’s privacy needs for personalized charging station selection while protecting the user’s location privacy. The research content and contributions of this paper are as follows:
We select the route with the highest utility from the navigation routes and build a utility model based on multiattribute decision theory. The route utility and privacy effects are described by a normalized decision matrix, and then privacy preferences are added to this model to build a multiattribute utility function to quantify the utility of the routes and select the highest-utility route for the user.
We propose a personalized privacy assignment algorithm to satisfy users’ personalized privacy settings. Firstly, we use the distance as an indicator to assign an appropriate privacy budget to the user and determine the range of false locations that the user can receive. While satisfying the user’s personalized privacy needs, we achieve the user’s charging service needs.
We evaluate PPVC and prove its privacy and utility by using relevant theorems. Meanwhile, based on the real dataset, experimental simulation is used to compare the performance of PPVC with the existing Shift Route [
14] and ATGD [
15] methods. PPVC improves the charging service quality by 25% and 8%, respectively, while the accuracy of the charging station is only slightly affected. PPVC not only guarantees privacy protection during the charging request process of EVS users but also meets the personalized privacy needs of users and provides higher service quality.
The rest of this paper is organized as follows.
Section 2 provides related work about aggregate data protection methods and negative surveys.
Section 3 describes in detail the system model used in this paper.
Section 4 presents the specifics of our scheme. Comparative experiments and the corresponding analysis of the experimental results of our approach and the comparison approach are performed in
Section 6.
Section 7 gives the conclusion and future work.
2. Related Work
Existing location privacy protection schemes fall into the following categories: encryption mechanisms, caching strategies, anonymization techniques, and differential privacy. Anonymization techniques use generalization theory for location privacy protection [
8]. Specifically, the user’s actual location is generated as an anonymous region along with the locations of
other users and sent to the charging service provider. However, the level of privacy protection is weakened due to the vehicle nodes’ high-speed autonomous mobility characteristics. Therefore, the anonymization scheme cannot be directly applied in V2G. The encryption-based privacy protection scheme [
9] processes the user’s location information through a one-way irreversible encryption function so that the user’s identity and location information cannot be disclosed while ensuring service availability. However, the encryption scheme consumes a lot. It has high computational complexity, making implementing data sharing and mining challenging. Schemes based on caching mechanisms [
16] reduce the possibility of user privacy information leakage by reducing the number of interactions between users and charging service providers. However, these caching strategies rarely consider the need for user complexity. Thus, it is clear that traditional location privacy protection methods cannot provide adequate privacy guarantees for V2G networks.
Yin et al. [
17], aiming at the characteristics of high dispersion and low density of location data, established a multilevel location information tree model by combining practicality and privacy and added noise to the access frequency of selected data by using the Laplace scheme. Xiong et al. [
18] proposed a randomized differential privacy method for location datasets, using private location clustering to narrow the random field to hide the user’s exact location. Andres et al. [
19] proposed a geographical method, which provides a formal privacy model for demonstrable privacy protection. Jiang et al. [
20] performed noise processing on sensitive locations frequently visited on the driving track, thus protecting users’ location privacy.
Some obfuscation-based schemes focus on location privacy in road networks. The Geo-I Satisfying Map Index scheme (GEM) evaluates the level of privacy protection and data utility of traditional Geo-I in road networks [
21]. The GEM sets connections in a road network as obfuscation candidates. The scheme is based on Geo-I directly obfuscating the driver’s location with the connection. The scheme in [
22] discretizes the road network with the same length intervals. The authors use the path distance between two intervals to measure the indistinguishability of Geo-I. However, setting the same length intervals in the road network is challenging. If the intervals are short to accommodate short roads, computational consumption increases [
23]. The correlation between connectivity and privacy can be ignored if the interval contains several short roads.
However, these traditional differential privacy protection methods provide the same level of privacy protection for users’ different charging service request locations. These methods cannot address sensitive location queries in a personalized manner. Personalized differential privacy protection schemes are proposed. Li et al. [
24] proposed a personalized differential privacy protection method for repeated queries, which generates a new privacy protection specification based on the query user privileges and the number of the same queries. Li et al. [
25] proposed a personalized range of sensitive privacy protection schemes using a map storage algorithm to facilitate the storage of 2D local maps and reduce the storage cost. The existing personalized differential privacy protection schemes divide privacy into different levels.
Various schemes protect drivers’ sensitive locations at different levels for personalization. The personalization scheme in [
26] measures privacy requirements by individual attributes such as access duration, frequency, and regularity. The scheme formulates an incomplete information game to balance service quality and privacy protection. Zhong et al. [
27] investigated privacy requirements based on movement regularity for personalized pseudonym exchange. The scheme [
28] in measures privacy requirements using intimacy, which specifies the density of community edges in a social network. Differential privacy and generative adversarial networks are used to add noise to the raw data. The scheme [
29] specifies that the personal privacy requirements of a location are negatively correlated with the number of hops to the sensitive location. The algorithm in [
30] coordinates semantic privacy and location privacy based on the driver’s requirements, which are measured in terms of the relationship between the drivers. Then, a game-theoretic model is constructed to protect location and differential privacy based on social distance. The scheme in [
31] designs the privacy requirements to negatively correlate with the Euclidean distance between the current location and the last inferred location. The privacy requirement is used to compute the obfuscated privacy budget. This design reduces the exposure probability but requires real-time computation of privacy requirements.
4. System Overview
4.1. Problem Statement
In the electric vehicle charging scenario, the system scenario is shown in
Figure 1: the vehicle user is located at point
A, the destination is point
E, and the sensitive locations set by the user are
F, H, and
I. The sensitive locations are defined by the user according to the requirements, and different users will set different sensitive locations according to their own requirements.
In this case, the navigation system plans the user’s four routes to the charging station. The user must choose the most efficient route from these recommended routes as the driving route. On that route, service request location updates are periodically submitted to the charging service system for service queries. However, this frequent query will leak sensitive information, and the privacy of the user’s charging service request location
A, B, C, D, E needs to be protected. The distance between these query locations
A, B, C, D, E and their nearest sensitive locations
F, H, I varies. Different privacy protection should be provided for different query locations. However, the existing location privacy protection cannot answer sensitive attribute queries in a personalized and differential private way [
24,
25], which still results in location privacy leakage.
4.2. Personalized Location Privacy Protection
This paper proposes a differential privacy-based privacy protection scheme for personalized charging locations. The scheme can protect the privacy of charging service requests at different locations on the optimal driving route according to the privacy requirements of the user. The personalized location privacy protection scheme is used to protect the user’s location information before the charging service request, as shown in
Figure 2.
The user sets their service requirements, including information such as sensitive location points, privacy level , target points of interest, and the acceptable error distance range between the wrong location and the actual location. The navigation system recommends m driving routes to the target charging station for the user based on their requirements. The user builds out a multiattribute road selection benefit decision matrix based on the driving routes recommended by the navigation. Based on the multiattribute route selection benefit matrix, a multiattribute route selection benefit function is established using a weight assignment algorithm based on information entropy theory, which jointly optimizes the distance between the user’s starting point and the charging station on the recommended driving route and the sum of the distance from each request location sent by the user to its nearest sensitive location. The benefit value of each recommended driving route is calculated based on the multiattribute benefit function. The most efficient way is determined as the user’s driving road using a ranking algorithm.
The radius of the sensitive circle R is calculated for each sensitive location point based on the user’s previously set acceptable error distance value . Based on the selected driving route, the privacy budget is assigned to the user according to their individual needs. The user’s service request on the way chosen is outside the sensitive circle. Each charging service’s requested location point outside the sensitive circle is assigned a privacy budget according to the percentage of sensitive distance. The remaining privacy budget is set to each charging service request location point inside the sensitive circle in an equal share manner. The privacy budget allocated to the charging service request location is noise-added to generate a false location. The processed false location is sent to the charging service provider for charging service requests to protect the user’s actual location information. The charging service provider feeds the user with the service information results based on the information submitted by the user.
4.3. Multiattribute Decision Model Based on Information Entropy
PPVC establishes a multiattribute decision model based on information entropy, which integrates the influence of two attributes on route selection, namely total route length and Euclidean distance between charging station location and sensitive location, to obtain the relatively optimal road as the user’s driving route. Combined with the actual scenario, the driving route chosen by the user is mainly determined by two factors: One is that the route with low driving cost will be preferred. The other is that the route with less privacy leakage will be preferred. Therefore, we measure the cost of the k route by using the total length from the starting point to the charging station of the k-recommended driving route by the in-vehicle user. If the total length of the route is smaller, it means the cost is lower and the route is more likely to be chosen. The Euclidean distance sum of all service request locations and their nearest sensitive locations on the k-recommended route is used to measure the privacy leakage of the user on the k route. If the distance is larger, it means the privacy leakage is smaller, and the route is more likely to be chosen by the user.
According to multiattribute decision theory [
30,
31], the attribute whose value is positively proportional to the likelihood of a solution being chosen is called the benefit attribute; conversely, the attribute whose value is inversely proportional to the likelihood of a solution being chosen is called the cost attribute. Among the above two attributes, the Euclidean distance D between all service request locations and their nearest sensitive locations on the driving route recommended by the user in the
k entry belongs to the benefit attribute. The distance from the starting place to the charging station on the driving route recommended in the
k entry belongs to the cost attribute. The symbols involved are shown in
Table 1.
Cost Attribute: The total length from the starting place to the charging station in the driving route recommended in clause
k is
Benefit Attribute: The sum of the Euclidean distances between all service request location points and their nearest sensitive locations on the driving route recommended by the user in the
k entry is
where
denotes the distance from the
i service location to the
j location on the
k route;
means
selects
as the nearest sensitive location;
means other cases;
N is the total number of sensitive locations set by the user; and
m is the total number of routes recommended by the navigation system to the user.
4.4. Benefit Attributes
We consider the influence of two attributes on route selection, namely the sum of the distances
between all service request location points and their nearest sensitive locations on the
k-recommended driving route as the benefit attribute, the total length
k of the user’s driving route from the source to the target point of interest on the
k driving route as the cost attribute, and the set of attributes affecting which road the user chooses as
. Since the navigation system recommends
m driving routes for the user, the multiattribute benefit decision matrix is established, the recommended multiple driving routes are treated as
m solutions, and two critical attributes are included in each solution; then, the
m solutions can form an
multiattribute decision matrix of the form
In the multiattribute decision matrix, the meanings and scales of each attribute are different and noncommensurable. At the same time, each attribute’s value affects the user’s final decision result. Therefore, to make the outcome of the decision matrix meet the user’s requirements, a standardization approach is used to eliminate the differences between the attributes. The normalization of the two attributes that affect the path selection can be expressed as
where
is the cost attribute;
is the benefit attribute;
m is the total number of paths planned by the navigation for the user: and
. After dimensionless processing, the normalization matrix
is obtained, and
is called the normalization attribute of the
k scenario for the
j attribute value. Obviously, the larger the
value, the better.
The benefit value of each recommended driving route is calculated based on the multiattribute route selection benefit function, and the most efficient route is determined as the driving route using a ranking algorithm. The multiattribute route selection benefit function is calculated based on the user’s cost and benefit attributes and can be expressed as
where
denotes the weight assigned to the user cost attribute using information entropy theory, and
denotes the cost attribute of the
k-recommended driving route;
denotes the weight assigned to the user benefit attribute using information entropy theory, and
denotes the benefit attribute of the
recommended driving route; and
. After the weight values of the two attributes are determined, the utility value of each route can be calculated to determine which driving route the user chooses to drive. Thus, the privacy budget of that driving route can be assigned.
4.5. Weight Assignment for Benefit Attributes
The choice of attribute weights directly affects the decision results, and the information entropy method is used to determine the weights of the decision matrix. For the decision matrix
, the evaluation of the solution for attribute
j,
is defined as
Bringing
into the information entropy formula, the entropy value
,
of the scheme about attribute
j, and the degree of information deviation is defined as
. In the decision-making process, if the user has unique preferences for specific attributes, the preference value
can be introduced to adjust the weights, and the relevant weights can be expressed as
The above equation satisfies
,
. Once the attribute weights are determined, the utility value of each path can be calculated to determine which path the user chooses. Then, the privacy budget can be allocated. Algorithm 1 summarizes the information entropy-based multiattribute path benefit algorithm.
Algorithm 1 Multiattribute-based path benefit algorithm |
Input: The set of k routes recommended by the navigation system for the user to reach the destination. |
Output: Route with the highest value of the benefit function . |
- 1:
Calculate the length of each driving route. - 2:
Calculate the sum of the distances between each service request location point and its nearest sensitive location point on each route. - 3:
Apply Equations (11) and (12) to normalize the decision matrix - 4:
Calculation of weight values for each attribute value: - 5:
Establishment of a benefit function; - 6:
- 7:
for Number of iterations do - 8:
Benefit function . Selection of the maximum value of the benefit function - 9:
if then - 10:
- 11:
- 12:
end if - 13:
end for - 14:
get
|
5. Personalized Privacy Budget Allocation Algorithm
The existing privacy mechanisms do not consider users’ individual privacy needs. They cannot allocate appropriate privacy budgets to users according to their privacy needs at different locations. Hence, privacy protection is too much for some locations and not enough for others. Therefore, a personalized privacy budget allocation PPVC algorithm is proposed to meet users’ personalized service needs with reasonable privacy protection.
In order to meet the different privacy needs of users, the personalized privacy budget allocation PPVC algorithm quantifies the personalized privacy budget allocation model using the sensitive distance share. In order to implement this algorithm, it is necessary to set a sensitive area range for sensitive location points. Suppose the user’s actual location is outside the set sensitive circle, which indicates that the actual location is far from the sensitive location. In that case, the user’s privacy is less likely to be compromised. The privacy budget allocation scheme can perform the allocation process according to the original scheme, mainly determined by the Euclidean distance between the current location point and the sensitive location.
Our proposed privacy allocation method can achieve personalized privacy protection. The user adaptively chooses the privacy budget according to the privacy needs at different locations. The privacy budget value is close to 0 when the location of the user’s charging service request passes through a sensitive location. The amount of noise added to the user’s location data tends to infinity, making the user receive an invalid service. We propose a privacy budget allocation scheme inside the sensitive circle for this particular case. We evenly distribute the remaining privacy budget to all charging service request location points within the sensitive circle.
The privacy budget allocation model in this subsection is divided into the following parts: determination of the radius R of the sensitive circle, allocation of the privacy budget inside the sensitive area, and allocation of the privacy budget outside the sensitive area.
5.1. Radius of the Sensitive Circle
The radius of the sensitive circle corresponding to each sensitive position point is calculated using the planar Laplace noise mechanism. Suppose the current actual position of the user is
. The noise is added to the actual position using the planar Laplace mechanism to generate the false position as
. The distortion distance between the actual position and the false position of the user can be expressed as
According to the geographic indistinguishability, using the planar Laplace mechanism to add noise to the actual location, we can see that
has the following relationship [
37]:
The user sets their acceptable error value between the actual and false positions to
. If the generated false position satisfies the user’s requirements, the following relationship must be satisfied:
Based on Equations (15) and (16), we have the following relationship:
where
R denotes;
denotes the sum of the distances of all service locations to their nearest sensitive locations on the entire driving route;
denotes the total privacy budget selected by the user;
denotes the distance error threshold between the current service location and the generated false location, which is set by the user;
is the Lambert function;
M is related to the value of
[
37]; and
denotes the random number generated between [0, 1]. When the distance between the actual location and the sensitive location is less than
R, the privacy budget allocation within the sensitive region is applied; otherwise, the privacy budget allocation outside the sensitive region is applied.
5.2. Privacy Budget Allocation
5.2.1. Budget Allocation for Privacy Outside Sensitive Areas
When the user’s actual location is outside the set sensitive area
, it means that the actual location is far away from the sensitive location. The user’s privacy is less likely to be compromised. In this case, the privacy budget allocation scheme can perform the allocation process according to the original scheme. This allocation scheme is mainly determined by the Euclidean distance between the current charging service request location and the sensitive location, as shown in
Figure 1. For locations
A, C, D, and
E outside the sensitive circle, the allocation model is shown below:
where
denotes the privacy budget assigned to the
i service location outside the sensitive circle;
denotes the total privacy budget selected by the user for privacy protection on the driving route;
means
selects
as the nearest sensitive location, and
means otherwise;
denotes the distance from the
i service location to the
j sensitive location;
n denotes the total number of service location points; and
N denotes the total number of sensitive location points set by the user.
When the location of the user’s charging service request happens to pass through a sensitive location point, the privacy budget value is close to 0 if this allocation method is used, which means that the amount of noise added to the user’s location data tends to be infinite, and the user receives an invalid service. Therefore, we propose a privacy budget allocation method in sensitive areas for this particular case.
5.2.2. Privacy Budget Allocation in Sensitive Areas
Suppose the user’s actual location is inside the sensitive area set by the user. In that case, it means that the location of the user’s charging service request is close to the sensitive location. The possibility of privacy leakage is very high.
Figure 1 shows the sensitive area, with
F as the center and
R as the radius. When a charging service request is made at this location, the user’s privacy budget is allocated as the sum of the total privacy budgets at the location minus the privacy budgets of all points outside the sensitive circle. Then, the remaining privacy budget is evenly distributed to each charging service request location. If the entire route contains
n points inside the sensitive circle, the privacy budget allocated to each charging service request location is
where
denotes the total privacy budget set by the user on the driving route;
denotes the privacy budget assigned to the
i location outside the sensitive circle on the current driving route; and
n denotes the total number of locations within the sensitive circle on the current driving route. Algorithm 2 outlines the implementation of the PPVC algorithm for personalized privacy budget assignment.
Algorithm 2 PPVC algorithm |
Input: |
Output: Service request location point privacy budget allocation results . |
- 1:
Calculate the radius of the sensitivity circle: - 2:
for Number of iterations do - 3:
Determine if the service request location is outside the sensitive circle. - 4:
if then - 5:
Budgetary allocations for privacy outside sensitive circles: - 6:
Sum of privacy budgets for all sensitive circles: - 7:
else - 8:
- 9:
end if - 10:
end for - 11:
Budgetary allocations for privacy in sensitive circles:
|
5.3. Data Analytics for Utilities
Theorem 1 (
-utility)
. D is a transactional dataset, and is the result of the personalized location privacy protection scheme after protecting D. If the following relation holds, then this scheme satisfies -utility: Proof. Assume that the spatial extent query
Q covers
n items in the output domain, and the exact query result of
Q on the dataset is
. Where
denotes the items covered by the query
Q, the query result on the noisy dataset
is denoted as
, and
denotes the added noise. According to the definition of
-utility, it is necessary to prove that
.
where
denotes the noise added to the original data that satisfies the planar Laplace distribution, and for each
satisfies the following relation:
If
, then it is denoted as a FAILURE, whose probability of occurrence is given by the following relation:
Thus, the likelihood of a successful event is related as follows:
Thus, it can be obtained that the program satisfies the definition of data utility.
□