Next Article in Journal
What Factors Revitalize the Street Vitality of Old Cities? A Case Study in Nanjing, China
Previous Article in Journal
Spatio-Temporal Big Data Collaborative Storage Mechanism Based on Incremental Aggregation Subvector Commitment in On-Chain and Off-Chain Systems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Efficient and Verifiable Range Query Scheme for Encrypted Geographical Information in Untrusted Cloud Environments

1
School of Computer and Big Data Science, Jiujiang University, No. 551, Qianjin East Road, Jiujiang 332000, China
2
China Gridcom Co., Ltd., Shenzhen 518031, China
3
School of Modern Information Technology, Zhejiang Polytechnic University of Mechanical and Electrical Engineering, Hangzhou 310053, China
4
School of Computer Science and Information Technology, Daqing Normal University, Daqing 163111, China
5
Jiujiang Key Laboratory of Cyberspace and Information Security, Jiujiang University, No. 551, Qianjin East Road, Jiujiang 332000, China
*
Author to whom correspondence should be addressed.
ISPRS Int. J. Geo-Inf. 2024, 13(8), 281; https://doi.org/10.3390/ijgi13080281
Submission received: 15 May 2024 / Revised: 28 July 2024 / Accepted: 8 August 2024 / Published: 11 August 2024
(This article belongs to the Topic Recent Advances in Security, Privacy, and Trust)

Abstract

:
With the rapid development of geo-positioning technologies, location-based services have become increasingly widespread. In the field of location-based services, range queries on geographical data have emerged as an important research topic, attracting significant attention from academia and industry. In many applications, data owners choose to outsource their geographical data and range query tasks to cloud servers to alleviate the burden of local data storage and computation. However, this outsourcing presents many security challenges. These challenges include adversaries analyzing outsourced geographical data and query requests to obtain privacy information, untrusted cloud servers selectively querying a portion of the outsourced data to conserve computational resources, returning incorrect search results to data users, and even illegally modifying the outsourced geographical data, etc. To address these security concerns and provide reliable services to data owners and data users, this paper proposes an efficient and verifiable range query scheme (EVRQ) for encrypted geographical information in untrusted cloud environments. EVRQ is constructed based on a map region tree, 0–1 encoding, hash function, Bloom filter, and cryptographic multiset accumulator. Extensive experimental evaluations demonstrate the efficiency of EVRQ, and a comprehensive analysis confirms the security of EVRQ.

1. Introduction

In recent years, due to the rapid development of the Internet of Things (IoT) and precise positioning technologies (such as the Global Positioning System, GPS), Location-based Services (LBS) and the corresponding applications have become prevalent and widely used [1,2,3]. In the traditional model of LBS, the data owners, who are the location service providers, store the geographical data locally. Data users send query requests to the data owners, who process these queries over the geographical data locally and then return the query results to the data users. However, as cloud computing becomes increasingly mature, an increasing number of data owners are choosing to outsource their location data to cloud servers. This allows them to take full advantage of various benefits provided by cloud computing, such as improved reliability, enhanced flexibility, accelerated deployment, and so on [4,5]. In this new cloud-based model of LBS, the data owners outsource their geographical data to the cloud servers, and the data users send query requests to the cloud servers. The cloud servers execute the query requests from the data users and then return the query results to them.
However, in this new cloud based model of LBS, the geographical data stored in the cloud server are not directly controlled by the data owners, and the query requests from data users are processed by the cloud server. As a result, both the geographical data in the cloud server and the query requests of data users are vulnerable to various security threats [6]. These threats have the potential to pose significant risks to people’s safety and property. For example, attackers may try to steal and analyze the geographical data stored in cloud servers and then infer the privacy information, such as users’ movements, home addresses, and other personal information. Untrusted cloud servers may selectively query a portion of the geographical data to save computational resources. Untrusted cloud servers may provide incorrect query results to mislead data users, and then gather information about the wrong location for specific purposes. In some cases, untrusted cloud servers may even illegally modify the outsourced geographical data.
To address the security and privacy concerns of geographical information in an untrusted cloud environment, a straightforward and effective approach is to encrypt the geographical data prior to outsourcing them to the cloud server [7]. Traditional encryption methods (such as block encryption) are effective in securing geographical data stored in cloud servers. However, they do not facilitate range query over ciphertexts or provide verification for the correctness of query results or the integrity of the query process. Therefore, in the cloud-based model of LBS, simultaneously ensuring the security of geographical data in the cloud server, preserving the query privacy of data users, enabling efficient and effective data retrieval, and providing verification for the correctness of query results and the integrity of the query process have become a challenging task [8,9].
Currently, methods available for addressing the aforementioned issues include order-preserving encryption methods [10,11,12,13,14,15], bucketization methods [16,17,18,19,20,21,22], and public key encryption methods [23,24,25,26]. Order-preserving encryption methods can support efficient query on the ciphertexts of geographical data. However, these methods have a security drawback as they leak the order information between ciphertexts. Then, attackers can infer the plaintexts underlying ciphertexts by using the leaked order information. To protect the order information between ciphertexts, bucketization methods encrypt and retrieve data within a bucket as a whole. Consequently, bucketization methods do not support accurate ciphertext queries. Public key-based encryption methods can provide strict security and accurate query results. However, due to the computational complexity of the public key encryption methods, the efficiency is often compromised. Therefore, using public key encryption methods in LBS can adversely affect users’ experience. For example, in a high-speed moving vehicle, untimely query results from the cloud server may cause data users to miss the nearest destination.
In this paper, we propose an efficient and verifiable range query scheme (EVRQ) for encrypted geographical information in untrusted cloud environments. In our scheme, we propose a method for dividing map regions (MRs) level by level according to the distribution of points of interest (POIs). In the EVRQ scheme, different MRs at the same level contain the same (or similar) number of POIs, and attackers are unable to distinguish between different MRs at the same level according to the density distribution of POIs. The data owner constructs a map region tree (MRT). In the MRT, the nodes at the same level are associated with the MRs, which are also at the same level. Then, the data owner builds an MRT index by using the MRT, 0–1 encoding, hash function, Bloom filter, and cryptographic multiset accumulator techniques. Next, the data owner encrypts POIs. Finally, the data owner outsources the ciphertexts of POIs and MRT index to the cloud server. The data user processes their queried range by using 0–1 encoding and hash function techniques to generate a query token and sends the query token to the cloud server. The cloud server compares the query token with the nodes of the MRT index in a top–down manner and then returns the retrieved ciphertexts of POIs to the data user as the query results. The data user can also request relevant information from the cloud server to verify the correctness of the query results and the integrity of the query process. The contributions of this paper are as follows:
  • We propose a division method for MRs and then build an MRT index. In the MRT index, POIs are evenly distributed. Attackers (including untrusted cloud servers) cannot infer the query privacy of the data user by observing the data user’s query process and analyzing the density distribution of POIs.
  • We propose an efficient range query scheme on the MRT index, which can efficiently verify the correctness of the query results and the entirety of the query process by combining Bloom filter, hash calculation, and cryptographic multiset accumulator techniques.
  • We conducted extensive experiments to evaluate the performance of EVRQ. The experimental results demonstrate the efficiency of the EVRQ scheme. Additionally, we provide a detailed analysis of the correctness and the security of the EVRQ scheme.

2. Related Work

Order-preserving encryption (OPE) methods [10,11] support efficient comparison between ciphertexts due to the preservation of the order information within the ciphertexts. Boldyreva et al. introduced a weaker security definition known as random order preserving function (ROPF) security, which requires that the ciphertext cannot be distinguished from the function value generated by a randomly generated incremental function. Moreover, they developed the BCLO scheme, the first OPE scheme that can be proven to be secure. Subsequently, there have been ongoing efforts [12,13,14,15] aimed at improving the efficiency and security of the BCLO scheme. The majority of existing OPE methods only consider comparisons between ciphertexts of single-dimensional data. As the geographical data usually have multiple dimensions, these OPE methods are not suitable to process the geographical data. Recently, Zhan et al. [27] proposed an OPE method to support efficient range query over the ciphertexts of multi-dimensional data. However, as the order information is not protected in these OPE methods, attackers can analyze the order information of the ciphertexts and then infer the underlying plaintexts [7]. Thus, the security of OPE methods is not very high.
Bucketizaion methods [16,17] have been proposed to protect the order information of the ciphertexts and support efficient queries over ciphertexts. In a bucketizaion method, all the data are partitioned into multiple buckets. Then, the entire data within each bucket are encrypted as a unit. During the query process, if there are any data in a bucket that match the query request, all the ciphertexts within that bucket will be returned as the query results. Many researchers have proposed different approaches to optimize bucketization methods. Lee et al. [19] sorted all the buckets in an ordered manner to improve the search efficiency over buckets. Wang et al. [18] organized all the buckets together using an R-tree and applied the asymmetric scalar product preserving encryption (ASPE) [28] to encrypt the nodes of the R-tree nodes. As ASPE ensures the security of the information stored in the R-tree nodes and supports ciphertext comparisons, Wang’s method can support efficient ciphertext retrieval. Since then, many bucketization methods [20,21,22] have been proposed to solve the issue of ciphertext query in cloud servers. However, the bucketization methods have a common drawback: if a bucket contains any data that satisfy a query request, all the ciphertexts within that bucket will be returned as the query results. Therefore, the query results are not accurate.
Public key-based encryption methods [23,24,25,26] have been proposed to support accurate queries over ciphertexts. Shi et al. [24] organized all the multi-dimensional data by utilizing several interval trees. Both multi-dimensional data and multi-dimensional queried range are identified by the node identifications (IDs) in these interval trees. Thus, these node IDs are embedded in the ciphertexts of multi-dimensional data and the query tokens of queried ranges. During the query process, if the IDs in the search token of the queried ranges match those IDs in the ciphertexts of multi-dimensional data, the ciphertexts can be correctly retrieved. Lu et al. [25] proposed a method to support range query over ciphertexts. Their method was constructed by using a B+tree and the encryption method in [29]. As in the high computation overhead of [29], Lu’s method [25] is not very efficient. Wu et al. [30] proposed a method to support range query and verification over the ciphertexts of the multi-dimensional data. In a recent work, Mei et al. [26] proposed a method to support range query and access control on the ciphertexts of the multi-dimensional data. As many of the existing public key-based encryption methods are based on computations over elliptic curves, these public key-based encryption methods are not very efficient.

3. Preliminaries

The EVRQ scheme is constructed by using several techniques. The 0–1 encoding enables the conversion of comparative operations between numbers into set operations. A cryptographic multiset accumulator is utilized to verify the intersection of two sets. The Bloom filter efficiently performs set operations with a high probability of accuracy. The details of these technologies are as follows.

3.1. 0–1 Encoding

The 0–1 encoding [31] converts a positive integer n to a t-length binary string, denoted by n t n t 1 n 1 . Then, 0–1 encoding generates two sets. A set is S n 0 = { n t n t 1 n i + 1 1 | n i = 0 , 1 i t } , and the other set is S n 1 = { n t n t 1 n i | n i = 1 , 1 i t } . For two positive integers n and n , 0–1 encoding can be used to determine the ordering relationship between n and n . Specifically, S n 1 S n 0 n n and S n 1 S n 0 = n < n . For example, 3 and 6 are two positive integers. The 4-length binary strings of 3 and 6 are ( 0011 ) 2 and ( 0110 ) 2 , respectively. The 0–1 encodings of 3 and 6 are S 3 0 = { 1 , 01 } , S 3 1 = { 001 , 0011 } , S 6 0 = { 1 , 0111 } and S 6 1 = { 01 , 011 } . As S 6 1 S 3 0 , 6 3 . And also, as S 3 1 S 6 0 = , 3 < 6 .

3.2. Cryptographic Multiset Accumulator

A cryptographic multiset accumulator [32] is widely used to prove set disjointness. It contains the following four algorithms.
K e y G e n ( 1 λ ) ( s k A C C , p k A C C ) . It takes a security parameter λ as input and outputs a secret key s k A C C and a public key p k A C C .
S e t u p ( X , p k A C C , s k A C C ) a c c ( X ) . It takes a multiset X, the public key p k A C C , and the secret key s k A C C as inputs and outputs the accumulative value a c c ( X ) .
P r o v e D i s j o i n t ( X 1 , X 2 , p k A C C ) Π . It takes two multisets X 1 , X 2 ( X 1 X 2 = ) and the public key p k A C C as inputs. It outputs a proof Π .
V e r i f y D i s j o i n t ( a c c ( X 1 ) , a c c ( X 2 ) , Π , p k A C C ) . It takes the accumulative values a c c ( X 1 ) , a c c ( X 2 ) , a proof Π , and the public key p k A C C as inputs. It outputs 1 if and only if X 1 X 2 = .

3.3. Bloom Filter

Bloom filter [33] offers a probabilistic but very efficient way to determine whether an element belongs to a set. A Bloom filter contains (1) a set E = { e 1 , e 2 , , e n } (n is a positive integer); (2) a hash function h a s h and w different hash keys k 1 , k 2 , …, k w (w is a positive integer); and (3) a bit array A of t bits (each bit is initialized to 0). For each element e i E ( i = 1 , 2 , , n ), the Bloom filter computes hash values h a s h ( e i , k 1 ) , h a s h ( e i , k 2 ) , …, h a s h ( e i , k w ) and then sets the values at the positions h a s h ( e i , k 1 ) , h a s h ( e i , k 2 ) , …, h a s h ( e i , k w ) of A to 1. To test whether an element e belongs to the set E, the Bloom filter calculates the hash values h a s h ( e , k 1 ) , h a s h ( e , k 2 ) , …, h a s h ( e , k w ) . If all the values at the positions h a s h ( e , k 1 ) , h a s h ( e , k 2 ) , …, h a s h ( e , k w ) of A are 1, e is considered to be a member of E. Otherwise, e is not in E.

4. System Model

Figure 1 shows the system model of EVRQ. Firstly, the data owner (DO) generates multiple map regions (MRs) according to the locations of points of interest (POIs) in a map. Then, the DO encrypts all the information of POIs and builds a map region tree (MRT) index by using the MRs, 0–1 encoding, cryptographic multiset accumulator, and Bloom filter techniques. Next, the DO outsources the ciphertexts of POIs and the MRT index to the cloud server (CS) and distributes secret keys to the data user (DU). The DU generates a query token for their queried range and sends the query token to the CS. After receiving the query token, the CS performs range query on the MRT index. Finally, the CS returns the retrieved ciphertexts of POIs to the DU as the query results. According to the DU’s requirements, they can verify the correctness of query results and the entirety of the query process.
In EVRQ, the CS is untrusted. Precisely, the CS does not faithfully follow the DO and the DU’s designated protocols and procedures. Specifically, the CS may (1) be curious about the geographical data and the index stored within them, (2) selectively query only a portion of the geographical data to save computational resources, (3) intentionally include some geographical data that do not meet the query condition in the query results and return them to the DU, and (4) even illegally modify the geographical data and the index stored in the CS. The DO and the DU are considered trustworthy.
Definition 1.
Correctness. Suppose that s 1 , s 2 , …, s n represent n different POIs and E n c ( s 1 ) , E n c ( s 2 ) , …, E n c ( s n ) represent the ciphertexts of s 1 , s 2 , …, s n respectively, and these ciphertexts are the query results retrieved from the MRT index when querying with the range Q. The EVRQ scheme is correct if these ciphertexts of POIs E n c ( s 1 ) , E n c ( s 2 ) , …, E n c ( s n ) are stored within the MRs of the leaf nodes in the MRT index and these MRs intersect with the queried range Q.
Definition 2.
Security. Suppose F ( d ) = c o u n t ( d ) is a leakage function, where d is a binary string and c o u n t ( d ) returns a total number of 1 in d. If no adversary can access any information beyond what is revealed by F, the EVRQ scheme is considered secure.
Definition 3.
Structure Indistinguishability. For any two MRs, if and only if these two MRs contain the same number of POIs, their corresponding MR trees and MRT indexes have the same structure.

5. Construction of Our Scheme EVRQ

In this section, we first describe MR and MRT in our scheme EVRQ. Then, we introduce MR encoding. Finally, we present the construction of the EVRQ scheme in detail.

5.1. Map Region (MR) and Map Region Tree (MRT)

Suppose there are n POIs which are distributed in an MR. These POIs are denoted by s 1 , s 2 , …, s n , respectively. The DO divides this MR into multiple MRs. Each MR is associated with a level number, and the MRs with the same level number contain the same or a similar number of POIs. The generation process of MRs is as follows.
Firstly, the DO uses the notation M 1 1 to denote the whole MR. M 1 1 is the first-level MR, which covers all the POIs in the whole MR. The superscript of M 1 1 represents the level of M 1 1 and the subscript represents the order of M 1 1 within its level.
Secondly, the DO divides M 1 1 into two parts by a vertical line. The left part is denoted by M 1 l e f t 1 , and the right part is denoted by M 1 r i g h t 1 . If the total number of POIs in M 1 1 is even, then the DO sets the total number of POIs in M 1 l e f t 1 to be equal to that in M 1 r i g h t 1 . If the total number of POIs in M 1 1 is odd, then the DO sets the total number of POIs in M 1 r i g h t 1 to be one more than that in M 1 r i g h t 1 .
Thirdly, the DO divides M 1 l e f t 1 into two parts by a horizontal line. The upper part is denoted by M 1 l e f t u p 1 , and the lower part is denoted by M 1 l e f t d o w n 1 . If the total number of POIs in M 1 l e f t 1 is even, then the DO sets the total number of POIs in M 1 l e f t u p 1 to be equal to that in M 1 l e f t l o w 1 . If the total number of POIs in M 1 l e f t 1 is odd, then the DO sets the total number of POIs in M 1 l e f t u p 1 to be one more than that in M 1 l e f t l o w 1 . The DO also divides M 1 r i g h t 1 into two parts by another horizontal line. The upper part is denoted by M 1 r i g h t u p 1 , and the lower part is denoted by M 1 r i g h t l o w 1 . In the same way, if the total number of POIs in M 1 r i g h t 1 is even, then the DO sets the total number of POIs in M 1 r i g h t u p 1 to be equal to that in M 1 r i g h t l o w 1 . If the total number of POIs in M 1 r i g h t 1 is odd, then the DO sets the total number of POIs in M 1 r i g h t d o w n 1 to be one more than that in M 1 r i g h t u p 1 .
According to the method of MR division above, M 1 1 is divided into four MRs: M 1 l e f t l o w 1 , M 1 l e f t u p 1 , M 1 r i g h t l o w 1 , and M 1 r i g h t u p 1 , respectively. For the sake of convenience, these four MRs are denoted by M 1 2 , M 2 2 , M 3 2 , and M 4 2 , respectively. The superscript 2 represents the level of these MRs and the subscripts 1, 2, 3, and 4 represent the order of these MRs within level 2.
Finally, the DO divides M 1 2 , M 2 2 , M 3 2 , and M 4 2 iteratively until each MR contains only a single POI.
According to the method of the MR division above, it is easy to observe two facts: (1) most MRs within the same level contain an equal (or a similar) number of POIs. Specifically, the difference in the number of POIs contained in different MRs is at most one; (2) when different MRs contain an equal (or a similar) number of POIs, an attacker cannot effectively distinguish these MRs according to the number of POIs they contain. Therefore, an attacker (including the untrusted CS) cannot infer the query privacy of the DU by observing the DU’s query process on different MRs and analyzing the density distribution of POIs in different MRs.
In the following Example 1, we demonstrate our method for MR division.
Example 1.
Figure 2 (1) shows the first-level MR, denoted by M 1 1 , which contains 17 gas stations (17 POIs). Firstly, the DO divides M 1 1 into two parts, M 1 l e f t 1 and M 1 r i g h t 1 , by a vertical line. M 1 l e f t 1 contains 9 gas stations ( S 1 , S 2 , S 3 , S 4 , S 5 , S 6 , S 7 , S 8 , S 13 ), while M 1 r i g h t 1 contains 8 gas stations ( S 9 , S 10 , S 11 , S 12 , S 14 , S 15 , S 16 , S 17 ). Then, the DO further divides M 1 l e f t 1 into two parts, M 1 2 and M 3 2 , by a horizontal line. Similarly, the DO divides M 1 r i g h t 1 into two parts, M 2 2 and M 4 2 , by another horizontal line. As shown in Figure 2 (2), M 1 2 contains 4 gas stations ( S 1 , S 2 , S 4 , S 5 ), M 2 2 contains 4 gas stations ( S 9 , S 10 , S 11 , S 12 ), M 3 2 contains 5 gas stations ( S 3 , S 6 , S 7 , S 8 , S 13 ), and M 4 2 contains 4 gas stations ( S 14 , S 15 , S 16 , S 17 ). Then, the DO divides these second-level MRs— M 1 2 , M 2 2 , M 3 2 , and M 4 2 —iteratively. As shown in Figure 2 (3), the DO obtains the third-level MRs: M 1 3 , M 2 3 , , M 16 3 . Each third-level MR (except for M 11 3 ) contains only one gas station. Thus, the DO stops further dividing these third-level MRs. As M 11 3 contains 2 gas stations ( S 7 , S 8 ), the DO continues to divide M 11 3 . As shown in Figure 2 (4), the DO obtains the fourth-level MRs M 1 4 ( M 1 4 only contains one gas station S 7 ) and M 2 4 ( M 2 4 only contains one gas station). As M 1 4 and M 2 4 only contain one gas station, the DO stops dividing M 1 4 and M 2 4 .
To construct an MRT, the DO firstly constructs a quadtree and processes the quadtree by using the MRs. In the quadtree, the DO associates the first-level node (i.e., the root node) with the MR M 1 1 that contains all the POIs. Secondly, the DO divides M 1 1 into four MRs ( M 1 2 , M 2 2 , M 3 2 , M 4 2 ) that contain the same (or a similar) number of POIs. The DO associates the second-level nodes (i.e., the child nodes of the root node) with the MRs M 1 2 , M 2 2 , M 3 2 , M 4 2 , respectively. Then, the DO continues to divide M 1 2 , M 2 2 , M 3 2 , M 4 2 into sixteen MRs M 1 3 , M 2 3 , , M 16 3 that contain the same (or a similar) number of POIs, and then associates the third-level nodes of the quadtree with these MRs ( M 1 3 , M 2 3 , , M 16 3 ). By using the same method, the DO processes the quadtree in a top–down manner. Finally, the DO associates each leaf node of the MRT with an MR that contains only one POI. After processing each node of the quadtree, the DO obtains the MRT. Note that, in our scheme, there is no intersection between the POIs contained in different MRs that are associated with the same level nodes in the MRT.
Figure 3 shows the MRT that is constructed by using the MRs in Figure 2. As shown in Figure 3, each node is associated with an MR ( M 1 1 , M 1 2 , etc.), and each leaf node is associated with an MR that contains only one POI ( S 1 , S 2 , etc.). By observing the MRT, it is easy to observe the following: (1) as the depth difference between leaf nodes is at most one, the MRT is a balanced tree. Therefore, the MRT supports fast traversal. (2) As the MRs of the same level contain an equal (or similar) number of POIs, the structures of the corresponding subtrees are the same (or similar). Thus, it is difficult to effectively distinguish different MRs of the same level and their corresponding subtrees. This feature ensures that the density distribution information of POIs remains protected. Therefore, an attacker (including the untrusted CS) cannot infer the DU’s query privacy by observing the DU’s query process on the MRT and analyzing the density distribution information of POIs on a map.

5.2. Map Region (MR) Encoding

As shown in Figure 4, an MR M can be represented by the minimum point p min = ( x min , y min ) M and the maximum point p max = ( x max , y max ) M . Therefore, the encoding of M can be represented by the encodings of x min , y min , x max , and y max . The encoding process of x min , y min , x max , and y max is as follows.
Firstly, the DO encodes x min , y min , x max , and y max to t-length binary strings (t is a large-enough positive integer), denoted by B ( x min ) , B ( y min ) , B ( x max ) , and B ( y max ) , respectively.
Secondly, the DO generates random numbers r x min , r y min , r x max , and r y max for x min , y min , x max , and y max , respectively. Finally, the DO encodes these random numbers to t-length binary strings, denoted by B ( r x min ) , B ( r y min ) , B ( r x max ) , and B ( r y max ) , respectively. These random numbers r x min , r y min , r x max , and r y max are used to randomize the boundary information x min , y min , x max , and y max of MR while ensuring comparability between them.
Then, the DO concatenates the above binary strings to obtain B ( x min ) | | B ( r x min ) , B ( y min ) B ( r y min ) , B ( x max ) | | B ( r x max ) , and B ( y max ) | | B ( r y max ) , where | | denotes concatenation operation. Note that the above processes for binary strings still preserve the ordering relationship between the data. For example, as x max > x min and y max > y min , there are B ( x max ) | | B ( r x max ) > B ( x min ) | | B ( r x min ) and B ( y max ) | | B ( r y max ) > B ( y min ) | | B ( r y min ) .
Finally, the DO calculates the 0-encodings of B ( x min ) | | B ( r x min ) and B ( y min ) | | B ( r y min ) , denoted by S x min 0 and S y min 0 , respectively. The DO calculates the 1-encodings of B ( x max ) | | B ( r x max ) and B ( y max ) | | B ( r y max ) , denoted by S x max 1 and S y max 1 , respectively. The DO uses S x min 0 , S y min 0 , S x max 1 , and S y max 1 as the MR encoding of M in Figure 4.

5.3. Efficient and Verifiable Range Query Scheme (EVRQ)

5.3.1. System Setup

Firstly, the DO randomly chooses a number as the key k. Then, the DO sets a security parameter λ and executes the algorithm K e y G e n ( 1 λ ) in [32], which outputs a secret key s k A C C and a public key p k A C C . Next, the DO sets four tagging parameters t a g 0 , t a g 1 , t a g 2 , and t a g 3 and selects w hash keys k 1 , k 2 , , k w used for the hash function in a Bloom filter. Finally, the DO keeps s k A C C secretly; the DO also sends the tagging parameters t a g 0 , t a g 1 , t a g 2 , and t a g 3 ; hash keys k 1 , k 2 , , k w ; and the key k to the DU, and publishes p k A C C .

5.3.2. Map Region Tree (MRT) Index Construction

Firstly, we represent the method for processing a node in MRT. Then, we introduce the method for constructing an MRT index.
1. Node Processing in MRT. To facilitate explanation, we assume that N is a node in MRT, and N is associated with an MR M, which is represented by S x min 0 , S y min 0 , S x max 1 , and S y max 1 (see Section 5.2).
The DO converts S x min 0 to S ¯ x min 0 = { h a s h ( a , k ) | | t a g 0 } by using the key k and a secure hash function (such as SHA families), where a S x min 0 and | | represent the concatenation operation. Similarly, the DO converts S y min 0 , S x max 1 , and S y max 1 to S ¯ y min 0 = { h a s h ( a , k ) | | t a g 1 } (where a S y min 0 ), S ¯ x max 1 = { h a s h ( a , k ) | | t a g 2 } (where a S x max 1 ), and S ¯ y max 1 = { h a s h ( a , k ) | | t a g 3 } (where a S y max 1 ), respectively. During the conversion process, tagging parameters t a g 0 , t a g 1 , t a g 2 , and t a g 3 are padded after each element in S ¯ x min 0 , S ¯ y min 0 , S ¯ x max 1 , and S ¯ y max 1 , respectively. This ensures that S ¯ x min 0 , S ¯ y min 0 , S ¯ x max 1 , and S ¯ y max 1 are disjoint with each other.
The DO computes the union set S M = S ¯ x min 0 S ¯ y min 0 S ¯ x max 1 S ¯ y max 1 and executes the algorithm S e t u p ( S M , p k A C C , s k A C C ) in [32]. This algorithm outputs the accumulated value of all elements in S M , denoted by a c c ( S M ) .
The DO selects an l-length binary string, denoted by B F (each bit of B F is initialized to 0). For each element a S M , the DO calculates hash values h a s h B F ( a , k 1 ) , h a s h B F ( a , k 2 ) , …, h a s h B F ( a , k w ) and sets the corresponding positions in B F to 1.
2. MRT index Construction. As the CS is considered to be untrusted in our scheme EVRQ, the CS may selectively query only a portion of the data to save computational resources, and even illegally modify the data and the index stored in the CS. To prevent such malicious operations, the DO needs to generate hash values for each node in an MRT index. By using the hash values of the MRT index, the DU can verify that the CS has traversed the entire MRT index and has not made any illegal modifications to the geographical data and the MRT index stored in the CS. The generation of hash values in the MRT index is as follows.
Suppose that N is a leaf node in the MRT, N is associated with an MR M which contains only one POI, denoted by s. Firstly, the DO uses a secure encryption method, such as AES [34], to encrypt s and securely distributes the decryption key to the DU. The ciphertext of s is denoted by E n c ( s ) . Secondly, the DO calculates S M , a c c ( S M ) , and B F M (see Section 5.3.2). Then, the DO computes the hash value of N, which is h N = h a s h n o d e ( E n c ( s ) S M a c c ( S M ) B F M ) (⊕ represents the XOR operation). Finally, the DO stores E n c ( s ) , S M , a c c ( S M ) , B F M , and h N in the leaf node N.
Suppose that N is an internal node in the MRT, and N is associated with an MR M. There are four child nodes N 1 , N 2 , N 3 , and N 4 of N. The hash values of N 1 , N 2 , N 3 , and N 4 are h N 1 , h N 2 , h N 3 , and h N 4 , respectively. Firstly, the DO calculates S M , a c c ( S M ) , and B F M (see Section 5.3.2). Then, the DO computes h N c h i l d = h N 1 h N 2 h N 3 h N 4 and computes h N = h a s h n o d e ( h N c h i l d S M a c c ( S M ) B F M ) . Finally, the DO stores h N c h i l d , a c c ( S M ) , B F M , and h N in the internal node N.
Note that some internal nodes may not have four child nodes. In such cases, the DO only needs to use the hash values of all the child nodes of the internal node to compute the hash value of the internal node. For example, as shown in Figure 3, the internal node N A has only two child nodes (leaf nodes) N B and N C . N A is associated with MR M 11 3 . In this case, the DO computes the hash values h N A c h i l d = h N B h N C and h N A = h a s h n o d e ( h N A c h i l d S M 11 3 a c c ( S M 11 3 ) B F M 11 3 ) . Then, the DO stores h N A c h i l d , S M 11 3 , a c c ( S M 11 3 ) , B F M 11 3 , and h N A in the internal node N A .
After the DO adds the aforementioned information to each MRT node, the MRT index is constructed.

5.3.3. Query Token Generation

Firstly, for a queried range Q, the DU represents Q by using the minimum point q min = ( u min , v min ) Q and the maximum point q max = ( u max , v max ) Q . The DU converts u min , v min , u max , and v max to t-length binary strings, denoted by B ( u min ) , B ( v min ) , B ( u max ) , and B ( v max ) , respectively. Secondly, the DU randomly chooses numbers r u min , r v min , r u max , and r v max for u min , v min , u max , and v max , respectively, and also converts these random numbers to t-length binary strings, denoted by B ( r u min ) , B ( r v min ) , B ( r u max ) , and B ( r v max ) , respectively. Then, the DU calculates 0-encodings for B ( u min ) | | B ( r u min ) and B ( v min ) | | B ( r v min ) , denoted by S u min 0 and S v min 0 , respectively. The DU calculates 1-encodings for B ( u max ) | | B ( r u max ) and B ( v max ) | | B ( r v max ) , denoted by S u max 1 and S v max 1 , respectively. Thus, the encoding of Q is represented by S u min 0 , S v min 0 , S u max 1 , and S v max 1 .
Secondly, to determine whether the queried range Q intersects with an MR M, the CS should determine whether S ¯ u min 0 intersects with S ¯ x max 1 , S ¯ v min 0 intersects with S ¯ y max 1 , S ¯ u max 1 intersects with S ¯ x min 0 , and S ¯ v max 1 intersects with S ¯ y min 0 (see Section 5.3.4). Thus, the DO calculates S ¯ u min 0 = { h a s h ( a , k ) | | t a g 2 } (where a S u min 0 ), S ¯ v min 0 = { h a s h ( a , k ) | | t a g 3 } (where a S v min 0 ), S ¯ u max 1 = { h a s h ( a , k ) | | t a g 0 } (where a S u max 1 ), and S ¯ v max 1 = { h a s h ( a , k ) | | t a g 1 } (where a S v max 1 ).
Finally, the DU executes the algorithm S e t u p ( S ¯ u min 0 , p k A C C , s k A C C ) in [32], which outputs the accumulated value of all elements in S ¯ u min 0 , denoted by a c c ( S ¯ u min 0 ) . Using the same method, the DU can obtain the accumulated value of all elements in S ¯ v min 0 , S ¯ u max 1 , and S ¯ v max 1 , denoted by a c c ( S ¯ v min 0 ) , a c c ( S ¯ u max 1 ) and a c c ( S ¯ v max 1 ) , respectively. The DU sends S ¯ u min 0 , S ¯ v min 0 , S ¯ u max 1 , S ¯ v max 1 , and their corresponding accumulated values a c c ( S ¯ u min 0 ) , a c c ( S ¯ v min 0 ) , a c c ( S ¯ u max 1 ) , and a c c ( S ¯ v max 1 ) as the query token to the CS.

5.3.4. Range Query

In this section, we first transform the operation of determining the positional relationship between Q and M into the operation of comparing two values in Q and M. Then, we utilize the 0–1 encoding technique to convert the operation of comparing the values into the operation of determining whether two different sets intersect. We add the tags t a g 1 , t a g 2 , t a g 3 , and t a g 4 to the elements in these sets. This allows for the CS to find the sets and then correctly perform the set intersection operations.
As shown in Figure 4, the MR M of a node N can be represented by the minimum point p min = ( x min , y min ) and the maximum point p max = ( x max , y max ) . Similarly, the queried range Q can be represented by the minimum point q min = ( u min , v min ) and the maximum point q max = ( u max , v max ) . As shown in Figure 5 (1)–(4), if Q M = , there are four possible positional relationships between Q and M: Figure 5 (1) shows that Q is positioned to the right of M, which indicates x max < u min ; Figure 5 (2) shows that Q is positioned to the left of M, which indicates u max < x min ; Figure 5 (3) shows that Q is positioned above M, which indicates v max < y min ; Figure 5 (4) shows that Q is positioned below M, which indicates y max < v min .
According to 0–1 encoding, if x max < u min , there is S x max 1 S u min 0 = . Note that S ¯ x max 1 = { h a s h ( a , k ) | | t a g 2 } (where a S x max 1 ) and S ¯ u min 0 = { h a s h ( a , k ) | | t a g 2 } (where a S u min 0 ). Thus, if x max < u min , there is S ¯ x max 1 S ¯ u min 0 = . Note that S ¯ x min 0 = { h a s h ( a , k ) | | t a g 0 } (where a S x min 0 ), S ¯ y min 0 = { h a s h ( a , k ) | | t a g 1 } (where a S y min 0 ), and S ¯ y max 1 = { h a s h ( a , k ) | | t a g 3 } (where a S y max 1 ). As the tagging parameters are different, there is no element in S ¯ x min 0 , S ¯ y min 0 and S ¯ y max 1 that is the same as any element in S ¯ u min 0 . It is easy to know that S ¯ x min 0 S ¯ u min 0 = , S ¯ y min 0 S ¯ u min 0 = and S ¯ y max 1 S ¯ u min 0 = . Thus, we have the conclusion that if x max < u min , there is S ¯ u min 0 S M = , where S M = S ¯ x min 0 S ¯ y min 0 S ¯ x max 1 S ¯ y max 1 . According to the above analysis, we have the following conclusions.
  • x max < u min S u min 0 S x max 1 = S ¯ u min 0 S ¯ x max 1 = S ¯ u min 0 S M = (corresponds to a positional relationship in Figure 5 (1));
  • u max < x min S u max 1 S x min 0 = S ¯ u max 1 S ¯ x min 0 = S ¯ u max 1 S M = (corresponds to a positional relationship in Figure 5 (2));
  • v max < y min S v max 1 S y min 0 = S ¯ v max 1 S ¯ y min 0 = S ¯ v max 1 S M = (corresponds to a positional relationship in Figure 5 (3));
  • y max < v min S v min 0 S y max 1 = S ¯ v min 0 S ¯ y max 1 = S ¯ v min 0 S M = (corresponds to a positional relationship in Figure 5 (4)), where S M = S ¯ x min 0 S ¯ y min 0 S ¯ x max 1 S ¯ y max 1 .
Therefore, if the CS determines that S ¯ u min 0 S M = , S ¯ u max 1 S M = , S ¯ v max 1 S M = , or S ¯ v min 0 S M , then there is Q M = .
The CS extracts S ¯ u min 0 , S ¯ v min 0 , S ¯ u max 1 , and S ¯ v max 1 from the query token of Q and extracts S M from the node N in the MRT index. If S ¯ u min 0 S M = , S ¯ u max 1 S M = , S ¯ v max 1 S M = , or S ¯ v min 0 S M , then the CS determines Q M = and stops querying the descendant nodes of N in the MRT index. Otherwise, the CS continues querying the descendant nodes of N in the MRT index. The CS begins determining from the root node to the leaf nodes in the MRT index. Finally, it retrieves the encrypted POIs in the leaf nodes, which meet the query request of the DU. These retrieved ciphertexts of POIs are then returned to the DU as the query results.

5.3.5. Verification of Correctness and Entirety in the Query Process

To verify the correctness and entirety in the query process, the DU can request relevant information from the CS.
Suppose N is a node in the MRT index, which is associated with an MR M. In the query process, if the CS determines Q M = , the DU can request the CS to provide a proof as follows.
The CS extracts the binary string B F M from the node N and sends it to the DU. Upon receiving B F M , the DU extracts each element a from S ¯ u min 0 in the query token and calculates hash values h a s h B F ( a , k 1 ) , h a s h B F ( a , k 2 ) , …, h a s h B F ( a , k w ) . Then, the DU checks if the values at the positions of all hash values in B F M are 1. If there is no such element a, it means that S ¯ u min 0 S M = . Then, the DU can verify Q M = . Similarly, the DU can also determine whether S ¯ v min 0 S M = , or S ¯ u max 1 S M = , or S ¯ v max 1 S M = by using B F M , and then verify Q M = . Otherwise, B F M fails to verify Q M = . This could be due to false positive errors in the Bloom filter, and the probability of such false positives occurring is very low [33]. Thus, the DU requests other information from the CS to verify Q M = . The CS extracts S ¯ u min 0 in the query token and S M in the node N. Then, the CS executes the algorithm P r o v e D i s j o i n t ( S ¯ u min 0 , S M , p k ) in [32], and then obtains the proof u min that represents the relationship between S ¯ u min 0 and S M . Similarly, the CS can obtain the proof v min that represents the relationship between S ¯ v min 0 and S M , the proof u max that represents the relationship between S ¯ u max 1 and S M , and the proof v max that represents the relationship between S ¯ v max 1 and S M . The DO sends u min , v min , u max , v max , and a c c ( S M ) to the DU. The DU executes the algorithm V e r i f y D i s j o i n t ( a c c ( S ¯ u min 0 ) , a c c ( S M ) , u min , p k ) in [32] to verify S ¯ u min 0 S M = . Similarly, the DU executes the algorithm V e r i f y D i s j o i n t to verify S ¯ v min 0 S M = , S ¯ u max 1 S M = , and S ¯ v max 1 S M = . If any equation S ¯ u min 0 S M = , S ¯ v min 0 S M = , or S ¯ u max 1 S M = , or S ¯ v max 1 S M = is satisfied, it means that Q M = . Finally, the DU can determine whether the CS has correctly executed the query process.
Note that a Bloom filter is adopted in the above verification process. The Bloom filter can provide fast verification and relatively high accuracy to prove that Q M = . However, considering that the Bloom filter may have a very low probability of false positive errors occurring, a cryptographic multiset accumulator is adopted in our scheme EVRQ. By combining the Bloom filter and cryptographic multiset accumulator, EVRQ ensures the efficiency of the verification and the correctness of the query process.
Suppose (1) E n c ( s 1 ) is the ciphertext of the POI s 1 queried by the DU; (2) s 1 is located in the MR M 1 that is associated with the leaf node N 1 ; (3) N 1 , N 2 , N 3 , and N 4 are sibling nodes; (4) N is the parent node of N 1 , N 2 , N 3 , and N 4 ; (5) the POIs of N 2 , N 3 , and N 4 are not covered by the queried range Q. The DU requests E n c ( s 1 ) , S M 1 , a c c ( S M 1 ) , B F M 1 , and h N 1 of the node N 1 from the CS. By verifying whether h N 1 is equal to h a s h n o d e ( E n c ( s 1 ) S M 1 a c c ( S M 1 ) B F M 1 ) , the DU can determine whether the CS has illegally modified the information of N 1 or provided incorrect information to the DU. Then, the DU requests the hash values h N 2 , h N 3 , and h N 4 stored in N 2 , N 3 , and N 4 , respectively, and S M , a c c ( S M ) , and B F M stored in N. Next, the DU calculates the hash value h N = h a s h n o d e ( h N 1 h N 2 h N 3 h N 4 S M a c c ( S M ) B F M ) . Using this method in a bottom–up manner iteratively, the DU calculates the hash values of the ancestor nodes of N 1 . Finally, the DU tests whether the computed hash value of the root node is equal to the hash value stored in the root node of the MRT index. If they are equal, it indicates that the CS has honestly provided all the information and executed the range query’s entirety.

6. Experiments

In our experiments, we compared our scheme EVRQ with the MDOPE scheme [27]. Our scheme EVRQ is constructed based on a map region tree, 0–1 encoding, hash function, Bloom filter, and cryptographic multiset accumulator. The computations involving bilinear groups in a cryptographic multiset accumulator [32] were implemented using the Java Pairing Based Cryptography Library 2.0.0 [35]. The MDOPE scheme is constructed based on the network data structure, prefix encoding, and Bloom filter. Our experiments were conducted on a Win10 computer equipped with an AMD Ryzen 5 2500U CPU and 8 GB of Random Access Memory (RAM). We used the control variables method to test the efficiency of EVRQ and MDOPE on two properties: the volume of the experimental data and the scale of the queried range. As the EVRQ scheme is not affected by the distribution density of geographical data, to ensure a fair and comprehensive comparison between the EVRQ scheme and the MDOPE scheme, we generated data sets of different sizes in which geographical data are randomly and evenly distributed. These data sets consisted of 2 6 , 2 8 , 2 10 , 2 12 , and 2 14 distinct geographical data, respectively. The geographical data in these data sets were evenly distributed within the map regions [ 1 , 2 3 ] × [ 1 , 2 3 ] , [ 1 , 2 4 ] × [ 1 , 2 4 ] , [ 1 , 2 5 ] × [ 1 , 2 5 ] , [ 1 , 2 6 ] × [ 1 , 2 6 ] , and [ 1 , 2 7 ] × [ 1 , 2 7 ] , respectively. In the subsequent experiments, we executed each algorithm hundreds of times and calculated the average runtime.

6.1. Index Construction

As shown in Figure 6, as the number of geographical data increases from 2 6 to 2 14 exponentially, both the EVRQ scheme and the MDOPE scheme require larger indexes to accommodate these geographical data, leading to an exponential increase in the number of index nodes. Consequently, the time of index generation in both the EVRQ scheme and the MDOPE scheme also increases exponentially. In the EVRQ scheme, it is necessary to generate additional information for each index node to verify the correctness and integrity of the query process. The additional information includes the hash value for each index node and the cryptographic accumulator value of the map region for each index node. The hash values for each index node must be generated from the leaf nodes to the root node in a bottom–up manner. The cryptographic accumulator value involves complex elliptic curve operations. Consequently, the index generation in the EVRQ scheme is slow. Compared to the EVRQ scheme, the MDOPE scheme does not support the verification of query process correctness and integrity. Therefore, the MDOPE scheme only needs to compute and store binary strings generated by a Bloom filter that supports ciphertext comparison operations. As a result, the MDOPE scheme is faster in terms of index generation compared to the EVRQ scheme.
Note that in cloud-based LBS applications, the index generation process is completed prior to serving the data user and only needs to be generated once for permanent usage. Therefore, the index generation process does not have any negative impact on the data user’s experience.

6.2. Token Generation

In our experiments, the queried ranges were randomly generated in the ranges of [ 1 , 2 3 ] × [ 1 , 2 3 ] , [ 1 , 2 4 ] × [ 1 , 2 4 ] , [ 1 , 2 5 ] × [ 1 , 2 5 ] , [ 1 , 2 6 ] × [ 1 , 2 6 ] , and [ 1 , 2 7 ] × [ 1 , 2 7 ] , respectively. Thus, the average length of encodings that represent these queried ranges increases from 3 bits to 7 bits. As shown in Figure 7, as the average length of encodings increases in the EVRQ scheme, the generation of query token is very efficient. However, in the MDOPE scheme, the generation of query token is very slow, and its time increases exponentially. In the EVRQ scheme, the generation of query token requires binary strings and hash values of two points of the queried range, making it very efficient. In contrast, in the MDOPE scheme, a queried range is represented by a large number of binary strings, and then these binary strings need to be merged pairwise, resulting in a small number of combined binary strings. Due to the enormous number of binary strings that need to be merged, the generation of query tokens becomes highly inefficient. Additionally, as the query range expands in size, the number of binary strings to be merged grows exponentially, leading to an exponential increase in the time of query token generation in the MDOPE scheme.

6.3. Range Query

As shown in Figure 8, the time for range query increases with the height of the index. This is because as the height of the index increases, the number of index nodes that need to be compared by the query tokens also grows. In the EVRQ scheme, range query is implemented by determine whether there exists the intersection between several sets of hash values. In the MDOPE scheme, range query is implemented by checking if several sets of hash values match a binary string generated by a Bloom filter. Both of these operations, used to determine whether a queried range matches an index node, are highly efficient. As a result, both the EVRQ and MDOPE schemes exhibit highly efficient query performance.

6.4. Verification

As the MDOPE scheme does not support verifying the correctness and integrity of the query process, we only conducted tests on the EVRQ scheme. Figure 9 demonstrates the efficiency of the correctness verification for the query process. As the index height increases, the size of the index also correspondingly increases. However, since the average size of the randomly generated query ranges also increases, the number of index nodes that need to be verified does not change significantly. This results in the consistent efficiency of the correctness verification for the query process. In the correctness verification of the query process, the Bloom filter is very efficient and exhibits a high accuracy rate. It efficiently verifies the majority of MRT index nodes that do not intersect with the queried range. In rare cases where the Bloom filter experiences false positive errors, a cryptographic accumulator is used to verify a small number of MRT index nodes that do not intersect with the queried range. Therefore, the EVRQ scheme ensures both the efficiency and accuracy of the correctness verification for the query process.
As shown in Figure 10, the efficiency of the integrity verification for the query process is demonstrated. In order to verify the integrity of the query process, a data user needs to calculate the hash value for the retrieved query results from leaf nodes. Then, in a bottom–up manner, the data user calculates the hash values for the ancestor nodes, finally computing the hash value of the root node and comparing it with the hash value stored in the root node. When the index height increases, the data user needs to calculate the hash values of more nodes. Consequently, the time required for verifying the integrity of the query process also increases. However, due to the high efficiency of hash calculations, the process of verifying the integrity of the query process remains highly efficient.

7. Correctness and Security Analysis of EVRQ

In this section, we first analyze the correctness of the EVRQ scheme. Then, we demonstrate the information of the EVRQ scheme. Finally, we analyze the indistinguishability of the MRT index.
Theorem 1.
Our method EVRQ complies with the correctness of Definition 1.
Proof. 
Suppose that an MR M is represented by the minimum point p min = ( x min , y min ) M and the maximum point p max = ( x max , y max ) M , and a queried range Q is represented by the minimum point q min = ( u min , v min ) Q and the maximum point q max = ( u max , v max ) Q . According to the 0–1 encoding technique, ( x max < u min ) ( u max < x min ) ( v max < y min ) ( y max < v min ) = t r u e means ( S u min 0 S x max 1 = ) ( S u max 1 S x min 0 = ) ( S v max 1 S y min 0 = ) ( S v min 0 S y max 1 = ) = t r u e . As ( x max < u min ) ( u max < x min ) ( v max < y min ) ( y max < v min ) = t r u e means Q M = (see Section 5.3.4), ( S u min 0 S x max 1 = ) ( S u max 1 S x min 0 = ) ( S v max 1 S y min 0 = ) ( S v min 0 S y max 1 = ) = t r u e means Q M = . Section 5.3.4 also gives four formulas: S u min 0 S x max 1 = S ¯ u min 0 S ¯ x max 1 = S ¯ u min 0 S M = , S u max 1 S x min 0 = S ¯ u max 1 S ¯ x min 0 = S ¯ u max 1 S M = , S v max 1 S y min 0 = S ¯ v max 1 S ¯ y min 0 = S ¯ v max 1 S M = , and S v min 0 S y max 1 = S ¯ v min 0 S ¯ y max 1 = S ¯ v min 0 S M = . Thus, it is easy to know if ( S ¯ u min 0 S M = ) ( S ¯ u max 1 S M = ) ( S ¯ v max 1 S M = ) ( S ¯ v min 0 S M ) = , there is Q M = . Otherwise, Q M . □
Therefore, by utilizing the query token of Q to exclude all MRs that do not intersect with Q, the CS can determine the rest MRs that intersect with Q. In a top–down manner, the CS eventually finds the leaf nodes of the MRT index whose MRs intersect with Q. The CS returns the ciphertexts of POIs in these leaf nodes to the DU as query results. Thus, our scheme EVRQ complies with the correctness defined in Definition 1.
Theorem 2.
Our scheme EVRQ adheres to the security in Definition 2.
Proof. 
The POIs are encrypted using a secure encryption scheme, and the security of the geographical data can be guaranteed by the encryption scheme. Suppose that d represents the information of an MR M or a queried range Q. Then, d is a value in the set { x min , y min , x max , y max , u min , v min , u max , v max } . In our scheme EVRQ, firstly, d is converted to a binary string B ( d ) . Then, a random number r d is chosen and converted to a binary string B ( r d ) . Next, the 0–1 encoding of B ( d ) | | B ( r d ) is computed, which is denoted by S B ( d ) | | B ( r d ) 0 (or S B ( d ) | | B ( r d ) 1 ). Finally, each element of S B ( d ) | | B ( r d ) 0 (or S B ( d ) | | B ( r d ) 1 ) is processed by a hash function, and then padded with a t a g , which is denoted by { h a s h ( a , k ) | | t a g } , where a S B ( d ) | | B ( r d ) 0 (or { h a s h ( a , k ) | | t a g } , where a S B ( d ) | | B ( r d ) 1 ). According to the 0–1 encoding technique, the total number of elements in { h a s h ( a , k ) | | t a g } , where a S B ( d ) | | B ( r d ) 0 (or { h a s h ( a , k ) | | t a g } , where a S B ( d ) | | B ( r d ) 1 ), is equal to the total number of 0 (or 1) in B ( d ) | | B ( r d ) | | t a g . Consequently, adversaries possess knowledge solely of the leakage function F ( d ) = c o u n d ( d ) . Hence, the EVRQ scheme adheres to the security in Definition 2. □
Note that, there are several ways to mitigate such information leakage. For example, the DO can increase the length of B ( r d ) or adjust the total number of 0 (or 1) in B ( r d ) to control the total number of 0 (or 1) in B ( d ) | | B ( r d ) . The DO can also add some artificial pseudo-hash values to reduce the risk of such information leakage.
Theorem 3.
Our scheme EVRQ adheres to the security in Definition 3.
Proof. 
In the MRT index, each node is associated with an MR. As the MRs of the same-level nodes contain an equal (or similar) number of POIs, the structures of the corresponding subtrees are the same (or similar). Thus, it is difficult to effectively distinguish different MRs of the same-level nodes and the corresponding subtrees. This feature ensures the security of the density distribution of POIs. Therefore, an attacker (including the untrusted CS) cannot infer the DU’s query privacy by observing the DU’s query process on the MRT index and analyzing the density distribution information of POIs on a map. □
If two different MPs contain a similar number of POIs (i.e., one MP contains one more POI than the other MP), due to the secure processing of the MP index, attackers can only distinguish between these two MPs based on the number of POIs they contain; attackers cannot obtain any other useful information. Additionally, the DO can make these two MP indistinguishable by adding a fake POI to the MP with fewer POIs. As a result, attackers are unable to differentiate between these two different MPs.

8. Discussion

In this paper, we propose the EVRQ scheme. Compared to order-preserving encryption methods, EVRQ protects the order information between ciphertexts by transforming the comparison of ciphertexts into set intersection operations. In contrast to bucketization methods, EVRQ supports accurate geographic information queries. The accuracy of query results is crucial in LBS applications, and EVRQ enables the data user to accurately locate the desired POIs. Compared to public key-based methods, EVRQ avoids computationally expensive calculations on bilinear groups during the query process, resulting in a higher efficiency and timely delivery of query results to the data user, thereby providing a better user experience. Additionally, EVRQ partitions the map in a reasonable manner, preventing attackers from analyzing the query privacy of the data user based on the density distribution of POIs and analyzing map information using the index structure. EVRQ also combines the Bloom filter and cryptographic multiset accumulator, making the verification process more efficient.

9. Conclusions

In this paper, we mainly consider several types of malicious behaviors of the untrusted cloud server and propose a range query scheme EVRQ. In EVRQ, the data owner builds an MRT index. In the MRT index, as the MRs of the same level contain an equal (or similar) number of POIs, the corresponding subtrees in MRT are the same (or similar), which can effectively ensure the security of the density distribution of POIs. Therefore, attackers (including the untrusted cloud server) cannot infer the data user’s query privacy by observing the data user’s query process on the MRT and analyzing the density distribution of POIs on a map. In the MRT index, the Bloom filter and cryptographic multiset accumulator techniques are adopted to efficiently and effectively verify the correctness of the query process in the cloud server. Specifically, the data owner can achieve efficient verification in the majority of cases by using a Bloom filter and achieve effective verification in rare cases by using a cryptographic multiset accumulator (when false positive errors occur in the Bloom filter). Additionally, to ensure that the cloud server has traversed the entire MRT index when executing a range query and has not made any malicious modifications to the outsourced data, the MRT index incorporates hash techniques. Thus, the proposed scheme EVRQ can effectively achieve secure range queries for geographical information in an untrusted cloud environment.

Author Contributions

Methodology, Zhuolin Mei; Software, Zhuolin Mei, Shimao Yao, Shunli Zhang, Haibin Wang, Hongbo Li and Jiaoli Shi; Validation, Zhuolin Mei; Writing—original draft, Zhuolin Mei, Jing Zeng, Caicai Zhang and Shimao Yao; Writing—review and editing, Zhuolin Mei, Jing Zeng, Caicai Zhang, Shimao Yao, Shunli Zhang, Haibin Wang and Hongbo Li; Project administration, Jing Zeng and Caicai Zhang; Funding acquisition, Zhuolin Mei and Caicai Zhang. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Natural Science Foundation of China (Nos. 61962029, 62062045, 62262033, 62341206); the Jiangxi Provincial Natural Science Foundation of China (No. 20232BAB202007); the Zhejiang Province Visiting Engineer Cooperation Project (No. FG2023061); the Heilongjiang Province Natural Science Foundation of China (No. LH2022G001).

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

The authors would like to thank the editor and anonymous reviewers for their suggestions.

Conflicts of Interest

Author Jing Zeng was employed by the company China Gridcom Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Zeng, J.; Yang, L.T.; Lin, M.; Ning, H.; Ma, J. A survey: Cyber-physical-social systems and their system-level design methodology. Future Gener. Comput. Syst. 2020, 105, 1028–1042. [Google Scholar] [CrossRef]
  2. Bajaj, C.; Upadhyay, D.K.; Kumar, S.; Kanaujia, B.K. GPS-Integrated RFID Antenna with AMC Backing for IoT-Based Sensing and Tracking Applications. IEEE Trans. Antennas Propag. 2023, 72, 1929–1934. [Google Scholar] [CrossRef]
  3. Cheng, L.; Chen, J.; Liu, Y.; Zhu, Q. Variable Projection Algorithm for GPS Positioning in Multipath Environments Based on Aitken Acceleration Method. IEEE Trans. Ind. Inform. 2024, 20, 6404–6412. [Google Scholar] [CrossRef]
  4. Feng, J.; Yang, L.T.; Zhang, R.; Qiang, W.; Chen, J. Privacy preserving high-order bi-lanczos in cloud–fog computing for industrial applications. IEEE Trans. Ind. Inform. 2020, 18, 7009–7018. [Google Scholar] [CrossRef]
  5. Wu, Z.; Xuan, S.; Xie, J.; Lin, C.; Lu, C. How to ensure the confidentiality of electronic medical records on the cloud: A technical perspective. Comput. Biol. Med. 2022, 147, 105726. [Google Scholar] [CrossRef] [PubMed]
  6. Yao, S.; Dayot, R.V.J.; Ra, I.H.; Xu, L.; Mei, Z.; Shi, J. An identity-based proxy re-encryption scheme with single-hop conditional delegation and multi-hop ciphertext evolution for secure cloud data sharing. IEEE Trans. Inf. Forensics Secur. 2023, 18, 3833–3848. [Google Scholar] [CrossRef]
  7. Mei, Z.; Zhu, H.; Cui, Z.; Wu, Z.; Peng, G.; Wu, B.; Zhang, C. Executing multi-dimensional range query efficiently and flexibly over outsourced ciphertexts in the cloud. Inf. Sci. 2018, 432, 79–96. [Google Scholar] [CrossRef]
  8. Feng, J.; Yang, L.T.; Zhu, Q.; Choo, K.K.R. Privacy-preserving tensor decomposition over encrypted data in a federated cloud environment. IEEE Trans. Dependable Secur. Comput. 2018, 17, 857–868. [Google Scholar] [CrossRef]
  9. Meng, Q.; Weng, J.; Miao, Y.; Chen, K.; Shen, Z.; Wang, F.; Li, Z. Verifiable spatial range query over encrypted cloud data in VANET. IEEE Trans. Veh. Technol. 2021, 70, 12342–12357. [Google Scholar] [CrossRef]
  10. Boldyreva, A.; Chenette, N.; Lee, Y.; O’neill, A. Order-preserving symmetric encryption. In Proceedings of the Advances in Cryptology-EUROCRYPT 2009: 28th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Cologne, Germany, 26–30 April 2009; Proceedings 28. Springer: Berlin/Heidelberg, Germany, 2009; pp. 224–241. [Google Scholar]
  11. Boldyreva, A.; Chenette, N.; O’Neill, A. Order-preserving encryption revisited: Improved security analysis and alternative solutions. In Proceedings of the Advances in Cryptology–CRYPTO 2011: 31st Annual Cryptology Conference, Santa Barbara, CA, USA, 14–18 August 2011; Proceedings 31. Springer: Berlin/Heidelberg, Germany, 2011; pp. 578–595. [Google Scholar]
  12. Xiao, L.; Yen, I.L. Security analysis for order preserving encryption schemes. In Proceedings of the 2012 46th Annual Conference on Information Sciences and Systems (CISS), Princeton, NJ, USA, 21–23 March 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 1–6. [Google Scholar]
  13. Krendelev, S.F.; Yakovlev, M.; Usoltseva, M. Order-preserving encryption schemes based on arithmetic coding and matrices. In Proceedings of the 2014 Federated Conference on Computer Science and Information Systems, Warsaw, Poland, 7–10 September 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 891–899. [Google Scholar]
  14. Teranishi, I.; Yung, M.; Malkin, T. Order-preserving encryption secure beyond one-wayness. In Proceedings of the Advances in Cryptology–ASIACRYPT 2014: 20th International Conference on the Theory and Application of Cryptology and Information Security, Kaoshiung, Taiwan, 7–11 December 2014; Proceedings, Part II 20. pp. 42–61. [Google Scholar]
  15. Dyer, J.; Dyer, M.; Xu, J. Order-preserving encryption using approximate integer common divisors. In Proceedings of the Data Privacy Management, Cryptocurrencies and Blockchain Technology: ESORICS 2017 International Workshops, DPM 2017 and CBT 2017, Oslo, Norway, 14–15 September 2017; Springer: Cham, Switzerland, 2017; pp. 257–274. [Google Scholar]
  16. Hore, B.; Mehrotra, S.; Tsudik, G. A privacy-preserving index for range queries. In Proceedings of the Thirtieth International Conference on Very Large Data Bases, Toronto, ON, Canada, 31 August–3 September 2004; Volume 30, pp. 720–731. [Google Scholar]
  17. Hore, B.; Mehrotra, S.; Canim, M.; Kantarcioglu, M. Secure multidimensional range queries over outsourced data. VLDB J. 2012, 21, 333–358. [Google Scholar] [CrossRef]
  18. Wang, P.; Ravishankar, C.V. Secure and efficient range queries on outsourced databases using Rp-trees. In Proceedings of the 2013 IEEE 29th International Conference on Data Engineering (ICDE), Brisbane, QLD, Australia, 8–12 April 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 314–325. [Google Scholar]
  19. Lee, Y. Secure ordered bucketization. IEEE Trans. Dependable Secur. Comput. 2014, 11, 292–303. [Google Scholar] [CrossRef]
  20. Handa, R.; Rama Krishna, C.; Aggarwal, N. An efficient approach for secure information retrieval on cloud. J. Intell. Fuzzy Syst. 2018, 34, 1345–1353. [Google Scholar] [CrossRef]
  21. Handa, R.; Rama Krishna, C.; Aggarwal, N. Keyword binning-based efficient search on encrypted cloud data. Arab. J. Sci. Eng. 2019, 44, 3559–3584. [Google Scholar] [CrossRef]
  22. Lin, W.; Wang, K.; Zhang, Z.; Fu, A.W.; Wong, R.C.W.; Long, C.; Miao, C. Towards secure and efficient equality conjunction search over outsourced databases. IEEE Trans. Cloud Comput. 2020, 10, 1445–1461. [Google Scholar] [CrossRef]
  23. Boneh, D.; Waters, B. Conjunctive, subset, and range queries on encrypted data. In Proceedings of the Theory of Cryptography: 4th Theory of Cryptography Conference, TCC 2007, Amsterdam, The Netherlands, 21–24 February 2007; Proceedings 4. Springer: Berlin/Heidelberg, Germany, 2007; pp. 535–554. [Google Scholar]
  24. Shi, E.; Bethencourt, J.; Chan, T.H.; Song, D.; Perrig, A. Multi-dimensional range query over encrypted data. In Proceedings of the 2007 IEEE Symposium on Security and Privacy (SP’07), Berkeley, CA, USA, 20–23 May 2007; IEEE: Piscataway, NJ, USA, 2007; pp. 350–364. [Google Scholar]
  25. Lu, Y. Privacy-Preserving Logarithmic-Time Search on Encrypted Data in Cloud. In Proceedings of the NDSS, San Diego, CA, USA, 5–8 February 2012. [Google Scholar]
  26. Mei, Z.; Yu, J.; Zhang, C.; Wu, B.; Yao, S.; Shi, J.; Wu, Z. Secure multi-dimensional data retrieval with access control and range query in the cloud. Inf. Syst. 2024, 122, 102343. [Google Scholar] [CrossRef]
  27. Zhan, Y.; Shen, D.; Duan, P.; Zhang, B.; Hong, Z.; Wang, B. MDOPE: Efficient multi-dimensional data order preserving encryption scheme. Inf. Sci. 2022, 595, 334–343. [Google Scholar] [CrossRef]
  28. Wong, W.K.; Cheung, D.W.l.; Kao, B.; Mamoulis, N. Secure kNN computation on encrypted databases. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, New York, NY, USA, 29 June–2 July 2009; pp. 139–152. [Google Scholar]
  29. Shen, E.; Shi, E.; Waters, B. Predicate privacy in encryption systems. In Proceedings of the Theory of Cryptography Conference, San Francisco, CA, USA, 15–17 March 2009; pp. 457–473. [Google Scholar]
  30. Wu, S.; Li, Q.; Li, G.; Yuan, D.; Yuan, X.; Wang, C. ServeDB: Secure, verifiable, and efficient range queries on outsourced database. In Proceedings of the 2019 IEEE 35th International Conference on Data Engineering (ICDE), Macao, China, 8–11 April 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 626–637. [Google Scholar]
  31. Lin, H.Y.; Tzeng, W.G. An efficient solution to the millionaires’ problem based on homomorphic encryption. In Proceedings of the Applied Cryptography and Network Security: Third International Conference, ACNS 2005, New York, NY, USA, 7–10 June 2005; Proceedings 3. Springer: Berlin/Heidelberg, Germany, 2005; pp. 456–466. [Google Scholar]
  32. Xu, C.; Zhang, C.; Xu, J. vchain: Enabling verifiable boolean range queries over blockchain databases. In Proceedings of the 2019 International Conference on Management of Data, Amsterdam, The Netherlands, 30 June–5 July 2019; pp. 141–158. [Google Scholar]
  33. Bloom, B.H. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 1970, 13, 422–426. [Google Scholar] [CrossRef]
  34. AES Proposal: Rijndael. 1999. Available online: https://www.cs.miami.edu/home/burt/learning/Csc688.012/rijndael/rijndael_doc_V2.pdf (accessed on 7 August 2024).
  35. The Java Pairing Based Cryptography Library (jPBC). Available online: http://gas.dia.unisa.it/projects/jpbc/index.html (accessed on 7 August 2024).
Figure 1. System model.
Figure 1. System model.
Ijgi 13 00281 g001
Figure 2. Map region division.
Figure 2. Map region division.
Ijgi 13 00281 g002
Figure 3. Map region tree.
Figure 3. Map region tree.
Ijgi 13 00281 g003
Figure 4. Map region M and its two points.
Figure 4. Map region M and its two points.
Ijgi 13 00281 g004
Figure 5. Four possible positional relationships between Q and M such that Q M = .
Figure 5. Four possible positional relationships between Q and M such that Q M = .
Ijgi 13 00281 g005
Figure 6. The time of index construction.
Figure 6. The time of index construction.
Ijgi 13 00281 g006
Figure 7. The time of token generation.
Figure 7. The time of token generation.
Ijgi 13 00281 g007
Figure 8. The time of range query.
Figure 8. The time of range query.
Ijgi 13 00281 g008
Figure 9. The time for verifying correctness.
Figure 9. The time for verifying correctness.
Ijgi 13 00281 g009
Figure 10. The time for verifying integrity.
Figure 10. The time for verifying integrity.
Ijgi 13 00281 g010
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mei, Z.; Zeng, J.; Zhang, C.; Yao, S.; Zhang, S.; Wang, H.; Li, H.; Shi, J. Efficient and Verifiable Range Query Scheme for Encrypted Geographical Information in Untrusted Cloud Environments. ISPRS Int. J. Geo-Inf. 2024, 13, 281. https://doi.org/10.3390/ijgi13080281

AMA Style

Mei Z, Zeng J, Zhang C, Yao S, Zhang S, Wang H, Li H, Shi J. Efficient and Verifiable Range Query Scheme for Encrypted Geographical Information in Untrusted Cloud Environments. ISPRS International Journal of Geo-Information. 2024; 13(8):281. https://doi.org/10.3390/ijgi13080281

Chicago/Turabian Style

Mei, Zhuolin, Jing Zeng, Caicai Zhang, Shimao Yao, Shunli Zhang, Haibin Wang, Hongbo Li, and Jiaoli Shi. 2024. "Efficient and Verifiable Range Query Scheme for Encrypted Geographical Information in Untrusted Cloud Environments" ISPRS International Journal of Geo-Information 13, no. 8: 281. https://doi.org/10.3390/ijgi13080281

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop