Attribute-Based Access Control Meets Blockchain-Enabled Searchable Encryption: A Flexible and Privacy-Preserving Framework for Multi-User Search

Han, Jiujiang; Li, Ziyuan; Liu, Jian; Wang, Huimei; Xian, Ming; Zhang, Yuxiang; Chen, Yu

doi:10.3390/electronics11162536

Open AccessArticle

Attribute-Based Access Control Meets Blockchain-Enabled Searchable Encryption: A Flexible and Privacy-Preserving Framework for Multi-User Search

by

Jiujiang Han

^1,†

,

Ziyuan Li

^2,†

,

Jian Liu

^1,*,

Huimei Wang

¹,

Ming Xian

¹,

Yuxiang Zhang

¹ and

Yu Chen

¹

College of Electronic Science and Technology, National University of Defense Technology, Changsha 410073, China

²

Unit 31438, Shenyang 110000, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Electronics 2022, 11(16), 2536; https://doi.org/10.3390/electronics11162536

Submission received: 7 July 2022 / Revised: 31 July 2022 / Accepted: 11 August 2022 / Published: 13 August 2022

(This article belongs to the Section Computer Science & Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Searchable encryption enables users to enjoy search services while protecting the security and privacy of their outsourced data. Blockchain-enabled searchable encryption delivers the computing processes that are executed on the server to the decentralized and transparent blockchain system, which eliminates the potential threat of malicious servers invading data. Recently, although some of the blockchain-enabled searchable encryption schemes realized that users can search freely and verify search results, unfortunately, these schemes were inefficient and costly. Motivated by this, we proposed an improved scheme that supports fine-grained access control and flexible searchable encryption. In our framework, the data owner uploads ciphertext documents and symmetric keys to cloud database and optional KMS, respectively, and manipulates the access control process and searchable encryption process through smart contracts. Finally, the experimental comparison conducted on a private Ethereum network proved the superiority of our scheme.

Keywords:

smart contract; blockchain; searchable encryption; attribute-based access control smart contract; blockchain; searchable encryption; attribute-based access control

1. Introduction

With the rapid development of the Internet, the era of big data has come. As more and more data are generated in daily life, cloud storage technology is emerging, such as Amazon Storage Service [1] and Tencent micro cloud [2] in China. However, with the growth of cloud storage applications, users have found that when the data are outsourced to the cloud, they cannot directly control the data. This presents a huge challenge for protecting users’ privacy and security. A common solution is to upload data after encryption, but this approach encounters the problem of how to query the ciphertext. The simplest method is to download and decrypt all documents and then query them. However, downloading redundant and unnecessary documents results in network bandwidth waste. The decryption and query process also incurs a large calculation overhead. This approach is clearly too cumbersome and expensive. Since the cloud server usually has calculation power, people wish to search via the server. Unfortunately, the cloud server is usually “honest-but-curious”, searching after decryption by the server will undoubtedly compromise user privacy to the server, which may cause serious damage to the user’s data security. In order to solve these problems, SE (searchable encryption) technology was developed.

Searchable encryption is a mode of data sharing between servers and clients. In order to protect the privacy of users in complex and changeable scenarios, many improved algorithms and schemes with different functions have been derived. One of the most typical schemes is the searchable encryption scheme for TB level data [3]. We will generate more and more data in our daily lives, and we will rely more on cloud storage to solve data storage problems. Therefore, we need an efficient and reliable searchable encryption scheme to protect our data privacy.

At the same time, facing the complex network environment, an honest-but-curious server may also be a potentially malicious server. In April 2020, the inventor of PanDownload was arrested. They cracked the client program of Baidu Online Disk and provided an improved third-party cloud disk software [4]. Because the software they provided has faster download bandwidth than the original software, it attracted a huge number of customers in a short period of time. However, the software could disclose documents and pictures without clients’ consent, representing a serious infringement. After the announcement of their arrest, netizens have issued strong doubts about whether a cloud service with absolute privacy-preserving capabilities really exists. Therefore, we cannot be certain that a cloud server we use will keep our information and data protected and private.

Considering the threat of malicious servers, most of the existing verifiable search schemes focus on detecting cheats without effective countermeasures (e.g., punishing cheaters). Therefore, a reliable and fair mechanism for implementing search is needed. Blockchain-based services can effectively solve this problem. Blockchain is a decentralized platform without the role of a server, and each client is a peer-to-peer network node. When a node wishes to process a transaction, its behavior must be confirmed by all other nodes (e.g., consensus) before it can take action. Compared with the pyramid-like C/S structure, the P2P network of blockchain can provide users with more sound and practical services.

Based on the outstanding performance of blockchain in data security and non-tampering, many related works [5,6,7,8,9] have emerged in recent years. These works give several typical cases in which the traditional searchable encryption scheme can run effectively under a severe threat environment. However, their scheme does not perform well in multi-user settings. For example, if the DU (data user) wants to search, they need to communicate with the DO (data owner), who remains continuously online, and this requires a large amount of physical storage space. Moreover, their scheme does not optimize the storage resources and computing resources of the blockchain, which greatly increases the communication overhead and computing overhead, affecting the efficiency. Because of these issues, we improve and construct a more free, practical, and fine-grained access control/searchable encryption scheme for DU and DO. Our contributions are as follows:

We implement an attribute-based access control method based on smart contract. DO will set up an access policy matrix through LSSS (Linear Secret-Sharing Schemes), and calculate the inner product with a secret sharing column vector constructed by secret value s. Then the calculation result as an access control vector will be delivered to the blockchain. To achieve searching functionality, DU will first call the ACC (access control contract) with their own attributes, only the DU who meets the access policy can obtain an effective verification code. Then search and obtain the corresponding documents through the verification code in the next step;
We implement an improved searchable encryption method based on smart contract. Compared with the previous work, our scheme allows DO to go off-line freely after indexing. In addition, DU can customize the searchtoken to search, and require less costly resources compared with similar work [6]. After the DU obtains the document label, it downloads the corresponding ciphertext from the cloud database, and simultaneously hands over the transaction of calling contract to a trusted KMS (Key Management Server) for verification. After the KMS verifies the transaction, it returns the symmetric key of the ciphertext document through the secure channel, so that the DU can finally perform the decryption;
We conduct many experiments and compare them with the previous work. In the end, we confirm the superiority of our scheme with excellent performance data. As far as we know, our scheme is the most flexible among current similar work.

It is worth noting that the cloud in our scheme should be a database with only a data management function, and not equipped with a computing function. To avoid ambiguity, we will use the term “cloud database” instead of “cloud server” in later sections.

The rest of the paper is organized as follows. The related work is discussed in Section 2. We introduce the background of main system components in Section 3, including smart contract, attribute-based access control, and searchable encryption. Before delving into the description, we give an overview of the system, and explain notations, algorithms and design goals of our scheme in Section 4. We introduce our scheme from access control process and searchable encryption process and provide a security analysis in Section 5. We completed the experiment in Section 6, and compared with similar work to show the superiority of our scheme. Finally, we conclude the scheme and discuss future work as a continuation from the present study in Section 7.

2. Related Work

Nowadays, increasing numbers of people are choosing to store data in a storage server to reduce pressure on local data storage. However, due to privacy and security considerations, encrypted documents are usually uploaded, which presents challenges to document management. If a data owner sends a request to the server to download a previously uploaded document, how can the server search for the corresponding document among massive ciphertext documents? To answer this question, Song et al. [10] proposed the first practical searchable encryption scheme named SWP in 2000. This scheme uses a special two-layer encryption structure to construct encrypted data, which enables sequential scanning to search ciphertext. The core idea is to encrypt each word separately and then embed a hash value with a special format into the ciphertext. In order to search, the server can extract the hash value through a trapdoor submitted by the user and check whether the value has this special format.

With the development of searchable encryption, it has now been expanded into a mature search application that can meet complex needs. Since the work of Song et al. [10], much research has emerged to implement SE schemes for specific scenarios, and until now, SE has been fully developed. There are specific improvements in the following five aspects:

Keyword search under asymmetric encryption [11,12];
Support dynamic update of ciphertext documents stored on the server [13,14,15];
Expand supports for query formats to meet broader search query needs (include multi-keyword search and fuzzy keyword search) [16,17,18];
Support search service between thousands and millions of records [3];
Ensure that the server always calculates and returns the search results faithfully in the face of an honest-but-curious server [19];

However, the previous schemes are all based on the same foundation. That is, the server is honest-but-curious. In today’s complex and changeable network environment, servers constantly face attacks and threats from all sides. When the server is not 100% secure and reliable, almost all previous searchable encryption schemes will collapse. Therefore, a secure and reliable searchable encryption scheme is urgently needed.

In recent years, blockchain technology has appeared in people’s vision. Its P2P network environment and open transaction mode attract the interest of researchers. Blockchain 2.0 is the era of smart contracts represented by Ethereum, meaning decentralized applications can be established. It expands the simple mode of Bitcoin and applies blockchain technology from financial to non-financial fields [20], covering all aspects of human life so that individuals can build trust and realize information sharing in daily life without relying on trusted third parties or institutions, such as medical health and intellectual property. As an emerging technology with great potential, blockchain provides a programmable environment for smart contracts. Most of the existing blockchain application frameworks are based on the design and development of smart contracts. Smart contracts have been widely used in blockchain.

Many researchers have proposed to combine blockchain with traditional services to ensure that processes can be correctly implemented in accordance with the protocol under completely transparent conditions. Li et al. [21] proposed a crowdsourcing system based on Ethereum smart contract in 2019. They delivered all the logical operations in the crowdsourcing system, such as the requester releasing tasks, the workers’ registering and selecting tasks, the CSP evaluating task, to the smart contract. The significance of their work lay in realizing the first fully trusted crowdsourcing system with the support of whole processes auditing, and they also released it to the Ethereum test network Ropsten for all users to test. Cai et al. [22] proposed an oracle protocol based on peer-to-peer prediction and designed a decentralized prediction mechanism based on blockchain. Compared with ASTRAEA protocol, their oracle protocol can effectively prevent Sybil attacks. Blockchain system is also used for data-mining privacy perservation [23,24] and data sharing [25,26,27]. Niu et al. [26] proposed a searchable attribute-based encryption scheme. They stored the encrypted EHR and keyword on the cloud storage server and the blockchain, respectively. Through CP-ABE, they ensured that the ciphertext on the cloud storage server could only be decrypted by a specific cluster of users. However, this scheme required executing complex bilinear-mapping calculations by the searcher on the blockchain, which is a difficult task for the blockchain. For instance, performing this complex operation on Ethereum would consume lots of resources. Furthermore, they did not explain the methodology in their paper. Mahmood et al. [24] proposed a security model that uses federated learning and supports blockchain. This model uses the incentive mechanism of Ethereum and the consensus algorithm of Proof of Work (PoW) to promote security cooperation between decentralized nodes. Wang et al. [28] proposed an asynchronous federated learning system based on permissioned blockchains. They integrated the learned model into the blockchain and performed two-order aggregation calculations, which can effectively alleviate the synchronous federated learning algorithm.

The flourishment of blockchain has attracted global attention. In the field of searchable encryption, some work attempted to transfer the calculation task of the traditional cloud server to the smart contract, while the server only retains the most basic database functions. By virtue of the openness of smart contracts, the potential threat of malicious services can be eliminated.

Li et al. [9] proposed a theoretical SSE framework based on blockchain in 2017. Their framework only contained three entities, namely DO, DU, and smart contract. DU and DO can submit tasks to smart contracts in the form of transactions to complete. In their work, DO first processed every single keyword to generate the corresponding label and assembled labels and document identifiers into dictionary-type data. After all keywords were processed, the dictionary-type data finally generated would be uploaded to the blockchain. When DU wishes to search, they can request a trapdoor with a certain keyword to realize it. However, their scheme is only suitable for lightweight documents. For large-scale documents, although they proposed dividing documents into blocks and uploading them with multiple transactions, their experiments did not show the performance of this part very well. Blockchain is not inherently suitable for storing documents. Furthermore, in the smart contract, the storage operation requires costly resources. They did not explain their work from a practical point of view.

Hu et al. [5] proposed the first sound SE scheme based on blockchain in 2018. They demonstrated the interaction process of single DO and smart contracts from the perspective of theory and practice. As the first well-implemented based SE scheme based on blockchain, it provided a research approach for later researchers. However, there still exists a flaw in their scheme: under a multi-user setting, if DU wants to enjoy the search service, they need to submit a deposit to the contract, and DO will then perform a search for DU after confirming the deposit information. This multi-user setting is built on a business basis. As the deployer of the contract, DO might violate the agreement. Moreover, when DU requests to search, DO must be online. Otherwise, it cannot build the trapdoor of corresponding keywords. These problems are not addressed in their work.

On the basis of the former work, Cai et al. [8] conducted an in-depth study on search availability in the multi-user setting. They proposed a t-time-locked payment to ensure the fairness of search services. Their scheme partially solved the business trust issue between DO and DU in [5]. Chen et al. [7] proposed a searchable encryption scheme for EHR (electronic health record). Their scheme is similar to [5], but does not support dynamic updates. Jiang et al. [6] proposed a stealth authorization access control scheme based on [5]. In their scheme, DO would create a trapdoor for DU in advance, then encrypt it with the ECC public key corresponding to DU, and finally pack the encrypted authorization content into transactions. However, the size of the encrypted authorization content will become multiplied, which means that there will be more gas overhead when uploading to the blockchain. For these works that combine searchable encryption with blockchain, we summarize the characteristics of each work in Table 1.

In general, in the previous work related to the combination of searchable encryption and blockchain, in the multi-user case, DO needs to receive the search request of DU online, and the cost of blockchain is high, which is not flexible and practical. Our work focuses on providing an improved access control scheme using smart contracts, building on the work of blockchain and searchable encryption. In this scheme, DO does not need to receive DU requests online, and DU can achieve fine-grained and practical searchable encryption in a multi-user setting. This will significantly reduce the cost of blockchain, and is more flexible and practical than previous work. We also summarize the characteristics of our scheme in Table 1. Before the detailed introduction of the framework, we will briefly recap some system components in Section 3.

3. System Components

3.1. Ethereum and Smart Contracts

Every entity is available in the form of an account on Ethereum, divided into EOA (external owned account) and CA (contracts account), and the main body of the CA is a binary script code generated by the programming language Solidity. Each account is under a unique address, and if an EOA would like to call a smart contract, it is necessary to create a transaction and send it to the CA. A transaction in transaction pools will be packaged into blocks by an entity called the miner. Then, CA parses the input parameters in the transaction to perform complex logical operations. When a new block is successfully mined and appended to the blockchain, the corresponding miner is rewarded with cryptocurrency (Ether). This incentive mechanism inspires miners’ passion for work, and at the same time provides convenience for EOA who wish to trade or call contracts. In the script code of the smart contract, each operation requires a certain pre-defined amount of gas, so it needs to declare “gasLimit” and “gasPrice” in advance, otherwise excessively complex calculations will bring unaffordable Ether costs. When the property of the sender is not enough to pay “gasLimit*gasPrice”, the script operation that has been performed will be traced back and the transaction is determined to be invalid at the end.

In order to achieve the consensus mechanism, Ethereum requires that all clients must follow up on every update on the blockchain, and only clients that record the latest state of the blockchain can work properly. This over-rigid mechanism ensures that any information on Ethereum is immutable and auditable, but at the same time, this feature also puts forward higher demands on the storage capacity and network bandwidth of each client. It is an unfortunate reality that the blockchain platform cannot store a large amount of data like a database. Therefore, considering these characteristics, Ethereum can act as a trusted base that is suitable for correctness and reliability but not for storage or privacy.

3.2. Searchable Encryption

In the initial scheme, to save the local resource overhead, Alice chooses to deliver documents to a data server controlled by Bob, and asks Bob to provide data retrieval services so that Alice can search for the desired document by keyword. However, since it is difficult to ensure that Bob is a trusted service provider, Alice dares not directly send the document to Bob in plaintext. To overcome this dilemma, SE (searchable encryption) appeared at that time and required the following:

Alice encrypts the document and uploads it to the server;
Alice needs to construct a trapdoor with a specific structure for the keyword to retrieve, and keyword information cannot be obtained from the trapdoor;
The server searches through the trapdoor and returns the corresponding ciphertext document. The server can only know at most that the document contains a certain keyword, but the overall information of the document is unknown;

The scheme of Song et al. [10] provided strong support for the future development of cloud computing. However, there are still many problems to be solved. The most serious disadvantage of their scheme is that it must use fixed-size words, which is incompatible with the existing file encryption standards. Furthermore, it must use its specific two-layer encryption method that can only be used for text-type data, but not for other data, such as compressed data.

In fact, there have been many subsequent improvements to it. Cash et al. [3] proposed an effective scheme in 2014 that supports searching tens of millions of ciphertext records. It is worth mentioning that the searchable encryption scheme we designed in this paper is based on it. Their scheme links the ciphertext document with the identifiers string by constructing keyword-id mapping pairs, and finally with the search result id, asks the database for the ciphertext document. For a database with N record/keyword pairs, their basic scheme produces an encrypted index of optimal size

O (N)

, and processes a search with r results in optimal

O (r)

time. Considering that Ethereum is not suitable for data storage and redundant calculating operations will also bring more property consumption, their scheme transfers the storage pressure to the database and realizes minimum storage, so we adopted it as the basic part of the SEC.

In our scheme, the document will be encrypted by a symmetric cryptosystem (e.g., DES algorithm or AES algorithm), then the ciphertext and key will be uploaded to the cloud database and KMS, respectively. In some previous works [6,7], they did not elaborate on key management in-depth, but the fact is that it is necessary to discuss and explain key management. In the pioneering work [5], since the search and decryption are only performed by DO, there is no need to set up a KMS. However, if DU is responsible for these tasks, an additional trusted KMS has to be set up to manage keys and monitor the transactions on the blockchain. When a DU requests a key for decryption, after confirming the authenticity of the user’s transaction, KMS will return the key corresponding to the ciphertext document to the DU.

3.3. Attribute-Based Access Control

Access control technology could grant the subject permission to access the object according to the access policy, and effectively control the process of the subject’s permission. On this basis, ABAC (attribute-based access control) was proposed in 2002 [29], fine-grained access control in complex information systems and dynamic expansion of large-scale users were realized, which provides an ideal access control scheme for an open network environment (e.g., Bitcoin network or Ethereum network).

In oractice, each entity can be effectively distinguished by a combination of characteristics (e.g., Alice’s attribute can be [N. Y. City, Banker, Female]). These characteristics are known as attributes. Utilizing attribute sets to formulate access policies can give the access control system flexibility and scalability in an open network environment.

The most outstanding advantage of ABAC lies in its powerful mathematical expression ability. In our paper, DO can input the access policy into the ACC (access control contract) in the form of vector

\vec{μ}

, and support adding and deleting policy. In order to enjoy the search service, DU needs to submit the user’s attributes to the ACC first, and the smart contract will execute the calculation and return a verification code. DU cannot know whether the verification code was valid before calling the SEC (searchable encryption contract) with it. After obtaining the verification code, the DU can enter the next search stage through the keyword and verification code.

To simplify the calculation in the contract and reduce the amount of gas cost, we need a simple and easy-to-use policy description and analysis tool. The scheme proposed by Lewko et al. [30] meets our needs. In their ABAC scheme, boolean formulas representing attributes are converted into an LSSS (Linear Secret-Sharing Schemes) matrix. The general approach includes:

They consider the boolean formula as an access tree, where interior nodes are AND and OR gates and the leaf nodes correspond to attributes;
They firstly label the root node of the tree with the vector $v = (1)$ (a vector of length 1), then go down the levels of the tree, labeling each node with a vector determined by the vector assigned to its parent node;
Padding the vector of each leaf node to n-length (n is the vector length of the longest leaf node), l attributes mean there are l leaf nodes (i.e., l vectors). Finally, combine these vectors into the LSSS matrix $M_{l \times n}$ .

There are different labeling rules for AND gates and OR gates. If the parent node is an OR gate labeled by the vector v, then we also label its children by v. If the parent node is an AND gate labeled by the vector v, then we label its left children by vector

v ∥ 1

and its right children by vector

(0, \dots, - 1)

, whose length is the same as left children.

After completing the labeling of each leaf node, we can construct the corresponding LSSS matrix. For example, we consider that the access policy formula

A T T

is (

W_{1}

or

W_{2}

) and

W_{3}

and

W_{4}

, and the LSSS matrix can be constructed as shown in Figure 1.

The LSSS matrix M contains the information about access policy, which will provide a component for constructing the access policy vector

μ

. Before this, we first introduce the secret reconstruction property of LSSS in Definition 1.

Definition 1.

The secret-reconstruction-based LSSS matrix

M_{l \times n}

.

Let p be a prime,

s \in Z_{p}

be a secret over a set of parties

P

,

M_{l \times n}

be the LSSS matrix, ρ be a mapping function which labels each row of

M_{l \times n}

with a party in

P

. With a column vector

\vec{z} = {(s, z_{2}, z_{3}, \dots, z_{n})}^{T}

, where

z_{2}, z_{3}, \dots, z_{n}

are random values in

Z_{p}

,

M \vec{z}

is the vector formed by the l shares of the secret s, and

μ_{i} = {(M \vec{z})}_{i}

is the share belonging to the party

ρ (i)

. The pair of

(M, ρ)

can be used to represent an LSSS structure.

The LSSS structure has the linear reconstruction property. Specifically, if S is an authorized set for the LSSS structure

(M, ρ)

, there exist constants

{\{θ_{i} \in Z_{p}\}}_{i \in I}

satisfy

\sum_{i \in I} (θ_{i} M_{i}) = (1, 0, \dots, 0)

, where I is defined as

I = {i ∣ ρ (i) \in S \cap i \in [1, l]}

. To reconstruct the secret, we can calculate through

\sum_{i \in I} (θ_{i} μ_{i}) = \sum_{i \in I} (θ_{i} {(M \vec{z})}_{i}) = (1, 0, \dots, 0) • \vec{z} = s

. However, if S is not an authorized set, another value will be recovered.

In consideration of gas-saving and flexible verification, we use the access policy vector

μ

as the input of DO to call ACC and set the access policy. The secret s will be used to construct the verification code and searchable encryption process, which will be presented in detail in Section 5.

4. Problem Formulation

In this section, we show the general picture of notations and algorithms that will be used in our scheme, and expound on the system overview and design goals.

4.1. System Overview

With the popularity of cloud computing, more and more people would like to deliver data to the cloud to ease the pressure on local storage. To protect the privacy and implement search services, searchable encryption has emerged. However, earlier SE schemes were built on the premise of an honest-but-curious server. Given that the server is always facing ubiquitous attacks, this premise does not appear to be always satisfied. Therefore, decentralized blockchain technology has been nominated to overcome this dilemma, and the calculating operations on the server are handed over to smart contracts to complete, ensuring the correctness and validity of the search results. Compared with the previous work, our scheme sets an additional smart contract for realizing attribute verification and implements a fine-grained access control/search encryption scheme.

The schematic diagram of our system is shown in Figure 2. The framework proposed in this paper mainly consists of five components: Blockchain, Cloud Database or IPFS, KMS, DO and DU.

Blockchain: As the basis of the whole framework, blockchain manages access policies and keyword indexes of encrypted documents through smart contracts.

Cloud Database or IPFS: The Cloud Database or IPFS is responsible for storing the original encrypted files.

KMS: The KMS is responsible for managing the secret key corresponding to file decryption.

DO: DO is responsible for managing information such as encrypted files and access policies. DO first uploads the encrypted file to the cloud database or IPFS and deploys the smart contract on the blockchain. Then, DO constructs the keyword index and access policy and uploads them to the blockchain through the smart contract, realizing the access control and searchable encryption of files.

DU: As the data user, DU calls the AC contract to verify its attributes and then calls the SE contract to query and obtain the encrypted file. DU obtains the symmetric key from KMS through transaction information verification. DU decrypts the encrypted file with a symmetric key to obtain a plaintext file.

Our scheme is inspired by the following scenario: Tom, who is a patient in Seattle, has a particular eye disease. He encrypts his electronic health record and uploads it to an electronic medical data server. At the same time, he also wants to share his record with ophthalmologists of the top three hospitals

h o s_{A}

,

h o s_{B}

, and

h o s_{C}

in New York City but does not want any other person or institution to obtain his personal information. Moreover, he does not consider that the hidden malicious server might deliver the wrong record or incomplete record to the ophthalmologists. In this scenario, Tom not only limits the sharing objects of his EHR, but also requires avoiding operations on a centralized server.

Regarding the elimination of the malicious server’s impact on the ciphertext search process, there have been some great works transplanting the traditional searchable encryption scheme to the blockchain [5,7], but none of these works have considered giving flexible and controllable access rights to DUs. Therefore, for the first time, we give fine-grained access control capabilities to DU based on this previous research.

We set up an ACC on Ethereum to verify the attributes of DUs. In the scenario we mentioned, Tom would first perform symmetric encryption and upload the EHR ciphertext and symmetric key to the cloud database and KMS, respectively. Then he calls ACC and authorizes the attribute formula

[N . Y . C i t y

a n d

(h o s_{A}

o r

h o s_{B}

o r

h o s_{C})

a n d

o p h t h a l m o l o g i s t]

, and encrypts the keywords in the EHR to obtain dictionary-type data and upload it to the blockchain by calling SEC. When a DU wants to inquire through keywords, they must first call ACC, encapsulate their attribute information in the transaction and upload it to receive a verification code. Then, they call the SEC with the verification code and keyword parameters. Only when the verification code and the keyword parameters meet the conditions preset by the contract, the search can retrieve the correct result. After the DU obtains the identifier of the corresponding document, they will actively download the ciphertext document from the cloud database or IPFS, and deliver the transaction information to the KMS for inspection in the meantime. After confirming that the transaction information is correct, the KMS will then send the corresponding symmetric key via a secure channel (e.g., SSH).

The framework of this paper is based on Ethereum. Ethereum uses a unit called gas to measure the computing or storage resources needed to perform a task, for example, deploying a smart contract or calling a function. Gas in Ethereum has monetary value. Generally speaking, the more complex the task, the more gas is consumed, and the more money is needed. In this paper’s framework, deploying contracts and calling ABI interfaces require the consumption of gas. Gas cost is the key indicator to measure the main implementation cost of this scheme. It should be noted that since Ethereum is a transparent platform, it cannot be ruled out that other clients will attempt to reproduce the operation through the transaction information already recorded on the blockchain. Therefore, we adopt cryptography techniques to achieve the greatest degree of data protection. The specific method is described below.

4.2. Notations

In Figure 2, we give a formal description of our scheme

Π

. The standard notations used in this paper are shown in Table 2. Specifically, we also introduce common symbols. Let

H : {0, 1}^{*} \to {0, 1}^{h}

,

I : {0, 1}^{*} \to {0, 1}^{λ}

,

F : {0, 1}^{λ} \times {0, 1}^{*} \to {0, 1}^{λ}

and

G : {0, 1}^{λ} \times {0, 1}^{λ} \to {0, 1}^{*}

be four PRFs (pseudo-random functions), ∥ be the concatenation symbol,

⌊ \cdot ⌋

be the floor function,

∣ \cdot ∣

mean the length or size of the variable, and ⊥ mean no data are present.

H and I are used in the AC process. DO converts the plaintext of the target attribute set into h-bit binary random numbers by H to ensure that others cannot learn the specific attribute elements contained in the target attribute set. In our experiment in Section 6, we set h to 80, since converting the attribute plaintext into an 80-bit binary string can satisfy almost all application scenarios. For example, in a medical scenario, the attributes of the DUs may include: hospital names, positions, departments, and other related terminology. Furthermore, the size of all attribute sets would be far less than

2^{80}

. Moreover, if our scheme is used in a small enterprise, the size of the attribute set may be smaller. At this time, we can choose a smaller h, such as

h = 64

, to reduce the matching overhead and gas cost of ACC.

In addition, we denote

μ

as the access policy vector obtained after the matmul product of the access policy matrix

M_{l \times n}

and the secret-sharing column vector

{\vec{z}}_{n}

, and

a t t_{l}

as the attribute expression list composed of all attributes in the access policy. Since each row in

M_{l \times n}

represents an attribute in the attribute set, when DO calls the ACC to upload parameters, it needs to ensure that each element in

μ

and

a t t_{l}

corresponds to each other. Note that each element in the

a t t_{l}

is a binary string with a length h. This representation ensures that the target attribute set is only known by the DO, which not only achieves privacy-preservation on the blockchain, but also prevents malicious DU from force-blasting a valid verification code.

F and G are used in the SE process to construct keyword-identifier mapping pairs into dictionary data

γ

. For dictionary-type data on Ethereum, we set the GET function to obtain the specified data item from the dictionary. For example, given a dictionary

γ

and an input label l, Get

(γ, l)

would output the item corresponding to the label l in the dictionary data

γ

.

We represent the database as

D B = {(i d_{i}, K W_{i})}_{i = 1}^{d}

, which is a list of identifier-keyword pairs where

i d_{i} \in {0, 1}^{l}

and

K W_{i} \subseteq {0, 1}^{*}

. The whole keywords set in DB can be expressed as

K W = \cup_{i = 1}^{d} k w_{i}

. The set of documents containing a specific keyword

k w \in K W

is denoted by

D B (k w) = {i d_{i} ∣ k w \in K W_{i}}

.

4.3. Algorithm Synopsis

Our scheme includes two processes, AC and SE. They are roughly composed of the following algorithms:

For the AC process:

$(μ, {att}_{l}) \leftarrow GetAccessPolicy (ATT, s)$ : It runs by the DO locally, which take a secret value s and access policy formula $A T T$ as input and output $μ$ and an attribute expression list $a t t_{l}$ .
$AddAccessPolicy (μ, {att}_{l}, {ID}_{AC})$ : It runs by DO and calls ACC to set the access policy by uploading parameters to Ethereum. $I D_{A C}$ represents the unique identifier of the access policy added each time, and DO can implement flexible deleting policy and query previously added policy according to it.
$DelAccessPolicy ({ID}_{AC})$ : It runs by DO and calls ACC to delete the corresponding $a t t_{l}$ and $μ$ according to the $I D_{A C}$ of the access policy.
$(μ, att) \leftarrow QueryAccessPolicy ({ID}_{AC})$ : It runs by DO and calls ACC to query corresponding $a t t_{l}$ and $μ$ according to the $I D_{A C}$ of the access policy.
$(VC, timestamp) \leftarrow AttributesVerification ({ATT}_{DU})$ : It runs by DU to call ACC, and takes the user’s own attributes $A T T_{D U}$ as input to upload to the blockchain. The ACC first compares the attributes possessed by the DU with the $a t t_{l}$ uploaded by the DO. If it satisfies matching conditions (i.e., access policy), it can successfully reconstruct s and calculate the verification code $V C = I (s ∥ m a g . s e n d e r ∥ t i m e s t a m p)$ , $m s g . s e n d e r$ is the address of DU, and $t i m e s t a m p$ is the parameter generated by the smart contract. Finally, ACC takes $V C$ and $t i m e s t a m p$ as output to its state, which is known publicly. Note that if $A T T_{D U}$ does not satisfy the matching conditions (access policy), a $V C$ will also be output. However, the first component that generates this $V C$ will not be s, but another number. So in this case, the $V C$ will be invalid in the $Search$ .

For the SE process:

$(C_{DB}) \leftarrow Enc (DB, sk)$ : DO executes symmetric encryption (e.g., DES algorithm or AES algorithm) on each document in the DB before executing other operations. Due to the high efficiency of symmetric encryption, DO can encrypt each document with different key $s k$ . Then, DO uploads the ciphertext document $C_{D B}$ and its corresponding key to the cloud database and the trusted KMS.
$(K, K_{A}, K_{D}) \leftarrow GenerateKey ({0, 1}^{λ})$ : It runs by the DO to sample three keys. It takes a security parameter $λ$ as input, and outputs three keys for Search, Add and Del. Unlike previous schemes, the keys we generated in this algorithm are publicly available.
$(EDB) \leftarrow SetupDB (K, KW, s)$ : It runs by the DO locally, which takes key K, all keywords $K W$ , and secret s as input. After calculation, the corresponding index $E D B$ will be output finally. Then DO calls SEC to upload the index to the blockchain. Figure 4 shows in detail how $E D B$ is constructed.
$(R) \leftarrow Search (VC, timestamp, K, K_{A}, K_{D}, kw)$ : It runs by DU to call SEC, and take $(V C, t i m e s t a m p, K, K_{A}, K_{D}, k w)$ as input to upload to blockchain. In our scheme, SEC is deployed by DO. Before deployment, DO needs to set s as the internal parameter of the contract. Since the contract written by Solidity will be compiled into binary script code at the end, the number of s is only known by DO. It will first judge $I (s, m s g . s e n d e r, t i m e s t a m p) = = V C$ , if true, it will be transferred to the keyword matching search process, otherwise, it will directly output the identifier list of relevant encrypted documents $R = ⊥$ .

In addition, Add and Del, respectively, represent adding/deleting dictionary-type data stored on Ethereum. The detailed pseudocode of these two algorithms is shown in Figure 5.

4.4. Design Goals

Since our scheme is based on some previous work, we also introduce the design goals of our scheme like [5,7,8,31]. Fairness, Soundness and Confidentiality are included, but the content has been updated from previous work. The concept of fairness was introduced as a design goal so that each participant, especially in a multi-user setting, would be treated fairly and motivated to conform to the correct calculations. Introducing the concept of soundness, considering that malicious users do not perform operations in the correct way, it should be detected and will not receive any reward. At the same time, we should protect the confidentiality of data files or query keywords to prevent them from being attacked by opponents.

Fairness: In the work of previous schemes [5,7,8], they do not support the free search of multiple DUs. These schemes all rely on mortgaging deposit to let DO help the DU to perform the search process (or build trapdoors). Since our scheme needs to support a highly flexible privacy-preserving search, fairness will be more demanding:

In the $AddAccessPolicy$ , the DO first delivers the specific access policy information to the ACC. To conduct the next search, the DUs must first execute $AttributeVerification$ . The contract returns a unique verification code to each DU, and this verification code is an h-length pseudorandom bit generated based on the secret s and a one-way pseudorandom function. Even if all users of Ethereum can see the verification code returned to the DU, they cannot access the secret s information set by the DO.
All DUs can construct a searchtoken through keys $K, K_{A}, K_{D}$ , keyword $k w$ , and together with the verification code $V C$ as parameters to call the SEC. However, only DUs that satisfy the access policy will return the identifier list R corresponding to the keyword $k w$ . DUs who do not satisfy the access policy will only obtain an empty identifier list R. Whether the DU satisfies the access policy is only known after they call SEC and obtain the result. If a malicious DU wants to repeatedly call the SEC to force verification, since the verification code cannot be constructed to verify successfully, they will only succeed in wasting gas.

In summary, fairness means that each party can call ACC and SEC, but only obtain results when meeting the present conditions. Furthermore, because every operation on Ethereum costs gas, the contract deployer (i.e., DO) can control gasprice to adjust the operation expenses. When gasprice is not cheap, each transaction will deliver to the deployer for a considerable amount of Ether, which hinders the party attempting to force-blast. Therefore, fairness guarantees each party involved is incentivized to perform correct operations.

Soundness: Every operation of Ethereum requires gas, and its nature as a business platform just meets our needs. Participants who violate the agreement will only waste a lot of gas, which is a painful price. At the same time, with the help of transparency in smart contracts, we no longer need to worry about threats such as collusion.

Confidentiality: The confidentiality of ciphertext is guaranteed by the symmetric encryption algorithm. As long as the key length is sufficient, the ciphertext will not be deciphered in theory. In addition, when newly added documents are history-independent [3], it is difficult for any PPT adversary to learn whether the newly added document contains a keyword that has been searched before. Furthermore, with the help of technologies such as differential privacy, ciphertext documents can be properly confused, which will further implement obstacles for the adversary.

Since most of our SE processes are based on previous work, and there has been sufficient security analysis, we will focus on the security requirements of the AC process in this paper. That is to say, any DU that satisfies the access policy can obtain an effective verification code, while any DU that does not satisfy can hardly burst an effective verification code without obtaining more information.

5. Scheme Construction and Security Analysis

In this section, we will introduce our scheme

Π

based on the Ethereum platform, consisting of

Π_{A C}

and

Π_{S E}

.

5.1. $Π_{A C}$ : Fine-Grained Access Control Based on Verification Codes

Attribute-based access control has concise and flexible mathematical expression capabilities, and there are already many developed works for reference. The mathematical construction method in our scheme

Π_{A C}

is based on [30] and has been improved to apply to the subsequent SE process. In

Π_{A C}

, DO firstly constructs access policy vector

μ

and attribute expression list

a t t

locally, then packages these two parameters into the transaction, and hands over to the blockchain. The miners on the blockchain would package new transactions into a block, and the qualified block would be chained according to the POW consensus mechanism. In

Π_{A C}

, the primary operations are centered on access policies, so the computational complexity mainly depends on the complexity of attribute-based access policies. The complexity of the attribute-based access policy depends on the number n of DU attributes. Therefore, for the operations of

A d d A c c e s s P o l i c y

,

D e l A c c e s s P o l i c y

and

Q u e r y A c c e s s P o l i c y

, the computational complexity is

O (n)

. In

A t t r i b u t e s V e r i f i c a t i o n

, the computational complexity of this operation here is

O (n^{2})

because of the two-layer circular matching, which will be improved in our future work.

As the contract deployer, DO can add and delete access policy by calling the ACC. To verify the DU, only the upload of the attribute set

A T T_{D U}

is required, which is then matched with the attribute expression list

a t t

submitted by the DO. In this step, we set the initial value

A = 0

. For each attribute submitted by DU, we perform a match with

{a t t}_{i}

. If it meets the matching requirements, we let

A = A + μ_{i}

. If the attributes of DU (i.e.,

A T T_{D U}

) belong to the verification set S, it can finally obtain

A = s

and construct the verification code from A, DU’s address and block timestamp. The scheme

Π_{A C}

construction is presented in detail in Figure 3.

In previous schemes, there were few studies on the access control of DU. In the most similar work [6], DU encrypts the transaction through the ECC cryptosystem to achieve stealth authorization. However, this scheme has two flaws:

It requires a huge amount of gas, which is unaffordable for DU;
The authorized DU can only search through the searchtoken established by DO in advance, which is extremely inconvenient.

In our scheme, DU only requires a low amount of gas to obtain a corresponding verification code. Each DU with a verification code can freely search during the SE process, representing an innovation unmentioned in the literature. Due to the emergence of VC, the design of the

Π_{S E}

is somewhat different from the pioneering scheme [5], which will be introduced later.

5.2. $Π_{S E}$ : Reliable Searchable Encryption with High Flexibility

Our scheme is inspired by Hu et al. [5]. In their work, DU realizes keyword searches through a fairness mechanism named

Π_{f a i r}

, that requires the user to submit a certain deposit $deposit to the smart contract in advance. Once the DO finds the deposit information in the new transaction, they construct a searchtoken to execute the search process. Only when the $deposit is greater than the search execution cost $cost and the service fee $offer is paid to the DO, will the smart contract execute the search and return the result to the DU. However, this scheme lacks flexibility: the premise for DU to execute the search is that DO must be online. Moreover, DO can falsify the amount of $offer to extract more Ether from the $deposit. Although Cai et al. [8] proposed a reliable and fair SE scheme based on [5], it is still rigid and inconvenient for DU to implement search.

From the perspective of practicability rather than profitability, our scheme

Π_{S E}

is more flexible and workable. DU can freely construct searchtoken and together with verification code as parameters to call SEC. We also utilize multiple plaintext packaging methods to construct dictionary-type data as reported in [3,5]. In order to protect confidentiality, the length of data bits we encrypt each time cannot exceed the security parameter

λ

. So if the length of identifier is

l e n

, we would have

p \leq λ / l e n

. In addition, since the search process needs the output result of the AC process, we modified the original algorithm. Figure 4 show our search scheme in detail.

As with the original work [3,5], our scheme also supports dynamic updates. The

Add

and

Del

are almost the same as before. Only the initialization of c has been adjusted. The specific algorithms of the

Add

and

Del

are presented in Figure 5.

We made the same adjustment of c in the

Search

. The reason for these is to take into account multiple DO scenarios, since each client on Ethereum can choose to be DO if they wish to upload some data. We initialize c to be s in the SE process

Π_{S E}

, and when a client becomes DO and uploads some documents which contain exactly the same keyword-id pairs as the previous document, as long as the secret value s is different, completely different dictionary-type data would be constructed. Moreover, we can even select the keys

K, K_{A}, K_{D}

once and use them throughout the SE process to reduce the pressure caused by key management.

Through the two schemes

Π_{A C}

and

Π_{S E}

, the flexible and available SE has been realized. Our scheme focuses on the practicability of SE on Ethereum, with less consideration for profitability. Because in different application scenarios, the search cost would be distinct, it is hard to summarize profitability in a common solution. Therefore, we do not provide a similar fairness mechanism to support the DU searches.

After obtaining the ciphertext document corresponding to the keyword, the DU needs to request a symmetric key from the KMS to decrypt the document. KMS can be EOA on Ethereum, which retains the entire blockchain. Once DU proves to the KMS that they are the submitter of the transaction by signature, and the KMS verifies that it is correct, it will send the decryption key to the DU via the secure channel. The operation of asking for the key can also be performed off-chain.

In

Π_{S E}

, the main operations are all around the files uploaded by DO, so the computational complexity mainly depends on the number of files n. Therefore, for the operations of

S e t u p D B

,

S e a r c h

,

A d d

and

D e l

, the computational complexity is

O (n)

. Regarding obtaining the key, there is no specific description in previous similar work. It should be noted that KMS is a server entrusted by DO to store keys, and its existence is crucial for DU to obtain document plaintext finally. DO can choose to play the role of KMS to simplify the workflow, but this requires that DO must be online and provide timely feedback for the DU’s key request. If DU does not want to deliver the key to KMS, or DU does not trust other organizations, the method of DU plays as KMS is a good alternative solution.

Our scheme can be combined with encryption technology to achieve a greater degree of security, such as using the ABE technology [26] to verify the attributes of the DU when performing ciphertext decryption, which ensures that the attribute set declared by the DU during AC process is real.

5.3. Security Analysis

To illustrate the security of our scheme, we present the security analysis one by one through the design goals proposed before. It should be noted that our security analysis is based on the normal operation of the Ethereum network and the correct execution of smart contracts.

Fairness: Our scheme focuses on the efficient and flexible search capabilities assigned to DUs. Through attribute-based access control, DUs that meet the access policy can enjoy search service without having to mortgage deposits for each search. The guarantee of fairness comes from the auditability and authenticity of the implementation of

AttributesVerification

on ACC. DU needs to ask for a verification code and search through its own attribute set. For a malicious DO, it cannot be ruled out that they will repeatedly declare different attribute sets to obtain multiple verification codes. However, the records of calling the contract will be disclosed to each blockchain client. When all other clients find that within a short time, there exists one address repeatedly calling ACC, it is easy to determine that it belongs to a malicious DU. The openness of Ethereum exposes all actions to the air, and any transactional information is permanently stored on Ethereum. Therefore, there is no need to consider the collusion between DU and smart contracts. The potential threats of servers that have appeared in centralized searchable encryption systems in the past will no longer exist.

In addition, we can make some provisions outside the blockchain. For example, we stipulate the maximum number of calling ACC within one day. If someone violates the agreement, their account will be pulled into the blacklist of contract deployers.

Soundness: When our scheme operates normally, to obtain the verification code and enjoy the search, the DU should spend a certain amount of gas to call ACC, and the smart contract ensures that the returned ciphertext meets the needs of the DU. The consensus nature of Ethereum ensures the correct operation of every step. DO can also regulate the gasprice of the contract. If a malicious DU performs force-blasting without grasping other information, they will only waste Ether.

Confidentiality: In the traditional scheme, the server will perform the search on the ciphertext. When the server becomes untrustworthy, it is difficult to guarantee that it will return the available results. Therefore, in our scheme, the operation performed by the server is transferred to the decentralized blockchain system, so that we can ensure the integrity and authenticity of the query and the returned results.

For the access control scheme, we give each DU a verification code based on attribute verification. To prevent someone from monitoring the contract state on Ethereum, we use the timestamp as a component to generate the verification code to ensure that each verification code is different.

To prove confidentiality, we also follow the real-ideal simulation paradigm. We will mainly prove the confidentiality of the

Π_{A C}

, and the confidentiality of the

Π_{S E}

can refer to the pioneering work [5,7]. In order to briefly introduce our own innovation point, we will not give too much description about it.

We first define the stateful leakage function of

Π_{A C}

is

L (μ, a t t, A T T_{D U}) = {s, V C}

, and

n e g l ()

is a negligible function. Then, we say that the

Π_{A C}

is

L -

secure against non-adaptive attacks, which means if for any probabilistic polynomial-time (PPT) adversary A, there exists a PPT simulator S such that:

∣ P r [R e a l_{A}^{Π_{A C}} (λ) = 1] - P r [I d e a l_{A, S}^{Π_{A C}} (λ) = 1] ∣ \leq n e g l (λ)

Theorem 1.

If H, I are pseudo-random functions, and

∣ a t t ∣

is not too small, The

Π_{A C}

is

L -

secure against non-adaptive attacks.

For general SSE (Symmetric Searchable Encryption), non-adaptive indistinguishability is equivalent to non-adaptive semantic security. It is equivalent to proving that for all PPT adversaries A, there is a PPT simulator S such that the advantage to distinguish the outputs of

R e a l_{A}^{Π_{A C}}

and

I d e a l_{A, S}^{Π_{A C}}

is negligible.

To save gas, we could reduce the h in the H, but this means that the adversary has more advantages in distinguishing the attribute formula μ. However, in real-world scenarios, our attribute expression will be a logical combination of multiple attributes (e.g., the number of attributes is 4 in Section 3.3). With the increase of the

∣ a t t ∣

, the difficulty of the adversary to reconstruct secret s would increase linearly with it. At this point, the adversary will prefer to distinguish VC. In the

AttributesVerfection

, there exists

V C \leftarrow I (A ∥ a d d r e s s ∥ t i m e s t a m p)

, if I is pseudo-random, we can have:

A d v (A (V C)) = ∣ P r [A t t r i b u t e s V e r f i c a t i o n \to V C] - P r [r a n d o m \to V C^{'}] ∣ \leq n e g l (λ)

Therefore, we believe that the

Π_{A C}

is

L -

secure against non-adaptive attacks.

6. Experimental Evaluation

To demonstrate our scheme, we conducted experiments using Truffle and Ganache, the localized simulation blockchain tools provided by Ethereum. In order to reduce the waiting time in the simulation process and make better statistics of experimental data, we set the block time for mining to 0. In this way, once DU calls the contract, the contract can react in time. The experiment is run on a PC equipped with an I7-8750H processor at 2.20 GHz and 16 GB RAM. All operations are performed under Windows 10. We implement F and G using HMAC-SHA256, and implement H and I using keccak256, which is provided by Solidity. Considering that we need H to be a variable-length PRF, we take the last h-bit to achieve this goal. In order to better compare the experimental results, we selected the same parameters as the pioneering work [5] in the SE process, which means we set p as 8, N as 4, and

S t e p

as 47.

The program code in our scheme is written by Java, Python and Solidity, respectively. We summarize the detailed performance on Ganache as follows:

6.1. Performance of Access Control

We use Java to realize the generation of LSSS matrix M and access policy vector

μ

, and use Solidity to write ACC to achieve access control. To reflect the different levels of gas cost incurred by the difference in the number of attributes in access control, we increased the number of attributes from 3 to 20 and recorded the consumption of each function one by one. The gas cost for different functions is shown in Figure 6. Meanwhile, the theoretical Ether consumption is shown in Table 3.

Considering more computational operations would be executed for more attributes, the gas cost of these four functions also increases with the number of attributes. Furthermore, the gas cost of

A d d A c c e s s P o l i c y

increases most conspicuously, because the storage operation is very expensive for Ethereum. This requires the DO to consider carefully before setting access control. Otherwise, a lot of gas will be wasted. In general, when adding an access policy with 20 attributes, it only needs 1,765,740 gas, about USD 13.2, which is economically acceptable.

We can observe an interesting phenomenon from Figure 6: the gas cost of

A d d A c c e s s P o l i c y

,

D e l A c c e s s P o l i c y

,

Q u e r y A c c e s s P o l i c y

, increases linearly with the number of attributes, while the gas cost of

A t t r i b u t e s V e r i f i c a t i o n

is one order of magnitude higher than other functions. This is due to the usage of two

f o r

loops in the algorithm, and some optimization on it can further reduce gas cost, which will be further improved in our future work. Regarding saving gas, we can control the bit string length h to a smaller value, which is 80 in our experiment. However, if we want to prevent malicious DU from force-blasting, we should increase the size of h appropriately.

In particular, the exchange rate of Ether against USD in Table 3 is based on the market value of Ethereum on 12 March 2022. Given that the market value of Ethereum is always in a significant change trend, this exchange rate will fluctuate greatly at any time. We can further find from the table that when the number of attributes is 20, the gas of

A d d A c c e s s P o l i c y

is significantly higher than

A t t r i b u t e s V e r i f i c a t i o n

. Although the latter will eventually surpass the former when the number of attributes increases further, in reality, 20 attributes can satisfy most access control scenarios, so we can draw a conclusion: the expenses needed for DO to add an access control policy will be much higher than those needed for DU to verify attributes, which is fair and affordable for DU.

6.2. Performance of Searchable Encryption

For better performance comparison, we experimented with the same data source as [5] and selected different increasing subsets as the original DB. In order to avoid misunderstanding, we emphasize that the DB in the scheme is a collection of all

i d_{i}

containing a specific keyword

k w_{i}

, rather than the original document.

Compared with the stealth authorization access control scheme [6], our attributes-based access control scheme demonstrates better performance in practice. For example, in the

S e t u p D B

phase, our scheme is almost as efficient as the pioneering work [5]. However, the stealth authorization scheme in [6] rewrites the data domain of the transaction into stealth bag and authorization content in two parts, which means that the data they upload are far larger than the size of EDB. In order to better reflect the outstanding achievements of our scheme in saving gas, we show the comparison of [6] and our scheme in gas cost in Table 4.

In the process of uploading EDB to Ethereum, our three

D B

s create 109, 166, 211 transactions, respectively. Since in Ganache, transactions can be quickly discovered and packaged by miners, the number of transactions determines the final setup time. However, in [6], DO needs to encrypt each transaction one by one using the ECC public key of the corresponding DU. With the increase of transactions, the time overhead of encryption will far exceed the time overhead of communication in Ganache.

For

D B 1

, after being encrypted into EDB, it is divided into 109 transactions, each transaction contains about 0.02 MB dictionary-type data, and the average time for a transaction to be packed by miners and added to the blockchain is about 0.7 s. In [6], there is no description of how many transactions are needed to upload EDB. However, for the smallest

D B 4

, the data would become much larger after being encrypted and combined into authorization content. Hence, considering that the size of ciphertext would be larger than that of plaintext, even if the

D B 4

with only 5012 pairs is encrypted and then uploaded to the blockchain, the required gas will be far greater than that of our

D B 3

with 81,579 pairs. For DB3, we comprehensively considered the gas implementation cost of the

Π_{A C}

scheme in the previous subsection. The total gas implementation cost of the framework is 73,183,103 wei, which is still far less than the gas required for DB4 to be uploaded to the blockchain. We adopt the exchange rate proposed in Table 3 to convert gas cost into USD cost, and find that our scheme is more reasonable and practical than [6] in terms of monetary cost.

In summary, us authors and Jiang et al. [6] both adopt the access control mechanism, but their price of utilizing secret authentication to establish EDB is much higher than ours.

For the search phase, due to the additional calculation of verifying VC in our scheme, it will spend a bit more time than [5]. However, compared to similar work with the access control mechanism [6], our scheme does not incur extra time overhead, and is even faster with a small number of matching documents. Search time is evaluated from 10 matched documents to 55 matched documents. The details are shown in Figure 7.

The DB of [5] contains 100,762

(k w, i d)

pairs and the EDB size is 5.4 MB. It should be noted that we execute

S e a r c h

10 times on Ganache to obtain the average value as the experimental result data, while the experimental data of Hu et al. [5] and DB5 are obtained directly from their paper. The number of transactions required for search is 1. We find that the search time increases almost linearly with the number of matching documents, which is in line with the fact that each item in dictionary data

γ

of EDB will be matched one by one. Furthermore, the larger the dictionary, the more time it will take.

7. Conclusions

In this paper, we aim to bring fine-grained access control and flexible searchable encryption services to DU based on previous schemes. Firstly, we relied on LSSS to allow DO to construct a secret-sharing vector local, and stipulated the range of sharing target DUs by calling ACC. Before performing the search, DU must call ACC to verify the identity. Only the DU that meets the access policy can obtain an effective verification code and perform the search. Secondly, we set up a searchable encryption scheme with more flexibility. Any DU can construct a searchtoken through keywords, but only a DU with an effective verification code can obtain the document id fed back by SEC. Since blockchain can record all transactions on it and is equipped with strong auditable ability, we introduced a trusted KMS (or even the DO itself) to verify the transactions executed by DU, and returned the corresponding symmetric key to perform decryption.

In the future, we will try to combine with CP-ABE to further ensure the authenticity of DU attributes without excessively increasing the gas costs. Furthermore, the attribute verification algorithm in ACC will be optimized to achieve access control with lower gas costs.

Author Contributions

J.H. and Z.L. contributed to writing the original draft, preparation, methodology, software, and validation of the proposed scheme; J.L. contributed to writing—review and editing and funding acquisition; H.W. contributed to supervision; M.X. contributed to project administration; Y.Z. contributed to conceptualization; Y.C. contributed to visualization. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China under Grant No. 61801489.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

Amazon Web Services. Cloud Storage Solutions for Free. Available online: https://aws.amazon.com/free/storage/ (accessed on 17 June 2022).
Tencent. Tencent Micro Cloud Intelligent Network Disk. Available online: https://www.weiyun.com/ (accessed on 17 June 2022).
Cash, D.; Jaeger, J.; Jarecki, S.; Jutla, C.S.; Krawczyk, H.; Rosu, M.C.; Steiner, M. Dynamic searchable encryption in very-large databases: Data structures and implementation. In Proceedings of the NDSS 2014, San Diego, CA, USA, 23–26 February 2014; Volume 14, pp. 23–26. [Google Scholar]
Leps. Baidu’s Response to Pandownload Developer’s Arrest: Actively Cooperate with the Police. Available online: https://www.rayradar.com/2020/04/16/baidus-response-to-pandownload-developers-arrest-actively-cooperate-with-the-police/ (accessed on 7 July 2020).
Hu, S.; Cai, C.; Wang, Q.; Wang, C.; Ren, K. Searching an Encrypted Cloud Meets Blockchain: A Decentralized, Reliable and Fair Realization. In Proceedings of the IEEE INFOCOM 2018—IEEE Conference on Computer Communications, Honolulu, HI, USA, 15–19 April 2018; pp. 792–800. [Google Scholar]
Jiang, S.; Liu, J.; Wang, L.; Yoo, S.M. Verifiable Search Meets Blockchain: A Privacy-Preserving Framework for Outsourced Encrypted Data. In Proceedings of the ICC 2019—2019 IEEE International Conference on Communications (ICC), Shanghai, China, 21–23 May 2019; pp. 1–6. [Google Scholar]
Chen, L.; Lee, W.K.; Chang, C.C.; Choo, K.K.R.; Zhang, N. Blockchain based searchable encryption for electronic health record sharing. Future Gener. Comput. Syst. 2019, 95, 420–429. [Google Scholar] [CrossRef]
Cai, C.; Weng, J.; Yuan, X.; Wang, C. Enabling Reliable Keyword Search in Encrypted Decentralized Storage with Fairness. IEEE Trans. Dependable Secur. Comput. 2018, 18, 131–144. [Google Scholar] [CrossRef]
Li, H.; Zhang, F.; He, J.; Tian, H. A Searchable Symmetric Encryption Scheme using BlockChain. arXiv 2017, arXiv:1711.01030. [Google Scholar]
Song, D.X.; Wagner, D.; Perrig, A. Practical techniques for searches on encrypted data. In Proceedings of the 2000 IEEE Symposium on Security and Privacy, S&P 2000, Berkeley, CA, USA, 14–17 May 2000; pp. 44–55. [Google Scholar]
Boneh, D.; Di Crescenzo, G.; Ostrovsky, R.; Persiano, G. Public Key Encryption with Keyword Search. In Proceedings of the Advances in Cryptology—EUROCRYPT 2004, Interlaken, Switzerland, 2–6 May 2004; pp. 506–522. [Google Scholar]
Chen, R.; Mu, Y.; Yang, G.; Guo, F.; Wang, X. Dual-Server Public-Key Encryption with Keyword Search for Secure Cloud Storage. IEEE Trans. Inf. Forensics Secur. 2015, 11, 789–798. [Google Scholar] [CrossRef]
Kamara, S.; Papamanthou, C. Parallel and Dynamic Searchable Symmetric Encryption. In Proceedings of the International Conference on Financial Cryptography and Data Security, Okinawa, Japan, 1–5 April 2013; pp. 258–274. [Google Scholar]
Yavuz, A.A.; Guajardo, J. Dynamic Searchable Symmetric Encryption with Minimal Leakage and Efficient Updates on Commodity Hardware. In Proceedings of the International Conference on Selected Areas in Cryptography, Sackville, NB, Canada, 12–14 August 2015; pp. 241–259. [Google Scholar]
Xia, Z.; Wang, X.; Sun, X.; Wang, Q. A Secure and Dynamic Multi-Keyword Ranked Search Scheme over Encrypted Cloud Data. IEEE Trans. Parallel Distrib. Syst. 2016, 27, 340–352. [Google Scholar] [CrossRef]
Cong, W.; Ren, K.; Yu, S.; Urs, K.M.R. Achieving usable and privacy-assured similarity search over outsourced cloud data. In Proceedings of the IEEE Infocom, Orlando, FL, USA, 25–30 March 2012; pp. 451–459. [Google Scholar]
Dong, Q.; Guan, Z.; Wu, L.; Chen, Z. Fuzzy Keyword Search over Encrypted Data in the Public Key Setting. In Proceedings of the Web-Age Information Management, Beidaihe, China, 14–16 June 2013; pp. 729–740. [Google Scholar]
Premasathian, N.; Choto, S. Searchable encryption schemes: With multiplication and simultaneous congruences. In Proceedings of the 2012 9th International Conference on Information Security and Cryptology (ISCISC), Tabriz, Iran, 13–14 September 2012; pp. 147–150. [Google Scholar]
Chai, Q.; Gong, G. Verifiable symmetric searchable encryption for semi-honest-but-curious cloud servers. In Proceedings of the 2012 IEEE International Conference on Communications (ICC), Ottawa, ON, Canada, 10–15 June 2012; pp. 917–922. [Google Scholar]
Wood, G. Ethereum: A secure decentralised generalised transaction ledger. Ethereum Proj. Yellow Pap. 2014, 151, 1–32. [Google Scholar]
Li, M.; Weng, J.; Yang, A.; Lu, W.; Zhang, Y.; Hou, L.; Liu, J.-N.; Xiang, Y.; Deng, R. CrowdBC: A Blockchain-based Decentralized Framework for Crowdsourcing. IEEE Trans. Parallel Distrib. Syst. 2019, 30, 1251–1266. [Google Scholar] [CrossRef]
Cai, Y.; Fragkos, G.; Tsiropoulou, E.E.; Veneris, A. A truth-inducing sybil resistant decentralized blockchain oracle. In Proceedings of the 2020 2nd Conference on Blockchain Research & Applications for Innovative Networks and Services (BRAINS), Paris, France, 28–30 September 2020; pp. 128–135. [Google Scholar]
Li, Z.; Liu, J.; Hao, J.; Wang, H.; Xian, M. CrowdSFL: A Secure Crowd Computing Framework Based on Blockchain and Federated Learning. Electronics 2020, 9, 773. [Google Scholar] [CrossRef]
Mahmood, Z.; Jusas, V. Blockchain-Enabled: Multi-Layered Security Federated Learning Platform for Preserving Data Privacy. Electronics 2022, 11, 1624. [Google Scholar] [CrossRef]
Ali, A.; Almaiah, M.A.; Hajjej, F.; Pasha, M.F.; Fang, O.H.; Khan, R.; Teo, J.; Zakarya, M. An Industrial IoT-Based Blockchain-Enabled Secure Searchable Encryption Approach for Healthcare Systems Using Neural Network. Sensors 2022, 22, 572. [Google Scholar] [CrossRef] [PubMed]
Niu, S.; Chen, L.; Wang, J.; Yu, F. Electronic Health Record Sharing Scheme With Searchable Attribute-Based Encryption on Blockchain. IEEE Access 2020, 8, 7195–7204. [Google Scholar] [CrossRef]
Barenji, A.V.; Montreuil, B. Open Logistics: Blockchain-Enabled Trusted Hyperconnected Logistics Platform. Sensors 2022, 22, 4699. [Google Scholar] [CrossRef] [PubMed]
Wang, R.; Tsai, W.T. Asynchronous federated learning system based on permissioned blockchains. Sensors 2022, 22, 1672. [Google Scholar] [CrossRef] [PubMed]
Bonatti, P.A.; Samarati, P. A uniform framework for regulating service access and information release on the Web. J. Press. Vessel Technol. 2002, 129, 52–57. [Google Scholar] [CrossRef]
Lewko, A.; Waters, B. Decentralizing Attribute-Based Encryption. In Proceedings of the Advances in Cryptology—EUROCRYPT 2011, Tallinn, Estonia, 15–19 May 2011; pp. 568–588. [Google Scholar]
Kosba, A.; Miller, A.; Shi, E.; Wen, Z.; Papamanthou, C. Hawk: The Blockchain Model of Cryptography and Privacy-Preserving Smart Contracts. In Proceedings of the 2016 IEEE Symposium on Security and Privacy (SP), San Jose, CA, USA, 23–25 May 2016; pp. 839–858. [Google Scholar]

Figure 1. The construction of the LSSS matrix corresponding to the access policy formula of

(W_{1} or W_{2})

and

W_{3}

and

W_{4}

.

Figure 1. The construction of the LSSS matrix corresponding to the access policy formula of

(W_{1} or W_{2})

and

W_{3}

and

W_{4}

.

Figure 2. System overview.

Figure 3.

Π_{A C}

construction.

Figure 3.

Π_{A C}

construction.

Figure 4. Part I of

Π_{S E}

construction.

Figure 4. Part I of

Π_{S E}

construction.

Figure 5. Part II of

Π_{S E}

construction.

Figure 5. Part II of

Π_{S E}

construction.

Figure 6. The gas cost of calling different functions in ACC under different attribute values.

Figure 7. Comparision of search time.

Table 1. Characteristics of all similar work and ours.

Scheme	Demand for Data Owner	Usage of Data User	Gas Cost	Query Support	User-Defined Keyword Support	Flexibility in Multi-User Scenarios
Hu et al. [5]	On-line	After the deposit, DO searches and feedbacks results to DU.	low	keyword	yes	bad
Cai et al. [8]	On-line	DU submits searchtoken to search and supports fair judgment.	medium	keyword	yes	general
Chen et al. [7]	On-line	DU submits query Q to DO, who feedbacks the trapdoor used for the search.	low	boolean, range	yes	bad
Jiang et al. [6]	Off-line	DU can search based on stealth authorization.	high	keyword	no	general
Ours	Off-line	Attribute-based access control, DU search freely.	low	keyword	yes	good

Table 2. The standard notations.

Notations	Definition
s	A secret value
$\vec{μ}$	An access policy vector
$M_{l \times n}$	An access policy matrix
${\vec{z}}_{n}$	A secret-sharing column vector
$γ$	A dictionary data
$λ$	A security parameter
$K W$ , $k w$	All keywords;A keyword
$H, I, F, G$	The pseudo-random function
h	The length of bit
$a t t_{l}$	The attribute expression list composed of all attributes in the access policy
$A T T$	The access policy formula
$I D_{A C}$	The unique identifier of the access policy added each time
$m s g . s e n d e r$	The address of DU
$V C$	The verification code
K, $K A$ , $K D$	Three secret keys for Search, Add and Del

Table 3. Gas cost in ACC. (gasPrice = 3 Gwei, 1 Ether = 2492.5 USD, Number of Attributes is 20).

Function	Gas Cost	Ether Cost	USD Cost
AddAccessPolicy	1,765,740	0.005297220	13.20067
DelAccessPolicy	334,387	0.001003161	2.499877
QueryAccessPolicy	120,944	0.000362832	0.904177
AttributesVerification	750,336	0.002251008	5.609512

Table 4. Performance comparison of

S e t u p D B

.

Table 4. Performance comparison of

S e t u p D B

.

Scheme	DBs’ Name	$({kw}_{i}, {id}_{i})$ Pairs	Size of EDB	Setup Time	Gas Cost
Ours	DB1	44,254	2.12 MB	1.27 min	38,347,029 wei
	DB2	60,431	2.85 MB	2.19 min	51,925,577 wei
	DB3	81,579	3.43 MB	2.82 min	70,211,696 wei
Jiang et al. [6]	DB4	5012	0.32 MB	13.27 min	339,806,098 wei
	DB5	10,019	0.64 MB	33.49 min	679,281,587 wei
	DB6	19,987	1.28 MB	75.97 min	1,365,630,213 wei

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Han, J.; Li, Z.; Liu, J.; Wang, H.; Xian, M.; Zhang, Y.; Chen, Y. Attribute-Based Access Control Meets Blockchain-Enabled Searchable Encryption: A Flexible and Privacy-Preserving Framework for Multi-User Search. Electronics 2022, 11, 2536. https://doi.org/10.3390/electronics11162536

AMA Style

Han J, Li Z, Liu J, Wang H, Xian M, Zhang Y, Chen Y. Attribute-Based Access Control Meets Blockchain-Enabled Searchable Encryption: A Flexible and Privacy-Preserving Framework for Multi-User Search. Electronics. 2022; 11(16):2536. https://doi.org/10.3390/electronics11162536

Chicago/Turabian Style

Han, Jiujiang, Ziyuan Li, Jian Liu, Huimei Wang, Ming Xian, Yuxiang Zhang, and Yu Chen. 2022. "Attribute-Based Access Control Meets Blockchain-Enabled Searchable Encryption: A Flexible and Privacy-Preserving Framework for Multi-User Search" Electronics 11, no. 16: 2536. https://doi.org/10.3390/electronics11162536

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Attribute-Based Access Control Meets Blockchain-Enabled Searchable Encryption: A Flexible and Privacy-Preserving Framework for Multi-User Search

Abstract

1. Introduction

2. Related Work

3. System Components

3.1. Ethereum and Smart Contracts

3.2. Searchable Encryption

3.3. Attribute-Based Access Control

4. Problem Formulation

4.1. System Overview

4.2. Notations

4.3. Algorithm Synopsis

4.4. Design Goals

5. Scheme Construction and Security Analysis

5.1. $Π_{A C}$ : Fine-Grained Access Control Based on Verification Codes

5.2. $Π_{S E}$ : Reliable Searchable Encryption with High Flexibility

5.3. Security Analysis

6. Experimental Evaluation

6.1. Performance of Access Control

6.2. Performance of Searchable Encryption

7. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Attribute-Based Access Control Meets Blockchain-Enabled Searchable Encryption: A Flexible and Privacy-Preserving Framework for Multi-User Search

Abstract

1. Introduction

2. Related Work

3. System Components

3.1. Ethereum and Smart Contracts

3.2. Searchable Encryption

3.3. Attribute-Based Access Control

4. Problem Formulation

4.1. System Overview

4.2. Notations

4.3. Algorithm Synopsis

4.4. Design Goals

5. Scheme Construction and Security Analysis

5.1. Π A C : Fine-Grained Access Control Based on Verification Codes

5.2. Π S E : Reliable Searchable Encryption with High Flexibility

5.3. Security Analysis

6. Experimental Evaluation

6.1. Performance of Access Control

6.2. Performance of Searchable Encryption

7. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

5.1. $Π_{A C}$ : Fine-Grained Access Control Based on Verification Codes

5.2. $Π_{S E}$ : Reliable Searchable Encryption with High Flexibility