Soft Set Decision and Cluster Percolation Method-Based Policy Clustering and Encryption Optimization for CP-ABE

Liu, Wei; Helil, Nurmamat

doi:10.3390/math12020259

Open AccessArticle

Soft Set Decision and Cluster Percolation Method-Based Policy Clustering and Encryption Optimization for CP-ABE

by

Wei Liu

and

Nurmamat Helil

^*

College of Mathematics and System Science, Xinjiang University, Urumqi 830046, China

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(2), 259; https://doi.org/10.3390/math12020259

Submission received: 15 December 2023 / Revised: 9 January 2024 / Accepted: 10 January 2024 / Published: 12 January 2024

(This article belongs to the Section Mathematics and Computer Science)

Download

Browse Figures

Versions Notes

Abstract

:

In ciphertext-policy attribute-based encryption, there might be different levels of overlapping in the access policies of different data objects outsourced by the same data owner. This paper proposes a soft set decision-making method and cluster percolation method-based policy clustering by using policy similarity for CP-ABE, aiming to merge the duplicated access policy pieces to reduce repeated computations during the encryption process of corresponding data objects. Firstly, the access policies are clustered using either the soft set decision-making or the cluster percolation method. Secondly, the access policies within the same cluster are integrated for further encryption of corresponding data objects as a whole, thereby preventing redundant computations during the encryption process and thus reducing computational overhead. Theoretical analysis and experimental results demonstrate the feasibility and effectiveness of the proposed approach in this paper.

Keywords:

ciphertext-policy attribute-based encryption; shared sub-policy; soft set decision-making method; cluster percolation method

MSC:

68M25

1. Introduction

Attribute-based encryption (ABE) [1] has effectively addressed the issue of flexible access control for data in cloud storage. In the ciphertext-policy attribute-based encryption (CP-ABE) [2] scheme proposed by Bethencourt et al., data owners can define access policies flexibly. As a result, CP-ABE is particularly suitable for access control scenarios in cloud storage. Since most CP-ABE schemes are constructed based on elliptic curve bilinear groups, a significant number of bilinear pairings and exponentiation operations are involved in the encryption and decryption processes. These operations yield high computational costs for data owners.

In many cloud storage services that deploy CP-ABE-based access control, as data owners and consumers, many lightweight devices, such as sensors, smartphones, and remote terminals, are involved in computation-intensive data encryption/decryption operations. Therefore, improving CP-ABE encryption/decryption efficiency is a non-trivial issue. As a result, researchers have made substantial efforts to enhance encryption and decryption efficiency [3,4,5,6,7,8,9,10,11,12,13,14,15,16]. Among them, [3,4,5,6,7] are about improving encryption efficiency. Zhou et al. [3] introduced an encryption outsourcing approach. They split the access policy into two pieces, and the encryption-related computations associated with the two pieces are carried out separately by the data owner and the encryption service provider; the data owner only needs to conduct computations related to a sub-tree with just one attribute, while the encryption service provider conducts most of the remaining computations. Li et al. [4] proposed an encryption outsourcing scheme based on MapReduce technology. This scheme divides the access tree into two subtrees from the root node. The encryption-related computations for the left subtree are performed using MapReduce; the data owner performs encryption-related computations for the right subtree. Luo et al. [5] proposed a fast encryption CP-ABE scheme for the Spark big data. They utilized parallel computing methods to improve the efficiency of encryption. Hohenberger et al. [6] introduced an online/offline CP-ABE and KP-ABE scheme. Their scheme divides the encryption process into online and offline stages. The computations of bilinear pairings for encryption are performed in the offline stage; the data owner only needs to modify the relevant attribute parameters during the online stage to obtain the corresponding ciphertext. This approach reduces the computational overhead of the online encryption. Leng et al. [7] presented an ABE scheme that supports encryption outsourcing. This scheme needs to construct the shared access policy as a matrix; it outsources its construction and most of the encryption work of corresponding data objects to a cloud encryption server. As a result, the data owner only needs to perform three exponentiation operations to complete the remaining encryption process.

Researchers have made efforts to address the efficiency concerns in decryption [8,9,10,11,12,13,14,15,16]. Green et al. [8] introduced a CP-ABE scheme that outsources decryption operations to a third party. Subsequently, decryption optimization of CP-ABE has been considered in applications such as intelligent connected vehicles [9], smart healthcare [10], and the Internet of Things (IoT) [11]. The CP-ABE scheme for user privacy protection ABE [12] delegates certain operations to distributed proxy servers during the decryption stage, effectively reducing the computational cost for users. Zou et al. [13] introduced a fast decryption approach for CP-ABE. This approach incorporates Spark cluster and parallel computing techniques into the construction of the CP-ABE scheme, achieving rapid decryption. Li et al. [14] proposed a constant ciphertext length CP-ABE scheme that supports outsourcing encryption and decryption operation, significantly reducing communication costs. Zhang et al. [15] and Sheng [16], respectively, proposed fully outsourced CP-ABE schemes, enabling the outsourcing of key generation, encryption, and decryption. However, all of the mentioned approaches [3,4,5,6,7,8,9,10,11,12,13,14,15,16] address the issue of excessive computational overhead primarily by improving the algorithms’ efficiency.

To tackle the problem of increasing computational costs in encryption due to the complexity of access policies or the number of attributes, some researchers have aimed to enhance encryption efficiency by optimizing access policy and integrating similar access policies, further reducing redundant computations, leading to improve efficiency on the whole, instead of improving efficiency by optimizing encryption and decryption algorithms for single data objects. Wang et al. [17] proposed the layered access structure to solve the secure sharing of hierarchical files. The files are encrypted as a whole with one integrated access structure. This scheme integrates different access trees into a single access tree, thus saving storage costs and reducing encryption overhead. Wang [18] proposed a policy compression approach based on a greedy algorithm. This approach introduces public attribute ciphertext and “sub-policy” and achieves a compact access policy for a given access tree. It reduces the ciphertext length and improves the computational efficiency of both encryption and decryption. In [19], the authors proposed integrating access trees of different data objects of the data owner, thus avoiding repeated computations in the encryption process and reducing the computational overhead. However, this scheme deals with the case of only one shared sub-policy among different access policies. In our previous study [20], we proposed merging multiple shared sub-policies among different access policies and encrypting corresponding data objects as a whole to further reduce the overall computational cost of encryption of these data objects.

Decision-making is selecting the optimal solution from several alternatives to achieve a specific goal. Decision-making based on soft sets and their extensions with uncertainty has been adopted in various fields, including health care, management, finance, and artificial intelligence. The choice value algorithm [21] and the comparison score algorithm [22] are the two primary soft-set theory methods utilized for decision-making problems. Liu et al. [23] proposed a decision model based on a fuzzy soft set and ideal solution approach, which provides an algorithm of “divide-and-conquer” for attributes of the soft set. In contrast to existing works based on the choice value-based approach and the comparison score-based approach, it generates the optimal ideal solution according to the distinct properties of each attribute. After that, it uses the weighted Hamming distance to calculate the similarity between each possible alternative object and the ideal object. The closest one to the ideal object will be the optimal choice.

There might be varying degrees of overlap or similarity in the access policies of different data objects that the same data owner is about to outsource. The above works did not consider globally grouping data objects of a data owner according to the similarity of access policies of these data objects. If we group all data objects of a data owner according to their policy similarity and then merge the identical parts of the policies of data objects in the same group, we can avoid repeated independent computation to some extent during the encryption process, thus improving overall encryption efficiency. That is our intention.

In this paper, we address the tree access structure CP-ABE. We need to clarify some concepts beforehand to describe our study’s object better. We call a subtree with only one branch node a primary sub-policy. When a primary sub-policy appears in multiple access trees, we call it a primary shared sub-policy of these access trees.

We support “AND” gate and “OR” gate tree access structures. However, we have limitations: the parent node of each primary shared sub-policy is the root node of the whole access tree. In addition, if a data owner only releases a few data objects, or among these, access control policies have less overlapping, we might not need to cluster them. If a data owner releases many data objects and there are significant overlaps among their access policies, our scheme will be useful.

Based on previous work [20], this paper employs soft set decision-making and the cluster percolation method (CPM) [24] to cluster the access trees. Similar primary shared sub-policies among access trees from the same cluster are merged to the utmost extent; then corresponding data objects are encrypted together, thereby eliminating repeated independent computation of encryption and enhancing the overall efficiency of encryption.

The main contributions of this paper are as follows:

For the case that the parent node of each primary shared sub-policy is the root node of the whole access tree, we use a soft set decision-making method to cluster and then integrate all the access trees from the same cluster optimally, thus minimizing the overall computational overhead of encryption for data owners.
For the case that the parent node of each primary shared sub-policy is the root node of the whole access tree, we use the CPM assisted by the soft set decision-making method to cluster and then integrate all the access trees from the same cluster optimally, thus minimizing the overall computational overhead of encryption for data owners.

Our approach is suitable for data outsourcing scenarios that utilize CP-ABE-based access control where the data objects released by the same data owner have tree access policies with very good overlappings from the perspective of their sub-policies. Similar to most CP-ABE schemes, our scheme is IND-CPA secure.

2. Basic Knowledge

2.1. Bilinear Mapping

Let

G_{1}

and

G_{2}

be two cyclic groups of prime order p, where the generator of

G_{1}

is g. A mapping

e : G_{1} \times G_{1} ⟶ G_{2}

is considered bilinear if it fulfills the following three conditions.

Non-degeneracy: $e (g, g) \neq 1$ ;
Bilinear: $\forall u, v \in G_{1}$ and $a, b \in Z_{p}$ , the equation $e (u^{a}, v^{b}) = e {(u, v)}^{a b}$ holds;
Computable: $\forall x, y \in G_{1}$ , the value of $e (x, y)$ can be computed in polynomial time.

2.2. Monotonic Access Structure

Let

{P_{1}, P_{2}, . . ., P_{n}}

be the set of participants. If for all subsets B and C,

B \in A

and

B \subseteq C

, then

C \in A

, we say the set

A \subseteq 2^{{P_{1}, P_{2}, . . ., P_{n}}}

is monotone. A (monotonic) access structure

A

is a (monotonic) set formed by the non-empty subsets of

{P_{1}, P_{2}, . . ., P_{n}}

, i.e.,

A \subseteq 2^{{1, 2, . . ., n}} ∖ {⌀}

. The subsets in

A

are called authorized sets, while the subsets not in

A

are referred to as unauthorized sets.

2.3. Access Tree

Let

T

denote an access tree composed of leaf nodes and non-leaf nodes. Each non-leaf node x in

T

is characterized by its child nodes and a threshold gate. Let

n u m_{x}

represent the number of child nodes of node x, and

k_{x}

be its threshold, where it is specified that

0 < k_{x} \leq n u m_{x}

. When

k_{x} = t

, it represents a logic of “t out of

n u m_{x}

”. When

k_{x} = 1

, it represents a logical “OR”, and when

k_{x} = n u m_{x}

, it represents a logical “AND”.

For each node y in the access tree

T

, let

p a r e n t (y)

denote the parent node of y. Each parent node assigns an index value to each of its child nodes. Define the function

i n d e x (y)

to represent the index value of node y, where

1 \leq i n d e x (y) \leq n u m_{x}

. The function

a t t (y)

is defined to represent the attributes associated with the leaf node y in

T

.

2.4. Primary Shared Sub-Policy

Definition 1.

The same subtree that appears in multiple access trees is called a shared sub-policy of these access trees [19].

Definition 2.

If multiple access trees share the same subtree and this subtree contains only one branch node, it is called a primary shared sub-policy of these access trees.

2.5. Ideal Access Tree

Definition 3.

For a given multiple access trees, the ideal access tree is the access tree that includes all the primary shared sub-policies that appeared in these given access trees.

We only require that the ideal access tree includes all of the primary shared sub-policies that appeared in certain access trees. There are no additional requirements regarding its structure or the remaining part of it. We use it to select the most similar access tree to it, which contains the highest number of leaf nodes in its primary shared sub-policies overall. This access tree is used as a basis for later clustering and integration.

2.6. Basic Sub-Policy

Definition 4.

For given access trees, if all of them contain a primary shared sub-policy, we call this primary shared sub-policy the basic sub-policy of these access trees.

2.7. Soft Set

Definition 5.

Let

(U, E)

be a soft space, where U is the initial universal set, E is the parameter set,

P (U)

is the power set of U, and

A \subseteq E

. Let

F : A \to P (U)

be a mapping. The pair

(F, A)

is a soft set over the universe U.

Example 1.

U is the initial universal set, consisting of six houses:

U = {h_{1}, h_{2}, h_{3}, h_{4}, h_{5}, h_{6}}

. The parameter set E is given as

E = {e_{1}, e_{2}, e_{3}, e_{4}, e_{5}, e_{6}, e_{7}}

, where

e_{1}

,

e_{2}

,

e_{3}

,

e_{4}

,

e_{5}

,

e_{6}

, and

e_{7}

, respectively, represent attributes “expensive”, “good environment”, “wooden”, “beautiful”, “cheap”, “well-maintained”, and “in disrepair”. In this soft space

(U, E)

, defining a soft set indicates attributes such as “expensive” and “beautiful”. Suppose someone is interested in purchasing a house and is concerned with attributes

e_{1}

,

e_{2}

,

e_{3}

,

e_{4}

,

e_{5}

. Their evaluation of houses can be represented as a soft set

(F, A)

, where

A = {e_{1}, e_{2}, e_{3}, e_{4}, e_{5}}

. Let us assume that

F (e_{1}) = {h_{2}, h_{4}}

,

F (e_{2}) = {h_{1}, h_{3}}

,

F (e_{3}) = {h_{3}, h_{4}, h_{5}}

,

F (e_{4}) = {h_{1}, h_{3}, h_{5}}

, and

F (e_{5}) = {h_{1}}

. Then,

F (e_{1}) = {h_{2}, h_{4}}

indicates that houses

h_{2}

and

h_{4}

are “expensive”,

F (e_{2}) = {h_{1}, h_{3}}

indicates that houses

h_{1}

and

h_{3}

have a “good environment”, and similar interpretations can be made for the other attributes.

2.8. Decision Function Based on Hamming Distance

Hamming distance describes the degree of difference between two elements of equal length.

Definition 6.

The normalized Hamming distance of dimension n is a mapping

d_{H} : R^{n} \times R^{n} ⟶ R

that satisfies:

d_{H} (A, B) = \frac{1}{n} (\sum_{i = 1}^{n} | a_{i} - b_{i} |)

where

A = (a_{1}, a_{2}, \dots, a_{n})

,

B = (b_{1}, b_{2}, \dots, b_{n})

.

Here, we treat each primary shared sub-policy as an attribute for decision-making. Let U be composed of attributes (primary shared sub-policies), and

(F, A)

be a soft set on the domain U. All attributes have equal importance. Let

u_{g o a l}

represent the optimal solution. Thus, the decision problem becomes an optimization problem:

m i n {d_{H} (u_{g o a l}, u_{i}) u_{i} \in U, i = 1, 2, \dots, n}

where n is the number of elements in the initial universal set.

Definition 7.

Let

W = (w_{1}, w_{2}, \dots, w_{n})

be the weights,

\sum_{i = 1}^{n} w_{i} = 1

. The weighted Hamming distance

d_{W H}

is a mapping

d_{W H} : R^{n} \times R^{n} \to R

given by:

d_{W H} (A, B) = \sum_{i = 1}^{n} w_{i} a_{i} - b_{i}, i = 1, 2, \dots, n

where

A = (a_{1}, a_{2}, \dots, a_{n})

,

B = (b_{1}, b_{2}, \dots, b_{n})

.

2.9. Primary Shared Sub-Policy Weight

Our proposal calculates the distance between each access tree and the ideal access tree using the weighted Hamming distance. The weights for each attribute (primary shared sub-policy) are calculated according to the following definitions:

Definition 8.

The impact factor of a primary shared sub-policy is defined as multiplying the number of access trees containing it by the number of leaf nodes within it.

Definition 9.

The maximum combinable number of a primary shared sub-policy is defined as subtracting its number of leaf nodes from its impact factor.

Definition 10.

The weight of a primary shared sub-policy is defined as dividing its maximum combinable number by the sum of the maximum combinable numbers of all the primary shared sub-policies.

2.10. Cluster Percolation Method

A faction refers to an undirected graph where any two nodes are connected by an edge, forming a complete subgraph or cluster. The maximal complete subgraph has the highest number of nodes among all complete subgraphs. A complete subgraph with k nodes in the graph is called a k clique. We say two k-cliques are connected if one overlaps with another by

k - 1

nodes, as shown in Figure 1.

The set of all connected cliques with a size of k or greater than k forms a k-community, as shown in Figure 2.

The main idea of the CPM is first to find complete subgraphs and then utilize these complete subgraphs to find k-communities; the k value of a community represents that the community is composed of cliques of size k or greater than k. After finding all k cliques, an overlap matrix of these cliques can be constructed. In this symmetric matrix, each row (column) represents a clique, and the non-diagonal elements of the matrix represent the number of common nodes in two connected cliques. The diagonal elements represent the size of the clique. By setting non-diagonal elements less than

k - 1

to 0 and diagonal elements less than k to 0, with other elements set to 1, we obtain a k-clique adjacency matrix, where each connected part forms a k-community, as shown in Figure 3.

In our second approach, we measure the similarity between two access trees based on the number of leaf nodes in their primary shared sub-policies. Using this similarity metric, we find k-cliques and combine multiple connected k-cliques to form the k-community. We select the k-community with the largest number of nodes and the maximum value of k as the pre-selected result for clustering. Finally, we refine the pre-selected result using the soft-set decision-making method.

3. Access Policy Clustering Method Based on Soft Set Decision-Making

3.1. Overview

In this paper, we first use a soft set decision-making method to select an optimal access tree closest to the ideal access tree, i.e., the one with the highest number of leaf nodes in all its primary shared sub-policies, from a given number of access trees. After that, we select the primary shared sub-policy with the maximum combinable number and use this primary shared sub-policy as the clustering criterion to put all access trees containing this primary shared sub-policy into one cluster; the rest of the access trees are repeatedly clustered using the same method.

The following example illustrates the clustering method of the access policy based on soft-set decision-making.

Example 2.

Consider an example of 20 access trees

T_{1}, T_{2}, \dots, T_{20}

containing 10 primary shared sub-policies

T_{A}, T_{B}, \dots, T_{J}

. Each tree’s primary shared sub-policies are randomly given, as shown in Table 1. The numbers in the table indicate the number of leaf nodes of a primary shared sub-policy included in the access tree.

First, we count all primary sub-policies of each access tree and further determine all primary shared sub-policies according to whether they appear repeatedly in different access trees. Next, use all primary shared sub-policies to construct a vector. Each component represents if the corresponding primary shared sub-policy is included; if included, the component is set to 1; otherwise, it is set to 0.

According to Table 1, the minimum shared sub-policies of the access tree

T_{1}

can be represented by the vector (1,0,1,0,1,1,0,0,1,0). The representation of other access trees is in the same way. For the ideal access tree containing all the minimum shared sub-policies, the corresponding vector is represented as (1,1,1,1,1,1,1,1,1,1).

(F, A) = [\begin{matrix} e_{1} & e_{2} & e_{3} & e_{4} & e_{5} & e_{6} & e_{7} & e_{8} & e_{9} & e_{10} \\ T_{g o a l} & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ T_{1} & 1 & 0 & 1 & 0 & 1 & 1 & 0 & 0 & 1 & 0 \\ T_{2} & 1 & 0 & 1 & 0 & 1 & 0 & 0 & 0 & 0 & 1 \\ T_{3} & 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 & 1 & 0 \\ T_{4} & 1 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ T_{5} & 0 & 1 & 1 & 0 & 0 & 1 & 0 & 0 & 1 & 0 \\ T_{6} & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ T_{7} & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 1 \\ T_{8} & 0 & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 0 & 0 \\ T_{9} & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 1 \\ T_{10} & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ T_{11} & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ T_{12} & 0 & 1 & 0 & 0 & 1 & 0 & 1 & 0 & 1 & 0 \\ T_{13} & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ T_{14} & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ T_{15} & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ T_{16} & 0 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ T_{17} & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 1 & 0 & 0 \\ T_{18} & 0 & 1 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 1 \\ T_{19} & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 1 & 0 \\ T_{20} & 1 & 0 & 1 & 0 & 0 & 1 & 0 & 1 & 0 & 0 \end{matrix}]

(1)

As shown in Table 1, the number of leaf nodes for the primary shared sub-policies

T_{A}, T_{B}, \dots, T_{J}

are 3; 2; 1; 3; 2; 1; 2; 2; 1; 2, respectively. Five access trees include the primary shared sub-policy

T_{A}

, so the impact factor of

T_{A}

is

3 \times 5 = 15

. Similarly, for the primary shared sub-policies

T_{B}, \dots, T_{J}

, the impact factors are as follows: 12: 7; 12; 10; 8; 6; 8; 5; 10.

Next, calculate the maximum combinable number for each primary shared sub-policy. The maximum combinable number for the primary sharing sub-policy

T_{A}

is 15-3=12. Similarly, for the primary shared sub-policies

T_{B}, \dots, T_{J}

, the maximum combinable numbers are as follows: 10; 6; 9; 8; 7; 4; 6; 4; 8.

After that, calculate the weights for the primary shared sub-policies as follows: For

T_{A}

, the weight is 12/(12+10+6+9+8+7+4+6+4+8)=12/74. For

T_{B}, \dots, T_{J}

, the weights are: 10/74; 6/74; 9/74; 8/74; 7/74; 4/74; 6/74; 4/74; 8/74.

Using the soft set decision-making matrix (1), calculate the weighted Hamming distance between each access tree and the ideal access tree as follows:

d_{H} (T_{g o a l}, T_{1}) = 37 / 74, d_{H} (T_{g o a l}, T_{2}) = 40 / 74, d_{H} (T_{g o a l}, T_{3}) = 38 / 74

d_{H} (T_{g o a l}, T_{4}) = 54 / 74, d_{H} (T_{g o a l}, T_{5}) = 47 / 74, d_{H} (T_{g o a l}, T_{6}) = 65 / 74

d_{H} (T_{g o a l}, T_{7}) = 59 / 74, d_{H} (T_{g o a l}, T_{8}) = 61 / 74, d_{H} (T_{g o a l}, T_{9}) = 59 / 74

d_{H} (T_{g o a l}, T_{10}) = 58 / 74, d_{H} (T_{g o a l}, T_{11}) = 71 / 74, d_{H} (T_{g o a l}, T_{12}) = 56 / 74

d_{H} (T_{g o a l}, T_{13}) = 61 / 74, d_{H} (T_{g o a l}, T_{14}) = 54 / 74, d_{H} (T_{g o a l}, T_{15}) = 65 / 74

d_{H} (T_{g o a l}, T_{16}) = 58 / 74, d_{H} (T_{g o a l}, T_{17}) = 61 / 74, d_{H} (T_{g o a l}, T_{18}) = 52 / 74

d_{H} (T_{g o a l}, T_{19}) = 62 / 74, d_{H} (T_{g o a l}, T_{20}) = 43 / 74

Since

d_{H} (T_{g o a l}, T_{1}) = 37 / 74

is the smallest, tree

T_{1}

is selected as the optimal tree.

The optimal tree

T_{1}

contains the primary shared sub-policies:

T_{A}

,

T_{C}

,

T_{E}

,

T_{F}

,

T_{I}

. The largest maximum combinable number among them is 12, and the corresponding primary shared sub-policy is

T_{A}

. We chose it as the criterion for the first clustering. Access trees

T_{1}

,

T_{2}

,

T_{4}

,

T_{14}

,

T_{20}

, containing the primary shared sub-policy

T_{A}

, are grouped into the first cluster.

We repeat the same method for the remaining access trees from scratch.

T_{3}

has the smallest weighted Hamming distance with the ideal access tree. Therefore,

T_{3}

is chosen as the optimal tree for the second clustering. In

T_{3}

, the primary shared sub-policies are

T_{B}

,

T_{D}

,

T_{F}

,

T_{H}

,

T_{I}

.

Among them, the largest maximum combinable number is 10, and the corresponding primary shared sub-policy is

T_{B}

. We chose it as the criterion for the second clustering. Access trees

T_{3}

,

T_{5}

,

T_{10}

,

T_{12}

,

T_{16}

and

T_{18}

, containing the primary shared sub-policy

T_{B}

, are grouped into the second cluster. The remaining trees are further clustered using the same method.

3.2. A General Approach to Access Tree Clustering Based on Soft Set Decision-Making

Based on Example 2, we provide a general soft set decision-making-based clustering method for access trees.

Find primary shared sub-policies in a given multiple access trees. Determine the current ideal access tree. For each access tree, count the number of primary shared sub-policies and the number of leaf nodes each primary shared sub-policy contained.
Calculate the impact factor, maximum combinable number, and weight for each primary shared sub-policy.
Calculate the weighted Hamming distance between each access tree and the ideal access tree.
Select the access tree with the smallest distance from the ideal access tree as the optimal tree. Select the primary shared sub-policy with the largest maximum combinable number in the optimal tree as the criterion for clustering. Put access trees containing this primary shared sub-policy into a cluster.
Repeat steps 1–4 for the remaining access trees until all access trees are clustered.

4. Access Policy Clustering Method Based on Cluster Percolation Method

4.1. Overview

Below, we introduce our second approach of access control policy clustering. We use the total number of leaf nodes in the primary shared sub-policies contained in two access trees to measure the similarity between them; based on this metric, regarding access trees as nodes, we find all maximal complete subgraphs among the access trees and regard each maximal complete subgraph as a clique to construct the overlapping matrix. The overlapping matrix is then transformed into a community adjacency matrix, as described in Section 2.10, We select the k-community with the largest number of nodes (access trees) and the maximum value of k as the pre-selected result for clustering. If all these access trees contain at least one primary shared sub-policy, this pre-selected result becomes the final result. If there is no primary shared sub-policy contained in all access trees in the pre-selected result, then we chose the primary shared sub-policy with the largest maximum combinable number among primary shared sub-policies that these access trees in the pre-selected result contain. Other access trees that do not contain the selected primary shared sub-policy are eliminated from the pre-selected result, and the remaining access trees become the final result.

CFinder-2.0.6–1448 (https://www.cfinder.org/, accessed on 1 March 2023) is a software developed by Adamcsek et al. [25] for searching, visualizing, and analyzing the implicit group module of graphs based on the CPM. It can find cliques of a specified size in the graph and construct larger communities from the nodes and edges shared in the cliques.

The following is an example to introduce our access policy clustering approach based on the CPM.

Example 3.

Suppose 20 primary shared sub-policies

T_{A}, T_{B}, \dots, T_{T}

randomly distributed in 50 access trees

T_{1} (A N D), T_{2} (A N D), \dots, T_{50} (O R)

, where

(A N D)

or

(O R)

denotes the threshold value of the root node of this access tree, as shown in Table 2.

We take the sum of the number of leaf nodes contained in all the identical primary shared sub-policies in two trees as the similarity of the two trees. The access tree

T_{1} (A N D)

and

T_{2} (A N D)

share a primary shared sub-policy

T_{D}

that has three leaf nodes; therefore, the similarity between

T_{1} (A N D)

and

T_{2} (A N D)

is 3. There is no identical primary shared sub-policy in access tree

T_{1} (A N D)

and access tree

T_{12} (A N D)

; thus, the similarity between access tree

T_{1} (A N D)

and access tree

T_{12} (A N D)

is 0. Therefore, we obtain the similarity between every two access trees. The similarity between partial access trees is shown in Table 3.

Subsequently, all the similarity data are imported into CFinder; we first set the similarity threshold to 3. The community with the largest k value and maximum nodes (access trees) is selected as a pre-selected result of the first clustering among the communities generated from the similarity data. Now, 17 access trees

T_{2} (A N D)

,

T_{4} (O R)

,

T_{7} (A N D)

,

T_{18} (A N D)

,

T_{20} (O R)

,

T_{25} (A N D)

,

T_{28} (O R)

,

T_{29} (A N D)

,

T_{30} (A N D)

,

T_{32} (A N D)

,

T_{33} (A N D)

,

T_{34} (O R)

,

T_{35} (A N D)

,

T_{36} (O R)

,

T_{44} (A N D)

,

T_{45} (O R)

,

T_{50} (O R)

are pre-selected results. All of them contain the primary shared sub-policy

T_{N}

, there is no need to exclude any access tree from the first cluster. Therefore, these access trees are the result of the first clustering, as shown in Table 4.

Similarly, the similarity data for the remaining 33 access trees were imported into CFinder. We set the similarity threshold to 3, The communities with the largest k value and maximum nodes (access trees) are

T_{1} (A N D)

,

T_{13} (A N D)

,

T_{15} (A N D)

,

T_{16} (O R)

,

T_{17} (O R)

,

T_{22} (O R)

,

T_{23} (O R)

,

T_{39}

,

T_{40} (O R)

,

T_{42} (A N D)

,

T_{47} (A N D)

. Since these access trees do not have a primary shared sub-policy in common, we need to handle them further. We compute the maximum combinable number of primary shared sub-policies contained in these trees.

The primary shared sub-policy

T_{O}

has the largest maximum combinable number and is chosen as the criterion for further clustering. The access trees

T_{1} (A N D)

,

T_{22} (O R)

that do not contain

T_{O}

are excluded from the current pre-selected cluster; the remaining trees

T_{13} (A N D)

,

T_{15} (A N D)

,

T_{16} (O R)

,

T_{17} (O R)

,

T_{23} (O R)

,

T_{39} (A N D)

,

T_{40} (O R)

,

T_{42} (A N D)

,

T_{47} (A N D)

are the final result of the second clustering. These access trees contain primary shared sub-policies, as shown in Table 5.

Similarly, the similarity data for the remaining 24 access trees were imported into CFinder. We set the similarity threshold to 2, The communities with the largest k value and maximum nodes (access trees) are

T_{6} (O R)

,

T_{8} (A N D)

,

T_{9} (O R)

,

T_{10} (O R)

,

T_{21} (O R)

,

T_{22} (O R)

,

T_{26} (A N D)

,

T_{31} (O R)

,

T_{41} (A N D)

,

T_{46} (O R)

. All of these access trees contain the primary shared sub-policy

T_{J}

, so it does not require excluding any access tree from the cluster. Therefore, the pre-selected access trees above are the result of the third clustering. These access trees contain primary shared sub-policies, as shown in Table 6.

The remaining 22 access trees can then be clustered in the same way, but we omit it here for brevity.

4.2. A General Approach to Access Tree Clustering with Cluster Percolation Method

Based on example 3, a general method for clustering access trees based on CPM is given.

For all primary shared sub-policies between every two access trees, calculate the sum of the number of leaf nodes contained in these primary shared sub-policies and take it as the similarity of the two trees.
Set the similarity threshold and use the CPM to find all cliques and communities composed of access trees.
Access trees from the community with the largest value of k and maximum nodes (access trees) are set to be the pre-selected result for clustering. If all these access trees contain at least one primary shared sub-policy, this pre-selected result becomes the final result. If there is no primary shared sub-policy contained in all access trees in the pre-selected result of the cluster, then we choose the primary shared sub-policy with the largest maximum combinable number among primary shared sub-policies that these access trees in the pre-selected result contain; other access trees that do not contain the primary shared sub-policy are eliminated from the pre-selected result, and the remaining access trees become the final result.
Repeat step 2 and step 3 for the remaining access trees until all access trees are clustered.

5. Amendment and Integration of Access Trees in the Same Cluster

5.1. Overview

After first-level clustering according to clustering methods provided in Section 3.2 or Section 4.2, access trees within the same cluster are further categorized into

A N D

-type and

O R

-type classes based on the threshold value of their root node. Then, based on the similarity of the access trees, we perform divisive clustering for all access trees in the two classes until all access trees in the same cluster have at least two primary shared sub-policies in common. Then, starting from the final round of clustering results, we do the following: integrate the access trees of the same class and within the same cluster to achieve a bigger access tree corresponding to that cluster, and then integrate the bigger access trees within these same classes to achieve much bigger access trees. Repeat this integration operation until two large access trees with

A N D

root and

O R

root are generated, respectively. Then, the two access trees are connected with basic sub-policies as crossing nodes.

5.2. Amendment of Access Trees within the Same Class

Before integrating access trees in the same class, we need to make some amendments to them. Here, for clarity, we first illustrate how to amend access trees using an example of four trees

T_{1}

,

T_{2}

,

T_{3}

and

T_{4}

. Suppose they are from the same class (

A N D

-type) and the same cluster of the first-level clustering. The symbols

T_{1} (o t h e r), T_{2} (o t h e r), T_{3} (o t h e r), T_{4} (o t h e r)

are used to denote the non-shared sub-policy part of access trees; the symbols

T_{A}, T_{B}, T_{C}

denote the primary shared sub-policy part.

The four access trees have basic sub-policy

T_{A}

and the primary shared sub-policy

T_{B}

in common. We add a child node—“

A N D

”, the same child node as the root node, under the root node, making it the parent node of the two primary shared sub-policies. The two primary shared sub-policies with their newly added parent node construct a new shared sub-policy for all access trees; we denote it

T_{A B} (A N D)

. We can observe that

T_{3}

and

T_{4}

have

T_{A B} (A N D)

and

T_{C}

in common. Therefore, we repeat the above operation; that is, we add the “

A N D

” node under the root node of

T_{3}

and

T_{4}

, making it the parent node of the two shared sub-polices

T_{A B} (A N D)

and

T_{C}

. Before and after the amendment are shown in Figure 4. Access trees in the

O R

-type class can be amended in the same way. Here, we omit it.

5.3. Case Study of Policy Integration

The following illustrates the amendment and integration of access trees within the first cluster in Example 3.

Example 4.

As part of the first-level clustering result, there are 17 trees in the first clusters in Example 3, and the primary shared sub-policies they contain are shown in Table 4.

These trees all contain the primary shared sub-policy

T_{N}

, where

T_{N}

is the basic sub-policy in this cluster. The access trees with the “

A N D

” root node are categorized into the

A N D

-type class; they are

T_{2} (A N D)

,

T_{7} (A N D)

,

T_{18} (A N D)

,

T_{25} (A N D)

,

T_{29} (A N D)

,

T_{30} (A N D)

,

T_{32} (A N D)

,

T_{33} (A N D)

,

T_{35} (A N D)

,

T_{44} (A N D)

. The access trees with the “

O R

” root node are categorized into the

O R

-type class; they are

T_{4} (O R)

,

T_{20} (O R)

,

T_{28} (O R)

,

T_{34} (O R)

,

T_{36} (O R)

,

T_{45} (O R)

,

T_{50} (O R)

. After first-level clustering and classifying, we now start second-level clustering. In the domain of

A N D

-type class, the maximum combinable numbers of these 20 primary shared sub-policies are recalculated. They are

3, 6, 4, 6, 2, 10, 3, 9, 0, 3, 4, 9, 6, 36, 4, 12, - 1, 4, 4, 6 .

Besides the basic sub-policy

T_{N}

,

T_{P}

has the largest maximum combinable number. The access trees

T_{2} (A N D)

,

T_{7} (A N D)

,

T_{25} (A N D)

,

T_{32} (A N D)

containing both

T_{P}

and

T_{N}

are put into a new cluster as a part of second-level clustering result, and their structure is illustrated in Figure 5.

Since these access trees also contain the primary shared sub-policy

T_{F}

,

T_{F}

and

T_{P}

can be integrated with the basic sub-policy

T_{N}

first. For the cluster of access trees

T_{2} (A N D)

,

T_{7} (A N D)

,

T_{25} (A N D)

,

T_{32} (A N D)

that generated in the second-level clustering, we add a child node—“

A N D

”, the same child node as the root node, under the root node, making it the parent node of the three primary shared sub-policies. The three primary shared sub-policies with their newly added parent node construct a new shared sub-policy for these access trees; we denote it

T_{N P F} (A N D)

. We recompute the maximum combinable number of each primary shared sub-policy within the domain composed of these four access trees

T_{2} (A N D)

,

T_{7} (A N D)

,

T_{25} (A N D)

,

T_{32} (A N D)

, and they are

0; 0; - 2; 6; 0; 6; - 3; 3; - 4; 0; 0; 3; 0; 12; 4; 12; - 1; 4; 0; 0 .

Besides the primary shared sub-policies

T_{P}

,

T_{F}

, and basic sub-policy

T_{N}

, we find the primary shared sub-policy

T_{D}

that has the largest maximum combinable number out of the remaining 17 minimal sub-polices. We can observe that

T_{2} (A N D)

,

T_{25} (A N D)

and

T_{32} (A N D)

have

T_{N P F} (A N D)

and

T_{D}

in common. Therefore, we repeat the above operation, that is, we add the “

A N D

” node under the root node of

T_{2} (A N D)

,

T_{25} (A N D)

and

T_{32} (A N D)

, making it the parent node of the two shared sub-polices

T_{N P F} (A N D)

and

T_{C}

. According to the access tree amendment method provided in Section 5.2, we have the amendment results of access trees

T_{2} (A N D)

,

T_{7} (A N D)

,

T_{25} (A N D)

,

T_{32} (A N D)

, as illustrated in Figure 6; we also have the amendment results of access trees

T_{2} (A N D)

,

T_{25} (A N D)

,

T_{32} (A N D)

, as illustrated in Figure 7.

Then the identical parts

T_{(N P F - D)} (A N D)

of the three access trees

T_{2} (A N D)

,

T_{25} (A N D)

,

T_{32} (A N D)

are merged first to obtain a bigger access tree; then, the identical parts

T_{N P F} (A N D)

of the newly generated bigger access tree and

T_{32} (A N D)

are merged. Finally, we obtained the final integrated access structure shown in Figure 8.

In the domain of the

A N D

-type class, for the remaining 6 access trees

T_{18} (A N D)

,

T_{29} (A N D)

,

T_{30} (A N D)

,

T_{33} (A N D)

,

T_{35} (A N D)

,

T_{44} (A N D)

, we repeat the same second-level clustering method. The maximum combinable numbers of these 20 primary shared sub-policies are recalculated. They are

0; 4; 4; - 3; 0; 2; 3; 3; 0; 0; 0; 3; 3; 20; - 4; - 4; - 1; - 4; 0; 4 .

Besides the basic sub-policy

T_{N}

,

T_{C}

has the largest maximum combinable number. The access trees

T_{18} (A N D)

,

T_{29} (A N D)

,

T_{33} (A N D)

containing both

T_{N}

and

T_{C}

are categorized into second-level clustering results. Since these access trees contain the primary shared sub-policy

T_{C}

,

T_{C}

can be integrated with the basic sub-policy

T_{N}

first. For the cluster of access trees

T_{18} (A N D)

,

T_{29} (A N D)

,

T_{33} (A N D)

that were generated in the second-level clustering, we add a child node—“

A N D

”, the same child node as the root node, under the root node, making it the parent node of the two primary shared sub-policies. The two primary shared sub-policies with their newly added parent node construct a new shared sub-policy for these access trees; we denote it

T_{N C} (A N D)

. We recompute the maximum combinable number of each primary shared sub-policy within the domain composed of these three access trees

T_{18} (A N D)

,

T_{29} (A N D)

,

T_{33} (A N D)

, and they are

0; 2; 4; - 3; - 2; 0; 0; - 3; - 4; 0; 0; 0; - 3; 8; - 4; - 4; 1; - 4; - 4; 0 .

Besides the primary shared sub-policies

T_{C}

and basic sub-policy

T_{N}

, we find the primary shared sub-policy

T_{B}

that has the largest maximum combinable number out of the remaining 18 minimal sub-polices. We can observe that

T_{18} (A N D)

and

T_{29} (A N D)

have

T_{N C} (A N D)

and

T_{B}

in common. Therefore, we repeat the above operation, that is, we add the “

A N D

” node under the root node of

T_{18} (A N D)

and

T_{29} (A N D)

, making it the parent node of the two shared sub-polices

T_{N C} (A N D)

and

T_{B}

. We repeat previous amendment and integration methods. The access trees

T_{18} (A N D)

,

T_{29} (A N D)

and

T_{33} (A N D)

are connected into one access tree using the basic sub-policy as a cross-node. For the remaining 3 access trees,

T_{30} (A N D)

,

T_{35} (A N D)

,

T_{44} (A N D)

, we repeat the same method. After that, the basic sub-policy is used as a cross-node to integrate these ten access trees into one access tree. The access trees in the

O R

-type class are amended and integrated similarly. Finally, the

O R

-type access tree and

A N D

-type access tree are integrated into one access tree using the basic sub-policy as a cross-node.

5.4. A General Approach for Integration of Access Trees in the Same Cluster

Based on Example 4, a general approach for integrating access trees in the same cluster is given.

For first-level clustering results, access trees within the same cluster are classified into $A N D$ -type access trees and $O R$ -type access trees according to the threshold of the parent node of the primary shared sub-policy (the root node of the access tree).
Select the primary shared sub-policy, excluding the basic sub-policy, with the largest maximum combinable number in the same class as the criterion for the second-level clustering. Conduct the second-level clustering for the remaining access trees until all access trees in the same cluster have at least two primary shared sub-policies in common.
Make amendments and integration of the access trees in clusters that generated the second-level clustering; connect access trees in the same type access tree into one large access tree using the basic sub-policy as a cross node.
Connect the $A N D$ -type access tree and $O R$ -type access tree with the basic sub-policy to obtain an access tree.

5.5. Secret Sharing for Integrated Access Trees

As a preparation for our CP-ABE scheme with policy integration, we use six access trees with different sub-policies, as shown in Figure 9, to illustrate how to set the security parameter s (the constant term of the polynomial) for each node within the integrated access tree. According to the integration method described in Section 5.4, access trees within the same class are integrated first; then, we obtain two big access trees with the

A N D

root node and

O R

root node. The result of the integration is shown in Figure 10. These two access trees are finally integrated into one access tree using the basic sub-policy as a cross-node, as shown in Figure 11.

Starting from the root node, a polynomial

f_{t} (x)

needs to be generated for each node

N_{t}

in the integrated access tree; the degree of the polynomial equals the threshold value of the node minus one, i.e.,

d_{t} = k_{t} - 1

. As shown in Figure 11, starting from the root node

N_{0}

, for each node, the data owner generates a polynomial in the following way: first, choose a random number

s \in Z_{p}

such that

f_{0} (0) = s

. Then

s_{d_{o}}, s_{d_{o} - 1}, \dots, s_{1} \in Z_{p}

is randomly chosen to determine

f_{0} (x)

, that is

f_{0} (x) = s_{d_{o}} x^{d_{o}} + s_{d_{o} - 1} x^{d_{o} - 1} + \dots + s_{1} x + s

. Since

N^{(1)}

is an

A N D

node, the corresponding polynomial is made using the secret sharing scheme; details are as follows:

f_{1} (x) = a_{d_{1}} x^{d_{1}} + a_{d_{1} - 1} x^{d_{1} - 1} + \dots + a_{1} x + a_{0}

; set the index value of

N_{0}

to be 1, i.e.,

i n d e x (N_{0}) = 1

, then

s = f_{1} (1) = a_{d_{1}} + a_{d_{1} - 1} + \dots + a_{1} + a_{0}

; choosing the remaining

d_{1}

points again, we can determine

a_{0} = s - (a_{d_{1}} + a_{d_{1} - 1} + \dots + a_{1})

. Since

N^{(n)}

is an

A N D

node, assignments are made according to the secret sharing scheme such that

f_{n} (x) = n_{d_{n}} x^{d_{n}} + n_{d_{n} - 1} x^{n_{n} - 1} + \dots + n_{1} x + n_{0}

. set the index value of

N_{0}

to be 1, i.e.,

i n d e x (N_{0}) = 1

, then

s = f_{1} (1) = a_{d_{1}} + a_{d_{1} - 1} + \dots + a_{1} + a_{0}

; choosing the remaining

d_{n}

points again, we can determine

n_{0} = s - (n_{d_{n}} + a_{n_{1} - n} + \dots + n_{1})

. Since

N^{(2)}

is an

O R

node, assignments are made according to the secret sharing scheme such that

f_{2} (0) = f_{0} (0) = s

. For

N^{(1)}

with parents

N_{p}

and

N_{m}

, and

N^{(2)}

with parents

N_{q}

and

N_{s}

, there are

f_{m} (i n d e x (N^{(1)})) = f_{p} (i n d e x (N^{(1)})) = f_{1} (0) = a_{0}

(2)

f_{q} (i n d e x (N^{(2)})) = f_{s} (i n d e x (N^{(2)})) = f_{2} (0) = f_{0} (0) = s

(3)

Let

f_{m} (x) = m_{d_{m}} x^{d_{m}} + m_{d_{m} - 1} x^{d_{m} - 1} + \dots + m_{1} x + m_{0}

.

f_{p} (x) = p_{d_{p}} x^{d_{p}} + p_{d_{p} - 1} x^{d_{p} - 1} + \dots + p_{1} x + p_{0}

,

f_{q} (x) = q_{d_{q}} x^{d_{q}} + q_{d_{q} - 1} x^{d_{q} - 1} + \dots + q_{1} x + q_{0}

,

f_{s} (x) = s_{d_{s}} x^{d_{s}} + s_{d_{s} - 1} x^{d_{s} - 1} + \dots + s_{1} x + s_{0}

.

s = q_{d_{q}} + q_{d_{q} - 1} + \dots + q_{1} + q_{0} = s_{d_{s}} + s_{d_{s} - 1} + \dots + s_{1} + s_{0}

. set

N^{(1)}

in

f_{m} (x)

and

f_{p} (x)

, and

N^{(2)}

in

f_{q} (x)

and

f_{s} (x)

both have index values 1, i.e.,

i n d e x (N^{(1)}) = i n d e x (N^{(2)}) = 1

, then

a_{o} = f_{m} (1) = f_{p} (1)

,

s = f_{q} (1) = f_{s} (1)

, i.e.,

a_{0} = p_{d_{p}} + p_{d_{p} - 1} + \dots + p_{1} + p_{0} = m_{d_{m}} + m_{d_{m} - 1} + \dots + m_{1} + m_{0}

.

s = q_{d_{q}} + q_{d_{q} - 1} + \dots + q_{1} + q_{0} = s_{d_{s}} + s_{d_{s} - 1} + \dots + s_{1} + s_{0}

. Selecting the remaining

d_{m} + d_{p} + d_{q} + d_{s}

points can then be computed:

p_{0} = a_{0} - (p_{d_{p}} + p_{d_{p} - 1} + \dots + p_{1})

(4)

m_{0} = a_{0} - (m_{d_{m}} + m_{d_{m} - 1} + \dots + m_{1})

(5)

q_{0} = s - (q_{d_{q}} + q_{d_{q} - 1} + \dots + q_{1})

(6)

s_{0} = s - (s_{d_{s}} + s_{d_{s} - 1} + \dots + s_{1})

(7)

Thus,

f_{p} (x)

,

f_{m} (x)

,

f_{q} (x)

and

f_{s} (x)

are determined. For the parent nodes

N_{r}

and

N_{t}

of

N^{(m)}

, there are

f_{r} (i n d e x (N^{(m)})) = f_{t} (i n d e x (N^{(m)})) = f_{m} (0) = m_{0}

(8)

Let

f_{r} (x) = r_{d_{r}} x^{d_{r}} + r_{d_{r} - 1} x^{d_{r} - 1} + \dots + r_{1} x + r_{0}

,

f_{t} (x) = t_{d_{t}} x^{d_{t}} + t_{d_{t} - 1} x^{d_{t} - 1} + \dots + t_{1} x + t_{0}

. set the index value of

N^{(m)}

in

f_{r} (x)

and

f_{t} (x)

to be 1, i.e.,

i n d e x (N^{(m)}) = 1

, then

m_{o} = f_{r} (1) = f_{t} (1)

, i.e.,

m_{0} = r_{d_{r}} + r_{d_{r} - 1} + \dots + r_{1} + r_{0} = t_{d_{t}} + t_{d_{t} - 1} + \dots + t_{1} + t_{0}

. Selecting the remaining

d_{r} + d_{t}

points can then be computed:

r_{0} = m_{0} - (r_{d_{r}} + r_{d_{r} - 1} + \dots + r_{1})

(9)

t_{0} = m_{0} - (t_{d_{t}} + t_{d_{t} - 1} + \dots + t_{1})

(10)

Thus,

f_{r} (x)

,

f_{t} (x)

are determined. For the other nodes

N_{t}

, let

f_{t} (0) = f_{p a r e n t (N_{t})} (i n d e x (N_{t}))

, and choose the remaining

d_{t}

values to determine

f_{t} (x)

. Thus, the polynomial for each node in the integrated access tree can be determined.

6. Construction of CP-ABE Scheme with Integrated Access Trees

6.1. System Architecture

The system framework for this proposal is illustrated in Figure 12. The proposal involves four entities, as follows:

-: Cloud storage platform (CSP): responsible for storing the ciphertext uploaded by data owners; it is semi-trusted.
-: Attribute authority (AA): responsible for generating the system’s public key and master private key and generating private keys for data users based on their attributes; it is fully trusted.
-: Data owner (DA): responsible for specifying access policies and uploading ciphertexts embedded with access policies to the CSP.
-: Users: can download ciphertext from the CSP and successfully decrypt it when their attributes satisfy the access policies.

6.2. Algorithm Description

This CP-ABE scheme primarily consists of four algorithms: system establishment (

S e t u p

), key generation (

K e y G e n

), data encryption (

E n c r y p t

), and data decryption (

D e c r y p t

). The algorithm definitions are as follows:

(1)

S e t u p (1^{λ}) ⟶ (P K, M S K)

: The AA executes this algorithm. It takes the system’s security parameter

λ

as input and produces its public key

P K

and master key

M S K

as output.

(2)

K e y G e n (P K, M S K, S) ⟶ S K

: The AA executes this algorithm, generating the corresponding private key for data users based on their attribute set. It takes the system’s public key

P K

, master key

M S K

, and the attribute set S of the user as input and produces the user’s private key

S K

as output.

(3)

E n c r y p t (P K, {M_{i}}_{1 \leq i \leq n}, {T_{i}}_{1 \leq i \leq n}) ⟶ {C T_{i}}_{1 \leq i \leq n}

: The DA executes this algorithm, encrypting data

M_{i}

using the access tree

T_{i}

, where there are shared sub-policies among n access trees. The algorithm takes the system’s public key

P K

, data

{M_{i}}_{1 \leq i \leq n}

, and access trees

{T_{i}}_{1 \leq i \leq n}

as input, and produces ciphertext

{C T_{i}}_{1 \leq i \leq n}

as output.

(4)

D e c r y p t (P K, S K, C T_{i}) ⟶ (M_{i}), 1 \leq i \leq n

: The data users execute this algorithm, using their private key

S K

to decrypt ciphertext

C T_{i}

. The algorithm takes the system’s public key

P K

, user’s private key

S K

, and ciphertext

C T_{i}

as input and produces the corresponding plaintext data

M_{i}

as output.

6.3. Details of Algorithms

The CP-ABE scheme for multiple shared sub-policies proposed in this paper includes four algorithms: system setup (

S e t u p

), key generation (

K e y G e n

), data encryption (

E n c r y p t

), and data decryption (

D e c r y p t

). The detailed descriptions of each algorithm are as follows:

(1)

S e t u p

Two multiplicative cyclic groups are selected, denoted as

G_{1}

and

G_{2}

, both of prime order p. Let g be the generator of

G_{1}

. A bilinear map

e : G_{1} \times G_{1} \to G_{2}

is defined, and a hash function

H : {0, 1}^{*} ⟶ G_{1}

is chosen. The AA randomly selects

α

and

β

from

Z_{p}^{*}

and generates the system’s public key

P K

and master key

M S K

as follows:

P K = (G_{1}, g, g^{β}, e {(g, g)}^{α})

(11)

M S K = (β, g^{α})

(12)

Finally, the AA publishes

P K

to all data owners and users.

(2)

K e y G e n

Based on the user’s attribute set S, the AA generates the corresponding private key

S K

. The AA first randomly selects

r \in Z_{p}

, and then for each attribute i in the attribute set S, it randomly selects

r_{i} \in Z_{p}

. Finally, it computes the user’s private key

S K

as follows:

S K = (D = g^{\frac{α + r}{β}}, \forall i \in A : D_{i} = g^{r} \cdot H {(i)}^{r_{i}}, D_{i}^{'} = g^{r_{i}})

(13)

(3)

E n c r y p t

Here, we describe the algorithm using the example of 6 distinct shared sub-policy access trees, as depicted in Figure 9. First, the encryption algorithm constructs polynomials for all nodes of the access tree, following the secret sharing scheme described in Section 5.5. Second, it computes the ciphertext using the traditional CP-ABE encryption scheme [2]. Let

Y (T_{A}), Y (T_{B}), Y (T_{C}), Y (T_{D})

and

Y (T_{E})

be the sets of leaf nodes of primary shared sub-policies

A, B, C, D

, and E.

Y (T_{o t h e r_{i}}), i = 1, 2, . . ., 6

be the sets of leaf nodes of the non-shared parts of access trees

T_{o t h e r_{i}}, i = 1, 2, . . ., 6

, respectively. The encryption of data

M_{1}

,

M_{2}

,

M_{3}

,

M_{4}

,

M_{5}

and

M_{6}

is as follows:

\begin{matrix} C T_{1} = & (T_{1}, \tilde{C} = M_{1} \cdot e {(g, g)}^{α f_{n} (0)}, C = g^{β f_{n} (0)}, \\ \forall N_{a} \in Y (T_{A}) : C_{N_{a}} = g^{f_{a} (0)}, {C^{'}}_{N_{a}} = H {(a t t (N_{a}))}^{f_{a} (0)}, \\ \forall N_{o t h e r_{1}} \in Y (T_{o t h e r_{1}}) : C_{N_{o t h e r_{1}}} = g^{f_{o t h e r_{1}} (0)}, {C^{'}}_{N_{o t h e r_{1}}} = H {(a t t (N_{o t h e r_{1}}))}^{f_{o t h e r_{1}} (0)}) \end{matrix}

(14)

\begin{matrix} C T_{2} = & (T_{2}, \tilde{C} = M_{2} \cdot e {(g, g)}^{α f_{p} (0)}, C = g^{β f_{p} (0)}, \\ \forall N_{a} \in Y (T_{A}) : C_{N_{a}} = g^{f_{a} (0)}, {C^{'}}_{N_{a}} = H {(a t t (N_{a}))}^{f_{a} (0)}, \\ \forall N_{b} \in Y (T_{B}) : C_{N_{b}} = g^{f_{b} (0)}, {C^{'}}_{N_{b}} = H {(a t t (N_{b}))}^{f_{b} (0)}, \\ \forall N_{o t h e r_{2}} \in Y (T_{o t h e r_{2}}) : C_{N_{o t h e r_{2}}} = g^{f_{o t h e r_{2}} (0)}, {C^{'}}_{N_{o t h e r_{2}}} = H {(a t t (N_{o t h e r_{2}}))}^{f_{o t h e r_{2}} (0)}) \end{matrix}

(15)

\begin{matrix} C T_{3} = & (T_{3}, \tilde{C} = M_{3} \cdot e {(g, g)}^{α f_{t} (0)}, C = g^{β f_{t} (0)}, \\ \forall N_{a} \in Y (T_{A}) : C_{N_{a}} = g^{f_{a} (0)}, {C^{'}}_{N_{a}} = H {(a t t (N_{a}))}^{f_{a} (0)}, \\ \forall N_{b} \in Y (T_{B}) : C_{N_{b}} = g^{f_{b} (0)}, {C^{'}}_{N_{b}} = H {(a t t (N_{b}))}^{f_{b} (0)}, \\ \forall N_{c} \in Y (T_{C}) : C_{N_{c}} = g^{f_{c} (0)}, {C^{'}}_{N_{c}} = H {(a t t (N_{c}))}^{f_{c} (0)}, \\ \forall N_{o t h e r_{3}} \in Y (T_{o t h e r_{3}}) : C_{N_{o t h e r_{3}}} = g^{f_{o t h e r_{3}} (0)}, {C^{'}}_{N_{o t h e r_{3}}} = H {(a t t (N_{o t h e r_{3}}))}^{f_{o t h e r_{3}} (0)}) \end{matrix}

(16)

\begin{matrix} C T_{4} = & (T_{4}, \tilde{C} = M_{4} \cdot e {(g, g)}^{α f_{r} (0)}, C = g^{β f_{r} (0)}, \\ \forall N_{a} \in Y (T_{A}) : C_{N_{a}} = g^{f_{a} (0)}, {C^{'}}_{N_{a}} = H {(a t t (N_{a}))}^{f_{a} (0)}, \\ \forall N_{b} \in Y (T_{B}) : C_{N_{b}} = g^{f_{b} (0)}, {C^{'}}_{N_{b}} = H {(a t t (N_{b}))}^{f_{b} (0)}, \\ \forall N_{c} \in Y (T_{C}) : C_{N_{c}} = g^{f_{c} (0)}, {C^{'}}_{N_{c}} = H {(a t t (N_{c}))}^{f_{c} (0)}, \\ \forall N_{o t h e r_{4}} \in Y (T_{o t h e r_{4}}) : C_{N_{o t h e r_{4}}} = g^{f_{o t h e r_{4}} (0)}, {C^{'}}_{N_{o t h e r_{4}}} = H {(a t t (N_{o t h e r_{4}}))}^{f_{o t h e r_{4}} (0)}) \end{matrix}

(17)

\begin{matrix} C T_{5} = & (T_{5}, \tilde{C} = M_{5} \cdot e {(g, g)}^{α f_{q} (0)}, C = g^{β f_{q} (0)}, \\ \forall N_{a} \in Y (T_{A}) : C_{N_{a}} = g^{f_{a} (0)}, {C^{'}}_{N_{a}} = H {(a t t (N_{a}))}^{f_{a} (0)}, \\ \forall N_{d} \in Y (T_{D}) : C_{N_{d}} = g^{f_{d} (0)}, {C^{'}}_{N_{d}} = H {(a t t (N_{d}))}^{f_{d} (0)}, \\ \forall N_{e} \in Y (T_{E}) : C_{N_{e}} = g^{f_{e} (0)}, {C^{'}}_{N_{e}} = H {(a t t (N_{e}))}^{f_{e} (0)}, \\ \forall N_{o t h e r_{5}} \in Y (T_{o t h e r_{5}}) : C_{N_{o t h e r_{5}}} = g^{f_{o t h e r_{5}} (0)}, {C^{'}}_{N_{o t h e r_{5}}} = H {(a t t (N_{o t h e r_{5}}))}^{f_{o t h e r_{5}} (0)}) \end{matrix}

(18)

\begin{matrix} C T_{6} = & (T_{6}, \tilde{C} = M_{6} \cdot e {(g, g)}^{α f_{s} (0)}, C = g^{β f_{s} (0)}, \\ \forall N_{a} \in Y (T_{A}) : C_{N_{a}} = g^{f_{a} (0)}, {C^{'}}_{N_{a}} = H {(a t t (N_{a}))}^{f_{a} (0)}, \\ \forall N_{d} \in Y (T_{D}) : C_{N_{d}} = g^{f_{d} (0)}, {C^{'}}_{N_{d}} = H {(a t t (N_{d}))}^{f_{d} (0)}, \\ \forall N_{e} \in Y (T_{E}) : C_{N_{e}} = g^{f_{e} (0)}, {C^{'}}_{N_{e}} = H {(a t t (N_{e}))}^{f_{e} (0)}, \\ \forall N_{o t h e r_{6}} \in Y (T_{o t h e r_{6}}) : C_{N_{o t h e r_{6}}} = g^{f_{o t h e r_{6}} (0)}, {C^{'}}_{N_{o t h e r_{6}}} = H {(a t t (N_{o t h e r_{6}}))}^{f_{o t h e r_{6}} (0)}) \end{matrix}

(19)

Finally, the DA uploads

C T_{1}

,

C T_{2}

,

C T_{3}

,

C T_{4}

,

C T_{5}

and

C T_{6}

to the CSP.

(4)

D e c r y p t

Assuming a user with an attribute set S and private key

S K

intends to access data

M_{1}

, the user first downloads ciphertext

C T_{1}

from the CSP. Then, they initiate the decryption process using

S K

. The decryption process is described as a recursive algorithm as follows: Let

N_{j}

represent a node in

T_{1}

. If

N_{j}

is a leaf node and

a t t (N_{j}) = i

where

i \in S

, then the computation is as follows:

F_{j} = D e c r y p t N o d e (C T, S K, N_{j}) = \frac{e (D_{i}, C_{N_{j}})}{e (D_{i}^{'}, C_{N_{j}}^{'})} = e {(g, g)}^{r f_{j} (0)}

(20)

If

N_{j}

is a leaf node and

a t t (N_{j}) = i

where

i \notin S

, then let

F_{j} = ⊥

.

If

N_{j}

is a non-leaf node, for all child nodes

N_{z}

of

N_{j}

, let

Q_{j}

represent a set of any

k_{j}

nodes

N_{z}

, and for each

N_{z}

in the set, ensure that

F_{z} \neq ⊥

. If such a

Q_{j}

does not exist, then let

F_{z} = ⊥

. If

Q_{j}

exists, then the computation is as follows:

\begin{matrix} F_{j} & = \prod_{z \in Q_{j}} F_{z}^{Δ_{z, Q_{j}} (0)} = \prod_{z \in Q_{j}} e {(g, g)}^{r f_{z} (0) Δ_{z, Q_{j}} (0)} = \prod_{z \in Q_{j}} e {(g, g)}^{r f_{p a r e n t (N_{z})} (i n d e x (N_{z})) Δ_{z, Q_{j}} (0)} \\ = \prod_{z \in Q_{j}} e {(g, g)}^{r f_{j} (z) Δ_{z, Q_{j}} (0)} = e {(g, g)}^{r \sum_{z \in Q_{j}} f_{j} (z) Δ_{z, Q_{j}} (0)} = e {(g, g)}^{r f_{j} (0)} \end{matrix}

(21)

Here,

Δ_{z, Q_{j}} (x) = \prod_{u \in Q_{j}, u \neq z} \frac{x - u}{z - u}

is the Lagrange coefficient polynomial.

When the user’s attribute set S satisfies the access control policy

T_{1}

, the

D e c r y p t N o d e

function is called at the root node

N_{n}

of

T_{1}

as follows:

F_{p} = D e c r y p t N o d e (C T, S K, N^{(1)}) = e {(g, g)}^{r f_{n} (0)}

(22)

The final step of decryption to obtain the data

M_{1}

is accomplished through the following calculations:

\frac{\tilde{C} \cdot F_{n}}{e (C, D)} = \frac{M_{1} \cdot e {(g, g)}^{α f_{n} (0)} \cdot e {(g, g)}^{r f_{n} (0)}}{e (g^{β f_{n} (0)}, g^{\frac{α + r}{β}})} = M_{1}

(23)

7. Security Analysis

7.1. Security Model

This paper uses a security game between the attacker

A

and the challenger

C

to describe the security model of the scheme. The specific process is as follows:

(1) Initialization Phase

Attacker

A

selects a challenge access structure

(T, T^{*})

and sends it to challenger

C

, where

T

and

T^{*}

share sub-policies.

(2) Setup Phase

Challenger

C

obtains the public key

P K

and sends it to attacker

A

.

(3) Key Query Phase 1

Attacker

A

initiates a private key query request to challenger

C

, and challenger

C

generates the corresponding attribute private key

S K_{i}

based on the attribute set

S_{i}

. However, the key queries do not satisfy the access policies

T

and

T^{*}

.

(4) Challenge Phase

Attacker

A

sends two pairs of plaintexts of the same length,

(M_{0}, M_{0}^{*})

and

(M_{1}, M_{1}^{*})

, to challenger

C

. Challenger

C

randomly selects b from the set

{0, 1}

and encrypts

(M_{b}, M_{b}^{*})

using

T

and

T^{*}

to obtain ciphertexts

(C T, C T^{*})

, which are then sent to attacker

A

.

(5) Query Phase 2

Repeat the operations of the key query phase 1.

(6) Guess Phase

Attacker

A

outputs a guess

b^{^{'}}

for b. If

b = b^{^{'}}

, then it is considered that attacker

A

has won this security game. The advantage of

A

winning this game can be denoted as:

A d v_{A} = P r [b^{^{'}} = b] - 1 / 2

7.2. Proof of Security

Definition 11.

If the CP-ABE scheme is secure, then it can be proven that there does not exist an attacker

A

who can break the scheme proposed in this chapter with a certain advantage in polynomial time. This implies the security of the scheme presented in this chapter.

Proof.

The scheme is based on the CP-ABE scheme [2], which has been proven to be secure in the generic group and random oracle model. If attacker

A

can break the scheme proposed in this chapter with a certain advantage in polynomial time, then there exists a challenger

C

who can break the CP-ABE scheme with the same advantage.

Initialization phase: attacker

A

selects a challenge access structure

(T, T^{*})

and sends it to challenger

C

, where

T

and

T^{*}

share sub-policies.

Setup phase: challenger

C

obtains the public key

P K = (G_{0}, g, g^{β}, e (g, g^{α}))

from the CP-ABE scheme and sends

P K

to attacker

A

.

Key query phase 1: attacker

A

queries challenger

C

for the private key corresponding to the attribute set

S_{i}

. Challenger

C

selects two random variables

r, r_{i} \in Z_{p}

and generates the corresponding private key

S K_{i} = (D = g^{\frac{α + r}{β}}, {D_{i} = g^{r} \cdot H {(i)}^{r_{i}}, D_{i}^{'} = g^{r_{i}}}_{\forall i \in A_{q_{1}}})

using the key generation algorithm and sends it to

A

. Attacker

A

can query

C

multiple times. Challenger

C

will continue to respond to attacker

A

’s queries. However, all key queries fail to satisfy the access policies

T

and

T^{*}

.

Challenge phase: attacker

A

sends two plaintext message pairs

{M_{0}, {M_{0}}^{*}}

and

{M_{1}, {M_{1}}^{*}}

to challenger

C

. Challenger

C

obtains

{M_{0}, {M_{0}}^{*}}

and

{M_{1}, {M_{1}}^{*}}

. It sends

(T, T^{*})

to the CP-ABE scheme, randomly chooses

b \in 0, 1

, and encrypts using access policies

T

and

T^{*}

to generate the corresponding ciphertexts

C T

and

C T^{*}

. Then, challenger

C

sends the generated challenge ciphertexts

C T, C T^{*}

to attacker

A

.

C T, C T^{*}

are as follows:

\begin{matrix} C T_{1} = & (T, \tilde{C} = M_{b} \cdot e {(g, g)}^{α f_{p} (0)}, C = g^{β f_{p} (0)}, \\ \forall N_{ξ} \in Y (T_{A}) : C_{N_{ξ}} = g^{f_{ξ} (0)}, {C^{'}}_{N_{ξ}} = H {(a t t (N_{ξ}))}^{f_{ξ} (0)}, \\ \forall N_{ρ} \in Y (T_{o t h e r_{1}}) : C_{N_{ρ}} = g^{f_{ρ} (0)}, {C^{'}}_{N_{ρ}} = H {(a t t (N_{ρ}))}^{f_{ρ} (0)}) \end{matrix}

(24)

\begin{matrix} C T_{1} = & (T^{*}, \tilde{C} = {M_{b}}^{*} \cdot e {(g, g)}^{α f_{q} (0)}, C = g^{β f_{q} (0)}, \\ \forall N_{ξ} \in Y (T_{A}) : C_{N_{ξ}} = g^{f_{ξ} (0)}, {C^{'}}_{N_{ξ}} = H {(a t t (N_{ξ}))}^{f_{ξ} (0)}, \\ \forall N_{ρ} \in Y (T_{o t h e r_{1}}) : C_{N_{ρ}} = g^{f_{ρ} (0)}, {C^{'}}_{N_{ρ}} = H {(a t t (N_{ρ}))}^{f_{ρ} (0)}) \end{matrix}

(25)

Key query phase 2: attacker

A

continues to initiate private key queries for different attribute sets. If the queried attribute set does not satisfy

T

and

T^{*}

, challenger

C

responds to the query in the same manner as in Phase 1. Otherwise, the attacker’s

A

’s query requests are terminated.

Guessing phase: attacker

A

outputs a guess about

b^{^{'}} \in 0, 1

. If

b = b^{^{'}}

, then the advantage of attacker

A

winning the game can be denoted as:

A d v_{A} = P r [b = b^{^{'}}] - 1 / 2

In summary, if attacker

A

can win the query and request game in this scheme with a certain advantage, it can be concluded that there exists a challenger

C

who can break the CP-ABE scheme with the same advantage. However, research has shown that CP-ABE schemes are provably secure, so the proposed scheme is secure. □

8. Performance Analysis

8.1. Encryption Computation Overhead Analysis

All experiments in this paper use the CP-ABE toolkit based on the JPBC (Java pairing-based cryptography) library. The running environment of the simulation experiments: the operating system is Windows 10 Home edition, CPU is Intel(R) Core(TM) i7-7700HQ CPU @ 2.80 GHz 2.80 GHz, and memory is 16 GB of RAM. All the experimental results are the average of 10 runs. Let

T (m_{i})

represent the computational cost of exponentiation in group

G_{i} (i = 1, 2)

, and

T (p m_{2})

represent the computational cost of multiplication in group

G_{2}

.

In the CP-ABE scheme [2], the DA encrypts each data object separately, and the computational cost of

\tilde{C}

in the ciphertext involves exponentiation and multiplication operations in group

G_{2}

, which is

T (m_{2}) + T (p m_{2})

. The computational cost of C is in terms of exponentiation in group

G_{1}

, which is

T (m_{1})

. For each leaf node of

C_{N_{y}}

and

{C^{'}}_{N_{y}}

, the computational cost is also exponentiation in group

G_{1}

, which is

T (m_{1})

. Let

Y (T_{1})

represent the number of attributes. Therefore, the total computational cost for a leaf node is

2 Y (T_{1}) T (m_{1})

. Thus, the total computational cost for encrypting a single data object is

(2 Y (T_{1}) + 1) T (m_{1}) + T (m_{2}) + T (p m_{2})

. The computational costs for the three types of operations involved in the encryption algorithm are shown in Table 7.

It can be seen that the encryption computational overhead mainly comes from the exponential operation on the group

G_{1}

on the leaf nodes. If we reduce the number of leaf nodes involved in encryption operations, we will reduce the overall encryption overhead.

8.2. Comparison of Different Schemes

As shown in Table 8, the traditional CP-ABE scheme [2] does not take into account the occurrence of shared sub-policies. Li’s scheme [19] can only handle a single shared sub-policy. Our previous work [20] handles the case where there exists the same number of shared sub-policies among multiple access trees. Our scheme provides a global clustering method of access trees according to their policy similarity and then integrates access trees in the same cluster. In contrast to our previous work, in our approach, within the same cluster of access trees, we allow the number of primary shared sub-policies among access trees to differ.

For brevity, we use the six access trees illustrated in Figure 9 to compare the computational overheads of different schemes. The example is representative; it includes six access trees with three layers; six access trees include both “AND” root gate and “OR” root gate; they have both different and the same number of primary shares sub-policies. In reality, if a data owner only releases a few data objects, or access control policies have less overlapping, we might not need to cluster them. If a data owner releases many data objects, and there are significant overlaps among their access policies, our scheme will be useful. The computational overheads of the traditional CP-ABE scheme for encrypting the six access trees in Figure 9 are:

T_{1} : [2 Y (T_{o t h e r 1}) + 2 Y (T_{A}) + 1] T (m_{1}) + T (m_{2}) + T (p m_{2})

T_{2} : [2 Y (T_{o t h e r 2}) + 2 Y (T_{A}) + 2 Y (T_{B}) + 1] T (m_{1}) + T (m_{2}) + T (p m_{2})

T_{3} : [2 Y (T_{o t h e r 3}) + 2 Y (T_{A}) + 2 Y (T_{B}) + 2 Y (T_{C}) + 1] T (m_{1}) + T (m_{2}) + T (p m_{2})

T_{4} : [2 Y (T_{o t h e r 4}) + 2 Y (T_{A}) + 2 Y (T_{B}) + 2 Y (T_{C}) + 1] T (m_{1}) + T (m_{2}) + T (p m_{2})

T_{5} : [2 Y (T_{o t h e r 5}) + 2 Y (T_{A}) + 2 Y (T_{D}) + 2 Y (T_{E}) + 1] T (m_{1}) + T (m_{2}) + T (p m_{2})

T_{6} : [2 Y (T_{o t h e r 6}) + 2 Y (T_{A}) + 2 Y (T_{D}) + 2 Y (T_{E}) + 1] T (m_{1}) + T (m_{2}) + T (p m_{2})

The total computational overhead is

\begin{matrix} [12 Y (T_{A}) + 6 Y (T_{B}) + 4 Y (T_{C}) + 4 Y (T_{D}) + 4 Y (T_{E}) \\ + 2 Y (T_{o t h e r 1}) + 2 Y (T_{o t h e r 2}) + 2 Y (T_{o t h e r 3}) \\ + 2 Y (T_{o t h e r 4}) + 2 Y (T_{o t h e r 5}) + 2 Y (T_{o t h e r 6}) \\ + 6] T (m_{1}) + 6 T (m_{2}) + 6 T (p m_{2}) \end{matrix}

(26)

All of the single shared sub-policies,

T_{A}

, of six access trees are merged into one, and the data objects corresponding to the six access trees are encrypted together in reference [19], which reduces the computational overhead by

10 Y (T_{A}) T (m_{1})

compared to the traditional CP-ABE scheme. The scheme supporting the integration of multiple shared sub-policies with the same numbers in reference [20] reduces the computational overhead of

[10 Y (T_{A}) + 4 Y (T_{B}) + 2 Y (T_{D}) + 2 Y (T_{E})] T (m_{1})

compared to the traditional CP-ABE scheme. We support the integration of multiple shared sub-policies with different numbers, which reduces the computational overhead of

[10 Y (T_{A}) + 4 Y (T_{B}) + 2 Y (T_{C}) + 2 Y (T_{D}) + 2 Y (T_{E})] T (m_{1})

compared to the traditional CP-ABE scheme.

8.3. Simulation Experiment

8.3.1. Simulation Experiments with Different Scenarios

For clarity, we set that the primary shared sub-policies have the same structure, i.e., they have the same number of leaf nodes

i = Y (T_{A}) = Y (T_{B}) = Y (T_{C}) = Y (T_{D}) = Y (T_{E}) = Y (T_{F})

; we also set the access structure of the non-shared part to be the same; therefore,

j = Y (T_{o t h e r 1}) = Y (T_{o t h e r 2}) = Y (T_{o t h e r 3}) = Y (T_{o t h e r 4}) = Y (T_{o t h e r 5}) = Y (T_{o t h e r 6})

.

When the number of attributes j of the non-shared sub-policy corresponding to each of the six access trees shown in Figure 9 is constant, the encryption computational overheads of the four schemes increase as the number of attributes i in the primary shared sub-policies increases, as shown in Table 9. In this experiment, we take

j = 5

;

i = {2, 3, 4, 5, 6}

.

From Figure 13, it can be seen that the encryption computation overhead of all four schemes gradually increases with the increase in the number of attributes of the primary shared sub-policy i. The encryption overhead of this scheme is smaller than the above three schemes.

8.3.2. Simulation Experiment with Clustering

Using the clustering method in Section 5.4, we achieve the clustering results shown in Figure 14.

In first-level clustering, we obtain the first cluster composed of 17 access trees; these access trees are

T_{2} (A N D)

,

T_{4} (O R)

,

T_{7} (A N D)

,

T_{18} (A N D)

,

T_{20} (O R)

,

T_{25} (A N D)

,

T_{28} (O R)

,

T_{29} (A N D)

,

T_{30} (A N D)

,

T_{32} (A N D)

,

T_{33} (A N D)

,

T_{34} (O R)

,

T_{35} (A N D)

,

T_{36} (O R)

,

T_{44} (A N D)

,

T_{45} (O R)

,

T_{50} (O R)

. After classifying, second-level clustering, access tree amendment, and integration, these 17 access trees are integrated into one big access tree. We can avoid repeated computation for primary shared sub-policies during the encryption process. Hence, in contrast to traditional CP-ABE that encrypts these 17 objects separately, we respectively reduced the computational costs of encryption related to minimal sub-policies

T_{N}

,

T_{P}

,

T_{F}

,

T_{D}

,

T_{L}

,

T_{C}

,

T_{B}

,

T_{H}

,

T_{M}

,

T_{T}

,

T_{O}

, and

T_{I}

by 16 times, four times, three times, two times, two times, three times, one time, one time, one time, one time, two times, and one time.

In first-level clustering, we obtain the second cluster composed of nine access trees; these access trees are:=

T_{13} (A N D)

,

T_{15} (A N D)

,

T_{16} (O R)

,

T_{17} (O R)

,

T_{23} (O R)

,

T_{39} (A N D)

,

T_{40} (O R)

,

T_{42} (A N D)

,

T_{47} (A N D)

. After classifying, second-level clustering, access tree amendment, and integration, these 9 access trees are integrated into one big access tree. We can avoid repeated computation for primary shared sub-policies during the encryption process. Hence, in contrast to traditional CP-ABE that encrypts these 9 objects separately, we respectively reduced the computational costs of encryption related to minimal sub-policies

T_{O}

,

T_{R}

,

T_{F}

,

T_{G}

,

T_{L}

, and

T_{M}

by eight times, two times, one time, three times, one time, and one time.

In first-level clustering, we obtain the third cluster composed of 10 access trees; these access trees are

T_{6} (O R)

,

T_{8} (A N D)

,

T_{9} (O R)

,

T_{10} (O R)

,

T_{21} (O R)

,

T_{22} (O R)

,

T_{26} (A N D)

,

T_{31} (O R)

,

T_{41} (A N D)

,

T_{46} (O R)

. After classifying, second-level clustering, access tree amendment, and integration, these 10 access trees are integrated into one big access tree. We can avoid repeated computation for primary shared sub-policies during the encryption process. Hence, in contrast to traditional CP-ABE that encrypts these 10 objects separately, we respectively reduced the computational costs of encryption related to minimal sub-policies

T_{J}

,

T_{Q}

,

T_{H}

,

T_{D}

,

T_{T}

,

T_{I}

, and

T_{B}

by nine times, one time, two times, one time, one time, two times, and one time.

A comparison of the computational cost of encryption before and after clustering is shown in Table 10.

From Figure 15, we can see that after integrating access trees in the same cluster, we can encrypt the corresponding data objects as a whole, which reduces the computational overhead significantly compared to encrypting them separately.

9. Conclusions

Most research works about CP-ABE efficiency try to improve their scheme’s efficiency by improving encryption, key generation, or decryption algorithms. They ignore the fact that there might exist similarities among different data objects’ access policies released by the same data owner. Considering the practical existence of varying degrees of overlap or similarity among the access policies of various data objects outsourced by the same data owner, we propose a policy clustering approach based on soft-set decision-making and cluster percolation methods. In addition, we provide how to merge these similar policy pieces and integrate access policies in the same cluster, enabling the encryption of corresponding data objects as a whole. Thus, our approach helps to prevent redundant computations associated with these similar policy pieces during the encryption process, ultimately enhancing the overall encryption efficiency from the data owners’ standpoint. Our method is suitable for the CP-ABE application scenarios where the data objects released by a data owner have access policies composed of a small set of the attribute universe; the policies have very good overlappings. Like most CP-ABE schemes, our scheme is IND-CPA secure.

Our proposal includes certain restrictions on the structure of access policies, and we have not yet provided an integration method for arbitrary access trees that have shared sub-policies. We will focus on developing a less restrictive integration method for similar access trees in the future. Additionally, we will explore more precise measuring methods of policy similarity.

Author Contributions

Conceptualization and validation, W.L. and N.H.; writing—original draft, W.L. and N.H.; writing—review and editing, W.L. and N.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research work was sponsored by the Natural Science Foundation of Xinjiang Uygur Autonomous Region (grant number: 2023D01C02) and the National Natural Science Foundation of China (grant number: 61862059).

Data Availability Statement

Data are artificial and were randomly generated.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

Sahai, A.; Waters, B. Fuzzy identity-based encryption. In Proceedings of the Advances in Cryptology—EUROCRYPT 2005: 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Aarhus, Denmark, 22–26 May 2005; pp. 457–473. [Google Scholar]
Bethencourt, J.; Sahai, A.; Waters, B. Ciphertext-policy attribute-based encryption. In Proceedings of the 2007 IEEE Symposium on Security and Privacy (SP’07), Berkeley, CA, USA, 20–23 May 2007; pp. 321–334. [Google Scholar]
Zhou, Z.; Huang, D. Efficient and secure data storage operations for mobile cloud computing. In Proceedings of the 2012 8th International Conference on Network and Service Management (cnsm) and 2012 Workshop on Systems Virtualiztion Management (svm), Las Vegas, NV, USA, 22–26 October 2012; pp. 37–45. [Google Scholar]
Li, J.; Jia, C.; Li, J.; Chen, X. Outsourcing encryption of attribute-based encryption with mapreduce. In Proceedings of the Information and Communications Security: 14th International Conference, ICICS 2012, Hong Kong, China, 29–31 October 2012; pp. 191–201. [Google Scholar]
Luo, W.; Feng, C.; Zou, l.; Yuan, D.; Wu, T.; Li, M.; Wang, G. Attribute-based encryption scheme with fast encryption. J. Softw. 2020, 31, 3923–3936. (In Chinese) [Google Scholar]
Hohenberger, S.; Waters, B. Online/offline attribute-based encryption. In Proceedings of the Public-Key Cryptography–PKC 2014: 17th International Conference on Practice and Theory in Public-Key Cryptography, Buenos Aires, Argentina, 26–28 March 2014; pp. 293–310. [Google Scholar]
Leng, Q.; Luo, W. Attribute-based encryption with outsourced encryption. Commun. Technol. 2021, 54, 2242–2246. [Google Scholar]
Green, M.; Hohenberger, S.; Waters, B. Outsourcing the decryption of {ABE} ciphertexts. In Proceedings of the 20th USENIX Security Symposium (USENIX Security 11), San Francisco, CA, USA, 10–12 August 2011. [Google Scholar]
Feng, C.; Yu, K.; Aloqaily, M.; Alazab, M.; Lv, Z.; Mumtaz, S. Attribute-based encryption with parallel outsourced decryption for edge intelligent IoV. IEEE Trans. Veh. Technol. 2020, 69, 13784–13795. [Google Scholar] [CrossRef]
Zhong, H.; Zhou, Y.; Zhang, Q.; Xu, Y.; Cui, J. An efficient and outsourcing-supported attribute-based access control scheme for edge-enabled smart healthcare. Future Gener. Comput. Syst. 2021, 115, 486–496. [Google Scholar] [CrossRef]
Zhang, J.; Cheng, Z.; Cheng, X.; Chen, B. OAC-HAS: Outsourced access control with hidden access structures in fog-enhanced IoT systems. Connect. Sci. 2021, 33, 1060–1076. [Google Scholar] [CrossRef]
Laicheng, C.; Yufei, L.; Xiaoye, D.; Xian, G. User privacy-preserving cloud storage scheme on CP-ABE. J. Tsinghua Univ. (Sci. Technol.) 2018, 58, 150–156. [Google Scholar]
Zou, L.; Feng, C.; Qin, Z.; Yuan, D.; Luo, W.; Li, M. CP-ABE scheme with fast decryption for public cloud. J. Softw. 2020, 31, 1817–1828. (In Chinese) [Google Scholar]
Li, J.; Sha, F.; Zhang, Y.; Huang, X.; Shen, J. Verifiable outsourced decryption of attribute-based encryption with constant ciphertext length. Secur. Commun. Netw. 2017, 2017, 3596205. [Google Scholar] [CrossRef]
Zhang, R.; Ma, H.; Lu, Y. Fine-grained access control system based on fully outsourced attribute-based encryption. J. Syst. Softw. 2017, 125, 344–353. [Google Scholar] [CrossRef]
Sheng, L. User privacy protection scheme based on verifiable outsourcing attribute-based encryption. Secur. Commun. Netw. 2021, 2021, 6617669. [Google Scholar] [CrossRef]
Wang, S.; Zhou, J.; Liu, J.K.; Yu, J.; Chen, J.; Xie, W. An Efficient File Hierarchy Attribute-Based Encryption Scheme in Cloud Computing. IEEE Trans. Inf. Forensics Secur. 2016, 11, 1265–1277. [Google Scholar] [CrossRef]
Wang, J. AccessPolicy for Attribute-Based Encryption. Ph.D. Thesis, Wuhan University, Wuhan, China, 2015. (In Chinese). [Google Scholar]
Li, W.; Liu, B.M.; Liu, D.; Liu, R.P.; Wang, P.; Luo, S.; Ni, W. Unified fine-grained access control for personal health records in cloud computing. IEEE J. Biomed. Health Inform. 2018, 23, 1278–1289. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.; Guo, T.; Helil, N. CP-ABE Optimization via the Flexible Integration of Access Policies Containing Multiple Shared Subpolicies. Secur. Commun. Netw. 2022, 2022, 2822846. [Google Scholar] [CrossRef]
Maji, P.; Roy, A.R.; Biswas, R. An application of soft sets in a decision making problem. Comput. Math. Appl. 2002, 44, 1077–1083. [Google Scholar] [CrossRef]
Roy, A.R.; Maji, P. A fuzzy soft set theoretic approach to decision making problems. J. Comput. Appl. Math. 2007, 203, 412–418. [Google Scholar] [CrossRef]
Liu, Z.; Qin, K.; Pei, Z. A method for fuzzy soft sets in decision-making based on an ideal solution. Symmetry 2017, 9, 246. [Google Scholar] [CrossRef]
Palla, G.; Derényi, I.; Farkas, I.; Vicsek, T. Uncovering the overlapping community structure of complex networks in nature and society. Nature 2005, 435, 814–818. [Google Scholar] [CrossRef] [PubMed]
Adamcsek, B.; Palla, G.; Farkas, I.J.; Derényi, I.; Vicsek, T. CFinder: Locating cliques and overlapping modules in biological networks. Bioinformatics 2006, 22, 1021–1023. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Example of cliques.

Figure 2. Example of communities.

Figure 3. Example of the cluster percolation method (http://web.stanford.edu/class/cs224w/, accessed on 1 March 2023).

Figure 4. Illustration of access tree amendment.

Figure 5. Structure of access trees

T_{2} (A N D)

,

T_{7} (A N D)

,

T_{25} (A N D)

,

T_{32} (A N D)

.

Figure 5. Structure of access trees

T_{2} (A N D)

,

T_{7} (A N D)

,

T_{25} (A N D)

,

T_{32} (A N D)

.

Figure 6. Amendment results of access trees

T_{2} (A N D)

,

T_{7} (A N D)

,

T_{25} (A N D)

,

T_{32} (A N D)

.

Figure 6. Amendment results of access trees

T_{2} (A N D)

,

T_{7} (A N D)

,

T_{25} (A N D)

,

T_{32} (A N D)

.

Figure 7. Amendment results of access trees

T_{2} (A N D)

,

T_{25} (A N D)

,

T_{32} (A N D)

.

Figure 7. Amendment results of access trees

T_{2} (A N D)

,

T_{25} (A N D)

,

T_{32} (A N D)

.

Figure 8. Integration result of access trees

T_{2} (A N D)

,

T_{7} (A N D)

,

T_{25} (A N D)

,

T_{32} (A N D)

.

Figure 8. Integration result of access trees

T_{2} (A N D)

,

T_{7} (A N D)

,

T_{25} (A N D)

,

T_{32} (A N D)

.

Figure 9. Example of six access trees with different primary shared sub-policies.

Figure 10. Integration results of two types of access trees.

Figure 11. Final integration results of two types of access trees.

Figure 12. System architecture.

Figure 13. The trends of the computational cost of encryption of the four schemes with the number of attributes in the primary shared sub-policies.

Figure 14. Final clustering results.

Figure 15. Comparison of encryption time costs with and without clustering and integration (unit: s).

Table 1. Example of 10 primary shared sub-policies contained in 20 access trees.

	$T_{A}$	$T_{B}$	$T_{C}$	$T_{D}$	$T_{E}$	$T_{F}$	$T_{G}$	$T_{H}$	$T_{I}$	$T_{J}$
$T_{1}$	3		1		2	1			1
$T_{2}$	3		1		2					2
$T_{3}$		2		3		1		2	1
$T_{4}$	3				2
$T_{5}$		2	1			1			1
$T_{6}$				3
$T_{7}$						1				2
$T_{8}$				3			2
$T_{9}$						1				2
$T_{10}$		2						2
$T_{11}$			1
$T_{12}$		2			2		2		1
$T_{13}$			1			1
$T_{14}$	3									2
$T_{15}$				3
$T_{16}$		2	1
$T_{17}$						1		2
$T_{18}$		2					2			2
$T_{19}$					2				1
$T_{20}$	3		1			1		2
impact factor	15	12	7	12	10	8	6	8	5	10
maximum combinable number	12	10	6	9	8	7	4	6	4	8

Table 2. Example of access trees with primary shared sub-policies.

	$T_{A}$	$T_{B}$	$T_{C}$	$T_{D}$	$T_{E}$	$T_{F}$	$T_{G}$	$T_{H}$	$T_{I}$	$T_{J}$	$T_{K}$	$T_{L}$	$T_{M}$	$T_{N}$	$T_{O}$	$T_{P}$	$T_{Q}$	$T_{R}$	$T_{X}$	$T_{T}$
$T_{1} (A N D)$	0	0	0	3	0	0	3	0	0	0	0	3	0	0	0	0	1	4	0	2
$T_{2} (A N D)$	0	2	0	3	0	2	0	0	0	0	0	0	0	4	0	4	0	0	0	0
$T_{3} (O R)$	0	0	0	0	0	0	0	0	0	0	0	3	0	0	0	0	0	0	0	0
$T_{4} (O R)$	0	0	2	0	0	0	3	0	4	0	0	3	0	4	0	0	0	0	0	0
$T_{5} (A N D)$	0	2	0	0	0	0	0	0	0	0	0	3	3	0	0	0	0	0	4	2
$T_{6} (O R)$	0	0	0	0	0	0	0	0	4	3	0	0	0	0	0	4	0	0	0	2
$T_{7} (A N D)$	3	0	0	0	2	2	0	3	0	3	4	0	0	4	4	4	0	4	0	2
$T_{8} (A N D)$	0	0	0	0	2	0	3	0	0	3	0	0	3	0	0	0	1	0	0	0
$T_{9} (O R)$	3	0	0	0	2	0	0	3	0	3	0	0	3	0	0	0	1	0	4	0
$T_{10} (O R)$	0	2	0	0	0	0	0	0	4	3	0	0	3	0	0	0	0	4	0	0
$T_{11} (A N D)$	0	0	0	0	2	0	0	3	0	0	0	3	3	0	0	4	1	0	0	0
$T_{12} (A N D)$	0	0	0	0	2	2	0	3	0	0	0	0	0	0	4	4	0	0	4	0
$T_{13} (A N D)$	0	2	0	3	0	2	3	0	0	0	0	0	0	0	4	0	0	4	0	0
$T_{14} (A N D)$	0	2	0	3	0	0	0	3	0	0	0	0	0	0	0	0	0	0	0	0
$T_{15} (A N D)$	0	0	0	0	0	0	0	0	4	3	0	0	0	0	4	0	0	4	0	2
$T_{16} (O R)$	0	0	2	0	0	0	3	0	0	0	0	3	3	0	4	4	0	0	0	2
$T_{17} (O R)$	0	0	0	3	0	0	3	0	0	3	4	0	0	0	4	0	0	0	4	2
$T_{18} (A N D)$	0	0	2	0	0	0	0	0	0	3	0	0	0	4	0	0	0	0	0	2
$T_{19} (O R)$	0	0	0	3	0	0	0	0	0	0	4	0	0	0	0	0	1	0	4	0
$T_{20} (O R)$	3	0	0	0	2	0	0	3	0	0	0	0	3	4	4	4	0	0	0	0
$T_{21} (O R)$	0	0	2	0	2	0	0	0	0	3	0	0	0	0	0	0	0	0	0	0
$T_{22} (O R)$	0	0	0	3	0	0	3	3	0	3	0	0	0	0	0	0	0	4	0	2
$T_{23} (O R)$	3	0	0	0	0	0	0	0	4	0	0	0	0	0	4	0	0	4	0	0
$T_{24} (O R)$	3	0	0	0	0	0	0	0	4	0	0	0	0	0	0	4	0	0	0	0
$T_{25} (A N D)$	0	0	0	3	0	2	0	3	0	0	0	3	0	4	0	4	0	4	0	0
$T_{26} (A N D)$	0	0	2	0	0	0	0	0	0	3	0	0	0	0	0	0	0	0	0	0
$T_{27} (A N D)$	0	0	0	0	0	0	0	0	4	0	0	3	0	0	0	0	0	0	0	2
$T_{28} (O R)$	0	0	0	0	0	0	0	0	4	3	0	0	0	4	4	4	0	0	0	0
$T_{29} (A N D)$	3	2	2	0	0	2	3	0	0	0	4	3	0	4	0	0	0	0	0	0
$T_{30} (A N D)$	0	0	0	0	2	0	0	3	4	0	0	0	3	4	0	0	0	0	0	2
$T_{31} (O R)$	0	2	2	3	0	0	0	3	0	3	0	3	0	0	0	4	1	0	0	2
$T_{32} (A N D)$	0	0	0	3	0	2	0	0	0	0	0	3	3	4	4	4	0	0	4	0
$T_{33} (A N D)$	0	2	2	0	0	0	0	0	0	0	0	0	0	4	0	0	0	0	0	0
$T_{34} (O R)$	0	0	2	0	0	0	0	3	4	0	0	3	3	4	0	0	1	4	0	0
$T_{35} (A N D)$	0	0	0	0	0	0	3	3	0	0	0	0	3	4	0	0	0	0	0	2
$T_{36} (O R)$	0	0	0	3	0	0	0	0	0	3	0	0	0	4	4	0	0	0	0	0
$T_{37} (A N D)$	3	0	0	0	0	0	0	0	0	0	0	0	0	0	0	4	0	0	0	0
$T_{38} (A N D)$	3	0	0	0	0	0	0	0	0	0	4	0	3	0	0	0	0	0	4	0
$T_{39} (A N D)$	3	0	0	0	0	0	0	3	0	0	0	0	0	0	4	0	0	4	0	0
$T_{40} (O R)$	3	0	0	0	0	0	3	0	4	3	0	3	3	0	4	0	1	0	0	0
$T_{41} (A N D)$	0	2	0	0	0	0	0	0	0	3	0	0	0	0	0	0	1	0	4	0
$T_{42} (A N D)$	0	0	0	3	0	2	3	0	4	0	4	0	0	0	4	0	0	0	0	0
$T_{43} (A N D)$	0	0	2	0	0	2	0	0	0	0	0	0	0	0	0	0	0	4	0	0
$T_{44} (A N D)$	0	2	0	0	0	2	0	0	0	0	0	3	0	4	0	0	0	0	4	0
$T_{45} (O R)$	0	0	0	0	2	0	0	0	0	0	0	0	3	4	0	0	0	0	0	2
$T_{46} (O R)$	3	2	0	0	2	0	0	0	4	3	0	0	0	0	0	0	0	0	0	0
$T_{47} (A N D)$	0	2	0	0	0	2	3	3	0	0	0	0	0	0	4	0	0	0	4	2
$T_{48} (O R)$	0	2	0	0	2	2	0	3	0	0	0	0	0	0	0	0	0	0	0	0
$T_{49} (O R)$	0	2	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	4	0	0
$T_{50} (A N D)$	0	0	0	0	0	0	0	0	0	3	0	0	0	4	0	0	0	0	0	0

Table 3. Similarity between partially access trees.

	$T_{1}$	$T_{2}$	$T_{3}$	$T_{4}$	$T_{5}$	$T_{6}$	$T_{7}$	$T_{8}$	$T_{9}$	$T_{10}$	$T_{11}$	$T_{12}$	$T_{13}$	$T_{14}$	$T_{15}$	$T_{16}$	$T_{17}$	$T_{18}$	$T_{19}$	$T_{20}$	$T_{21}$	$T_{22}$	$T_{23}$	$T_{24}$
$T_{1}$
$T_{2}$	3
$T_{3}$	3	0
$T_{4}$	6	4	3
$T_{5}$	5	2	3	3
$T_{6}$	2	4	0	4	2
$T_{7}$	6	10	0	4	2	9
$T_{8}$	4	0	0	3	3	3	5
$T_{9}$	1	0	0	0	7	3	11	9
$T_{10}$	4	2	0	4	5	7	7	6	6
$T_{11}$	4	4	3	3	6	4	9	6	9	3
$T_{12}$	0	6	0	0	4	4	15	2	9	0	9
$T_{13}$	10	7	0	3	2	0	10	3	0	6	0	6
$T_{14}$	3	5	0	0	2	0	3	0	3	2	3	3	5
$T_{15}$	6	0	0	4	2	9	13	3	3	11	0	4	8	0
$T_{16}$	8	4	3	8	8	6	10	6	3	3	10	8	7	0	6
$T_{17}$	8	3	0	3	6	5	13	6	7	3	0	8	10	3	9	9
$T_{18}$	2	4	0	6	2	5	9	3	3	3	0	0	0	0	5	4	5
$T_{19}$	4	3	0	0	4	0	4	1	5	0	1	4	3	3	0	0	11	0
$T_{20}$	0	8	0	4	3	4	20	5	11	3	12	13	4	3	4	11	4	4	0
$T_{21}$	0	0	0	2	0	3	5	5	5	3	2	2	0	0	3	2	3	5	0	2
$T_{22}$	12	3	0	3	2	5	12	6	6	7	3	3	10	6	9	5	11	5	3	3	3
$T_{23}$	4	0	0	4	0	4	11	0	3	8	0	4	8	0	12	4	4	0	0	7	0	4
$T_{24}$	0	4	0	4	0	8	7	0	3	4	4	4	0	0	4	4	0	0	0	7	0	0	7
$T_{25}$	10	13	3	7	3	4	17	0	3	4	10	9	9	6	4	7	3	4	3	11	0	10	4	4

Table 4. Results of the first clustering of 50 access trees.

	$T_{A}$	$T_{B}$	$T_{C}$	$T_{D}$	$T_{E}$	$T_{F}$	$T_{G}$	$T_{H}$	$T_{I}$	$T_{J}$	$T_{K}$	$T_{L}$	$T_{M}$	$T_{N}$	$T_{O}$	$T_{P}$	$T_{Q}$	$T_{R}$	$T_{X}$	$T_{T}$
$T_{2} (A N D)$	0	2	0	3	0	2	0	0	0	0	0	0	0	4	0	4	0	0	0	0
$T_{4} (O R)$	0	0	2	0	0	0	3	0	4	0	0	3	0	4	0	0	0	0	0	0
$T_{7} (A N D)$	3	0	0	0	2	2	0	3	0	3	4	0	0	4	4	4	0	4	0	2
$T_{18} (A N D)$	0	0	2	0	0	0	0	0	0	3	0	0	0	4	0	0	0	0	0	2
$T_{20} (O R)$	3	0	0	0	2	0	0	3	0	0	0	0	3	4	4	4	0	0	0	0
$T_{25} (A N D)$	0	0	0	3	0	2	0	3	0	0	0	3	0	4	0	4	0	4	0	0
$T_{28} (O R)$	0	0	0	0	0	0	0	0	4	3	0	0	0	4	4	4	0	0	0	0
$T_{29} (A N D)$	3	2	2	0	0	2	3	0	0	0	4	3	0	4	0	0	0	0	0	0
$T_{30} (A N D)$	0	0	0	0	2	0	0	3	4	0	0	0	3	4	0	0	0	0	0	2
$T_{32} (A N D)$	0	0	0	3	0	2	0	0	0	0	0	3	3	4	4	4	0	0	4	0
$T_{33} (A N D)$	0	2	2	0	0	0	0	0	0	0	0	0	0	4	0	0	0	0	0	0
$T_{34} (O R)$	0	0	2	0	0	0	0	3	4	0	0	3	3	4	0	0	1	4	0	0
$T_{35} (A N D)$	0	0	0	0	0	0	3	3	0	0	0	0	3	4	0	0	0	0	0	2
$T_{36} (O R)$	0	0	0	3	0	0	0	0	0	3	0	0	0	4	4	0	0	0	0	0
$T_{44} (A N D)$	0	2	0	0	0	2	0	0	0	0	0	3	0	4	0	0	0	0	4	0
$T_{45} (O R)$	0	0	0	0	2	0	0	0	0	0	0	0	3	4	0	0	0	0	0	2
$T_{50} (O R)$	0	0	0	0	0	0	0	0	0	3	0	0	0	4	0	0	0	0	0	0

Table 5. Results of the second clustering of 50 access trees.

	$T_{A}$	$T_{B}$	$T_{C}$	$T_{D}$	$T_{F}$	$T_{G}$	$T_{H}$	$T_{I}$	$T_{J}$	$T_{K}$	$T_{L}$	$T_{M}$	$T_{O}$	$T_{P}$	$T_{Q}$	$T_{R}$	$T_{X}$	$T_{T}$
$T_{13} (A N D)$	0	2	0	3	2	3	0	0	0	0	0	0	4	0	0	4	0	0
$T_{15} (A N D)$	0	0	0	0	0	0	0	4	3	0	0	0	4	0	0	4	0	2
$T_{16} (O R)$	0	0	2	0	0	3	0	0	0	0	3	3	4	4	0	0	0	2
$T_{17} (O R)$	0	0	0	3	0	3	0	0	3	4	0	0	4	0	0	0	4	2
$T_{23} (O R)$	3	0	0	0	0	0	0	4	0	0	0	0	4	0	0	4	0	0
$T_{39} (A N D)$	3	0	0	0	0	0	3	0	0	0	0	0	4	0	0	4	0	0
$T_{40} (O R)$	3	0	0	0	0	3	0	4	3	0	3	3	4	0	1	0	0	0
$T_{42} (A N D)$	0	0	0	3	2	3	0	4	0	4	0	0	4	0	0	0	0	0
$T_{47} (A N D)$	0	2	0	0	2	3	3	0	0	0	0	0	4	0	0	0	4	2

Table 6. Results of the third clustering of 50 access trees.

	$T_{A}$	$T_{B}$	$T_{C}$	$T_{D}$	$T_{E}$	$T_{G}$	$T_{H}$	$T_{I}$	$T_{J}$	$T_{L}$	$T_{M}$	$T_{P}$	$T_{Q}$	$T_{R}$	$T_{X}$	$T_{T}$
$T_{6} (O R)$	0	0	0	0	0	0	0	4	3	0	0	4	0	0	0	2
$T_{8} (A N D)$	0	0	0	0	2	3	0	0	3	0	3	0	1	0	0	0
$T_{9} (O R)$	3	0	0	0	2	0	3	0	3	0	3	0	1	0	4	0
$T_{10} (O R)$	0	2	0	0	0	0	0	4	3	0	3	0	0	4	0	0
$T_{21} (O R)$	0	0	2	0	2	0	0	0	3	0	0	0	0	0	0	0
$T_{22} (O R)$	0	0	0	3	0	3	3	0	3	0	0	0	0	4	0	2
$T_{26} (A N D)$	0	0	2	0	0	0	0	0	3	0	0	0	0	0	0	0
$T_{31} (O R)$	0	2	2	3	0	0	3	0	3	3	0	4	1	0	0	2
$T_{41} (A N D)$	0	2	0	0	0	0	0	0	3	0	0	0	1	0	4	0
$T_{46} (O R)$	3	2	0	0	2	0	0	4	3	0	0	0	0	0	0	0

Table 7. Operations and runtime involved in encryption algorithms.

Notation	Description	Running Time (ms)
$T (m_{1})$	Exponential operations on group $G_{1}$	15.30
$T (m_{2})$	Exponential operations on group $G_{2}$	1.08
$T (p m_{2})$	Multiplication on group $G_{2}$	0.02

Table 8. Comparison of different schemes in supporting shared sub-policies.

Scheme	Single Shared Sub-Policy	Multiple Shared Sub-Policies with Same Numbers	Multiple Shared Sub-Polices with Different Numbers	Policy Clustering
[2]	no	no	no	no
[19]	yes	no	no	no
[20]	yes	yes	no	no
Ours	yes	yes	yes	yes

Table 9. The encryption computational overhead of the four schemes (unit: s).

i	2	3	4	5	6
[2]	2.966	3.665	4.308	5.085	5.801
[19]	2.501	2.938	3.366	3.908	4.404
[20]	2.141	2.366	2.619	2.952	3.324
Ours	2.061	2.238	2.429	2.701	2.912

Table 10. Comparison of the computational cost of encryption before and after clustering.

	First Cluster of Access Trees	Second Cluster of Access Trees	Third Cluster of Access Trees
Number of access trees	17	9	10
Total number of original leaf nodes	304	172	137
Total number of leaf nodes after integration	174	115	88
Sum of encryption time costs without clustering and integration (separate encryption)	14.248 s	8.053 s	6.435 s
Sum of encryption time costs with clustering and integration	8.311 s	5.443 s	4.186 s

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, W.; Helil, N. Soft Set Decision and Cluster Percolation Method-Based Policy Clustering and Encryption Optimization for CP-ABE. Mathematics 2024, 12, 259. https://doi.org/10.3390/math12020259

AMA Style

Liu W, Helil N. Soft Set Decision and Cluster Percolation Method-Based Policy Clustering and Encryption Optimization for CP-ABE. Mathematics. 2024; 12(2):259. https://doi.org/10.3390/math12020259

Chicago/Turabian Style

Liu, Wei, and Nurmamat Helil. 2024. "Soft Set Decision and Cluster Percolation Method-Based Policy Clustering and Encryption Optimization for CP-ABE" Mathematics 12, no. 2: 259. https://doi.org/10.3390/math12020259

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Soft Set Decision and Cluster Percolation Method-Based Policy Clustering and Encryption Optimization for CP-ABE

Abstract

1. Introduction

2. Basic Knowledge

2.1. Bilinear Mapping

2.2. Monotonic Access Structure

2.3. Access Tree

2.4. Primary Shared Sub-Policy

2.5. Ideal Access Tree

2.6. Basic Sub-Policy

2.7. Soft Set

2.8. Decision Function Based on Hamming Distance

2.9. Primary Shared Sub-Policy Weight

2.10. Cluster Percolation Method

3. Access Policy Clustering Method Based on Soft Set Decision-Making

3.1. Overview

3.2. A General Approach to Access Tree Clustering Based on Soft Set Decision-Making

4. Access Policy Clustering Method Based on Cluster Percolation Method

4.1. Overview

4.2. A General Approach to Access Tree Clustering with Cluster Percolation Method

5. Amendment and Integration of Access Trees in the Same Cluster

5.1. Overview

5.2. Amendment of Access Trees within the Same Class

5.3. Case Study of Policy Integration

5.4. A General Approach for Integration of Access Trees in the Same Cluster

5.5. Secret Sharing for Integrated Access Trees

6. Construction of CP-ABE Scheme with Integrated Access Trees

6.1. System Architecture

6.2. Algorithm Description

6.3. Details of Algorithms

7. Security Analysis

7.1. Security Model

7.2. Proof of Security

8. Performance Analysis

8.1. Encryption Computation Overhead Analysis

8.2. Comparison of Different Schemes

8.3. Simulation Experiment

8.3.1. Simulation Experiments with Different Scenarios

8.3.2. Simulation Experiment with Clustering

9. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI