1. Introduction
Anonymous networks refer to hiding the users’ privacy information such as the network address of communication entities and the communication relationship between entities in the transmission traffic through certain methods, so that attackers cannot directly know or speculate on the communication relationship between the two parties or the identity information or location information of the communication entity [
1,
2]. Therefore, the security of routing information is an important factor in ensuring anonymous network security. In the onion routing (TOR) networks, some secure and reliable servers are set as directory servers (DSs), which can provide anonymous routing information to describe the current state of routing nodes. The user equipment (UE) can request and download the address information of routing nodes through HTTPS [
3].
In anonymous networks, the identity and address information of routing nodes need to be properly protected from being identified by attackers. As shown in
Figure 1a, the UE in TOR networks queries the DS for the routing node (RN) list [
4], and the DS returns
n qualified routing nodes to the UE. The UE randomly selects
m routing nodes from the
n nodes to build a multi-hop transmission path in which
m,
n ∈
N+, and
m ≤
n. As shown in
Figure 1b, in the invisible Internet Project (I2P) networks, the UE queries the network database (NetDb) for routing information. NetDb stores and searches the routing information in
n floodfill nodes through the Kad algorithm [
5]. NetDb returns the RouterInfo and LeaseSet to the UE, and the UE establishes an outbound tunnel for the user equipment to send data according to the routing information. The outbound tunnel is from the gateway to the endpoint, and the LeaseSet contains the gateway of the recipient’s inbound tunnel [
6]. After receiving the data, the endpoint of outbound tunnel forwards the data to the gateway of the recipient’s inbound tunnel. This routing scheme is also called the garlic routing [
7]. The problem with the above two query schemes of the routing table is that the DS or NetDb knows the range of routing nodes that can be selected by the UE; at the same time, the UE also knows the information of other routing nodes in the DS and NetDb other than the routing nodes selected by itself. Take TOR networks as an example: according to the RN request of the UE, the DS feeds back
n routing nodes meeting the requirements of the UE. The UE accordingly selects
m routing nodes from the
n routing nodes, with the result that the address information of
n–
m routing nodes in the DS are leaked to the UE.
Oblivious transfer (OT) is an important branch of cryptography research. It helps data owners provide data retrieval and data calculation based on their own data, while not disclosing their own data information to ensure that the data are available and invisible [
8,
9]. This paper proposes a secure querying scheme of the routing table (SQRT) based on oblivious transfer. This scheme makes sure that the user equipment only obtains the routing node information fed back to the user equipment in the directory server without knowing the information of other routing nodes from the routing table in the directory server. The directory server knows neither the specific requirements of the routing nodes submitted by the user equipment nor which part of the node information is fed back to the user equipment so as to achieve a good effect of routing table data privacy protection.
At present, there are some oblivious transfer schemes that can be applied to the malicious model. The BLAZE scheme presented in [
10] tries to combine the semi-honest model with a consistency test to check the consistency by comparing the hash value of the share in the input stage with that in the output stage, which might lead to high communication traffic of oblivious transfer expansion. A multi-party secret sharing scheme, TRIDENT is designed in [
11], where the share of the product and its hash value is generated among multiple participants in the online stage and sent to other parties to check the consistency, which leads to a linear relationship between the storage overhead and the number of participants. In [
12], a route selection scheme based on connectivity, delay, and trust (CDT) is proposed to help user equipment obtain good connectivity–delay–trust performance and prevent potential attacks from malicious routing nodes. However, if the user equipment is a malicious attacker trying to detect the network topology, the above scheme cannot guarantee the anonymity of the routing nodes, and the malicious attacker can sniff the identity and location information of all the routing nodes. In [
13], a secure route optimization scheme based on decentralized identifiers (DIDs) is provided to defense against denial-of-service attacks at the routing layer. There is a defect in the secure routing scheme where the directory server masters each routing node through which the user equipment information transmission path passes. Therefore, it is obviously not applicable to the honest but curious model of the directory server. Based on the analysis of the schemes proposed in latest literature, the authors provide
Table 1, which demonstrates the technical advantages and disadvantages of some current routing querying protocol schemes. A qualitative analysis shows the innovation and contribution of the SQRT scheme in the paper.
The remainder of this paper is organized as follows:
Section 2 describes the system model of secure query of the routing table in anonymous networks;
Section 3 presents the
Nk-out-of-
Nk + ∆ oblivious transfer protocol and its application in the secure query of the routing table;
Section 4 analyzes the security of SQRT scheme;
Section 5 provides the simulation environment, the numerical results, and some discussions, and
Section 6 concludes this paper.
2. System Model
The system model of secure query of the routing table in anonymous networks is shown in
Figure 2. The workflow of the SQRT scheme proposed in this paper includes three steps. Firstly, a user equipment
UEk generates a routing node query request. Secondly, the directory server stores and updates the status information of the routing nodes. Lastly, the directory server queries the status information of the routing nodes.
2.1. User Equipment Generates Routing Node Query Request
The user equipment UEk first generates a routing node query request and represents the required routing node requirements as an m-dimensional vector as , where i = 1, 2, …, m, k = 1, 2, …, l, and the number of user equipment in the network is l. represents the first constraint on the routing nodes of the user equipment UEk, for example, the network bandwidth of the requesting routing node BWj = 10 Mbps, j = 1, 2, …, n. represents the second constraint of the routing nodes, for example, the online time of the requesting routing node is Tj ≥ 12 h. represents the third constraint of the routing nodes, for example, the number of requested routing nodes Nk = 3, k = 1, 2, …, l. The user equipment UEk blinds with w × m-dimensional matrix Bk, ,
where
, and
bai is the selection bit,
bai ∈ {0,1},
i = 1, 2, …,
m. The user equipment
UEk generates the
w ×
m-dimensional matrix:
The public and private keys of the user equipment
UEk are (
sk,
pk), and
Nk + ∆ random public keys
pkt,
t = 1, …, and
Nk + ∆ are sampled from the public key space, where ∆ is the increasing redundancy due to the
Nk routing nodes required by the user equipment
UEk. The security of the SQRT scheme assumes that there is a public key encryption scheme [
14], where
Nk + ∆ random public keys are sampled without obtaining the corresponding private keys, and the semi-honest attack model is used to secure the scheme [
15]. The DS can only see the
Nk + ∆ public keys sent by the user equipment
UEk and cannot predict the corresponding private key, which public key
UEk has.
The user equipment UEk sends a routing node query request to the DS, including the blinded Mk and pkt, where t = 1, …, Nk + ∆.
2.2. The Directory Server Stores and Updates the Status Information of the Routing Nodes
The DS dynamically collects the status information of routing nodes through the network heartbeat mechanism [
16,
17]. The DS regularly sends heartbeat detection packets to the routing node
RNj, where
j = 1, 2, …,
n, and the present number of routing nodes in the network is
n. The DS waits for the responses of the routing nodes. If the responses of the routing nodes are not received within a certain time, it is considered that the current routing node has been offline and performed operation
n =
n − 1. The heartbeat response packet fed back by routing node
j to the directory server contains the real-time network bandwidth
BWj of the current routing node. The DS counts the online time
Tj of the routing node up to the current time based on the heartbeat response packet fed back by routing node
j. If the DS detects a newly added or re-online routing node, the real-time network bandwidth and online time of the routing node are also collected according to the above process, and the operation
n =
n + 1 is executed.
The DS can establish a routing table database based on the routing node data collected through the heartbeat mechanism [
18], which is expressed as
DB = {
dj},
j = 1, 2, …,
n, where
n is the number of nodes in the routing table;
dj is the state information set of routing node
j, i.e.,
dj = (
,
, …,
),
i = 1, 2, …,
m,
j = 1, 2, …,
n. For example,
represents the first type of status information of node
j, such as the network bandwidth
BWj, and
represents the second type of status information of node
j, such as the online time
Tj of the node
j. The DS stores and updates the status information set table
dj of the routing node.
2.3. The Directory Server Queries and Returns the Status Information of the Routing Node
After obtaining the
Mk of the user equipment
UEk, the DS calculates the
Sij for the routing node
j,
j = 1, 2, …,
n,
where
u and
v represent different routing nodes, respectively, i.e.,
u,
v = 1, 2, …,
n, and
u ≠
v.
The DS returns the routing node information
to the user equipment
UEk, which can be obtained according to the received
This paper is based on the distance
between
qi and
dj to the query routing nodes. With a flowchart,
Figure 3 shows that the DS selects nodes with the number of
Nk + ∆ from the number of
n routing nodes that meet the
UEk requirements. The specific method steps are shown in Algorithm 1. The DS queries the elements in the routing request set of the user equipment
UEk through Algorithm 1, compares them with the routing node status information in the routing table database, and looks for the nearest
Nk + ∆ routing nodes between them. Algorithm 1 describes how the directory server feeds back the
Nk + ∆ routing nodes that meet the requirements of the routing node request of the user equipment.
3. SQRT Scheme
The security querying for the routing table process includes two participants, DS and UEk. They want to calculate f(qi,(d1, d2, …, dm)) = rz together, where rz is the satisfied routing node obtained by comparison, and UEk obtains the output function rz.
The process of the user equipment
UEk querying the routing table security from the DS can be abstracted into an
Nk + ∆-out-of-
Nk oblivious transfer model, which is abbreviated as
. In a public key encryption scheme [
19], there is an encryption function
E, a decryption function
D, a public key
Keyp, and a private key
Keys. The key length satisfies |
Keyp| = |
Keys| =
K and String =
D(
E(
String,
keyp),
Keys), which means that in any secure public key encryption scheme, the corresponding private key cannot be calculated for a specific public key. In other words, it is difficult to calculate the corresponding private key. The Algorithm 1 shows the process of secure querying for the routing table.
Algorithm 1. The algorithm of secure querying for the routing table |
Input: sent by user equipment UEk and dj = (, , …, ) stored in the directory server Output: The IP address of the routing node RNj, and the number of routing nodes is Nk + ∆ |
Ini_Function(, dj) // Update the routing node requirements of UEk and the routing node status information of DS as the initial conditions for j = 1:n for i = 1:m // The DS needs to calculate the distance between and of node j if // Compare the distance of routing node among , , and , respectively; u and v represent different routing nodes Queue_Fun(dj,dj+1) // The routing table database DB is sorted according to the distance, and the number of the nearest node is in the front else Queue_Fun(dj+1,dj) end if end for end for Queue_Fun(DB)→RN1, …, , …, // The first Nk + ∆ routing nodes of n routing nodes in the routing table database End
|
For the 2-out-of-1 oblivious transfer scheme [
20], the input of DS is two strings
s0,
s1 ∈ {0,1}
k, and the input of
UEk is
c ∈ {0,1}. Using the 2-out-of-1 oblivious transfer to construct the
Nk + ∆-out-of-
Nk oblivious transfer is shown in
Figure 4.
Figure 4 mainly introduces how to select
Nk routing nodes from the
Nk + ∆ routing nodes in the oblivious transfer model.
The DS has inputs M1, M2, …, Mn ∈ {0,1}k, selects a random string C ∈ {0,1}k, and sends C to the DS;
UEk has different inputs c1, c2, …, ∈ {1, 2, …, Nk + ∆}, constructs a pair of public key Keypc and private key Keysc, and then calculates another public key Keyp,1−c according to Keyp,1−c ⊕ Keypc = C, and the key length is satisfied ;
UEk sends and to the DS;
The DS verifies whether ⊕ = C. If not, the execution is rejected.
- 4.
The DS sends E(s0, ) and E(s1, ) to UEk;
- 5.
UEk uses
Keypc to get
sc, i.e.,
- 6.
UEk and DS perform the interaction process of Nk oblivious transfer. Each time the oblivious transfer protocol is executed, UEk can always obtain the address information of a routing node j, j = 1, 2, …, n. After executing Nk times, UEk can obtain Nk messages Mt, t ∈ {c1, c2, …, }.
If both DS and
UEk are honest [
21,
22],
UEk can obtain
sc through (4), i.e.,
UEk can obtain
Nk messages
Mt,
t ∈ {
c1,
c2, …,
}. For DS, what it obtains in the process of interaction is only the sum of two strings
and
, and it cannot judge the exact value of
c. For
UEk, it can only construct public key and private key,
Keypc,
Keysc→
Keyp,1−c through the following methods. According to the requirements of a public key encryption scheme,
UEk cannot calculate
Keys,1−c through
Keyp,1−c. Thus, the
Nk + ∆-out-of-
Nk oblivious transfer scheme based on general public key encryption scheme can be constructed, i.e.,
.
The DS selects Nk public keys pkt, t = 1, 2, …, and Nk, from Nk + ∆ random public keys sent by the UEk. In addition, ∆ public keys pkt, t = 1, 2, …, ∆, and ∆ are randomly generated to encrypt the routing node information and return the information of Nk + ∆ routing node to the UEk, i.e., (e0, e1, …, ) = (, , …, ). After receiving (e0, e1, …, ), the UEk decrypts (e0, e1, …, ) with sk and only obtains the address information of Nk routing nodes.
4. Security Analysis
Under the semi-honest attack model [
23], if
is secure, the SQRT scheme is secure. The proof process is specified as follows.
Proof. Let the protocol with formal security proof be Π [
24]. Note that during the implementation of the protocol, the two views of
UEk and DS are
, including four parts of messages: {secret input, random number, messages transmitted from the other party, and output}. Based on the two views,
UEk and
DS can obtain the views as follows, respectively,
□
From the definition of security [
25], only the probabilistic polynomial time algorithm
S1/
S2 [
26] needs to be constructed, so that
P1/
P2 can be constructed on the premise of known input/output: (
qi,
rz)/({
d1,
d2, …,
dm},
λ), i.e., {
S1(
qi,
rz)}/{
S2({
d1,
d2, …,
dm},
λ)}. It is computationally indistinguishable from the views obtained in the process of real protocol execution, i.e., it meets the requirements
Therefore, we need only to construct S1 and S2. Next, we will prove the security of the SQRT scheme in two cases.
4.1. The DS Is Semi-Honest
If DS is semi-honest, it is only necessary to construct
S2 so that the input {
di}
i=1,…,m and output λ of
P2 are known. In this case, the corresponding view can be obtained according to (6). Because the SQRT scheme is based on the security of
, it can be regarded as a black box [
27], and there is an algorithm,
to obtain the input {
ri}
i=1,…,m and output
λ, then the view of
DS during execution can be obtained, which is represented as
({
ri}
i=1, …, m,
λ) and is indistinguishable from the real view calculation, i.e.,
4.2. The UEk Is Semi-Honest
Like the previous case, only
S1 needs to be constructed so that when the input
qi and output
rz of
P1 are known, the corresponding view can be obtained according to (5). Similarly, because the SQRT scheme is based on the security of
, there is an algorithm
to obtain the input
z and output
rz, and the view of
UEk executing
can be obtained, i.e.,
The view of the output of
S1 is represented as
Since
sa and
Bk are randomly generated and satisfy
,
i = 1, 2, …,
m,
sa and
Bk can be obtained from the view
. Because of their randomness, we can ascertain that (
sa,
Bk) and (
,
) are indistinguishable. Similarly,
and
are computationally indistinguishable [
28].
Thus, combined with (9), we can obtain
Whereas we can deduce that
Therefore, the security of the SQRT scheme can be proven under the semi-honest model.
5. Experimental Results
The hardware configuration of the directory server used in this experiment is Intel 4214R processor. The memory size is 256 GB. The network port is 25 Gbps. The software environment is CentOS Linux release 7.6. The hardware configuration of the user equipment is W-2145 8core CPU, and the memory is 32 GB. In order to obtain the packets of routing querying requests and responses, we needed to create a packet capture module on the router in the local network. In order to mark the network traffic more efficiently, the method of capturing offline data needed to be improved. The experimental method in the paper was to run the routing table query operation according to the full buffer mode in the virtual machine, i.e., continuously generate the data packets of routing querying request and response in the anonymous networks. During the experimental test, the data packet was captured through the router in the local network with Wireshark, TCPDump, and other packet capture tools.
Table 2 shows the security model and performance comparison results between the SQRT scheme proposed in this paper and the existing schemes. The security models are mainly divided into malicious model or semi-honest model, and the network performance includes the running time of the policy and the generated communication traffic. The BLAZE and TRIDENT schemes can realize multiplication under the malicious model; therefore, the communication traffic generated during the implementation of the above two schemes is relatively large. By comparison, the traffic generated by the TRIDENT scheme is lower than that of the BLAZE scheme. For the semi-honest model, the SQRT scheme proposed in this paper adopted the 3-out-of-10 oblivious transfer model. Compared with the onion routing and the galicia routing, the running time can be reduced by 16.9% and 34.5%, respectively, and the traffic can be reduced by 19.6% and 46.5%, respectively.
Figure 5 shows the performance results of the SQRT scheme in terms of running time when
Nk and ∆ take different values. It is worth noting that when analyzing the time overhead of cryptographic algorithms, we mainly focused on the public key cryptographic algorithms with a large amount of computation and longtime consumption, while ignoring the symmetric cryptographic algorithms with small overhead. Let
Nk = 1, …, 10, ∆ = 1, …, and 10. With the increase in the values of
Nk and ∆, the running time of the SQRT scheme increases accordingly, especially when
Nk ≥ 5 and ∆ ≥ 4, the running time is higher than 500 ms. When
Nk = 10, ∆ = 10, and the SQRT scheme adopts the more complex 10-out-of-20 oblivious transfer model, the running time reaches 714.5 ms. Therefore, the secure querying of the routing table can still be completed in less than 1s, and the user equipment can only obtain the information of 10 routing nodes fed back to the user equipment from the directory server, but not know the information of 10 additional routing nodes fed back by the directory server. On the other hand, the directory server knows neither the specific requirements of the routing nodes submitted by the user equipment, nor which 10 nodes the user equipment has selected among the 20 nodes fed back to the user equipment.
Figure 6 shows the performance test results of the SQRT scheme in terms of communication traffic when
Nk and ∆ take different values. Similar to
Figure 5, as the values of
Nk and ∆ increase, the communication traffic of the SQRT scheme increases accordingly. By comparison, the change of communication traffic value is more affected by the change of
Nk value than ∆ value. When
Nk is 1, the traffic is generally less than 100 KB. When
Nk is 10, the traffic is generally less than 400 KB. The experimental results show that the secure querying scheme of the routing table based on oblivious transfer proposed in the paper has less computational and communication overhead and is suitable for applications requiring high querying efficiency and security. Therefore, the SQRT scheme has good security and availability.
The authors give a comparison of the analysis results of the degree of anonymity between the SQRT scheme and the existing schemes, as shown in
Figure 7. The degree of anonymity is a measure of the degree to which the identity or address information of the routing nodes in the anonymous networks is not recognized by the attacker. In [
29], the degree of anonymity of anonymous networks is defined as
D = log
2h/log
2n, where
h is the number of routing nodes of which the private information is sniffed by the attacker, and
n is the total number of routing nodes in anonymous networks. Therefore, the smaller the anonymity
D, the better the privacy and security protection effect of the secure querying scheme of the routing table. When the number of routing nodes in the anonymous networks continues to increase, the degree of anonymity of various schemes declines, and the decline becomes slower. The degree of anonymity of the SQRT scheme is much lower than other existing schemes. That is because the SQRT scheme can effectively ensure that both user equipment and the directory server faithfully follow the routing querying protocol and protect the privacy information of both parties to the greatest extent. The experimental results better illustrate the security performance advantages of the SQRT scheme compared with the existing schemes.
6. Conclusions
The SQRT scheme proposed in the paper can effectively ensure that both user equipment and the directory server faithfully follow the routing querying protocol and protect the privacy information of both parties to the greatest extent. Compared with the existing schemes, the SQRT scheme proposed in the paper has obvious performance advantages in the degree of anonymity, running time, and communication traffic. Specifically, the SQRT scheme has the following four advantages:
- (1)
The first is correctness. If the directory server and user equipment abide by the protocol, the user equipment will obtain the routing node information it needs after the implementation of the scheme.
- (2)
The second is the confidentiality of the information of other routing nodes in the directory server, that is, after the policy has completed execution, the user equipment cannot obtain the information of any other routing nodes except the information of the routing nodes it needs.
- (3)
The third is the confidentiality of the routing node information obtained by the user equipment. After the policy has completed execution, the directory server knows neither the specific requirements of the routing node submitted by the user equipment, nor which part of the routing node information is obtained from the user equipment.
- (4)
The fourth is the efficiency of the network performance of the SQRT scheme in the operation process. For the semi-honest model, the SQRT scheme not only ensures the correctness and confidentiality, but also reduces the running time and traffic compared with the other existing schemes. It is expected to provide support and reference for the design and planning of the future secure communication system.
At the same time, this paper also has some limitations, mainly in the security analysis. This paper only carried out a theoretical analysis of the main possible threats; there was no strict mathematical basis to support this security analysis, and we did not use a provable security method for a more reliable security analysis. Therefore, the safety analysis part can be improved in future research.