An On-Demand Fault-Tolerant Routing Strategy for Secure Key Distribution Network

Wu, Zhiwei; Deng, Haojiang; Li, Yang

doi:10.3390/electronics13030525

Open AccessArticle

An On-Demand Fault-Tolerant Routing Strategy for Secure Key Distribution Network

by

Zhiwei Wu

^1,2,

Haojiang Deng

^1,2 and

Yang Li

^1,2,*

¹

National Network New Media Engineering Research Center, Institute of Acoustics, Chinese Academy of Sciences, No. 21, North Fourth Ring Road, Haidian District, Beijing 100190, China

²

School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, No. 19(A), Yuquan Road, Shijingshan District, Beijing 100049, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(3), 525; https://doi.org/10.3390/electronics13030525

Submission received: 28 December 2023 / Revised: 25 January 2024 / Accepted: 26 January 2024 / Published: 27 January 2024

(This article belongs to the Section Networks)

Download

Browse Figures

Versions Notes

Abstract

:

The point-to-point key distribution technology based on twinning semiconductor superlattice devices can provide high-speed secure symmetric keys, suitable for scenarios with high security requirements such as the one-time pad cipher. However, deploying these devices and scaling them in complex scenarios, such as many-to-many communication, poses challenges. To address this, an effective solution is to build a secure key distribution network for communication by selecting trusted relays and deploying such devices between them. The larger the network, the higher the likelihood of relay node failure or attack, which can impact key distribution efficiency and potentially result in communication key leakage. To deal with the above challenges, this paper proposes an on-demand fault-tolerant routing strategy based on the secure key distribution network to improve the fault tolerance of the network while ensuring scalability and availability. The strategy selects the path with better local key status through a fault-free on-demand path discovery mechanism. To improve the reliability of the communication key, we integrate an acknowledgment-based fault detection mechanism in the communication key distribution process to locate the fault, and then identified the cause of the fault based on the Dempster–Shafer evidence theory. The identified fault is then isolated through subsequent path discovery and the key status is transferred. Simulation results demonstrate that the proposed method outperforms OSPF, the adaptive stochastic routing and the multi-path communication scheme, achieving an average

20 %

higher packet delivery ratio and lower corrupted key ratio, thus highlighting its reliability. Additionally, the proposed solution exhibits a relatively low local key overhead, indicating its practical value.

Keywords:

secure key distribution; routing; fault tolerance; DS evidence theory; reliability

1. Introduction

The distribution of a secure symmetric key is essential to ensure the confidentiality of data transmission in secure communication. General symmetric key distribution methods such as the Diffie–Hellman algorithm have high convenience [1]. However, the emergence of quantum computing [2] reduces the complexity of discrete logarithm problems and increases the risk of key cracking. Quantum Key Distribution (QKD) ensures the security of distributed keys through the fundamental principles of quantum mechanics [3]. However, this method requires a dedicated optical channel and has the drawback that the key generation rate decays exponentially with distance [4]. Physical Unclonable Functions (PUF) based on Semiconductor Superlattice (SSL) is a Secure Key Distribution (SKD) system designed by synchronizing random numbers generated by the space-separated chaotic systems of both parties through the public channel [5,6]. The random information outputted by the SSL device is a complex nonlinear function of the driving signal, and the two twinning SSL devices under the same wafer are strongly correlated but difficult to be cloned in different wafers at some time in the future even if the same process is adopted [7]. These properties not only ensure the security of distributed keys [8], but also avoid the deployment of dedicated channels, and can be used on high-mobility terminals such as satellites [9].

To enable more users to communicate securely through a one-time pad cipher, each user needs to install a twinning SSL device with each other. However, in large-scale networks, a single node will have communication and computing performance bottlenecks due to the need to deal with tasks such as fuzzy extraction [6]. At the same time, the high manufacturing cost of SSL devices will also limit the installation and deployment of a large number of devices [9]. Therefore, this kind of PUF technology still has shortcomings, such as poor scalability to apply to complex network application scenarios.

By organizing a point-to-point SKD system to establish Secure Key Distribution Networks (SKDN), the application scope of the SKD system can be expanded. Enabling the provision of a secure symmetric key distribution service between any nodes using existing limited resources, which is the basic functional requirement of SKDN. In QKD, one implementation uses quantum repeaters to employ quantum entanglement of photons to communicate over different optical channels, and there is no need for trust in the network nodes to ensure unconditional security during QKD operations facilitated by such quantum repeaters [10]. Due to the limitations of the underlying physical mechanism, these methods are only suitable for QKD. The SKDN based on trusted relay has the characteristics of flexibility and strong scalability. The secure key distribution among the connected nodes can be completed by hop-by-hop forwarding in a public channel [11]. Moreover, it can support nodes to deploy other PUF devices for heterogeneous networking [12].

Compared with classical routing [13], the design goals of the routing strategy in SKDN have changed. During the key forwarding process, the routing strategy is responsible for finding a path with sufficient local key resources to encrypt the forwarded key to prevent the key from being eavesdropped on the way. Besides, the trusted relay on the selected path needs to avoid key leakage. The increase in the scale of SKDN, however, leads to a higher probability of network layer faults [14]. The term fault is used to refer to disruptions that can significantly impact key distribution performance. The causes may be benign, such as node movement, fail-stop caused by bugs, shortage of local key resources, congestion caused by bursting traffic, etc. It may also be malicious. For example, when a trusted relay normally participates in the routing process, it will exhibit Byzantine behavior such as tampering, forging, and selectively discarding packets. In addition, the faulty node may tamper with routing and intervene in other key distribution processes, reducing key distribution security [15]. The fault tolerance of routing protocols can be improved to ensure normal network functions and at the same time alleviate the urgent need to achieve fault tolerance of underlying software and hardware [16]. Since SSL-based SKDN has no dedicated channel restrictions, the node locations deployed by matched devices are highly decentralized and dynamic [9]. A well-designed design of the distributed routing strategy can reduce system operation and maintenance costs.

Existing related research primarily focuses on reducing the possibility of faulty nodes passively eavesdropping on key [15]. For example, random routing is used to avoid a certain path where the faulty node is located or multi-path key distribution is used to ensure that at least one path in the multiple paths does not contain a faulty node to ensure that the key is not leaked [10]. The key generation rate of the current point-to-point SSL-based technique, however, is only in the range of megabits per second [5]. This method improves key confidentiality at the expense of consuming a large amount of local keys, which weakens availability. Additionally, there is no mechanism optimization for potential performance issues under broader fault scenarios like the above. Considering the aforementioned issues, this paper proposes a practical routing strategy for SKDN. By considering the path discovery and fault detection mechanisms in the routing strategy design, and the collaborative working methods between them, we ensure that the key distribution system has certain fault tolerance in the scenario of network layer fault. The main contributions of this paper are summarized as follows:

An on-demand path discovery mechanism in SKDN is proposed. Fault-free path discovery is performed when necessary, and appropriate paths are selected based on local key status to reduce control message propagation overhead and improve key resource utilization.
A fault handling method in the communication key distribution stage is proposed. After analyzing the location of the fault through the acknowledgment-based fault detection mechanism, based on the Dempster–Shafer (DS) evidence theory, the mass function of the weighted observation evidence is calculated to identify possible causes. A new round of path discovery can be used to isolate the cause of fault and transfer key status to improve the ability to handle exceptions.
The effectiveness of the proposed solution under different node scales and different proportions of faulty nodes is evaluated through simulation. Simulation results show that the proposed solution has improved in parameters such as packet delivery ratio and corrupted ratio, and the verification strategy has a certain practical value.

The rest of this paper is organized as follows. Section 2 summarizes the related research articles. The system and problem model is introduced in Section 3. Section 4 describes the proposed solution. Section 5 presents the simulation result and discussion. The paper is concluded in Section 6.

2. Related Work

In the secure key distribution networks based on the trusted relay assumption, existing work on the design of the routing mechanism mainly considers the performance and security of secure key distribution.

In terms of network performance considerations, early testbeds primarily made modifications based on the Internet routing protocol [11]. The Defense Advanced Research Projects Agency uses modified OSPF, where link cost is evaluated based on the number of generated keys distributed on a link within the Link State Announcement update interval, but the current load problem is ignored [4,17]. In the European project, SECOQC [18] evaluates the link workload by calculating the actual key forwarding rate, but does not provide a method for calculating the link cost. Some studies optimize key utilization based on the number of local keys of the relay and minimize the number of path hops [19]. The common feature among the above methods is the utilization of periodic path information exchange, but when the number of SKDN requests increases, on-demand probing provides lower overhead and more timely link status information [20]. In addition to processing messages according to priority in different message queues on nodes to improve the quality of service, research [21] has found that the similarity in resource distribution with Mobile Ad Hoc Networks (MANETs) provides a reference for SKDN design, and proposes on-demand routing by calculating geographical distance and link status, which reduces the number of routing control packets. Research [22] has found that the OLSR algorithm can be improved through interaction with the status of the local key, but the hop count has not been optimized, which also affects system performance; in addition, software-defined networks can be used to collect topology information and complete path calculation [23] based on application requirements [24].

Some SKD technologies, such as QKD, can prevent QKD link eavesdropping, Attackers can exploit this to carry out denial-of-service attacks [25], but the aforementioned routing mechanism can be utilized to select alternative paths if a trusted relay node available. However, when the assumption based on trusted relay nodes partially fails, the security risk of active attacks such as traffic redirection [26] increases.

Research [13] has pointed out that routing mechanisms in existing work lack attention to key protection. Among them, how to improve the security of forwarding keys can be roughly divided into two types of methods; the first method realizes key distribution in an untrusted environment through multi-path [27], which can prevent eavesdropping by malicious nodes. However, it causes a huge overhead of key resources under the condition that the environment is trustworthy, and it cannot avoid the occurrence of system paralysis events caused by faulty nodes [15]. The second method uses a stochastic routing mechanism to ensure that it does not rely on a single key forwarding path to avoid possible risky relay nodes [28]. Although this method can avoid active attacks causing system unavailability to a certain extent, it exhibits limited performance improvement in the presence of faulty nodes [29]. The method also lacks a mechanism for switching between trusted and untrusted environments in order to reduce system overhead.

3. System and Problem Model

In this section, we introduce the considered system as well as the problem model.

3.1. System Model

SKDN consists of SKD nodes, twinning SSL devices, and public channels; it is used to extend the service scope of point-to-point security key distribution and complete security key distribution between any SKD nodes.

In the layered architecture of the SKDN, the link layer performs real-time point-to-point secure key distribution; the service layer receives the user’s confidential communication request, waits for the network layer to provide the communication key, and then returns it to the user. The network layer receives requests from the service layer, while the communication key is reconstructed through the local key generated by the link layer and returned to the service layer. The SKDN model is shown in Figure 1.

The link layer is composed of SKD nodes and logical links. The logical link consists of SKD nodes and public channels connecting SKD nodes. Each SSL device on the node matches an SSL device on another node within the SKDN. The link layer performs point-to-point secure key distribution between nodes containing twinning SSL devices based on the logical link. The node sends the driving signal to the other node through the logical link. Both parties convert the signal from digital to analog and then input it into the SSL device to generate a local pre-key. Twinning SSL devices produced on the same wafer have similar input and output characteristics [9], but it is still necessary to exchange auxiliary information derived from the local pre-key on the logical link, and then use the fuzzy extractor to perform information harmonization and privacy amplification to generate an unconditionally secure and consistent local key [5]. The locally generated key is cached in memory and used to establishing the communication key within the SKD network layer.

The network layer is responsible for the control and management of local key resources and the execution of routing strategies for generating communication keys. The routing strategy is divided into two steps: path discovery and communication key distribution. Path discovery determines the set of relays used for subsequent communication key distribution. The selected trusted relay needs to cache enough local key resources to support the forwarding of communication key. The communication key distribution process is shown in Figure 2.

Assume that a path including A, B, C, and D is selected through the path discovery mechanism, and A generates a certain random number. Then the random number and the local key generated by A and the next hop B can be XORed to generate ciphertext. The ciphertext is forwarded to the next hop on the public channel. After decryption, the next hop continues to forward according to the rules until D obtains the random number. Since the key is sent in the form of ciphertext through a one-time pad on the classic channel, the information cannot be obtained by intercepting the ciphertext. Furthermore, assuming the relay is trustworthy, it can be proved that the communication key represented by the random number is unconditionally secure [3].

The service layer obtains the communication key generated by the network layer. Therefore, SKDN provides a secure symmetric key distribution service for the application data network, which is of great significance to ensuring long-term security of many application data. There are existing symmetric key distribution algorithms on the protocol stack, such as IPSec, VPN, TLS, etc., to strengthen the security of user data [10]; as a basic module, it is also suitable for many application scenarios [9,30], such as the information-centric network, Internet of Things, Space–Air–Ground Integrated Network, etc.

3.2. Problem Model

We define a network

G = (V, E)

, and denote

F \subseteq V ∖ {v_{s}, v_{d}}

as the set of faulty nodes, where

{v_{s}, v_{d}} \subseteq V

is the set of source and destination pairs. First, we will formally analyze the changes in the reliability of the key distribution system under fault scenarios, and describe the conditions that the routing strategy needs to meet to complete key distribution for secure communication. When the value of the binary structure function

ϕ (x)

of order

| V | - 2

describing G is equal to 1, secure communication key distribution can be completed; when the value is equal to 0, the distribution fails. When the binary state

x_{i}

in the binary state vector

x = (x_{1}, x_{2}, \dots, x_{| V | - 2})

is equal to 1, the corresponding relay components

v_{i} \in V ∖ {v_{s}, v_{d}}

also belongs to F, otherwise it is equal to 0. The elements

γ_{i} (x) \in Γ

contained in the minimal path sets

Γ

derived from any

ϕ (x)

satisfy the following formula [31]:

ϕ (x) = 1 - \prod_{i = 1}^{| Υ |} (1 - γ_{i} (x))

(1)

where

γ_{i} (x)

is the product of the binary states corresponding to the set of components. Therefore, as long as

ϕ (x)

is equal to 1, it means that at least one

γ_{i} (x)

is equal to 1, which means that a set of components that constitute this

γ_{i} (x)

does not belong to the set F. To this end, we hope to design a feasible and robust routing strategy, which first needs to discover such

γ_{i} (x)

, and second distributes the secure communication keys through a single path composed of corresponding components.

Next, we will describe the specific scenario when the binary state

x_{i}

is equal to 1, that is, we will introduce the network layer faults covered by the proposed solution according to the different fault occurrence stages. In the path discovery phase, if the relay

f \in F

causes routing packets passing through this node to be lost due to benign reasons such as scarce key material [21], mobility [9], etc., it may lead to the failure of key distribution path establishment and perform a new round of path discovery [32]. The path discovery overhead is increased and a certain delay is also added. If f is controlled by an attacker, the routing control information may be tampered with, forged, or discarded, thereby affecting the establishment of the actual key distribution path. The path may also be redirected to the path where the node exists to facilitate the eavesdropping of subsequent communication keys [15]. In the communication key distribution phase, the same benign reasons can cause communication key message forwarding to fail. However, when f exhibits Byzantine behavior, the communication key may be discarded when passing through the node, that is, black hole attack, delayed, or forwarded to a suboptimal exit [32]. In addition, it is also easy to cause the communication key to be leaked, thereby reducing the success rate and security of key distribution. In a real-world setting, due to the embedded nature of semiconductor superlattice devices, SKDN is suitable for incremental deployment on the underlay network to be compatible with existing IP devices, so the risk on general-purpose servers also applies to secure key distribution network scenarios such as potential broken access control, denial of distributed service attacks, etc. These potential security risks may cause the above Byzantine behavior. As shown in Figure 3, malicious cause failures can also occur in two stages at the same time. For example, in the path discovery stage, incorrect path information is injected to build the key distribution path on the faulty node. In the subsequent communication key distribution stage, faulty nodes will conduct black hole attack or other Byzantine behaviors, seriously affecting the reliability of security key distribution.

4. Proposed Solution

Due to the criticality of the basic services provided by the SKDN, in addition to improving its scalability, it is also necessary to consider how to handle the failure of a SKD node. Based on the traffic consistency principle [33], if the aforementioned fault occurs, it will cause inconsistency between the inflow and outflow of nodes. To this end, nodes count the acknowledgment of the communication key, determine the traffic consistency, and detect and locate the abnormal link. However, the detection can also be caused by benign faults. To prevent the direct isolation of the faulty link solely based on the acknowledgment rate, which could lead to a significant reduction in system availability, we apply DS evidence theory to identify the cause of fault through multi-information aggregation and make corresponding decisions, and propose an on-demand fault-tolerant routing strategy for SKDN.

4.1. Overview

As shown in Figure 4, this routing strategy provides full life cycle management for communication keys, which is divided into six parts: path discovery, key transmission, fault detection, evidence collection, information aggregation, and decision making.

Path Discovery:

An improved on-demand path discovery mechanism based on signatures and flooding not only reduces the overhead but also prevents path discovery failures caused by tampering or selective loss of routing messages to a certain extent.

Key Transmission:

After the path is discovered, the node sends the communication key to the next hop node in the corresponding routing table entry of the path; at this time, the communication key status is uncertain and cannot be used by the upper layer.

Fault Detection:

The destination node will acknowledge the communication key. An abnormal acknowledgment rate will trigger the fault detection. The detection will start from the intermediate node of the path to detect the upstream and downstream status and recursively detect the faulty subpath until the faulty link is confirmed. This will allow the location of the faulty node to be narrowed down to a single link.

Evidence Collection:

Characterize the acknowledgment rate, bitmap autocorrelation coefficient, and local key status of the current path as evidence of the cause of the link fault, and define and derive the mass function corresponding to each piece of evidence; weight the evidence and perform fusion calculations to obtain the combined mass function.

Information Aggregation and Decision Making:

Select the cause of fault that can maximize the pignistic transformation of the combined mass function, conduct a corresponding new round of path discovery, and convert the communication key status generated during evaluation.

4.2. Path Discovery

The process is mainly divided into two phases: route request and route response. At least one path (if any) free of faulty nodes is found by flooding routing packets.

Each node will maintain a routing table, in which each table entry contains the following fields: destination address, egress port, relay list, and entry status. When the routing table entry for the destination is located is unavailable, it enters the route request phase. The SKD node generates and broadcasts the route request message. the message has an incremented request sequence number and is signed by the source, so other nodes can identify the request. To ensure the privacy of routing messages, the messages need to be encrypted with a local key before broadcasting. Each relay will maintain a list that stores recent routing requests. As shown in Figure 5, if the received request message does not match, it will be broadcast to other ports, thereby reducing the overhead of routing messages and the consumption of local keys.

When the route request message arrives at the destination, it enters the routing response phase. If the current request has not been processed, a routing response message containing the incremented response sequence number is broadcast. As shown in Figure 6, when the response list maintained by the relay cannot match, or it matches but the link status calculated based on the packet is better than the record in the table, the cumulative link status in the table is updated. Add the link status at the entrance of this node to the link status list in the packet, sign it, encrypt it, and forward it to other ports. If the source does not receive the route response message within a certain period, it simply resends the route request message. If there is at least one path between the source and the destination, the algorithm can discover the path and ensure the reachability of the route.

If the path priority is measured by accumulating link status when propagating routing response messages, a path that cannot provide local key services may be selected. As shown in Figure 7, if A triggers path discovery, this mechanism will cause the network layer to select paths containing A, B, and C. The key supply capability of the logical link between B and C is poor, and the key transmission failure is more likely. We comprehensively consider the local key status and hop number of the link of the current message as the path status, and the larger one is selected:

P = \underset{i}{arg max} (\frac{min (S_{i 0}, S_{i 1}, \dots, S_{i H_{i}})}{H_{i}})

(2)

where P represents the selected path sequence number and

H_{i}

represents the number of hops of path i. The link status S in the formula is defined as:

S = (\frac{K_{c u r}}{K_{m a x}}) * K_{g e n}

(3)

The first factor weight represents the status of the key cache.

K_{c u r}

and

K_{m a x}

here represent the amount of currently cached local key and storage limit, respectively. The second factor

K_{g e n}

is the absolute net key generation rate of the current logical link. If the first factor is small, it implies that a significant number of communication keys have passed through the node in the past period, which requires time to recover. If

K_{c u r}

is 0, it means that the current logical link cannot provide services.

4.3. Fault Detection

When a path is discovered, communication key distribution can be performed within the path. As mentioned earlier, if there are faulty nodes on the path, transmission abnormalities will occur. Therefore, this detection method is based on the destination node’s acknowledgment of the communication key: if the amount of acknowledgment in a sliding window is greater than the threshold, fault detection at the source is triggered.

Some faulty nodes may send bogus information to the destination node to maintain the traffic consistency principle. However, in the SKDN, the SKD node will utilize the local key to encrypt or decrypt messages when forwarding or receiving messages. If the faulty node does not use the local key, this bogus information will be decrypted into garbled characters and cannot be forwarded to the destination, causing fault detection to be triggered. Therefore, by using the local key, the traffic flowing through the faulty node is limited to a certain normal range, which improves the detection accuracy.

When fault detection is triggered, a detection list is maintained based on the relay list of the corresponding entry of this path in the routing table at this moment. We assume that the path includes source and destination with an odd number of N nodes, with

1, 2, \dots, N

as serial numbers, respectively. If the detection is triggered, the detected abnormal path from 1 to N is divided into two parts by inserting the intermediate point

(N - 1) / 2

in the detection list. When the next communication request arrives, the detection list is attached to the end of the packet carrying the communication key and then forwarded. All points in the detection list, including the destination, need to return acknowledgment and recursively detect the two subpaths. By analogy, until the detection triggers detection on the indivisible path from i to j, it is suspected that one of the nodes is faulty.

To prevent the faulty node from discarding the acknowledgment so that it can be blamed on any node between itself and the destination on the path when the relay receives the packet containing its own identity in the detection list, it needs to trigger the timing without sending the acknowledgment first. When the acknowledgment returned by the destination arrives at the relay within the time limit, the relay information and the packet body will be signed and attached to the end of the packet; otherwise, a new acknowledgment will be generated and sent.

If a fault-triggered communication key is used for secure communication between source and destination, the communication of both parties may be leaked. We observe the asynchronous nature of key distribution and application, establishing four different states for communication keys: uncertain, suspicious, available, and unavailable. For convenience, we assume that the duration of the fault will exceed the full cycle of fault detection. When the communication key request arrives, the status of the forwarded communication key is uncertain. When the acknowledgment rate is lower than the threshold, the status of the batch of keys is converted to available and returned to the service layer. Otherwise, convert the key status to suspicious.

4.4. Evidence Collection

In addition to malicious causes, benign factors such as poor local key status can also cause detection triggers. Simply isolating the faulty link will reduce system availability. Therefore, we use the Dempster–Shafer evidence theory to provide theoretical tools for multi-information aggregation and subsequent decision making to make a reasonable link status assessment. The Dempster–Shafer evidence theory provides a mathematical model for the uncertainty and imprecision in events associated with certain evidence, and also shows how to combine different evidence to make reasonable deductions about associated events [34]. We define a frame of discernment

Ω

as the possible causes that trigger detection, and the corresponding mass functions are assigned to both subjective and objective knowledge, also known as evidence, generated during the detection process, representing their respective estimations of potential causes:

Ω = \{B e n i g n, M a l i c i o u s\}

(4)

Mass function m is the mapping of power sets

2^{Ω}

to positive real numbers, satisfying:

\sum_{A \subseteq Ω} m (A) = 1, m (⌀) = 0

(5)

where m is defined as the following piecewise linear function:

m (M a l i c i o u s) = \{\begin{matrix} d, 0 \leq x \leq a, \\ (\frac{1 - 2 d}{c - a}) (x - a) + d, a < x \leq c, \\ 1 - d, c < x \leq 1 . \end{matrix}

(6)

m (B e n i g n) = \{\begin{matrix} 1 - d, 0 \leq x \leq a, \\ (\frac{2 d - 1}{b - a}) (x - a) + 1 - d, a < x \leq b, \\ d, b < x \leq 1 . \end{matrix}

(7)

m (B e n i g n, M a l i c i o u s) = \{\begin{matrix} 0, 0 \leq x \leq a, \\ (\frac{2 d - 1}{b - a} + \frac{1 - 2 d}{c - a}) (a - x), a < x \leq b, \\ (\frac{2 d - 1}{c - a}) (x - a) - 2 d + 1, b < x \leq c, \\ 0, c < x \leq 1 . \end{matrix}

(8)

f o r 0 < a < b < c < 1, 0 < d < 0.5 .

where x is derived from each evidence and b represents the current value of the input evidence x when the uncertainty of the associated event is highest. The evidence is collected from the communication key acknowledgment rate, current path status, and autocorrelation coefficient of acknowledgment bitmap, as shown below.

Evidence 1: communication key acknowledgment rate

e_{1}

. As shown in the following formula,

N_{K}

is the number of communication key packets sent in a sliding window, and

N_{A}

is the number of acknowledgments; this evidence is collected from the acknowledgment rate of the communication key that triggers the current detection process. Since the false positive rate is a common problem of intrusion detection systems [35], further analysis is required. It needs to be evaluated based on further evidence to improve the accuracy of inference of the cause of the fault:

e_{1} = \frac{N_{A}}{N_{K}}

(9)

Evidence 2: autocorrelation coefficient of acknowledgment bitmap

e_{2}

. This evidence is collected from the autocorrelation coefficient of acknowledgment bitmap that triggers the current detection process. A bitmap is represented by a one-dimensional sequence:

\vec{b} = (b_{1}, b_{2}, \dots, b_{N_{K}})

(10)

where

b_{j}

indicates whether the jth acknowledgment that triggers the current detection process is lost or not. Malicious faults usually result in irregular response packet loss, so we use the autocorrelation function [36] as shown below to evaluate the repeating pattern in the time series represented by the bitmap and further deduce the nature of the anomaly:

e_{2} = 1 - \frac{1}{N_{K}} \sum_{j = 0}^{N_{K} - 1} \sum_{k = 1}^{N_{K} - j} \frac{b_{k} b_{k + j}}{N_{K} - j}

(11)

Evidence 3: current path status

e_{3}

. This evidence is collected from the current path status

S_{c u r}

in the routing table.

K_{c m a x}

represents the maximum key generation rate of all logical links. A path with a poor current path status can easily cause detection triggers. If not, the belief that indicates a malicious fault in the path is stronger.

e_{3} = \frac{S_{c u r}}{K_{c m a x}}

(12)

Evidence 4 to Evidence $3 + log ⌊N - 1⌋$ : communication key acknowledgment rate

e_{3 + i}

. This evidence is collected from the acknowledgment rate that triggered the ith round of the fault detection process:

e_{3 + i} = \frac{N_{A_{i}}}{N_{K_{i}}}

(13)

Evidence $4 + log ⌊N - 1⌋$ to Evidence $3 + 2 log ⌊N - 1⌋$ : autocorrelation coefficient of acknowledgment bitmap

e_{3 + log ⌊N - 1⌋ + i}

. This evidence is collected from the autocorrelation coefficient of the acknowledgment bitmap in the ith round of the fault detection process. A bitmap is represented by a one-dimensional sequence:

\vec{b_{i}} = (b_{i, 1}, b_{i, 2}, \dots, b_{i, N_{K}})

(14)

where

b_{i, j}

indicates whether the jth acknowledgment in the ith round of the fault detection process is lost or not.

e_{3 + l o g ⌊ N - 1 ⌋ + i} = 1 - \frac{1}{N_{K}} \sum_{j = 0}^{N_{K} - 1} \sum_{k = 1}^{N_{K} - j} \frac{b_{i, k} b_{i, k + j}}{N_{K} - j}

(15)

4.5. Information Aggregation and Decision Making

The mass functions

m_{i}

and

m_{j}

of different observational evidence can be combined through Dempster’s rule to obtain the combined mass function

m^{'}

, as shown in the following formula.

m^{'}

also satisfies the condition of Equation (5) and is the same as the power set

2^{Ω}

mapped by

m_{i}

and

m_{j}

:

m^{'} (A) = \frac{1}{1 - k} \sum_{B \cap C = A} m_{i} (B) m_{j} (C)

(16)

k = \sum_{B \cap C = ⌀} m_{i} (B) m_{j} (C)

(17)

Therefore, different information can be considered comprehensively to further analyze and reason about the elements in the set. However, the normalized conjunctive rule of combination that defines Dempster’s rule assumes the same degree of belief for different evidence, and lacks the distinction of evidence importance through prior knowledge. For this reason, we calculate the combined mass function by using the weighted Dempster’s rule. After assigning a positive integer importance factor (IF) to each piece of evidence, for any

A \in 2^{Ω}

, Equations (16) and (17) will be transformed into:

m^{'} (A) = \frac{1}{k^{'}} \sum_{B \cap C = A} {m_{i} (B)}^{\frac{f_{i}}{f_{j}}} {m_{j} (C)}^{\frac{f_{j}}{f_{i}}}

(18)

k^{'} = \sum_{B \cap C \neq ⌀} {m_{i} (B)}^{\frac{f_{i}}{f_{j}}} {m_{j} (C)}^{\frac{f_{j}}{f_{i}}}

(19)

where

f_{i}

and

f_{j}

are the IFs of

m_{i}

and

m_{j}

, respectively. The IF of the combined mass function

m^{'}

is

(f_{i} + f_{j}) / 2

. It can be verified that the weighted Dempster’s rule satisfies the condition of Equation (5); weighting is more general, and due to the special case

f_{i} = f_{j}

, it degenerates into Dempster’s rule. More importantly, Equations (18) and (19) shows that the mass function with high IF contributes more to the combined mass function, effectively utilizing prior knowledge obtained from key transmission.

For the obtained combined mass function

m^{'}

, select the element

θ_{r e s u l t} \in Ω

that can maximize the Pignistic transform of

m^{'}

as the cause of fault:

θ_{r e s u l t} = \underset{θ}{arg max} (B e t P (θ))

(20)

where Pignistic transform

B e t P (θ)

is as follows:

B e t P (θ) = \sum_{A \subseteq Ω, θ \in A} \frac{m (A)}{| A | (1 - m^{'} (⌀))}

(21)

When the detection is triggered, we collect evidence from

e_{1}

to

e_{3 + 2 log ⌊N - 1⌋}

, calculate the combined mass function

m^{'}

, and calculate

θ_{r e s u l t}

through Equation (20). When

θ_{r e s u l t}

is benign, path discovery is performed, and convert the communication key sent from the detection trigger to available, return to the service layer. Otherwise, the communication key in the buffer is cleared, and the detected link will be added to the route request message in the next path discovery, and other nodes will ignore the link during the route discovery phase. Algorithm 1 for combined mass function

m^{'}

is summarized as follows. where

f_{i}

is the IF of each

e_{i}

,

m_{i}

is calculated from Equations (6)–(8). The time complexity of combined mass function calculation is

O (log (N))

, where N is the number of nodes on the path. This shows that the computational overhead increases slowly as the network size increases, indicating the practical value of the algorithm.

Algorithm 1 Combined mass function calculation

Input:: $m_{1}, m_{2}, \dots, m_{3 + log ⌊N - 1⌋ + i}, f_{1}, f_{2}, \dots, f_{3 + log ⌊N - 1⌋ + i}$
Output:: $m^{'} f^{'}$
1:: $m^{'} = m_{1}, f^{'} = f_{1}$
2:: for $i = 2$ to $4 + log ⌊N - 1⌋ + i$ do
3:: $k^{'} = \sum_{B \cap C = ⌀}^{} {m^{'} (B)}^{\frac{f_{i}}{f^{'}}} {m_{i} (C)}^{\frac{f^{'}}{f_{i}}}$
4:: for $A \in 2^{Ω}$ do
5:: $m^{'} (A) = \frac{1}{1 - k^{'}} \sum_{B \cap C = A}^{} {m^{'} (B)}^{\frac{f_{i}}{f^{'}}} {m_{i} (C)}^{\frac{f^{'}}{f_{i}}}$
6:: end for
7:: $f^{'} = (f^{'} + f_{i}) / 2$
8:: end for

4.6. Analysis

First, we analyze the security of the proposed solution. In the above path discovery phase, the RSA digital signature algorithm is used to verify the identity of the relevant information in the message to prevent the faulty node from impersonating other nodes. However, this will affect the security of the communication key. Due to the characteristics of SKD technology, the generated symmetric keys are independent of each other. Therefore, if the RSA public and private key pair are cracked at some point in the future, it will not affect the generated key and the one-time pad encryption using the key. This is called forward security. In addition, an issue that requires attention is that the possibility for an adversary to store the current so-called secure ciphertext and wait for the emergence of more advanced cryptanalysis technology to extract information from the ciphertext. This is possible in traditional symmetric key distribution scenarios. Different from the former, combining the key generated by SKD technology and the one-time pad encryption method can ensure the long-term security of ciphertext, which is of great significance for certain scenarios, such as personal medical records, business secrets, etc.

Next, we analyze and compare the upper limit of fault tolerance and overhead of this method and existing methods in the scenario shown in Figure 3. We will now analyze the shortcomings of existing methods under this mixed fault. The method used in [15] combined with PBFT [37] ensures that in the presence of faulty nodes, any two normal nodes can reach a consensus on the same key distribution path construction scheme. Based on this, path discovery between normal nodes can resist Byzantine behavior from faulty nods, avoiding previously described traffic redirection attack. However, this method assumes that the leader node proposes a construction plan through global network information calculation, which brings huge resource consumption when implemented in dynamic SKDN [22]. In addition, the method can only tolerate most

| F |

faulty nodes in

3 | F | + 1

nodes simultaneously. In the multi-path communication key distribution strategy proposed by method [38], communication keys are distributed on

| F | + 1

disjoint paths in sequence. After the destination node XORs the results, it can obtain the ITS secure communication key. This ensures the security of key distribution. However, faulty nodes can reduce the success rate of the multi-path scheme by delaying the communication key of one of the paths. In the literature [14], it has been proposed that additional

| F |

disjoint paths are needed to improve the key distribution liveness. Although the combination of the above methods can ensure the reliability of secure key distribution in the fault scenario shown in Figure 3, there is an upper limit to the number of faulty nodes that can be resisted, and a large number of redundant disjoint paths are required as the cost of improving reliability, which greatly reduces system availability. For example, when C, E, and G in Figure 3 become faulty, faulty nodes become the majority of network nodes, and there are not enough disjoint paths. By the path discovery and fault detection mechanism, our method can discover the path containing A, B, F, and D and distribute the key on this path. Therefore, ideally, as long as there is a path that does not contain a faulty node, communication keys can be distributed on this path without relying on local keys from other paths, which improves the availability of the system.

5. Simulations and Discussion

This section will verify the effectiveness of the proposed solution through experimental simulation and detailed analysis of relevant results.

5.1. Simulation Environment

We use the ns3 simulator to compare the proposed solution with OSPF, the adaptive stochastic routing (ASR) [28] and the multi-path communication scheme (MPCS) [38]. BRITE [39] is used to generate a random topology under the Waxman model, in which nodes and links are randomly distributed in the grid to ensure that the evaluation of the method based on the simulation results is independent of the specific network structure. Table 1 shows the parameters used in the simulation. The simulation time was 100 s for each simulation and 20 simulations were run to obtain the average value. We will consider the following two scenarios representing malicious fault in the communication key distribution phase and path discovery phase, respectively. One is the black hole attack (BHA), where the faulty node participates in path discovery normally while performing uniform random selective packet loss; the second is the traffic redirection attack (TRA). Based on BHA, it interferes with the path discovery process involved, and the faulty node falsely reports its local key status, which will cause nodes to select a path containing faulty nodes for key distribution with a greater probability.

Performance metrics including packet delivery ratio, key material utilization, corrupted key ratio, and hop count can be used to assess the availability and reliability of routing strategy within the specified simulation environment, as described below.

Packet Delivery Ratio: This metric is the ratio of communication keys successfully received by the destination to all communication keys sent by the source during the simulation. This is used to evaluate communication key distribution performance under fault scenarios.

Key Material Utilization: This metric is the ratio of local keys consumed by successfully received communication keys during the simulation to the total consumption of local keys. It can reflect the key utilization rate and certain routing overhead, reflecting the availability of the routing strategy.

Corrupted Key Ratio: The metric is the proportion of successfully received communication keys that have passed faulty nodes during the simulation. This reflects the confidentiality of communication keys successfully distributed in fault scenarios.

Hop Count: This metric is the average forwarding number of successfully received communication keys during the simulation. This reflects the effectiveness of the path discovery mechanism with a certain key distribution delay and improved fault tolerance.

5.2. Results and Discussion

Based on the above simulation environment configuration, this section will conduct a comprehensive comparison of the proposed solution and the other three methods based on the above metrics in two fault scenarios.

Packet Delivery Ratio: Figure 8 shows the packet delivery ratio (PDR) of the four methods under different faulty nodes ratio (FNR) in the BHA scenario of different numbers of nodes. The vertical line segments depict the

95 %

confidence intervals of the results. As FNR increases, the PDR of the four different methods decreases, in line with expectations. We found that the proposed solution has a higher PDR than the other three methods at the same FNR. When the FNR is low, the proposed solution can identify based on different evidence that there is a greater chance that the cause of the fault is benign, so it can perform a dynamic route discovery and plan a better path for communication key forwarding. While the other methods have no identification mechanism, and both paths are relatively fixed, they have a lower PDR. ASR’s load balancing ability can improve the PDR a little but not much. MPCS reduces the possibility of key leakage by transmitting it on redundant paths. However, this leads to a proportional acceleration of local key consumption, resulting in a decrease in system availability. When FNR increases, the probability of the proposed solution detecting the faulty nodes is greater. Although partitioning the network causes performance degradation and PDR reduction, the faulty nodes are isolated and the accuracy of path discovery information is improved, so the PDR is still higher than the other two methods. As the number of nodes increases, the distributed scalability of the proposed solution enables the system to maintain a high PDR. It can be observed that when the number of nodes is 40, the proposed solution has a significant decrease in PDR when FNR is around

0.1

to

0.2

, while when the number of nodes is 60, 80, and 100, PDR drops significantly when FNR is greater than

0.25

. This shows that when the number of nodes increases, even though the proposed solution isolates faulty nodes under the same FNR, complex topology provides more available paths.

Figure 9 shows the PDR of the four methods under FNR in the TRA scenario of different numbers of nodes. The vertical line segments depict the

95 %

confidence intervals of the results. As the FNR increases, the PDR under different methods decreases. However, compared to the BHA scenario, using OSPF, ASR, and MPCS under TRA causes PDR to drop rapidly. This is because the faulty node propagates erroneous status information during the process of route discovery, causing path establishment to pass through the faulty nodes. As the FNR increases, this phenomenon becomes more serious. It can be seen that compared with the BHA scenario, when the proposed solution is used, erroneous status information can easily cause too many paths to be concentrated on faulty nodes, and the PDR will inevitably drop sharply. However, the inconsistency between status information and PDR improves the fault detection efficiency. Therefore, the proposed solution also avoids related faulty nodes in the subsequent path discovery process, ensuring a relatively stable PDR level. Therefore, we observe that at each FNR, the proposed solution performs better than the other three methods.

Key Material Utilization: Figure 10 shows the key material utilization (KMU) of the three methods under FNR in the BHA scenario of different numbers of nodes. The vertical line segments depict the

95 %

confidence intervals of the results. KMU reflects the proportion of key material used to successfully forward communication keys among the total key consumption, and is affected by the PDR and the overhead of the local keys used for other than communication key forwarding. Compared with OSPF and ASR, the KMU of the proposed solution is slightly lower, because it has more frequent detection and path discovery under high FNR. MPCS has the lowest KMU due to the resource consumption caused by its mechanism. When the other two methods correctly implement routing control, it will not increase the local key overhead in other places due to the improvement of FNR. The proposed solution provides significant PDR gains as mentioned above. This is due to the improved fault tolerance of the system at the cost of resource consumption. In general, the KMU difference between these three methods is small, reflecting the effective resource utilization of the proposed solution. For example, when the number of nodes is 100, with the improvement of PDR, the KMU of the proposed solution is even better than the comparison method at FNR around

0.2

to

0.3

.

Corrupted Key Ratio: Figure 11 shows the corrupted key ratio (CKR) of the three methods under FNR in the BHA scenario of 60 nods. The CKR of different methods increases with the increase of FNR, and the rate of increase of the proposed solution is much lower than that of the comparison method. This is one of the advantages of the proposed solution, although MPCS has lower CKR than OSPF and ASR when FNR is low. However, as FNR increases, there are not enough redundant paths in the network for key distribution, which leads to a sharp increase in CKR under MPCS. Through the collaborative work of the fault detection mechanism and the path discovery mechanism, a suitable path is found for communication key distribution by the proposed strategy, which reduces the possibility of the communication key being intercepted by faulty nodes. When the maximum FNR is equal to

0.4

, the CKR is still less than

5 %

. This information leakage can be eliminated with privacy amplification. Since the path provided by OSPF and ASR is relatively fixed, CKR is linearly positively correlated with FNR, which greatly improves the possibility of communication key leakage.

Hop Count: Figure 12 shows the hop count of the three methods under FNR in the BHA scenario of 60 nods. OSPF identifies a single shortest path for key distribution. ASR and MPCS selects other subshort paths for key distribution with a certain probability, thereby improving the distribution success rate and therefore having a larger hop count. As FNR increases, a slight decrease in hop count can be observed. This is essentially due to the lack of response measures to faulty nodes in the comparison method, which results in a greater possibility of containing faulty nods in long paths, resulting in a decrease in the distribution success rate. The hop count traversed by successfully received communication keys is reduced. In the absence of faulty nodes, benefiting from the reasonable definition of link status, the hop count of the proposed solution is between OSPF and ASR, which also represents a lower distribution delay. The increase of hop count with the increase of FNR means that the proposed solution can identify faulty nodes and find paths that cannot be found by the comparison method for key distribution, which improves the fault tolerance of the strategy.

6. Conclusions

In this paper, we propose a practical on-demand fault-tolerant routing strategy to improve the availability and reliability of communication key distribution in the presence of network layer fault. We consider a combination of path discovery and fault detection mechanisms to balance the effectiveness and fault tolerance of SKDN. In particular, the strategy adopts a fault-free on-demand path discovery and selects the appropriate path for key forwarding based on the local key status. In addition, an acknowledgment-based fault detection mechanism is integrated during the distribution process to locate abnormal links, and the identification accuracy is improved by identifying possible causes based on DH evidence theory. The system’s reliability is enhanced by varying responses to different causes. The simulation results demonstrate the effectiveness and scalability of the proposed solution compared to comparative methods under different faulty node ratios. Moreover, the proposed solution has a relatively low local key overhead, indicating certain practicability. In future work, we will consider the impact of changes in bandwidth resources, including the reduction of fault identification accuracy. Additionally, a solution to dynamically join and exit nodes is considered to prevent delays in path information dissemination. We will incorporate additional evidence from other intrusion detection systems and reputation systems to analyze network status more comprehensively, and develop a more refined exception response mechanism to further enhance the fault tolerance of SKDN.

Author Contributions

Conceptualization, Z.W. and Y.L.; Funding acquisition, H.D.; investigation, Z.W.; Methodology, Z.W., H.D. and Y.L.; software, Z.W.; supervision, Y.L. and H.D.; validation, Z.W.; writing—original draft preparation, Z.W.; writing—review and editing, Z.W., Y.L. and H.D. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the Strategic Priority Research Program of Chinese Academy of Sciences: Information Collaborative Service and Data Sharing (Grant No. XDA031050100).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Rescorla, E. The Transport Layer Security (TLS) Protocol Version 1.3. IETF RFC 8846. 2018. Available online: https://www.rfc-editor.org/rfc/rfc8446 (accessed on 27 December 2023).
Shor, P.W. Algorithms for quantum computation: Discrete logarithms and factoring. In Proceedings of the 35th Annual Symposium on Foundations of Computer Science, Santa Fe, NM, USA, 20–22 November 1994; IEEE: Piscataway, NJ, USA, 1994; pp. 124–134. [Google Scholar]
Alléaume, R.; Branciard, C.; Bouda, J.; Debuisschert, T.; Dianati, M.; Gisin, N.; Godfrey, M.; Grangier, P.; Länger, T.; Lütkenhaus, N.; et al. Using quantum key distribution for cryptographic purposes: A survey. Theor. Comput. Sci. 2014, 560, 62–81. [Google Scholar] [CrossRef]
Elliott, C.; Pearson, D.; Troxel, G. Quantum cryptography in practice. In Proceedings of the 2003 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, Karlsruhe, Germany, 25–29 August 2003; pp. 227–238. [Google Scholar]
Liu, W.; Yin, Z.; Chen, X.; Peng, Z.; Song, H.; Liu, P.; Tong, X.; Zhang, Y. A secret key distribution technique based on semiconductor superlattice chaos devices. Sci. Bull. 2018, 63, 1034–1036. [Google Scholar] [CrossRef]
Liu, J.; Xie, J.; Zhang, J.; Liu, B.; Chen, X.; Feng, H. A Secure Secret Key Agreement Scheme among Multiple Twinning Superlattice PUF Holders. Sensors 2023, 23, 4704. [Google Scholar] [CrossRef]
Tong, X.; Chen, X.; Xu, S. Advances in superlattice cryptography research. Chin. Sci. Bull. 2020, 65, 108–116. [Google Scholar] [CrossRef]
Keuninckx, L.; Soriano, M.C.; Fischer, I.; Mirasso, C.R.; Nguimdo, R.M.; Van der Sande, G. Encryption key distribution via chaos synchronization. Sci. Rep. 2017, 7, 43428. [Google Scholar] [CrossRef]
Xu, L.; Wu, H.; Xie, J.; Yuan, Q.; Sun, Y.; Shi, G.; Luo, S. An SSL-PUF Based Access Authentication and Key Distribution Scheme for the Space–Air–Ground Integrated Network. Entropy 2023, 25, 760. [Google Scholar] [CrossRef]
Cao, Y.; Zhao, Y.; Wang, Q.; Zhang, J.; Ng, S.X.; Hanzo, L. The evolution of quantum key distribution networks: On the road to the qinternet. IEEE Commun. Surv. Tutorials 2022, 24, 839–894. [Google Scholar] [CrossRef]
Mehic, M.; Niemiec, M.; Rass, S.; Ma, J.; Peev, M.; Aguado, A.; Martin, V.; Schauer, S.; Poppe, A.; Pacher, C.; et al. Quantum key distribution: A networking perspective. ACM Comput. Surv. (CSUR) 2020, 53, 1–41. [Google Scholar] [CrossRef]
Liu, S.; Jiang, N.; Zhang, Y.; Wang, C.; Zhao, A.; Qiu, K.; Zhang, Q. Secure key distribution based on hybrid chaos synchronization between semiconductor lasers subject to dual injections. Opt. Express 2022, 30, 32366–32380. [Google Scholar] [CrossRef] [PubMed]
Kong, P.Y. Challenges of Routing in Quantum Key Distribution Networks with Trusted Nodes for Key Relaying. IEEE Commun. Mag. 2023, 1–7. [Google Scholar] [CrossRef]
Lenzen, C.; Medina, M.; Saberi, M.; Schmid, S. Robust Routing Made Easy: Reinforcing Networks Against Non-Benign Faults. IEEE/ACM Trans. Netw. 2023, 2023, 1–15. [Google Scholar] [CrossRef]
Luo, Y.; Li, Q.; Mao, H.K.; Chen, N. How to Achieve End-to-end Key Distribution for QKD Networks in the Presence of Untrusted Nodes. arXiv 2023, arXiv:2302.07688. [Google Scholar]
Avramopoulos, I.; Kobayashi, H.; Wang, R.; Krishnamurthy, A. Highly secure and efficient routing. In Proceedings of the IEEE INFOCOM 2004, Hong Kong, China, 7–11 March 2004; IEEE: Piscataway, NJ, USA, 2004; Volume 1. [Google Scholar]
Elliott, C.; Colvin, A.; Pearson, D.; Pikalo, O.; Schlafer, J.; Yeh, H. Current status of the DARPA quantum network. In Proceedings of the Quantum Information and Computation III, Orlando, FL, USA, 29–30 March 2005; SPIE: Bellingham, WA, USA, 2005; Volume 5815, pp. 138–149. [Google Scholar]
Dianati, M.; Alléaume, R.; Gagnaire, M.; Shen, X. Architecture and protocols of the future European quantum key distribution network. Secur. Commun. Netw. 2008, 1, 57–74. [Google Scholar] [CrossRef]
Yang, C.; Zhang, H.; Su, J. The qkd network: Model and routing scheme. J. Mod. Opt. 2017, 64, 2350–2362. [Google Scholar] [CrossRef]
Chakraborty, K.; Rozpedek, F.; Dahlberg, A.; Wehner, S. Distributed routing in a quantum internet. arXiv 2019, arXiv:1907.11630. [Google Scholar]
Mehic, M.; Fazio, P.; Rass, S.; Maurhart, O.; Peev, M.; Poppe, A.; Rozhon, J.; Niemiec, M.; Voznak, M. A novel approach to quality-of-service provisioning in trusted relay quantum key distribution networks. IEEE/ACM Trans. Netw. 2019, 28, 168–181. [Google Scholar] [CrossRef]
Yao, J.; Wang, Y.; Li, Q.; Mao, H.; El-Latif, A.A.A.; Chen, N. An Efficient Routing Protocol for Quantum Key Distribution Networks. Entropy 2022, 24, 911. [Google Scholar] [CrossRef]
Chen, L.Q.; Zhao, M.N.; Yu, K.L.; Tu, T.Y.; Zhao, Y.L.; Wang, Y.C. ADA-QKDN: A new quantum key distribution network routing scheme based on application demand adaptation. Quantum Inf. Process. 2021, 20, 309. [Google Scholar] [CrossRef]
Chen, L.; Zhang, Z.; Zhao, M.; Yu, K.; Liu, S. APR-QKDN: A Quantum Key Distribution Network Routing Scheme Based on Application Priority Ranking. Entropy 2022, 24, 1519. [Google Scholar] [CrossRef]
Schartner, P.; Rass, S. Quantum key distribution and Denial-of-Service: Using strengthened classical cryptography as a fallback option. In Proceedings of the 2010 International Computer Symposium (ICS2010), Tainan, Taiwan, 16–18 December 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 131–136. [Google Scholar]
Rass, S.; König, S. Turning Quantum Cryptography against itself: How to avoid indirect eavesdropping in quantum networks by passive and active adversaries. Int. J. Adv. Syst. Meas 2012, 5, 22–33. [Google Scholar]
Salvail, L.; Peev, M.; Diamanti, E.; Alléaume, R.; Lütkenhaus, N.; Länger, T. Security of trusted repeater quantum key distribution networks. J. Comput. Secur. 2010, 18, 61–87. [Google Scholar] [CrossRef]
Le Quoc, C.; Bellot, P.; Demaille, A. Stochastic routing in large grid-shaped quantum networks. In Proceedings of the 2007 IEEE International Conference on Research, Innovation and Vision for the Future, Hanoi, Vietnam, 5–9 March 2007; IEEE: Piscataway, NJ, USA, 2007; pp. 166–174. [Google Scholar]
Wen, H.; Han, Z.; Zhao, Y.; Guo, G.; Hong, P. Multiple stochastic paths scheme on partially-trusted relay quantum key distribution network. Sci. China Ser. Inf. Sci. 2009, 52, 18–22. [Google Scholar] [CrossRef]
Wang, J.; Chen, G.; You, J.; Sun, P. Seanet: Architecture and technologies of an on-site, elastic, autonomous network. J. Netw. New Media 2020, 6, 1–8. [Google Scholar]
Ruijters, E.; Stoelinga, M. Fault tree analysis: A survey of the state-of-the-art in modeling, analysis and tools. Comput. Sci. Rev. 2015, 15, 29–62. [Google Scholar] [CrossRef]
Awerbuch, B.; Curtmola, R.; Holmer, D.; Nita-Rotaru, C.; Rubens, H. ODSBR: An on-demand secure Byzantine resilient routing protocol for wireless ad hoc networks. ACM Trans. Inf. Syst. Secur. (TISSEC) 2008, 10, 1–35. [Google Scholar] [CrossRef]
Bradley, K.A.; Cheung, S.; Puketza, N.; Mukherjee, B.; Olsson, R.A. Detecting disruptive routers: A distributed network monitoring approach. IEEE Netw. 1998, 12, 50–60. [Google Scholar] [CrossRef]
Zhao, Z.; Hu, H.; Ahn, G.J.; Wu, R. Risk-aware mitigation for MANET routing attacks. IEEE Trans. Dependable Secur. Comput. 2011, 9, 250–260. [Google Scholar] [CrossRef]
Zarpelão, B.B.; Miani, R.S.; Kawakani, C.T.; de Alvarenga, S.C. A survey of intrusion detection in Internet of Things. J. Netw. Comput. Appl. 2017, 84, 25–37. [Google Scholar] [CrossRef]
Shu, T.; Krunz, M. Privacy-preserving and truthful detection of packet dropping attacks in wireless ad hoc networks. IEEE Trans. Mob. Comput. 2014, 14, 813–828. [Google Scholar] [CrossRef]
Castro, M.; Liskov, B. Practical byzantine fault tolerance. In Proceedings of the OsDI, New Orleans, LA, USA, 22–25 February 1999; Volume 99, pp. 173–186. [Google Scholar]
Zhou, H.; Lv, K.; Huang, L.; Ma, X. Quantum network: Security assessment and key management. IEEE/ACM Trans. Netw. 2022, 30, 1328–1339. [Google Scholar] [CrossRef]
Mehic, M.; Maurhart, O.; Rass, S.; Voznak, M. Implementation of quantum key distribution network simulation module in the network simulator NS-3. Quantum Inf. Process. 2017, 16, 253. [Google Scholar] [CrossRef]

Figure 1. SKDN model. Yellow, orange and pink region represent service, network and link layer in SKDN respectively.

Figure 2. Communication key distribution.

Figure 3. A mixed fault scenario.

Figure 4. Overview of the proposed solution.

Figure 5. Route request phase.

Figure 6. Route response phase.

Figure 7. A distribution of logical link resource status.

C o s t_{X Y}

represents the key supply capability of the logical link between X and Y.

Figure 7. A distribution of logical link resource status.

C o s t_{X Y}

represents the key supply capability of the logical link between X and Y.

Figure 8. PDR comparison of the proposed solutions in the BHA scenario, i.e., OSPF, ASR, and MPCS. (a) Number of nodes = 40. (b) Number of nodes = 60. (c) Number of nodes = 80. (d) Number of nodes = 100.

Figure 9. PDR comparison of the proposed solutions in the TRA scenario, i.e., OSPF, ASR, and MPCS. (a) Number of nodes = 40. (b) Number of nodes = 60. (c) Number of nodes = 80. (d) Number of nodes = 100.

Figure 10. KMU comparison of the proposed solutions in the BHA scenario, i.e., OSPF, ASR, and MPCS. (a) Number of nodes = 40. (b) Number of nodes = 60. (c) Number of nodes = 80. (d) Number of nodes = 100.

Figure 11. CKR comparison of the proposed solutions in the BHA scenario.

Figure 12. Hop count comparison of the proposed solutions in the BHA scenario.

Table 1. Simulation setup.

Parameters	Value
Waxman parameters	$α = 0.1, β = 0.1$
Number of nodes	$40 / 60 / 80 / 100$
Number of source-destination pairs	$8 / 12 / 16 / 20$
Ratio of the number of faulty nodes	$0 / 0.05 / 0.1 / 0.15 / 0.2 / 0.25 / 0.3 / 0.35 / 0.4$
Acknowledgment rate threshold	$0.5$
Error rate	$[0, 0.5]$
Constant bit rate	$[0.5, 1.5]$ Mbps
Piecewise linear function parameters	$a = 0.1, b = 0.6, c = 0.9, d = 0.2$
Important factor of $e_{1}, e_{2}, e_{3}$	$3, 3, 6$
Important factor of evidence collected in the fault detection process	3
$K_{g e n}, K_{c u r}, K_{m a x}$	$[0.5, 1.5]$ Mbps, $[1, 25]$ MB, 100 MB

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wu, Z.; Deng, H.; Li, Y. An On-Demand Fault-Tolerant Routing Strategy for Secure Key Distribution Network. Electronics 2024, 13, 525. https://doi.org/10.3390/electronics13030525

AMA Style

Wu Z, Deng H, Li Y. An On-Demand Fault-Tolerant Routing Strategy for Secure Key Distribution Network. Electronics. 2024; 13(3):525. https://doi.org/10.3390/electronics13030525

Chicago/Turabian Style

Wu, Zhiwei, Haojiang Deng, and Yang Li. 2024. "An On-Demand Fault-Tolerant Routing Strategy for Secure Key Distribution Network" Electronics 13, no. 3: 525. https://doi.org/10.3390/electronics13030525

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An On-Demand Fault-Tolerant Routing Strategy for Secure Key Distribution Network

Abstract

1. Introduction

2. Related Work

3. System and Problem Model

3.1. System Model

3.2. Problem Model

4. Proposed Solution

4.1. Overview

4.2. Path Discovery

4.3. Fault Detection

4.4. Evidence Collection

4.5. Information Aggregation and Decision Making

4.6. Analysis

5. Simulations and Discussion

5.1. Simulation Environment

5.2. Results and Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI