1. Introduction
Nowadays, network communication applications are ubiquitous, causing various security problems. The data receiver wants to get all the data content sent by the sender and wants the data to be complete, authentic, and non-repudiation. At the beginning of the design of the existing network communication methods, the focus is only on data transmission connectivity, while data transmission security is ignored. This design fundamentally lacks endogenous security mechanisms and is also the root cause of security problems such as identity spoofing, address forgery, route hijacking, and denial of service in cyberspace. Moreover, the weak association between each message in the data stream leads to low reliability of the transmission process.
Traditional network communication methods do not have endogenous security mechanisms such as data integrity verification, making the transmitted content easy to be tampered with and forged, making it difficult to trace the source of the attack and the attacker’s identity. To solve such security problems, the IPSec [
1] (IP Security) security suite of the network layer is mainly used to perform integrity verification, data encryption, and data source authentication on the transmitted IP datagrams. Nevertheless, IPSec can usually only solve local problems on a regional scale. In particular, implementing IPSec technology is relatively complex, requiring two stages of negotiation before data transmission. The time and computing resources consumed by each step of the negotiation process are rather significant [
2], necessarily leading to the problem of poor deployability. Reference [
3] has proposed an attack method for the first-phase authentication process, and the IKE protocol used in the negotiation also has vulnerabilities such as man-in-the-middle attack [
2] denial of service attack [
4]. At the same time, the authentication header protocol [
5] (Authentication header, AH) and the encapsulating security payload protocol [
6] (Encapsulating Security Payload, ESP) are included in IPSec. However, the AH protocol can ensure the integrity of the transmitted messages, data source authentication, and anti-replay protection services. The ESP protocol can also provide data stream encryption services. Both protocols can easily guarantee the non-repudiation of the message, and both communicating parties can effectively synchronize the message and trace the message.
The latest technologies in vehicular ad hoc networks and the Internet of Things (IoT) provide solutions to traditional networks that lack security and trust mechanisms [
7,
8,
9]. These technologies ensure the authenticity, reliability of the information in the network, and the legitimacy of the vehicles disseminating such information. In traditional networks, the authenticity and reliability of packets transmitted between network nodes and the trust between nodes are also crucial. We consider that constructing a “chain” of messages communicated between nodes in a traditional network can provide a secure and reliable mechanism for the network. Lamport first proposed the concept of a “hash chain” to solve the problem of password tampering during transmission [
10]. Existing research on the hash chain only constructs various forms of hash chain structures for application-layer data. These studies make hash chains computationally expensive for security reasons. However, none of these schemes use a sequence of network communication messages to construct a hash chain nor a synchronization mechanism for network communication messages. At the same time, these schemes all use a hash chain to encrypt data or keys to achieve higher security for data content while preventing encrypted content from being cracked and tampered with, and none of the solutions is to improve the efficiency of secure data transmission.
Contributions
Aiming at the shortcomings of the above traditional network communication methods, we propose a novel secure communication method based on the message hash chain, referred to as the Message Hash Chain (MHC) method. The main contributions of the proposed MHC method are summarized as follows:
- 1.
The MHC method adopts a new chain transmission method to ensure the non-tampering, non-repudiation, and higher reliability requirements of multiple messages. The main idea is to iteratively hash the digest of the transmitted message to form a hash chain about the message sequence. The two communicating parties can ensure the integrity, immutability, and synchronization of the message sequence through the hash chain, thereby effectively guaranteeing the security of message transmission.
- 2.
When performing data signature and authentication, both parties only need to perform signature authentication on messages at certain intervals and do not need to complete it on each message. In this way, the authenticity and non-repudiation of all previously transmitted messages can be ensured, the overhead of signature authentication is reduced, and the efficiency of secure message transmission is greatly improved.
- 3.
Using the sequence number and node value of the message hash chain of the MHC method can provide anti-protection against replays and ensure reliability.
2. Related Works
The method proposed by Lamport is to encrypt the password through the hash function many times iteratively, and the verifier can verify the entire ciphertext sequence through the result of the latest encryption.
Based on Lamport, Chung et al. [
11] proposed the star chaining technique and tree chaining technique. The star chaining technique can verify each packet individually and can tolerate any degree of packet loss. The tree chaining technique can be regarded as a multi-layer star chaining technique. Although this scheme can achieve a smaller communication load than a star hash chain, it disadvantages sender delay, buffering of packets before sending, and less payload.
Golle [
12] proposes a hash chain with high performance and a high proportion of payload, but its biggest flaw is that it cannot avoid the risk of chain disconnection caused by too many packets contained in the chain.
Liu [
13] proposed a hash pre-streaming data signature scheme. The basic idea is to divide a long sequence into
m subsequences and use the hash pre-streaming data signature scheme to sign the first packet of the
m subsequences. At the same time, a buffer dedicated to storing the hash values and signatures of the
n packets in the subsequence is added to the server.
Zhang et al. [
14] proposed a butterfly-graph-based stream authentication scheme with advantages in payload, packet authentication probability, and packet loss tolerance. However, compared with other structures of hash chains, this method needs to run the hash function many times, making it less efficient.
Miller et al. [
15] improved the scheme proposed by Zhang. Although the security of the hash chain and the probability of data packet authentication were strengthened on the original basis, the complex structure led to a further decrease in its operating efficiency.
The authentication protocol based on hash chain proposed by Liu [
16] can calculate a continuous hash chain by performing multiple hash function calculations on the hash value of the data payload. Although the biggest feature of this authentication protocol is that it can resist replay attacks, it still cannot guarantee the non-repudiation of each packet.
Huang et al. [
17] used different hash functions to iterate keys multiple times and finally got a hash chain authentication scheme for message integrity verification. Still, this scheme’s order of hash functions needs to be kept secret.
References [
18,
19] propose self-updating hash chains and optimized tree hashing, respectively. These two hash chain structures optimize the security and packet loss tolerance on the original basis. Still, the overall operating efficiency is not much different or even slightly insufficient from the original structure.
The concept of “hash chain” is currently widely studied in application fields such as the Internet of Things, autonomous driving protocols, data security, and lightweight transmission protocols. Hakeem et al. proposed a hash chain-based V2X security protocol and a key generation and management protocol at [
20,
21]. The primary method uses the hash function to iterate the generated key many times, which realizes the highly secure message authentication in the V2X device at a low cost. At the same time, it can solve the key update problem of remote WAN and can resist key leakage attacks and replay attacks. Huang et al. [
22] proposed a hash chain-based data availability monitoring method, which applies the hash chain to the distributed system to solve the data consistency problem in the system. Kim et al. [
23] proposed a lightweight authentication scheme applied to military networks. This scheme combines the hash chain with the one-time password, which ensures the integrity of the transmission content and reduces the network transactions of transmission. Luo et al. [
24] improved the blockchain consensus algorithm by using the hash chain to realize the recording and verification of blocks.
4. Construction Method of Message Hash Chain
Two communicating parties, A and B, communicate, and sender A transmits the message stream to receiver B. The structure of each message is . Where is the content of the message sent by sender A in sequence, is the sequence number of the message, and is the result calculated by sender A according to and the tail node of the message hash chain by the constructing method of the chain. When receiver B receives , it also needs to use and the local tail node of the message hash chain to calculate the for verification.
The construction method of the message hash chain is shown in
Figure 3. The communication node needs to calculate the first node value
of the message hash chain according to the first message
, obtained by performing two hash function calculations on
. After that, each message needs to calculate a digest using a hash function and then splice this digest with the tail node of the message hash chain to calculate the corresponding node value of the chain.
The last node of each message hash chain is called the tail node, and the other nodes are called the intermediate nodes. The sender updates the node of the message hash chain corresponding to the latest sequentially sent message to a tail node and updates the original tail node to an intermediate node. The receiver verifies the messages in sequence and uses the successfully verified messages to construct a node of the message hash chain. Update this newly constructed node to the tail and the previous tail node to the intermediate nodes.
The iterative process of the message hash chain node is shown in Algorithm 1. The parameters used in Algorithm 1 are described below:
: The role of the function is to obtain the source and destination addresses from the message header.
: Match the message hash chain between two addresses.
Algorithm 1HC_Iteration |
Input: Header content, payload, node value of the message hash chain. Output: A new node value of the chain. - 1:
- 2:
. - 3:
. - 4:
return
|
The message hash chain construction algorithm is shown in Algorithm 2.
Algorithm 2 The construction process of the message hash chain |
- 1:
“” - 2:
for is not empty do - 3:
- 4:
- 5:
end for
|
The two communicating parties update the message hash chain every time they construct a message hash chain node. At a specific time
t, a message hash chain node of
is constructed, then the complete message hash chain expression at time
t is as following:
6. Chain Signature
We improved the chain signature scheme previously proposed in [
25] to achieve higher security and efficiency. By
Section 7, the
scheme is an additional option of the
scheme, enabling the MHC method to guarantee the authenticity and non-repudiation of data. In this way, the
scheme can verify all previous messages with only one signature, dramatically improving signature and authentication efficiency. Suppose there is a message
, and its corresponding message hash chain node at the sender is
. If the sender reaches the signature interval or chain-signatures the message as required, the signature
must be calculated first, and then the encapsulated message
is sent to the receiver. Suppose the receiver can successfully verify the node value and signature of the message hash chain in the
in turn, i.e., in that case, the receiver can satisfy the equations
and
when verifying the
, and it can guarantee the non-repudiation of all previously transmitted messages.
6.1. Chain Signature Process
Algorithm 3 shows the process of chain signature for both parties in communication. The messages transmitted by the two communicating parties include messages with a signature and those without a signature, and the chain signature interval is d. In the process, the communication node constructs the message hash chain and transmits the messages synchronously, e.g., the node encapsulates the and constructed according to the into a message and sends it to the destination. For the security of the message hash chain, the sender will chain-sign the message when the signature counter reaches d or when necessary, e.g., after the sender signs , it only needs to sign next time. The structure of a message with a signature is , and a message without a signature is . The process of the sender encapsulating the messages shown in Algorithm 3. The parameters used in Algorithm 3 are described below:
: Current signature interval.
: A signature is required when the sender’s signature interval reaches .
: The function that encapsulates parameters as header of message hash chain.
: The signature function described in
Section 7.
: Sender’s private key.
Algorithm 3 Message Hash Chain Encapsulates Messages Header |
Input: Header content, signature interval, payload. Output: Encapsulated MHC datagram. - 1:
According to the content of the message, the payload and the tail node of the message hash chain, a node value of the chain is generated. - 2:
The sender inserts at the end of the message hash chain. - 3:
The sender updates message hash chain tail node . - 4:
if then - 5:
The sender encapsulates the header . - 6:
else - 7:
The sender computes the signature . - 8:
- 9:
end if - 10:
return p
|
6.2. Chain Authentication Process
Algorithm 4 shows the process of chain authentication of the message by the receiver. For messages without a signature, the receiver needs first to determine whether the sequence number of the messages is legal and then authenticate the node value of the message hash chain of the messages. For messages with a signature, the receiver needs to authenticate the signature and verify the sequence number of the messages and the node value of the message hash chain. The parameters used in Algorithm 4 are described below:
: Sequence number counter.
: Sender’s public key.
: Message hash chain node value used for verification.
Algorithm 4 The Receiver Verifying The Received Messages |
Input: Messages. Output: Verification status. - 1:
Obtain header information, message sequence number, payload, and node value of the chain from the message: . - 2:
if then - 3:
return -1. # A value of “-1” indicates that the sequence number is not sequential. - 4:
end if - 5:
if then - 6:
. - 7:
if then - 8:
return -2. # A value of “-2” indicates an error in signature verification. - 9:
end if - 10:
else - 11:
. - 12:
if then - 13:
The sender inserts at the end of the message hash chain and updates message hash chain tail node . - 14:
return 0. # A value of “0” indicates that the authentication of the message is successful. - 15:
else - 16:
return -3. # A value of “-3” indicates an error in message hash chain verification. - 17:
end if - 18:
end if
|
7. Safety Analysis
The necessary definitions for proving the security of the message hash chain are given below.
Definition 1. If there is always a for all e such that when , then is said to be a negligible value with μ as the parameter.
Definition 2. Note that H is the set of all hash functions, and h is a hash function. If h can find a, b, , in polynomial time, then it is considered that h will have a hash collision. For , if the probability of hash collision in h is equal to , i.e., the probability of hash collision in h is negligible, then H is a non-collision hash function set.
Definition 3. Denote a digital signature scheme triple , which satisfies:
represents the asymmetric key generation algorithm. For the key pair , is the private key of the signature, and is the public key of the signature.
is the signature algorithm of the digital signature scheme. For the communication transmission sequence , there is on a certain segment of data transmitted, where q is a positive integer, and come from .
is the verification algorithm of the digital signature scheme. For the digital signature of a certain segment of data and the generated by , there is always .
Definition 4. For the digital signature scheme , if only the cannot forge the of the scheme in polynomial time, then the scheme is secure.
Definition 5. The message hash chain verification scheme takes the digital signature scheme as an option. On the basis of , it also satisfies:
is the construction algorithm of the message hash chain. For the transmission sequence , there is = , where come from , respectively.
After receiving the sequence and the encapsulated in it, the receiver also constructs a message hash chain node = for the received sequence through , and there is , , .
Theorem 1. The messages between two messages authenticated by chain signature also have authenticity and non-repudiation.
is a secure digital signature scheme, h is a known hash function, and the probability of hash collision at h is less than , i.e., h is a non-collision hash function. In this case, if the digital signatures of and can be verified successfully and satisfy , then , , can verify their authenticity and non-repudiation through massage hash chain verification.
Proofof Theorem 1. It is assumed that the message hash chain verification scheme is insecure. This means that under the condition that is a secure digital signature scheme and h is a non-collision hash function, the message hash chain verification cannot guarantee the authenticity and non-repudiation of the message sequence, which message sequence between the message and the message that can be successfully verified by . Then there is an attacker who uses algorithm to forge the scheme, and obtains the signature sequence transmitted by the victim and the message hash chain node value sequence according to the victim’s , where , , and .
Then the scheme can output a valid signature sequence and message hash chain node sequence:
Specifically, algorithm uses to generate a pair of , and then uses to construct the message hash chain nodes of all message sequences . Finally, encapsulate them into the message sequence of the message hash chain, and sign with . The final output of algorithm is and .
According to the assumptions, the signed and verified messages satisfy and , . For , , only uses the message hash chain verification instead of the digital signature verification. Although the attacker cannot forge in , it can forge its as . From , the following two situations will inevitably occur.
. Obviously if
, where
h is a non-collision hash function, then
is obtained. Next, algorithm
can output the message hash chain sequence
and finally get
. If there should be
, but
, it means that algorithm
can forge
digital signature scheme. However, it obviously contradicts the assumption that
is a secure digital signature scheme.
. Knowing that , there must be that can recursively get . For the message hash chains and digital signatures at both sides of the transmission corresponding to , they satisfy the relational expressions and . If , then the receiver can use the message hash chain to verify the authenticity and non-repudiation of , and then use the message hash chain to verify the authenticity and non-repudiation of in a recursive way, which contradicts the null hypothesis.
In summary, the null hypothesis does not hold. It means that under the condition that is a secure digital signature scheme and h is a non-collision hash function, the message hash chain verification scheme is secure. Therefore, the authenticity and non-repudiation of the data flow between two digital signature intervals can be ensured by using the message hash chain verification. □
Theorem 2. Through the chain signature and authentication of a message, all messages in the previous sequence of this message can be verified.
Under the same conditions as Theorem 1, the receiver verifies the digital signature of a message in the data stream. If , , then can judge its own authenticity and non-repudiation according to the correctness of ’s digital signature.
Proof of Theorem 2. There is a sequence , the sender will sign the , and the receiver will verify the signature. Suppose there is an attacker who can use algorithm to forge the node value of the message hash chain. This means that for the message hash chain sequences and constructed by , the algorithm can output the forged message hash chain node sequence according to , and make . In the absence of an attacker, when the receiver receives the , the verification of the signature must satisfy . If algorithm can output to satisfy , then it means that algorithm can forge scheme , but this obviously contradicts the assumption. This shows that if can be verified by digital signature, then also has authenticity and non-repudiation; otherwise, do not have authenticity and non-repudiation. □
Theorem 3. The message hash chain can ensure the integrity and immutability of the data flow.
Proof of Theorem 3. A message and its corresponding node value of message hash chain are jointly encapsulated into a message hash chain message , where includes the source address , destination address and other contents of the message header . Obviously, the equation can be obtained from the Formula (1). If any content of the message hash chain message is tampered with by an attacker, and the tampered values are , and , respectively, then must occur when the receiver verifies it. □
8. Reliability Analysis
It is necessary to set the sequence number in the MHC method because the node values of the message hash chain should be calculated in strict order when constructing the chain. The difference between the sequence number contained in the message hash chain and that contained in IPSec is that the sequence number field is an optional field in IPSec, which is mainly used to provide anti replay services, while the sequence number field of MHC method is a necessary field, and each node of the message hash chain needs to be constructed according to the sequence number. After IPSec establishes a SA for the first time or the SA reaches its life cycle to renegotiate parameters, it will clear the sequence number stored in the SA, and then incrementally count each message. The sequence number of the message hash chain inherits the previous changes and is not cleared, and the verification of each message must verify whether the sequence number changes incrementally in sequence. The reliability of message hash chain is mainly reflected in that the communication receiver should not only compare the sequence number to judge whether it is increased in order, but also verify the integrity, authenticity and non-repudiation of the whole message through the scheme, and complete packet loss retransmission, chain synchronization and timely error detection through the sequence number.
8.1. Packet Loss Retransmission
If there is a data stream communication between the two communicating parties through the MHC method, the data stream sent by A to B, each packet is , where = . The message hash chain constructed by the sender is , and the chain constructed by the receiver is . Under the condition that the network has the possibility of packet loss, the following two situations must occur:
At least, there is a possibility that it is greater than , and the P received by B arrives in order, then the message hash chain constructed by B through P satisfies .
At least, there is a possibility that it is greater than , and the data stream received by B may arrive out of sequence or lose packets. Assume that at a certain time , the sequence number corresponding to the sender’s tail node is , and the sequence number corresponding to the tail node used by the receiver for verification is , . At this time, if the sender sends a new message to reach B, and its corresponding sequence number , then set the message retention time for . Subsequently, at time , where , if the message hash chain of the has not been successfully verified, the will be discarded, and the sender will request the following message corresponding to the sequence number of the current tail node of the chain. In contrast, if the chain of the can be successfully verified and the corresponding message hash chain is constructed at the receiver, the verification of the message corresponding to the last sequence number is continued.
8.2. Error Detection and Correction
The error detection function of the MHC method mainly uses the chain signature and chain synchronization mechanism to verify the message’s integrity, authenticity, and non-repudiation in real-time by comparing the node values of the message hash chain in real-time and signing and authenticating the chain at intervals. If the attacker tampers or forges any message content, the verification of the node value and signature of the message chain will fail. Both communicating parties should re-request the message with verification error within a limited time to ensure that the data flow can achieve higher reliability or disable the illegal message sender to reduce network security risk.