1. Introduction
Global distribution of computing capacity has opened its gates to various technologies, among which blockchain technology is significant. The idea of a blockchain came into popular awareness in 2008 when Nakamoto [
1] proposed that it be used with Bitcoin, the first cryptocurrency to use blockchain technology to record transactions. Blockchain is a kind of digital ledger technology that is based on distributed computing and decentralized data-sharing environments for recording network and user transactions [
2]. A key feature of blockchain is that it provides historical records of transactions that are publicly recorded in chronological order. That transparency is quantified as an incentive to cooperate. The fundamental concept underlying blockchain networks is decentralization, which means virtually anybody anywhere in the world could download the code and begin “mining” for cryptocurrencies like Bitcoin [
3]. Ethereum is a blockchain-based, time-stamped state machine designed to efficiently execute smart contracts [
4]. These smart contracts enable various types of transactions and processes to be completed on the blockchain network [
5,
6]. It can provide several unique characteristics, such as decentralization, security, privacy, speed, and transaction integrity features, along with data integrity, a trusted platform, flexibility, and security code functionality. For this reason and others, blockchain suits a variety of other use cases outside of cryptocurrencies. Blockchain technology has now been applied in many scenarios, including the Internet of Things (IoT), medical, self-rule, government, renewable power grids, and world shipping [
7]. The reduction in emissions and the provision of quicker, cheaper financial transactions brought about by blockchain technology are highlighted. Thanks to blockchain’s unique characteristics, recently, it has attracted a lot of attention, and researchers have proposed opportunities for its integration in the IoT environment. The IoT is a system of interrelated computing devices that have the ability to connect to other devices and equipment, including sensors that gather data from the surroundings and actuators that communicate set environmental data after processing over the Internet through network nodes [
8].
In today’s technology, there is a considerable and vast amount of data that are synchronized with different devices through internet connectivity [
9]. These data are created across different sectors and applications. The IoT is expanding rapidly and is being represented in various applications such as smart buildings, urban development, smart transportation, smart healthcare, and smart industry [
10]. When things or devices are connected to the Internet, security concerns, particularly those related to privacy, have appeared, and with the increasing complexity of the Internet infrastructure, these devices become more vulnerable to security breaches and intrusions than other peripheral devices such as smartphones and computers. Due to these resource limitations, complex security arrangements cannot be implemented easily, and that creates an opportunity for IoT devices to be maliciously attacked [
11].
By 2033, it is estimated that the number of IoT devices will surpass 39 billion devices, according to some statistics. This is approximately 2.5 times the IoT installed base in 2022 [
12], and this growth indicates the increasing adoption and integration of IoT into various aspects of daily life. Blockchain deployment in the IoT architecture is expected to offer several advantages, such as improved speed, traceability, and reliability; enhanced security and privacy; reduced costs; and eliminated single points of failure [
13,
14]. The primary challenge blockchain faces when integrated into an IoT architecture is ensuring compatibility between the two domains, as it states that due to the nature of blockchain technology, which needs excessive resources for communicating with devices and performing validating and mining operations, quite a few IoT devices fail in terms of processing power, storage, and also engaged energy [
15]. The other challenge is that IoT devices often have poor connectivity due to limited battery capabilities, while the blockchain typically is designed for stable network connections [
16].
The consensus in blockchain technology basically provides for the participants’ agreement on the validity of transactions and the order in which they shall be added to the blockchain [
17]. Also, these protocols are considered crucial for maintaining the integrity of the blockchain ledger. Mining adds new blocks to the blockchain and generates new ones. Miners solve complex mathematical puzzles by using their computational power, and the first one to solve it gets to propose a new block of transactions to be added to the chain. This process is resource-intensive and requires a consensus mechanism to ensure that all participants agree on the validity of the proposed block. Hash functions are cryptographic algorithms that play a critical role in blockchain processes, such as generating and validating transaction signatures. By utilizing a hash function after a block is generated, each block is assigned a unique cryptographic hash value, with the latter being some sort of digital fingerprint that confirms the integrity of that block [
18]. The Merkle root value is also computed by hash functions. The Merkle root is a summary hash that represents all the transactions within a block. It provides a compact and efficient way to verify the contents of a block without needing to process each transaction individually [
19].
The blockchain uses specific computing conditions and high-power consumption schemes to implement cryptographic algorithms, in contrast to IoT devices, and that represents a challenge for resource-constrained IoT bias when implementing cryptographic algorithms in blockchain applications for these devices [
20]. Cryptographic algorithms used in blockchain are classified into two major branches: symmetric and asymmetric encryption algorithms. The asymmetric encryption algorithms, like the RSA (Rivest–Shamir–Adleman) algorithm [
21], are indispensable for ensuring the requisite level of security and privacy in the blockchain. RSA is one of the well-recognized asymmetric algorithms that are highly secure yet not suitable for IoT bias. The other limitation will be the tight limitations on the size of the RSA algorithm, which is believed to have a secure size of 2048 bits. ECC [
22] has superseded the RSA algorithm due to its efficient integration with IoT devices, which manage to consume less energy and perform better than RSA.
Because of the limited resources available in IoT devices, consensus protocols such as Proof of Work (POW) do not work well. The miner is elected through a process, namely the mining process, which requires significant computational effort to compute the hash in POW protocols [
23]. Bitcoin uses SHA-256, a 256-bit secure hash algorithm that belongs to the SHA-2 family. The efficacy of SHA-256 has been assessed by a multitude of researchers across a spectrum of IoT devices [
24], and it was concluded that its large size makes it undesirable for use in size- and power-constrained environments such as the IoT. For IoT devices, more efficient and compatible ciphers such as Lesamnta-LW and Simon [
25] have been designed. A number of alternative hash functions have been proposed for use in blockchain-based applications, including Scrypt [
26], the X11 [
27] mining algorithm, and Blake [
28], which are promoted as faster and requiring less power to mine. The hash function is selectively distinguished from each other—the hash function, in turn, will affect the latency and network availability problem during mining. Improper hash function placement may increase the delay of updates within the blockchain and thus not allow the network to be efficient or even available promptly [
29].
There are proposed solutions to solve blockchain-based IoT device problems. The proposed suggestions may cover areas like authentication mechanisms, consensus protocols, or establishing lightweight architectures. While these proposed solutions explore lightweight hash function alternatives for blockchain-based IoT devices, they offer little to no evaluation of the selected approach [
30]. Rather, they address energy-aware methods or the problem of energy efficiency for hashing, respectively. Hash functions are an important part of blockchain operation, so hashing power must be considered. Toward blockchain applications for IoT devices, while energy efficiency is one performance metric that a hash function should have, other performance metrics such as resource utilization, speed, and throughput should be considered when selecting and evaluating hash functions for IoT devices in blockchain applications [
31,
32].
Despite the advancements and application-specific advantages of existing lightweight hash functions, several challenges persist in achieving seamless integration between IoT devices and blockchain systems. These challenges include computational complexity, energy consumption, and compatibility with resource-constrained devices, as discussed in the previous section. Addressing these limitations requires innovative approaches to hashing that prioritize both performance and security while remaining suitable for the unique requirements of IoT environments. To address these challenges, this study introduces a novel hashing algorithm called HashLEA (Hash Lightweight Encryption Algorithm). Designed specifically for IoT–blockchain integration, HashLEA leverages the lightweight LEA cipher and incorporates efficient block processing techniques.
Blockchain-based IoT applications, such as supply chain monitoring or smart grid management, require efficient and secure hashing mechanisms to ensure data integrity and performance. The proposed HashLEA algorithm addresses these challenges through its lightweight design and high performance.
Figure 1 demonstrates an example of how HashLEA can be practically applied in a supply chain monitoring scenario, ensuring data security and integrity across the entire chain. Further technical details and algorithmic structure are provided in
Section 4. This figure illustrates a practical use case of HashLEA in supply chain monitoring. The algorithm ensures secure and efficient data handling across various stages of the supply chain, validating and aggregating information to maintain data integrity.
The remainder of the paper is structured as follows. The following section builds upon the literature review.
Section 3 provides an overview of blockchain, IoT, and hash functions and their key properties, including a discussion on hash functions.
Section 4 describes the detailed structure and design of the proposed HashLEA function.
Section 5 presents the security analysis and performance evaluations, comparing HashLEA with existing hash algorithms. Finally,
Section 6 concludes the paper and outlines potential directions for future work.
2. Related Works
Recently, blockchain technology has attracted a lot of interest from researchers due to its unique characteristics and wide range of applications. As a foundation for secure, decentralized systems, blockchain is built on strong cryptographic components, including hash functions. Given the rapid progress and challenges in this area, different studies have been conducted to explore various aspects of hash functions for blockchain systems, such as architecture, performance, and cryptographic mechanisms. The following is a list of studies in this area in the literature.
Wang et al. [
29] conduct an experimental investigation into how the choice of hash functions affects blockchain performance. They replace the default SHA-3 implementation in Ethereum with alternative hash functions, including BLAKE2 and SHA-2, and evaluate their effect on metrics such as throughput, latency, and scalability. The study highlights that faster hash functions significantly improved throughput but had mixed effects on latency depending on network conditions and workloads.
Upadhyay et al. [
33] analyze the avalanche effect in 16 cryptographic hash functions, including MD5, SHA-1, and BLAKE, by assessing metrics such as the strict avalanche criterion and bit independence criterion. The study proceeded to rank the hash functions in question according to their degree of adherence to the aforementioned metrics and proceeded to evaluate their randomness using the randomness toolkit developed by NIST. The findings of the study serve to highlight the importance of the avalanche effect in ensuring the security of cryptographic systems.
Sharma and Saxena [
34] present a comparative analysis of cryptographic hash functions in the context of blockchain technology. The study evaluates hash functions such as Whirlpool, MD5, RIPEMD, SHA, and BLAKE based on performance metrics such as frequency and throughput. The authors highlight the critical role of hash algorithms in ensuring the security and scalability of blockchain networks while identifying SHA-256 and SHA-2 as particularly efficient and secure algorithms.
Sa’ed Abed et al. [
24] explore the intersection of blockchain and IoT, highlighting the challenges of integrating resource-intensive blockchain technology with the constrained environments of IoT devices. The study reviews common and lightweight hash functions, with a particular focus on SPONGENT, PHOTON, and QUARK, and evaluates their performance on Field-Programmable Gate Array (FPGA) platforms. The results demonstrate the existence of trade-offs between these functions. The results demonstrate that SPONGENT exhibits favorable performance in terms of security and throughput. However, QUARK achieves reduced power and energy consumption with lower variations in security. PHOTON, with its balanced characteristics, is the most suitable for blockchain–IoT applications. This research work has proven the importance of optimizing a hash function to balance computational complexities with those of IoT resources.
Kuznetsov et al. [
35] analyze the performance of cryptographic hash functions in blockchain systems, comparing widely used algorithms like Keccak, RIPEMD160, and SHA-2 with recent ones such as MD4 and EDONR. The study evaluates metrics such as cycles per byte and hash rate, thereby demonstrating the trade-offs between performance and security. Furthermore, the research examines the potential of “Kupyna”, Ukraine’s national standard for use in blockchain applications, offering insights into the selection of efficient and secure hash functions for decentralized systems.
In their study, Wang et al. [
36] highlight the importance of hash functions in ensuring the security and availability of blockchain technology. They focus on two important security criteria—hiding and puzzle-friendliness—required for a hash function in blockchain applications. They mathematically describe these criteria under the framework of Rogaway-Shrimpton’s theory, which proves their relationship. The authors draw the conclusion that it is more challenging to compromise on hiding and puzzle-friendliness than on preimage resistance. This suggests that hash functions like SHA-256, which are preimage-resistant, are suitable for blockchain security. This theoretical framework provides insights into potential vulnerabilities and attacks on blockchain hash functions.
Panuntun et al. [
37] examine the significance of hash functions in blockchain systems, which guarantees data integrity and transaction speed. The efficiency of the hash algorithm is of paramount importance for the maintenance of stability and the facilitation of rapid transactions. The acceleration of hash functions enhances the operational efficiency of blockchain systems by expediting the processing of transactional data. The research tests a variety of hash functions within a straightforward blockchain framework, demonstrating that the BLAKE2b algorithm is particularly adept at processing data rapidly and efficiently on hardware, while the SHA family of algorithms strikes a balance between complexity and security.
Seok et al. [
38] investigate the potential for integrating blockchain technology with Industrial IoT (IIoT) in order to enhance data integrity and network efficiency. They propose a flexible blockchain architecture that adapts the hash function used for mining based on transaction volume, aiming to enhance the availability of the blockchain network. The selection of lightweight hash functions, including QUARK, PHOTON, and SPONGENT, ensures that the architecture exhibits high performance in terms of area, throughput, and power consumption, rendering it suitable for resource-constrained IIoT devices. The authors argue that this dynamic approach mitigates the computational load and latency, thereby enhancing scalability and efficiency, particularly in the context of monitoring and supervision applications within IIoT environments.
Kuznetsov et al. [
39] address the challenges of efficiency and computational overhead in blockchain data verification, particularly in Ethereum, by proposing a novel aggregation scheme for zero-knowledge proofs within Merkle Trees. The approach proposed by the authors markedly reduces the size of the proof and the computational resources required for verification, thereby achieving a balance between security and efficiency. The results of experimental evaluations conducted using actual Ethereum block data demonstrate a notable enhancement over traditional methods. This work offers a scalable and secure solution for blockchain data verification, thereby enhancing the performance and adaptability of blockchain technology across a range of applications, including financial transactions and supply chain management.
In addition to the studies examining the hash algorithms developed for blockchain applications mentioned above, many new hash algorithm architecture designs [
40,
41], new hash functions [
42,
43], and hash-based blockchain applications [
44,
45] have been presented.
In addition to the studies mentioned earlier, the key hash functions discussed in the literature are presented in detail in the following sections. Existing lightweight hash functions have demonstrated notable success and offered application-specific advantages in IoT–blockchain integration. However, challenges such as high computational complexity, energy consumption, and compatibility with resource-constrained IoT devices remain areas of concern. While these algorithms have their strengths, the proposed HashLEA algorithm has been designed to address these limitations by focusing on speed, performance, and security. Our evaluations indicate that HashLEA performs faster and more efficiently than the reviewed algorithms while maintaining a comparable level of security, making it a competitive option for IoT–blockchain applications. The contributions of the HashLEA algorithm are explained in detail below.
Our Contribution
The study contributes to the development of secure hashing algorithms adapted to the specific challenges presented by IoT applications. It offers a scalable and efficient solution for blockchain and other distributed systems. Comprehensive performance evaluations and security assessments have been carried out, benchmarking HashLEA against other hash algorithms from the existing literature. The main contributions of this paper are as follows:
A novel lightweight hash function, designated as HashLEA, is proposed for implementation. It utilizes the LEA block cipher for the purpose of enhancing both performance and security.
A comprehensive survey is conducted on hash functions. Extensive performance evaluations and security tests have been conducted, comparing HashLEA with other lightweight hash algorithms in the literature. A practical testbed simulating blockchain scenarios, along with hardware implementation on Raspberry Pi 4, validates the algorithm’s applicability.
The results of the security testing, which included statistical analysis of avalanche effect, uniformity of distribution, and collision resistance, demonstrate that HashLEA meets the security requirements and exhibits exceptional diffusion and unpredictability.
Performance benchmarking, conducted on both software and hardware platforms, demonstrates that HashLEA outperforms existing algorithms by achieving superior execution times across various input sizes, establishing it as a suitable choice for blockchain and IoT applications.
3. Background
3.1. Blockchain Technology and IoT Applications
The IoT intends to connect devices to the Internet in a peer-to-peer pattern for efficient data collection and sharing in our daily lives. Blockchain has been integrated with the IoT to resolve the obtainable IoT problems. Blockchain technology has been broadly used in IoT applications. In contrast to the other similar technologies—which keep data and information in central locations—data and information with blockchain technology will be reserved inside secure loops in the network. Thus, this process can enhance the level of security of the information. Unlike centralized technologies, blockchain decentralizes data storage, making unauthorized access or hacking more difficult and almost impossible. The high capability of blockchain to encrypt and protect data contributes to the heightened security of information [
46].
Traditional systems often lack transparency regarding the location, storage, and transmission of private data. Thus, when blockchain technology is applied, it will record all the data and communications that occur between the consumer and the producer. It will also let him know the information he wants to know. This will reflect positively on increasing and enhancing the reliability between the two parties. For example, highly secure transfers can be achieved by the implementation of blockchain technology services, which are activated by the principle of payment first. Thus, the decentralized storage networks of blockchain facilitate secure fund transfers and transactions with reliability [
47].
A centralized server/client paradigm is used in most IoT solutions, where, through the Internet, clients are connected to the servers hosted in the cloud, and this can be costly to maintain. Even though these solutions are capable of working successfully, as IoT grows, a new paradigm is considered necessary. The proposed decentralized solutions with a P2P pattern cannot guarantee security and privacy [
48]. Blockchain has the latent qualities or abilities to answer a lot of the challenges that arise from the usage of IoT [
49]:
The centralized server/client IoT solutions that are hosted in the cloud can be costly to maintain.
Maintaining millions of distributed smart devices is considered a significant challenge.
Transparency is crucial for safety and trust. It is recommended that open-source solutions be employed and evaluated in the context of the next generation of IoT devices.
Usually, the IoT needs a central entity, so at any point in which a failure occurs, it will be prominent. Managing aspects such as time synchronization, registries, privacy, and trust on a continuous basis will be challenging.
Blockchain applied to IoT has three big challenges [
50]:
Scalability issues: as the size of the blockchain grows with the increase and growth of transactions, there are concerns about scalability and potential centralization.
Computing power and time constraints: IoT devices are low on processing power and consume low energy; therefore, it is difficult to use secure encryption algorithms, which may take too long to process.
Storage challenges: All nodes in a blockchain keep a copy of all dealings or transactions that have ever happened in the blockchain since its formation. The size will become greater, leading to potential storage issues over time. IoT devices may face limitations in storing large amounts of data.
The challenges for blockchain as applied to IoT begin with low resources. Despite the fact that the processing power of most IoT devices is low, optimization of the algorithms in these devices enables certain operations. The cryptographic operations are the most processing-demanding operations in blockchain; testing and optimizing hash functions for IoT devices are essential to ensure performance feasibility [
51].
Many companies and industries are now using specific sensors for their IoT applications to control the operations of managing the company’s affairs and monitor the time that these operations take. Thus, these means help to sense the problems before they occur and then avoid them by following unique methods. When blockchain technology is implemented in the IoTs for these companies and organizations, this application process can significantly increase the security levels. This technology ensures reliable data and information on the companies’ systems. IoT sensing technologies extend beyond companies and industrial organizations to include companies and commercial establishments as well. These sensing methods and systems are designed to monitor and track the products and goods of companies. Integrating blockchain into IoT for these companies will generate a secure and confidential environment for sharing data and information between them, customers, and consumers. This will help in creating a commercial system of much higher quality and efficiency than the existing systems. This enhances the quality and efficiency of commercial systems, contributing to product and service development. Also, it will increase the consumers’ reliability of these companies and their various products [
52].
Blockchain technology, while promising for IoT applications, faces challenges in adapting cryptographic operations like hashing to resource-constrained IoT devices. Hash functions are central to blockchain processes, yet existing solutions often struggle with scalability, efficiency, and compatibility in these environments. To address these limitations, the next section discusses the role of hash functions in blockchain systems, leading to the motivation behind the proposed HashLEA algorithm.
3.2. Hash Functions for Blockchain-Based Applications
Hash functions are an essential component of blockchain, and they are utilized in a variety of blockchain-based implementations, particularly cryptocurrencies. One among these applications is Bitcoin, which is the earliest and most widely utilized cryptocurrency that employs the PoW consensus protocol. PoW is recognized as a significant consumer of power and computation, which in turn uses the SHA-256 hash function. Other cryptocurrency applications like Titcoin, Peercoin, and Bitcoin Cash are also using the SHA-256 hash function [
29]. One of the most well-known examples of a blockchain-based application is the Ethereum platform, which provides a decentralized environment for deploying and executing smart contracts. Ethereum allows participants to transact in a peer-to-peer network without any trusted intermediary. Transactions are sent to and received by Ethereum accounts created by users of the system. Smart contracts enable immutable, verifiable, and securely distributed transaction records, thus offering full transparency and control over their data to the participants. Ethereum moved from the PoW consensus mechanism to proof of stake, a more secure and energy-efficient way of governing the validation of blocks and the integrity of the blockchain network [
53]. The Keccak, one of the hash function families based on the Sponge structure, is implemented through the Ethash function for coins that use Ethereum. Keccak is also a candidate for the SHA-3 hash function. Another common hash function used by many cryptocurrencies is Scrypt.
The blockchain has a number of different uses for hash functions. Here is a comprehensive overview of the various crucial uses of hash functions in blockchain technology [
24,
29,
35]:
Data integrity: Hash functions are used to create a fixed-size hash value for each block of data in a blockchain. By comparing the hash of a block to the hash of the previous block, users can quickly verify whether the data within the block have been altered or tampered with. Even a slight change in the block results in a different hash value, indicating data corruption and ensuring data integrity.
Merkle tree: Hash functions are employed in Merkle Trees to secure the integrity of the block header. This uses hash functions to make sure that it is not possible to discover two Merkle Trees with a similar root hash, and that helps the integrity of the block header to be protected by storing the root hash inside the block header and thus protecting the integrity of the whole transaction process.
PoW consensus algorithm: This algorithm defines a valid block as one with a hash value below a certain threshold that the threshold value has a hash value greater than the block header hash value. Miners in PoW blockchains, such as Bitcoin, must find a hash meeting specific criteria to add a block to the blockchain.
Digital signatures: Hash functions play an essential role in digital signatures to ensure data integrity and are utilized for authentication for blockchain transactions.
Chain of blocks: In the blockchain, each block header holds the hash of the previous block header, forming a chain. This ensures that it is impossible to change even a single block in a blockchain without being noticed. The modification of one block requires changing subsequent blocks, and this increases the difficulty and challenge.
Smart contracts: Hash functions are used in smart contracts for specific purposes, such as verifying input integrity or storing hashed data, ensuring the contract’s conditions remain tamper-proof.
Mining rewards: In PoW blockchains, miners must find a specific hash PoW that, when combined with the block data, produces a hash with a certain number of leading zeros. This hash is the basis for rewarding miners with new cryptocurrency coins.
Block validation: Each block in the blockchain has a header that contains a hash of the previous block’s header. This chaining of blocks with their header hashes ensures that the blocks are connected in a linear, tamper-resistant manner. Any modification to a previous block renders the entire blockchain invalid.
Thus, it can be concluded that hash functions are a vital part of the blockchain technology, safeguarding data integrity, ensuring the immutability of the data stored on the blockchain, and supporting various critical processes such as consensus algorithms, digital signatures, and smart contracts. A detailed overview of the key hash functions from the literature is provided in the following subsection, establishing the foundation for discussing their relevance to blockchain-based applications.
3.3. Hash Functions
3.3.1. Quark
The Quark family refers to a collection of cryptographic hash functions that were intended for lightweight applications. The initial developers of Quark are Jean-Philippe Aumasson, Luca Henzen, Willi Meier, and María Naya-Plasencia [
54]. These functions were created to cater to the needs of resource-constrained devices, such as those found in the IoT and RFID (Radio Frequency Identification) protocols. Quark utilizes a Sponge construction with a capacity (
c) equivalent to the digest length (
n) and a core permutation derived from earlier cryptographic primitives. The design is focused on minimizing resource consumption, making it suitable for environments where computational resources are limited. This approach aligns with the demands of lightweight cryptography, ensuring a balance between efficiency and security in scenarios where devices have constraints on processing power and memory.
Given an initial state of
b bits, a Sponge construction processes the message m in three main phases. The first phase is initialization, where the message is padded by appending a single ‘1’ bit and then the minimum number of ‘0’ bits such that its length becomes a multiple of
r—the rate size. This step ensures that the message is properly formatted for the absorption process. During the absorbing phase, the
r-bit message blocks are XORed with the last
r bits of the internal state, and then the permutation
P is applied. It starts with an XOR of the first message block with the internal state, and then each subsequent message block alternates between an XOR and a permutation until the message is fully absorbed. The absorbing phase ends with an application of
P. The final phase, Squeezing, provides the output by extracting
r-bit chunks from the last
r bits of the internal state interleaved with applications of the permutation
P, continuing until
n bits of output are reached. The Quark hash function is based on a permutation
P derived from the Grain stream cipher and the KATAN block cipher. The internal state of
P consists of three feedback shift registers: two non-linear feedback shift registers (NFSRs) with
bits each and a linear feedback shift register (LFSR) with
bits [
55].
3.3.2. Keccak
Keccak [
56] is considered a family of Sponge-based hash functions, which are renowned for their flexibility and security. This family includes various instances and configurations to suit different requirements. Keccak functions are based on the Keccak-f permutation, which is the fundamental building block of these hash functions. The Keccak-f permutation is comprised of a series of elementary rounds that entail logical operations and bit permutations. Keccak-f[200] and Keccak-f[400], along with their lightweight implementation permutations reported by Kavun and Yalcin [
57], are derivatives of the SHA-3 hash function. The introductory element or component is the Keccak-f permutation, which comprises a multitude of elementary rounds comprising logical operations and bit permutations. Keccak-f[
b] involves a permutation chosen from a collection of seven permutations, and therefore
b denotes the width of the permutation (ranging from 25 to 1600) and the width of the state within the Sponge construction.
The permutation Keccak-f[
b] operates on a state
a, which is structured as a three-dimensional array of elements in
, denoted as
, where
, where
w represents the lane size, and the state consists of five rows, five columns, and
w bits per lane. The bit at position
in the state a is indexed as
, where
(modulo 5) and
(modulo
w). The mapping between the linear bitstring
s, used in the Sponge construction, and the three-dimensional state
a is expressed as
. This mapping ensures that the state is interpreted correctly in its 3D form during the permutation process. Indices
and
z are taken modulo their respective dimensions, ensuring cyclic behavior as required by the design. In cases where indices
, or all three indices are omitted, the statement applies universally to all valid values of the omitted indices. This notation allows for compact expressions in describing operations that affect the entire state or specific subsets of it [
56].
3.3.3. Scrypt
The Scrypt [
26] is a cryptographic hashing algorithm serving as a validation algorithm for a number of cryptocurrencies. Initially, it was implemented for Tenebrix, further being adopted as the base algorithm for Litecoin and Dogecoin. The Scrypt algorithm is also utilized by a number of other cryptocurrencies, including Mooncoin, ProsperCoin, MonaCoin, CashCoin, and many more [
3]. Scrypt is a password-based Key Derivation Function (KDF). In the field of cryptography, a KDF is a hash function that generates one or more secret keys, such as a password, master key, or user name and user IDs, from a secret value using a pseudorandom function. KDFs are typically well structured to avoid brute force password guessing attacks. Prior to the advent of Scrypt, KDFs such as Password-Based Key Derivation Function 2 (PBKDF2) demonstrated a certain degree of vulnerability to resist FPGAs and ASIC attacks. While some other password-based KDFs, including PBKDF2, were not memory-intensive, they are computationally intensive. Scrypt was specifically designed to address both aspects, being memory-intensive and computationally demanding [
58,
59].
The Scrypt depends on the following three main parameters:
N,
p, and
r.
N is the parameter that defines the computation and memory cost, impacting the resources needed to perform the execution of the algorithm. The
p is the parallelization degree, while
r shows the size of a memory block, hence directly affecting memory usage. Other parameters include settings for the hash function and the output hash length. The Scrypt operation takes two input parameters: the message to be hashed and a random string called the salt. The addition of this string or salt serves to introduce entropy and thus secure the system against all attacks based on pre-calculated rainbow tables. The association tables offer time–memory cooperation for the recovery of clear encryption keys from hashed keys. Initially, the data are fed into a PBKDF2, which has the objective of improving resistance to brute-force attacks and reducing the number of potential weak points. The output of the PBKDF2 function will be an array of blocks,
, each of 128 ×
r bytes, according to the algorithm’s parameters. Then, these blocks are processed by the ROMix function. The output of the ROMix function becomes an expensive salt, fed into another iteration of PBKDF2 to produce the output key of the desired length [
26].
3.3.4. X11/X13/X15/X17
The hashing algorithms belonging to the “X” family include X11, X13, X15, and X17. The “X” algorithms family is known for their high degree of defense against hacker attacks. The number of hash functions used in the “X” family algorithms is determining their names. The X11 algorithm uses 11 cryptographic hash functions: Keccak, Blake, BMW, Groestl, Skein, JH, CubeHash, Luffa, SHAvite, SIMD, and Echo. This multi-hashing structure adds complexity, initially providing significant resistance to ASIC mining. The X13 algorithm builds upon the X11 algorithm by incorporating two additional functions, namely HAMSI and FUGUE, thereby increasing the total number of cryptographic hash functions to 13. The X17 algorithm represents a further expansion of X13, incorporating four additional functions: Shabal, Whirlpool, LOSELOSE, and DJB2. This brings the total number of cryptographic functions to 17. The X-family algorithms have been widely adopted in the field of cryptocurrencies. The first implementation of X11 was in DarkCoin, now Dash, which was the first to utilize this algorithm in blockchain and cryptocurrency mining. X13 has been employed in cryptocurrencies such as DANK, AERO, and NAV. X15 has been used in coins like KOBO, HAL, SOLE, and MARYJ. X17 has been applied in other cryptocurrencies due to its broader range of cryptographic algorithms [
60,
61,
62,
63].
3.3.5. Xevan
The Xevan is a cryptographic hash function and mining algorithm used in various cryptocurrencies. It is a distinctive blend of the double X17 algorithm utilizing a 128-bit header. The initial deployment and advancement of Xevan were conducted by BitSend (BSD), which subsequently prompted the adoption of the algorithm by numerous cryptocurrencies. Xevan has dedicated GPU miners available for both NVIDIA and AMD. For NVIDIA, one can utilize the ccminer fork by krnlx, which is available here. Limxtec sgminer is used for AMD GPUs. Compiled versions of sgminer are available for both Linux and Windows, while the Krnlx version of the NVIDIA release is specifically designed for Windows. Details of the hash functions and parameters used within the Xevan algorithm may vary slightly depending on the cryptocurrency. The Xevan algorithm was designed to be resistant to ASICs, promoting decentralization by favoring GPU miners. Cryptocurrencies like BitSend, Solaris, Nanucoin, Northern Coin, Amsterdam Coin, XHIMERA, B-Hash, Bitcoin Incognito, and Motion have used the Xevan algorithm in their mining operations [
64,
65].
3.3.6. CryptoNight
CryptoNight is designed to support efficient mining on CPUs while intentionally being less effective on GPUs and FPGAs. The CryptoNight algorithm also boasts resistance against ASICs, and it is used to mine those coins that implement the CryptoNote protocol, with Monero being a notable example. Since the CryptoNight algorithm is composed of a loop structure where memory reads and writes occur repeatedly, it is very sensitive to the latency of the memory, meaning its performance is strictly a function of the memory. This memory-centric characteristic is crucial, and the outcome of these memory-intensive operations has been determined to be used in a later step by the hash function, and this process will produce the output, which is the potential block solution. A crucial design decision within the algorithm was to guarantee that the working data size aligns with the shared cache memory per core in a contemporary CPU. This category of memory exhibits a markedly reduced latency in comparison to conventional system DRAM or a GPU’s VRAM, so this design choice imparts a substantial efficiency advantage when running CryptoNight on a CPU as opposed to a GPU [
66].
The following steps are involved in the CryptoNight algorithm: In the initialization phase, with parameters
and
, the hash input is initialized using the Keccak function (SHA-3). The last parameters of Keccak (bytes 0 to 31) are interpreted as an AES-256 key and expanded into 10 round keys. During the block extraction phase, the bytes between 64 and 191 are extracted as eight blocks of 16 bytes each to be later encrypted. Finally, the required data are assigned to the scratchpad, having a size of almost 2 MB. In the scratchpad operations stage, an XOR operation is performed among the first 63 bytes of Keccak with the goal of initializing
A and
B constraints, both of length 32 bytes. The variables and their processing are used to enforce a continuous read–write loop 524,288 times in the scratchpad to ensure that the algorithm is tailored for the latency of the memory. The last hash sequence of the data previously obtained is calculated in this final hash calculation phase [
67].
3.3.7. EtHash
The Ethash algorithm was formed as a function to carry out the PoW of the Ethereum blockchain and some other cryptocurrencies. The Ethash algorithm combines aspects of the Keccak cryptographic function, a basis of SHA-3, in a unique way. It aims to be memory-hard—in other words, resistant to optimization by ASICs—while being very efficient on GPU-based mining. Concurrently, this should facilitate the implementation of optimized verification procedures that are both rapid and efficient. Ethash was specifically developed to deter the centralization of mining power and promote a more decentralized and egalitarian mining environment. The algorithm’s resilience to ASICs is intrinsic to its memory-intensive nature, achieved through the utilization of a pseudo-random dataset initiated in accordance with the blockchain’s length, thereby conferring variability. This dataset, designated as the Directed Acyclic Graph (DAG), undergoes regeneration approximately every 30,000 blocks, which is equivalent to approximately five days [
63,
68].
3.3.8. Equihash
Equihash is a memory-oriented PoW mining algorithm, resistant to ASIC mining, hence promoting decentralization. It is a class of asymmetric memory-hard PoW algorithms based on the general birthday problem in cryptography. Equihash is applied in several cryptocurrencies to support a more decentralized and accessible mining ecosystem. Equihash possesses several distinguishing features, such as inherently being a memory-hard algorithm: it requires that real RAMs participate in the mining process with equality and makes the engineering of ASIC miners much harder to manufacture. As noted above, this feature ultimately grants participation in mining for an average PC with a big enough RAM, therefore more inclusivity. Furthermore, Equihash is designed to be able to divide the mining workload into parts, enabling average users to distribute the workload between many systems, which enhances its scalability and decentralization. Equihash is resistant to ASIC mining, ensuring that mining remains open to simple miners and not highly centralized within large-scale mining operations. In addition, its security is grounded on the generalized birthday problem, adding more cryptographic robustness to it. These features make Equihash a good fit for cryptocurrencies like Zcash, where it enables efficient verification and scalability, particularly for users with limited device capabilities [
63,
69].
3.3.9. Fugue
Fugue is a cryptographic hash function developed by Shai Halevi, William E. Hall, and Charanjit S. Jutla. It was submitted to the NIST hash function competition as a candidate for the next generation of secure hashing standards. Fugue is capable of supporting variable-length inputs of up to
− 1 bits while producing fixed-length hash outputs, including 224, 256, 384, or 512 bits. These variants are designated Fugue-224, Fugue-256, Fugue-384, and Fugue-512, respectively [
70]. Unlike traditional hash functions designed using the Merkle–Damgård paradigm, Fugue follows a different approach that avoids the inherent vulnerabilities of the Merkle–Damgård construction, such as multi-collision and herding attacks. Rather than depending on a compression function, Fugue maintains a substantial evolving internal state and utilizes an innovative “super-mix” operation to achieve diffusion across the state. The algorithm includes the following features [
71]:
In Fugue, the design adopts and extends the Grindahl approach [
72] by maintaining a large, evolving internal state and leveraging AES-like primitives for state evolution.
Building on the AES-inspired design of Grindahl, Fugue replaces AES’s 4 × 4 column mixing matrix with a 16 × 16 “super-mix” operation. This transformation improves diffusion while maintaining computational efficiency. The super-mix involves multiplying a 16-byte vector by a 16 × 16 matrix.
The rounding function is applied selectively to parts of the state rather than uniformly. This selective approach is combined with XOR-based mixing.
Computational methods have been used to verify that differential cryptanalysis and related attacks cannot find collisions in Fugue faster than the simple birthday attack.
3.3.10. Grostl
Grostl is a cryptographic hash function that operates as an iterated hash function, utilizing a compression function for processing message blocks. The design principles of Grostl are fundamentally distinct from the SHA family, incorporating innovative elements while drawing upon components from the AES block cipher. In particular, Grostl utilizes the identical S-box as that employed by AES and constructs its diffusion layers in a comparable manner. However, it separates from this approach by using a small number of permutations rather than the numerous permutations used in block ciphers. This results in a number of benefits, including simplified analysis, robust security, resistance to side-channel attacks, efficient parallelism, and protection against length-extension attacks [
73].
The Grostl algorithm employs the wide-pipe construction, which represents an enhancement to the fundamental Merkle–Damgård design through the utilization of larger internal state registers. This construction enables the execution of two parallel functions, designated as
P and
Q. Each function executes a series of four round transformations on message blocks [
74]. The initial transformation, designated as “AddRoundConstant”, introduces a constant value specific to the current round through a bitwise XOR operation. SubBytes is the subsequent non-linear substitution step that employs the identical S-box as the AES algorithm. The subsequent transformation, ShiftBytes, performs a cyclic left shift on rows of the internal state matrix. Finally, MixBytes applies matrix multiplication to achieve diffusion, thereby ensuring that each output bit depends on multiple input bits. These transformations are executed in a sequential manner over a number of rounds, with 10 rounds being the minimum for 224- and 256-bit outputs and 13 rounds being the minimum for 384- and 512-bit outputs. The processes of
P and
Q operate on message blocks that are represented as either 8 × 8 or 8 × 16 byte matrices, depending on the length of the hash [
75].
The Grostl compression function, designated as
f, operates on two inputs: a chaining input and a message block (
), where
h is either the initial value (for the first message block) or the output of the previous computation, and
m is the current message block. The compression process is carried out by iterating over the padded and divided message
M, using the chaining value from the previous iteration to process the next block. Once the final message block has been processed, the output transformation computes the final hash value by applying the following equation:
, truncating the result to the desired hash size [
76].
3.3.11. SHAvite-3
The SHAvite-3 family of cryptographic hash functions is proposed by E. Biham and O. Dunkelman and is based on the HAsh Iterative FrAmework (HAIFA). Two variants are currently available. SHAvite-3-256 has been designed for the generation of digests of up to 256 bits, while SHAvite-3-512 is intended for larger digests of up to 512 bits. The design of SHAvite-3 emphasizes simplicity, security, and performance, leveraging well-understood cryptographic primitives to address vulnerabilities in earlier hash standards like SHA-1 [
77].
The fundamental component of SHAvite-3-256 is the Feistel block cipher
, which employs a round function that is iterated 12 times. Each round comprises three complete rounds of the AES, thereby ensuring a high level of security through the use of non-linearity and diffusion. The Davies–Meyer construction transforms the
block cipher into a compression function through the process of XORing the output with its input, thereby integrating the strength of cryptography directly into the hash process [
78].
The compression function in SHAvite-3-256 takes four inputs: a 256-bit chaining value
h, 512-bit message block
m, a 64-bit bit counter
b, and a 256-bit salt
s. The chaining value
h is encrypted using
, where the round keys
for each round
i are derived from a message expansion function. This function alternates between a non-linear step (comprising four rounds of the AES, keyed by the salt
s and partially XORed with the bit counter
b) and a linear step (comprising XOR operations to derive new keys). The generation of the requisite 36 round keys for the cipher is achieved through the application of four non-linear and four linear steps to the input message block, which is designated as the first four round keys [
79].
3.3.12. Skein
The Skein hash function family, optimized for performance on 64-bit processors, was developed by Ferguson et al. [
80]. Skein is built on the tweakable block cipher Threefish, which supports block and key sizes of 256, 512, or 1024 bits. The distinctive chaining mode of Skein, designated as Unique Block Iteration (UBI), incorporates the Matyas–Meyer–Oseas construction with a flexible format specification for tweak values and a padding scheme. UBI functions as a unified mechanism for Initialization Vector (IV) generation, message compression, and output transformation, thereby facilitating the adaptability of Skein to a diverse array of use cases [
81].
The Skein system is based on the Threefish block cipher, which employs an ARX (Add–Rotate–XOR) construction. The addition is performed modulo
, thereby ensuring efficient arithmetic operations. The bitwise XOR operation provides the desired non-linearity and the bit rotations contribute to the diffusion of the cipher state. The state of Threefish is organized into 64-bit words, with the MIX operation transforming pairs of these words. The rotation distances employed in the MIX operation are contingent upon the block size, round index, and word positions, thereby ensuring both variability and security. Threefish applies 72 substitution-permutation rounds to the input block, with subkeys derived via a straightforward key schedule that combines the cipher key and tweak value through the use of addition modulo
. Subkeys are appended to the intermediate state at the conclusion of every fourth round [
80].
Throughout the SHA-3 competition, Skein benefited from a series of refinements designed to enhance its security. The key updates included the following: a tweaking of rotation constants after the second round with the aim of optimizing diffusion and an adjustment of constants in the Threefish key schedule during the final round with the objective of mitigating rotational cryptanalysis [
82]. The minimal nature of these changes serves to highlight the robustness of Skein and its ability to adapt to advancements in cryptanalysis.
3.3.13. Lyra2/Lyra2REv2
Lyra2 is a Password Hashing Scheme (PHS) that employs a cryptographic Sponge structure to generate pseudorandom outputs, which can be used as cryptographic keys or authentication strings. Designed with a focus on sequential execution, Lyra2 resists parallelization, thereby providing strong protection against attackers employing high-performance hardware, such as GPUs or custom ASICs. Furthermore, it is straightforward to implement in software and offers flexibility for legitimate users to adjust their memory and processing costs in order to meet security requirements [
83].
The Lyra2 system represents an improvement over its predecessor, the Lyra system, in that it addresses limitations and enhances security against a range of potential attack vectors. The Lyra2 employs a two-dimensional memory structure, designated as the memory matrix, wherein cells are successively read, written, and revisited throughout the hashing process. This matrix is initially configured and traversed through a stateful combination of the Sponge’s absorbing, squeezing, and duplexing operations. The stateful design guarantees that the matrix remains sequential, with its internal state carried forward rather than being reset [
84].
The Lyra2REv2 hashing algorithm is a chain of multiple hashing functions, commonly utilized as a PoW mechanism in various cryptocurrencies. Its primary objective is to enhance the resilience of the algorithm to the use of ASICs, thereby ensuring a more level playing field for mining participants. At the core of the Lyra2REv2 chain lies a critical component derived from a specific implementation of the general Lyra2 algorithm, which plays a crucial role in achieving the intended security and resilience properties of the algorithm. Discarding memory cells to conserve memory leads to the necessity of recomputation whenever those cells are accessed again, up to their last modification [
85]. This design feature makes Lyra2 highly resistant to memory reduction attacks.
3.3.14. BLAKE
The BLAKE design [
28] emphasizes simplicity and efficiency, with the objective of reducing complexity in implementation and debugging. This design choice has the effect of reducing the potential for errors and accelerating the deployment process, thereby making it accessible even to those without expertise in cryptography. BLAKE functions (e.g., BLAKE-224/256/384/512) operate through four main processes: initialization, permutation, compression, and finalization. Each of these steps employs modular addition, XOR operations, and bit rotations to achieve efficient and secure hashing [
86].
In the initialization process, a 16-word (512/1024-bit) state matrix is initialized using inputs such as a chain value, a message block, a salt, and a counter. The initialization ensures that different inputs generate unique initial states. The state is represented as a 4 × 4 matrix, akin to ChaCha and AES, facilitating visualization and comprehension. In the permutation process, the input message block is divided into 16 chunks and permuted across rounds using predefined parameters. In BLAKE, the permuted values are subjected to XOR operations with 16 constants, thereby enhancing diffusion. The core component of the BLAKE system is the compression function, which employs eight G-functions to iteratively transform the internal state. These functions utilize addition, XOR, and bit rotation to update the state matrix, with variations in rotations distinguishing BLAKE-256 and BLAKE-512 from their counterparts. Upon completion of the designated rounds, the ultimate hash value is derived. The finalization process entails the combination of the internal state, chain value, and salt, resulting in the final output hash [
28].
3.3.15. Overview
While existing lightweight hash functions like Scrypt, Groestl, Myr-Groestl, Keccak, BLAKE, Skein, Quark, X11 to X17, and Lyra family have shown notable successes in specific applications, they each present limitations when applied to IoT–blockchain integration, particularly in terms of computational complexity, energy consumption, and scalability. Scrypt, Groestl, and Myr-Groestl are known for their high security, but they require substantial computational resources, making them less suitable for resource-constrained IoT devices. Although they support parallel processing, their energy consumption remains high, posing challenges for IoT systems with limited battery life.
Keccak, used in SHA-3, offers a higher level of security than SHA-2, making it a robust choice for blockchain systems. However, its processing time is longer, which can impact performance in environments like IoT, where processing power is limited. Despite its superior security features, Keccak’s high computational demands make it less ideal for IoT applications, where performance and energy efficiency are paramount. Quark, BLAKE, and Skein stand out for their lower energy consumption, making them favorable choices for IoT applications. However, this comes at the cost of reduced security compared with more robust algorithms like SHA-2 or Keccak. While X11 to X17 and Lyra family offer energy-efficient solutions, they may not meet the highest security requirements in some blockchain applications.
While these algorithms have their strengths, HashLEA has been designed to address the specific challenges of IoT–blockchain integration. By focusing on optimizing speed, performance, and energy efficiency without compromising security, HashLEA provides a competitive solution that is faster and more efficient than the reviewed algorithms while maintaining a comparable level of security.
HashLEA leverages the lightweight and efficient LEA cipher, utilizing an ARX structure to optimize performance. Additionally, the specially developed, novel hash structure introduces a block-level mini-hash mechanism that creates a cascading dependency between input blocks, ensuring that any alteration in the data affects all subsequent computations. Thanks to this novel hash structure, the desired performance and security levels are achieved, making HashLEA suitable for low-resource environments while maintaining robust cryptographic properties. The following sections provide a detailed explanation of the HashLEA algorithm and its contributions to improving IoT–blockchain systems.
4. Proposed HashLEA Algorithm
This section provides a detailed explanation of the proposed hash function, HashLEA, which employs lightweight encryption techniques to facilitate efficient and secure hashing. The LEA cipher is chosen as the core of HashLEA due to its high performance and suitability for resource-constrained environments. In a previous study (referenced in [
87]), we conducted a comparative analysis of several block cipher algorithms and selected high-performance algorithms for cryptographic operations. Among the top contenders, LEA demonstrated particularly strong performance in terms of both speed and energy efficiency. LEA is optimized for low-resource environments, offering remarkable performance while maintaining robust cryptographic properties. This made it an optimal selection for HashLEA, as our objective is to develop a distinctive hash structure that could fulfill the particular requirements of IoT–blockchain integration. By using LEA as the basis, HashLEA benefits from its high performance, ensuring fast processing times and low energy consumption, which is critical for IoT applications. The proposed scheme (
Figure 2) incorporates block processing, addition and XOR-based mini-hashing, and final hash aggregation to achieve desired cryptographic properties such as collision resistance and the avalanche effect.
In HashLEA, we use the LEA cipher with a 128-bit block size and 24 rounds. These parameter choices are made based on the performance and security trade-offs in cryptographic algorithms, as well as standard practices in lightweight cipher design. The 128-bit block size is a common choice for lightweight ciphers, as it strikes a balance between computational efficiency and security. This size is large enough to provide a high level of security while not overly burdening the limited resources of IoT devices. Additionally, this block size is consistent with other block ciphers like Speck and Simon, which are widely used in IoT applications. The selection of 24 rounds for the LEA cipher is guided by security evaluations and known attacks. In the context of cryptanalysis, the highest number of rounds affected by attacks such as Boomerang and Differential Linear has been reported to be 15 rounds. Therefore, selecting 24 rounds, which is the original design parameter of LEA, provides a significant security margin, ensuring resistance against these known attacks while maintaining computational efficiency. This choice balances robust security with fast processing, making it suitable for IoT environments.
The input data are divided into 128-bit sub-blocks (, , , to ), as this is the block size for the LEA-128 cipher. These blocks are processed in groups of eight, resulting in a total of 1024 bits (128 bytes) per iteration. The design ensures efficient processing time and allows for scalability in the hashing of larger data streams. Within each iteration, two operations are performed to generate intermediate “mini-hash” values:
Bitwise XOR operation: Each 64-bit word from the eight blocks is XORed sequentially, resulting in a cumulative XOR value (miniHash1). This operation is fundamental to ensuring cryptographic diffusion. By combining bits from multiple input blocks, the XOR operation propagates any single-bit change in the input across the cumulative value. This sensitivity to input variations helps detect modifications within individual blocks and ensures that all blocks strongly influence the resulting miniHash1 in the group. Additionally, XOR’s simplicity ensures high performance even in resource-constrained environments.
Addition operation: Simultaneously, the same 64-bit words are added together, yielding a second cumulative value (miniHash2). This addition operation complements the XOR operation by introducing further diffusion into the mini-hash, making the hash sensitive to changes in the magnitude of input variations. The combined use of XOR and addition ensures that both positional and numeric differences in the input data contribute significantly to the hash computation.
The miniHash1 and miniHash2 values are encrypted using the LEA cipher with precomputed round keys. The encryption process applies 24 rounds of LEA cipher operations, thereby ensuring robust mixing and security. Subsequent to encryption, the mini-hash values are modified by reapplying XOR and addition operations with the original block data, which increase cryptographic complexity across blocks by integrating nonlinear mixing, bitwise transformations, and modular arithmetic operations.
The intermediate hash values from each iteration are combined into a cumulative finalHash, which maintains the chaining property. For each block group processed, the partial hash values (hash[0…3]) are added to the corresponding slots in finalHash. This aggregation ensures that even minor changes in any block affect the final output hash. By means of an iterative process of addition and XOR operations, the algorithm propagates modifications throughout the hash, thereby achieving the desired avalanche effect. Furthermore, the design guarantees that the final hash output is resistant to collisions and highly responsive to alterations in the input data.
The pseudocode for the HashLEA function outlines the step-by-step process of generating a cryptographic hash from input data in Algorithm 1. The pseudocode focuses on simplicity and clarity, providing a high-level view of the function’s logic without delving into implementation-specific details. Key steps in the pseudocode include initializing variables for intermediate and final hash values, applying XOR and addition operations to compute mini-hashes, and encrypting these mini-hashes to enhance security. The results of each iteration is added to a cumulative hash value, ensuring that all input data contribute to the final output. In the pseudocode, the modularity of the process is revealed whereby the operations on each group of blocks are isolated and independently combined. Abstraction puts into focus only the core design principles of the hash function—iterative processing, encryption, and mixing operations—so the function’s structure is easy to understand and adaptable for different cryptographic needs. Also, the HashLEA function is implemented in C in order to take advantage of the language’s efficiency and low-level memory control that is so important for cryptographic applications.
Algorithm 1 HashLEA |
for each group of 8 blocks in do for i = 1 to 8 do end for for i = 1 to 8 do end for for k = 0 to 1 do end for for k = 0 to 3 do end for end for
|
5. Security and Performance Analysis
We evaluate the properties of the proposed HashLEA function in four key aspects: avalanche effect, collision resistance, security analysis, and performance efficiency. The HashLEA function is implemented in C and tested on a Windows 10 Pro platform with a 64-bit Intel i7-7700HQ 2.80 GHz processor with 16GB RAM and Raspberry Pi 4. To assess its practical efficiency, the proposed scheme is benchmarked against other lightweight hash functions, focusing on execution time. The proposed HashLEA function has been evaluated in terms of security and performance by comparing it with existing lightweight hash functions. These comparative hash functions are implemented in the C programming language to ensure efficiency and compatibility. However, the security and performance tests are conducted on a Node.js-based node environment. This approach enabled the creation of a testbed that resembled blockchain application scenarios, thereby providing a practical context for assessing the function’s effectiveness in real-world use cases. The test bed, which is meant to simulate the data security and fast processing needs of a distributed system, is a valuable source of insight into the potential application of HashLEA in blockchain architectures.
5.1. Statistical Analysis of HashLEA
It is a fundamental property of hash functions that they exhibit both confusion and diffusion, which is critical for resisting statistical attacks. Diffusion ensures that any alteration to the input message results in significant, pseudorandom changes throughout the hash output, thereby maximizing the unpredictability of the changes. As defined by Shannon, confusion serves to mask the relationship between the input message and the resulting hash, thereby enhancing resistance to cryptographic analysis. In order for a hash function to be considered secure, it should be the case that a single bit change in the input will result in approximately 50% of the output bits being altered.
The evaluation of this behavior typically conducted four key metrics in statistical analysis. Let N represent the total number of tests, denote the number of changed bits in the hash output for the i-th test, and n signify the length of the hash output in bits. These metrics collectively assess the hash function’s capacity to achieve an optimal balance between confusion and diffusion.
The number of average bit changes:
Percentage of bit change:
The standard deviation of bit change:
The standard deviation of
:
In the evaluation process, the Mersenne Twister (MT19937) pseudorandom number generator is employed to generate 1000 random input data. Corresponding hash values are then computed for these messages. Subsequently, a single bit in an arbitrary position of each original input data is altered, and the resulting hash values are generated for comparison.
Table 1 presents the alterations between the hash values of the original and modified messages, alongside data from other hash functions for comparison purposes. The proposed hash function consistently achieves the ideal 50% average bit change in both the number and percentage of altered bits across all tested versions. Furthermore, the standard deviation of these changes is notably low, indicating stable and predictable diffusion characteristics. The proposed function has shown strong consistency in comparison with other hash functions, reflected in the lower standard deviation values of the average number of bit changes (
) and the percentage of bit changes (
), providing further evidence of its robustness and reliability.
5.2. Distribution of Hash Values
It is critical for the hash value to have a uniform distribution so that it will be able to resist pre-image and second pre-image attacks effectively. If the hash function has some vulnerability, then brute-force attacks can be used to determine the first or second pre-image. There are also faster pre-image attacks that arise from cryptanalysis targeting specific hash functions. To assess the uniformity of the hash value distribution, 2000 random input data are generated and the frequency distribution of the hash values is analyzed. The frequency of each hexadecimal character in the hash output is calculated to assess the uniformity of the distribution. It is expected that each hexadecimal character should appear in approximately equal numbers.
To further assess the distribution, a chi-square test (Equation (
5)) is performed as a goodness-of-fit test to determine whether the hash values are uniformly distributed across all 256-bit versions of the proposed hash function. In order to evaluate the distribution of the hexadecimal characters, a random sample of 2000 messages is selected. The number of expected occurrences of each hexadecimal character (denoted as
) is calculated to be 8000. The observed occurrences (
) for the hexadecimal characters are presented in
Table 2.
The calculated chi-square value (
) indicates that the null hypothesis (
, asserting that the hash values are uniformly distributed) is confirmed for any alpha
value. Based on the results, the chi-square value demonstrates that the distribution of hexadecimal characters produced by the hash function is statistically uniform. The set of hexadecimal characters is comprised of the range 7851 to 8160 for the HashLEA-256. The critical chi-square value is 10.26, which exhibits a minimal deviation from the expected outcome.
5.3. Collision Resistance
In cryptography, a collision is defined as the occurrence of two distinct messages that generate the same hash value. Resistance to collisions is one of the most fundamental properties of a cryptographic hash function. Although collisions are unavoidable in scenarios involving large sample spaces, it is of paramount importance to make their occurrence as challenging as possible. Given the vastness of the sample space (
for a 256-bit hash), it is impractical to test every possible hash value. To assess the collision resistance of a hash function, an alternative approach proposed by Wong [
88] is more practical. This method entails the examination of hash outputs for identical values at corresponding positions subsequent to the introduction of bit alterations in the input messages, thereby enabling the prediction of the probability of generating identical hashes.
In our study, we generated 2000 random input data and computed the resulting hash values in ASCII format. Subsequently, a single bit is altered at an arbitrary position in each input message, and the corresponding hash value is calculated. In accordance with Wong’s methodology, the number of identical pairs of hexadecimal characters at identical positions between the original and modified hash outputs are calculated and are represented as
. The objective is to minimize
, as this indicates a higher resistance to collisions. The results, presented in
Table 3, demonstrate the number of matching entries across the hash values. These results are compared with those of other widely used hash functions, which can be expected that as the digest size increases, the probability of finding matching entries also tends to increase. The proposed HashLEA function’s collision resistance remains robust within the expected limits.
Moreover, the absolute difference between the generated hash values is calculated in accordance with Equation (
6), which quantifies the distance (
) between two corresponding ASCII characters in the same position of the original (
) and modified (
) hash values. A comparison of the absolute difference values is presented in
Table 4. The maximum, minimum, and mean values of
are derived from the set of absolute differences. The average absolute difference (
) is calculated by dividing the mean value by the number of matching pairs of hexadecimal characters. The results demonstrate that the proposed HashLEA function yields a value of 85.64, which is slightly higher than the ideal value but still within an acceptable range. This illustrates that HashLEA has a stable and robust collision resistance in terms of consistency and predictability.
5.4. Software and Hardware Efficiency Analysis
In order to evaluate the performance efficiency of the proposed HashLEA function, a comprehensive comparison is conducted with a number of selected key algorithms, including X11, X13, X15, X17, Scrypt, Quark, Keccak, Xevan, Fugue, Groestl, Myr-Groestl, SHAvite-3, Skein, Lyra2z, Lyra2re, Lyra2REv2, and Blake. The execution times are measured for the hashing of messages of varying sizes (100 KB to 10 MB), with the results averaging over 1000 iterations.
The results, presented in
Figure 3, demonstrate that HashLEA consistently exhibits superior performance compared with other hash functions across all tested message sizes. To illustrate, the processing of a 100 KB message is completed in 0.157 ms, which is significantly faster than the 0.233 ms required by X11 and the 0.235 ms required by Quark. Similarly, for larger inputs, such as 10 MB, HashLEA completes the hashing process in 15.48 ms, which is notably faster than Quark 20.564 ms and Keccak 42.26 ms.
The Scrypt is a computationally intensive algorithm due to its use of a password-based KDF, Fugue, which performs heavy truncation of the intermediate state to form the output digest. Additionally, the Groestl algorithm employs a compression function constructed from two fixed, large, distinct permutations, resulting in significantly longer execution times for these three hash functions. To illustrate, Scrypt processes a 10 MB message in 135.056 ms, while Groestl takes 118.737 ms. Consequently, these algorithms are less suitable for applications that require real-time performance in the software domain.
By employing ARX operations and a lightweight block cipher-based structure, HashLEA achieves a balance between computational efficiency and cryptographic robustness. ARX operations are inherently simple and fast, making them highly suitable for software implementations on general-purpose processors. Also, due to the uniform and predictable structure of ARX operations, vulnerability to some types of side-channel attacks, timing, or differential power analysis is significantly reduced, hence enhancing security. Its software-oriented design makes it ideal for resource-constrained environments, as demonstrated by its efficiency across all tested scenarios. The performance comparison highlights the practicality of HashLEA for use cases demanding both speed and robust security, such as blockchain and IoT applications.
The other evaluation of the HashLEA algorithm is conducted on a real hardware platform, specifically a Raspberry Pi 4 with an ARM Cortex-A72 processor, to assess its performance in the context of blockchain-based IoT applications. This practical setup offers a realistic view of the algorithm’s behavior when deployed on low-power, resource-constrained devices typically used in IoT environments. By running the experiments on the Raspberry Pi 4, which is widely adopted in IoT deployments due to its affordability and sufficient processing capabilities, we are able to simulate real-world conditions for blockchain applications. The performance results indicate that HashLEA exhibits remarkable computational efficiency even on this hardware, which is critical for IoT devices that often operate with limited resources.
The comparison with other hash algorithms, including Scrypt, Quark, Keccak, Groestl, Skein, Blake, and others, further emphasizes HashLEA’s advantage in terms of execution time and resource consumption (
Figure 4). The processing of a 100 KB message is completed in 0.52 ms by HashLEA, while the closest competitor, Groestl, takes 2.1 ms. For larger inputs, such as 1 MB, HashLEA completes the hashing process in 5.39 ms, compared with 21.29 ms for Groestl. These factors are especially important for IoT devices, where processing speed directly impacts the scalability and performance of blockchain-based systems. In summary, by testing HashLEA on a Raspberry Pi 4, we can confidently assert that it is well suited for use in blockchain-based IoT applications, offering high performance and low computational overhead in environments with limited hardware resources. This demonstrates its potential for real-world implementation in decentralized IoT networks.
The energy consumption of the HashLEA algorithm has been calculated in proportion to its execution times through experiments conducted on the Raspberry Pi 4. The energy consumption is calculated based on voltage and frequency information, with power and current values determined according to the algorithm’s execution times. Voltage measurements are taken using the
vcgencmd measure_volts command, and frequency values are recorded during idle and active states. The frequency is measured at 600 MHz during idle and 1.5 GHz during active execution using the
vcgencmd measure_clock arm command. The voltage value is consistently recorded as 0.8375 V in both states. For the energy consumption calculation, the average power consumption of the Raspberry Pi 4 is assumed to be approximately 2 Watt. While this power may vary depending on the processor frequency and workload, it is treated as a constant value for this calculation. Power consumption is calculated using the relationship between voltage and current. The relationship between power, voltage, and current is given by the equation
. From this, the current
I can be calculated as
. Using this formula, estimated current values are calculated for both idle and active states. During idle, the current is assumed to be 1 A, while during active execution, the current is estimated to be 2.4 A. Energy consumption is then calculated for each execution time using the formula
. Where
E represents energy consumption,
P is power consumption, and
t is the execution time in seconds (converted from milliseconds). The energy consumption values for different execution times are given as 0.52 ms for 100 KB input data;
J = 2.4 A
V
ms, 5.39 ms for 1 MB input data;
J = 2.4 A
V
ms. HashLEA’s closest competitor, Groestl, demonstrates higher energy consumption compared with HashLEA. While HashLEA’s energy consumption varies from 1.05 mJ to 10.83 mJ across different file sizes, Groestl consumes between 4.22 mJ and 42.79 mJ. Despite these higher values, HashLEA maintains its position as a more energy-efficient alternative, making it especially suitable for applications that prioritize low power consumption without compromising on performance. The calculated energy values for HashLEA and its competitors, including Scrypt, Quark, Keccak, Groestl, Skein, Blake, JH, Spongent, Lesamnta-LW, and Photon, are provided in
Figure 5 for ease of reference.
6. Conclusions
This study presents HashLEA, a novel lightweight hash algorithm model based on the LEA encryption algorithm. This research began by considering the current literature in this area, paying attention to design principles, performance metrics, and applicability to constrained environments, including IoT and blockchain systems. Prominent algorithms such as Keccak, X11, Quark, Scrypt, and others are analyzed to identify their strengths, such as high levels of security and robustness, as well as their limitations, such as computational inefficiency, high power consumption, or lack of adaptability to resource-constrained devices. This comprehensive analysis provided a solid foundation for the design of the HashLEA algorithm, ensuring that it addresses the identified gaps while leveraging proven techniques from existing models. In summary, the internal structure in HashLEA has been attentively designed to balance performance against security.
Extensive performance and security tests were conducted to validate the HashLEA algorithm. The evaluations included frequency distribution analysis, which showed that the input values processed by HashLEA resulted in a uniform and balanced bit distribution in the hash outputs, highlighting the robustness of the algorithm against statistical predictability. Avalanche effect testing revealed a high degree of randomness, with small changes in the input, such as a single bit change, resulting in significant changes in the hash output. Furthermore, HashLEA exhibited strong collision resistance, achieving performance on par with or better than other established algorithms. This ensures that HashLEA minimizes the risk of producing duplicate outputs for different inputs, a critical property for cryptographic applications.
In addition to security testing, HashLEA has been compared for efficiency with selected key algorithms in the literature, including variants of the SHA family and other lightweight hash functions. The comparisons showed that HashLEA delivers comparable or better performance in key metrics such as computational efficiency while maintaining its lightweight characteristics. The performance benchmarking showed the efficiency of HashLEA according to different message sizes. The proposed algorithm’s low execution time, coupled with its lightweight design, makes it highly suitable for IoT environments and other scenarios where computational resources are limited. Unlike heavyweight cryptographic algorithms, HashLEA offers an optimal balance between speed and resource utilization, making it particularly promising for blockchain applications in constrained environments.
Additionally, the performance of the HashLEA algorithm is evaluated on a real hardware platform, specifically the Raspberry Pi 4 (ARM processor), to assess its impact on blockchain-based IoT applications. Both execution time and energy consumption are measured to provide a comprehensive analysis of its efficiency. The tests demonstrated that HashLEA achieves high efficiency, even on low-power, resource-constrained devices. Comparisons with other hash algorithms showed that HashLEA offers faster processing times and lower energy consumption, highlighting its superior computational efficiency for IoT applications. These findings highlight HashLEA as a suitable solution for blockchain-based IoT applications, providing strong performance and security, particularly on devices with limited resources.
The results of this study suggest that HashLEA provides a meaningful alternative to lightweight cryptographic solutions by improving the security/performance trade-off. It features both competitive and innovative properties compared with existing hash algorithms, especially for applications requiring strong hashing with minimum computational overhead.
Future work will focus on the practical implementation of HashLEA in IoT devices and blockchain systems, beyond Raspberry Pi. In addition, hardware integration of HashLEA on platforms such as FPGAs or ASICs will be explored to optimize its performance in real-world applications. Parallel processing and multi-core support will also be investigated to assess its scalability for large datasets and computational tasks.