**1. Introduction**

The Internet of Things (IoT) is based on the fact that each object or thing can use wireless communication to communicate with each other [1]. Nowadays, IoT has attracted attentions of societies, governments, and industries for a wide range of applications including smart homes, healthcare services, environmental monitoring, smart transportation, smart networks, security, fire detection, finance tracking, smart lighting, etc. [2].

In this context, wireless sensor networks (WSNs) play an important role in increasing the number of networks with low-cost smart devices which can be easily installed. WSNs are widely-used in various fields such as environment, health, military etc. Each node is composed of a sensor unit, processing unit (microcontroller), the radio communication unit, and an energy resource. Figure 1 represents these components. Examples include wildlife and environment supervision, health monitory, pediment supervision, border supervision and control, security, etc. [3–5].

**Figure 1.** Components of the sensor node.

Wireless nodes typically have capabilities such as sensing, computing and self-organizing operations for routing and data transmission to a base station (BS) [6]. However, limitations of wireless nodes include short-range communications, low bandwidth, processing/storage limitation and, particularly, energy consumption [7]. One of the main problems in IoT based on wireless nodes is the energy consumption in nodes. Each node has a battery and therefore, limited energy is stored in it. Since the implemented wireless nodes are not easily accessible, it is hardly possible to access the nodes and recharge or exchange the battery. Energy consumption mainly occurs in sensing, data processing, and data transmission [7]. However, most of the mentioned energy consumption is related to data transmission. Therefore, reducing energy consumption in these networks is an ongoing research [8,9].

Routing between two nodes with the highest energy efficiency and balanced energy consumption between them are two of the factors that can affect the lifetime of a network significantly. One of the methods to reduce energy consumption in these networks is to find an optimal route by applying to cluster [7,9,10].

In the clustering method, nodes are divided into clusters and one node within a cluster is selected as a cluster head (CH). The cluster member (CMs) nodes, sensing the environment, send the data to the CH. The CH receives data from CMs and aggregates them and finally; the aggregated data is transferred directly or indirectly to a BS with the help of middle nodes. In fact, the purpose of clustering is to find an optimal route to send data to a BS [11].

Clustering protocols are usually classified into two protocols including static and dynamic protocols. In static protocols, clustering is performed once and the nodes always remain in the same cluster. Virtual concentric circle band-based clustering (VCCBC) [12] and an energy-efficient protocol with static clustering (EEPSC) [13] are two of the examples of the static clustering methods in WSNs; although, in terms purely of static protocol performance, network overhead is reduced but shows instability for a long period of time. The main disadvantage of static methods is that it results in energy depletion in several nodes [14,15]. In dynamic protocols, clustering is performed in each round, and new clusters form during each round. Low-energy adaptive clustering hierarchy (LEACH) [3] is one of the examples of the dynamic clustering methods. Dynamic performance can improve network lifetime, but usually, it has a high overhead [14,15].

Clustering is the most important energy efficient technique. In this technique, the sensor nodes are organized into groups termed as clusters. The regular nodes in the cluster are called as cluster members and a CH is selected among them [16,17]. In order to prevent the network from hot spot issue, unequal clustering techniques can be utilized for load balancing between the CHs [18,19].

Recently, hybrid static–dynamic methods have been proposed for clustering. Hybrid unequal clustering with layering protocol (HUCL) [15] is one of the examples of these clustering methods. This method is a hybrid of both static and dynamic clustering. Therefore, similar to dynamic protocols, this method performs clustering during each round and, as per static protocols, the clusters remain the same during several rounds. Within each cluster, a node is assigned as CH and remains the same until another node is selected as the new CH. After some rounds, clustering and cluster formation are performed again. This procedure always continues in the lifetime of a network. In the hybrid method, the overhead can be reduced in addition to improvement of network stability and lifetime [14,15].

Furthermore, the clustering protocols and CH selection methods in wireless nodes are typically divided into two categories: centralized and distributed clustering [10,14]. In the centralized clustering method, the BS uses the general knowledge of a network for clustering the nodes and a BS needs to collect the information about the status of the nodes within a network for clustering the nodes. This method is not applicable to large-scale networks [14]. In the distributed method, nodes perform routing in a self-organized manner and without requiring more information about the network position. In distributed algorithms, each node decides about its own CH probability based on some parameters [10]. In these algorithms, a BS has no effect on the CH selection. In contrast to centralized clustering, distributed methods are more efficient for large-scale networks. Therefore, there is less overhead in these methods due to the omission of messages transferred between nodes and a BS [14].

In this paper, given the above-mentioned advantages, the performance of a novel hybrid static–dynamic protocol is used. The current paper places emphasis on a distributed clustering algorithm. The probability of selecting a node as a CH is determined on the basis of energy level, a number of neighbors and distance to BS within each sensor and its neighbors. Moreover, CMs send a data packet to CH based on assistance to cluster heads (ACHs) mechanism and CH send packages to a BS through an energy-aware multi-hop routing method. Clustering is performed unequally. In this clustering, CHs close to the BS have smaller radius; as a result, the number of CMs are reduced and less energy is consumed for receiving data. In this way, they spend more energy to receive data from CHs far from the BS.

The remainder of this paper is organized as follows. A literature review is presented in Section 2. Section 3 gives our proposed algorithm in detail. Analysis of our proposed HCD algorithm is further discussed in Section 4. In Section 5, extensive evaluation and simulation results are given with discussion and analysis and Section 6 concludes this paper.

#### **2. Literature Review**

In the last decades, extensive research has been carried out around wireless nodes clustering. Some studies were performed on the basis of centralized clustering whilst some others presented a distributed clustering method.

A subset of widely-used clustering algorithms could be mentioned as LEACH, LEACH-centralized (LEACH-C) [3], and hybrid energy-efficient distributed clustering (HEED) [20].

One of the main distributed clustering algorithms used to reduce energy consumption in WSNs is the LEACH algorithm. In this algorithm, CH is selected based on a random rotation. Therefore, the energy consumption has been optimized and the energy load in distributed throughout the network. If fixed nodes are selected as CH, their energy will end soon, and they die earlier than other nodes. Hence, nodes are selected as a CH by a fixed probability, and they introduce themselves to the whole network. Since CH is selected in a probable manner, it is possible that they are close to each other; therefore, one of LEACH problems is the heterogeneous distribution of CH in the environment.

Another widely-used algorithm proposed is the HEED algorithm [20]. In this algorithm, each node creates a random number between zero and one. If a number smaller than CH probability is selected for the node, that node is then selected as tentative CH. If CH is equal to one, then the node becomes the final CH. Otherwise, the tentative CH remains, and finally, CH probability becomes doubled. In each ordinary node, if the CH probe is equal to one, then the node becomes final CH. Otherwise, it creates a random number between zero and one. If the number is smaller than CH probability, the node is then selected as a tentative CH, and finally, CH probability becomes double. This procedure continues until CH is selected. CH is selected according to the remaining energy of the node and they are considered as the second parameter of communication costs inside the cluster. Collectively, nodes join a CH whose distance to the related CH is smaller.

Another distributed clustering algorithm is density and distance based CH selection algorithm (DDCHS) [21]. In the clustering step, the area is virtually divided into hexagons so that circular cluster overlap is prevented and in fact, some borders are considered for clusters. For each hexagon, a CH is selected and subsequently, some subcircles are taken into account in the virtual hexagons according to the average of usual nodes distances to the cluster center. This algorithm is composed of three steps: (i) local grouping which determines a center in length and width of the clustering area. The area is divided into four equal parts, (ii) comparison of nodes density—the node density is determined in each part and one of the parts is selected as a candidate quarter. (iii) Comparing the distance between nodes by computing the distance of each node to other available nodes in candidate quarter and selecting the node with the shortest distance to other nodes as a CH.

An energy-aware distributed dynamic clustering protocol, based on fuzzy logic (ECPF) [22], is another algorithm introduced in 2012. This method performed the clustering by using the fuzzy logic and taking the energy of nodes as a nonprobabilistic parameter. The node selection was done sporadically. This protocol reduced the overhead of the network, by providing the mechanism similar to a setup phase, rather than performing the setup phase in each round. With regard to energy saving, the purpose of this algorithm was to select a set of appropriate clusters which covers the whole region selection by means of a fuzzy system. So in this fuzzy system, input parameters such as the degree and the center of the nodes were considered. The use of fuzzy logic in selecting the CH and the rule of overhead reduction contributed to improving the lifetime of the network. One of the challenges proposed in this clustering is the lack of complete coverage in the network in the time range.

Other solutions, such as energy-based clustering for WSNs lifetime optimization and balancing energy consumption in clustered WSNs (BLAC), were proposed by Ducrocq et al. [23]. BLAC used the energy level and the degree of nodes or the density of nodes for clustering. The protocol performs as distributed and dynamic. BLAC balanced the energy consumption between nodes and improved the network lifetime and stability. In order to balance the energy, the CH rotates between nodes. These methods involve overload.

Energy and coverage-aware distributed clustering (ECDC) protocol [24] was proposed in 2014. The protocol was based on two components: energy and coverage. In this protocol, nodes share their information with their neighbors for calculating a delay time. According to the calculated delay time, CH is elected to the network. ECDC protocol was able to improve the coverage. Thus, the aforementioned protocol contributed to decreased energy consumption and increased the lifespan of the network. However, this method is carried out dynamically and it involves higher overload.

Hybrid unequal clustering with layering protocol (HUCL) was presented in 2015 [15]. HUCL is a hybrid of dynamic and static clustering methods. Clusters closer to the BS were smaller. In this method, CHs are selected based on the energy status of the node, the distance to a BS and the number of neighbors. Also, data is transferred to the BS as multihops. Each node shared its own information with the size of its cluster reduces for neighbors. In the HUCL method, each node computes its own delay time and CH are selected according to the computed delay times and, therefore, the nodes join to the nearest CH. In this algorithm the CH that has no member node changes its own state to a member node and finally, it joins to the nearest CH. Data transmission step is divided into time periods. Member nodes send their own data clusters to CH and CH use multihop routing to transmit data to the BS. Simulation results show that compared to other available protocols, HUCL is able to reduce the network overhead, optimize energy consumption, and increase the network lifetime. In this algorithm, in order to compute the cluster radius, energy consumption of nodes, and also for calculation delay time, a distance of a node to BS is not taken into account.

An improved energy aware distributed unequal clustering protocol (EADUC-II) was proposed in 2016 [25]. In this method, clusters are formed with unequal sizes and thus, clusters near to a BS have a smaller size. To determine the competition radius of cluster nodes, other parameters have been considered including node energy and to determine routing, the criterion of node energy to select the next step is taken into account for routing between CH. In this method, in order to compute the node delay time and node density, its distance to the BS is not considered. Although this method is performed in hybrid form, the improvement of overload reduction is less.

Another algorithm is an unequal multi-hop balanced immune clustering protocol (UMBIC) which was proposed in 2016 [26]. In multi-hop routing, the CH which is near to the BS loses its energy due to the relay data of the further CH. This protocol provided the WSNs with the nodes of various sizes and with a variety of homogeneous and heterogeneous nodes with distinctive densities; therefore, this method resulted in an improved network lifetime. UMBIC used an unequal clustering method to optimize the energy consumption in intra- and intercluster routing, forming the unequal clusters based on the distance of a node to BS and the energy level of the node. Also, UMBIC used the multi-objective immune algorithm to provide the routing tree with the aim of minimizing the cost of the nodes' relationship. Hence, this method could effectively reduce the network overhead and improve network lifetime.

A grid-based reliable routing protocol (GBRR) was proposed in 2016 [27]. This protocol improves the quality of the communication within the intracluster and intercluster by creating virtual clusters based on square grids and proper selection of the steps. This protocol divided the network into equally square-shaped grids so that in each grid one or more nodes might exist. According to the local information of the nodes' condition and grids, clustering was performed in a way that one cluster can occupy one or more grid. To reduce the overhead of CH, the multihop routing algorithm calculated the most efficient route between clusters; therefore, the source node does not need to transmit the data through the middle CHs to BS. In CH Competition stage of this method, it can be improved by considering the node's distance to the BS.

An energy-efficient QoS routing for WSNs using a self-stabilizing algorithm was proposed in 2015 [28]. This paper presented a self-stabilizing hop-constrained energy-efficient (SHE) clustering and a multi-hop routing algorithm for declining the delay and improving the quality of packet transmission. This protocol performs as a hybrid clustering method. CHs are determined by the BS in a definitive and offline Method and routing is done as distributed and online. The advantage of this protocol is that clustering is initialized only once at the beginning of the network. This protocol is based on TDMA schedules using the clustering algorithm for transmitting the data within the cluster in a manner that the nodes of the cluster with a tolerable delay send their own data to CH. Also, an adaptable routing protocol was proposed for data transmission between CH and the BS.

Chanak et al. [29] proposed an energy-aware distributed routing algorithm to tolerate network failure in WSNs. This protocol has specific routing schemes for better tolerating the network failure in the current position. This scheme includes three new algorithms. In a distributed energy-efficient heterogeneous clustering (DEEHC) network clustering is done according to the residual energy in order to minimize the energy required for data transmission. During clustering, each sensor node finds k-vertex disjoint paths for sending data to the CH according to the energy levels of neighbor nodes. In other words, the routing between nodes CH and BS is done based on the proposed k-vertex disjoint path routing (KVDPR) algorithm. In addition, the route maintenance mechanism (RMM) algorithm enables nodes and CH to keep a route, which is based on the neighbor nodes energy conditions and prevents failure. In DEEHC algorithm, for determining CH using computing time, distance to the BS, and node density are not considered.

Naeem et al. [30] proposed a dynamic and cooperative clustering. Furthermore, a new technique called the neighborhood formation scheme is presented. This algorithm aims to distribute energy demand among nodes and optimize a number of sensors involved in detection and report of events. This algorithm is executed distributed and dynamically. Results show that the proposed framework improved network lifetime and reliability in data transmission.

An energy-aware multi-hop routing (EAMR) protocol was proposed in 2017 [31]. This method aims to reduce overhead by reducing variations of CHs. In this method, a method similar to LEACH [3] is sued to select initial CHs to form clusters. EAMR allows a node to operate as a cluster head until its energy is not lower than a threshold; therefore, in this algorithm, CHs vary only when required. In order to improve network lifetime, this algorithm employs multi-hop routing. However, since membership of nodes in a cluster does not change until the end, in selecting the initial CHs, many of the important parameters like nodes density and distance from BS is not considered.

To sum up, the above section has summarized some clustering protocols which were provided to increase wireless node lifetime and that also reviewed recent hybrid methods to reduce the overhead and increase the lifetime of wireless nodes.

Table 1 shows a summary and comparison of some of the clustering algorithms.


**Table 1.** Comparison of some of the clustering algorithms.

The proposed protocol is an improvement of the HUCL protocol and the method is reviewed in the following sections. This improvement is due to the fact that in our protocol is a novel hybrid unequal clustering which is performed by a simple efficient algorithm for selecting CH node based on density, energy level, and distance to the BS. It also proposed a new mechanism by assisting the CH with intracluster data transmission as well as improved intercluster data transmission using layered mechanism. The main objective is to provide a high-precision clustering protocol to balance the energy consumption, decrease the overhead, substantial increase of networks' lifetime, and finally, to improve existing methods.

#### **3. The Proposed HCD Algorithm**

In this section, we introduce a hybrid unequal multi-hop clustering based on density (HCD) to improve network lifetime and throughput. HCD is performed by the distributed method. In this method, clustering is performed as a hybrid of static and dynamic methods. We assumed that nodes can detect their own distance by received signal strength indicator (RSSI) and considered CH relation according to the energy level and nodes situation in the network. Since the CH node consumes more energy, it is prevented to select the nodes having no desirable energy status as a CH and, therefore, the network stability increases. Also, the energy-aware multi-hop method was used for routing between CH.

#### *3.1. Network Model*

An IoT based on wireless nodes and a BS with an unlimited power supply connected to the network are the primary considerations in our model. Data is sampled by sensor nodes and they are routed in order to be sent to the BS. Also, each node can perform as a CH or non-CH node. Some assumptions of the network model are as follows.


The first order radio model of energy consumption in this proposed method is similar to the LEACH protocol [3]. Energy consumption transfer is defined as Equation (1).

$$E\_{TX}(i, K, d\_{ij}) = \begin{cases} \begin{array}{l} \text{E}\_{elec}K + \text{E}\_{fs} \text{Kd}\_{ij}^{-2} & \text{if } & d\_{ij} \le d\_o \\\\ \text{E}\_{elec}K + \text{E}\_{mp} \text{Kd}\_{ij}^{-4} & \text{if } & d\_{ij} > d\_o \end{array} \tag{1}$$

where *K* is the number of data bits and *dij* is the distance between two nodes—*i* and *j*. *Eelec* is energy consumption in sender or receiver circuit to send a data bit. *Efs* and *Emp* are energy consumptions in the sender amplifier for sending a data bit in terms of distance between the receiver and sender. Also, *d*<sup>0</sup> value, which is a threshold distance, is obtained from Equation (2).

$$d\_0 = \sqrt{\frac{E\_{fs}}{E\_{mp}}}\tag{2}$$

In an energy model, a receiver energy consumption is defined as Equation (3).

$$E\_{RX}(i, \mathcal{K}) = E\_{clcc}\mathcal{K} \tag{3}$$

According to the data aggregation model used in our simulation, it is assumed that the information collected by a set of *N* nodes can be packed in a *k* bit package.

#### *3.2. Protocol Performance*

After deploying nodes, a layering stage is performed. Each of the nodes computes its distance from the BS. For calculating this distance, the BS broadcast a signal which will be heard by all nodes and each node approximates its own calculated distance to the BS using the RSSI. Consequently, the node informs its position to the BS. The BS then starts layering the network. In the beginning, the BS calculates the difference between the closest and the furthest nodes to itself and experimentally defines four layers for networks. The BS sends messages within the network and assigns layer ID for each node.

All HCD operations are involved in cluster initialization stage and data transmission stage. Cluster initialization stage is composed of four phases including the delay time calculation phase, CH selection phase, cluster formation phase, and route construction phase. The data transmission stage is divided into a number of major slots. Each major slot is formed by several rounds and two substages which are called CH rotation and adjustment route. Each round involves in intercluster transmission phase and intracluster transmission phase. In a CH rotation, the role of CH is to turn between member nodes to prevent discharging of the energy in CH. The operation of HCD is displayed in Figure 2. Figure 3 also shows the flowchart of the proposed algorithm.

**Figure 2.** Hybrid unequal multihop clustering based on density (HCD) operation.

**Figure 3.** Flowchart of the HCD protocol.

#### 3.2.1. Cluster Initialization

In this stage, network nodes are divided into groups to form clusters. In the first phase, when the network is operational, the probability of each node to become a CH and its cluster radius are determined based on the energy level of the node and its distance to the BS. Nodes calculate their own cluster radius according to Equation (4).

$$R\_{\mathcal{C}}(i) = \left[1 - a(\frac{d\_{BS,\text{max}} - d\_{i,BS}}{d\_{BS,\text{max}} - d\_{BS,\text{min}}}) - \beta(1 - \frac{E\_{\text{rem}}(i,r)}{E\_{\text{Max}}})\right] RL\_{\text{max}} \times \lambda \tag{4}$$

In the above-mentioned Equation, *Rc*(*i*) is the radius of node *i* and *RLmax* is maximum competition radius for being a CH, as determined previously. *Erem*(*i,r*) is the remaining energy of the node *i* in the round *r* and *Emax* stands for the maximum energy capacity of the node. *dBS,max* and *dBS,min* are the maximum and minimum distance of nodes from the BS and *di,BS* is the distance of *i*th node from the BS. *α* and *β* are also weight factors that can vary between zero and one. *λ* is the weight factor associated with the layer where the nodes are located. In the lower layers *RLmax* is multiplied by a smaller coefficient to the nodes which are closer to the BS and, thus, has a smaller radius. In contrast, further clusters are multiplied by a greater coefficient to have a larger radius. Accordingly, if a node is selected as a CH it will have more energy to receive and transmit the data than the further CH to the BS.

Before computing delay time, each node shares its location information and energy level with its neighbors. All nodes, which are situated in its radio range, receive the node's message from all neighbors. After informing the nodes of their neighbors, they are able to calculate the average energy of neighbor's nodes which is calculated using Equation (5).

$$E\_{A\text{rev}}(i, r) = \frac{\sum\_{j \in N\_{nbr}(i, r)} E\_{rcm}(j, r)}{\max(|N\_{nbr}(i, r)|, \varepsilon)}\tag{5}$$

*Nnbr*(*i,r*) is the number of sets of nodes of neighbors *i* in the round of *r*. *|Nnbr*(*i,r*)*|* is node degree or number of neighbors for node *i* in the round *r*. The parameter ε is a very small number in order to avoid infinity caused by division by zero. Each node will give itself a scoring point parameter in order to start the routine of becoming a CH based on density. In the first cluster initialization stage, the point of CH for each node is zero. Each node considers its neighbors and if the energy level of the neighbor is smaller than its own energy level, it increases its point by one unit. The scoring point needed for each node to become a CH is updated after the computation given in Equation (6).

$$\begin{array}{lcl}\forall & j \in \mathcal{N}\_{nbr}(i, r) \\ \forall & f \qquad E\_{rm}(i, r) > E\_{rm}(j, r) \\ \forall & if \qquad \left| \begin{array}{l} \mathcal{N}\_{nbr}(i, r) > E\_{rm}(j, r) \\ \mathcal{N}\_{nbr}(i, r) \Big| > \left| \begin{array}{l} \mathcal{N}\_{nbr}(j, r) \end{array} \right| \\ \text{if } & \operatorname{id}\_{\text{Layer}}(i) < \operatorname{id}\_{\text{Layer}}(j) \end{array} \begin{array}{l} \text{then} \qquad p(i) = p(i) + 1 \\ \text{then} \qquad p(i) = p(i) + 1 \end{array} \tag{6} \\ \forall & \begin{array}{l} \text{if } \quad d\_{i, \text{BS}} < d\_{j, \text{BS}} \qquad \text{then} \qquad p(i) = p(i) + 1 \end{array} \end{array} \tag{7}$$

*P* is the score point given to nodes. *idLayer*(*i*) is the id of the layer where the node is situated on. Nodes compute their own delay time to announce being a CH. In this way, nodes that have suitable energy with a higher number of neighbors in the network and are closer to the BS acquire a higher point for being a CH. It should be taken into account that in this protocol we considered the point of all alive nodes equal to zero in the initial round. Also, we applied Equation (6) in each round for *i*th

node up to the number of *Nnbr*(*i,r*). After computing the point of being CH, each node is computed according to Equation (7).

$$T\_w(i, r) = \begin{cases} \frac{1}{p(i)} \times V\_r \times T\_2 & \text{if} E\_{\text{rem}}(i, r) \ge E\_{\text{Ave}}(i, r) \\\\ V\_r \times T\_2 & \text{otherwise} \end{cases} \tag{7}$$

In this Equation, *Tw* is the delay time of the node *i*. *T*<sup>2</sup> refers to the second phase time and *Vr* is a random number within the range of 0.9 to 1. Therefore, some variation occurs. In this paper, the neighbor of the node *i* refers to a node located in a distance smaller than the radius of the node which is calculated according to Equation (4). Since CH node consumes more energy, the chance of becoming a CH by those nodes that have undesirable status is prevented and, therefore, network stability increases.

In the second phase, each node must wait until the end of the delay time. If a node does not receive CH message from its neighbors during the delay time, it then announces being a CH after the delay time ends. It is evident that the node which has less delay time achieve a higher probability to become a CH. On the other hand, if a node receives a CH message, it stops its own timer and, thus, cannot be selected as a CH. Since CH node consumes more energy, the chance of being CH by the nodes having no desirable status is prevented; therefore, network stability increases.

After selecting a CH, next phase is cluster formation. Each node selected as a CH, broadcast head message in the network. After receiving the head messages by non-CH nodes, they join to the closest CH and transmit their joining message involving energy level and distance to the CH. The presented protocol applies an assistance node to CH mechanism that allows a CH to use member nodes to share traffic load. Each CH has intraclustering layering. Nodes having a distance less than the threshold (average distance) are placed in the first layer and nodes having distance more than the threshold are placed in the second layer. Then CH identifies nodes that their energies are more than half of the average energy and have been located in layer 1 as assistance to the CH (ACH). In contrast, if a node has been located in layer 2, CH will choose the closest ACH to the CM node and will send the packet to this node. Otherwise, if the distance between a node and CH is less than the threshold distance and is located in layer 1, it would start direct transmission. As mentioned, CH schedules clusters in a way that the farther node transmits earlier; therefore, within a cluster, when an ACH to a CH receives a packet from its neighbor it aggregates and integrates the packet into its packet and send it to the CH. If there is no assistance node to the CH node between the source node and the CH, the source node will send data directly to the CH. Subsequently, CH performs scheduling of the nodes according to the time division multiple access (TDMA) schedules and sends scheduling information to the members. Figure 4 shows the flowchart of routing intraclustering in HCD.

**Figure 4.** Routing flowchart of intraclustering in HCD.

The fourth phase is constructing a data transmission path to the BS. We use two messages involving route request and route reply to construct the route. Other member nodes will be inactive in this part. According to their distance from the BS, each CH broadcasts a route request with identification content, energy level, the count of cluster member nodes, and the distance to the BS in the network. CH receives the message and updates its routing table. The CH transmits the data packet to the BS either directly or through a multi-hop method. Following computing transmission cost to the BS and middle CH, they select the middle CH with minimum cost for multi-hop transmission. CH in layer 1 is involved in direct transmission to the BS. The value of the evaluation parameter is computed according to the following Equation (8).

*relay*(*i*) = ⎧ ⎪⎨ ⎪⎩ *ETX*(*i*, *l*, *di*,*j*) + *ETX*(*j*, *l*, *dj*,*nexthopj*) + *ERX*(*j*, *l*) *Inf i f Erem*(*i*,*r*) <sup>≥</sup> *ETX*(*i*, *<sup>l</sup>*, *di*,*j*) *and Erem*(*j*,*r*) ≥ *ETX*(*j*, *l*, *dj*,*nethopj*)+(*ERX*(*j*, *l*) × (*M*(*j*) + *R*(*j*) + 1)) *Otherwise* (8)

where *ETX*(*i,l,di,j*) is the required energy for transmitting data from CH *i* to CH *j*. *ETX*(*i,l,dj,nexthopj*) is the required energy for transmitting data from CH *j* to the next hop. *ERX*(*j,l*) is the required energy for receiving data in node *j*. *M*(*j*) is the count of member nodes of *j* CH. *R*(*j*) is the number of CHs, for whom node *j* act as a relay node and receive their data. Each CH computes the cost if there is a CH between it and the BS, and if it receives a route request. After selecting CH with a minimum value, CH *i* sends a route reply message to the CH of the next hop.

In algorithm 1, lines 3 to 15 correspond to the first phase: the delay time calculation phase. Lines 17 to 27 and lines 28 to 46 correspond to the second and third phase, the CH selection phase and cluster formation phase, respectively. The third phase, computation of intracluster layer, is represented in lines 41. In this phase, CH divided CMs into two layers. Nodes which are in the first layer and their distance from CH is small and their energy is higher than the average energy of nodes are selected as ACH and scheduling is formulated such that further nodes transmit data sooner. Then, scheduling is broadcasted. Lines 47 to 63 represent the route construction phase.



#### 3.2.2. Data Transmission

Data transmission stage includes several major slots. Each major slot consists of several rounds, a CH rotation, and an adjustment route. Each round is formed by two phases: (i) data transmission in intraclustering and (ii) data transmission in interclustering.

#### Round

In the first phase of each round, based on the schedule by the CH, nodes start to send their own data to the CH. When CH receives the packets from its own members, it starts to assemble an integration. In the second phase, after sending data by nodes to CH, CH sends their data to the BS through a path that has been created in path discovered phases. The route has been designed so that energy consumption is minimized for sending each packet. In this phase, the protocol uses

the carrier-sense multiple access (CSMA) method to transfer data. Each CH sends the data to the BS through the path created in the previous section.

#### CH Rotation

Except for the last major slot, one part of the CH rotation was situated at the end of each major slot. In this part, each cluster member nodes send their data to their own CH. CH select a cluster member node as a CH which has more energy for the next round. After collecting the relevant information for layering in the cluster, scheduling the nodes, and constructing data transmission path to the BS, the current CH sends the aforementioned information to the new CH of the next rounds and then switches to be a member node.

#### Adjustment Route

After CH rotation, there will be one updated adjustment route. In this part, cluster formation phase and route construction phase in cluster initialization stage are done with slightly different runs. The only difference in cluster formation phase in this part is that CH gains the relevant information for layering and determines the assistance nodes to CH nodes from the previous CH with no changes in the other actions.

#### **4. Analysis of HCD**

**Lemma 1.** *Control message complexity is a clustering of type O(N), and in the proposed method, in the worst case scenario, it is decreased r times.*

**Proof of Lemma 1.** In this protocol it is assumed that *N* is the number of nodes, *r* is the number of rounds in a major slot, *M* refers to the number of major slots in data transmission, and *R* is the number of rounds in a data transmission, which is equal to *r* × *M*. -

During the layering stage BS broadcasts a message in the network. Subsequently, node *N* transmits self\_info\_msg to BS and BS broadcasts cmd\_msg message in the network. Therefore, *N* + 2 messages are required for this stage. In the cluster initialization stage, node *N* broadcasts node\_msg including information on node energy and location. Then, the *K* times head\_msg is broadcasted by CHs and *N* − *k* join\_msg messages are transmitted to the CHs by CM. Consequently, *K* messages of TDMA\_msg are broadcasted by CHs. Moreover, 2*K* control messages are required for constructing the data transmission path to the BS. Therefore, *N* + *K* + (*N* − *K*) + *K* + 2*K* = 2*N* + 3*K* messages are required for clustering. In total, 3*N* + 3*K* + 2 messages are required during the above-mentioned stages. Therefore, in the worst case scenario, the control message complexity is of type *O*(*N*).

In hybrid performance, it is not necessary to perform clustering in each round and data transmission is performed with *M* major slot, which consists of *r* rounds. For CH rotation and route adjustment, maximum (*N* − *K*) + *K* and *K* + 2*K* messages are required, respectively. Therefore, in the data transmission stage, [(*M* − 1) × ((*N* − *k*) + *k* + *k* + 2*k*)] = (*M* − 1) × (*N* + 3*k*) messages are needed.
