*8.1. Characterization of Canon Clustering-Based Routing Protocols and Deduction of MOO Metrics*

A systematic survey (refer to Table 9) and characterization of LEACH-based routing protocols were conducted using the clustering process, CH features, and cluster features, as indicated in Figure 20, in order to conceive the core MOO metrics for the proposed CA-IoT network framework. The clustering process, CH features, and cluster features define the performance optimalities and the quality of the sampled data of the resulting architecture.

**Figure 20.** Characterization of cluster-based networks and deduced taxonomy of MOO metrics for optimizing Agri-IoT networks.


As depicted in Figure 20a, the cluster features define the underlying connectivity issues, such as cluster quality indices (thus, cluster count, cluster size) and intra-cluster and inter-cluster communication types (thus, single-hop or multihop or both) [23,24]. From the network design viewpoint, the cluster quality depends on the optimality of the CH count and cluster sizes, which in turn rely on the core design parameters, such as the spatial density and uniformity of the deployed nodes, the specification of the wireless communication standard, the routing architecture, and the size of the network [47]. Since the deployment of SNs in a typical Agri-IoT can be controlled, the stipulated cluster quality properties can be optimized to resolve connectivity issues in Figure 20b. In a randomly deployed field, these cluster quality parameters can be optimized using a pairing-based SN duty-scheduling mechanisms [9,12].

Secondly, the CH features can be static, mobile, or role-rotated in both homogeneous or heterogeneous networks [9,12] based on the SNs' resource hierarchy. Additionally, the CHs can be assigned different tasks, such as data aggregation, FM, coordinating network reconfiguration or duty cycling, and network maintenance, depending on their resource capacities and network requirements. This case study is based on static SNs and the distributed network construction approach (see references in [9,12,33,126–132]), where the SNs locally manage the entire clustering process, and a CH is elected without the entire network's information.

As shown in Figure 20a, the clustering process can be characterized by the clustering method/network type (thus, centralized or distributed), the CH selection method, reclustering or network adaptability to topological or scalable conditions, and the complexities (thus, control message and computational complexities) of the entire network operation cycle. Unlike the static approach with fixed CH, the adaptive clustering approach selects CH based on the current network conditions and rotates this role. However, both approaches can incorporate self-reclustering techniques to self-heal SN-out-of-service faults. Data outlier faults can be best detected and corrected using threshold-based decision theory or spatial correlation methods with the least complexities. Due to the large-scale and high deployment densities of WSN-based Agri-IoT, the distributed clustering process is more suitable for enhancing local FM, scalability, network management, and power optimization than the centralized approaches [37,47].

Generally, the CA-IoT network can be optimized by formulating the deduced optimal decision metrics in Figure 20a into a MOO framework and multihop routing model in order to provide the guidelines for the design of the WSN sublayer of Agri-IoT. From the comparative evaluations of Figures 10 and 20a, a taxonomy of MOO metrics for designing an efficient WSN-based CA-IoT network is proposed in Figure 20b. To enhance the clarity of the state of the art on cluster-based protocols and justify the need for the proposed MOO metrics, a comparative summary based on the characterization parameters is presented in Table 9.

#### *8.2. CH Election Techniques*

A CH selection process is very critical to the resulting network's performance efficiency. In addition to centralized networks and the computationally expensive fuzzy-based clustering approaches [133,134], the efficiencies of all LEACH-inherited protocols are mainly dependent on their CH selection techniques [47,49]. Therefore, the correct estimation of the cluster quality metrics (i.e., CHs count and cluster size) is pivotal in attaining the objectives in Figure 10. With the aid of nodes' residual energy and location metrics, the optimal CH count and cluster size can be preset before network deployment. Currently, these metrics are randomly selected using a probabilistic approach in LEACH-inherited protocol [9,21] or derived using a deterministic or an attribute-based method [47,135]. However, the probabilistic clustering, such as the LEACH-inherited protocols, is expected to perform better in terms of network lifespan, minimal clustering overhead, improved connectivity, network/coverage stability, low latency, collision-free routing, load balancing, high network stability span, and algorithmic simplicity if the optimal CH count was

predefined correctly [136]. However, the CH count is randomly predefined in these protocols [9,21], which undermines the CH's stability and the resulting architecture's optimality. This challenge can be addressed via common CH selection metrics including Euclidean distance, intra-cluster/inter-cluster communication costs, energy-harvesting capacities, and probabilistic factors. To date, the related attempts in [49,126,137–139] only relied on the SNs' residual energy and location information to re-elect CHs after the initial CH count is defined, which cannot be ideal for WSN-based Agri-IoT.

For instance, an active SN in a particular round decides whether or not to become a CH by choosing a random number (*rn*) ranging between 0 and 1 and comparing the number with a specified threshold *Th*. A node, therefore, becomes a CH for that round if *rn* < *Th*, where *Th* is expressed as:

$$Th = \begin{cases} \begin{array}{c} \frac{p\_d}{1 - p\_d \times ((first - round) mod \frac{1}{p\_d})}, & \text{if } n \in \mathcal{G} \\\ 0, & \text{otherwise} \end{array} \end{cases} \tag{1}$$

where *pd* is the desired percentage of CHs or CH count per round, and *G* is the number of SNs that have not been a CH in the previous 1/*p* rounds.

The authors in [119] proposed a three-layered LEACH (TL-LEACH) that operates in three functional phases—CH election, MN recruitment, and data transfer—to enhance the energy efficiency of LEACH. Their first-level CH election approach modified Equation (1) into an enhanced threshold *T*(*i*), which is expressed as:

$$T(i) = \begin{cases} \ (r+1) \times mod(\frac{1}{p} \times p)\_{\prime} & \text{if } i \in G\\ 0, & \text{otherwise} \end{cases} \tag{2}$$

where *p* is the CH count, *r* is the current round number, and *G* is the number of SNs that have not been a CH in the previous 1/*p* rounds. The second-level CHs are selected from the first-level CHs based on the shortest distance to the BS to function as aggregated packet forwarders or relay CHs (RCHs).

The authors in [120] introduced energy (*E*(*i*)) and distance (*D*(*i*)) attributes into Equation (1) to improve the load-balancing merit of LEACH. The resulting *Th* is expressed as:

$$Th = \begin{cases} \frac{p\_d}{1 - p\_d[r \times mod \frac{1}{p\_d}]} \times [E(i) + (1 - D(i))], & \text{if } n \in G\\ 0, & \text{otherwise} \end{cases} \tag{3}$$

Multihop routing via relay CHs (RCHs) was recommended for distant CHs in the future scope of [120].

In the LEACH presented with a distance-based threshold (LEACH-DT) algorithm in [121], the probability of becoming a CH depends on the relative distance between a node and the BS. This algorithm differs from the LEACH algorithm because the desired percentage of CHs (*pi*) is predefined using Equation (5), while the threshold *T*(*I*,*r*) is expressed as:

$$T(i,r) = \begin{cases} \frac{p\_i}{1 - p\_i \times [r \times \text{mod}\frac{1}{p\_i}]'} & \text{if } G\_i(r) = 0\\ 0, & \text{if } G\_i(r) = 1 \end{cases} \tag{4}$$

Note that the terms retain their usual definitions, namely:

$$p(i) = k \frac{\mathfrak{F}\_i}{\sum\_{j=1}^{N} \mathfrak{F}\_j}, 0 \le p\_i \le 1,\tag{5}$$

where

$$\zeta\_i = 1/\overline{E\_{CH}} \times d\_i - \overline{E\_{nom-CH}} \,\tag{6}$$

The variable *di* depicts the distance between node *i* and the BS, and *ECH* and *Enon*−*CH* are the average residual energies in CHs and non-CHs, respectively. The authors further established the need for a multihop routing approach in simulations and real-world WSNs to validate the countless theoretical propositions and benefits.

In the decentralized energy-efficient hierarchical cluster-based routing algorithm (DHCR) [123], SNs compete to become CHs. First, the BS broadcasts a trigger message at a specific range. The receiving nodes then compete to become a CH by disseminating a new message containing their residual energies and distances from the BS. Using this information, a neighboring node *i* within the target range receives the message and calculates its *CHSnf uni* as:

$$\text{CHS}\_{nfun\_{i}} = a \times \frac{Err\_{i}}{E\_{\text{max}}} + b \times \frac{1}{Dis - To - BS\_{i}},\tag{7}$$

where *Erei* and *Emax* are the residual and initial energy levels of node *i*, respectively; *Dis* − *To* − *BSi* is the distance between node *i* and the BS, and *a* and *b* are real random values between 0 and 1 such that *a* + *b* = 1. The values of *Dis* − *To* − *BSi* of node *i* and its neighbors are compared, and the node with the highest *Dis* − *To* − *BSi* value is selected as the CH. A first-level CH broadcasts its residual energy, neighboring node count, and distance from the BS via a specific route. The next-level CHs receive the information and similarly repeat the procedure to ensure that every node determines a redistributor CH to the BS at the same time. A redistributor CH has more energy and fewer neighbors (neighboring degrees).

However, the Hamilton energy-efficient routing protocol (HEER) [124] creates an entire cluster of nodes, aggregates data, and transmits them to the BS via a Hamiltonian path that has been created by the entire cluster of nodes and controls the cluster size by selecting one node as the CH using the probability function *p*, which can be expressed as:

$$p = \frac{L\_{\text{message}}}{F\_{\text{max}}} \tag{8}$$

where *Lmessage* is the size of every node, and *Fmax* is the maximum size of a frame. The HEER protocol creates the clusters only once in the first round based on LEACH, and it role-rotates the CHs per the energy on the Hamiltonian path after a determined period.

Similarly, the two-phased EAMR protocol [125] randomly preselects the CH. A CH also selects its closest CH as its redistributor CH. The clusters are static over the entire network lifetime, and the CH role rotates randomly within the clusters according to a cluster replacement threshold. The new CH inherits the redistributor role if the old CH had one. Overall, since the node location, residual energy, and sleep schedule are indispensable in the CH selection process, the CH selection methods proposed by the authors in [9,12,36,120,140] are recommended WSN-based Agri-IoT applications.
