1. Introduction
The recognition and detection of underwater mines is an active research field motivated by the need to clear mines, due to their harmful effects on the environment [
1]. An underwater mine is a destructive object that represents a significant threat to human and marine life [
2,
3]. Many systems for detecting underwater mines have been developed to reduce the negative impact of their explosion. However, almost all of the existing methods require sophisticated, expensive equipment to explore the sea and/or human operators to maintain an ideal system. Therefore, a detection system is needed that improves the efficiency of the mine clearance process, with a significant reduction in the operational time, cost, and the system operator’s risk of injury or loss of life, and with high detection accuracy.
Wireless sensor networks (WSN) hold great potential for aquatic environment monitoring, since they can sense, gather, and transmit data without a physical connection [
2,
4]. Although, in a roundabout way, this has led to the development of a new self-driven device called underwater wireless sensor networks (UWSNs) [
5], they are considered an alternative to manual operations, such as cable interactions and aquatic systems, for implementations (e.g., self-directed underwater vehicles (AUVs) and autonomous underwater vehicle management) [
1]. These systems provide an attractive solution for the low-cost continuous monitoring of underwater environments [
4,
6,
7,
8]. Underwater acoustic sensor networks (UWASN) can be applied for the detection of underwater mines. Furthermore, these devices incorporate sensors and other components that can send and receive different signals. They can communicate through acoustic waves, which are used to build and deploy UWSN systems in deep underwater settings.
The sensor nodes have strong limitations in their processing ability, embedded battery power, wireless bandwidth, and storage space. The major obstacle that calls into question the feasibility of applications built on these sensors is the energy constraint. Therefore, in order to extend the sensor battery lifetime, a low-complexity scheme for data processing and communication is required [
9,
10]. The clustering approach is one of the practical solutions to managing network energy consumption efficiently [
11]. It also helps to distribute the energy consumption among the nodes in the network. The working mechanism for this approach involves grouping the sensor nodes into the cluster and electing one of these nodes to be the cluster head. The cluster head is responsible for gathering the data from its members and sending them to the base station.
In most cases, the nodes will be deployed densely to cover all of the required areas, which makes some of the nodes enter sleep mode, thereby reducing the energy consumption. The use of a cluster-based architecture helps to share the processing load via the sensors of the cluster, which consequently reduces the per-node energy consumption and contributes to extending the network lifetime. Furthermore, the application of the clustering approach assists in reducing the amount of sent information, which increases the network lifetime [
12,
13]. A critical aspect of the proposed approach is represented by the need to perform advanced signal processing at the sensors, which entails significant energy consumption and makes the feature extraction mechanism essential to reduce energy consumption. Furthermore, the energy-aware design of systems solving complex problems requires efficient management of energy consumption without losing performance, which is carried out at a design level by solving the optimization problems involving energy consumption as a metric [
12,
14,
15,
16,
17,
18].
In the UWSN, the transmission process consumes more energy compared with sensing or computation processes. It consumes approximately 80% of the power for each sensor node [
12]. Thus, if we minimize the size of the data, it will reduce the energy consumption of each node.
Compared to terrestrial WSNs, underwater environments are characterized by unique features and face several issues, such as the depth-related impact on temperature, salinity, pressure, winds, and waves. These characteristics significantly affect the high-frequency waves used to collect sea-environment information (e.g., EM waves), which suffer from severe attenuation when used. Similarly, low-frequency signals, such as optical waves, need high-precision pointing beams, which suffer from scattering.
Underwater signal acquisition methods should have the capability to resist seawater characteristics. For an underwater medium, acoustic waves are less lossy and support long-range signal transmission. Thus, acoustic signals are primarily employed in underwater communication. Sound is a series of pressure perturbations that travel as a wave and exhibit phenomena such as reflection, diffraction, and interference [
15]. Sonar sensors are considered an efficient choice because of their low fabrication cost and low power consumption. Moreover, sonar signals suffer less attenuation compared to other underwater techniques [
16]. Developing a successful underwater mine detection system requires that mines can be distinguished (or classified) from other mine-like objects with great accuracy. Therefore, there is a solid need to extract the relevant information from the sonar data in order to evaluate and understand the signal properly. So-called feature extraction directly affects a system’s classification performance [
19]. If the extracted features are not expressive for a certain problem, then the classification is not satisfactory [
20]. At present, numerous techniques have been proposed for these subjects, including spectrogram correlation, time-frequency analysis, hidden Markov models, wavelet transformation (WT), and other approaches. The WT of signals has been widely employed for feature extraction. It converts the signals into a time or frequency domain, and the resultant wavelet coefficients can be used for classification [
19]. Compared to the other feature extraction techniques—such as slop vector waveform, Fourier transforms, and chaos methods—WT consumes less energy, as it extracts the expressive information from the original signal.
In this context, the main contribution of this research paper is to propose a clustered underwater wireless acoustic sensor network (UWASN) for mine detection. This system is designed to be lightweight and to reduce energy consumption, while automating the whole procedure of detecting and monitoring aquatic environments efficiently. The system provides the following characteristics:
Effective, lightweight mine detection using the wavelet-based extracted features of sonar signals.
Precise mine surveillance systems and short mine monitoring.
The rest of this paper is structured as follows:
Section 2 presents a comprehensive study and review of the related works on detecting underwater mines based on sonar signals. It also reviews the studies of clustered UWSNs.
Section 3 presents the UWASN energy consumption model.
Section 4 demonstrates the proposed scheme.
Section 5 provides different sensing methods.
Section 6 covers the experiment setup implementation and the simulation environment.
Section 7 contains an evaluation of the results, and the conclusions with recommendations for future work are presented in
Section 8.
3. UWASN Energy Consumption Model
In recent years, target detection has gained attention from researchers. Many existing detection mechanisms have been proposed to detect underwater mines with high accuracy using WANs. However, processing acoustic signals consumes more energy, directly affecting the network lifetime. Energy consumption is one of the primary issues in sensor networks, due to the inability to replace or recharge their batteries. Furthermore, each sensor node has significant power constraints, and the amount of energy consumed will impact both the network performance and the lifetime of the sensors.
The network contains many acoustic sensor nodes and sink nodes that are placed and distributed over the area of interest to monitor the surrounding environment by sending acoustic signals. The network is partitioned into clusters, each containing a CH and several member nodes. Instead of sending the raw sensor node data to the CH, each member sensor node is responsible for sensing the surrounding environment, processing the received signals, and extracting the features from them. When an event of interest occurs, the sensor nodes extract the features from the acoustic signal and use them to identify the type of detected object using the classification algorithms. Then, the sensors send the packets to their CH. After that, the CH applies the classification process to classify the detected object. Once the CH detects a mine, it transmits the target’s detection information to the BS. Since each sensor extracts features from the signal, and the CH applies a classification process to classify the detected object, the size of the transmitted packet sent to the BS is reduced, leading to decreased energy consumption for communication within the network. The communication process consumes more energy compared to signal processing. When we reduce the size of the transmitted packets, we preserve energy, which leads to an increased network lifetime.
Furthermore, it is important to use a classification algorithm that can accurately classify the detected object using the extracted features and with a high accuracy rate. This helps to reduce the memory space overhead. Since the energy consumption of the sensor nodes is directly affected by the computational complexity of the adopted algorithms, the energy consumption increases during data processing. Therefore, it is essential to use low-complexity algorithms that can accurately classify the objects using fewer instructions.
The UWASN energy model is based on the dissipation of the acoustic energy used in [
14] to produce it. The compression and dilation of a medium result in the generation of acoustic waves when a mechanical disturbance occurs. The propagation medium’s elasticity is a characteristic of this phenomenon [
6].
where
TL is the transmission loss and
SL is the source level. The purpose of
SL is to calculate the amount of sound radiated by a sound source. It refers to the intensity of the radiated sound at a distance of 1 m from the source. Furthermore, the intensity indicates the amount of sound power transmitted through a unit area in a particular direction. The source level is the relative intensity and uses decibel (dB) units. The decibels units are used in underwater sound, and the (dB) is measured as a pressure of 1 microscale (µPa). All of the parameters in Equation (1) are in dB re µPa, and the value of 1 µPa is equal to 0.67 × 10
−22. Different signal shapes have different transmission losses. When transmitting a cylindrical signal, the transmission loss is as follows:
The required threshold value of
, indicated by
, must be larger than that of
to obtain a better reception. However,
is a monotonically decreasing function of frequency
f. For convenience, we consider
to be
(
f), hereafter. The essential transmitter power
to obtain intensity
at a distance of 1 m is as follows:
where
is defined in terms of
SL as follows:
Finally,
is represented as follows:
where
and
H is the water depth in meters.
With the transmission of l bits over distance d, the dissipated transmission energy can be expressed as follows:
and the receiver radio energy consumption can be expressed as follows:
where
and
are the energy consumed by the transmitter and receiver to process the l bits of data, respectively, and
is the bit duration in seconds.
4. Proposed Work
The proposed approach consists of two phases. In Phase 1, we start at the edge to determine the required extracted features. Then, we deploy these features to the sensors. In Phase 2, the sensors sense the surrounding environment in order to detect the mines. When a mine is detected, they extract the features and send packets to their CH. The CH performs the classification and sends the notification packets to the BS if the detected object is a mine. Finally, the CH assigns weights to all of the features received from its cluster members based on the signal strength. After a certain period, the edge can send an improved list of required extracted features to the sensors.
The main contribution of this paper is the design of a scheme that focuses on decreasing the complexity of the object detection process in the UWASN and the enhancement the classification process. In order to achieve this goal, an appropriate algorithm for feature extraction and classification must be applied [
38]. The following two significant factors are necessary to accomplish this task:
The constraints of network resources, such as a small memory size, limited power supplies, and low communication, are the main characteristics of WSNs. Therefore, it is essential to use a model that can make recognition decisions with a minimum number of datasets.
The computational complexity is another critical factor. The feature extraction and classification process at the sensor level increase the energy consumption. However, the amount of transmitted data will be decreased [
38].
The goal of this research is to develop an efficient method for detecting and disarming underwater mines and related substances in various marine environments by using underwater acoustic sensors with wavelet transform (WT). This system will provide accurate and reliable information on the location of underwater mines and related substances. The use of underwater sensor networks (UWSNs) will enable the gathering of all of the necessary information under different circumstances, facilitating the identification and clearance of underwater minefields and promoting disturbance-free aquatic life. This approach will ensure that different marine life and related activities can enjoy disturbance-free aquatic life, as unexpected blasts that destroy aquatic life and degrade the marine environment, which may even be eradicated. The proposed work will involve several stages and models working together to provide accurate information on the location of underwater mines. The first step is to extract the features from the sonar signal to distinguish mines from other mine-like objects underwater, and the second step is to deliver a notification when a mine is detected.
4.1. Network Model
There are three main types of transmission mediums for underwater communication: radio wave communication (electromagnetic), optical communication (light), and acoustic communication (sound). However, radio and optical communication are inefficient underwater, due to poor performance, leaving acoustic communication as the primary option, due to its low attenuation in water [
39].
Acoustic communication involves using sound signals for communication between the sensors in the network. These signals can transmit over long distances, compared to electromagnetic and optical waves, making them ideal for underwater communication. Underwater acoustic sensor networks (UASNs) consist of numerous sensors that use acoustic signals to communicate for various underwater applications, such as monitoring risks. Each sensor sends an acoustic wave and receives the reflexion of the objects. The received acoustic signal will be processed, and the extracted features will be transmitted to the cluster head (CH), which then processes it along with other possible notifications coming from the other sensors in the same cluster. The CH will be in charge, using these signals to detect mines using an ML-based classifier. The underwater sensor nodes communicate with their respective cluster heads (CHs) using acoustic signals to transmit packets [
40]. The density of the nodes impacts the detection accuracy of mines. In depth, if we increase the nodes’ density, the distance between the mines and the acoustic sensor decreases, which provides better received signal power. Furthermore, a higher number of sensors will report the detection of a mine to the cluster head, which gives more available data to the cluster head in order to accurately classify the detected object.
Acoustic sensor nodes are placed underwater and organized into multiple clusters. One CH in the cluster collects the sensor node packets and transfers them to the BS over acoustic signals. It is easy to see why the BS is located near to the shore, close to the water. The BS obtains the packets from the CHs submerged underwater and then directs them through radio frequency transmission to the on-land controller using RF communication.
Figure 2 illustrates the network model used in this work.
4.2. Detection Scheme at the Sensor Node
The acoustic sensor node continuously senses the surrounding environment by sending acoustic signals underwater and receiving signals from the underwater objects. When the sensors receive signals, they process them using WT to extract their features. Then, they transmit these packets to their CH in order to complete the classification process to detect the type of detected object. After that, if the detected object is mine, the CH will send the notification packet to the BS. The detection scheme is presented in
Figure 3.
4.3. Feature Extraction Using Wavelet Transformation
WT is a time- and frequency-domain method that is used to extract significance and reveal hidden information from the original signal. It analyzes the signal at different levels and resolutions, thereby extracting more relevant information [
41]. It is widely used to detect inter alia, heart rates, and specific objects. It can apply both continuous wavelet transformation (CWT) and digital wavelet transformation (DWT). CWT is defined as follows:
where
f(
t) is a signal,
(
t) is a mother signal, a is the dilatation, and
b is a translation. CWT is not randomly used for predictions, because it is computationally difficult and time consuming [
42], and the creation of redundant factors to a substantial volume of computation [
43]. DWT transformation is defined as follows:
where
c is the scale and
d is the translation variation. DWT is often used because it requires less computation time and is simpler to apply. Furthermore, DWT is more suitable for time-critical applications or situations where the power supply is limited [
44]. For these reasons, DWT is an effective selection for sonar signal processing. Different wavelet families implement DWT, and they have unique features. These families include Haar, Morlet, complex Morlet, Meier, Daubechies, Coiflets, and Shannon–Kotelnikov. The Haar wavelet family is preferred over the other families because it is simple and sufficiently resolves various problems [
43]. In addition, Haar has a high computation speed and is memory-efficient. Also, it does not require extra memory for its calculations [
45]. Since the principal aims of this work are to deal with limited computational capability and reduce the energy use, the Haar wavelet was chosen as the function to transform the sonar signals and extract the features. Furthermore, implementing the Haar wavelet increases the classifying ability to distinguish mines from other objects, because it extracts only the crucial features from the signals.
Discrete wavelet transformation (DWT) is a widely used technique for feature extraction, due to its efficiency in this area. Previous studies have demonstrated that using this technique can produce good results. The primary goal of this feature extraction method is to reduce the dimensionality by removing the irrelevant features and selecting the optimal group of attributes from the original data [
46]. Furthermore, this feature extraction method can reduce the time needed for training and processing data and improve the accuracy by using DWT to remove the redundant features and clean the data.
4.4. Classification and Mine Detection
After the transformation of the signal, a high level of accuracy is required for the classification. Misclassified mines could lead to explosions, rendering the detection system useless. In order to enable the classification at the sensor level, a classifier must possess the following characteristics:
First, the classifier must have a high accuracy in classifying mines and related substances, with a low misclassification rate. Second, it should have a low computational complexity, as the energy consumption of the sensor node is directly affected by the computational complexity of the classifier. Third, it should have a small memory footprint, as the sensor node has limited memory resources. Finally, it should be able to handle the non-stationary nature of the underwater environment, as the acoustic signal characteristics can change over time.
The Naïve Bayes classifier is a probabilistic classifier based on the Bayes rule theorem, which assumes the attribute
is fully independent of a given output class Y. This is called the conditional independence assumption [
47]. Considering that X contains n attributes, its representation is given by [
48], as follows:
A supervised classifier is used to classify the data in order to make predictions about the outcomes [
49]. Compared to the other classifiers, Naïve Bayes is efficient because it uses a simple calculation, requires less computational complexity and memory, and has high accuracy [
50]. Given these advantages, we selected the Naïve Bayes classifier to discriminate the mines from the other objects.
The clustered protocol proposed in this work is based on the protocol presented in [
14]. The network includes multiple sensors that use a transmission medium to perform distributed sensing. The main idea here was to compare the performance of the cluster protocol using the following three different transmission media: acoustic, free-space optics (FSO), and electromagnetic (EM). The findings of the protocol were as follows [
14]:
In a Gaussian-distributed underwater sensor network (UWSN), acoustic waves outperform free-space optics (FSO) and electromagnetic (EM) communication techniques in terms of the optimal number of clusters.
Therefore, for any underwater application using a clustering topology, acoustic communication requires less energy.
However, acoustic underwater communication is limited by the bandwidth, and the behavior of optimal clustering is not uniform across the bandwidth. The best number of clusters can be achieved at the lower bound of the bandwidth.
Acoustic waves are, thus, the least lossy underwater, as they support long-range signal transmission. Moreover, they are mainly employed in underwater communication. Acoustic communication is bringing back this once-defunct underwater communication mechanism.
In their work, the distribution of the sensors uses a mathematical formula to achieve an optimal number of clusters. The aim is to overcome the issue of using too much energy because of the increased overall communication overhead if the distribution is based on having more clusters while distributing the sensors equally on each cluster [
14]. However, fewer clusters use more energy to transmit the data from the CH to the BS. The ideal number of clusters by means of the acoustic waves is defined by the following formula [
14]:
where
N is the number of nodes,
a(
f) is the absorption coefficient,
is the bit duration,
Z is a constant,
H represents the sea depth,
M is the length,
is the energy dissipation in the electrical circuit, and
d is the distance between the transmitter and the receiver in meters.
The optimal number of clusters depends on the dimensions of the sensing field (M), the number of sensor nodes (N), the distance between the nodes and the BS (), and the energy consumption of the transmitter electronics (). Consequently, the optimal number of clusters is independent of the energy consumption of the transmitter electronics.
Using the optimal number of clusters as a guide, the sensor nodes can self-organize into clusters using distance-based segmentation to group themselves in a decentralized manner. This method outperforms the low-energy adaptive clustering hierarchy (LEACH) protocol in resolving energy imbalances. These imbalances usually occur in the LEACH when it does not consider sensing coverage and distance from the base station (BS) in selecting the cluster heads (CHs). The CHs are selected by using the distribution formula, which is calculated in each node. The selected nodes then send their self-selection decision to the other sensor nodes in the network, and the other nodes then organize themselves into clusters after the most suitable CH is selected from the self-elected nodes. Therefore, the CHs use time-division multiple-access (TDMA) methods to send packets from the sensor nodes to the BS.
5. Sensing Methods
The sensors used in the UWSNs can operate in different modes and methods depending on the circumstances and environment deep underwater. In underwater acoustic sensing systems, the sensors operate in the following three different modes: 2D, 3D, and hybrid [
51]. These modes function differently to detect and provide data on substances underwater.
In a two-dimensional (2D) environment, static sensor nodes are typically installed in submerged positions on the seabed. These nodes connect with a sink node for data transfer via multi-hop communication across multiple clusters [
51].
In a three-dimensional (3D) design using inflated buoys as supports, sensors are deployed at different depths by modifying the length of the cable that connects to the anchor on the sea bottom [
51]. The sensor nodes in the mobile architecture have the freedom to move around. This allows for the dynamic reconfiguration of the network topology. In the mobile architecture, the sensor nodes have the freedom to move around, allowing for the dynamic reconfiguration of the network topology. The mobile nodes require two transceivers for proper functioning. To enhance the network capabilities and gather data, remotely operated underwater vehicles (ROVs), autonomous underwater vehicles (AUVs), or sea gliders can be used. A hybrid design is a third type of vehicle, which mixes static and mobile sensor nodes to fulfill specific functions [
42]. Mobile nodes can operate as routers or controllers in a hybrid vehicle to connect with static or standard sensors in a distributed system for data sensing [
51].
Past research has shown that acoustic communication is suitable for underwater communication. Acoustic signals can travel long distances underwater, and the communication range between the nodes is large, allowing for sparsely dispersed underwater acoustic sensors. This kind of deployment is suitable for many applications, such as pollution and habitat monitoring, where the loss of some data is acceptable to a certain extent [
52]. However, for critical applications that involve critical data, such as intruder detection and mine detection, a dense deployment of nodes is necessary and required. In such cases, losing even the slightest amount of data is not acceptable [
53].
6. Implementation and Simulation Environment
The dataset used in this work was obtained from the UCI Machine Learning Repository and is called the connectionist bench sonar dataset, which includes the mines vs. mocks dataset [
54]. This task trains the network to define the type of sonar signal reflected off a metal cylinder or cylindrical rock. This dataset contains two types of files. The first file is “sonar. mines,” which consists of 111 patterns acquired from bouncing signals off a metal cylinder at various angles and under different circumstances and labeled “M”. The second file is “sonar. rocks,” with 97 patterns of signals that bounced off rocks under similar conditions and was labeled with “R.”.
The sonar transmitted is a frequency-modulated acoustic chirp, where the frequency increases over time. A chirp is a signal containing a frequency that increases or decreases over time. The signals are transmitted at various angles, covering 180 degrees for the rocks and 90 degrees for the metal cylinder. Each pattern contains 60 decimal numbers, with values between 0 and 1 representing the attributes or features of the bounced signal. Each attribute represents the amount of energy within a particular frequency band.
For the evaluation process, we have used 70% of the data for training and 30% of the data for testing. We used the Python programming language and NS-3 simulation environments to measure the different metrics of the work. We employed the following metrics to evaluate the proposed work’s mine detection scheme and network performance (
Table 1 lists the simulation environment parameters). We have selected a square area of 50 m × 50 m, with a depth of 50 m. In this area, we deployed a variable number of nodes that ranged from 1 to 100, with a base station located in the center. The packet size was composed of 500 bytes, which were used for the exchange of data between the different nodes.
where
TP and
TN represent the number of true positive and true negative predictions, respectively, and
FP and
FN represent the number of false positive and false negative predictions, respectively.
Throughput: The rate at which the packets are successfully transmitted between the sources and the destinations in the network, measured in packets per second.
Packet delivery ratio (PDR): The ratio of packets transmitted to the number successfully delivered in the network.
Network delay: The end-to-end delay in the transmission process, measured as the mean time from when the source sends a packet to when the message is successfully received at the intended destination.
Average energy consumption vs. the number of rounds: The average remaining energy in the nodes at a specific round.
Alive nodes: The number of nodes that are alive at a specific round.
7. Results and Discussion
To evaluate the accuracy of the proposed method during the analysis stage, experiments were performed using different classification algorithms, including support vector machines (SVMs), random tree, J48, and K-star. Picking one algorithm could lead to limited results. Also, in order to save time and cost, we could not try every algorithm. We selected a group of popular algorithms in the literature that belong to different families of ML algorithms (numerical, symbolic, etc.)
Table 2 shows the outcomes obtained from using these algorithms. Additionally,
Figure 4 illustrates the distribution of data points belonging to the two classes (mines in blue and rocks in red) after applying the Level 3 Haar function to the dataset. This shows a clear grouping of each class after feature extraction, thereby enhancing the classification accuracy.
This result can be explained by the fact that Haar wavelet transform provides details and high resolution in the time-frequency domain of the processed acoustic signal, which helps to extract the relevant features that will be efficiently used in the classification. The wavelet transform applied to the different levels allows us to extract the different frequency sub-bands that comprise the signal and has high adequacy in classifying the object accurately.
The results have shown that the Naïve Bayes classifier achieved the highest accuracy (95.1691%) when using the selected dataset in both the sevenfold and the fivefold cross-validation, outperforming the other classifiers. The random tree classifier produced similar results to Naïve Bayes, but Naïve Bayes was lighter and simpler, making it more suitable for practical applications.
Table 3 presents detailed results of the selected classifier, successfully classifying 197 out of 207 objects and incorrectly classifying only 10 objects, with a low mean absolute error (MAE) of 0.0473. Additionally, a comparison of the fivefold and tenfold classifications is shown in the table.
Table 4 compares this study with the previous efforts carried out on the same dataset. Three other researchers used wavelet transform (WT) in their studies, while four used different techniques. As the table indicates, the proposed method yielded better results than all of the previously used methods. While the accuracy of the three studies that utilized a comparable methodology (wavelet) varied between 80% and 94%, the proposed method achieved a classification accuracy of 95.1691%. The four other studies yielded accuracies of between 72% and 89%. This suggests that WT improves the classification accuracy. Therefore, the proposed study’s higher accuracy compared to all of the previous studies indicates a significant contribution to this field.
Figure 5 illustrates the amount of energy consumed by specific numbers of nodes in the proposed work. It shows that energy consumption is logically proportionate to the detection scheme at the sensor node. Since the detection steps include signal transformation followed by the classification, the energy consumption result is reasonable. Each time an object was spotted, these procedures were carried out. We also employed an optimum clustering method to decrease the energy consumption, since the node’s residual energy plays a role in picking the CH. Compared to the work in [
29], the proposed work consumed less energy in the former work; furthermore, the amount of consumed energy ranged from 210.28960 joule to 214.57170 joule, depending on the packet size (packet size: 50, 100, 150, 200, and 250 bytes). Moreover, even when the packet size in [
29] was smaller than the packet size in the proposed work herein (packet size: 500 bytes), the energy consumption in [
29] was still higher. The chart in
Figure 5 compares the proposed work and the previous protocols used in [
37] in the aspect of consumed energy. The proposed work provided a reasonable consumption of energy with 10 nodes, in comparison to the ReVOHPR protocol. However, when the number of nodes increased to 25, the energy consumption increased, but the proposed work still provided well in terms of energy consumption in comparison to the rest of other protocols. Until the number of nodes reached 100, the proposed work consumed less energy, only in the comparison to the PSO protocol.
Figure 6 shows that the proposed work consumed less energy than all of the protocols presented in [
30]. Overall, the outcomes of the suggested work look positive in terms of consumed energy.
Figure 7 illustrates the network delay in relation to the number of nodes. As the number of nodes increases, the delay increases, because the CHs receive more packets from the sensor nodes. The comparison of the proposed work with the other protocols from [
37] shows that the proposed work had less delay time than all of the other protocols with varying numbers of nodes. Furthermore, the proposed work provided less delay in comparison to all of the protocols provided in [
36]. When the number of nodes was 100, the least delay achieved was 6.8 s, by the CA-DBR protocol, which is more than the proposed work.
Figure 8 shows a comparison between the proposed work’s delay and the protocol delay provided in [
30]. The proposed work has less delay in comparison to the other protocols.
Compared to the previous study [
29], the PDR was more significant in the suggested work, with a PDR of 67% for 100 nodes. More importantly, with 100 nodes, the highest PDR value was 80% in the work of [
27], which is lower than the figure reported in the proposed study. Furthermore, in [
36], all of the protocols produced a lower PDR, with the CA-DBR protocol providing the highest percentage at 78% at 100 nodes.
Figure 9 illustrates a comparison between the proposed work and the previous procedures [
37]. Exceptionally, the suggested work produced better results in comparison, except in contrast to the ReVOHPR protocol, which was 96% at 100 nodes compared to 90% for the proposed work.
Figure 10 provides a comparison between the proposed work and the protocols provided in [
30]. The proposed work provided a higher PDR than the protocols, with the same number of nodes. On the other hand,
Figure 11 shows that, when the number of sensor nodes rises, the throughput standards upsurge along with them, and the transmission rate in the network also increases. The high throughput attests to the capability of this approach to support the intensive exchange between the nodes of the cluster participating in the detection of the acoustic signal and the classification. It also shows that the network allows us to exchange high frequency of notifications with the sink node submitted by multiple cluster heads. In comparing the throughput of the proposed work to that of the protocols in [
37], it can be seen from the graph that the proposed work provided lower throughput values in general in comparison to the other protocols. Also, when the proposed work was compared to the protocols in [
30], the protocols adopted achieved a better throughput.
In
Figure 12, the number of alive nodes declines from 100, and more than half are still alive after 300 rounds. Hence, the proposed work must exceed 900 rounds before all of the nodes are dead. By comparison, Ref. [
28] and Ref. [
31] reached the end in the 800th and 500th rounds, respectively. Therefore, even though their nodes’ energy consumption was less than that of the proposed work, our proposed method of conserving energy by applying optimal clustering is better for extending the network’s lifetime.