*Article* **A Compact and High-Performance Acoustic Echo Canceller Neural Processor Using Grey Wolf Optimizer along with Least Mean Square Algorithms**

**Eduardo Pichardo \*, Esteban Anides \*, Angel Vazquez, Luis Garcia, Juan G. Avalos, Giovanny Sánchez, Héctor M. Pérez and Juan C. Sánchez**

> Instituto Politécnico Nacional, ESIME Culhuacan, Av. Santa Ana No. 1000, Ciudad de México 04260, Mexico **\*** Correspondence: edua\_95pim@hotmail.es (E.P.); eanidesc1800@alumno.ipn.mx (E.A.);

Tel.: +52-55-2101-9551 (E.P. & E.A.)

**Abstract:** Recently, the use of acoustic echo canceller (AEC) systems in portable devices has significantly increased. Therefore, the need for superior audio quality in resource-constrained devices opens new horizons in the creation of high-convergence speed adaptive algorithms and optimal digital designs. Nowadays, AEC systems mainly use the least mean square (LMS) algorithm, since its implementation in digital hardware architectures demands low area consumption. However, its performance in acoustic echo cancellation is limited. In addition, this algorithm presents local convergence optimization problems. Recently, new approaches, based on stochastic optimization algorithms, have emerged to increase the probability of encountering the global minimum. However, the simulation of these algorithms requires high-performance computational systems. As a consequence, these algorithms have only been conceived as theoretical approaches. Therefore, the creation of a low-complexity algorithm potentially allows the development of compact AEC hardware architectures. In this paper, we propose a new convex combination, based on grey wolf optimization and LMS algorithms, to save area and achieve high convergence speed by exploiting to the maximum the best features of each algorithm. In addition, the proposed convex combination algorithm shows superior tracking capabilities when compared with existing approaches. Furthermore, we present a new neuromorphic hardware architecture to simulate the proposed convex combination. Specifically, we present a customized time-multiplexing control scheme to dynamically vary the number of search agents. To demonstrate the high computational capabilities of this architecture, we performed exhaustive testing. In this way, we proved that it can be used in real-world acoustic echo cancellation scenarios.

**Keywords:** grey wolf optimization; swarm intelligence; real world application; spiking neural P system; AEC system; LMS; neuromorphic architecture; FPGA

**MSC:** 68W10; 68Q06; 68Q45; 68W99; 94A12

#### **1. Introduction**

Nowadays, acoustic echo canceller (AEC) systems mainly use the least mean square (LMS) adaptive filter algorithm, since it exhibits low computational complexity [1]. Therefore, this algorithm is easy to implement in low-area devices. However, its use can cause instability in the system, since it has an uni-model error surface. Hence, this algorithm is limited when a multimodal error surface is considered, since this algorithm must be initialized in the valley of the global optimum to converge to the global optimum [2]. Another aspect is linked to its convergence speed, because this depends on the eigen-value spread of the correlation Matrix (R) [3]. To overcome these problems, bio-inspired evolutionary [4,5] and swarm intelligence (SI) techniques [6–8], have emerged as potential solutions for the parameter optimization. Recently, the grey wolf optimization (GWO)

**Citation:** Pichardo, E.; Anides, E.; Vazquez, A.; Garcia, L.; Avalos, J.G.; Sánchez, G.; Pérez, H.M.; Sánchez, J.C. A Compact and High-Performance Acoustic Echo Canceller Neural Processor Using Grey Wolf Optimizer along with Least Mean Square Algorithms. *Mathematics* **2023**, *11*, 1421. https://doi.org/10.3390/ math11061421

Academic Editor: Gaige Wang

Received: 1 February 2023 Revised: 28 February 2023 Accepted: 13 March 2023 Published: 15 March 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

algorithm [6], which is considered to be one of the most recent metaheuristic SI methods, has proven to be very successful in multiple fields, such as image processing [9,10], health care/bioinformatics [11,12], control systems [13], electromagnetics [14,15], environmental applications [16,17], and adaptive filtering [18,19], among others [20,21], since this algorithm offers impressive characteristics in contrast with other swarm intelligence methods.

In general terms, the GWO algorithm shows a good balance between the exploration and exploitation processes during the search. As a consequence, this allows high accuracy. From the engineering point of view, this algorithm can be seen as a potential solution in practical applications, since it involves fewer parameters and less memory compared with the most advanced metaheuristic SI methods. Therefore, this algorithm requires few controlling parameters. As a consequence, its practicality is increased significantly [21]. Based on these features, new variants of the GWO algorithm can be developed to be applied in practical and real-time acoustic echo canceller (AEC) systems.

In general, bio-inspired algorithms are capable of globally optimizing any class of adaptive filter structures. Therefore, the use of these algorithms opens new opportunities in the development of advanced adaptive filters and enables their implementation in embedded devices [22]. Specifically, recent studies have proven that the particle swarm optimization (PSO) algorithm can be used in the development of AEC systems. For example [23] proposed a PSO-based AEC system to guarantee a high degree of accuracy in terms of echo return loss enhancement (ERLE). Specifically, the authors used the PSO to perform the error minimization in the frequency domain. Ref. [24] presented a PSO-based AEC system, in which the PSO algorithm provides a very fast convergence rate by performing the error minimization in the time domain. Another PSO-based AEC system was developed by [25]. This approach was applied to multichannel systems to improve stability.

After analysing all the works mentioned above, we found that the PSO algorithm may suffer some limitations, especially when a heavy constraint optimization is required. Under this situation, one may get trapped in local minima. To avoid this, the GWO algorithm can be seen as a potential solution, since it offers a high convergence rate at the cost of exhibiting high computational complexity and low tracking capabilities [26–28]. In particular, good performance can be obtained, especially when a large population is used, i.e., population-based meta-heuristics generally have greater exploration when compared to a single solution-based algorithm [6]. However, its implementation in current embedded devices becomes infeasible. Hence, the development of new variants of GWO algorithms, to decrease their computational cost whilst maintaining performance, is still a great challenge. Recently, some authors have proposed convex combinations of adaptive filters, in which two adaptive algorithms with complementary capabilities are used to improve the steadystate MSE and the tracking performance [29,30]. With respect to the latter, the tracking performance determines the capabilities of the system to track variations in the statistics of the signals of interest [31]. Therefore, in practical AEC systems, the improvement of this factor is highly required [32].

Here, we propose a new variant of the convex adaptive filter, based on the GWO and LMS algorithms. In addition, we include the block-processing scheme in both algorithms to easily implement them in parallel hardware architectures. As a consequence, these algorithms were applied to real-time and practical AEC applications. Specifically, we used the GWO algorithm to guarantee a high convergence rate and high tracking capabilities. With respect to the latter, we added new exploration capabilities by dynamically adjusting the search space, especially in the context of abrupt changes occurring in the acoustic environment, which cannot be achieved when using the conventional GWO algorithm. A significant improvement in terms of computational cost was achieved by dynamically decreasing the population of the GWO algorithm over the filtering process. In addition, the use of the LMS allowed us to improve the steady-state MSE.

From an engineering perspective, the development of low computational complexity metaheuristic SI methods makes their implementation in current resource-constrained devices feasible. In addition, the development of advanced and novel implementation techniques also opens new opportunities in the creation of compact and high-performance devices. Specifically, the simulation of the proposed convex GWO/LMS adaptive filter requires an enormous number of large precision multipliers. Therefore, the development of low area, low latency and large precision adders and multipliers is still a challenging task. Inspired by the neural phenomena, Ionescu presented, for the first time, a class of distributing and parallel computing models, denominated as spiking neural P systems [33]. This new area of membrane computing intends to exploit, to the maximum, the intrinsic parallel capabilities of the soma of the neurons to create novel processing systems. In recent years, several authors focused their efforts to create advanced arithmetic circuits, such as adders and multipliers.

• *Parallel neural adder*.

Recently, Ref. [34] developed a neural adder circuit to compute two large integer numbers in parallel. In spite of this achievement, this adder demands an enormous number of synapses and neurons to process large integer numbers. As a result, its use for simulation purposes becomes impractical, since metaheuristic SI methods demand high precision numerical accuracy.

• *Parallel neural multiplier*.

In [35], the authors intended to significantly reduce the number of synapses to create an ultra-compact parallel multiplier. Despite decreasing the area consumption, the processing speed was still high.

Analyzing previous works, we noted that these circuits were proposed to process large integer numbers in parallel at the cost of increasing their processing speed. Therefore, any of these circuits can be used in the simulation of real-time AEC systems. In addition, arithmetic circuits with more precision exhibit a clear trade-off between area consumption and processing speed. From an engineering perspective, several authors still continue to make tremendous efforts to design advanced neural arithmetic circuits with high-precision to be used in real-time AEC systems. Specifically, these developments face two large challenges:


Regarding the latter, we developed three potential proposals:


Our results showed that the proposal of new variants of the GWO algorithm, along with new implementation techniques, generates a compact neuromorphic architecture to be used in practical and real-time AEC applications.

The paper is structured as follows. Section 2 introduces the fundamentals of GWO and LMS algorithms. Additionally, we introduce the proposed Convex GWO/LMS algorithm, which involves the use of a new set of equations to improve its tracking capabilities and computational complexity. Section 3 presents software simulation tests, including the justification of the selection of some tuning parameters and display comparisons

between existing approaches and the proposed algorithm. In Section 4, we present a new parallel neural adder circuit and a new parallel neural multiplier circuit, which are highly demanded in the computation of the proposed algorithm. In addition, this section introduces experimental results of the implementation of the proposed Convex GWO/LMS algorithm in the Stratix IV GX EP4SGX530 FPGA. Finally, in Section 5 we provide the conclusions.

#### **2. The Proposed Block Convex GWO/LMS Algorithm**

#### *2.1. GWO Algorithm*

In general terms, the GWO is considered a population-based optimization technique inspired by the behavior of the Canis lupus [6]. This algorithm intends to mimic the hunting and hierarchical behavior of grey wolves. Regarding the latter, this algorithm involves four hierarchical levels; alpha (*α*) represents the fittest solution in the population, while beta (*β*) and delta (*δ*) denote the second and third best solutions, respectively. Finally, omega (*ω*) are the search agents. To simulate the GWO algorithm, the following equations are used:

$$\overrightarrow{\mathcal{W}}(n+1) = \overrightarrow{\mathcal{W}}\_p(n) - \overrightarrow{\mathcal{A}} \cdot |\overrightarrow{\mathcal{C}} \times \overrightarrow{\mathcal{W}}\_p(n) - \overrightarrow{\mathcal{W}}(n)|\tag{1}$$

where *<sup>n</sup>* denotes the current iteration, −→*<sup>W</sup> <sup>p</sup>* is the position vector of the prey, −→*<sup>W</sup>* represents the position of a grey wolf, −→*<sup>A</sup>* <sup>=</sup> <sup>2</sup>−→*<sup>a</sup>* · −→*<sup>r</sup>* <sup>1</sup> <sup>−</sup> −→*<sup>a</sup>* and −→*<sup>C</sup>* <sup>=</sup> <sup>2</sup>−→*<sup>r</sup>* <sup>2</sup> denote coefficient vectors, where −→*r* <sup>1</sup> and −→*r* <sup>2</sup> represent random vectors in [0, 1], respectively, and −→*a* is linearly decreased from 2 to 0 using the following equation

$$
\overrightarrow{d}'(t) = 2 - \frac{2t}{MaxIter} \tag{2}
$$

where *MaxIter* is the total number of iterations.

To update the position of the search agents, the following equations must be used

$$
\overrightarrow{\mathcal{W}}\_1(n) = \overrightarrow{\mathcal{W}}\_a(n) - \overrightarrow{\mathcal{A}}\_1 \cdot |\overrightarrow{\mathcal{C}}\_1 \cdot \overrightarrow{\mathcal{W}}\_a(n) - \overrightarrow{\mathcal{W}}(n)|\tag{3}
$$

$$
\overrightarrow{\mathcal{W}}\_{2}(n) = \overrightarrow{\mathcal{W}}\_{\beta}(n) - \overrightarrow{\mathcal{A}}\_{2} \cdot |\overrightarrow{\mathcal{C}}\_{2} \cdot \overrightarrow{\mathcal{W}}\_{\beta}(n) - \overrightarrow{\mathcal{W}}(n)|\tag{4}
$$

$$
\overrightarrow{\mathcal{W}}\_{\mathcal{B}}(n) = \overrightarrow{\mathcal{W}}\_{\delta}(n) - \overrightarrow{\mathcal{A}}\_{\mathcal{B}} \cdot |\overrightarrow{\mathcal{C}}\_{\mathcal{B}} \cdot \overrightarrow{\mathcal{W}}\_{\delta}(n) - \overrightarrow{\mathcal{W}}(n)|\tag{5}
$$

$$
\overrightarrow{\dot{W}}\_p(n+1) = \frac{\overrightarrow{\dot{W}}\_1(n) + \overrightarrow{\dot{W}}\_2(n) + \overrightarrow{\dot{W}}\_3(n)}{3} \tag{6}
$$

where −→*Wα*(*n*), −→*Wβ*(*n*), −→*Wδ*(*n*) are the best three wolves at each iteration and −→*<sup>W</sup> <sup>p</sup>*(*<sup>n</sup>* <sup>+</sup> <sup>1</sup>) is the new position of the prey.

#### *2.2. LMS Algorithm*

The LMS algorithm is a practical method to obtain estimates of a filter's weights, **w**(*n*), in real time [1]. The equation for updating the weights of the LMS algorithm in a sample by sample fashion is given by:

$$\mathbf{w}(n+1) = \mathbf{w}(n) + \mu e(n)\mathbf{x}(n) \tag{7}$$

where **x**(*n*) is the current input vector, μ is a fixed step-size [0, 1], *e*(*n*) is the error signal and is calculated using:

$$\mathbf{c}(n) = d(n) - \mathbf{w}^T(n)\mathbf{x}(n) \tag{8}$$

and *d*(*n*) is the desired signal.

#### *2.3. Convex GWO/LMS*

As discussed above, the GWO algorithm is widely used to solve a variety of problems since it exhibits significant properties. However, it has a number of limitations and suffers from inevitable drawbacks. The main limitation comes from the no-free-lunch (NFL) theorem, which states that no optimization algorithm is able to solve all optimization problems [40]. This means that GWO might require modification when solving some real-world problems. In particular, we proposed some modifications to the conventional GWO to be used in the development of AEC systems. Specifically, we present a new combination between the GWO and the LMS algorithms to improve tracking-capabilities and the steady-state error of the filter, respectively, since it has been proven that the use of convex combination approaches significantly improve the performance of adaptive schemes by using two adaptive algorithms [41].

As can be observed in Figure 1, the proposed system computes the coefficients of the filters, **w**<sup>1</sup> and **w**2, as follows:

• *The calculation of the filter coefficients by employing the LMS algorithm*. Here, we use the block LMS algorithm to calculate the **w**<sup>1</sup> weights. This has allowed us to create efficient implementations by using commercially available embedded devices [42,43]. The adaptive filter coefficients **w**<sup>1</sup> are defined as:

$$\mathbf{w}\_1(n+1) = \mathbf{w}\_1(n) + \mu \mathbf{X}(n) \mathbf{e}\_{LMS}(n) \tag{9}$$

where **w**1(*n*) is the weight-vector **w**1(*n*) = *w*(0), *w*(1), ··· *w*(*N* − 1) *T* , μ is the step-size, and **e***LMS*(*n*) is the error vector. The error vector is composed as:

$$\mathbf{e}\_{LMS}(n) = \begin{bmatrix} e(nL), & e(nL-1), & \cdots & e(L(n+1)+1) \end{bmatrix}^T \tag{10}$$

and **X**(*n*) is derived from the current input block

$$\mathbf{x}(n) = \begin{bmatrix} \mathbf{x}(nL), & \mathbf{x}(nL-1), & \cdots & \mathbf{x}(nL-N+1) \end{bmatrix} \tag{11}$$

where *L* depicts the length of the block and *N* is the size of the filter. Additionally, **X**(*n*) is calculated as follows:

$$\mathbf{X}(n) = \begin{bmatrix} \mathbf{x}(nL), & \mathbf{x}(nL-1), & \cdots & \mathbf{x}(nL-N+1) \\ \mathbf{x}(nL-1) & \mathbf{x}(nL-2) & \cdots & \mathbf{x}(nL-N) \\ \vdots & \vdots & \ddots & \vdots \\ \mathbf{x}(nL-L+1) & \mathbf{x}(nL-L) & \cdots & \mathbf{x}(nL-L-N+2) \end{bmatrix}^T \tag{12}$$

The error vector is obtained as follows:

$$\mathbf{e}\_{LMS}(n) = \mathbf{d}(n) - \mathbf{y}\_{LMS}(n) \tag{13}$$

where the desired response vector **d**(*n*) is given by L

$$\mathbf{d}(n) = \begin{bmatrix} d(nL), & d(nL-1), & \cdots & d(nL-N+1) \end{bmatrix}^T \tag{14}$$

The filter output **y***LMS*(*n*) of each block is given by the following matrix vector product:

$$\mathbf{y}\_{LMS}(n) = \mathbf{X}(n) \cdot \mathbf{w}\_1(n) \tag{15}$$

• *The calculation of the filter coefficients by employing the GWO algorithm*. Here, the use of the block-processing scheme in SI algorithms allowed us to process the signal in real-time applications. As a consequence, this potentially allows full exploitation of the performance capabilities of the parallel hardware architectures by simulating the intrinsic parallel computational capabilities of the SI algorithms. In this work, we introduce, for the first time, the block-based GWO algorithm to be applied in practical and real-time AEC applications, as shown in Figure 2.

The encircling behavior of the grey wolves is mathematically described as follows:

$$
\overrightarrow{\mathcal{W}}(n+1) = \overrightarrow{\mathcal{W}}\_{\mathcal{P}}(n) - \overrightarrow{\mathcal{A}} \cdot |\overrightarrow{\mathcal{C}} \cdot \overrightarrow{\mathcal{W}}\_{\mathcal{P}}(n) - \overrightarrow{\mathcal{W}}(n)|\tag{16}
$$

where −→*<sup>A</sup>* <sup>=</sup> <sup>2</sup>*φ*(*n*) · −→*<sup>r</sup>* <sup>1</sup> <sup>−</sup> *<sup>φ</sup>*(*n*) denotes a coefficient vector, and *<sup>φ</sup>*(*n*) is in function of the value of instantaneous error. In addition, this value is in a range of 0 and 2. This can be described by:

$$\phi(n) = \frac{4}{1 + e^{-[\text{cWO}(n)]}} - 2 \tag{17}$$

To obtain the best solution, a fitness function, which is defined in terms of the mean square error (MSE), is used to evaluate each search agent. Hence, the fitness value *fk* of the position −→*<sup>W</sup>* is expressed as:

$$f\_k(n) = \frac{1}{L} \sum\_{i=1}^{L} e\_k^2(i) \tag{18}$$

where *k* = 1, 2, ... , *P*(*n*). Here, we propose a mechanism to dynamically adjust the number of search agents over the filtering process, i.e., *P*(*n*) denotes the current number of search agents. Specifically, this adjustment is a function of the power of the instantaneous error, and is obtained as follows:

$$P(n) = \lfloor \frac{2 \cdot (P\_{\max} - P\_{\min})}{(1 + e^{-\lfloor \varepsilon\_{GWO}(n) \rfloor})} - (P\_{\max} - P\_{\min}) \rfloor + P\_{\min} \tag{19}$$

where *Pmax* and *Pmin* defines the maximum and the minimum number of search agents, respectively. In addition, we use Equations (3)–(6) to update the position of the search agents and the prey. On the other hand, the error signal, *eGWO*(*n*), and filter output, *yGWO*(*n*), are described as follows:

$$\mathbf{e}\_{GWO}(n) = \mathbf{d}(n) - \mathbf{y}\_{GWO}(n) \tag{20}$$

$$\mathbf{y}\_{GWO}(n) = \mathbf{X}(n) \cdot \mathbf{w}\_{2}(n) \tag{21}$$

where **<sup>w</sup>**2(*n*) = −→*Wα*.

Considering the output of both filters, **y***GWO*(*n*) and **y***LMS*(*n*) at time *n*, we obtain the output of the parallel filter as:

$$\mathbf{y}(n) = \lambda(n) \cdot \mathbf{y}\_{GWO}(n) - [1 - \lambda(n)] \cdot \mathbf{y}\_{LMS}(n) \tag{22}$$

where *λ*(*n*) is a mixing parameter, which is in the range [0, 1]. This parameter is used to control the combination of the two filters at each iteration, and is defined by:

$$
\lambda(n) = \frac{1}{e^{-a(n)}}\tag{23}
$$

where *a*(*n*) is an auxiliary parameter used to minimize the instantaneous square error of the filters, and is obtained as follows:

$$a(n+1) = a(n) + \mu\_a \cdot \varepsilon(1) \cdot \left\{ \varepsilon\_{\rm GWO}(1) - \varepsilon\_{\rm LMS}(1) \right\} \cdot \lambda(n) \cdot \left[ 1 - \lambda(n) \right] \tag{24}$$

Finally, the performance of the combined filter can be further improved by transferring a portion of **<sup>w</sup>**<sup>1</sup> to −→*Wα*, −→*W<sup>β</sup>* and −→*Wδ*. This can be formulated as follows:

$$
\overrightarrow{\mathcal{W}}\_{a}(n) = \lambda(n) \cdot \overrightarrow{\mathcal{W}}\_{a}(n) - [1 - \lambda(n)] \cdot \mathbf{w}\_{1}(n) \tag{25}
$$

$$
\overrightarrow{\boldsymbol{\dot{W}}}\_{\beta}(n) = \lambda(n) \cdot \overrightarrow{\boldsymbol{\dot{W}}}\_{\beta}(n) - [1 - \lambda(n)] \cdot \mathbf{w}\_1(n) \tag{26}
$$

$$
\overrightarrow{\mathcal{W}}\_{\delta}(n) = \lambda(n) \cdot \overrightarrow{\mathcal{W}}\_{\delta}(n) - [1 - \lambda(n)] \cdot \mathbf{w}\_1(n) \tag{27}
$$

In this way, the GWO filter can reach a lower steady-state MSE and continues to keep a high convergence rate.

**Figure 1.** Proposed convex structure.

**Figure 2.** The flowchart of the block GWO algorithm.

#### **3. Pure Software Simulation**

Before implementing the proposed convex GWO/LMS adaptive filter in parallel hardware architectures, we simulated it in Matlab software for testing and comparison purposes. Specifically, we simulated the conventional LMS, GWO and our proposal to compare their performances. In addition, we used AEC structure, in which the existing approaches and the proposed convex GWO/LMS adaptive filter were used, as shown in Figure 3. As can be observed, *x*(*n*) is the far-end input signal, *e*(*n*) denotes the residual echo signal, *d*(*n*) represents the sum of the echo signal, *y*(*n*) and the background noise, *e*0(*n*).

To simulate the proposed convex GWO/LMS adaptive filter and existing approaches, we considered the following conditions:


**Figure 3.** Structure of the acoustic echo canceller.

**Figure 4.** Acoustic echo path used for the simulation of the existing and the proposed algorithms.

Considering a single-talk scenario, we performed three experiments to verify the performance of the proposed convex GWO/LMS adaptive filter in terms of echo return loss enhancement, (*ERLE* <sup>=</sup> <sup>10</sup>*log*10( *<sup>d</sup>*(*n*)<sup>2</sup> *<sup>e</sup>*(*n*)<sup>2</sup> )).

	- Figure 5 shows the evaluation of the ERLE level of the proposed algorithm. In this evaluation, we used a population size of 30 search agents and varied the number of coefficients of the adaptive filter from 150 to 500. The aim of this experiment was to observe how the ERLE level was affected by using different numbers of coefficients. Here, the proposed convex GWO/LMS adaptive filter guaranteed the same ERLE level regardless of the number of coefficients, as shown in Figure 5. Therefore, we used the minimum number of coefficients, since this factor is relevant, especially when it is implemented in resource-constrained devices.

**Figure 5.** ERLE level for different number of coefficients *N*.

• *Effect of varying the number of search agents of the proposed convex GWO/LMS adaptive filter*. In this experiment, we varied the number of search agents from 30 to 200 to evaluate the performance of the proposed algorithm in terms of ERLE level. Since the minimum number of adaptive filter coefficients (150) guaranteed a good ERLE level, we used this number for the experiment. As can be observed from Figure 6, we obtained the same performance by using different numbers of search agents. In this way, we confirmed that, when employing the minimum number of search agents, the proposed method reached a good ERLE level. From an engineering perspective, this has a great impact on the performance of resource-constrained devices, since the proposed algorithm intends to reduce its computational cost by decreasing the number of search agents over the adaptive process.

**Figure 6.** ERLE for different numbers of search agents *P*.

	- 1. **LMS**
		- **–** Convergence factor <sup>=</sup> <sup>9</sup> <sup>×</sup> <sup>10</sup>−<sup>7</sup>
	- 2. **GWO**
		- **–** *a* decreases linearly from 2 to 0
		- **–** lower bound = −1
		- **–** Upper bound = 1
		- **–** Population size = 50
	- 3. **PSO**
		- **–** Acceleration coefficient, *c*<sup>1</sup> = 1.6
		- **–** Acceleration coefficient, *c*<sup>2</sup> = 1
		- **–** Inertia weight = 0.8
		- **–** Lower bound = −1
		- **–** Upper bound = 1
		- **–** Population size = 100
	- 4. **DE**
		- **–** Crossover rate = 0.35
		- **–** Scaling factor = 0.8
		- **–** Combination factor = 0.25
		- **–** Lower bound = −1
		- **–** Upper bound = 1
		- **–** Population size = 50
	- 5. **ABC**
		- **–** Evaporation parameter = 0.1
		- **–** Pheromone = 0.6
		- **–** Lower bound = −1
		- **–** Upper bound = 1
		- **–** Population size = 50

#### 6. **PSO-LMS**

	- **–** Evaporation parameter = 0.1
	- **–** Pheromone = 0.6
	- **–** Lower bound = −1
	- **–** Upper bound = 1
	- **–** Population size = 50
	- **–** Convergence factor <sup>=</sup> <sup>3</sup> <sup>×</sup> <sup>10</sup>−<sup>5</sup>

As can be observed from Figure 7, the proposed convex GWO/LMS adaptive filter showed the best performance, in terms of ERLE level and convergence speed, by expending a large number of additions and multiplications, as shown in Table 1. In contrast, the LMS algorithm expended fewer additions and multiplications compared with the proposed algorithm at the cost of exhibiting a slow convergence speed. In general, the excessive number of additions and multiplications makes the implementation of the GWO adaptive filter in current embedded devices, such as DSP and FPGA devices, impractical since they have a limited number of these circuits. Here, our proposal intended to dynamically decrease the number of search agents, as shown in Figure 8. As a consequence, the number of multiplications and additions also reduced (Equation (19)). In this way, the implementation of our proposal in embedded devices can be feasible.

**Table 1.** Comparison between the proposed convex GWO/LMS system and existing approaches in terms of the number of additions and multiplications.


**Figure 7.** ERLE learning curves obtained by simulating existing approaches and the proposed algorithm; (**a**) by shifting the acoustic path at the middle of iterations and (**b**) by multiplying the acoustic path by −1 at the middle of iterations [1,15,23,48–51].

**Figure 8.** The number of search agents used during the adaptation process.

• *Statistical comparison between the proposed convex GWO/LMS and existing approaches* Statistical results were obtained with two different evaluations: average value of ERLE in dB and its corresponding standard deviation. The maximum number of iterations was set to 2,000,000 and each algorithm ran 10 times. The results are reported in Table 2.


**Table 2.** Comparison between the proposed convex GWO/LMS system and existing approaches in terms of average value of ERLE and standard deviation in dB.

As can be observed from Table 2, the proposed convex GWO/LMS achieved a good average ERLE level, in comparison with other existing algorithms. It should be noted that the MABC algorithm possessed the highest average value. Nonetheless, this algorithm presented a lower convergence speed, especially when abrupt changes occurred, as shown in Figure 7b. On the other hand, the GWO, PSO and PSO-LMS algorithms presented lower standard deviations in comparison with the proposed Convex GWO/LMS algorithm. Nonetheless, the proposed method achieved a higher average value.

#### **4. Pure Hardware Simulation**

To adequately simulate the proposed convex GWO/LMS system, we made extraordinary efforts to develop compact, high-processing and high-precision neural arithmetic circuits, such as adder and multiplier, since these two circuits are highly demanded in the computation of the proposed algorithm, as shown in Table 1. Specifically, we developed, for the first time, a customized floating-point representation to perform additions and multiplications with high precision.

To compute the numbers in floating-point format, we established the format criteria of the input numbers *u* and *v* to be either added or multiplier, as follows:

1. In general terms, the numbers *u* and *v* are separated in integer and fractional digits. Specifically, the number of integer digits and the number of fractional digits can be chosen over a range (*ug*<sup>0</sup> ··· *ug*<sup>1</sup> ··· *ugm* , *vg*<sup>0</sup> ··· *vg*<sup>1</sup> ··· *vgm* ), as shown in Figure 9. Here, we painted the digits with a specific color by using a variant of the SN P systems called coloured spikes to easily distinguish the units, tens, hundreds, etc., of the integer part and tenths, hundredths, thousandths, etc., of the fractional part. This strategy was also used to represent the digits of the results of the addition and multiplication operations.

**Figure 9.** General structure of the proposed floating-point neural adder.


The proposed neural adder circuit ∏*add* has a set of neurons, *σA*<sup>0</sup> , ··· , *σAm* , ··· , and a neuron, *σp*. The set of neurons *σ<sup>A</sup>* is in charge of computing the addition of two numbers, where each number is composed of an integer part and a fractional part, and neuron, *σp*, determines the position of the point to segment the number into an integer part and a fractional part, as shown in Figure 9.

The proposed neural adder circuit ∏*add* computes the addition as follows:

In the initial state, the neurons, *σA*, are empty. At this time, the dendritic branches of synaptic channels, (1) and (2), are activated according to the value of the digits, *u* and

*v*, respectively. For example, if the value of a digit, *v* or *u*, is equal to five, then five dendritic branches are activated. Therefore, these dendritic branches allow the flow of five spikes towards a specific neuron, *σA*. Simultaneously, the neuron, *σp*, places the point to segment the number into integer digits and the fractional digits by setting the firing rule, *a* → *ap*{*X*}. This spiking rule implies that, if neuron *σ<sup>p</sup>* receives a spike at any time, it fires and sends a spike to a specific neuron, *σAx* . To do this, we used a variant of the SN P systems called target indications. In this way, the point was allocated according to the desired precision. Therefore, the point, which was represented as a spike, was stored in a specific neuron, *σAx* . Once the neural circuit was configured, the partial additions of the spikes started.

Here, the addition of the numbers was performed in a single simulation step, since all neurons, *σA*, processed their respective input spikes simultaneously. Additionally, the carry spike was obtained when any neuron, *σA*, accumulated ten spikes. At this moment, the spiking rule, *<sup>a</sup>*9*a*+/*a*<sup>10</sup> <sup>→</sup> *<sup>a</sup>*(3), was set. After one simulation step, the result, represented by the remaining spikes in the soma of each neuron, is placed at the output synapses. In addition, neuron *σAx* sends the spike point to the environment by enabling its spiking rule, *ap* → *ap*.

In this work, we proved that the use of several variants of the SN P systems, such as coloured-spikes [36], rules on the synapses [37], target indications [51], extended channel rules [38], extended rules [39] and dendritic trunks [52] creates an ultracompact and high performance circuit, instead of only using the soma as conventional SN P systems do. In particular, the proposed neural circuit required fewer simulation steps, neurons and synapses compared with the existing approach [34], as shown in Table 3.

**Table 3.** Comparison between the existing neural adder [34] and this work in terms of synapses/neurons and simulation steps. Here, *n* represents the number of digits of *v* and *u*. Adapted from [53].


• *Parallel neural multiplier circuit* ∏*mul*.

Since the multiplier is one of the most demanding in terms of processing and area consumption, a large number of techniques have been developed to minimize these factors. Recently, several authors have used SN P systems to create an efficient parallel multiplier circuit. However, the improvement of processing speed is still an issue since most of the studies improved the area consumption. The improvement of this factor potentially allows the development of high performance systems to support AEC systems in real-time. In addition, the development of a high-precision neural multiplier is still a challenging task, since this factor is especially relevant when metaheuristic algorithms are simulated. Here, we developed a neural multiplier that shows higher processing speeds, in comparison with existing approaches, by keeping the area consumption low. To achieve this, we reduced the processing time by using cutting-edge variants of the SN P systems, such as coloured-spikes [36], rules on the synapses [37], target indications [51], extended rules [39] and dendritic trunks [52]. Specifically, we used these variants to significantly improve the time expended in the computation of the partial products of the multiplication, in comparison with the most recent approach [35].

The proposed neural multiplier circuit ∏*mul* is composed of a set of neurons (*σA*<sup>0</sup> , ··· , *<sup>σ</sup>An*−*<sup>j</sup>* , ··· , *<sup>σ</sup>An*−<sup>1</sup> , ··· , *<sup>σ</sup>A*2(*n*−*j*) , ··· , *<sup>σ</sup>A*2*n*−<sup>1</sup> ), and neuron, *<sup>σ</sup>in*, as shown in Figure 10. In general terms, neurons, *σA*, perform the addition of the partial products, where

each partial product is computed by using dendritic branches. In particular, each neuron *σ<sup>A</sup>* computes the partial product between a single digit of *u* and a digit of *v*, where the digits of *u* are represented as *p* spikes, which are generated by neuron *σin* when its spiking rule, (*a p <sup>u</sup>*)+/*a p <sup>u</sup>* → *a p <sup>u</sup>*, is applied, and the value of each digit of *v* activates an equal number of dendritic branches.

The proposed neural multiplier circuit ∏*mul* performs the multiplication, as follows: At the initial simulation step, neurons, *σA*, are empty. At this time, neuron, *σin*, places the spike point by receiving a spike. Therefore, it fires and sends the spike point to a specific neuron, *σAx* , using the target indications [51]. In this way, the digits of *u* or *v* are segmented into integer and fractional parts. Once the multiplier is configured, *m* · *n* partial products are executed in parallel. To perform any partial product, neuron, *σin*, fires *p* spikes which are sent to its corresponding neuron, *σA*. The soma of this neuron receives many copies of the spikes, *p*, in function of the number of active dendritic branches. For example, if neuron multiplies 3 × 3, this implies that neuron, *σAx* , receives three copies of three spikes by means of three dendritic branches. In this way, the neuron, *σA*, increases its potential of the soma by performing three additions. Therefore, the synaptic weights are not required since many approaches use them to perform partial products. Hence, the number of branched connections can be variable. In this way, the neuron *σ<sup>A</sup>* enables the optimal number of synaptic connections. From an engineering perspective, we proposed the use of a variable number of forked connections, since the implementation of a very-large number of synaptic connections in advanced FPGAs creates critical routing problems. Once the result is obtained, neuron, *σin*, places the spike point according to the addition of the number of fractional digits, as in conventional multiplication. Therefore, *σA*, which received the spike point at the initial simulation, fired the point spike by applying it firing rule (*ap* → *ap*). As can be observed from Table 4, we achieved a significant improvement in terms of simulation steps, since only one simulation step was required to perform a multiplication of two numbers with any length. This aspect is relevant, especially when real-time AEC system simulations are required. Additionally, we reduced the number

**Table 4.** Comparison between the proposed neural multiplier and the existing neural multiplier [35] in terms of simulation steps, synapses and neurons. Where *n* denotes the number of digits of *v* and *u*. Adapted from [53].

of synapses in comparison with the existing work.


**Figure 10.** Structure of the neural multiplier.

#### *4.1. Experimental Results*

To demonstrate the computational capabilities of the proposed convex GWO/LMS adaptive filter, we considered an arbitrary single-talk scenario, as shown in Figure 11.

**Figure 11.** Scheme of the AEC prototype.

Under this configuration, we used the proposed AEC system to perform the simulation of the proposed algorithm by using the experimental setup of Figure 11. Specifically, the proposed system has an AEC neural processor as its main core to cancel the echo signal and the background noise by simulating the proposed convex GWO/LMS adaptive filter. As can be observed from Figure 12, the AEC neural processor is composed of a control unit, CU, a set of processing cores, *α*, *β*, *δ*, *ω*, LMS, convex, and BRAMs. Figure 13 shows a sequence diagram to specify how each processing core computes specific parts of the proposed convex GWO/LMS adaptive filter. Here, we used the BRAMs to store *L* and *N* samples of the signals, *x*, and *d*, where *L* = *N*. In this way, we implemented the block processing scheme. Once these values are stored in their respective BRAMs, the computation of the proposed convex GWO/LMS adaptive filter starts. In this way, the AEC neural processor computes the new variant of the GWO algorithm and the LMS algorithm simultaneously. Specifically, we propose a new time-multiplexing control scheme to automatically update the number of search agents of the GWO algorithm over the processing time. Under this scheme, the number of search agents, *ω*, which are divided into three parts, is enabled or disabled either by the processor, *α*, or *β*, or *δ*. It is important to keep in mind that processing cores, *α*, *β*, *δ* evaluate the response of their respective processors, *ω*, simultaneously. According to this, the proposed time-multiplexing control scheme uses the signals, *en*\_*w*\_*α*, *en*\_*w*\_*β*, and *en*\_*w*\_*δ*, to enable or disable the number of search agents according to the simulation needs, as shown in Figure 12. Here, the use of the time-multiplexing control scheme allowed us to significantly decrease the number of buses to transfer the coefficients between the processing cores, *ω*, and the processing cores, *α*, or *β*, or *δ*. This saving allowed us to implement a full connection between the processing cores, *α*, *β* and *δ*. As a consequence, the evaluation of the searching point of each agent was performed by processing cores, *α*, *β* and *δ* simultaneously.

**Figure 12.** Scheme of the proposed AEC neural processor.

**Figure 13.** The sequential diagram to describe how the proposed convex GWO/LMS adaptive filter is executed by means of the processing cores.

In this work, we used the proposed AEC neural processor to simulate the single-talk and double-talk scenarios by considering the following conditions:

• We employed 1024 adaptive filter coefficients and varied the search agents from 70 to 15.


#### *4.2. Single-Talk Scenario*

In this section, we demonstrate the computational capabilities of the proposed algorithm under a single-talk scenario. As can be observed from Figures 14 and 15, the proposed AEC neural processor, which simulated the proposed convex GWO/LMS algorithm, reached a good ERLE level by processing two different signals. To carry this out, we configured the AEC neural processor to support one *α*, one *β*, one *δ*, and 70 *ω* processing cores. Here, the most demanding core, in terms of area and processing speed, was the *ω* processing core, since each one required 1000 neural multipliers and 1000 neural adders. Therefore, we physically implemented six *ω* processing cores to simulate virtually 70 *ω* processing cores. In this way, we saved a large area consumption. In general terms, the implementation of these components required 420,380 LEs, which represented 79% of the total area of an Stratix IV GX EP4SGX530 FPGA. Furthermore, the AEC neural processor required 114.56 μs, which was obtained by multiplying 14,320 clock cycles by the system clock period (8 ns) to simulate the proposed convex GWO/LMS algorithm. This time was calculated when all the search agents were used, i.e., by considering the worse case. However, the number of search agents decreased over the processing time. In the case of employing fifteen search agents, which represented the minimum number, only 1300 clock cycles were expended. Therefore, the processing time reduced from 114.56 μs to 10.4 μs. Therefore, the simulation of real-time AEC systems was guaranteed since the maximum latency was 125 μs.

**Figure 14.** ERLE learning curve of the proposed convex GWO/LMS considering an AR(1) process as input signal.

#### *4.3. Double-Talk Scenario*

To perform the experiments under this configuration, we employed a double-talk detector circuit to avoid adaptation during periods of simultaneous far and near-end speech. It should be noted that the area consumption in the implementation of this circuit was negligible. Therefore, this implementation expended around 79% of the total area of an Stratix IV GX EP4SGX530 FPGA, as in the previous case. Figures 16 and 17 show the ERLE of the proposed Convex GWO/LMS algorithm by considering an AR(1) process and a speech sequence signal, respectively. As can be observed from the above figures, the proposed algorithm showed good tracking capabilities and achieved a good ERLE level. This aspect is relevant since these levels are required in the development of practical and real-world AEC applications.

**Figure 15.** ERLE learning curve of the proposed convex GWO/LMS considering a speech sequence signal as input signal.

**Figure 16.** ERLE learning curve of the proposed convex GWO/LMS considering an AR(1) process as input signal and a double-talk scenario.

**Figure 17.** ERLE learning curve of the proposed convex GWO/LMS considering a speech sequence signal as input signal considering a double-talk scenario.

#### **5. Conclusions**

In this work, we present, for the first time, the development of a high-speed and compact FPGA-based AEC neural processor to efficiently simulate a new convex GWO/LMS algorithm. Here, we grouped our contributions as follows:

• *From the AEC model point of view*.

Here, we made intensive efforts to reduce the computational cost of the AEC systems to be implemented in resource-constrained devices. In addition, we significantly increased the convergence properties of these systems by using a cutting-edge metaheuristic swarm intelligence method, in combination with a gradient descent algorithm to be used in practical acoustic environments. Specifically, we present a new

variant of GWO algorithm along with the LMS algorithm. The use of this combination allowed us to guarantee a higher convergence rate and lower MSE level, in comparison to when gradient descent algorithms or metaheuristic SI methods were used separately. To improve the tracking capabilities of the conventional GWO algorithm, the proposed variant has new exploration capabilities, since the search space is dynamically adjusted. To make the implementation of the proposed variant of the GWO algorithm in embedded devices feasible, we used the block-processing scheme. In this way, the proposed convex GWO/LMS algorithm can be easily implemented in parallel hardware architectures. As a consequence, it can be simulated at high processing speeds. In addition, we significantly reduced the computational cost of the proposed convex GWO/LMS algorithm. To achieve this aim, we propose a method to dynamically decrease the population of a variant of the GWO algorithm over the filtering process.

• *From the SN P systems point of view*.

Here, we present, for the first time, a compact and high-processing speed floating-point neural adder and multiplier circuit. We used cutting-edge variants of the SN P systems, coloured-spikes, rules on the synapses, target indications, extended channel rules, extended rules and dendritic trunks to create a customized floating-point neural adder and multiplier. Specifically, the proposed neural adder and multiplier exhibits higher processing speed, compared with existing SN P adders and multipliers, since both expend only one simulation step, which is the best improvement achieved until now. • *From the digital point of view*.

In this work, we present, for the first time, the development of a parallel hardware architecture to simulate a variable number of search agents by using the proposed time-multiplexing control scheme. In this way, we implemented the proposed GWO method properly, in which the number of search agents increase or decrease according to the simulation needs. In addition, the use of this scheme allowed us to exploit, to the maximum, the flexibility and scalability features of the GWO algorithm.

Finally, we carried out several experiments to prove that the proposed convex GWO/ LMS algorithm, along with the new techniques inspired by biological neural processes, potentially allow the creation of practical and real-time AEC processing tools. Part of the future work is to develop new convex combinations, in which other meta-heuristic algorithms can be used in other adaptive filtering applications, such as active noise control, channel equalization and noise cancellers. In addition, new digital techniques will be explored to mimic bio-inspired behavior with high accuracy.

**Author Contributions:** Conceptualization, E.P.; Data curation, L.G. and H.M.P.; Formal analysis, G.S.; Funding acquisition, J.G.A. and J.C.S.; Investigation, A.V. and J.G.A.; Methodology, J.G.A. and G.S.; Resources, E.P., E.A. and A.V.; Software, E.P., E.A. and A.V.; Supervision, G.S. and J.C.S.; Validation, E.P. and E.A.; Writing—original draft, L.G. and H.M.P.; Writing—review & editing, H.M.P. All authors have read and agreed to the published version of the manuscript.

**Funding:** The authors would like to thank the Instituto Politécnico Nacional for the financial support.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** No new data were created or analyzed in this study. Data sharing is not applicable to this article.

**Acknowledgments:** The authors would like to thank the Consejo Nacional de Ciencia y Tecnologia (CONACYT) and the IPN for the financial support to make this work.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

**Fulin Tian \*, Jiayang Wang and Fei Chu**

School of Computer Science and Engineering, Central South University, Changsha 410083, China **\*** Correspondence: fulintian@csu.edu.cn

**Abstract:** In order to compensate for the low convergence accuracy, slow rate of convergence, and easily falling into the trap of local optima for the original Harris hawks optimization (HHO) algorithm, an improved multi-strategy Harris hawks optimization (MSHHO) algorithm is proposed. First, the population is initialized by Sobol sequences to increase the diversity of the population. Second, the elite opposition-based learning strategy is incorporated to improve the versatility and quality of the solution sets. Furthermore, the energy updating strategy of the original algorithm is optimized to enhance the exploration and exploitation capability of the algorithm in a nonlinear update manner. Finally, the Gaussian walk learning strategy is introduced to avoid the algorithm being trapped in a stagnant state and slipping into a local optimum. We perform experiments on 33 benchmark functions and 2 engineering application problems to verify the performance of the proposed algorithm. The experimental results show that the improved algorithm has good performance in terms of optimization seeking accuracy, the speed of convergence, and stability, which effectively remedies the defects of the original algorithm.

**Keywords:** swarm intelligence; Harris hawks optimization; elite opposition-based learning; Sobol sequence; nonlinear weight; Gaussian walk learning

**MSC:** 68T20

**Citation:** Tian, F.; Wang, J.; Chu, F. Improved Multi-Strategy Harris Hawks Optimization and Its Application in Engineering Problems. *Mathematics* **2023**, *11*, 1525. https://doi.org/10.3390/ math11061525

Academic Editor: Gaige Wang

Received: 19 February 2023 Revised: 13 March 2023 Accepted: 18 March 2023 Published: 21 March 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

#### **1. Introduction**

The swarm intelligence algorithm is a common approach in computational intelligence and an emerging evolutionary computing technique whose basic theory is to emulate the behavior of groups of ants, birds, bees, wolves, bacteria, and other organisms in nature and to use the mechanism of interaction to form group intelligence to solve complex problems through the exchange of information and cooperation between groups. Numerous researchers have carried out much work on such intelligent algorithms and proposed many novel algorithms, such as the grey wolf optimizer (GWO) [1], lightning search algorithm (LSA) [2], marine predators algorithm (MPA) [3], sine cosine algorithm (SCA) [4], salp swarm algorithm (SSA) [5], water cycle algorithm (WCA) [6], whale optimization algorithm (WOA) [7], cuckoo serach (CS) [8], artificial bee colony (ABC) [9], and moth flame optimization (MFO) [10].

The Harris hawks optimization (HHO) algorithm is a novel swarm intelligence algorithm proposed by Herdari et al. [11], who were inspired by observing the chase and escape action between Harris hawks and their prey. The algorithm is simple in principle, with few parameters while having a powerful global search capability, so it has received widespread attention and has been adopted in many engineering fields since its introduction. However, similar to other intelligent optimization algorithms, the basic Harris hawks algorithm is susceptible to such defects as low precision of convergence and the trend of falling into the local optimum during the search process when solving complex optimization problems. Numerous scholars have proposed different improvement schemes

to address these shortcomings. Qu et al. [12] introduced a method of information exchange to increase the diversity of populations, and a nonlinear energy escape factor was proposed and perturbed by chaotic interference to balance the local exploitation and global exploration of the algorithm. Xiaolong Liu et al. [13] optimized the method by setting up a square neighborhood topology with multiple subgroups to lead individuals in each subgroup to explore randomly in both directions. Andi Tang et al. [14] introduced the elite hierarchy strategy to make full use of the dominant population for enhancing the population diversity and improving the convergence of the algorithm in terms of speed and accuracy. Kaveh et al. [15] combined HHO and the imperialist competitive algorithm (ICA) [16] named imperialist competitive Harris hawks optimization (ICHHO), and it had good performance in structure optimization problems. Elgamal et al. [17] presented application of the chaotic maps at the initialization phase of the HHO, and the current best solution was analyzed using the simulated annealing (SA) [18] algorithm to improve the utilization of the HHO. Wenyu Li [19] invented a form of HHO by incorporating the novice protection tournament (NpTHHO), which was developed by adding a novice protection mechanism to better reallocate resources and introducing a variation mechanism in the exploration phase to further enhance the global search efficiency of the HHO algorithm. Essam H. et al. [20] combined HHO with a support vector machine (SVM) for the selection of chemical descriptors as well as the activity of compounds and drug design and discovery. Wunnava et al. [21] introduced an adaptive improvement algorithm performed by differencing to address the problem of the exploration ability of the algorithm being limited if the escape energy of the HHO is equal to zero, resulting in invalid random behavior at that stage, and they applied the algorithm to image segmentation.

In summary, the main improvement of the HHO algorithm is to improve the optimization ability of local exploitation and global exploration through various optimization strategies and then improve the the accuracy of convergence and performance of the algorithm before applying it in practical engineering. Among the existing improved versions of HHO, some are achieved by incorporating other algorithms, such as those in [15,17], while others are achieved by improving one or several stages of the underlying algorithm to achieve optimization. Compared with the improved algorithms that have been proposed, the improved strategies proposed in this paper include a more comprehensive scope, including the population initialization phase, the population update phase, the energy escape factor optimization, and the variation strategy. To overcome the weaknesses of the basic Harris hawks algorithm, such as low accuracy, slow speed of convergence, and easily being trapped in the local optimum, this paper presents improved multi-strategy Harris hawks optimization (MSHHO). The main contributions of this research are as follows:


• We apply it to two engineering application problems to inspect the practicality of the introduced algorithm. A new scheme is provided for the swarm intelligence algorithm in practical engineering applications.

This paper is organized as follows. Section 1 describes the current state of development of swarm intelligence algorithms and some existing strategies for improving the Harris hawks algorithm. Section 2 describes the basic principles of the basic Harris hawks algorithm. Section 3 details the improvement strategy introduced in this paper and gives the time complexity of the algorithm. Section 4 shows the experimental results conducted to demonstrate the effectiveness of the proposed algorithm. Section 5 shows the application of the algorithm presented in this paper to engineering optimization problems. Section 6 concludes the paper and provides an outlook for future work.

#### **2. Harris Hawks Optimization (HHO)**

The HHO algorithm models the Harris hawk's strategy for capturing prey under different mechanisms in a mathematical formulation, where individual Harris hawks form candidate solutions and the optimal solution produced by every iteration is considered the prey. The algorithm comprises two main phases, namely exploration and exploitation, and transitions between the two phases are performed by the magnitude of the prey's escape energy. The original Harris hawks optimization algorithm is described below.

#### *2.1. Exploration Phase*

The global search phase is majorly dictated by the location information of the Harris hawk population, and its update strategy is as follows:

$$X(t+1) = \begin{cases} X\_{\text{rand}}(t) - r\_1 |X\_{\text{rand}}(t) - 2r\_2 X(t)| & q \ge 0.5\\ \left( X\_{\text{prrg}}(t) - X\_{\text{m}}(t) \right) - r\_3 (LB + r\_4 (LB - LB)) & q < 0.5 \end{cases} \tag{1}$$

where *X*(*t* + 1) represents the location of the hawks in the iteration *t* + 1, *Xprey*(*t*) represents the location of the prey, *X*(*t*) represents the position of the hawks in the current generation t, *r*1–*r*<sup>4</sup> and *q* are randomly generated between (0,1), being renewed in each iteration, *UB* and *LB* are the upper and lower bounds of the population, respectively, *Xrand*(*t*) represents a hawk chosen randomly from the current population, and *Xm*(*t*) represents the average of the positions of individuals in the current population, which is obtained from Equation (2):

$$X\_m(t) = \frac{1}{n} \sum\_{k=1}^n X\_k(t) \tag{2}$$

where *Xk*(*t*) denotes the position of hawk *k* in the iteration *t* and *n* denotes the number of hawks.

#### *2.2. Transition from Exploration to Exploitation*

The energy equation controlling the escape of prey is as follows:

$$E = 2E\_0(1 - t/T) \tag{3}$$

where *t* is the current number of iterations, *T* denotes the maximum number of iterations, and the value of *E*<sup>0</sup> is a random number within (−1,1) that indicates the initial state of the energy. When the escape energy |*E*| ≥ 1, the Harris hawks search different areas to further explore for the location of the prey, which corresponds to the global exploration phase, and when |*E*| < 1, the Harris hawks explore the adjacent solutions locally, thus corresponding to the local exploitation phase.

#### *2.3. Exploitation Phase*

In this phase, the Harris hawk will besiege the target prey after finding it, based on the exploration results of the previous phases, while the prey will try to escape from the pursuit. On the basis of the behavior of Harris's hawk and prey, four possible strategies are proposed to be used for this phase of the simulation. The hard beseige and soft besiege by Harris's hawk are simulated by *E*. The parameter *r* is specified to indicate whether the prey successfully escapes or not.

#### 2.3.1. Soft Besiege

When |*E*| ≥ 0.5 and *r* ≥ 0.5, the prey tries to escape from the pursuit by jumping, and the Harris hawk will use a soft besiege to gradually consume the prey's energy. The behavior is modeled as follows:

$$X(t+1) = \Delta X(t) - E\left|fX\_{preg}(t) - X(t)\right|\tag{4}$$

$$
\Delta X(t) = X\_{\text{prey}}(t) - X(t) \tag{5}
$$

where *r*<sup>5</sup> represents a randomly generated number within (0,1) and *J* is introduced to simulate the nature of prey movement, with its value randomly varied in each iteration.

#### 2.3.2. Hard Besiege

When |*E*| < 0.5 and *r* ≥ 0.5, the prey does not have enough energy to escape, and so the Harris hawk attacks in a hard besiege manner, using Equation (6) to update the current position:

$$X(t+1) = X\_{pre}(t) - E|\Delta X(t)|\tag{6}$$

#### 2.3.3. Soft Besiege with Progressive Rapid Dives

When |*E*| ≥ 0.5 and *r* < 0.5, at this time, the prey has enough energy to escape from the pursuit. Harris hawks will update their positions according to the rule in Equation (7):

$$Y = X\_{prey}(t) - E[fX\_{prey}(t) - X(t)]\tag{7}$$

$$Z = \text{Y} + \text{S} \times LF(D) \tag{8}$$

where *D* is the dimension of the problem, *S* is a random vector of a size 1 × *D*, and *LF* is the levy flight function, which can be described as in Equation (9):

$$\begin{cases} LF(\mathbf{x}) = 0.01 \times \frac{\mathbf{u} \times \sigma}{\|\mathbf{v}\|^{\frac{1}{\beta}}} \\\\ \sigma = \left(\frac{\Gamma(1+\beta) \times \sin(\frac{\pi\beta}{2})}{\Gamma(\frac{1+\beta}{2}) \times \beta \times 2^{(\frac{\beta-1}{2})}}\right)^{\frac{1}{\beta}} \end{cases} \tag{9}$$

where *u* and *v* are random values within (0,1) and *β* is the default constant, set to 1.5.

Hence, the final strategy for updating the hawks' positions during the soft siege phase can be executed by Equation (10):

$$X(t+1) = \begin{cases} \,^\prime \mathcal{Y} \text{if} F(\mathcal{Y}) < F(X(t)) \\ \,^\prime Z \text{if} F(Z) < F(X(t)) \end{cases} \tag{10}$$

where *Y* and *Z* are obtained using Equations (7) and (8), respectively.

2.3.4. Hard Besiege with Progressive Rapid Dives

When |*E*| < 0.5 and *r* < 0.5, the prey does not have enough energy to make an escape, and the following strategy is defined to be executed under these conditions:

$$X(t+1) = \begin{cases} \,^\prime X \text{if} F(Y) < F(X(t)) \\\,^\prime Z \text{if} F(Z) < F(X(t)) \end{cases} \tag{11}$$

$$Y = X\_{\rm prry}(t) - E\left| fX\_{\rm prry}(t) - X\_{\rm m}(t) \right| \tag{12}$$

$$Z = \mathcal{Y} + \mathcal{S} \times LF(D) \tag{13}$$

where *Xm*(*t*) is obtained using Equation (2).

#### *2.4. The Main Steps of HHO*

The main steps of the overall HHO algorithm are as shown in Algorithm 1.


#### **3. Improved Multi-Strategy Harris Hawks Optimization (MSHHO)**

The HHO algorithm has multiple development modes, and the algorithm shifts between the different modes, making it good for local development, but it is also prone to the problem of falling into the local optimum. To remedy this deficiency, we introduce four improvement strategies to improve the original algorithm in this chapter, which are described in detail in the following subsections.

#### *3.1. Sobol Sequence Initialization Populations*

The distribution of the primitive solution in the solution space largely affects the convergence speed and the convergence precision of the intelligent algorithm. In the basic HHO algorithm, the initialized population is generated by randomization. However, the individuals generated in this way are not homogeneously distributed throughout the exploration space, which in turn affects the speed of convergence and precision of the algorithm. The Sobol sequence [22] is a deterministic low-difference sequence that has the feature of distributing the points in the space as uniformly as possible compared to the random sequence. The expression for the original population generated by the Sobol sequence can be represented by

$$X\_i = Lb + S\_n \times (lIb - Lb) \tag{14}$$

where *Lb* and *Ub* are the lower and upper bounds of the exploration space, respectively, and *Sn* is the random number generated by the Sobol sequence, where *Sn* ∈ [0, 1].

Assuming that the search space is two-dimensional, the population size is 100, and the upper and lower bounds are 1 and 0, respectively, the distribution of the original population space comparing the random initialization and the Sobol sequence initialization population space is shown in Figure 1.

**Figure 1.** Comparison of Sobol population generation and random population generation.

As shown in Figure 1, the original population generated by the Sobol sequence is more uniformly distributed, thus enabling the optimization algorithm to perform a better global exploration in the exploration space, increasing the diversity of the population and enhancing the convergence speed of the algorithm.

#### *3.2. Elite Opposition-Based Learning*

Opposition-based learning (OBL) [23] is an effective method of intelligent computing which was proposed by Tizhoosh in 2005. In recent years, this strategy has been employed for the improvement of various algorithms and has achieved outstanding optimization results [24,25]. Assuming that a feasible solution in a *d*-dimensional search space is **X**= (*x*1, *x*2, ··· , *xd*)(*xj* ∈ [*aj*, *bj*]), then its opposition-based solution is defined as **X**= (*x*1, *x*2, ··· , *xd*), where *xj* = *r*(*aj* + *bj*) − *xj*, *r* is the coefficient of uniform distribution inside [0, 1].

The inverse solution generated by the opposition-based learning strategy is not necessarily searching for the global optimal solution more easily than the current exploration space. To address this problem, elite opposition-based learning (EOBL) is proposed. Assuming that the extreme point of the current population in the search space is the elite individual **X***<sup>e</sup>* = (*x<sup>e</sup>* <sup>1</sup>, *<sup>x</sup><sup>e</sup>* <sup>2</sup>, ··· , *<sup>x</sup><sup>e</sup> <sup>d</sup>*), then its inverse solution **<sup>X</sup>***<sup>e</sup>* = (*x<sup>e</sup>* <sup>1</sup>, *<sup>x</sup><sup>e</sup>* <sup>2</sup>, ··· , *<sup>x</sup><sup>e</sup> <sup>d</sup>*) can be specified as follows:

$$\overline{\mathfrak{x}\_{j}^{\mathcal{E}}} = k \cdot (a\_{j} + b\_{j}) - \mathfrak{x}\_{j}^{\mathcal{E}} \tag{15}$$

where *x<sup>e</sup> <sup>j</sup>* ∈ [*aj*, *bj*], *k* is a random value inside [0, 1], *bj* and *aj* are the upper and lower bounds of the dynamic boundary, respectively, and *aj* = min(*x<sup>e</sup> <sup>j</sup>*), *bj* = max(*x<sup>e</sup> <sup>j</sup>*). Replacing the fixed boundary with a dynamic boundary is beneficial for making the generated inverse solution gradually reduce the search space and speed up the convergence of the algorithm. Since the elite inverse solution may jump out of the boundary and lose its feasibility, the following approach is taken to reset the value:

$$
\overline{\alpha^c\_i} = rand(a\_\flat, b\_\flat) \tag{16}
$$

#### *3.3. Escape Energy Update Optimization*

In the basic HHO, a Harris hawk relies on the energy factor *E* to manage the transition of the algorithm from the global search phase to the local search phase. However, as shown in Equation (3), its energy factor *E* is reduced from 2 to 1 using a linear update, which tends to trap it in a local optimum in the second half of the iteration. To overcome the deficiency of only local searching when the algorithm proceeds to the later phase, a new updated version of the energy factor is used:

$$E = \begin{cases} \cos(\pi \times (t/T + 1/2) + 2), t \le T/2\\ \cos(\pi \times (t/T - 1/2)^{1/3}), t > T/2 \end{cases} \tag{17}$$

$$E\_1 = E \times (2 \times rand - 1)\tag{18}$$

where *t* is the current number of iterations, *T* is the maximum number of iterations, and *r* is the random number inside [0, 1].

From Figure 2, we can see that early in the iteration, the deceleration rate is fast to control the global search capability of the algorithm. In the middle of the iteration, the decreasing rate slows down to balance the capability of local exploitation and global exploration. In the later part of the iteration, the local search speeds up, and its value becomes smaller rapidly. From Figure 3, it can be seen that *E*<sup>1</sup> has fluctuating energy parameters throughout the iterative process and is capable of both global and local searching throughout the iterative process, with global exploration being undertaken mainly in the early stage and more local exploitation while still retaining the possibility of global exploration in the later stage.

**Figure 2.** Iterative change graph of *E*.

**Figure 3.** Iterative change graph of *E*1.

#### *3.4. Gaussian Walk Learning*

Gaussian walk learning (GWL) is a classical stochastic walk strategy with strong exploitation capability [26–28]. Thus, this paper uses this strategy to mutate the population individuals to improve the diversity of the population while helping it leap out of the local optimum trap. The Gaussian walk learning model is shown in Equation (19):

$$X(t+1) = \mathbb{G}auss(X(t), \mathbf{r}) \tag{19}$$

$$\tau = \cos(\pi/2 \times (t/T)^2) \times (X(t) - X\_r(t))\tag{20}$$

where *X*(*t*) indicates the individual in the generation population *t*, *Gauss*(*X*(*t*), *τ*) is the Gaussian distribution with *X*(*t*) as the expectation and *τ* as the standard deviation, and *X*(*t*) is the location of the random individual in the generation population *t*. The step size of Gaussian walk learning is adjusted by the function cos(*π*/2 <sup>×</sup> (*t*/*T*)2). An image of this is shown in Figure 4. To balance the search ability of the algorithm, the perturbation applied in the early iterations is larger and rapidly decreases in the later stages to increase the algorithm's development ability.

**Figure 4.** Graph of wandering step length change control.

#### *3.5. Flow of the MSHHO Algorithm*

In summary, the main steps of the the improved multi-strategy Harris hawks optimization (MSHHO) algorithm is shown in Algothrim 2, and the flow chart of MSHHO is shown in Figure 5.

#### **Algorithm 2** Main steps of MSHHO algorithm

```
Input: Population size N and the maximum number of iterations T
```
1: **According to Equation** (14) **initialize the population**

```
2: while t < T do
```

```
3: Generate the reverse population using the elite opposition-based learning mech-
  anism and calculate the fitness of the original population and its reverse population
  individuals
```
4: **if** the algorithm is stagnant **then** 5: **According to Equation** (19) **update the location** 6: **else** 7: **for** i=1:*N* **do** 8: **According to Equation** (18) **update the escape energy** *E* 9: **if** |*E*| ≥ 1 **then** 10: According to Equation (1) update the location 11: **else if then**|*E*| < 1 12: **if** |*E*| ≥ 0.5 and *r* ≥ 0.5 **then** 13: According to Equation (4) update the location 14: **else if then**|*E*| < 0.5 and *r* ≥ 0.5 15: According to Equation (6) update the location 16: **else if then**|*E*| ≥ 0.5 and *r* < 0.5 17: According to Equation (10) update the location 18: **else if then**|*E*| < 0.5 and *r* < 0.5 19: According to Equation (11) update the location 20: **end if** 21: **end if** 22: **end for** 23: **end if** 24: *t* = *t* + 1 25: **end while** 26: **return** *Xprey*

**Figure 5.** The flow chart of MSHHO.

#### *3.6. Time Complexity Analysis of the Algorithm*

The time complexity of the basic HHO algorithm depends mainly on three stages: the initialization phase, the fitness calculation process, and the location update operation of the population. Assuming that the size of the Harris hawk population is *N*, the problem dimension is *D*, and the maximum number of iterations is *T*, the time complexity of the initialization phase is *O*(*N*), and the time complexity of finding the prey's location and updating the population's location vector is *O*(*N* ∗ *T*) + *O*(*N* ∗ *D* ∗ *T*), so the time complexity of the basic HHO algorithm is *O*(*N* ∗ (1 + *T* + *D* ∗ *T*)). For the improved algorithm proposed in this paper, the time complexity of the Sobol sequence initialization population is *O*(*N* ∗ *D*), the time complexity of the elite reverse learning strategy to optimize the population is *O*(*N* ∗ *D* ∗ *T*), and the average time complexity of the Gaussian random wandering strategy is *O*(*N*/2 ∗ *D* ∗ *T*). Thus, the time complexity of MSHHO is *O*(*N* ∗ (3/2 ∗ *D* ∗ *T* + *D* + 1)).

#### **4. Experiment and Results**

To verify the performance of the proposed MSHHO algorithm, the GWO [1], SCA [4], SSA [5], and WOA [7] approaches were selected for running a comparison with the original HHO [11] algorithm. The same experimental environment, platform, and parameters were selected for the experiments. Table 1 shows the parameter settings of the comparison algorithms. The environment of the simulation test for the experiment was the 64 bit Windows 10 operating system, the CPU was an Intel(R) Core(TM) i7-7700HQ at 2.80 GHz, and the simulation software was MATLAB R2016b.


**Table 1.** Parameter sets of the algorithms.

#### *4.1. Benchmark Functions and Numerical Experiment*

Twenty-three functions were selected from the CEC2005 benchmark [29,30], where *F*1–*F*<sup>7</sup> are unimodal functions, *F*8–*F*<sup>13</sup> are multimodal functions, and *F*14–*F*<sup>23</sup> are fixeddimension functions. The specific information of the benchmark function is shown in Tables A1–A3. For this experiment, the selected dimension of the test functions *F*1–*F*<sup>13</sup> was 30, while the other functions had different dimensions lower than 30. For each algorithm, the population size was set to 30, and the maximum number of iterations was 500. Each algorithm was run 30 times independently on each test function to prevent chance from bringing bias to the experimental results, and the mean, best value, and standard deviation of each algorithm run are shown in Tables 2 and 3.

The CEC2017 benchmark functions are characterized by a large problem size and more complex optimization searching, which can effectively distinguish the direct differences in the searching ability of different algorithms. There are 30 single-objective benchmark functions in CEC2017, including unimodal functions (*F*1–*F*3), simple multimodal functions (*F*4–*F*10), hybrid functions (*F*11–*F*20), and composition functions (*F*21–*F*30). In order to further verify the improvement effect of the MSHHO algorithm, 10 benchmark functions (*F*1, *F*3, *F*5, *F*7, *F*14, *F*15, *F*18, *F*21, *F*24, and *F*30) with different characteristics were selected for testing in the experiment, and their characteristics are shown in Table 4. Each algorithm was also run 30 times in the experiment, with a maximum number of iterations of 1000 and a population size of 100. The experimental results are shown in Table 5.

#### *4.2. Results Analysis*

In the experiment with CEC2005 as the test function, the unimodal functions *F*1–*F*<sup>7</sup> selected for this experiment were used to test the development capability of the algorithm.

From the experimental results, it can be seen that for the test functions *F*1–*F*4, MSHHO could directly find the best value of zero, and HHO has the second-best performance, while the SCA, SSA and WOA performed poorly to varying degrees. For *F*<sup>5</sup> and *F*6, MSHHO performed best in terms of both average and best results and with much higher accuracy than the other algorithms. For *F*7, MSHHO performed similarly to HHO, but numerically, MSHHO had a slightly better mean, optimal value, and stability and performed significantly better than the other comparison algorithms. Overall, among all unimodal test functions, MSHHO had the best performance, stable results, and significantly better optimization than the comparison algorithms.

*F*8–*F*<sup>23</sup> are multimodal functions to evaluate the exploration capability of the algorithm. The experimental results show that in functions *F*8–*F*23, compared with the other algorithms, MSHHO could achieve the optimal optimization effect in most functions, and many functions could find the best value, such as *F*9, *F*11, *F*14,*F*16, *F*17, *F*18, and *F*19. In *F*<sup>17</sup> and *F*19, the stability performance of MSHHO was slightly inferior to that of the SSA. Overall, the combined performance of MSHHO in the multimodal test function was still the best result.

In the experiments with CEC2017 as the test function, it can be seen that MSHHO performed well in the hybrid functions and composition functions, both of which could obtain the best optimal and mean values with good stability. In unimodal functions and simple multimodal functions, although the performance was not the best, it had a great improvement effect compared with the original HHO.

To visualize the convergence performance of MSHHO, the iterative convergence curves of the test functions were experimentally plotted, and the convergence plots of some of the test functions are shown in Figures 6–9. As can be seen from the figures, both in terms of the speed of convergence and the accuracy of convergence, MSHHO outperformed the other comparison algorithms. It showed good performance not only in the unimodal test functions but also the multimodal functions. The box graph shows that MSHHO also performed better in terms of stability compared with the other algorithms.


**Table 2.** Results of CEC2005 benchmark functions.


**Table 3.** Results of CEC2005 benchmark functions.

**Table 4.** The CEC2017 benchmark functions selected for the experiment.



**Table 5.** Results of CEC2017 benchmark functions.

#### *4.3. Nonparametric Statistical Analysis*

In order to analyze the test results of each experiment more precisely and avoid the influence of chance on the validation of the experimental results, the results of 30 instances of the 6 algorithms solving 33 test functions were passed through the Wilcoxon rank sum test at a significance level of 0.05 to identify significant discrepancies between the results of the comparison algorithms and MSHHO. If the p-value of the rank sum test was greater than 0.05, then this meant that there was no significant difference between the two results; otherwise, it meant that the results of the two algorithms were significantly different in the whole. The results of the rank sum test are shown in Table 6, and NaN indicates that the two groups of samples were the same. For the CEC2005 benchmark functions, the results in the table show that MSHHO was significantly different from GWO and the SCA and SSA in all 23 functions, from HHO in 22 functions, and from WOA in 19 functions. Among the 10 benchmark functions in CEC2017, MSHHO was significantly different from the SCA in all functions, from HHO and the WOA in 9 functions, from GWO in 8 functions, and from the SSA in 5 functions. Therefore, it can be concluded that there was a statistically significant difference in the optimization performance of MSHHO compared with the other algorithms, and the MSHHO algorithm performed significantly better.

**Figure 6.** Qualitative results of F5, F6, F7, and F8 (CEC2005).

**Figure 7.** Qualitative results of F10, F12, F13, and F14 (CEC2005).

**Figure 8.** Qualitative results of F15, F17, F19, and F20 (CEC2005).

**Figure 9.** Qualitative results of F18, F21, F24, and F30 (CEC2017).


**Table 6.** Results of Wilcoxon rank sum test for different algorithms.

#### *4.4. Comparison with Other Improved Strategies of the HHO Algorithm*

To better illustrate the improvement of the algorithm in terms of optimization performance, the experimental results of the chaotic elite Harris hawks optimization (CEHHO) algorithm in [14] were selected for comparison with the MSHHO algorithm proposed in this paper, with the same parameters as those set in [14], setting the number of populations to 50 and the maximum number of iterations to 300, with 17 common test functions selected for the experiments, and the comparison results are shown in Table 7.

**Table 7.** Comparison of the results of different improved algorithms.


From the results in the table, it is apparent that for functions *F*9, *F*10, and *F*11, both optimization algorithms could reach the optimal results with a standard deviation of zero, while for *F*16, *F*17, and *F*18, both algorithms found the optimal values as well. The standard deviation of MSHHO was smaller, and the algorithm was more stable in comparison. For the other functions, MSHHO had better performance in terms of both mean and standard deviation.

#### *4.5. Experimental Analysis of Solving High-Dimensional Functions*

Based on the experimental results above, we verified the optimization effectiveness of the improved MSHHO algorithm in low-dimensional functions. However, most algorithms would be much less effective or even fail when solving complex problems in high-dimensional functions.

In order to verify the practicality of MSHHO in high-dimensional problems, the original HHO algorithm and the improved MSHHO algorithm were selected for comparison experiments on 100-dimensional and 500-dimensional *F*1–*F*<sup>13</sup> functions, respectively, and the experimental results are shown in Table 8.

From the results in Table 8, it can be seen that the improved MSHHO algorithm still had better result values than the original HHO algorithm for each test function in 100 and 500 dimensions with good stability, and the MSHHO algorithm still found the optimal results in functions *F*1–*F*4.

**Table 8.** Experimental analysis of solving high-dimensional functions.


#### **5. Engineering Optimization Problems**

#### *5.1. Pressure Vessel Design Problem*

The pressure vessel is an essential and important piece of equipment in industrial production, the main function of which is to store liquids or gases under a certain pressure. The design problem of the pressure vessel is a nonlinear programming problem that belongs to the category of existence of multiple constraints. The objective of the problem is to minimize the cost of making the pressure vessel. The design of the pressure vessel is shown in Figure 10.

**Figure 10.** Pressure vessel design problem.

The pressure vessel was composed of a cylindrical vessel part and a hemispherical capping part at the head end and a tail end. L is the length of the cylindrical section, R is the radius of the inner wall of the vessel section, S is the wall thickness of the vessel section, and H is the wall thickness of the hemispherical cap, where, L, R, S, and H were the four variables to be optimized for the pressure vessel design problem. The objective function and constraints of the problem can be expressed as follows:

$$\begin{array}{l} \mathbf{x} = [\mathbf{x}\_1, \mathbf{x}\_2, \mathbf{x}\_3, \mathbf{x}\_4] = [S, H, \mathbf{R}, L] \\ \mathbf{M} \text{inf}(\mathbf{x}) = 0.6224 \mathbf{x}\_1 \mathbf{x}\_3 \mathbf{x}\_4 + 1.7781 \mathbf{x}\_2 \mathbf{x}\_3^2 + 3.1661 \mathbf{x}\_1^2 \mathbf{x}\_4 + 19.84 \mathbf{x}\_1^2 \mathbf{x}\_3 \end{array}$$

subject to

$$\begin{array}{l} \text{g}\_1(\mathbf{x}) = -\mathbf{x}\_1 + 0.0193\mathbf{x}\_3 \le 0\\ \text{g}\_2(\mathbf{x}) = -\mathbf{x}\_2 + 0.00954\mathbf{x}\_3 \le 0\\ \text{g}\_3(\mathbf{x}) = -\pi \mathbf{x}\_3^2 - \frac{4}{3}\pi \mathbf{x}\_3^3 + 1296000 \le 0\\ \text{g}\_4(\mathbf{x}) = \mathbf{x}\_4 - 240 \le 0\\ \text{0 } \le \mathbf{x}\_i \le 100, i = 1, 2\\ 10 \le \mathbf{x}\_i \le 200, i = 3, 4 \end{array}$$

The problem was solved by the MSHHO and HHO algorithms [11] as well as the SCA [4], SSA [5], and WOA [7] with the same relevant parameter settings for the algorithms, and the results are shown in Table 9. It can be seen that MSHHO demonstrated the best results in solving this problem.

**Table 9.** Results of pressure vessel design problem for different algorithms.


The convergence curve and box plot of the problem are shown in Figure 11. It can be visually seen that the MSHHO performed well in terms of accuracy and stability.

**Figure 11.** Qualitative results of pressure vessel design problem.

#### *5.2. Compression Spring Design Problem*

The optimal design of the spring must achieve the minimum value of its mass within the constraints of the shear stress, surge frequency, fluke curvature, and other relevant index criteria. The design is shown in Figure 12 and includes three design variables, namely the spring wire diameter *d*, the average spring coil diameter *D*, and the number of effective spring coils *N*. The objective function and constraints are described as follows:

> *x* = [*x*1, *x*2, *x*3]=[*d*, *D*, *N*] *Minf*(*x*)=(*x*<sup>3</sup> + 2)*x*<sup>2</sup>

<sup>1</sup>*x*<sup>2</sup>

subject to

$$\begin{array}{l} \mathcal{g}\_{1}(\mathbf{x}) = 1 - \frac{\mathbf{x}\_{2}^{3}\mathbf{x}\_{3}}{71785\mathbf{x}\_{1}^{4}} \le 0\\ \mathcal{g}\_{2}(\mathbf{x}) = \frac{\mathbf{x}\_{2}^{2} - \mathbf{x}\_{1}\mathbf{x}\_{2}}{12566(\mathbf{x}\_{1}^{3}\mathbf{x}\_{2} - \mathbf{x}\_{1}^{4})} + \frac{1}{5108\mathbf{x}\_{1}^{2}} \le 0\\ \mathcal{g}\_{3}(\mathbf{x}) = 1 - \frac{140.45\mathbf{x}\_{1}}{\frac{\mathbf{x}\_{2}^{2}\mathbf{x}\_{3}}{2\times 3}} \le 0\\ \mathcal{g}\_{4}(\mathbf{x}) = \frac{\mathbf{x}\_{1} + \mathbf{x}\_{2}}{1.5} \le 0\\ 0.05 \le \mathbf{x}\_{1} \le 2.00\\ 0.25 \le \mathbf{x}\_{2} \le 1.3\\ 2.00 \le \mathbf{x}\_{3} \le 15.0 \end{array}$$

The problem was solved by the MSHHO and HHO algorithms [11], as well as the SCA [4], SSA [5], and WOA [7] with the same relevant parameter settings for the algorithms, and the results are shown in Table 10, while the convergence curve and box plot of the spring design problem are shown in Figure 13. It can be seen that MSHHO demonstrated the best results in solving this problem.

**Figure 12.** Sping design problem.

**Table 10.** Results of spring design problem for different algorithms.


**Figure 13.** Qualitative results of spring design problem.

#### **6. Conclusions**

In this paper, a multi-strategy improvement method was presented to improve the original Harris hawks optimization algorithm by removing the weaknesses, such as low accuracy, slow speed of convergence, and easy being trapped in the local optimality. The improvement algorithm first uses a Sobol sequence to initialize the population and improve the population diversity. In the process of population update iteration, elite oppositionbased learning is used to increase the population diversity and population quality. The energy update strategy in the original algorithm is improved to balance the exploration and exploitation capability of the algorithm in a nonlinear update way. The Gaussian walk learning strategy is incorporated to avoid the algorithm from stagnating and falling into the local optimum.

To validate the performance of MSHHO, 23 benchmark functions with unimodal, multimodal, and fixed dimensions in CEC2005 and 10 benchmark functions with unimodal functions, simple multimodal functions, hybrid functions, and composition functions in CEC2017 were selected for comparison experiments with other algorithms, including and HHO GWO as well as the SCA, SSA, and WOA. Subsequently, comparisons with the basic HHO algorithm were made at 100 and 500 dimensions to verify the practicality of the high-dimensional problem. The results show that the improved algorithm had good performance in terms of search accuracy, convergence speed, and stability, which effectively compensated for the defects of the original algorithm. In addition, the MSHHO algorithm was applied to solve two engineering application problems in this paper. The experimental results show that MSHHO could achieve the best results compared with the other algorithms.

In future work, the next step in research focuses on applying the MSHHO algorithm to solving large-scale, complex multi-objective optimization and practical engineering applications, such as microgrid scheduling optimization problems.

**Author Contributions:** Conceptualization, F.T. and J.W.; methodology, F.T.; software, F.T. and F.C.; validation, F.T., J.W., and F.C.; formal analysis, F.T.; investigation, F.T.; resources, J.W.; data curation, F.T.; writing—original draft preparation, F.T.; writing—review and editing, F.T.; visualization, F.T.; supervision, J.W.; project administration, J.W.; funding acquisition, J.W. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the National Natural Science Foundation of China under grant number 61772031 and the Natural Science Foundation of Hunan Province under grant number 2020JJ4753.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**

#### *Appendix A.1*

Tables A1–A3 provide specific information on the benchmark function.


**Table A1.** Information on unimodal benchmark functions.

**Table A2.** Information on multimodal benchmark functions.



**Table A3.** Information on fixed-dimension benchmark functions.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
