*3.1. Grey Wolf Optimizer*

GWO is a population-based metaheuristic algorithm, which mimics the wolf hunting process. Population- and single-based optimization algorithms differ from each other in the number of agents used to carry out a search of a global optimum. Each agent is a candidate for finding a global optimum. Figure 2 shows the GWO simulation. The search begins when all agents obtain random *x*, *y* values, where lower bound ≤ x, y ≤ upper bound. Then, the cost function defines the best candidates α, β, γ among them in each iteration (Figure 2a) by equation:

$$\mathbf{f}(\mathbf{x}) = \sum\_{i=1}^{n} \mathbf{x}\_i^2 \tag{1}$$

Every single agent needs to compute *D*α, *D*β, *D*γ, and then, *X*1, *X*2, *X*<sup>3</sup> can be found. The random and adaptive vectors, <sup>→</sup> *<sup>A</sup>* and <sup>→</sup> *C* are upgraded in each iteration.

$$
\overrightarrow{D}\_a = |\overrightarrow{\mathbf{C}}\_1 \cdot \overrightarrow{X}\_\alpha - \overrightarrow{X}| \\
\overrightarrow{D}\_\beta = |\overrightarrow{\mathbf{C}}\_2 \cdot \overrightarrow{X}\_\beta - \overrightarrow{X}| \\
\overrightarrow{D}\_\gamma = |\overrightarrow{\mathbf{C}}\_3 \cdot \overrightarrow{X}\_\gamma - \overrightarrow{X}| \\
\tag{2}
$$

$$
\overrightarrow{X}\_1 = \overrightarrow{X}\_a - \overrightarrow{A}\_1 \cdot (\overrightarrow{D}\_a), \overrightarrow{X}\_2 = \overrightarrow{X}\_\beta - \overrightarrow{A}\_2 \cdot (\overrightarrow{D}\_\beta), \overrightarrow{X}\_3 = \overrightarrow{X}\_\gamma - \overrightarrow{A}\_3 \cdot (\overrightarrow{D}\_\gamma), \tag{3}
$$

Finally, the next agent's position is mean value of <sup>→</sup> *X*1, → *X*2, → *X*3, which Figure 2c is illustrated.

$$
\overrightarrow{X}(t+1) = \frac{\overrightarrow{X}\_1 + \overrightarrow{X}\_2 + \overrightarrow{X}\_3}{3} \tag{4}
$$

The same calculation will be repeated in each run time for every search agent.

Depending on the values of the vectors, <sup>→</sup> *<sup>A</sup>* and <sup>→</sup> *C*, also denoted as GWO parameters, the two phases make the transition between divergence (exploration) and convergence (exploitation) in the optimal solution search. The GWO parameters are calculated as follows:

$$
\overrightarrow{A} = 2\overrightarrow{a} \cdot \overrightarrow{r\_1} - \overrightarrow{a}\_{,}.
$$

$$
\overrightarrow{\mathcal{C}} = 2 \overrightarrow{\cdot r\_{2\nu}}
$$

where the value of <sup>→</sup> *a* decreases linearly from 2 to 0 using the update equation for iteration *t*:

$$
\overrightarrow{a\_t} = \overrightarrow{a\_t} - \frac{a\_0}{t},
\tag{5}
$$

and <sup>→</sup> *<sup>r</sup>* <sup>1</sup> and <sup>→</sup> *r* <sup>2</sup> are random values ranging from 0 to 1.

In GWO, the parameter <sup>→</sup> *A* determines the exploration and exploitation in searching behavior. Each agent of the population performs divergence when <sup>→</sup> *A* > 1 and executes convergence from α, β, γ agents when <sup>→</sup> *A* < 1. Figure 2 illustrates how the agent largely changes the next position at iterations *t* and *<sup>t</sup>*+*<sup>1</sup>* by the divergence of the parameter <sup>→</sup> *A*, which is linearly decreasing.

The parameter <sup>→</sup> *C* randomly determines the exploration or exploitation tendencies without dependency on the iterations. The stochastic mechanism in GWO allows the enhanced search of optimality by reaching different positions around the best solutions.

In recent years, the GWO algorithm has been widely modified in various studies. In study [42], the authors improved the convergence speed of GWO by guiding the population using the alpha solution. In another study [43], the new operator called reflecting learning was introduced in the algorithm. It mproves the search ability of GWO by the principle of light reflection in physics. The optimization was enhanced in studies [44–46] by random walk strategies and Levy flights distribution as well.

**Figure 2.** GWO simulation at first iteration: (a) initialization of grey wolf population and searching for the non-dominated agents by the cost sphere function of Equation (1); (b) calculating the divergence of each agent positions to α, β, γ agents in Equation (2); and (c) calculating the next agent position by Equations (3) and (4). The position of agent *x*(*t*) = 2.3531, *y*(*t*) = 2.8036 will change the position in the next iteration to *x*(*t* + 1) = 0.2914, *y*(*t*) = 0.9727, where *t* = 1.
