2.2.4. Selection

The process of selection of SHADE is the same as that of DE. However, the external archive needs to be updated during selection. If a better trail individual is generated, the original individual *<sup>x</sup>i*,*G* is stored in the external archive. If the external archive exceeds capacity, one of them is randomly deleted.

#### 2.2.5. Historical Memory Update

Historical memory update is also an important operation in SHADE. The historical memories *MCR* and *MF* are initialized by Formula (3) but their contents change with the iteration of the algorithm. These memories store the "successful" crossover rate *CR* and scaling factor *F*. "Successful" here means that the trail vector *u* is selected instead of the original vector *x* to survive to the next generation. In each generation, the values of these "successful" *CR* and *F* are first stored in arrays *SCR* and *SF*, respectively. After each generation, a unit of each of the historical memories *MF* and *MCR* is updated. The updated unit is specified by index *k,* which is initialized to one and increases by one after each generation. If *k* exceeds the memory capacity *H*, it is reset to one. The following formula is used to update the *k*-th unit of historical memory:

$$\mathbf{M}\_{\text{CR},k,G+1} = \begin{cases} \textit{mcam}\_{\text{WA}}(\text{S}\_{\text{CR}}) \; if \; \text{S}\_{\text{CR}} \neq \mathcal{Q} \\ \textit{M}\_{\text{CR},k,G} \; otherwise \end{cases} \tag{9}$$

$$\mathcal{M}\_{\text{F},k,G+1} = \begin{cases} \displaystyle \max \mathcal{W}\_{\text{WL}}(\mathbb{S}\_{\text{F}}) \; \text{if } \mathbb{S}\_{\text{F}} \neq \mathcal{Q} \\\displaystyle \mathcal{M}\_{\text{F},k,G} \; \text{otherwise} \end{cases} \tag{10}$$

If all individuals in the *G*-th generation fail to generate a better trail vector, i.e., *SF* = *SCR* = ∅, the historical memory will not be updated. The weighted Lehmer mean *WL* and weighted mean *WA* are calculated using the following formulas, respectively:

$$
gamma\_{WA}(\mathcal{S}\_{\rm CR}) = \sum\_{k=1}^{|\mathcal{S}\_{\rm CR}|} \mathfrak{w}\_k \times \mathcal{S}\_{\rm CR,k'} \tag{11}
$$

$$\text{mean}\_{WL}(\mathbf{S}\_F) = \frac{\sum\_{k=1}^{|\mathbf{S}\_F|} \mathbf{w}\_k \times \mathbf{S}\_{F,k}^2}{\sum\_{k=1}^{|\mathbf{S}\_F|} \mathbf{w}\_k \times \mathbf{S}\_{F,k}} \tag{12}$$

To improve the adaptability of the parameters, the weight vector *w* is calculated based on the absolute value of the difference that is obtained by subtracting the objective function value of the given vector from that of the trail vector in current generation *G*, as follows:

$$
\Delta w\_k = \frac{\Delta f\_k}{\sum\_{k=1}^{|S\_{CR}|} \Delta f\_k} \tag{13}
$$

where Δ*fk* = *<sup>f</sup><sup>u</sup>k*,*<sup>G</sup>* − *<sup>f</sup>xk*,*<sup>G</sup>* in (13).

The pseudo-code of the SHADE algorithm is shown in Algorithm 2.


#### *2.3. Linear Decrease in Population Size: L-SHADE*

In [21], a linear reduction of population size was introduced to SHADE to improve its performance. The basic thought is to gradually reduce the population size during evolution to improve exploitation capabilities. In L-SHADE, the population size is calculated after each generation using Formula (14). If the new population size *NPnew* is smaller than the previous population size *NP*, the all individuals are sorted on the basis of the value of the objective function, and the worst *NP*-*NPnew* individuals are cut. Also, the size of external archives/*A*/decreases synchronously with population size:

$$NNP\_{n \to v} = round \left( NP\_{init} - \frac{FES}{MAXFES} \times \left( NP\_{init} - NP\_f \right) \right) \tag{14}$$

where *NPf* and *NPinit* are the final and initial population size, respectively. *MaxFES* and *FES* are the maximum and current number of the calculation of the fitness function, respectively. And *round* () is a rounding function.

#### *2.4. Weighted Mutation Strategy with Parameterization Enhancement: jSO*

The jSO [24] algorithm won the CEC2017 single-objective real parameter optimization competition [46]. It is a type of iL-SHADE algorithm that uses a weighted mutation strategy [47]. The iL-SHADE algorithm extends L-SHADE by initializing all parameters in the historical memories *MF* and *MCR* to 0.8, statically initializing the last unit of historical memories *MF* and *MCR* to 0.9, updating *MF* and *MCR* with the weighted Lehmer average value, limiting the crossover rate *CR* and scaling factor *F* in the early stage, and *p* is calculated for the "current-to-*p*best/1" mutation strategy as:

$$p = p\_{\rm min} + \frac{FES}{MAXFES}(p\_{\rm max} - p\_{\rm min}) \tag{15}$$

where *pmin* and *pmax* are the minimum and maximum value of *p*, respectively. *FES* and *MaxFES* are the current and maximum number of the calculation of the fitness function, respectively.

The jSO algorithm sets *pmax* = 0.25 and *pmin* = *pmax*/2, initial population size to *NPinit* = 25 √*D* log *D*, and the size of the historical memory to *H* = 5. All parameters in *MF* and *MCR* are initialized to 0.3 and 0.8, respectively, and the weighted mutation strategy current-to-*p*best-w/<sup>1</sup> is used:

$$
\sigma\_{i,\mathbb{G}} = \mathbf{x}\_{i,\mathbb{G}} + F\_{\mathbb{w}} \times \left(\mathbf{x}\_{\text{pbest},\mathbb{G}} - \mathbf{x}\_{i,\mathbb{G}}\right) + F\_i \times (\mathbf{x}\_{r1,\mathbb{G}} - \mathbf{x}\_{r2,\mathbb{G}}),\tag{16}
$$

where *Fw* is calculated as:

$$F\_w = \begin{cases} 0.7 F\_{i\prime} & FES < 0.2MAXFES\_{\prime} \\ 0.8 F\_{i\prime} & FES < 0.4MAXFES\_{\prime} \\ 1.2 F\_{i\prime} & \text{otherwise} \end{cases} \tag{17}$$
