4.2.3. Individual Exploration

The evolution part of the NPC population in BCE usually has high selection pressure; it converges quickly. However, the general NPC evolution tends to converge to one or more regions of PF, rather than the entire PF. This leads to a lack of diversity, as there are areas of PF that have never even been explored. It is through individual exploration that NPC evolution explores unknown areas on PF to achieve the purpose of increasing the diversity of NPC population. Individual exploration will explore some promising individuals in the PC population rather than all individuals in the PC population, because some individuals in the PC population have been well explored by NPC populations. These promising individuals generally have been eliminated (by NPC evolution), are less developed, or are not even visited in NPC evolution. From this point of view, the discussion is mainly focused on two types of individuals in the PC population:


First of all, for the first group of individuals, these individuals are not in the niche of individuals in the NPC population. Such individuals are far away from the individuals in the NPC population in the objective space, obviously not the individuals favored by the NPC criterion. However, it is such individuals that are in areas that NPC evolution has not explored. While the second kind of individuals have an NPC individual in niche, which is not a lot of individuals when considering that the *k* is set to 3. However, such individuals are likely to be located in areas where NPC evolution is incomplete, and it is necessary to explore such promising individuals.

During individual exploration, the above two kind of individuals contained in the PC population are first marked and stored in set *S* (individual sets to be explored). Then, the variation operation is carried out on the individuals in set *S*, and all the new individuals generated by the variation operator are stored in set *T* (the new individuals set generated by individual exploration) for the next PC selection. The variation operator here can be selected arbitrarily, but it should be noted that the number of parent individuals required by the selected variation operator should be changed accordingly.

The influence of the radius of the niche should also be considered here. A relatively small radius may allow all individuals in the PC population to be explored, as there may not be many NPC individuals in each individual's niche. The reverse is also true, larger radius may cause all individuals to remain unexplored. Therefore, a dynamically varying radius is used here, which can vary with the size of the PC population.

With the continuous evolutions of PC and NPC, more and more non-dominated individuals are produced, and the selection pressure of PC gradually decreases. This slows down PC evolution when the number of newly created non-dominated individuals exceeds the size of the remaining PC population that can be stored. This allows for less individual exploration, allowing the high selection pressure of NPC to play a greater role. The dynamic radius of the niche is set as follows:

$$
\tau = (N\iota/N) \* r\_0 \tag{11}
$$

where *N* represents the PC population size, and *N* represents the size of the PC population before population maintenance, and *r*0 represents the base niche radius calculated by means of population maintenance.

In the case of fixed computational resources (functional evaluations), this process of adaptive exploration is necessary according to the evolutionary state of the population. On the one hand, individual exploration can make up for the lack of diversity in NPC population. On the other hand, when there is a lack of convergence, more computing resources can be given to NPC evolution to accelerate convergence under higher selection pressure.

As shown in Figure 2, individual exploration on a 3-objective optimization problem is given. The triangle of coordinates in the figure represents the Pareto front of the problem, and the points in the figure represent the distribution of individuals in the population in the objective space. Suppose the NPC population is shown in Figure 2a, and the PC population is shown in Figure 2b. Due to the characteristics of NPC population, the obtained solution set may be distributed in some part of the Pareto front. For example, the population in Figure 2a is concentrated to the left and to the top of the Pareto front, while there is no individual distribution on the right. While PC population is relatively evenly distributed around the Pareto front, but the convergence is not good (some points do not converge to the Pareto front). Moreover, the role of individual exploration is to explore the promising individuals in the PC population to promote the diversity of the NPC population. It can be seen here that several individuals marked in red in Figure 2b are still promising individuals although they have not converged to the Pareto front. By exploring these solutions, it is possible to ge<sup>t</sup> some solutions that have never been explored in the PC population but have a good diversity. After continuous individual exploration, the diversity of PC population will also be improved and finally reach the state, as shown in Figure 2c. The population in Figure 2c well balances convergence and diversity, thus, achieving the purpose of individual exploration.

**Figure 2.** The individual exploration process in bi-criterion evolution (BCE). (**a**) The optimal solution set obtained by non-Pareto criterion; (**b**) the optimal solution set obtained by Pareto criterion; (**c**) the optimal solution set obtained by the individual exploration.

#### *4.3. Two-Population Coevolutionary Algorithm with Dynamic Learning Strategy*

From the above description, it can be seen that these strategies have grea<sup>t</sup> advantages and far-reaching significance in solving many-objective optimization problems. Next, we will introduce DL-TPCEA in the above context.

#### 4.3.1. The Process of DL-TPCEA

Algorithm 2 gives the whole process of DL-TPCEA, from which it can be seen that the input parameters of DL-TPCEA include population size *N*, objective number *M,* and function evaluations (*FEs*). The final output is the population in which PC evolution. First, a parameter setting (Line 1, Algorithm 2) will be performed, which is mainly set for the current iteration number *gen* and the maximum iteration number *maxgen* in Equation (5). In addition, *Lp*-norm-based distance is also used in DLS for diversity maintenance, where the value *p* is also initialized. The inverse of the objective number (1/*M*) will be used here as the value of *p*. The parameter settings (Line 5, Algorithm 2) in the later steps do the same thing.

Before the proceeding of BCE, the populations (PC population and NPC population) in both evolutionary approaches should first be initialized (Lines 2–3, Algorithm 2). The NPC population randomly generates *N* decision vectors with dimension *D* in the domain by satisfying the normal distribution, where *D* represents the dimension of the decision variable. The PC population is generated by PC selection on the NPC population. This ensures that the individuals stored in the PC population will always be non-dominated.

#### **Algorithm 2** Framework of DL-TPCEA


When the algorithm begins to iterate, the individual exploration (Line 6, Algorithm 2) described in Section 4.2.3 is first performed. Exploring whether there are individuals in the PC population that the NPC population has not been (fully) explored. If these individuals existed, it will be stored in set *S* as described above. Then, the new individuals generated using variation operator to *S* was store in the set *T*. Finally, the returned NewPC population is all individuals in the set *T*. The *ExRatio* is a ratio coefficient, which represents the proportion of individuals to be explored. The *ExRatio* is calculated as follows:

$$ExRatio = \frac{Length(S)}{Length(PC)}\tag{12}$$

where *Length*(·) represents the size of the set or population. When *ExRatio* is greater than 0, it indicates that there are individuals in the PC population that need to be explored. The larger *ExRatio* means the more individuals in PC population need to be explored, and the value range of *ExRatio* is in [0, 1).

The *ExRatio* is set to dynamically change the convergence factor (dynamic convergence factor) of DLS later when using DLS for environment selection. As new individuals are generated by individual exploration, most of these individuals are located in areas that have not been explored or are not fully explored in NPC evolution. Therefore, the exploration at this iteration should pay more attention to these individuals, which means more diversityrelated individuals should be appropriately selected to better explore these regions in the evolution of NPC. In this case, the convergence factor is appropriately scaled down according to the size of *ExRatio* at this iteration to achieve this purpose. The detailed process is described in Section 4.3.2.

After individual exploration, the following is the evolution of NPC population (Lines 7–9, Algorithm 2) and PC population (Line 10, Algorithm 2), respectively. First of all, an environment selection is carried out, and the individuals in mixed population of NewPC population and NPC population is selected by using the non-Pareto criterion and stored in NPC population. The variation operator is then applied to the NPC population to generate a new NewNPC population. Then the individuals with better performance in non-Pareto criterion are selected from the mixed population of NPC population and NewNPC population. The evolution of PC population uses PC selection to select non-dominated individuals in mixed population of original PC population, NPC population, and NewNPC population. This will select all non-dominated individuals from the three populations to archive in the PC population. Population maintenance operation is performed on the PC population if necessary (when *Length*(PC) > *N*).

#### 4.3.2. Environmental Selection in NPC Evolution

The process of environmental selection in NPC evolution is shown in Algorithm 3. The environmental selection mainly uses DLS to select NPC population. However, dynamic convergence factors *α'* should be set according to the evolutionary state of the current population before selection. As the number of individuals explored by individual exploration is different at each iteration, the value of *ExRatio* is also different. However, when individuals need to be explored, the convergence factor *α* should be scaled down. In order to respond to the information of the number of individuals to be explored, the dynamic convergence factor *α'* is calculated as follows:

$$
\omega \prime = \pi - \omega \ast \sin(\frac{\text{ExRatio} \ast \pi}{2}) \tag{13}
$$

where *ω* is a dynamic scaling factor and is set to 0.1. The main purpose of this setting is to prevent the convergence factor from scaling too much, because a good convergence performance can be maintained when the convergence factor is set at 0.9 or so. Since the value interval of *ExRatio* is [0, 1), the value interval of dynamic convergence factor *α'* is [0.8, 0.9). This allows DLS to play a better role even in individual exploration.

After the predefined parameters are set, the next step is to select the individuals in the candidate population using DLS as shown in Algorithm 1. The population, here, is generated by the BCE process rather than a hybrid population with a parent–child relationship. In addition, the convergence factors used are dynamic convergence factors *α'* that are scaled according to the state of individual exploration.

#### **Algorithm 3** Environmental Selection in NPC Evolution


#### 4.3.3. The Time Complexity Analysis of DL-TPCEA

The time complexity of DL-TPCEA is mainly determined by the party that consumes more time during the evolution of PC and NPC. In PC selection, the time complexity of selecting non- dominated individuals from the three-part population (Line 10 of Algorithm 2) is *O*(*MN*2). The time complexity of population maintenance and individual exploration is also *O*(*MN*2). So, the time complexity of PC evolution is *O*(*MN*2). In the NPC evolution, the time complexity of first non-dominated sort is *<sup>O</sup>*(*N*log*<sup>M</sup>*−2*N*). The time complexity of calculating the number of convergence-related and diversity-related individuals is *C* (*C* is a constant), while the time complexity of calculating the two indicators of candidate solutions is *O*(*MN*2) and *O*(*N*2), respectively. The time complexity of using the indicators to select the candidate solution is *O*(*N*). In conclusion, the time complexity of DL-TPCEA is max{*O*(*N*log*<sup>M</sup>*−2*N*), *O*(*MN*2)}.
