*5.1. Algorithm Framework*

Hashimoto et al. [103] firstly explained that MFEA can be viewed as a special island model and then implemented a simple MTEC framework under the standard island model, as illustrated in Figure 7. Note that, it is essentially an explicit multi-population structure, in which the knowledge transfer across tasks is achieved through migration periodically.

**Figure 7.** An illustration of the MTEC framework under the standard island model [103].

Another multi-population evolution framework (MPEF) was first established for MTO, as shown in Figure 8, wherein each population addressed its own optimization task and genetic material transfer with the other populations can be implemented and controlled in an effective manner [82,83]. Moreover, by adaptively adjusting random mating probability, it is effective for encouraging positive knowledge transfer, while avoiding negative knowledge transfer.

**Figure 8.** An illustration of the multi-population evolution framework (MPEF) [83].

Liu et al. [86] proposed an efficient surrogate-assisted multi-task memetic algorithm (SaM-MA) for solving MTO problems. In the proposed method, the population is divided into multiple sub-populations, with each sub-population focusing on solving a task. In addition, a surrogate model with the Gaussian process model is used to predict the best solution, so as to reduce the number of fitness evaluations and to improve the search efficiency.

In order to isolate the information of each task, a light-weight multi-population framework was developed, in which each population corresponds to a single task [131]. In the proposed framework depicted in Figure 9, the inter-task knowledge transfer (individual immigration) is employed to generate the offspring, and then the successful individuals (generated from the inter-task crossover and surviving in the next generation) can replace the inferior individuals of the aforementioned task.

**Figure 9.** An illustration of the multipopulation technique for multitask optimization [131].

Besides this, research articles [84,90,100] also proposed the MTEC algorithm based on the multi-population framework, in which the number of populations is equal to the number of tasks to be optimized and each population concentrates on solving a specific task.

In order to clearly understand the focuses and differences of existing and potential works on MTEC, Jin et al. [132] proposed a general multitasking DE (MTDE) framework, which contains three major components, i.e., DE solver, knowledge transfer, and knowledge reuse. As illustrated in Figure 10, knowledge transfer is defined as both the processes of transferring knowledge out and in, and knowledge reuse as the process of utilizing the knowledge selected from the archive. In addition, two DE-specific knowledge reuse strategies were also studied in [132]: the base vector based strategy and the differential vector based strategy.

**Figure 10.** An illustration of multitasking DE (MTDE) framework [132].

Inspired by the cluster-based search feature of brain storm optimization (BSO), a brain storm multi-task problems solver (BSMTPS) framework was proposed by dividing individuals into several groups [99]. As illustrated in Figure 11, the offspring are generated by the internal brain storm (IBS) and the cross-task brain storm (CBS), achieving knowledge transfer within a special task and across different tasks, respectively. Zheng et al. [98] also employed the clustering technique to cluster similar solutions into one group. In this

way, it can avoid the knowledge transfer between dissimilar tasks and speed up the solving process.

**Figure 11.** An illustration of the brain storm multi-task problems solver (BSMTPS) framework [99].

MFEA adopts a simple inter-task knowledge transfer with randomness and tends to suffer from excessive diversity, thereby resulting in a slow convergence speed. To deal with the above issue, a two-level transfer learning framework was proposed for MTO [133]. Particularly, the upper level performs inter-task knowledge transfer via crossover and exploited the knowledge of the elite individuals to enhance the efficiency and effectiveness of genetic transfer. The lower level is an intra-task knowledge transfer, which transmits the beneficial information from one dimension to other dimensions to improve the exploration ability of the proposed algorithm. As a result, the two levels cooperate with each other in a mutually beneficial fashion.

In order to accelerate the algorithm convergence and improve the accuracy of solutions, Xie et al. [134] introduced a hybrid algorithm combining MFEA and PSO, in which the PSO was added after genetic operation of MFEA and applied to the intermediate-pop in each generation. Furthermore, an adaptive variation adjustment factor was proposed to dynamically adjust the velocity of each particle and guarantee that the convergence velocity was not too fast.

#### *5.2. Similarity Measure between Tasks*

Some researchers have focused on analyzing and measuring task relatedness [135]. As a pioneering work in [136], the similarity between tasks for MFEA was measured from three different perspectives, i.e., the distance between best solutions, the fitness rank correlation, and the fitness landscape analysis.

Based on a correlation analysis of the objective function landscapes of distinct tasks, Gupta et al. [137] presented a synergy metric (*ξ*) for capturing and quantifying a promising mode of complementarity between distinct optimization tasks. The metric can explain when and why the notion of implicit genetic transfer of MTEC algorithms may lead to performance enhancements.

For classification tasks, the relatedness between tasks is estimated by comparing their most appropriate patterns [138]. Nguyen et al. [138] proposed a multiple-XOF system, which can dynamically guide the feature transfer among learning classifier systems. The proposed method improves the learning performance of individual tasks when they are related, and reduces harmful signals from other tasks when they are not supportive to a target task.

#### *5.3. Many-Task Optimization Problem*

Until now, the existing MTEC approaches mainly focused on solving two optimization tasks simultaneously and few works have been developed solving many-task optimization (MaTO) problems. The work [139] in 2016 is the first attempt to demonstrate its feasibility for solving real-world problems with more than two tasks. In an MaTO environment, a natural idea of knowledge exchange is to select the most matching individuals from all tasks [122,123]. When the number of tasks to be optimized is more than two, in order to avoid this time-consuming approach, it is important to choose the most suitable task (or assisted task) to be paired with the present task (or target task) for effective knowledge transfer. The problem of recommending an internal source task has been considered as an open challenge in a MaTO context [140].

In [102], the roulette method based on the measured similarity of each task pair was used to select the source task. In this way, one task that has high similarity with the target task has a high chance to be selected. This can reduce the harm of negative transfer because only useful knowledge is transferred.

An adaptive mechanism of choosing suitable tasks was also proposed by simultaneously considering the similarity between tasks and the accumulated rewards of knowledge transfer during evolution [141]. Based on the reliable archives storing more sufficient individuals, the similarity between different tasks is measured by the Kullback–Leibler divergence. Inspired by the idea of reinforcement learning, a reward system was further developed in the proposed framework. Finally, the most likely beneficial task is identified and transfers knowledge via a new crossover method.

As task similarity may not capture the useful knowledge between tasks, instead of using similarity measures for task selection, Shang et al. [142] proposed a task selection approach based on credit assignment to conduct positive knowledge transfer. This approach selects the appropriate task according to how good the solutions transferred from different tasks performed along the evolutionary search process. The probability of selecting task *Tj* to task *Ti* is defined by:

$$SP\_j = \frac{\mathcal{W}\_{ij}}{\sum\_{j=1}^{K} \mathcal{W}\_{ij}} \tag{27}$$

where an element *Wij* gives how useful is task *Tj* for helping task *Ti*. In addition, the task assigned to individual *xi* is selected by task selective probability *pk i*defined by [95]:

$$p\_i^k(a) = \frac{\exp(a \cdot q\_i^k)}{\sum\_{k=1}^K \exp(a \cdot q\_i^k)}\tag{28}$$

where *qk i*is the degree of how individual *xi* can handle task *Tk*, which is defined by

$$\eta\_i^k = \frac{N - r\_i^k + 1}{\sum\_{k=1}^{K} (N - r\_i^k + 1)}\tag{29}$$

where *rk i*is the rank of individual *xi* in task *Tk*.

Moreover, Tang et al. [130] proposed a group-based MFEA by clustering the similar tasks (tasks with near global optima) and dispersing the dissimilar tasks. More importantly, the genetic materials can only be transferred within the same groups so that negative genetic transfers are eliminated.

Recently, Bali et al. [79] further utilized an RMP matrix in place of a scalar parameter *rmp* to effectively many-task genetic transfers online. It offers the distinct advantage of adapting the extent of knowledge transmissions between diverse task pairs with possibly nonuniform inter-task similarities.

#### *5.4. Decision Variable Translation Strategy*

For MTO problems, the optimal solutions of all constituent tasks tend to be in different locations of the unified search space. Within the range between those optimal solutions of different tasks, the trend of those objective functions may be in different directions. As a result, the effectiveness of knowledge transfer and sharing in MTEC may degrade or even be negative in this case. The main purpose of the decision variable translation strategy is to map the optimal solution of all tasks to the center point of the unified search space so that the growth trends of all tasks are similar and facilitate knowledge transfer during the optimization process [39,143,144].

In generalized MFEA (G-MFEA), each individual in the population was translated to a new location according to Equations (30) and (31):

$$op\_i = p\_i + d\_k \tag{30}$$

$$d\_k = sf \cdot \alpha \cdot (\mathbf{c}p - \mathfrak{m}\_k) \tag{31}$$

where *pi* and *opi* (*i* = 1, 2, ... , *Np*) are the *i*th solution and the corresponding transformed solution, respectively in the unified search space, *Np* is the population size and the translated value *dk* is estimated based on the promising solutions of the *k*th task. Furthermore, *mk* is the estimated optimum determined by calculating the mean value of the *μ* percent best solutions of the *k*th task.

Note that the translated direction and distance are both fixed for all individuals. Unfortunately, it is easy for individuals to go beyond the legal range, and then manual efforts are required to ensure their legality. As a result, the original population distribution is destroyed inevitably. Keeping this in mind, a novel variable transformation strategy and the corresponding inverse transformation were defined as Equations (32) and (33), respectively [143,144]

$$pop\_{ij} = \begin{cases} \frac{cp\_j}{m\_j} \cdot p\_{ij}, & p\_{ij} \le m\_j\\ \frac{cp\_j - 1}{m\_j - 1} \cdot p\_{ij} + \frac{m\_j - cp\_j}{m\_j - 1}, & p\_{ij} > m\_j \end{cases}, \quad j = 1, 2, \dots, D \tag{32}$$

$$p\_{ij} = \begin{cases} \frac{m\_j}{cp\_j} \cdot op\_{ij}, & op\_{ij} \le cp\_j\\ \frac{m\_i - 1}{cp\_j - 1} \cdot op\_{ij} + \frac{cp\_j - m\_j}{cp\_j - 1}, & op\_{ij} > cp\_j \end{cases}, \quad j = 1, 2, \cdots, D\tag{33}$$

where *cp* = (0.5, 0.5, ... , 0.5) is the center point of the unified search space, *pi* = {*pi*1, *pi*2, ... , *piD*} is the *i*th solution in the original unified search space and *opi* = {*opi*1, *opi*2, ... , *opiD*} is the corresponding *i*th solution in the transformed unified search space. Furthermore, *m* is the estimated optimal solution, which can be calculated as the mean value of the top *μ*\**Np* best solutions in the current generation.

#### *5.5. Decision Variable Shuffling Strategy*

In case the dimensions of decision space of different tasks in the MTO problem are different, a fine solution with small dimension may be poor and nonintegrated for task with large dimension, and some decision variables in the latter dimension of solution is always not used for tasks with small dimensions. Thus, the canonical MFEA is inefficient for MTO problems in this particular case.

To address this issue, a decision variable shuffling strategy was introduced [39]. To be specific, this strategy first randomly changes the order of the decision variables of individuals with small dimensions to give each variable an opportunity for knowledge transfer between two tasks. Then, the decision variables of individuals for the small dimensional task that are not in use are replaced with those of individuals for the large dimensional task to ensure the quality of the transferred knowledge.

Zhang and Jiang [145] systematically analyzed the defects of MFEA in dealing with heterogeneous MTO problems, and proposed the concepts of harmful transfer and defective parents. Then hetero-dimensional assortative mating and self-adaption elite replacements were proposed to overcome these issues. On six hetero-dimensional MTO problems, the proposed algorithm performed better than other algorithms.

Generally speaking, the order of decision variables has no significant influence on the single-task EAs. In contrast, the situation is significantly different for MTEC, in which the optimization process of one task more or less influences the optimization process of other tasks. Wang et al. analyzed the influence of the order of decision variables on single-task optimization (STO) and MTO problems, respectively. In addition, three orders of decision variables were proposed in [146,147]: full reverse order, bisection reverse order, and trisection reverse order. An important feature of these orders of decision variables is that an individual can recover as himself after two times of changing the order of decision variables.

#### *5.6. Adaptive Operator Selection Strategy*

It has been found that different crossover operators have various capabilities for solving optimization problems. Therefore, the appropriate configuration of crossover is necessary for robust search performance in MFEA. Zhou et al. [148] first investigated how the different types of crossover operator used affect the knowledge transfer in MFEA on both single-objective optimization (SOO) and MOO problems. As an efficient and robust MTEC, a new MFEA with adaptive knowledge transfer (MFEA-AKT) was further proposed, in which the crossover operator employed for knowledge transfer across tasks is self-adapted based on the information collected along the evolutionary search process.

In DE, a mutant vector is obtained by perturbing a base vector with several weighted difference vectors via a certain mutation strategy. Applying different mutation operators on current population can generate different search directions and offspring populations. Multiple commonly-used mutation strategies (DE/rand/1, DE/best/1, DE/current-to-rand/1, DE/current-to-best/1, DE/rand/2, DE/best/2, and DE/best/1 + *ρ*) were investigated to accelerate the convergence speed in [23,115,149], where DE/best/1 + *ρ* is defined as follows:

$$\mathbf{x}\_{i\circ}^{k} = \mathbf{x}\_{\text{best}}^{k} + F\_{i}(\mathbf{x}\_{r1}^{r} - \mathbf{x}\_{r2}^{r}) + F\_{i} \left(\frac{gen}{Gmax}\right)^{a} (\mathbf{x}\_{r3}^{r} - \mathbf{x}\_{r4}^{r}). \tag{34}$$

In the proposed mutation strategy, the value of *ρ* varies from 0 to 1. Its rationale is that the current-found best solution is utilized adequately to guide the search to promising areas in the early phase, while an increased perturbation is also integrated subsequently for a diverse exploration [149]. Note that we selected the suitable mutation strategy randomly in [115] or adaptively according to their success rates in previous generations in [23].

#### *5.7. Multi-Task Optimization under Uncertainties*

Optimization problems often have different kinds of uncertainties in practice due to the influence of subjective and objective factors [150,151]. Specifically, the objective and constraint functions across tasks usually contain uncertain variables [152].

The MFEA algorithm was extended to solve the interval MTO problem under uncertainty conditions [44]. In the proposed method, an interval crowding distance based on shape evaluation is calculated to evaluate the interval solutions more comprehensively. In addition, an interval dominance relationship based on the evolutionary state is designed to obtain the interval confidence level, which considers the difference of average convergence levels and the relative size of the potential possibility between individuals.

#### *5.8. Hyper-Heuristic Multi-Task Evolutionary Computation*

Instead of searching directly in the solution space like conventional meta-heuristics, hyper-heuristics work at the higher-level search space of a set of low-level heuristics [153,154]. The goal of hyper-heuristics is to solve the problem at hand by selecting existing low-level heuristics or generating new low-level heuristics.

Although hyper-heuristics search in heuristics space, their current paradigms still focus on solving isolated optimization problems independently. To integrate the advantages of MTEC and hyper-heuristics effectively, Hao et al. [78] proposed a unified framework of the evolutionary multi-task graph-based hyper-heuristic (EMHH). Note that, in EMHH, the concept of MTEC and graph heuristics are used as the high-level search methodology and low-level heuristics, respectively. It has been evaluated on examination timetabling and graph coloring problems and the experimental results demonstrate the effectiveness and efficiency of the proposed framework.

#### *5.9. Auxiliary Task Construction*

The distinctive performance of MTEC algorithms greatly depends on the similarity of tasks in MTO problem. These methods may fail in cases where no prior knowledge on the task correlations or even no related tasks are existed. Therefore, it is worth noting that constructing the auxiliary and related task for the main task is essential to the improved performance of evolutionary search [155,156].

As the first attempt in this direction, Da et al. [80] solved a complex travel salesman problem (TSP) problem in conjunction with a closely related (but artificially generated) multi-objective optimization task in a multi-task setting. The motivation behind the proposal is that the associated MOO task can often act as a helper task which aids the search process of the original problem by leveraging upon the implicit genetic transfer. Specifically, the MOO task is formulated by decomposing the original TSP problem into two distinct sub-tours.

Similarly, vehicle routing problem with time window (VRPTW) was modeled as a two-task problem in [157], i.e., a MOO version (main task) and a single-objective version (auxiliary task). The auxiliary task provides inspiration for the creation of bone routes and semi-finished product solutions, which work together to speed up the algorithm convergence by using these illegal solutions in the search process.

Feng et al. [111] proposed an evolutionary multitasking assisted random embedding method (EMT-RE) for solving the large-scale optimization problem. Besides the original problem, several low-dimensional auxiliary tasks are constructed by random embedding to assist target optimization in a multi-task scenario.

For a given MOO problem, each single objective problem naturally shares grea<sup>t</sup> similarity with it [158]. Therefore, the optimization processes on these single objective functions could generate useful knowledge to enhance the problem solving process on the target MOO problem. Huang et al. [158] treated each single objective problem as a separate task domain and then discussed the detailed designs of building the dynamic domain mapping and conducting knowledge transfer from multiple single objective problems to the multi-objective problem.

In industrial production, excessive process data are generated and collected, even leading to information overload. They are predicted by models with different precision. In [119], the operational indices optimization was first established based on an accurate model (multilayer perception) and two assistant models (the first-order polynomial regression model and the second-order polynomial regression model). Note that the assistant models are alternatively used in the multi-task environment with the accurate model to realize good knowledge transfer from the assistant models to the accurate model.

Inspired by the idea of the weight function, Zheng et al. [159] introduced a new additional helper-task to accelerate the convergence of the main task in multi-task scenario. As expected, the proposed method is beneficial to positive inter-task knowledge transfer by adding possible similar tasks.

#### **6. Applications of Multi-Task Evolutionary Computation**

Since the first establishment of MFEA, a number of MTEC algorithms have been proposed and successfully applied in many benchmark problems and real-world problems over the past few years, as summarized in Table 3.


**Table 3.** Application domains of MTEC algorithms in the past five years.



#### *6.1. Benchmark Problems*

#### 6.1.1. Continuous Optimization Problem

Evolutionary algorithms often lose their effectiveness and efficiency when applied to large-scale optimization problems. Feng et al. [111] presented a primary trial of solving large-scale optimization (up to 2000 dimensions) via the evolutionary multi-task assisted random embedding method.

EAs are not well suited for solving computationally expensive optimization problems, where the evaluation of candidate solutions needs to perform time-consuming numerical simulations or expensive physical experiments. Ding et al. [39] extended the basic MFEA to handle expensive optimization problems by transferring knowledge from multiple computationally cheap tasks to computationally expensive tasks. Similarly, a multi-surrogate based approach was adopted regarding the two surrogates as two related tasks [163]. The global surrogate model (expensive) is trained using all available data, and the local surrogate model (cheap) is trained using only part of the data subsequently selected from the data sorted.

A bi-level optimization problem (BLOP) is defined in the sense that one optimization task (the lower level problem) is nested within another (the upper level problem), which together comprise a pair of objective functions [181]. A multi-task bi-level evolutionary algorithm (M-BLEA) was provided as a promising paradigm to promote solving the upper level problem [37]. In M-BLEA, multiple lower level optimization tasks were to be appropriately solved during every generation of the upper level optimization, thereby facilitating the exploitation of underlying commonalities among them.

Although the original MFEA was designed for SOO problem [18], the idea of knowledge transfer or sharing across constitutive tasks also holds for the MOO problem. As a pioneer in multi-objective MTO, Gupta et al. [38] firstly extended the MFEA framework to the MOO domain. As a key element, a meaningful order of preference among candidate solutions in different tasks was proposed. Notice that for ordering individuals in a population, the binary preference relationship between two individuals satisfies the properties of irreflexivity, asymmetry, and transitivity [38].

Inspired by the division approach, Mo et al. [162] proposed a decomposition-based multi-objective multi-factorial evolutionary algorithm (MFEA/D-M2M). It adopts the M2M approach to decompose the MOO problem into multiple constrained sub-problems in order to enhance the population diversity. Note that a matting pool is also constructed to ensure genetic transfer across different sub-problems.

Yang et al. [120] presented the TMO-MFEA algorithm, in which decision variables were divided into two types, namely, diversity variables and convergence variables. The knowledge transfer on diversity variables is intensified to obtain evenly distributed solutions over the Pareto front (PF), whereas the knowledge transfer on convergence variables is restrained to maintain the convergence of the solution population toward the PF.

In MFEA based on decomposition strategy (MFEA/D), through multiple sets of weight vectors, each multi-objective task was decomposed into a series of SOO subtasks optimized with an independent population [161].

Recently, Ruan et al. [182] investigated when and how knowledge transfer works or fails in dynamic multi-objective optimization. Computationally knowledge transfer works poorly on problems with a fixed Pareto optimal set and under small environmental changes. In addition, the Gaussian kernel function used is not always adequate for the knowledge transfer.

#### 6.1.2. Discrete Optimization Problem

As a preliminary attempt, several NP-hard combinatorial problems were efficiently solved within the MTEC framework, such as the traveling knapsack problem (KP) [18], Sudoku puzzles [48], travel salesman problem (TSP) [56], quadratic assignment problem (QAP) [56], linear ordering problem (LOP) [56], job-scheduling problem (JSP) [56], vehicle routing problems (VRPs) [53], and deceptive trap function (DTF) [164].

Recently, Feng et al. [57] presented a generalized variant of VRPOD, namely, the vehicle routing problem with heterogeneous capacity, time window, and occasional driver (VRPHTO), by taking the capacity heterogeneity and time window of vehicles into consideration. To illustrate its benefit, 56 new VRPHTO instances were further generated based on the existing common vehicle routing benchmarks. In addition, the stochastic team orienteering problem with time windows (TOPTW) models the trip design problem under more realistic settings by incorporating uncertainties. In [167], a new MTEC approach based on island model was developed to effectively enable knowledge sharing and transfer across search spaces.

The CluSTP problem has been solved by MFEA with new genetic operators [62,64]. In [62], the major ideas of the novel genetic operators were first constructing a spanning tree for smallest sub-graph then the spanning tree for larger sub-graph based on the spanning tree for the smaller sub-graph. Thanh et al. [64] also proposed genetic operators based on the Cayley code. Tran et al. [63] proposed a MTEC algorithm to solve multiple instances of minimum routing cost clustered tree problem (CluMRCT) together. Crossover and mutation operators were studied to create a valid solution, and a new method of calculating the CluMRCT solution was also introduced to reduce the consuming resources. More recently, Thanh et al. [68,70] further presented a novel MFEA algorithm for the CluSPT problem. Its notable feature is that the proposed MFEA has two tasks. The goal of the first task is finding the fittest solution as possible for the original problem while the goal of the second one is determining the best tree which enveloped all vertices of the problem.

Rauniyar et al. [166] put forward an MFEA based on NSGA-II to solve the pollutionrouting problem (PRP). The authors considered a PRP formulation with two conflicting objectives: minimization of fuel consumption, and minimization of total travel distance.

In the literature, the n-bit parity problem is used to demonstrate the effectiveness and superiority of particular neural network architecture, training algorithms or neuroevolution methods. Chandra et al. [58] presented an evolutionary multi-task learning (EMTL) for feedforward neural networks that evolved modular network topologies for the n-bit parity problem.
