2.2.3. Multi-Form Optimization

Different from multi-task optimization dealing with distinct self-contained tasks simultaneously, multi-form optimization is a novel concept for exploiting multiple alternate formulations of a single target task [12]. As illustrated in Figure 4, instead of treating each formulation independently, the basic idea of multi-form optimization is to combine different formulations into a single multi-task optimization algorithm [20].

**Figure 4.** An illustration of multi-form optimization problem [30].

The challenge of multi-form optimization lies in the fact that it may often be difficult to ascertain which formulation is most suited for a particular problem at hand, given the known limits on computational resources. Alternate formulations induce different search behaviors, some of which may be more effective than others for a particular problem instance [30].

#### *2.3. Multifactorial Evolutionary Algorithm*

As a pioneering implementation of multi-task optimization, the multifactorial evolutionary algorithm (MFEA), inspired by the multifactorial inheritance [35,36], has gained increasing research interests due to its effectivity [18]. Algorithm 1 gives a description of the entire process of the canonical MFEA.

At the initialization phase, MFEA randomly generates a single population with *N*·*K* individuals in a unified search space (line 1). The individuals in the population then have a skill factor (see Definition 3 in Section 2.1), indicating the most suitable task in terms of ranking values on different tasks, and a scalar fitness (see Definition 4 in Section 2.1), determining by the reciprocal of the ranking value with respect to the most suitable task (lines 2–8).

There are two key features of MFEA, called assortative mating and selective imitation, which distinguish it from traditional EAs. The assortative mating mechanism allows not only the standard intra-task crossover between parents from the same task (lines 13–15) but also the inter-task crossover between distinct optimization instances (lines 16–18). The intensity of knowledge transfer is controlled by a user-defined parameter labeled as random mating probability (*rmp*). Since mutation is essential in genetic algorithms, MFEA with mutation applied on all newly generated candidates may achieve better performance (lines 20–23). As each newly generated individual has been assigned skill factor, the

evaluation for the individual is taken only on the task corresponded to such skill factor (line 24). After evaluation, the whole population obtain new ranking values and thus new skill factor and scalar fitness (lines 26–27), which is then used to select survivors for the next generation (line 28). Selective imitation is derived from the memetic concept of vertical cultural transmission, which aims to reduce the computational burden by evaluating an individual for their assigned task only.


*2.4. Literature Review and Analysis*

After retrieving several important full-text databases, abstract databases, and Google Scholar, 69 articles published in peer-review journals and 71 papers published in conference proceedings were collected and reviewed for this paper. The quantity of papers published each year is contained in Table 1.

**Table 1.** The quantity of papers published each year in the past five years. The number in parentheses represents the quantity of papers published first online.


As the first paper in this field, [24] is a keynote presentation abstract published in 2016 by Springer, while the International Conference on Computational Intelligence, Cyber Security and Computational Models was held in Coimbatore, India in December 2015. Interestingly, the first journal paper [37] was received on 1 December, 2015, and published online on 26 February, 2016, while it was published in the first volume of Complex & Intelligent Systems in 2015. For simplicity, two papers both count towards 2016, as shown in Table 1.

From Table 1, we noticed that the quantity increased for the past five years and exploded in the past two years. It had already reached 39 and 57 in 2019 and 2020, respectively, more than two thirds of the total. The results demonstrate the high research intensity and productivity in MTO, becoming a hot research topic in the evolutionary computation community.

These articles involve 277 co-authors from 12 countries, including China (184), Vietnam (19), Singapore (18), New Zealand (11), and the UK (10), as shown in Figure 5. The most prolific contributing authors in this field are summarized in Table 2. From here we see clearly that China and Singapore have demonstrated grea<sup>t</sup> research power in this field, and some famous research teams have emerged from China and Singapore. It is worth noting that these prominent scholars have some kind of academic connection (research scientist, Ph.D candidate, co-investigator, etc.) with the pioneer of MTO, Prof. Ong. In addition, each paper was written by 4.21 co-authors on average.

These articles were published in 34 journals and 24 international conferences. The preferential journals involve IEEE Transactions on Cybernetics (12), IEEE Transactions on Evolutionary Computation (12), IEEE Access (4), and Information Sciences (3), while the preferential conferences involve IEEE Congress on Evolutionary Computation (IEEE CEC) (33), Genetic and Evolutionary Computation Conference (GECCO) (8), and IEEE Symposium Series on Computational Intelligence (IEEE SSCI) (6). It is evident that the publication distribution shows a high concentration. The authors tend to publish these research results in the top journals and conferences in the evolution computation community, in order to promote their academic reputations. Open Access journals (like IEEE Access), meanwhile, are new options for scholars trying to seize the initiative first and achieve high visibility.

**Figure 5.** Number of co-authors from different countries.


**Table 2.** The most prolific contributing authors devoted to MTO and MTEC.

As of January 31, 2021, the most cited papers are [11,12,18,21,38,39], in descending order, and the other papers were cited less 70 times. Although [18] by Gupta et al. is not the first paper published in a journal or submitted to a journal, it has been widely recognized by the evolution computation community. The possible reason for this is that it provided the algorithmic background, biological foundation, basic concepts, algorithm framework, simulation experiments, and excessive experimental results of MFEA. As a result, this paper has been cited 233 times so far and considered the most classic paper in MTO and MTEC.

#### **3. Theoretical Analyses of Multi-Task Evolutionary Computation**

Experimentally, many success stories have surfaced in multi-task optimization scenarios in recent years, and demonstrated the superiority of multi-task evolutionary computation over traditional methods. A natural question is whether MTEC always improves convergence performance.

Follows directly from Holland's schema [40], under fitness proportionate selection, single-point crossover, and no mutation, the expected number of individuals in a population containing given a schema at generation is deduced in [30]. This demonstrates that, compared to conventional methods, the potential ability for MTEC to utilize knowledge transferred from other tasks in the multi-task environment to accelerate convergence towards high quality schema. Further, it was proved that the MFEA with parent-centric evolutionary operators and (*μ*, *λ*) selection can asymptotically converge to the global optimum of each constitutive task, regardless of the choice of *rmp* [41]. On the other hand, the reduction in the convergence rate of MFEA depends on the chosen *rmp* and single-task optimization may lead to faster convergence feature in the worst case.

Referring to [41], Tang et al. further proved that, by aligning two subspaces, the inter-task knowledge transfer method proposed in [42] can implicitly minimize the KLdivergence between two different subpopulations. In this way, we can implement the low-drift inter-task knowledge transfer.

In [43], adaptive model-based transfer (AMT) was proposed and analyzed theoretically. The theoretical result indicates that, by combining all available (source + target) probabilistic models, the gap between the underlying distributions of parent population and offspring population is reduced. In fact, with increasing number of source models, we can in principle make the gap arbitrarily small. Therefore, the proposed AMT framework facilitates the global convergence characteristic.

Yi et al. [44] discovered mathematically that the proposed interval dominance method has a strict transitive relation to the original method when *γ* = 0.5 and can be applied when comparing the dominance relationship between interval values.

The principal finding of [45] is that, for vehicle routing problems (VRPs), the positive knowledge transfer across tasks is strictly related to the intersection degree among the best solutions. More concretely, Osaba et al. have shown that intersection degrees greater than 11% are enough for ensuring a minimum positive activity.

Recently, Lian et al. [46] provided a novel theoretical analysis and evidence of the effectiveness of MTEC. It was proved that the upper bound of expected running time for the proposed simple (4 + 2) MFEA algorithm on the *Jumpk* function can be improved to *O* (*n*<sup>2</sup> + 2*k*) while the best upper bound for single-task optimization on the same problem is *O* (*nk*−1). This theoretical result indicates that MTEC is probably a promising approach to deal with some distinct problems in the field of evolutionary computation. The proposed MFEA algorithm is further analyzed on several benchmark pseudo-Boolean functions [47]. Theoretical analysis results show that, by properly setting the parameter *rmp* for the group of problems with similar tasks, the upper bound of expected runtime of (4 + 2) MFEA on the harder task can be improved to be the same as on the easier one, while for the group of problems with dissimilar tasks, the expected upper bound of (4 + 2) MFEA on each task are the same as that of solving them independently. This study theoretically explains why some existing MFEAs perform better than traditional EAs.

#### **4. Basic Implementation Approaches of Multi-Task Evolutionary Computation**

Gupta and Ong [48] provided a clearer picture of the relationship between implicit genetic transfer and population diversification. The experimental results highlighted that genetic transfer is a more appropriate metaphor for explaining the success of MTEC. Da et al. [49] further considered the incorporation of gene-culture interaction to be a pivotal aspect of effective MTEC algorithms. In [50], the inheritance probability (IP) of the selective imitation was firstly defined and then the influence on MTEC algorithm was studied experimentally. To alleviate the influence of IP on the algorithm performance, an adaptive inheritance mechanism (AIM) was thus introduced to automatically adjust the IP value for different tasks at different evolutionary stages.

Solving the multi-task optimization problem in a natural way is the multipopulation evolution strategy, in which each subpopulation evolves and exploits separate search spaces independently in order to solve the corresponding task. As an example, in Figure 6, a multi-population evolution model is depicted to solve two tasks [51]. According to the multi-population evolution model of MTEC, various implementation approaches of each element proposed so far are described in detail in the following subsection.

**Figure 6.** Multi-population evolution model for a simple case comprising two tasks [51].

#### *4.1. Chromosome Encoding and Decoding Scheme*

For effective EAs including MTEC, the unified individual representation scheme coupled with the decoding process is perhaps the most important ingredient, which directly affects the problem-solving process.

Canonical MFEA employed the unified representation scheme in a unified search space [18]. In particular, every variable of individual is simply encoded by a random key between 0 and 1 [52]. For the case of continuous optimization, decoding can be achieved in a straightforward manner by linearly mapping each random key from the genotype space to the design space of the appropriate optimization task [18,38]. For instance, consider a task *Tj* in which the *i*th variable is bounded in the range [*Li*, *Ui*]. If the *i*th random-key of a chromosome *y* takes value *yi* ∈ [0, 1], then the decoding procedure is given by

$$\mathbf{x}\_{i} = L\_{i} + (\mathbf{U}\_{i} - L\_{i}) \cdot y\_{i}. \tag{3}$$

In contrast, for the case of discrete optimization (such as knapsack problem (KP), quadratic assignment problem (QAP), and capacitated vehicle routing problem (CVRP)), the chromosome decoding scheme is usually problem dependent.

However, there are two obvious limitations of using a random key representation when dealing with permutation-based combinatorial optimization problems (PCOPs) [53]. Firstly, the decoding can be inefficient, since the transformation from the random key representation to the permutation is required for each fitness evaluation of EAs. Secondly, the decoding process can be highly prone to losses, since only information on relative order is derived. Therefore, Yuan et al. [53] introduced an exquisite and effective variant, called permutation based unified representation, to better adapt to PCOPs. To encode multiple VRPs, the permutation-based representation [54,55] was also adopted [56,57]. With it, a chromosome is encoded as a giant tour represented by a sequence in which each dimension is a customer id. In addition, the extended split approach [54,55] was introduced to translate a permutation-based chromosome into a feasible routing solution.

Chandra et al. [58] employed direct encoding strategy for weight representation, where all the weights are encoded in a consecutive order. Therefore, different tasks results in varied length real-parameter chromosomes in the MTEC algorithm.

The solutions offered by genetic programming (GP) are typically represented by an expression tree [59]. In the multifactorial GP (MFGP) paradigm, a novel scalable chromosome encoding scheme, gene expression representation with automatically defined functions [60], was utilized to effectively represent multiple solutions simultaneously [61]. In particular, this encoding scheme using a fixed length of strings contains one main function and multiple automatically defined functions (ADFs). The main function gives the final output, while the ADFs represent subfunctions of the main function. The corresponding decoding scheme was also proposed in [61].

Binh et al. [62] proposed an individual encoding and decoding method in unified search space for solving clustered shortest-path tree (CluSPT) problem. The number of clusters of individuals is equal to the maximum number of clusters of all tasks and the number of vertices of cluster *i* is the maximum number of vertexes of cluster *i* of all tasks. Note that such individual encoding and decoding approaches can also apply to the minimum routing cost clustered tree (CluMRCT) problem [63].

Thanh et al. [64,65] introduced the Cayley Code encoding mechanism to solve clustered tree problems. Cayley Code was chosen to be the solution representation for two reasons. The first advantage is that it can encode a solution into spanning tree easier than other methods. The other one is that it takes full advantage of existing evolutionary operators such as one-point crossover and swap-change mutation. In addition, three typical coding types in the Cayley Code families were also analyzed when performed on both single-task and multi-task optimization problems.

The Edge-sets structure has been proved to be efficient in finding spanning trees in graphs [66]. In [67], it was used to construct optimal data aggregation trees in wireless sensor networks. Each gene represents an edge, each taking a value of 0 or 1, corresponding to whether the edge is present in the spanning tree. In [68], solution presented by edge-sets representation was also built for the CluSPT problem. An individual has three properties: an ES property (edges connecting all clusters), IE property (vertices in each cluster connecting it to other vertices of different clusters), and LR property (roots of all

clusters). In order to transform a chromosome in unified search space into solutions for each task, the decoding scheme contains two separate parts. For the first task, a solution for the CluSPT problem is constructed from an individual in a unified search space by using its key properties, while the decoding method for the second task is the HBRGA algorithm proposed in [69]. However, this method cannot guarantee that the sub-graphs in clusters are also spanning trees, leading to create invalid solutions. Recently, Binh and Thanh [70] introduced another method for generating random solutions which can only produce valid solutions.

Nowadays, connectivity among communication devices in networks has been playing a significant role and multi-domain networks have been designed to help resolving scalability issues. Recently, Binh et al. [71] introduced MFEA with a new solution representation. With it, a chromosome consists of two parts in a unified search space: the first part encodes the priority of the corresponding nodes while the second part encodes the index of edges in the solution. In addition, the corresponding decoding scheme was also proposed in [71].

Constructing optimal data aggregation trees in wireless sensor networks is an NP-hard problem for larger instances. A new MFEA was proposed to solve multiple minimum energy cost aggregation tree (MECAT) problems simultaneously [67]. The authors also presented am encoding and decoding strategy, a crossover operator, and a mutation operator enabling multifactorial evolution between instances.

For solving multiple optimization tasks of fuzzy system, the encoding and decoding scheme was proposed in [72]. Each individual comprises multiple chromosomes corresponding to every fuzzy variables of the fuzzy system. Each chromosome is a series of gene sequences, and per gene has one-to-one correspondence with a membership function parameter of the fuzzy variable. When a decoding procedure is carrying out, according to the task space to be decoded, in the order that the output variable is decoded first and the input variables are decoded later, taking first few parameters of the required length from each chromosome and arranging them in ascending order, then splicing them to obtain the decoded individual.

For solving the community detection problem and active module identification problem simultaneously, a unified genetic representation and problem-specific decoding scheme was proposed [73]. An individual is encoded as an integer vector, to which each integer representing the label of community to which corresponding node is assigned.

For semantic Web service composition, a permutation-based representation was proposed [74]. A permutation is a sequence of all the services in the repository, and each service appears exactly once in the sequence. Using a forward graph building technique [75], a DAG-basedsolutioncaneasilybedecodedfromtheabovepermutation-basedsolution.

Membership function plays an important role in mining fuzzy associations. Wang and Liaw [76] proposed a structure-based representation MFEA for mining fuzzy associations. The optimization of each membership function is treated as a single task, and the proposed method can optimize all tasks in one run. More importantly, the structure based representation [77] can avoid the illegality by the transformation procedure and also reduce the number of arrangements of membership functions.

Very recently, in an evolutionary multitasking graph-based hyper-heuristic (EMHH), the chromosome of an individual is represented as a sequence of heuristics, with each bit representing a low-level heuristic [78].
