**1. Introduction**

Due to its extensive application in science and engineering fields, global optimization is a topic of grea<sup>t</sup> interest nowadays. Without a loss of generality, it implies the minimization of a specific objective function or fitness function [1]. Effective and common approaches for optimization problems can be mainly divided into deterministic and heuristic methods. Deterministic methods (such as linear programming and nonlinear programming) can find a global or an approximately global optimum using mathematical formulas. Generally speaking, they take advantage of the analytical properties of the optimization problem to generate a sequence of solutions that converge to a global optimum [2]. On the other hand, heuristic methods use random processes, and thus cannot guarantee the quality of the obtained solutions. Comparatively speaking, to find an acceptable solution, the deterministic approach needs fewer objective function evaluations than the stochastic approach.

**Citation:** Xu, Q.; Wang, N.; Wang, L.;Li, W.; Sun, Q. Multi-Task Optimization and Multi-Task Evolutionary Computation in the Past Five Years: A Brief Review. *Mathematics* **2021**, *9*, 864. https:// doi.org/10.3390/math9080864

Academic Editor: Fabio Caraffini

Received: 22 March 2021 Accepted: 9 April 2021 Published: 14 April 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

However, stochastic approaches have been found to be more flexible and efficient than deterministic approaches, especially for complex "black box" problems [3].

Evolutionary algorithms (EAs) are a kind of population-based stochastic optimization methods involving the Darwinian principles of "Natural selection and survival of the fittest" [4–8]. The algorithm starts with a population of randomly generated individuals. Then, new offspring are produced iteratively by undergoing evolutionary operators such as crossover and mutation, and fitter offspring will survive to the next generation. The production and selection procedure terminates when a predefined condition is satisfied. Due to their simple implementation and strong search capability, in the last few decades, EAs have been successfully applied to solve a wide range of real-world optimization problems in areas such as defense and cybersecurity, biometrics and bioinformatics, finance and economics, sport, and games [9,10].

Despite their grea<sup>t</sup> successes in science and engineering, existing EAs still contain some drawbacks. One major point is that traditional EAs typically start to solve a problem from scratch, assuming a zero prior knowledge state, and focus on solving one problem at a time [11,12]. However, it is well known that real-world problems seldom exist in isolation and are usually mixed with each other. The knowledge extracted from past learning experiences can be constructively applied to solve more complex or new encountered tasks.

Traditional machine learning algorithms only work well under a common assumption that the distributions of the training and test data are the same [13]. Nevertheless, the domains, tasks, and distributions may be very different in many real-world applications. In such cases, transfer learning or multitask learning between multiple source tasks and a target task would be desirable. In contrast to tabula rasa learning, transfer learning in the field of machine learning can leverage on a pool of available data from various source tasks to improve the learning efficacy of a related target task. The fundamental motivation for transfer learning in machine learning community was discussed in a NIPS (Conference and Workshop on Neural Information Processing Systems) 1995 post-conference workshop on "Learning to Learn: Knowledge Consolidation and Transfer in Inductive Systems" [14]. Since 1995, it has attracted substantial scholar attention, and achieved significant success [13,15–17]. Although the notion of knowledge transfer or transfer learning has been prominent in machine learning, it is relatively scarce, and has received far less attention in the evolutionary computation community. Frankly speaking, a detailed description of transfer learning in machine learning is beyond the scope of this review article, which is limited in transfer learning or multi-task learning in evolutionary computation.

As a novel paradigm, transfer optimization can facilitate the automatic knowledge transfer across optimization problems [11,12]. Following from the formalization, the conceptual realizations of this paradigm are classified into three distinct categories, namely sequential transfer optimization, multi-task optimization (MTO), the main focus of this article, and multiform optimization. Note that the concept of multi-task optimization is also described using other terms such as multifactorial optimization (MFO) [18], multitasking optimization (MTO) [19], multi-task learning (MTL) [20], multitask optimization (MTO) [11], multitasking [12], evolutionary multitasking (EMT) [21], evolutionary multitasking (EMT) [22], and multifactorial operation optimization (MFOO) [23].

The basic concept of multi-task optimization was originally introduced by Prof. Ong [24]. In contrast to the traditional EAs which optimize only one task in a single run, the main idea of MTO is to solve multiple self-contained optimization tasks simultaneously. Due to its strong search capability and parallelism nature, it has attracted grea<sup>t</sup> research attention since it was proposed in 2015. Nevertheless, to the best of our knowledge, there is no effort being conducted on the comprehensive survey, especially in future trends and challenges, about MTO. Thus, the intention of this article is to present an attempt to fill this gap.

Up to now, no research monograph on this topic has been published, except a book chapter written by Gupta et al. [25]. The review of the literature in this paper consists of 140 articles from refereed journals and conference proceedings. These papers listed in

the bibliography are drawn from the past five years. Note that dissertations [26–29] have generally not been included, although the tendency is to be inclusive when dealing with borderline cases. One of the major concerns here is that these results and key contributions with rarely novel ideas in dissertations are usually the collection of previous results published in journals or conferences.

The remaining of this review is organized as follows. The basic definition and some confusing concepts of MTO are introduced in Section 2. In this section, we also conduct a statistical analysis of the literature. In Section 3, the mathematical analysis of conventional multi-task evolutionary computation (MTEC) is provided which theoretically explains why some existing MTECs perform better than traditional methods. Then, Section 4 describes some basic implementation approaches for MTEC, such as chromosome encoding and decoding scheme, intro-population reproduction, inter-population reproduction, balance between intra-population reproduction and inter-population reproduction, and evaluation and selection strategy. Further, related extension issues of MTEC are summarized in Section 5. In Section 6, a review of the applications of MTEC in science and engineering is conducted. Finally, the trends and challenges for further research of this exciting field are discussed in Section 7. Finally, Section 8 is devoted to main conclusions.

#### **2. Basic Concept of Multi-Task Optimization and Multi-Task Evolutionary Computation** *2.1.DefinitionofMulti-TaskOptimization*

Generally, the goal of multi-task optimization is to find the optimal solutions for multiple tasks in a single run. Without a loss of generality, suppose there are *K* minimization tasks to be optimized simultaneously. Specifically, denote *Ti* as the *i*th minimization task to be solved. Then, the definition of a MTO problem can be mathematically represented as follows [18]:

$$\mathbf{x}\_{i}^{\*} = \operatorname\*{argmin}\_{\mathbf{x}} T\_{i}(\mathbf{x}), \quad i = 1, 2, \cdots, K \tag{1}$$

where *x*<sup>∗</sup> *i* is a feasible solution of the *i*th task *Ti*. Note that *Ti* itself could be single-objective optimization or multi-objective optimization problem. A general schematic of multi-task optimization is depicted in Figure 1.

**Figure 1.** An illustration of a multi-task optimization problem [30].

To evaluate the individuals in MTO, several properties associated with every individual are defined as follows [18]:

**Definition 1 (Factorial Cost):** *The factorial cost of individual pi on task Tj is the objective value fj of potential solution pi, which is denoted as ψij*.

**Definition 2 (Factorial Rank):** *The factorial rank of pi on Tj is the rank index of pi in the sorted objective value list in an ascending order, which is denoted as rij*.

**Definition 3 (Skill Factor):** *The skill factor is defined by the index of the task assigned to an individual. The skill factor of pi is given by τi* = argmin*j*∈{1,2,...,*K*}*<sup>r</sup>ij*.

**Definition 4 (Scalar Fitness):** *The scalar fitness of pi is the inverse of rij, which is given by ϕi* = 1/*minj*∈{1,2,...,*K*}*<sup>r</sup>ij.*

Herein, the skill factor is regarded as the cultural trait which can be inherited from its parents in MTO. The scalar fitness is used as the unified performance criterion in a multi-task framework.

#### *2.2. Confusing Concepts of MTO*

As an emerging paradigm in evolutionary computation community, multi-task optimization is easy to confuse with other optimization concepts outlined and distinguished in this section.

#### 2.2.1. Multi-Objective Optimization (MOO)

In a real-world scenario, a decision maker in the general case has to simultaneously account for multiple disparate or even contradictory criteria while selecting a particular plan of action. Mathematically, a multi-objective optimization problem can be formulated as follows:

$$\min F(\mathbf{x}) = \left(f\_1(\mathbf{x}), f\_2(\mathbf{x}), \dots, f\_m(\mathbf{x})\right)^T \tag{2}$$

where *x* is the decision variable vector. Typically, no single optimal solution can minimize all the objectives simultaneously due to the confliction between each pair of objectives. Thus, the main purpose of an MOO problem is to obtain an optimal solution set, called a Pareto solution set, with splendid convergence and diversity.

In the literature, multi-objective evolutionary algorithms (MOEAs) that are commonly used today can be classified into three categories [31]: (a) dominance-based MOEAs, such as NSGA-II [32], (b) indicator-based MOEAs, such as HypE [33], and (c) decomposition-based MOEAs, such as MOEA/D [34].

Although MOO and MTO problems both involve the optimization of multiple objective functions, they are two distinct optimization paradigms. MOO focuses on efficiently resolving conflicts among competing objective functions in one task. As a result, solving a MOO problem typically yields a Pareto solution set that provides the best trade-offs among all objective functions. Differently, MTO aims to leverage the implicit parallelism of a population-based search to seek out the optimal solutions for two or more tasks simultaneously. Therefore, the output of a MTO problem contains two or more optimal solutions corresponding to each task.

In order to further exhibit the distinction between MOO and MTO, we refer to their population distributions in Figure 2. In real life, you can imagine a scenario where you plan to buy a cheap and fine table in a furniture store. Actually, this problem that you face is a multi-objective optimization problem. Based on the definition of Pareto optimal solution, individuals {*p*2, *p*3, *p*4, *p*5} are incomparable to each other and are better than the individuals {*p*1, *p*6} in Figure 2a. As a result, the output of this MOO problem is the Pareto optimal solution set {*p*2, *p*3, *p*4, *p*5}, and then you can buy any table from this set based on personal preference.

**Figure 2.** Population distribution for multi-objective optimization (MOO) and multi-task optimization (MTO) problems. (**a**) Multi-objective optimization problem finding a cheap and fine table. (**b**) Multi-task optimization problem finding a cheap table and a cheap chair concurrently.

In contrast, you may possibly plan to buy a cheapest table and a cheapest chair at once, which is a typical multi-task optimization problem. In Figure 2b, individuals {*p*1, *p*2} are the cheapest chairs, and individuals {*p*5, *p*6} are the cheapest tables in this furniture store. Thus, the output of this MTO problem is two optimal solution sets: {*p*1, *p*2} and {*p*5, *p*6}, and then you can buy randomly ONE table from the set {*p*5, *p*6} and ONE chair from the set {*p*1, *p*2}.

#### 2.2.2. Sequential Transfer Optimization

The search process of many existing EAs typically begins from scratch, assuming a zero prior knowledge state. However, there is a grea<sup>t</sup> deal of knowledge from past exercises that can be exploited the similar search spaces in order to improve the algorithm performance. For instance, an engineering team designing a turbine for an aircraft engine would use, as a reference, past designs that have been successful and modify them accordingly to suit the current application [20].

Mathematically, we make the strict assumption that while tackling task *TK*, the tasks *T*1, *T*2, ... , *TK*−<sup>1</sup> have already been addressed previously with the extracted information available in the knowledge base *M* [12]. Herein, *TK* is said to act as the target optimization task, while *T*1, *T*2, ... , *TK*−<sup>1</sup> are said to be source tasks. As illustrated in Figure 3, the objective of sequential transfer optimization is to improve the learning of the predictive function of a target task using knowledge from any source task.

**Figure 3.** An illustration of a sequential transfer optimization problem [12].
