1. Introduction
In the plan of “United Nations Decade of Ocean Science for Sustainable Development (2021–2030)”, maintaining a healthy and clean marine environment received huge attention [
1]. Moreover, there is a rising concern about the accumulation of floating plastic debris in the marine environment [
2]. In 2018, China’s survey of marine garbage showed that plastic debris accounted for 88.7% of floating waste on the sea, 77.5% of beach garbage, and 88.2% of submarine garbage [
3]. As marine debris is difficult to degrade, its accumulation may lead to severe pollution and damage to the entire marine environment. It is particularly worth noting that almost half of the world’s plastic waste comes from packaging [
1]. Therefore, most plastics cannot be recycled or incinerated, causing massive destruction of resources and environmental pollution. It is also the primary source of marine plastic waste. Therefore, resolving and disposing of marine plastics is the key to maintaining a clean, sustainable, and productive ocean. The magnitude and the fate of cleaning marine plastics are still open questions [
2]. The reduction of marine plastic waste is the shared responsibility of all countries globally, which is also the responsibility of every person. It is necessary to urge stakeholders from different governance levels, including government departments, public and private enterprises, civil organizations, and every individual in society, to reduce ocean plastic garbage and make bold innovations and actions. In recent years, many countries have attached great importance to marine debris and microplastic pollution issues. Figuring out an effective way to clean marine plastics is the key to reduce the marine plastics pollution.
At first, manpower was mainly used to conduct various underwater tasks. However, utilizing trained divers to disposing marine plastics means that there would be many things to consider to ensure human safety and reach the demanded water depth to conduct cleaning work. On the other hand, till 2015, humans have produced at least 6.9 billion tons of plastic waste [
4]. Moreover, due to improper human disposal, less than 9% of plastic waste is recycled, 12% is incinerated, and 79% is landfill or arbitrarily discarded into the environment. Therefore, replacing staffing with underwater tools is inevitable, especially under the rapid development of artificial intelligence. At the same time, underwater robots have been applied to various marine engineering fields, such as ship-hulls cleaning [
5], ship-hulls inspection [
6], deep ocean mining [
7], etc. The same type AUVs could be applied in different scenarios when equipped with different tools. Especially for the underwater tracked vehicles (UTV), there are many different application scenarios for them. Such as, the UTV equipped with cutter bar for burring pipelines underwater [
8], the UTV with rock-crushing tool for the underwater rock excavation [
9], the UTV with ladder trench for the deep ocean mining [
7], etc. Although there are a few AUVs for marine plastics cleaning, researchers still make great efforts to explore and innovate. In 2020, an Italian research team released a “lobster robot”-SILVER 2, which attracted the public’s attention. The prominent positioning of this robot is to shoot and clean marine plastics under the sea. SILVER 2 can adapt well to various submarine topography, including seaweed, soil, rocks, sand, etc. It also shows good stability in the simulated water flow test [
10].
With the gradual increase in the intelligence of AUVs and the complexity of underwater tasks, the performance of multiple AUVs is far better than that of a single AUV [
11]. Therefore, utilizing and coordinating multi-robots to conduct large-scale cleaning tasks is necessary, which is also meaningful for the governance of marine plastic pollution. The key to successfully allocating tasks to multiple AUVs is coordinating the cleaning AUVs to obtain the optimal marine plastic-cleaning utility. At the same time, cooperation among a fleet of robotic agents is necessary and meaningful to improve the overall performance of any mission [
12]. In terms of coordinating multi-robots to conduct a given task, it is illustrated by the name of multi-robots task allocation (MRTA). The robotic system is called multi-robots system (MRS). MRTA problem is a crucial concept in MRS. It can be modeled as two distinct sets: a set of tasks to be achieved and another set of robots capable of doing these tasks [
13].
Due to the harsh underwater conditions, it is very difficult to control the single AUV work stably [
14], let alone to connect AUVs on time and change their allocated tasks in time. Therefore, the MRTA problem for the AUVs needs to get the reliably pre-set task allocation values. There exist two main approaches to solve MRTA problems, which are market-based and optimization-based approaches. For the harsh environment, several allocation methods are designed to overcome it. For example, the location-aided task allocation framework method is specially designed to balance the objectives and the individual constraints of the AUVs [
15]. Similar to the LAAF method, inspired by the evolutionary game theory (EGT), a specific and novel MRTA model for the AUVs is proposed on the basis of the optimization-based approaches. Benefits of the MRTA model combined with the replicator dynamics in the EGT can be illustrated as follows. First, in the process of MRTA, the demand of the whole multi-robot system and the allocated tasks of the robots are constantly and mutually adjusted and improved, which means many games are in progress [
16]. They will also imitate and learn from the advanced experience of other agents to build their methodological system, similar to the biological evolution process [
17]. Second, the EGT could emulate the bounded rationality of multi robots, which means the multi robots would evolve through mutual learning and adjustment and eventually form a strategic equilibrium and evolutionary stability [
18]. In other words, most AUVs cannot consent to the optimal strategy of maximizing collective reward or minimizing the cost, and the EGT can provide a relatively stable strategy after several rounds of dynamic evolution. Third, it has been proven that, under certain conditions, if the Karush–Kuhn–Tucker (KKT) first-order conditions of constrained optimization are satisfied, the equilibrium of EGT must exist [
19]. At last, EGT has been successfully applied to the project about the several thorny problems and challenges; [
20] modeled the demand-side management of a smart grid into a certain control networked evolutionary game, which minimize the total cost of the smart grid; [
21] proposed a novel algorithm based on the evolutionary game theory, which addressed the challenges faced by dynamic placement of VMs successfully. All in all, the idea of applying several concepts of the EGT to the MRTA model is feasible and meaningful.
The contributions lie in three-folds. First, inspired the replicator dynamics of the EGT, a novel MRTA model is constructed on the basis of the optimization-based approach. It belongs to a continuous optimization problem, different from the discrete and combinatorial optimization formulation in the common MRTA model. Second, by introducing the replicator dynamics to the proposed MRTA model, the final task-allocation values would be in a relevantly stable state. Third, the EO algorithm is first applied to solve the MRTA problem. Through the simulation results, the EO algorithm can get the correct values when solving the proposed MRTA problem. Besides, several conclusions regarding the applicability of the proposed model and the impacts of the complete cleaning tasks are successfully obtained.
The paper is structured as follows. In
Section 2, we introduce the related work involved in this paper, including the current situation of marine plastic pollution and the various solutions for the MRTA problems. Then, in
Section 3, a novel MRTA model is constructed based on the optimization-based approach and replicator dynamics. Different from the common expression of the MRTA problem, the expression in this section belongs to the continuous optimization problem. Besides, after introducing the replicator dynamics to the proposed model, the expression includes utility function and stable function. To solve the proposed model, the EO algorithm is selected and illustrated in
Section 4. Simulation results and discussions about the applicability of the proposed model, the impacts of the total tasks, and the effectiveness of the EO algorithm are shown in detail in
Section 5. At last, conclusions are drawn in
Section 6.
3. A Novel MRTA Model
Aiming to construct a MRTA model to get the optimal and stable allocated tasks, a distributed robotic system including n AUVs and a controlling center is assumed. Inspired by the evolutionary model and population dynamics, here, a novel MRTA model is constructed on the basis of the optimization-based approach, where the total weight of fouling material needed to be allocated is and the set of is the weight of cleaning tasks allocated to the set of AUV-agents by the controlling center. Therefore, the proposed MRTA model is different from the standard MRTA model with the combinational expression.
During the allocation, the sum of the agents in the set of
is required to be equal to
. The set of
represent the goal function of the corresponding AUVs in the MRS after completing allocated tasks. In other words, they are the feedbacks that the related AUV agents give back to the controlling center after conducting tasks. Therefore, the total feedback of cleaning all the allocated plastic tasks in this distributed MRS is the sum of the set
. The relationship and interactions among n AUVs and the controlling center are shown in
Figure 1.
The components of the proposed distributed system are shown vividly in
Figure 1. This distributed robotic system is composed of multiple distributed AUVs. Every distributed AUV has its goal function in cleaning marine plastics. There exists no master–slave relationship among these distributed AUVs, that is to say, they are equal during the task allocation of marine plastic cleaning. Moreover, they conduct the allocated cleaning tasks autonomously and individually. Preliminarily, the global constraint for this distributed task-allocation system is the total weight of the allocated tasks
must be
. The local constraint for this distributed-MRS is that the allocated task for the single AUV needs to satisfy the range
. Under the above initial global constraint and local constraints, this distributed task-allocation system can coordinate the AUVs to conduct the allocated tasks efficiently and harmoniously.
3.1. Fomulation of the MRTA Model
After sorting out the relationship between the multi-AUVs and the controlling center, we can first translate this kind of MRTA problem into a mathematical model based on the optimization solutions. Then, the MRTA model is translated and formed as follows:
This model aims to minimize the goal function . To satisfy the initial requirements mentioned previously, this model is constructed under the global constraint that the sum of (kg) must be (kg). Each (kg) should be limited in the range from the minimum value to the maximum value of themselves, that is, .
3.2. Cost Function
The formulation of the cost function could be the most challenging part of an optimization-based MRTA problem. The general cost function of the MRTA model for underwater cleaning can be illustrated as follows:
where
denotes the cleaning costs of a single cleaning-AUV, and
denotes the MPC tasks allocated to this cleaning-AUV. As we know, after the AUV finishes its allocated cleaning tasks, the owner of it would get profits from the government or the relevant companies. Therefore, during the task-allocation, it should consider the costs of AUVs and the profits of AUVs. The, then cleaning costs
per unit task and the cleaning profits
per unit tasks are chosen to be the other two coefficients in the general cost function. To be more specific,
is the cost by the AUV when it finishes cleaning 1 kg plastics, including electricity cost, staffing consumption, etc.,
is the relevant department of the government affords
c to AUV’s owners. That is to say, when the AUV is allocated 1 kg, the government or the relevant companies will pay
to the AUV.
In this general cost function, the relationship between
and
is linear. Besides,
and
are always consistent. However, these two findings show that this linear expression is unpractical. There exist two main reasons. First, all relationships in real life are non-linear, and linear expression is just the simplification or the approximation of the fundamental engineering problems. On the other hand, there are many factors causing
to change with
changing. For example, due to the cleaning experience accumulated during the task conduction, when
approaches its maximum cleaning weight
, the changing rate of costs would decrease in real life. When
approaches zero, the changing rate of cost should increase. The curve of the above two behaviors is shown in
Figure 2.
Thus, we can see that changing rate of
is changeable and depends mainly on the value of
. Then, here, we utilize the logistic-type function to model the above phenomenon. The linear expression is successfully upgraded to the non-linear expression in Equation (3).
where we limit the value
to the range of
.
still represents the total cost of the AUV, and
and
denote the maximum value and the minimum value of
. To reduce the number of coefficients in equation,
is introduced as the only coefficient in Equation (3). It denotes the plastics-cleaning ability of the AUV regardless of the value of
, which would be calculated by the technical parameter of the AUV. Four main technical parameters of the cleaning AUVs are chosen to calculate
, which are the recharge mileage
, the maximum weight of the cleaning-plastics
, the dive depth
, and the maximum speed
. According to their importance in the marine plastics-cleaning, they are ranked in the order of
,
,
, and
. Then,
can be calculated by the following Equation (4):
In Equation (4), the set of
denotes the importance level of the corresponding index.
,
,
, and
represent the ratios of the current states to the average values.
denotes the number of cleaning-AUVs. Based on the above construction and illustration of the cost function, the MRTA model can be further translated from Equation (1) to the following expression:
3.3. Combination with Replicator Dynamics
As we have mentioned previously, the proposed MRTA model is inspired by the evolutionary game and population dynamics. The evolutionary game concepts are suitable for this MRTA model. Replicator dynamics is the core concept in evolutionary game theory, the definition of which is shown in definition 1 [
18]. The replicator dynamics are introduced to the MRTA model to realize the stable allocated tasks among the AUVs. According to the definition of replicator dynamics and the corresponding Equation (10), a specific stable function for the proposed MRTA model is constructed and shown in Equation (6):
where
reflects the changing rate of the
ith AUV, which is assigned to
cleaning tasks. Correspondingly,
represents the adaptability of the
ith AUV and
denotes the average adaptability among the
AUVs. Therefore, if the value of
is zero, it means that the
ith AUV has stopped changing its allocated tasks. That is to say, in this situation, the
ith AUV has reached a relatively stable equilibrium state. In terms of the multiple AUVs, to reach a stable task-allocation, the modified replicator dynamics of all AUVs are needed to be zero or very close to zero. To solve the replicator dynamics by the optimization algorithm, Equation (6) is modified to Equation (7).
Based on the above work, a stable function for the proposed MRTA model is constructed as follows.
According to the mathematical expression of the stable function (8) based on the replicator dynamics, its goal is to minimize
F(
w) to approach zero. As the goal of the cost function (5) and stable function (8) are both the minimize-optimization and positive-definite, the goal function of the proposed MRTA model is obtained by adding the two functions up, which is shown in Equation (9). After this kind of combination, the multi-objective optimization has also become single-objective optimization.
In the common optimization-based MRTA model, the objective function is just the utility function or the cost function. Therefore, compared to the common MRTA model, this novel model (9) not only satisfies the minimization of the cost, but also reaches a relatively stable state of the task allocation.
Definition 1. Replicator Dynamic.
The replicator dynamic is a dynamic differential equation, which describes the frequency of an adopted strategy in a specific population. It can be expressed by the following formula.
In the above formula, is the proportion or probability of choosing pure strategy in a population, represents the fitness when using pure strategy , and denotes the average fitness of the population.
6. Conclusions
Owing to the harsh communication conditions for the AUVs, inspired by the evolutionary game theory and population dynamics, a novel and specific MRTA model for marine plastics cleaning is established. To get the optimized and relatively stable task-allocation values, the goal function of this novel MRTA model consists of a cost function and replicator dynamics of AUVs. Then, to solve this MRTA problem, the EO algorithm is chosen for its better performance in the other fields. It is the first time that the EO algorithm has been applied to the MRTA problem. Through the simulations, the following conclusions could be obtained. First, the EO algorithm is verified for being able to calculate the correct and expected values in the proposed MRTA model. Second, the proposed MRTA model is applicable for the different scales of the multi-robot system. Finally, the AUV with a larger cleaning ability factor r, determined by its four main technical parameters, would be allocated more tasks and increase faster with the total tasks increase.
It must be admitted that to apply the replicator dynamics to the MRTA model, the constraints and the goal function in the model are set to be relatively simple. Compared with the existing literatures, the difficulties of this research are as follows. First, there are no powerful underwater robots designed for cleaning marine plastics in the market. Second, to make this research practical, there will be a long way for us to study. Future work could include constructing a more complex MRS for underwater operations with more constraints and combining the proposed MRTA model with multi-robots path planning problem.