Auction-Based Behavior Tree Evolution for Heterogeneous Multi-Agent Systems

Wen, Shanghua; Wu, Wendi; Li, Ning; Wang, Ji; Yang, Shaowu; Ben, Chi; Yang, Wenjing

doi:10.3390/app14177896

Open AccessArticle

Auction-Based Behavior Tree Evolution for Heterogeneous Multi-Agent Systems

by

Shanghua Wen

^†

,

Wendi Wu

^†,

Ning Li

^*,

Ji Wang

,

Shaowu Yang

,

Chi Ben

and

Wenjing Yang

College of Computer Science and Technology, National University of Defense Technology, Changsha 410073, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Appl. Sci. 2024, 14(17), 7896; https://doi.org/10.3390/app14177896

Submission received: 30 July 2024 / Revised: 20 August 2024 / Accepted: 22 August 2024 / Published: 5 September 2024

(This article belongs to the Special Issue Automatic Control of Multi-agent Systems)

Download

Browse Figures

Versions Notes

Abstract

:

Collaboration in Multi-Agent Systems (MASs) is crucial but challenging in robotics, especially in heterogeneous MASs where robots have different capabilities. Nowadays, the key issue in research on collaboration in MASs is to fully utilize the capabilities of heterogeneous agents. To address this issue, we propose Auction-Based Behavior Tree Evolution (ABTE), a novel two-layer framework designed to learn BTs for heterogeneous MASs. In the first layer, we call it the command layer, and robots receive their tasks through the auction algorithm, enhanced by our innovative three-way handshaking communication protocol embedded in BT implementation, ensuring more efficient task allocation. The second layer of ABTE defines the specific execution behaviors of agents and is, therefore, named the execution layer. The behaviors in this layer are automatically generated by Grammatical Evolution (GE), which has been proven to be a general and effective method for generating swarm BTs. Our experiments are conducted within a Disaster Rescue Scenario, which requires intricate collaboration among multiple robots with diverse capabilities. The results indicate that ABTE outperforms the baseline algorithm, GEESE, in terms of resource utilization. Moreover, it demonstrates robust effectiveness in covering high-priority tasks, thereby validating the efficacy of employing an auction algorithm for generating BTs tailored for heterogeneous MAS.

Keywords:

multi-agent systems; task allocation; hierarchical behavior tree; grammatical evolution; three-way handshaking communication protocol

1. Introduction

Heterogeneous Multi-Agent Systems (MASs) are clusters of heterogeneous unmanned platforms like drones, unmanned boats, industrial robots, and so on. Such swarms are able to exhibit more intricate forms of swarm intelligence and, therefore, display extensive potential for application across both military and civilian sectors. In recent years, BTs have become a popular control architecture in MASs. BTs are directed rooted trees where nodes can be divided into root nodes, control flow nodes, and execution nodes. The execution in BT starts from the root node by generating signals or control flows, often called ticks, at a particular frequency [1]. The control flow nodes of BT mainly include Sequence, Selector, Parallel, and Decorator, and the execution nodes mainly include Condition and Action [2]. In the common representation of BT models, a sequence node is represented by a box with an arrow, while a selection node is represented as by a box with a question mark. As for execution nodes, condition nodes are usually represented by ellipses, while action nodes are typically represented by rectangles. Due to the advantages of modularity, generality, and reactivity over various other controlled hybrid systems, BTs have become popular control architecture in MASs in recent years [3]. The behaviors controlled by BTs occur by switching between a set of tasks based on changing observed input signals [4]. This allows BTs to react to environment changes by using frequent ticks to activate behaviors at runtime.

Although BTs have been demonstrated as an effective policy representation for various robotic applications, manual coding of BTs remains a time-consuming task that heavily relies on expert knowledge. To address the need for robots to adapt to unexpected environments, several approaches, including reinforcement learning, imitation learning, and evolution algorithms, have been explored [5]. In particular, Genetic Programming (GP) and Grammatical Evolution (GE) are commonly used in research. These algorithms measure the fitness of each population member by some specified metric(s), aiming to improve the performance of population members over multiple generations [6]. For example, Iovino et al. [7] used GP to generate BTs and control robots to carry objects in an unknown environment, proving that a learned BT is tolerant to faults in manipulation, localization, and navigation actions. An exploration by Neupane et al. [8] designed three common tasks for swarm and verified the effectiveness of learning swarm BTs using GE. These biomimetic swarm-based methods effectively solve complex collective problems but often overlook task matching with heterogeneous robot capabilities. With the advancement of robotics, heterogeneous MASs have been applied to a wider range of scenarios. Therefore, it is necessary to explore a method for heterogeneous MASs to cooperate based on their respective capabilities to perform tasks efficiently. To achieve this goal, two main challenges need addressing: firstly, establishing a cooperative mechanism considering the varying capabilities, and secondly, designing an autonomous generation method for BTs so that robots can adapt to the variation of tasks.

The solution explored in this paper is mainly targeted at the scenario where heterogeneous MASs need to perform multiple tasks, so it is necessary to propose a communication protocol for multi-agent collaboration. Although heterogeneity increases computing costs, its application in many scenarios promotes research on task allocation algorithms for heterogeneous MASs [9]. There are two kinds of architectures for task allocation problems: A centralized architecture is able to find optimal solutions without the consensus stage but cannot adapt to the changes in the environment and robots, resulting in poor scalability and robustness. In contrast, decentralized architecture is robust and has lower computing overhead, but it usually results in a sub-optimal solution [10]. In our design, we combine the advantages of centralized and decentralized methods. Each robot is able to become the central coordinator once the tasks are allocated. In terms of the task allocation algorithm, we adopt the auction algorithm, which is a commonly used market-based approach to assign tasks [11]. Inspired by economics, this kind of algorithm carries out multi-robot task allocation through an auction-like process [12]. Singhal et al. [13] first used a centralized auction algorithm to solve the multi-robot task allocation problem and proposed a polynomial-time auction algorithm for single-task allocation. Auction-based methods can utilize the information and preferences of robots to obtain effective solutions in resource-limited situations [14]. In addition, such methods support dynamic task allocation and are highly inclusive of changes in the environment.

Therefore, this paper aims to study the autonomous planning and collaboration of heterogeneous MASs in environments that involve a great deal of task variety, focusing on building a BT framework to enhance the autonomy of systems. In this framework, all available information about robots and the environment is considered comprehensively in the autonomous generation of behaviors, and a novel three-way handshaking communication protocol for task allocation is integrated. As the command layer in the framework, the communication protocol consists of a sequence of fixed behaviors and standardized package formats, which are capable of controlling the generation and initiation of behaviors at the execution layer. The overall framework is shown in Figure 1.

The contribution of this study lies in providing a general solution for heterogeneous MASs to plan collaboratively and adapt autonomously in multi-task environments. The details are as follows:

We propose Auction-Based Behavior Tree Evolution (ABTE), a novel hierarchical behavior control framework that consists of a command layer and an execution layer. This framework makes full use of the modularity of behavior tree models by layering command and execution behaviors. Command behaviors remain fixed, handling task allocation and controlling the generation of execution behaviors. Execution behaviors are generated by the GE algorithm after task allocation, which is designed to improve the completion rate and execution efficiency of MAS.
We propose a three-way handshaking communication protocol based on the SSI auction algorithm. This communication protocol, which comprises a set of auction behaviors and standard data formats, is a crucial component of the command layer. With this protocol, the command layer can fully utilize the information of the environment and robots to allocate tasks, thereby maximizing the strengths of heterogeneous robots.
We design simulation experiments in the Disaster Rescue Scenario, where there are a large number of tasks with different priorities. Our framework aims to complete as many high-priority tasks as possible while consuming less energy. The results prove that ABTE, compared to GEESE, can achieve higher resource utilization in heterogeneous MASs and display robust effectiveness in covering high-priority tasks.

In the ensuing section, an innovative three-way handshaking communication protocol based on auction algorithms is discussed, which constructs a task allocation architecture among agents. Section 3 proposes Auction-Based Behavior Tree Evolution (ABTE), a hierarchical behavioral control framework used for combining fixed behaviors with generated behaviors. The results of the experiments are presented in Section 4. Finally, the main conclusions are summarized and future lines of research are proposed in Section 5.

2. Three-Way Handshaking Communication Protocol

In this section, we propose a three-way handshaking communication protocol for collaboration and task planning in MAS. We first formulate the information of the environment and robots as the standard formats of communication data. Then, we propose a sequence of three-way handshaking actions based on the SSI auction algorithm and transform them into nodes of the BT. This kind of BT ultimately forms the command layer in the ABTE framework.

2.1. The Formats of Auction Information

To enable efficient cooperation for heterogeneous MASs in multi-task environments, we adopt an auction-based three-way handshaking communication protocol. The fixed information formats specified in this protocol are described as follows. Let R = {

R_{1}

,

R_{2}

,…,

R_{n}

} represent a set of n robots. Each robot

R_{i}

(i∈ [1,n]) is modeled by a set of state information

S_{i}

= {V,C,L,E,Tk}. In order to bid for tasks, bidders are required to report their

S_{i}

to the auctioneer. V and C are two constant attributes that represent the traveling speed and capability of robots, respectively, to execute different tasks. The location and energy level of robots during the task execution are recorded by L and E, and x number of tasks Tk = {

T k_{1}

,

T k_{2}

,…,

T k_{x}

} are allocated to robots. Each task

T k_{m}

(m∈ [1,x]) consists of attributes

T k_{m}

= {l,t,p,τ,μ,σ} and after the bidders have received allocation result Tk, they are ready to execute these tasks. In these attributes, l is the coordinate of a task in a two-dimensional plane. Different types t of tasks have different attributes, such as Excavation, Communication, Transportation, Search, and so on. A task is cast with a priority p, which is reflected in the time constraint

τ

. The last two attributes

μ

,

σ

represent the energy consumption per unit of time and the total completion level required for a task, respectively; thus, the energy and time for a robot

R_{i}

required to execute a task

T k_{m}

depend on the capability

C_{i m}

.

2.2. Task Allocation Architecture

To enhance the robustness of heterogeneous MAS, we implemented a distributed auction architecture where each robot can potentially take on the role of auctioneer or bidder. Task allocation initiates when a robot becomes the auctioneer upon receiving task information, prompting other robots to calculate their capability and bid. Task Instruction is the first handshaking, which involves the auctioneer sharing task details, followed by a blocking state waiting for bids. The unblocking occurs when the auctioneer receives all bids from or waits beyond the time limit. During the Capability Instruction, which is the second handshaking, bidders calculate their bids using information on tasks and their states, employing diverse algorithms and normalizing results. Task Appointment is the third handshaking, and during this period, the auctioneer calculates the auction results and informs all bidders. After finishing the auction, robots that have received tasks will evolve BTs in parallel and replicate the original BTs with the evolved BTs, then continue to run the BTs to execute tasks. The procedure can be viewed in Figure 2.

In applications of MASs, auction algorithms like single-round combinatorial (SRC) auction, parallel single-item (PSI) auction, and sequential single-item (SSI) auction [15] are typically used for task allocation. In these auction algorithms, SSI auctions integrate the advantages of both SRC auctions and PSI auctions. They are much easier to implement than SRC auctions since the central auctioneer receives exponentially less information and does not need to solve an NP-hard problem, and the team performance of the coordinate system based on SSI auctions is empirically close to optimal and much better than PSI auctions. Additionally, in comparison to SRC auctions, SSI auctions encounter challenges in attaining the optimal solution for total travel distance. However, in the scenarios primarily discussed in this paper, the energy of the robot is mainly consumed during task execution rather than movement. Therefore, we take SSI auctions as the basic algorithm of the task allocation architecture in this paper. Our SSI-based algorithm starts with unassigned tasks. Robots bid on each task and the highest bidder wins the corresponding task. This repeats until all tasks are assigned. To calculate bid value, the bidders simulate their task execution processes respectively, considering path cost

P_{c}

, time cost

T_{c}

, and energy consumption

E_{c}

. The path cost of the robot moving to the task is described in terms of the Manhattan distance, as follows:

P_{c} = | l (x) - L (x) | + | l (y) - L (y) |,

(1)

Time cost consists of movement and execution, as follows:

T_{c} = \frac{P_{c}}{V} + \frac{σ}{C_{k}}

(2)

Similar to time cost, energy consumption consists of movement and execution, as follows:

E_{c} = f \times P_{c} + μ \times \frac{σ}{C_{k}}

(3)

f represents the energy factor in Equation (6), set to constant in the program. Based on the above calculation results, the bid value

B_{i m}

of robot

R_{i}

to task

T k_{m}

can be calculated as follows:

B_{i m} = \frac{E - E_{c}}{E_{i n i t i a l}} \times 100 \times \frac{1}{T_{c}}

(4)

Thus, the bid value prioritizes residual time and energy consumption to evenly distribute tasks. Upon task allocation, bidders update their state information. Specifically, the location L and energy E will be modified after the completion of tasks so as to enable participation in subsequent auctions. The auction concludes when tasks are fully assigned or robots deplete their energy during task execution. The procedure of task allocation based on the auction algorithm is shown in Algorithm 1.

Algorithm 1 Auction Process

Input:

R_{n}

(The initial states of all robots),

T k_{m}

(the known tasks in the environment)

Output: {

T k_{R 1}

,

T k_{R 2}

, …,

T k_{R n}

} (The results of task allocation)

N o t i f i c a t i o n T a s k s (T k_{m})

while

i < r o b o t n u m b e r s \land t i m e < t i m e l i m i t

do

R_{i} \Leftarrow R e c e i v e I n f o r m a t i o n ()

i + +

end while

for

{T k}_{j} \in {T k}_{m}

do

B_{i j} \Leftarrow C a l c u l a t e B i d (R_{i}, {T k}_{j})

{T k}_{R i} \Leftarrow M a x (B_{i j})

end for

A l l o c a t e T a s k s (

{

T k_{R 1}

,

T k_{R 2}

, …,

T k_{R n}

}

)

3. The Hierarchical Behavior Control Framework

In this section, we propose ABTE, a hierarchical framework based on BT, to solve the collaboration problem for heterogeneous MASs. We first design the layers, explain the workflow of the framework, and then introduce the algorithm for autonomously generating behavior trees and propose the fitness function, which tends to improve the efficiency of MASs.

3.1. The Layers of the Framework

BTs have attracted much attention in the robotics field in recent years, which generalize existing control architectures and bring unique advantages for building robot systems [16]. BT is a kind of control model with strong modularity; this characteristic is very helpful for expansion and pruning. Therefore, we propose an extensible two-layer framework. The behaviors defined by the three-way handshaking communication protocol, which we define as auction behaviors, are incorporated within the first layer of the framework, namely the command layer. Such a design is the basis for the framework to be distributed in heterogeneous MASs, ensuring accurate execution of protocol-related actions. The second layer, also known as the execution layer, is an empty tree in the initial state of MASs. After the system completes task allocation in the command layer, the node for grammatical evolution will be activated. Robots then generate heterogeneous execution behavior trees according to task planning, thus forming the execution layer of our framework. In summary, we create a mechanism to convert the auction-based communication protocol into fixed BT behaviors while allowing the autonomous generation of execution behaviors based on tasks and improving the adaptability and flexibility of MASs. Figure 3 shows the BTs for robots in an experiment required to execute excavation tasks.

The task allocation behaviors based on the three-way handshaking communication protocol are fixed in the BT controller. After the control flow is started by ticks, the robots start the “Auction()” thread through the “Call Service” node to continuously monitor the tasks in the environment. While one robot in the cluster receives a demand to allocate tasks, the auction process begins. The robot that receives tasks acts as an auctioneer, as well as a bidder, and the others serve as bidders. The auctioneer robot executes “Notify”, “Receive Information”, “Calculate Bid”, and “Allocate Tasks” in sequence in the “Auction()” thread. Bidder robots are subjected to the BT controller to execute “Send Information” after the blocked “Receive Notification” node receives the auction notification and then blocks in the “Wait Allocation” node. If the allocation is received within a time constraint, robots continue to evolve, matching BTs according to the allocated tasks; then replace the original “Execution Behaviors” with the generated BTs; and continue to execute tasks.

There are two significant advantages of using ABTE to generate BTs. First of all, we implement a multi-thread approach to represent the auction-based three-way handshaking communication protocol with a singular behavioral sequence. This allows every robot embedded with this BT to simultaneously possess the capabilities of both auctioneer and bidder. Subsequently, the BTs for executing tasks can be generated according to the allocated tasks and environment information, which improves the adaptation of MASs to face unknown changes and challenges in the environment. The details of evolving execution behaviors will be explained next.

3.2. The Evolution of Execution BTs

GE is a context-free grammar-based genetic program paradigm proposed based on GP. Through the conversion rule of “Backus–Naur Form (BNF)” grammar, the derived tree is mapped into a binary string to implement a string-based genetic operation. Therefore, GE is capable of enhancing the efficiency of modeling BTs with high reliability and interpretability. It represents commendable explorations in the realm of generating BTs autonomously. A central work of incorporating BTs into the framework of GE is to find a grammar that can be used in the genotype-to-phenotype mapping [17]. The transformation uses BNF grammar, which specifies the language of the produced solution. In our experiments, robots are required to complete various tasks in the environment. The basic experiment assumes that there are four types of tasks: Excavation, Communication, Transportation, and Search. The production rules in BNF grammar for basic task coverage experiment are defined in Equations (5)–(13). Equation (5) indicates that BT as a whole follows the sequence structure. Equations (6)–(8) recursively expand the control nodes. Equations (9)–(11) define multiple execution nodes. Production rules Equations (12) and (13) define specific behaviors that robots need to perform. The phenotype created from this mapping process is a BT controller represented by a string (e.g., [‘Seq’, ‘action1’, ‘condition1’, ‘/Seq’]).

\begin{matrix} 〈 r o o t 〉 : : = & 〈 s e q u e n c e 〉 \end{matrix}

(5)

\begin{matrix} 〈 s e q u e n c e 〉 : : = & _S e q 〈 c o n t r o l 〉 〈 c o n t r o l 〉_/ S e q |_S e q 〈 e x e c u t i o n s 〉_/ S e q \end{matrix}

(6)

\begin{matrix} 〈 s e l e c t o r 〉 : : = & _S e q 〈 c o n t r o l 〉 〈 c o n t r o l 〉_/ S e q |_S e l 〈 e x e c u t i o n s 〉_/ S e l \end{matrix}

(7)

\begin{matrix} 〈 c o n t r o l 〉 : : = & 〈 s e q u e n c e 〉 | 〈 s e l e c t o r 〉 \end{matrix}

(8)

\begin{matrix} 〈 e x e c u t i o n s 〉 : : = & 〈 c o n d i t i o n s 〉 〈 a c t i o n s 〉 | 〈 a c t i o n s 〉 \end{matrix}

(9)

\begin{matrix} 〈 a c t i o n s 〉 : : = & 〈 a c t i o n 〉 〈 a c t i o n s 〉 | 〈 a c t i o n 〉 \end{matrix}

(10)

\begin{matrix} 〈 c o n d i t i o n s 〉 : : = & 〈 c o n d i t i o n 〉 〈 c o n d i t i o n s 〉 | 〈 c o n d i t i o n 〉 \end{matrix}

(11)

\begin{matrix} \begin{matrix} 〈 c o n d i t i o n 〉 : : = & _IsBlocked ? |_ReachLocation ? |_Excavation ? | \\ _Communication ? |_Transportation ? |_Search ? \end{matrix} \end{matrix}

(12)

\begin{matrix} \begin{matrix} 〈 a c t i o n 〉 : : = & _CheckState |_GetTasks |_MoveToTask |_AvoidObstacle | \\ _ExecuteExcavation |_ExecuteCommunication | \\ _ExecuteTransportation |_ExecuteSearch \end{matrix} \end{matrix}

(13)

The initial population is composed of BTs with a maximum depth of no more than 7, which are randomly generated based on the derivation tree method. For the crossover and mutation, Tournament Selection is often used to choose better parents for the next generation. This method generally works with any fitness value, including negative fitness, leading to a full exploration of the search space and a reduced likelihood of falling into local minima [18]. Since selection determines the initial population of every generation, the fitness function guides the evolutionary process. To measure the performance of the BT controller, we design the fitness function as Equation (14). In our system, lower fitness means better performance.

F i t n e s s = α D^{2} + β T + γ S (w, d)

(14)

where function S(w,d) is the size of BT, which consists of width w and depth d, as follows:

S (w, d) = \sqrt{w^{2} + d^{2}}

(15)

In Equation (14), D represents the distance one robot travels to complete all tasks, and T means the total time cost, including the time spent traveling and executing. Parameters {

α

,

β

,

γ

} are weights of {D,T,S}. In order to finish the same work with a smaller BT to avoid meaningless branches, S is required to have the greatest influence on fitness. In addition, trying to get robots to perform tasks they are good at is also an important consideration for better performance of BTs. Since robots take less time to execute tasks that match their capability, the weight of T is more important than the weight of D. Considering the order of magnitude, the values of parameters are designed as follows:

{α = 0.02, β = 0.05, γ = 1}

(16)

4. Results

As exploratory research, we test our framework in a simulated Disaster Rescue Scenario in which multi-thread is used to simulate multi-agent. The information on the environment and robots is preset by configuration files, and tasks in the environment are randomly generated according to the task information template. In order to evaluate the performance of heterogeneous MASs based on ABTE designed by us, we take the grammatical evolution algorithm for cluster behaviors [8] (GEESE) as a comparison and carry out a multi-task execution experiment in a preset environment.

4.1. Experimental Settings

Experiments are set in a 2D 100 × 100 coordinate system, as shown in Figure 4. Four types of tasks are randomly distributed—Excavation (stars), Communication (signals), Transportation (boxes), and Search (flags)—which refer to those frequently performed by MASs in the real world, as proposed by Hogg et al. [19]. Experimental MASs consist of multiple groups of robots and are placed at the starting point (50, 50). Each group is composed of four types of robots: R1, R2, R3, and R4. They each have the full capability (value 10) in one of the Excavation, Communication, Transportation, and Search tasks, and half in others. These values represent execution efficiency. The task properties are represented by {l,t,p,τ,μ,σ} according to the communication protocol mentioned in Section 2.1, including priority p (Excavation = 4, highest), time limit

τ

(all set to 60, except for Search), and energy/completion constants

μ

,

σ

(both 100). It is worth noting that, in our research, tasks with time constraints are classified as high-priority tasks, while Search tasks, due to their routine nature, are not subject to time restrictions.

We use the ABTE proposed in this paper and GEESE as a contrast to control robots to execute tasks. In order to analyze the advantages of the proposed method, we gradually increase the number of tasks and observe the energy consumption and coverage ratio of high-priority tasks of MASs. In addition, we increase the number of robots and observe the changes in the coverage ratio of high-priority tasks so that we can obtain strong evidence that the proposed method is better at performing high-priority tasks.

4.2. Simulation Studies

4.2.1. Execute Tasks with Sufficient Energy

In the first set of experiments, the MAS has enough energy for all tasks. Experiments end when robots mimic real-world energy conservation, meeting one of two conditions: (1) all tasks are completed or (2) the remaining energy is 5–10%. In experiment 1, there are 60 tasks randomly distributed in the environment. After a random robot receives the task information, it acts as both an auctioneer and a bidder (R1 in the experiment). According to the auction method mentioned in Section 2.2, the robots begin to individually evolve BTs after all tasks have been allocated. The design of the fitness function used in the GE process is expounded upon in Section 3.2, and the parameters involved in GE are shown in Table 1.

Figure 5 displays the execution BTs evolved by ABTE, with nodes in dashed boxes subject to replacement. Fitness functions promote compact BTs, resulting in diverse, task-specific BTs for heterogeneous robots. R1 executes Excavation and Search without obstacle avoidance, while R2’s BT includes obstacle avoidance for Communication tasks. Other robots can also evolve instinct BTs by permuting and combining these nodes. It is evident that the BTs generated by ABTE endow heterogeneous robots with diverse execution behaviors, significantly enhancing the adaptability of MASs.

Taking the evolutionary process of R1’s BT as an example, the initial population is composed of 50 BTs, which are randomly generated based on derivation tree method [20]. Subsequent iterations involve crossover and mutation to produce succeeding generations, with a cap of 100 individuals per generation—a measure designed to curtail excessive expansion of the search space. The fitness function is tailored to foster the development of compact BTs capable of fulfilling all tasks. Following approximately 60 evolutionary rounds, the BT most suited for R1’s tasks is derived.

Within the ABTE framework, the most substantial computational overhead stems from the GE used to generate BTs, with the longest evolutionary duration in our experimental context being around 12 s. Notably, this overhead only exists in the stage before executing tasks and does not affect the task execution process. Furthermore, given that robots evolve their BTs in a distributed parallel mode, the cumulative computational overhead on the system is only affected by the robot with the longest evolution time.

In experiment 2, the MAS consists of three groups of R1, R2, R3, and R4, which are set to complete different numbers of tasks (12–120). After all tasks have been executed, the trend and comparison of the mean and standard deviation energy consumption among MASs controlled by ABTE and GEESE are shown in Figure 6 and Figure 7.

The BTs evolved by GEESE operate globally, leading robots to favor closer tasks. However, in many cases, the energy consumption of executing tasks is greater than that of traveling [21]. In contrast, ABTE yields task-specific, heterogeneous BTs. Considering robotic capabilities and task priorities, each robot prioritizes high-priority tasks matching its abilities, reducing the overall energy use of the system. Figure 6 shows that the MAS controlled by ABTE outperforms GEESE in average energy consumption, with an increasing advantage as tasks rise. The standard deviation of energy consumption is also lower with the ABTE approach, ensuring more uniform resource use within the MAS. Additionally, as shown in Figure 7, we introduce resource dissipation as an evaluation metric, which elaborates on the average energy consumed by MASs to complete each task. In an ideal scenario, robots in MASs would only need to complete tasks that best match their capabilities without any travel-related consumption. In the experiments of this paper, the ideal resource dissipation is set at 5%, as indicated by the green dashed line in this figure. The resource dissipation of ABTE is closer to the ideal value, reflecting a higher utilization of resources. Overall, ABTE allows MASs to complete more tasks before reaching energy limits, ensuring remaining tasks are handled if one robot powers down, making ABTE more efficient and robust than GEESE.

4.2.2. Ability to Cover High-Priory Tasks

In the second set of experiments, the initial energy of MASs may be insufficient for all tasks. Robots start at (50, 50), with stopping conditions the same as those of the first set. The aim is to compare the high-priority task coverage ratio between methods when energy is limited. Excavation, Communication, and Transportation tasks have time constraints, making them high-priority compared to Search tasks. If tasks are performed outside the time limit, they are counted as unfinished tasks. The high-priority task coverage ratio is calculated by dividing the number of high-priority tasks completed by the total number of high-priority tasks.

Figure 8 presents the results of experiment 1 from the second set. The x-axis indicates total tasks, while the y-axis shows the coverage ratio of high-priority tasks. The MAS comprises three groups of R1, R2, R3, and R4, with tasks increasing from 12 to 120 in increments of 4. ABTE fully covers high-priority tasks up to 60, whereas GEESE cannot complete all high-priority tasks beyond 28, reaching only 50% of ABTE’s coverage at 60 tasks. With 120 tasks, ABTE maintains a 50% coverage ratio and GEESE drops to 24.4%. After 60 tasks, the coverage ratio of ABTE declines sharply due to the saturation of MASs at 45 high-priority tasks; beyond this, the system can complete only 45 within the time limit. At 72 tasks, GEESE peaks at 26 high-priority tasks, marking a significant gap from the saturation point of ABTE.

In experiment 2 of the second set of experiments, the number of tasks is constant at 120. We start with the MAS consisting of three groups of R1, R2, R3, and R4 and add one group of robots to the cluster at a time until both GEESE and ABTE can cover all the high-priority tasks. The result is shown in Figure 9. Overall, ABTE already has a 100% coverage ratio when seven groups of robots are included in the cluster, while GEESE will not be able to fully cover high-priority tasks until twelve groups are included.

The outcomes of ABTE are influenced by execution time and remaining energy, favoring robots with matching capabilities or idle tasks. This enables heterogeneous MASs (robots with diverse task-specific traits) to rapidly complete high-priority tasks, a benefit particularly evident in systems with limited initial energy. Since the total energy of MASs is insufficient for all tasks, the collaboration mechanism provided by ABTE tends to guide the cluster to maximize high-priority task completion before reaching the energy threshold. Consequently, ABTE outperforms GEESE in terms of high-priority task coverage ratio.

The above two sets of results show that ABTE allows robots to process a relatively large number of tasks with less energy consumption, and to complete more high-priority tasks. In addition, ABTE tries not only to allocate as many tasks as possible to robots that are good at executing them but also to evolve more efficient and customized BTs to control robots based on the task allocation. This makes the BTs of MASs not need to be hand-coded by experts in advance so that the MAS has stronger adaptability and flexibility to face the changes and challenges in the environment.

4.3. Discussion

The above experiments mainly verify the excellent performance of the ABTE framework in an environment with a large number of known tasks. Firstly, we fix 12 robots in MASs and gradually increase the number of tasks from 12 to 120 under the condition of sufficient system energy. The performance of ABTE is evaluated by using energy consumption and resource dissipation as the metrics. The results show that compared to GEESE, the MAS under the control of ABTE requires less energy to complete all tasks, and the gap in energy consumption increases as tasks increase. Additionally, in terms of resource dissipation, ABTE is closer to the ideal results. Secondly, we increase the number of tasks and robots to examine the ability of MASs to complete high-priority tasks. The results indicate that ABTE can achieve a higher coverage ratio of high-priority tasks using fewer robots.

Analyzing the characteristics of two types of frameworks for BT generation, GEESE is a reactive framework where the robots execute tasks as they encounter them. Therefore, the robots prioritize performing the nearest task. In contrast, ABTE is an auction-based framework that integrates the concept of task allocation. This framework auctions all tasks based on robot capabilities and task priorities, with each robot prioritizing high-priority tasks that match its capabilities. Based on real-world situations, we assume that the energy consumption of task execution is higher than that of traveling, and robots consume less energy when performing tasks they are good at. Therefore, ABTE exhibits less energy consumption and resource dissipation compared to GEESE. Additionally, the consideration of task priority in the auction ensures a high coverage ratio of high-priority tasks.

5. Conclusions

The performance of heterogeneous MASs in multi-task environments, especially those with numerous high-priority tasks, has become a key concern. In order to help MASs execute tasks autonomously, efficiently, and collaboratively, this paper designed a hierarchical framework based on BT for planning and control, called Auction-Based Behavior Tree Evolution (ABTE). This framework combines the command layer and an execution layer, ensuring efficient task allocation and automatic behavior generation. The experiments are designed around a typical Disaster Rescue Scenario, where heterogeneous multi-agents are grouped according to different functions. The effectiveness of the framework is verified by the performance of MASs facing numerous tasks with different priorities. The experimental results indicate that, compared with the current methods, the three-way handshaking communication protocol in ABTE can significantly maximize the advantages of robots and improve resource utilization within heterogeneous MASs. Furthermore, the grammatical evolution algorithm can not only generate execution behaviors tailored to different tasks but also enable MASs to preferentially complete tasks with higher priority.

It is worth noting that the Disaster Rescue Scenario represents just one application for the hierarchical behavior control framework proposed in this paper. This framework, which stratifies command and execution behaviors, is a paradigm that utilizes the modularity of BT, allowing the tree to expand and prune dynamically according to varying tasks. For heterogeneous MASs, the introduction of this framework can significantly boost adaptability, synergy, and autonomy. However, when implementing ABTE in alternative applications, it is crucial for researchers to ascertain that all tasks within the environment are known. Such knowledge provides the foundational basis for task allocation at the command layer. Furthermore, as assumed in Disaster Rescue Scenarios, the energy utilized by robots in performing specific tasks ought to surpass the energy expended during traveling. This premise is based on the emphasis of ABTE on optimizing the sequence of task execution rather than shortening the path. Future efforts will focus on refining robot behaviors to explore the diversity of evolutionary BTs and enhancing the task bidding value computation model. The integration of planning algorithms for minimization and enhancement will also be a key focus.

Author Contributions

N.L. and J.W. participated in the conception and design of the work; S.W. and W.W. investigated and designed the methodology; S.W. designed the software and experiments; S.W. and W.W. performed the experiments; W.W. and C.B. analyzed the data; S.W. wrote original draft; W.W., J.W. and S.Y. edited the paper; S.Y. and W.Y. administrated the project. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (Grant Nos. 62106278 and 91948303-1), the National Natural Science Foundation of China (Grant Nos. 611803375, 12002380, and 62101575), the National Key R&D Program of China (Grant No. 2021ZD0140301), and the Postgraduate Scientific Research Innovation Project of Hunan Province (Grant No. QL20210018).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are openly available in ABTE at https://github.com/shanghuaw/ABTE.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

MAS	Multi-agent system
BT	Behavior trees
GP	Genetic programming
GE	Grammatical evolution
SRC	Single-round combinatorial
PSI	Parallel single item
SSI	Sequential single item

References

Colledanchise, M.; Ögren, P. How behavior trees modularize hybrid control systems and generalize sequential behavior compositions, the subsumption architecture, and decision trees. IEEE Trans. Robot. 2017, 33, 372–389. [Google Scholar] [CrossRef]
Colledanchise, M.; Natale, L. Analysis and exploitation of synchronized parallel executions in behavior trees. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 3–8 November 2019; pp. 6399–6406. [Google Scholar]
Iovino, M.; Scukins, E.; Scukins, J.; Ögren, P.; Smith, C. A survey of behavior trees in robotics and AI. Robot. Auton. Syst. 2017, 154, 104096. [Google Scholar] [CrossRef]
Scheide, E.; Best, G.; Hollinger, G.A. Behavior tree learning for robotic task planning through monte carlo DAG search over a formal grammar. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 4837–4843. [Google Scholar]
Styrud, J.; Iovino, M.; Norrlöf, M.; Björkman, M.; Smith, C. Combining planning and learning of behavior trees for robotic assembly. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; pp. 11511–11517. [Google Scholar]
Partlan, N.; Soto, L.; Howe, J.; Shrivastava, S.; Seif El-Nasr, M.; Marsella, S. EvolvingBehavior: Towards co-creative evolution of behavior trees for game NPCs. In Proceedings of the 17th International Conference on the Foundations of Digital Games, Athens, Greece, 5–8 September 2022; pp. 1–13. [Google Scholar]
Iovino, M.; Styrud, J.; Falco, P.; Smith, C. Learning behavior trees with genetic programming in unpredictable environments. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 4591–4597. [Google Scholar]
Neupane, A.; Goodrich, M.A. Learning swarm behaviors using grammatical evolution and behavior trees. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence IJCAI-19, Macao, China, 10–16 August 2019; pp. 513–520. [Google Scholar]
Parker, L.E.; Rus, D.; Sukhatme, G.S. Multiple mobile robot systems. In Springer Handbook of Robotics; Bruno, S., Oussama, K., Eds.; Springer International Publishing: Singapore, 2016; pp. 1335–1384. [Google Scholar]
Wu, S.H.; Li, H.; Xiao, R.B.; Liu, J. Modeling and simulation of dynamic ant colony’s labor division for task allocation of UAV swarm. Phys. Stat. Mech. Its Appl. 2018, 491, 127–141. [Google Scholar] [CrossRef]
Turner, J.; Turner, Q.G.; Schaefer, G.; Whitbrook, A.; Soltoggio, A. Distributed task rescheduling with time constraints for the optimization of total task allocations in a multirobot system. IEEE Trans. Cybern. 2018, 48, 2583–2597. [Google Scholar] [CrossRef] [PubMed]
Baroudi, U.A.; Al-Shaboti, M.; Koubâa, A.; Trigui, S. Dynamic multi-objective auction-based (DYMO-auction) task allocation. Appl. Sci. 2020, 10, 3264. [Google Scholar] [CrossRef]
Singhal, R.; Goyal, P.; Agarwal, S.; Makkar, M.; Kumar, P. Multi-robot task allocation in e-commerce warehouses: A comparative analysis of distance minimization and priority-based approaches. In Proceedings of the ISME International Conference on Advances in Mechanical Engineering, Singapore, 27 July 2024; pp. 375–382. [Google Scholar]
Coltin, B.; Veloso, M.M. Mobile robot task allocation in hybrid wireless sensor networks. In Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, Taiwan, 18–22 October 2010; pp. 2932–2937. [Google Scholar]
Koenig, S.; Tovey, C.A.; Lagoudakis, M.G.; Markakis, V.; Kempe, D.; Keskinocak, P.; Kleywegt, A.; Meyerson, A.; Jain, S. The power of sequential single-item auctions for agent coordination. In Proceedings of the 21st National Conference on Artificial Intelligence, Boston, MA, USA, 16–20 July 2006; pp. 1625–1629. [Google Scholar]
Cai, Z.X.; Li, M.L.; Huang, W.R.; Yang, W.J. BT expansion: A sound and complete algorithm for behavior planning of intelligent robots with behavior trees. In Proceedings of the AAAI Conference on Artificial Intelligence, Palo Alto, CA, USA, 2–9 February 2021; pp. 6058–6065. [Google Scholar]
Hallawa, A.; Schug, S.; Iacca, G.; Ascheid, G. Evolving instinctive behaviour in resource-constrained autonomous agents using grammatical evolution. In Applications of Evolutionary Computation; Pedro, A.C., Juan, L.J., Francisco, F.V., Eds.; Springer International Publishing: Seville, Spain, 2020; pp. 369–383. [Google Scholar]
Le Goff, L.K.; Buchanan, E.; Hart, E.; Eiben, A.E.; Li, W.; De Carlo, M.; Winfield, A.F.; Hale, M.F.; Woolley, R.; Angus, M.; et al. Morpho-evolution with learning using a controller archive as an inheritance mechanism. IEEE Trans. Cogn. Dev. Syst. 2022, 15, 507–517. [Google Scholar] [CrossRef]
Hogg, E.; Hauert, S.; Harvey, D.; Richards, A. Evolving behaviour trees for supervisory control of robot swarms. Artif. Life Robot. 2020, 25, 569–577. [Google Scholar] [CrossRef]
Neupane, A.; Goodrich, M.A. Efficiently evolving swarm behaviors using grammatical evolution with PPA-style behavior trees. arXiv 2022, arXiv:2203.15776. [Google Scholar]
Colledanchise, M.; Almeida, D.; Ögren, P. Towards blended reactive planning and acting using behavior trees. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 8839–8845. [Google Scholar]

Figure 1. Collaboration in heterogeneous MASs within a multi-task environment. The entire allocation process is controlled by the same auction BT, adhering to a three-way handshaking communication protocol based on the auction algorithm. Upon the completion of task allocation, robots evolve heterogeneous execution BTs tailored to their tasks.

Figure 2. Three-way handshaking communication process based on auction algorithm.

Figure 3. ABTE generates BT with auction behaviors and execution behaviors.

Figure 4. Within the environment, where obstacles are stationary, there exist four types of tasks: Excavation, Communication, Transportation, and Search. These tasks must be completed by an MAS.

Figure 5. Execution BTs for all robots by ABTE. The sections outlined with dotted lines represent components that can be replaced during the evolutionary process, depending on the varying tasks that the robots are required to execute.

Figure 6. A comparison of the overall energy consumption with an increasing number of tasks.

Figure 7. A comparison of the resource dissipation with an increasing number of tasks.

Figure 8. A comparison of the coverage ratio with an increasing number of tasks.

Figure 9. A comparison of the coverage ratio with an increasing number of robots.

Table 1. GE parameters used for experiments.

Parameter	Value
Individuals in population	100
Max tree depth	7
Generations	100
Crossover	Variable_onepoint
Crossover probability	0.75
Mutation probability	0.01
Selection	Tournament

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wen, S.; Wu, W.; Li, N.; Wang, J.; Yang, S.; Ben, C.; Yang, W. Auction-Based Behavior Tree Evolution for Heterogeneous Multi-Agent Systems. Appl. Sci. 2024, 14, 7896. https://doi.org/10.3390/app14177896

AMA Style

Wen S, Wu W, Li N, Wang J, Yang S, Ben C, Yang W. Auction-Based Behavior Tree Evolution for Heterogeneous Multi-Agent Systems. Applied Sciences. 2024; 14(17):7896. https://doi.org/10.3390/app14177896

Chicago/Turabian Style

Wen, Shanghua, Wendi Wu, Ning Li, Ji Wang, Shaowu Yang, Chi Ben, and Wenjing Yang. 2024. "Auction-Based Behavior Tree Evolution for Heterogeneous Multi-Agent Systems" Applied Sciences 14, no. 17: 7896. https://doi.org/10.3390/app14177896

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Auction-Based Behavior Tree Evolution for Heterogeneous Multi-Agent Systems

Abstract

1. Introduction

2. Three-Way Handshaking Communication Protocol

2.1. The Formats of Auction Information

2.2. Task Allocation Architecture

3. The Hierarchical Behavior Control Framework

3.1. The Layers of the Framework

3.2. The Evolution of Execution BTs

4. Results

4.1. Experimental Settings

4.2. Simulation Studies

4.2.1. Execute Tasks with Sufficient Energy

4.2.2. Ability to Cover High-Priory Tasks

4.3. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI