• Grid model

The uniformly distributed grid represents the working environment of the robot, and each grid corresponds to the position of whether an obstacle exists or not [22]. For example, if a grid has an obstacle, it is marked as 1; otherwise, it is 0.

As shown in Figure 6, Bouzouita et al. [28] obtained georeferenced pictures through UAVs and then created a full map of a vineyard. The processing entails first dividing the UAV workspace into multiple regular grids and defined borders (line segments of workspace in the grid-based environment) using Bressenham's linear algorithm (BLA). Then, the procedure to distinguish the subareas in the grid-based environment is a recursive flood-fill algorithm that picks an empty cell (not marked as occupied) and floods in four directions while there are empty cells, and each flooded cell is marked as occupied. This mapping procedure is repeated until all nodes of the grid are occupied by georeferenced pictures. However, the drawback of this mapping is a tradeoff between acquiring an equally sized set of images in cells and inefficient aerial sampling with UAVs.

• Geometric model

**Figure 6.** Grid model with rasterized areas [28].

Geometric features such as line segments or curves are extracted from the environmental perception information and used by robots to construct an environmental map [22]. As shown in Figure 7, Ball et al. [65] defined the boundary of a 55-hectare sorghum stubble field by manually selecting appropriate latitude and longitude coordinates but omitted the headland where the UGVs turned around at the end of each row. Besides, the positions of multiple robots and obstacle information (human, pole, vehicle, etc.) were periodically sent to the central controller through sensors of a real robot and 12 simulated robots. Finally, the central controller constructed a 2D map of the environment surrounding the robot.

**Figure 7.** Geometric models were constructed with points and lines [65].

A more scientific approach than manually labeling is the Voronoi diagram [66]. For example, a fruit tree on the robot's driving path was taken as a point in the orchard map, and the two-dimensional coordinates of this point were found in the image acquired by the UAV. At the same time, the data of planting interval and density of fruit trees were learned from this image, and the two-dimensional coordinates of other fruit trees were obtained, and then the K-means clustering algorithm was used to find the center of clusters of multiple fruit trees, and the orchard was divided into multiple sub-regions for robot operation according to the principle that the fruit tree point is most adjacent to this center. But the experimental results after mapping are to be further validated on a real agricultural spraying multi-robot.

• Topological model

The topological relationship among objects in the environment is represented by the environment model [21]. The topological model consists of multiple unrelated points and lines with a simple model. Some researchers proposed to use this model in agricultural multi-robot path planning, but it is only an idea and has not been applied [67].

The specific application scope and characteristics of the multi-robot environmental model are shown in Table 3.


**Table 3.** Comparison of mapping methods [68–70].

Compared with 2D maps [14,73–75], 3D maps can provide more information. By fusing the original measurement or small local maps generated from multiple robots to construct global maps at the same time, or matching the 3D maps constructed by heterogeneous robots, more and more abundant data are obtained, which is a new research direction of mapping.

For example [71], the point clouds of the two images acquired by the UAVs and UGVs respectively were first represented by a grid model, and each cell stored excess vegetation index information and ground height information. After that, the relative displacement and rotation parameters between the two grid maps were extracted using the data provided by GPS and Attitude and Heading Reference Systems (AHRS), and the maps were matched by these parameters. Then the correspondence of point clouds between the matched maps was calculated using the Large Displacement Dense Optical Flow (LDOF) system and a voting scheme was used to select a coherent correspondence dataset with high data similarity to infer the initial conversion relationship between the maps, and a non-rigid alignment algorithm was used to eliminate orientation errors. Finally, the input point clouds were brought in for a point-to-point alignment, and non-vegetation points were removed to obtain the 3D maps. The experimental results showed that the constructed maps matched well, but the time taken to construct each such image was quite long.

As the farming environment is often dynamic, learning specific farming information while shortening the time to build maps will help improve the efficiency of multi-robot operations, which should be a direction for further research. For example, simultaneous localization and composition (SLAM) allows the robot to get information about the image while localizing it, and this technique is currently used more often for indoor multi-robots, especially for catastrophic search and rescue where high responsiveness and mobility are required [76]. In the literature [56] it was mentioned that this technique was used on a ground robot for heterogeneous multi-robot in a greenhouse and that the robot was motion controlled using an Augmented Monte Carlo Localization (AMCL) methods. The experimental results showed that the UGV using SLAM takes less time to construct the map than the remotely operated UAV. And based on the map, the ground robot can complete collision-free autonomous movement between the initial and target points. It is obvious that SLAM technology has great potential for multi-robot mapping [77], and the application of multi-SLAM on agricultural multi-robots will be one of the future research directions.

#### *3.3. Multi-Robot Task Allocation*

Multi-robot task allocation (MRTA) provides evaluation indicators of a multi-robot system, a task set, and system performance and finds a suitable robot for each subtask to perform, bringing the benefit of a robot system to perform mostly task collection. Thus, the quality of the MRTA results directly affects the efficiency of the entire system and whether each robot of the system can maximize its capabilities [78].

Solve the MRTA problem [79] involves many aspects, such as the capability attributes of the system members, the structural attributes of the tasks, the robot coordination mechanism, and the strategy of task allocation [80]. This approach divides the assignment of agricultural multi-robots into centralized components based on the decision of the central controller in this part.

• Centralized Allocation

The centralized allocation means that the leading robot or control center of the system decomposes the global task and then sends the decomposed subtask to each robot according to the corresponding allocation method.

In 2012, reference [14] reported the work area division and task assignment of two UGVs by a person through remote monitoring based on a citrus orchard map. However, as the number of robots increased, manual task assignment stops when a robot hits a tree or fails to make a reliable turn. Some scholars [66] solved the manual assignment of tasks to multiple robots from the perspective of dividing maps, namely, as many sub-regions as there are robots needed. The edges of these sub-regions are generated from discrete fruit tree points and K-mean clustering points on the map boundary using an integer programming approach. However, this view is too ideal and does not consider the case where the number of robots is less or more than the total number of tasks.

A similar study on the number of robots over the number of tasks is multiple robots and a small number of refueling stations [81]. In this reference, an approximate arbitrary time algorithm based on the branch delimitation method was used to obtain the task sequence of multi-robot collaborative refueling. The limit value of the path length distance was first calculated and used as the upper and lower bounds of the algorithm nodes based on the rules for robots walking infield and the total time cost function of refueling and waiting in the queue. Then the optimal solution was obtained by deleting the sub-nodes whose lower limits were greater than or equal to the optimal upper limits during iteration. Simulation experiments showed that the optimal approximate solution on resource utilization can be found by this method, but it is difficult to apply the method to other aspects of agriculture, such as spraying.

A reinforcement learning-based method (Dyna-Q +) was used to find the optimal search path from the current point to the endpoint, that is, multiple robots randomly selected actions (front, back, left, right) and each action was recorded, and the optimal path was obtained by rewarding and punishing the actions selected by the robots according to the presence or absence of obstacles [62]. Then a weighted graph was used to represent the Gird model, including parameters such as the current position of the robot, the set of grids, and whether there is a path between the grids. The minimum cost time for the UAVs to fly at different speeds under the optimal path was calculated using Dijkstra's algorithm, and finally, the search space was obtained based on the optimal path and time and allocated to multiple robots in proportion to the size of the space. A reinforcement learning [82] is a Markovian decision process where the basic idea is all about modeling or fitting a strategy using a function for more complex decision problems. However, the method requires a large number of samples and a long time when used.

Also, multiple robots working in an agricultural environment are often subject to resource-sharing conflicts. For predictable conflicts, relying only on a central task allocation approach to avoid conflicts, the adaptability of multiple robots is very limited, while adding a decision support system (DDS) to provide options for multi-robot collaboration, i.e., identifying problems and building or modifying decision models to avoid resource conflicts based on explicit goals. For example, as shown in Figure 8, in reference [58], the authors adopted a central entity (OptiVisor) to build a multi-robot seeding map based on inputs such as the location of static obstacles in a large field, the seeding method, the seed density, the location of the central controller, and the number of robots. Based on this map, the location and density of each seed are precisely located and the sowing task is assigned to multiple robots. When the robot finished the job, the path from the robot location to the Central Logistic Unit (CLU) was recorded and the robot was allowed to return to the CLU. Especially when one or more robots failed, the task of the failed robot was reassigned to the other robots of the failed robot, and the sowing path was updated for these replacement robots. OptiVisor could also stop a robot's motion when a multi-robot collision is imminent, and define restricted motion areas for the faulty robot. However, this task assignment method was implemented in a simulation environment and needs to be further tested in real applications.

• Distributed allocation

**Figure 8.** A centralized entity was used to plan the seeding task, seeding patterns, and seed densities for different robots in the simulation environment. At the same time, the centralized entities were used to monitor the whole seeding process to avoid collisions between robots [58]. The small blue dots are the locations of the seeds planned to be sown, the yellow dots are the seeds that have been successfully planted, and the larger red dots are the border seed sowing locations.

In general, the distributed task allocation method has the feature that each robot does not have global knowledge of the task but calculates and plans individually according to the local information obtained by the sensor. The performance of the whole system not only is closely related to the individual but also depends on the combined effects of individuals [83].

It was reported in the literature [84] that when the number of tasks is more than the number of farm machines, the ant colony algorithm can be used to find a suitable task sequence for multiple robots. First, the distance between plots was calculated based on the specified coordinate information of each plot, and the relevant parameters of the ant colony algorithm were set. Then randomly generated the starting points of multiple parcels, according to the state transfer probability formula for path selection, and put the generated path parcels into the forbidden table. Finally, the path distance of the plots was calculated, and the pheromones on the path were continuously updated according to the set rules, and iterations were repeated until the optimal task planning for multiple robots was found. The method is mainly used in simulated farmland environments and has not yet seen the practical application.

In reference [85], researchers adopted Semantic MozartSpaces to describe a task allocation data model based on a resource description framework (RDF) and SPARQL (a query language and data acquisition protocol developed by the RDF) in task storage where the RDF was used to construct nested blank nodes and SPARQL was used for querying, updating and interactions of the entry. As shown in Figure 9a, a task was mapped to a nested blank node to generate a semantic tuple (entry) in the task-allocation model. The entry was stored in the task storage with an internal ID that concluded the URL (uniform resource locator). Then, the entry could be selected with a URL according to the relationship between the robot's function and the task requirement. The results of the simulation experiment suggested that the execution time increases correspondingly with an increasing number of tasks, followed by a gradual decrease in production efficiency. It would be necessary to add new robots temporarily to ensure productivity, but the production cost would also increase, so the tasks need to be set in advance.

**Figure 9.** Distributed task allocations: (**a**) An entry was stored in task storage with ID numbers. And robots could select ID to work according to the relationship between the robot's function and the task requirement [85]; (**b**) A task was split into several subtasks. And each robot proposed the largest possible task allocation initially, then gradually decreased its offers based on negotiation mechanism until a deal was done [28].

As shown in Figure 9b, researchers developed market and auction-based approaches for task subdivision and allocation based on the Rubinstein negotiation protocol [28]. The advantage of such a protocol was that the auctioneer robot had to split the task into subtasks during the negotiation. In each round of negotiation, each robot initially started proposing the largest possible task allocation for itself and decreased its offers based on a negotiation of cost functions (discount factors) at each round until the other party indicated acceptance. With distributed task assignment under this protocol, each robot comes with a DDS, and each robot can dynamically adjust to the actual situation to get a suitable task. However, the task allocation result was generally an approximate optimal solution because of the discount factors, such as the area of the mission area, the distance of the robot from the goal, the area beyond the target, and the overlapping work area.

Combined with the above task allocation technology, the research progress of agricultural multi-robots in task allocation technology in the past 10 years is summarized in Table 4.


**Table 4.** Task allocation classification comparison of agricultural multi-robots [86,87].

It can be seen from Table 5 that the current task allocation methods for agricultural multi-robots are mainly centralized, and most of them are implemented in the simulation experiment.

The centralized task allocation mainly adopts the integer programming method, which describes the nature of the task allocation problem by establishing the objective function and constraints. Integral programming (IP) and mixed-integer programming (MIP) problems are an important branch in the field of operations research, which includes branch and bound method [89], cutting plane algorithm [90], graph theory method, etc [91]. The idea is to determine the transfer method and transfer relationship from one search point to another, and the result is a unique optimal solution, which is suitable for small scale; when the scale is extended, the computation is considerable and the computation time will be greatly increased [92]. Thus, the computation of the algorithm grows exponentially with increases in the number of tasks and robots. In general, it is often difficult to meet real-time requirements in task allocation issues.

Distributed task allocation mainly uses a method based on behavioral motivation and a market-based approach. The former applies to multi-robot systems with strong autonomy, but the method has low system allocation efficiency; the latter has a wide range of applications based on the resource optimization configuration idea of economics, but the difficulty of this research method is how to design the negotiation mechanism and reasonably determine the cost-income models of the task [44].

**Figure 10.** Heterogeneous sensors carried by agricultural multi-robots. (**a**) Typical sensors for aerial robots: inertial measurement unit (IMU), Global Positioning System (GPS), and pressure gauge [28]. (**b**) Typical sensor for ground robots RT1 and RT2: GPS, Laser, and IMU [45].

#### *3.4. Path Planning*

Path planning is the fundamental guarantee for multiple robots to accomplish tasks together. This technology refers to using the known static environment information, or the dynamic environment information obtained by the sensor, to autonomously plan a collision-free optimal path for each robot from a known starting point to a target point, which requires not only a single robot to avoid obstacle but also a plan to satisfy collision avoidance among multiple robots [93]. The path planning methods for single robots to avoid obstacles mainly include traditional methods, intelligent methods, and other methods. The traditional methods include the construction space method, V-Graphic (visibility graph), Voronoi diagram, grid method, A\* algorithm, and artificial potential field [94]. Intelligent methods include the ant colony algorithm, particle swarm algorithm (PSO), reinforcement learning algorithm, immune algorithm, genetic algorithm (GA), neural network, and fuzzy logic algorithm [95]. Other methods include dynamic programming (DP) and optimal control algorithms. Collision avoidance strategies among multiple robots include the priority method, rate adjustment method, traffic management rule, and consultation method.

The path planning method can be divided into centralized path planning and distributed path planning according to the ability classification of path planning [96].

• Centralized path planning

The centralized path planning method uses a centralized control unit to plan the optimal path for swarm robots. This method can improve the ability of "close coordination and optimal coordination" among mobile robots [97]. However, it encounters other problems such as "dimensions", "computational complexity", and "non-deterministic polynomial problem (NP) difficulties" with the increase in the number of robots, task difficulty, and space complexity. In particular, the "NP difficulties" problem [98,99], in theory, has not yet received a simple or fast solution.

Connecting known path points into a line, which is simulated to a topological model, and letting multiple robots walk along their respective paths is one of the simplest and easiest path planning methods to implement. But, this method requires obtaining accurate known point information in advance, which is a large preliminary job and is not suitable for situations with a large number of robots or a large operating area. A method similar to the point-to-point method is the visible map method, which aims to reduce collisions. That is, the information of the edge projection points of each obstacle is obtained in advance, and the robot free walking path, path points that can be combined or disconnected are represented by edges and nodes, respectively. Then the starting node is connected to the target node, or the starting node is connected to the raised point of each obstacle edge until it reaches the target node, forming a multi-robot walking path. Finally, depending on the size of the robot, the path width is increased or decreased appropriately. Similarly, the workload of measuring points is larger and does not apply when there are more obstacles.

Planning multi-robot paths on mapping is another method of global path planning. For example, in the grid method, the map was divided into multiple grid cells, and the paths were extended in eight directions with each grid cell as the center, and the path segments were formed by connecting the center of the grid vertically and diagonally with the centers of other grid cells. To get the globally optimal path in the grid, the A\* algorithm was used to search for the path segment with the lowest travel cost, in which the cost of the free space cell was set to 0, the cost of the cell with obstacles was set to the maximum, and the travel cost was the sum of all grid cells on the travel path segment, and the globally optimal path was set when the sum was the smallest.

Since information about the farmland changes dynamically, multiple robots operating with precision need to re-plan to create multiple paths each time based on different information. This multi-robot path planning problem with time windows has also been solved as a multi-objective optimization problem. As shown in Figure 11, where a multi-robot system including two aerial robots and three ground robots was jointly developed in Spain and other countries [49]. This system adopted a centralized control unit to provide global path planning for the multiple robots on a grid map with weed information, which divides the sequence of operations for the multiple robots in advance. The No dominated Sorting Genetic Algorithm II (NSGA-II) algorithm [100,101] was then used to coordinate the relationship between the distance of the robot travel path, the number of turns, the number of robots, the amount of weed killer used, and the capacity function to obtain an approximate optimal solution between the time and money spent by the multiple robots and the cost of weed treatment. It is a type of genetic algorithm, which mainly focuses on the simulation of crossover, variation, and hereditary phenomena occurring during natural selection and genetic inheritance, incorporating the natural law of superiority and inferiority, and deriving the candidate solutions for each generation based on the results, and finally deriving the optimal solution from the derived candidate solutions. However, this method is more computationally intensive and the experimental results are not suitable for fields that are unstructured or d fields without a fixed column or row lengths.

• Distributed path planning

The distributed path planning method requires little calculation and is robust but exhibits low efficiency and can provide only a suboptimal solution [102,103]. In a fully known environment, it is necessary to consider each robot obstacle avoidance method and collision avoidance strategy among robots [104], that is, selecting a robot for path planning first, then broadcasting its path to other robots, and finally planning paths of other robots by themselves. However, this method is difficult to achieve [97] for large numbers of robots. In an unknown environment, the preferred method is to plan a path for each robot to avoid static obstacles based on neglecting the movement of other mobile robots in the environment and then using the multi-robot collision avoidance strategy to solve the conflict problem among mobile robots.

**Figure 11.** The central controller plans the driving path for the unmanned aerial vehicles (UAVs) and unmanned ground vehicles (UGVs) [49]: (**a**) The path planning of two UAVs; (**b**) The path planning of three UGVs.



<sup>1</sup> The A \* algorithm is a direct search method for solving the shortest path in a road network.

At present, there is little research on the distributed path planning of agricultural multi-robots. As shown in Figure 12, Bouzouita et al. [28] developed UAVs to monitor agricultural information in the vineyard. The path planning of the UAVs is based on the grid map and A\*. On the map, the UAV path planning function is constructed according to constraint conditions, such as the number of UAV turns, the number of covered grid visits, and the time to complete the single partition. And using a heuristic algorithm like A\*, the next best node to be expanded is obtained by partying the generation value of each node. Then, the breadth-first search (BFS) algorithm is used to find the local maximum of the function by the distance between the cells. The result is the path from any starting point in the environment to the target unit. The practice shows that the method can find the approximate optimal solution, reduce the possibility of repeated access to the same cell, and facilitate the avoidance of known obstacles. However, it needs to consider the local environmental conditions to find wide applicability.

**Figure 12.** Distributed path planning [28], 1, 2, and 3 are the number of the UAVs, the red line is the planned path of the robot in the delimited area, and the blue line is the actual flight path of the robot.

Combining the above path planning techniques, we summarize the research progress of agricultural multi-robots in path planning methods over the past 10 years, as shown in Table 5.

It can be seen from Table 5 that agricultural mobile robots mainly use ground robots in farmland, and the number of robots is no more than four. The path planning method of multi-robots is mainly conducted in fully known conditions, and the grid method and V-Graphic under centralized planning are the methods most commonly used.

The collision avoidance strategy between agriculture robots usually does not incorporate changes in the path [110], and the obstacle avoidance strategy of a single robot comprises mainly speed adjustment and the priority principle. The path planning of multirobot generally does not consider the presence of obstacles in agricultural production.

When the UGVs turn around at the headland, they need to consider the relationship between the minimum turning radius and the headspace. The general head-turning method has a bulb shape (as shown in Figure 11b), zigzag or U-shape, etc. The U-shape predominates in practice. As shown in Figure 13, the zigzag (forward-reverse-forward) is used in smaller spaces; in contrast, the U-shape turn (turn-straight- turn) is used in larger spaces. However, the difficulty of robot control is increased with the zigzag turning shape, and the task allocation and path planning of the multi-robot are prepared for U-shape turning in advance.

**Figure 13.** Agricultural ground multiple robot head-turning mode: (**a**) Multi-robot turned on the ground with a zigzag way, that is, first forward, then backward, then forward again [47]. (**b**) Multi-robot turned on the ground with a U-shape (GMU is the abbreviation for Ground Mobile Unit) [25].
