For resources 1 and 2 the competing robots are 1 and 3. Since , then robot 1 is supposed to get access before robot 3. For resources 3 and 4 the competing robots are 2 and 3. Since and , then robot 2 will be the second to access resource 3, but the first to access resource 4.
The above setup would work well if each robot required only one resource at a time. However, it is possible that robot will require several resources, each with its own two-element queue. With multiple queues the concept of “being first” is not trivial. We have considered three possibilities:
Options 1 and 2 are simple, but they cause problems when the number of resource is high as the robot would almost never (option 1) or almost always (option 2) be considered first. This would either cause frequent deadlocks or would defeat the purpose of solution enforcing moving order. Thus, we have chosen option 3.
With this, we have formulated the task of robot coordination in continuous 2-D workspace as a discrete optimization problem with decision variables in a form of a single binary vector. As such, our formulation could be seen as a variant of a scheduling problem, for example a job shop scheduling problem, except that a job could be processed on multiple machines at once.
4.1. Feasibility and Deadlock Avoidance
As with other problems concerned with acquiring resources, there is possibility of deadlocks occurring. A deadlock occurs when there is non-empty set of robots that cannot proceed (are stopped indefinitely), despite not having completed their task yet. Deadlocks result in infeasible solutions, thus they need to be avoided or resolved. In the case of our problem there are two types of deadlocks, type I and type II. We will describe those as well as methods of avoiding deadlocks of type I and resolving deadlocks of type II.
Type I is a classic deadlock caused by resource holding by robots while waiting for another resource and the formation of resource awaiting cycles. A simple example would be with two robots
a and
b. First,
a acquires resource
and
b acquires resource
. Next,
a wants to acquire
, while
b wants to acquire
. As a result, robot
a is holding resource
while waiting for
, and robot
b is holding resource
while waiting for
. Thus, no robot will proceed as the required resource is unavailable and no resource will be released. We will now show a method for avoiding such deadlocks, based on the method shown in [
2].
Let us start by introducing some useful notation. Let x be a n-element vector, describing the “sector state” of the system, with its a-th element describing the state of robot a. If a is in its i-th sector, then . It should be pointed out that this numbering is not the same as the overall numbering for sectors. Two special values 0 and are used for when the robot has not started its task yet and for when it has completed its task, respectively. The initial state of the system is . Similarly, the final state is .
As explained earlier, we are not concerned what happens when a robot moves along a given sector, as it has no effect on resources. We are only concerned when a robot changes sectors, described as an event. Event means that robot a moves to the next sector, changing system state from to . Ideally, event can occur in state x if and only if:
Robot a has not completed its task i.e., .
All resources required for a to enter its next sector are free. Equivalently, this means that in state x no robot is in sector that is conflicted with sector robot a wants to enter.
State to which event leads is safe.
In general, state
x is safe if the final state
is reachable from it. However, the problem of determining whether a given state is safe is considered NP-hard [
31]. Due to this, we will not fully check the safety of
x. Instead, we will check the sufficient condition of
x being safe. This might lead to use treating some safe states as unsafe, but this is much easier to determine.
First, let us define non-conflicting sectors. A sector i is non-conflicting if and only if . Next, the nearest non-conflicting sector for robot a is the first sector, starting from a’s current sector, which is non-conflicting. For this purpose, robots before start of their task () and after completing their task () are considered to be in their nearest non-conflicting sectors.
With this, for a state x to be safe, it is sufficient to show there is a “safe sequence” of robots , such that moves first to its nearest non-conflicting sector, while remaining robots do not change sectors. Next, moves and so on up to . The resulting state is safe as all robots are in non-conflicting sectors, meaning all resources are free. Thus, is reachable from . Since is reachable from x, then is reachable from x, meaning that x is safe as well. The procedure for finding out a safe robot sequence for state x is as follows:
.
For each robot determine its nearest non-conflicting sector based on current state x.
Find the first robot a in , such that a can move to i.e., either (1) or (2) no robot in state x is in sector that is in conflict with any sector through .
- (a)
If a does not exist, then algorithm stops and x is not safe.
- (b)
Else .
If , then go to step 3, otherwise, state x is safe.
The above method solves the issues of type I deadlocks: we avoid them by avoiding entering unsafe states. By doing so, it will be always possible to move all robots to their nearest non-conflicting sectors, which require no resources, meaning all resources will become available.
Next we will consider type II deadlocks. Those deadlocks are never caused by resource unavailability alone, but also by the order of robots in solution . We will illustrate it with simple example. First, we assume there are three robots and no robot has completed its task yet. We consider possible events from state x: , , and . Let us assume only event is possible i.e., or would lead to unsafe states. In that situation, only robot 2 can move. Next, we assume that robot 2 has to acquire resource r for which it competes with robot 1 and neither of them acquired this resource yet. Finally, let us assume that , i.e., robot 1 is “planned” to acquire resource r before robot 2. With this, a deadlock is reached. If robots 1 or 3 move, then we reach unsafe state, resulting in type I deadlock. However, robot 2 cannot move either, because it would need to acquire r, while will force it to wait until robot 1 acquires and releases it first. In this case, no robot will proceed, which we describe as type II deadlock.
We can also illustrate it with an instance from
Table 1, except we will need to modify robot speeds. Let us assume the solution is
. Thus robot 3 (blue) given high-enough speed will travel through sectors
and
and will enter
. However, we choose the speeds in such a way that before robot 3 enters
, robot 2 (green) will reach the end of
and will try to enter
. This is, however, impossible, as robot 2 will require resources 3 and 4 for this, but resource 3 is unavailable due to robot 3 being still in sector
. Robot 2 thus has to wait. However, robot 3 will not move, as to enter
it requires resource 4, but
, thus out of robots 2 and 3, the one with the lower number should obtain that resource first. We see that type II deadlock is not caused by physical resources, but by the solution
(or a mix of both).
We considered two approaches to dealing with type II deadlocks. The first option is to detect such type II deadlocks and consider solution for which it occurred as infeasible (we will explain the detection as a part of the solution evaluation shortly). However, such an approach is cumbersome for the solving algorithms. For algorithms considering a single solution (e.g., constructive heuristics), if such solution is infeasible, then the algorithm does not work. For algorithms considering multiple candidate solutions (e.g., metaheuristics) infeasible solutions are also problematic as they force the algorithm to evaluate meaningless solutions. Metaheuristics also require initial solution, which should also be feasible.
Thus, we have adopted a second approach to resolving type II deadlocks: when such a deadlock is detected, we temporarily ignore the order imposed by . Instead, we consider events from to and choose the first event which is feasible (i.e., a has not completed its task yet, all required resources are available and resulting state is safe). This turns a potentially infeasible solution into a feasible one.
4.2. Solution Evaluation
The last issue related to solution is its evaluation, i.e., a method of transforming a solution into actual robot movement schedule in order to obtain task completion times and makespan . The procedure works by simulating robot movements and is as follows.
Initially, the system state is and time is . The procedure then enters a main loop, which continues until all robots complete their task i.e., until state is reached. In each loop iteration, each robot is assessed to see if it will be able to move in this iteration. For each robot a there are the following possibilities:
If a has already completed its task (i.e., ), then a is ignored and cannot move. We set .
Otherwise, if a is not at the end of its current sector, then a can move and we set to the time it will take a to reach the end of its current sector.
Otherwise, a has to be at the end of the current sector. We check if it is possible for a to enter a new sector (all needed resources are available, the resulting new state is safe and a is considered first in its resource queues). If the access is denied, then the robot cannot move and we set . If the access is granted, then the robot moves to the next sector and:
- (a)
If a completed its task, then we set and .
- (b)
Otherwise, the situation is similar to case 2, a can move until the end of the newly entered sector, so we set to the time it will take a to reach the end of it.
There is one additional possibility to consider. It might happen that robot a cannot move (e.g., resources are not available), but then robot changes sectors, freeing the resources, enabling a to move. Thus, the above robot assessment procedure is repeated, until no new robots were designated to move (no new values are assigned).
After the assessment is done, we have a set of values , with each value indicating either that a cannot move () or that a can safely move for time inside its sector. Let . Two possibilities can occur.
(i.e., at least one robot can advance). In that case each movable () robot a advances for time (i.e., by distance ). We record in the schedule that a moves in time interval from t to . After all movable robots have advanced, we update the simulation time .
(i.e., no robot can advance). This indicates that a type II deadlock occurred. In this case, we repeat the assessment procedure once more, but this time we ignore solution and the queues, thus at least one robot will be able to move, reducing this to case 1.
After the procedure completes, we obtain for each robot its movement schedule (intervals during which it advances), task completion times , and compute the makespan.
At this point, a careful reader might ask why a different solution representation was not used. Namely, a solution might have multiple resources between the same pair of robots, i.e., robots
a and
b compete for both
and
. In such a case one could simply assume
, significantly reducing the solution space. However, there exist instances for which such “matching” policy is not optimal. For example, consider instance from
Figure 3. Here there are only two resources, but robot 2 acquires them at the same time, while robot 1 acquires them separately. For this instance the possible solutions are
,
,
, and
. A simple brute force algorithm results in
,
, and
. Thus,
. In general, the idea of such “matching” between some values of
is interesting, but determining which elements of
can be safely matched is non-trivial.