Collective Memory: Transposing Pavlov’s Experiment to Robot Swarms

Campo, Alexandre; Nicolis, Stamatios C.; Deneubourg, Jean-Louis

doi:10.3390/app11062632

Open AccessArticle

Collective Memory: Transposing Pavlov’s Experiment to Robot Swarms

by

Alexandre Campo

^*

,

Stamatios C. Nicolis

and

Jean-Louis Deneubourg

Biological & Artificial Self-Organized Systems Team, Interdisciplinary Center for Nonlinear Phenomena and Complex Systems, Université libre de Bruxelles, Av. F.D. Roosevelt 50, CP231, 1050 Brussels, Belgium

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2021, 11(6), 2632; https://doi.org/10.3390/app11062632

Submission received: 24 January 2021 / Revised: 6 March 2021 / Accepted: 8 March 2021 / Published: 16 March 2021

(This article belongs to the Special Issue Recent Advances in Swarm Robotics)

Download

Browse Figures

Versions Notes

Abstract

:

Remembering information is a fundamental aspect of cognition present in numerous natural systems. It allows adaptation of the behavior as a function of previously encountered situations. For instance, many living organisms use memory to recall if a given situation incurred a penalty or a reward and rely on that information to avoid or reproduce that situation. In groups, memory is commonly studied in the case where individual members are themselves capable of learning and a few of them hold pieces of information that can be later retrieved for the benefits of the group. Here, we investigate how a group may display memory when the individual members have reactive behaviors and can not learn any information. The well known conditioning experiments of Pavlov illustrate how single animals can memorize stimuli associated with a reward and later trigger a related behavioral response even in the absence of reward. To study and demonstrate collective memory in artificial systems, we get inspiration from the Pavlov experiments and propose a setup tailored for testing our robotic swarm. We devised a novel behavior based on the fundamental process of aggregation with which robots exhibit collective memory. We show that the group is capable of encoding, storing, and retrieving information that is not present at the level of the individuals.

Keywords:

collective memory; swarm robotics; swarm intelligence; adaptive complex systems

1. Introduction

The study and design of collective behaviors in the context of complex systems may favor future developments in a multitude of applications. A relatively well known example is swarm robotics, a field that focusses on the design and implementation of groups of robots capable of taking advantage of their number to perform designated tasks. Desired properties of these swarms include scalability, and increased robustness compared to single robots [1,2]. Moreover, swarm robotics partly takes inspiration from collective behaviors observed in biological systems and self-organized systems, with a focus on decentralized implementations that use few behavioral rules [3,4] and favor local and situated communication [5,6].

Another field that may strongly benefit from a better understanding of collective behaviors is synthetic biology, in which new steps are taken at a fast pace. Indeed, among other possibilities, this field holds the promise of building biological machines, that is, robots near the molecular scale, which opens up vast perspectives and possibilities [7,8]. In these systems as well, the question of how to effectively coordinate large groups is crucial. By nature, biological systems are effective at replicating material, for instance producing large numbers of cells or other biological material, and their operating mechanisms can be rather complex. However, increased complexity inevitably comes with a cost in terms of design, building time, robustness, and required materials. It is therefore necessary to investigate and build a repertoire of fundamental collective behaviors that may be reused, adapted, and combined to design and implement such complex systems.

In this paper, we focus on the cognitive capability of memorizing information in groups. Memory is a common property that allows a system to adapt its behavior based on its past experience. There are different ways to implement a collective memory, with the most obvious one having all individuals retain the same information. This approach is robust, but it involves large redundancy that implies more complexity at the individual level. A variation of this approach that can reduce the implementation burden is to have only a fraction of individuals retain all or fragments of information. Another approach is stigmergy [9,10]: the group modifies its environment, adding marks that can later be exploited. This is a sort of external memory which has been successfully exploited in Nature (for instance, ants that deposit pheromones to maintain foraging paths), and much less in artificial systems. Here, we propose a new mechanism to implement collective memory that does not rely on individual memory nor on the modification of the environment. Instead, the information is stored in the spatial configuration adopted by the group. To study and demonstrate this collective behavior, we introduce an experiment based on Pavlov classical conditioning that has been adapted for testing a robotic swarm.

The Pavlov classical conditioning experiment [11] is a simple and effective demonstration of how animals can learn to associate different stimuli. In its best known version (see Figure 1), a dog is repeatedly presented food together with the sound of a bell, and it learns to associate these two stimuli. After some training, the dog reacts to the sole bell sound and starts to salivate even though it sees no food. The food is an unconditional stimulus (US) that triggers salivation by anticipation, which is an unconditional response (UR) from the dog. Initially, the dog salivates when presented food, and it does not display any particular response when it hears the bell sound. Only after training has the dog learned to associate the two stimuli, so that it may display a salivation response when hearing the bell sound, which is thus referred to as the conditional stimulus (CS).

This experiment raises interest in studying memory in groups because it provides a simple and straightforward method to demonstrate basic learning capabilities. The experiment can be divided into two main phases: first, a training phase in which the subject of the experiment learns to associate two different stimuli, that is, memorize simple information; second, a testing phase in which the subject must recall the information and use it to produce an adapted response.

In the following, we present the robots used, the experimental setup we devised, and detail how we transposed the Pavlov conditioning experiment to groups of robots, in order to test collective memory. A new collective behavior is introduced that can encode information in the group and extract the information when needed. The specificity of this behavior is that the group can display memory as a cognitive capability, although individual members have a purely reactive behavior and do not memorize any information. We then report results of simulated experiments that demonstrate collective memory and finally we discuss the future perspectives of this group behavior.

2. Experimental Setup

2.1. The Robots

In this experiment, we have simulated underwater autonomous robots, modeled after the aFish (and its predecessor, the Jeff robot), which have been created respectively in the subCULTron and CoCoRo EU projects [12,13]. These robots are able to detect target objects using a camera or using other signals such as acoustics or modulated light. To remain in a delimited area and not spread out in the open, the robots can rely on acoustic signals from a beacon at the surface. Moreover, robots can perceive each other using LEDs that produce light signals at short range (perceived in one meter range in daylight conditions). Robots can also perceive a stimulus from an experimenter, using a dedicated acoustic signal. Finally, the robots can navigate in three dimensions in the water body using three thrusters, two at the back to provide forward motion and one near the front that provides lateral motion. They can also regulate their underwater depth using a buoyancy system that operates by moving a piston against a rubber membrane, thereby modifying the overall volume of the submersed robot.

Figure 2 shows the robot that is simulated and summarizes the different sensors and actuators that are relevant in this experiment. It would have been possible to consider other robots to perform this experiment, for instance, wheeled robots moving in two dimensions, with the ability to advertise themselves and perceive local neighbors, and perceive additional stimuli in the environment such as a target object and signals from an experimenter.

2.2. The Setup

The setup is described in Figure 3. The simulated experiments take place inside a circular pool (12.5 m diameter), with a surface beacon maintained in the center. The surface beacon is used as an aggregation device that periodically emits acoustic signals. The robots detect and use these signals to aggregate under the beacon to remain in range and stay together. The beacon is perceived by robots in a range of 1.75 m. During the experiments, different stimuli can be sent to the robots using acoustic messages. These stimuli are perceived at once by all the robots. The unconditional stimulus (US) produces an unconditional response from the robots (UR), but the conditional stimulus is not a priori connected to a specific response. Experiments are carried out with groups of 30 robots in total.

2.3. The Learning and Testing Phases

The Pavlov experiment relies on two main stimuli, the unconditional stimulus (food) that triggers an unconditional response (salivation by anticipation), and a conditional stimulus (bell sound) which does not a priori elicit a specific reaction from the subject. In a first training phase, the subject is presented the unconditional stimulus together with the conditional stimulus. Later, in a testing phase, the subject is only presented the conditional stimulus. When the subject has learned to associate the two stimuli together, it reacts to the conditional stimulus and produces what is called a conditioned response. As a control, in the absence of training, the conditional stimulus does not elicit the conditioned response.

The Pavlov experiment is rather straightforward to adapt to a group of robots, starting by defining the specific stimuli that the robot can use. The unconditional stimulus is a target (implemented here as an acoustic message) that robots are tasked and rewarded to detect. When robots make a successful detection, they provide a positive response and advertise the finding with their status LEDs, displaying the equivalent of the dog’s salivation. The conditional stimulus (CS) is also an acoustic message that can be perceived by all the robots at once when triggered. This stimulus has no specific reward associated.

Figure 4 shows the unfolding of a collective memory experiment with a group of robots. The experiment starts with the initial state of the robots which are randomly scattered in the pool. In the training phase, the robots are presented either the unconditional stimulus alone or the unconditional and the conditional stimuli together. During that time, the robots move with a random walk and may stop when they are in range of the central beacon, thus aggregating together. In the testing phase, the robots are presented only the conditional stimulus and their response is observed. The robots are expected to collectively respond to this stimulus if they have learned to associate it with the unconditional stimulus. They advertise their response using their status LEDs. If the conditional stimulus was not presented jointly with the unconditional stimulus during the training phase, the robots are expected to provide a negative response with their status LEDs.

3. Robot’s Behavior

3.1. Encoding Information in the Collective Memory

The behavior of the robots is purely reactive [4] and does not require any memory to be implemented in the individual robots. As previously hinted, we implement the memory of the group in the spatial structure formed by the robots. In this experiment, there is only one bit of information to encode, whether to associate two stimuli or not. We therefore selected a simple method to produce two spatial configurations to be discriminated. The basic idea is to divide the group in two teams, with red and yellow robots, for instance, and let the memory bit be 0 when robots are in a mixed state, and be 1 when robots are in a segregated state. In the mixed state, the aggregated robots are each having on average the same number of red or yellow neighbors. In the segregated state, the aggregated robots have on average a majority of neighbors sharing the same color as itself. The coloring is a fixed characteristic attached to each robot, and never changes (e.g., the robots are painted red or yellow). Figure 5 summarizes the two spatial configurations that are used to encode the single bit collective memory of the group.

To encode information in the collective memory, we rely on the fundamental behavioral component of aggregation, inspired from the behavior of cockroaches that gather under dark shelters [14]. We define aggregation as the physical gathering of individuals in a cluster, such that individuals in the aggregate can perceive or detect their immediate neighbors. As reported in [14] and adapted to robot swarms in subsequent literature [15,16,17], the collective behavior of aggregation can be implemented by having robots move randomly in their environment and stop when they encounter one or more other robots when in the range of an aggregating device (here, the central beacon, which is equivalent to the shelter in experiments involving cockroaches). Moreover, robots have a probability Q to leave and resume motion that depends on the local density of robots

D = X / κ

, with X the number of robots under the shelter and

κ

the carrying capacity of the shelter. This departure rule prevents small clusters of two or three robots to be as stable as larger ones, hence leading the largest aggregate to eventually attract all the robots.

We extend this behavior to achieve mixed or segregated aggregation states with a group of robots made of two teams, red and yellow. When robots are aggregated, they periodically make a decision with probability Q whether to remain static or to resume motion based on the number of local neighbors, discriminating the red ones

X_{r e d}

and the yellow ones

X_{y e l l o w}

, or more generally, the neighbors sharing the same color

X_{S}

, and the ones with opposite color

X_{O}

. If robots indistinctively take into account all neighbors, red or yellow, the group eventually aggregates in a mixed configuration because they have on average the same probability to encounter either colored robots. However, if robots only take into account local neighbors that have the same color as themselves, they will tend to remain with similarly colored robots, and leave robots of opposite color, eventually leading to a segregated configuration.

The probability Q of robots to leave an aggregate and resume motion is calculated as follows:

\begin{matrix} Q (α) = \frac{θ}{1 + ρ {(\frac{X_{S} + α X_{O}}{κ})}^{2}}, \end{matrix}

(1)

where parameters

θ

and

ρ

determine the minimum and maximum probabilities of leaving the beacon area, and

α

is a parameter that represents the affinity with which robots aggregate with other robots of the opposite color. Moreover, we used parameter

κ

as a normalization coefficient. Based on preliminary tests, we have fixed

θ = 0.9

,

ρ = 100

, and

κ = 40

(note that robots only estimate neighbors by counting perceived signals, therefore

κ \neq 30

). In this paper, the robots are blind (

α = 0

) to opposite colored robots when the unconditional and conditional stimuli are perceived together, leading to the encoding of information via a segregated spatial configuration. Otherwise, robots do not discriminate their neighbors by color (

α = 1

) and aggregate in a mixed spatial configuration. In the following, we use the notation

P_{s e g} = Q (0)

to indicate that robots aggregate in a segregated configuration, and

P_{m i x} = Q (1)

to indicate that robots aggregate in a mixed configuration.

To study all the possible configurations adopted by the robots with this behavior, we have analyzed a mean-field mathematical model that we hereby introduce. As a means to describe the mixed and segregated spatial configurations under a single beacon, we have divided the aggregation zone (the range of the beacon) into two areas, named

a_{1}

and

a_{2}

. When robots are mixed, they indifferently occupy any location under the beacon, but, when they are segregated, they spatially divide into two teams that each occupy an area on its own. As a consequence, the model uses four variables (

X_{S, 1}

,

X_{O, 1}

,

X_{S, 2}

, and

X_{O, 2}

) that describe populations of robots by their type (S for self, and O for opposite), and an area in which they reside (1 or 2). To limit any potential confusion, we would like to stress that the division of the aggregation zone in two areas is a pure abstraction introduced to allow expressing and studying the robots’ behaviour with a mathematical model.

One can write the rates at which robots of type S and O move from area

a_{2}

to area

a_{1}

as:

\begin{matrix} Φ_{S, 2 \to 1} & = & \frac{θ γ_{1}}{1 + ρ {(X_{S, 2} + α X_{O, 2})}^{2}} \\ Φ_{O, 2 \to 1} & = & \frac{θ γ_{1}}{1 + ρ {(α X_{S, 2} + X_{O, 2})}^{2}}, \end{matrix}

(2)

where

γ_{1} = (1 - \frac{X_{S, 1} + X_{O, 1}}{κ})

represents the saturation of area

a_{1}

that is the degree to which it can still accept new robots. Moreover, in the mathematical model,

κ

represents the carrying capacity of each area, so that the whole aggregation zone can contain at most

2 κ

robots. This formulation only considers robots’ exchanges between the two areas: individuals can instantly move from one to the other, and the mathematical description of their transit outside the aggregation zone and back to it is neglected.

We can therefore describe the dynamics of the robots in the aggregation zone as a sum of incoming and outgoing flows using the following differential equations:

\begin{matrix} \frac{d X_{S, 1}}{d t} & = & - X_{S, 1} \cdot Φ_{S, 1 \to 2} + X_{S, 2} \cdot Φ_{S, 2 \to 1} \\ \frac{d X_{O, 1}}{d t} & = & - X_{O, 1} \cdot Φ_{O, 1 \to 2} + X_{O, 2} \cdot Φ_{O, 2 \to 1}, \end{matrix}

(3)

with the dynamics of robots in area

a_{2}

that can, by the conservation law, be deduced as

d X_{S, 2} = - d X_{S, 1}

and

d X_{O, 2} = - d X_{O, 1}

. This mean field model is further detailed and investigated in [18].

3.2. Decoding Information from the Collective Memory

To produce a collective response to the conditional stimulus, the group of robots need to evaluate its collective memory and to determine whether it has associated the conditional stimulus with the unconditional one. In practice, this means that the group must find out whether it is aggregated in a mixed or segregated state. Intuitively, each robot can count the number of surrounding neighbors of red and yellow colors. In a segregated state, one might expect all the neighbors to be of the same color, while, in a mixed state, it would be expected to have a balanced share of both colors. We name R the recall coefficient, which is an estimate of the group’ spatial configuration made by a single robot based on its local observations. It is formalized as:

\begin{matrix} R = \frac{X_{S} - X_{O}}{X_{S} + X_{O}}, \end{matrix}

(4)

with

X_{S}

the number of signals received from neighbors with the same color, and

X_{O}

the number of signals received from neighbors with the opposite color. When

X_{S}

and

X_{O}

tend to the same value as is the case in a mixed configuration, the recall coefficient R tends to 0. In a segregated configuration,

X_{S} > > X_{O}

,

X_{O}

tends to 0, and R tends to 1. In reality, robots may not always have a clear cut local perception, where all the neighbors belong to the same team, or with a perfectly balanced number of red and yellow neighbors. Therefore, in the following section, we investigate

R_{T}

, the ideal threshold at which a robot should consider it is in a segregated configuration in order to minimize false positives or false negatives.

We investigated the perception of signals emitted by neighboring robots in mixed and segregated configurations by monitoring the perception of all the aggregated robots in repeated trials (1000 replications, 30 robots, for a total of 60,000 observations analyzed). In Figure 6, we report the average perception observed in mixed aggregates, while, in Figure 7, we report the average perception in segregated aggregates. Data clearly indicate that mixed aggregates yield a very similar distribution of signals from both teams, and segregated aggregates yield a significantly higher count of signals from the self team. Hence, the two configurations can in principle be discriminated using the recall coefficient R. However, the reliability of this procedure depends on the number of observations and increases them. Individual robots estimate the aggregation configuration by measuring local signals within a limited time window, thereby reducing the number of observed signals and increasing the impact of noise and randomness on the recall coefficient.

To produce an homogeneous collective response from the group, we can strongly reduce the impact of noise and random events by using a decision-making phase, also called a quorum, in which each individual shares and updates its opinion in a peer to peer fashion so as to converge to a common decision, a consensus that averages out the various random fluctuations of individual estimates. The quorum phase is implemented by having the aggregated robots periodically advertise their opinion, with a rate of 3.3 signals per seconds on average. They also observe signals from their neighbors during time windows of 10 s and update their own opinion to the majority (including their own). This simple algorithm and derivatives are well studied in the literature [19,20,21,22], and it has been shown to converge to a coherent decision in the group, in favor of the opinion that is initially most represented.

3.3. Behavior Implemented

The behavior executed by the robots is purely reactive, and depends only on the signals perceived at present time. We describe this behavior in Figure 8 with a decision tree that is the best suited in this case, as it clearly shows the decision path taken to execute particular subroutines. On a technical side note, it should be noted that the specific implementation on these robots requires short windows of time during which signals are counted. This is due to the fact that robots identify their neighbors or exchange opinions when doing a quorum using modulated light messages. These messages are corrupted if they overlap spatially and temporally. To reduce the chances of message collisions, robots are not constantly emitting, but rather they use a probability to emit. Therefore, robots must count signals during a small time window to perceive local neighbors. This aspect is specific to our robot design, and would not be relevant if, for instance, robots were painted into two different colors and were using cameras to detect their neighbors. More broadly, any type of robot capable of aggregation and discriminating two types of neighbors would be suited for this experiment.

4. Results

We start by analyzing the mean field model as defined by Equation (3) and their steady-states (see [18] for details). The model possesses four types of solutions:

a homogeneous solution (referred to in the sequel as a mixed configuration), in which all four variables are equal ( $X_{S, 1} = X_{O, 1} = X_{S, 2} = X_{O, 2}$ ),
four semi-homogeneous solutions defined by $X_{S, 1} = X_{O, 1}$ (referred to in the sequel as a default mixed configuration) and by $X_{S, 1} = X_{O, 2}$ (referred to in the sequel as a segregation configuration),
and four inhomogeneous solutions defined by $x_{1} \neq x_{2} \neq y_{1} \neq y_{2}$ .

Figure 9A shows a bifurcation diagram of the steady solutions of Equation (3) with fixed

κ

and

ρ

values and varying

α

. As noticed, for increasing values of

α

, the system switches from a segregated configuration, where each team

X_{S}

and

X_{O}

aggregates respectively in one area, to a mixed configuration, where robots of each team are equally distributed either in both areas, or all in a single area (default mixed configuration).

As a side note, the default mixed configuration may seem to be an artifact of the mathematical model since the division of the aggregation zone in two areas is abstract and only serves the purpose of mathematical description. However, this particular solution can be interpreted as the result of competition between different early aggregates. If, by chance, robots from the opposite team encounter and start forming an aggregate earlier than robots of the same team, they offer an alternative aggregation site that may eventually capture all the robots of each team, thereby producing a mixed configuration by default. Notice that this state may only be observed when

α > 0

, which is not the case when robots are segregating in our experiments.

Figure 9B shows a state diagram against parameters

κ

and

α

for a particular value of

ρ

where all the available states are available. In particular, we see that increasing the carrying capacity

κ

(the size of the aggregation zone) favors the mixed configurations for large values of

α

. For small values of

α

and

κ

, the segregated configuration dominates. Finally, when

a l p h a

is small but

κ

is large, we observe a simultaneous coexistence of mixed and segregated configurations.

All in all, the model depicted by Equation (3) shows non-trivial features like different aggregation configurations and coexistence between them. Of special interest is the segregated configuration that is not induced by any agonistic behaviors but only by the finite size of aggregation zone, the individuals of the two teams interacting the same way.

In the following, we focus on results obtained with individual based simulations. Figure 10 shows snapshots of simulated experiments. In the training phase, two different conditions are tested: the robots are either exposed to the unconditional and the conditional stimuli together, or the unconditional stimulus alone. During that phase, we observe that robots exposed to the unconditional stimulus only form a single cluster under the central beacon and aggregate in a mixed state, disregarding the color of their neighbors. When the two stimuli are produced, the robots successfully aggregate in a segregated spatial configuration. In the subsequent testing phase, the robots are exposed to the conditional stimulus alone and their response is tested. Robots first make an initial opinion about the local configuration and then proceed to make a collective decision to reach quorum using peer to peer opinion exchange and update. At that time, robots display their current opinion using black or white color, for mixed or segregated configuration, respectively. We observe that, depending on their past experience, the group of robots reacts positively to the stimulus only if it was previously perceived in association with the unconditional stimulus.

In a first series of experiments, we investigate

R_{T}

, the threshold applied to the recall coefficient R, at which a robot considers it, is in a segregated configuration from a local point of view. More precisely, this threshold is used by the robots to decide whether the recall coefficient based on their local perception indicates segregation. Robots first calculate the coefficient and use the threshold to form their initial opinion, and then enter the quorum phase to elect the majority opinion. If this threshold is set too high, robots will only consider whether they are segregated when they don’t perceive members of the opposite team. However, because segregated teams are gathered at the same beacon, they are in close proximity and have the chance to sometimes perceive each other. On the other hand, if the threshold is set too low, a slight imbalance in the number of neighbors from each team may lead a robot to decide it is in a segregated configuration. Thus, the threshold

R_{T}

must be adjusted to minimize the risk of false positives where the group mistakenly perceives its configuration as segregated, and false negatives where the group fails to detect a segregated configuration.

In Figure 11, we report the impact of

R_{T}

on the robots’ quorum answer, depending on the mixed or segregated configuration of the group. Robots aggregate under the beacon either in a mixed configuration (with a probability to remain stopped

Q_{m i x}

), or in a segregated configuration (

Q_{s e g}

). After a period of 2500 s, sufficient to have the whole group aggregated in the range of the beacon, the robots form an initial opinion about the group configuration and entered a quorum phase. The quorum phase also lasts 2500 s to ensure that the whole group has converged to a common decision. We explored

R_{T}

values from 0 to 0.6 with a 0.05 increment. For each tested value, we performed 1000 simulations in both conditions, with mixed or segregated aggregation. We measured the proportion of experiments that ended with an incorrect decision of the group, which are false positives in simulations where the aggregated robots are in a mixed configuration, and false negatives where the robots are in a segregated configuration. We observe an optimal value when

R_{T} = 0.3

that minimizes the risks of error when evaluating the configuration, mixed or segregated, adopted by the group.

With the optimized setting of the

R_{T}

parameter, we have run two sets of experiments with 1000 simulations involving 30 robots for each condition. The results reported in Figure 12 show data obtained from the testing phase of the experiments, after robots are aggregated and right when the conditional stimulus alone is produced. We observe that the quorum process is always strongly converging, starting with about 70% of the robots sharing the same opinion about their current spatial configuration, and ending with the whole group electing the majority opinion. While the group always ends up making a collective decision, it may nevertheless collectively produce an incorrect response. For instance, in the first condition, the robots are exposed to only the unconditional stimulus during the training phase. They are thus expected to not learn any association between the two stimuli and, when exposed to the conditional stimulus in the testing phase, they should produce a negative response and not recall information. We observe that, in the first condition, robots have not memorized the association in 79% of the trials, but produce a mistaken response 21% of the time. In the second condition, in which robots are expected to learn the association, we observe a successful recall of the information in 82% of the trials, and 18% in which robots do not recall the information. Results are statistically significant (binomial test, p-value < 0.001).

5. Conclusions

We introduced a novel collective behavior that allows groups to memorize simple information, although the individual members of the group do not use any memory. Instead of relying on the individuals, the information is encoded and stored in the spatial configuration of the group when it is aggregated.

The seminal Pavlov experiment was adapted to work with groups of robots, providing a simple and straightforward method to demonstrate the collective cognitive capability of memorizing information. We detailed a specific implementation of the behavior for a swarm of underwater robots, including a method to produce two different types of robot aggregates which represent a single bit of memory. This was achieved by dividing the group in two colored teams that can cluster in a mixed or in a segregated configuration. To put to use the information collectively stored, we introduced a local estimate, the recall coefficient that allows each individual member to form an opinion about the state of the collective memory. The whole group then processes all opinions with a quorum phase in which we have shown that all the robots converge to a collective decision that follows information stored in the collective memory. We have shown that the proposed behavior successfully recalls information in 82% of our repeated experiments (

n = 1000

).

The results show that, with relatively simple agents and simple behavioral rules, a group can have a memory of its own, independently of the internal state of its members. This first experiment is rather simple and involves a single bit memory to learn to associate two stimuli. There are several directions that may be considered to investigate how more information may be stored: it would be interesting to introduce more teams, and also multiple aggregation sites. Encoding more information may also be achieved with a larger palette of spatial configurations, but this might in turn involve higher individual complexity in order to create the configurations and to discriminate them effectively. Only with one aggregate, two basic mixed and segregated states, and three teams may we encode up to 12 different configurations. Another point of potential improvement is the accuracy of storing and recalling information. We suggest that mainly two aspects of the experiments can play a large role: first, the quality of individual perception because it directly impacts the ability of the robots to perceive their spatial configuration, and therefore what information is stored. Second, randomness that is inherent in this work (in particular when robots move using a random walk, and when they make a decision to remain stopped) will cause spatial configurations themselves to be less accurately defined. The impact of random events may strongly be reduced by working with larger group sizes. For instance, when segregated, a smaller fraction of the teams will perceive both colors.

The collective behavior introduced in this paper may seem complicated for a task that is routinely achieved by most of the electronic devices in existence. However, we believe it bears significant interest in two cases: when the cost of implementing memory in an individual is high, for instance in synthetic biology with molecular machines and engineered bacteria, or in more elementary biochemical systems (to produce cognitive capabilities such as habituation or solve the detour problem), and when studying the behavior of groups with collective memory properties that may remain hidden when examining the individual members separately.

Author Contributions

Conceptualization: A.C. and J.-L.D.; simulations: A.C.; mathematical modeling: S.C.N. and J.-L.D.; results: A.C., S.C.N. and J.-L.D.; writing: A.C. and S.C.N. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the EU project CoCoRo, No. 270382, call FP7-ICT-2009-6, and the RoboCoenosis project, No. 899520, H2020-EU.1.2.1.—FET Open program.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are openly available here: https://github.com/AlexandreCampo/CollectiveMemory (accessed on 12 March 2021).

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

Bonabeau, E.; Dorigo, M.; Théraulaz, G. Swarm Intelligence: From Natural to Artificial Systems; Oxford University Press: New York, NY, USA, 1999. [Google Scholar]
Brambilla, M.; Ferrante, E.; Birattari, M.; Dorigo, M. Swarm robotics: A review from the swarm engineering perspective. Swarm Intell. 2013, 7, 1–41. [Google Scholar] [CrossRef] [Green Version]
Arkin, R.C. Behavior-Based Robotics; MIT Press: Cambridge, MA, USA, 1998. [Google Scholar]
Brooks, R.A. Intelligence without representation. Artif. Intell. 1991, 47, 139–159. [Google Scholar] [CrossRef]
Clancey, W.J. Situated Cognition: On Human Knowledge and Computer Representations; Cambridge University Press: Cambridge, UK, 1997. [Google Scholar]
Støy, K. Using Situated Communication in Distributed Autonomous Mobile Robotics. In Proceedings of the Seventh Scandinavian Conference on Artificial Intelligence, Odense, Denmark, 19–21 February 2001; Volume 1, pp. 44–52. [Google Scholar]
Endy, D. Foundations for engineering biology. Nature 2005, 438, 449–453. [Google Scholar] [CrossRef] [PubMed]
Khalil, A.S.; Collins, J.J. Synthetic biology: Applications come of age. Nat. Rev. Genet. 2010, 11, 367–379. [Google Scholar] [CrossRef] [PubMed]
Theraulaz, G.; Bonabeau, E. A brief history of stigmergy. Artif. Life 1999, 5, 97–116. [Google Scholar] [CrossRef] [PubMed]
Grassé, P.P. La theorie de la stigmergie: Essai d’interpretation du comportement des termites constructeurs. Insectes Sociaux 1959, 6, 41–81. [Google Scholar] [CrossRef]
Pavlov, P.I. Conditioned reflexes: An investigation of the physiological activity of the cerebral cortex. Ann. Neurosci. 2010, 17, 136. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Thenius, R.; Moser, D.; Varughese, J.C.; Kernbach, S.; Kuksin, I.; Kernbach, O.; Kuksina, E.; Mišković, N.; Bogdan, S.; Petrović, T.; et al. subCULTron-Cultural Development as a Tool in Underwater Robotics. In Proceedings of the Artificial Life and Intelligent Agents Symposium, Birmingham, UK, 14–15 June 2016; Springer: Cham, Switzerland, 2016; pp. 27–41. [Google Scholar]
Schmickl, T.; Thenius, R.; Moslinger, C.; Timmis, J.; Tyrrell, A.; Read, M.; Hilder, J.; Halloy, J.; Campo, A.; Stefanini, C.; et al. CoCoRo—The Self-Aware Underwater Swarm. In Proceedings of the 2011 Fifth IEEE Conference on Self-Adaptive and Self-Organizing Systems Workshops, Ann Arbor, MI, USA, 3–7 October 2011; pp. 120–126. [Google Scholar]
Amé, J.M.; Halloy, J.; Rivault, C.; Detrain, C.; Deneubourg, J.L. Collegial decision-making based on social amplification leads to optimal group formation. Proc. Natl. Acad. Sci. USA 2006, 103, 5835–5840. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Garnier, S.; Jost, C.; Jeanson, R.; Gautrais, J.; Asadpour, M.; Caprari, G.; Theraulaz, G. Aggregation behaviour as a source of collective decision in a group of cockroach-like-robots. In Proceedings of the European Conference on Artificial Life, Canterbury, UK, 5–9 September 2005; Springer: Berlin/Heidelberg, Germany, 2005; pp. 169–178. [Google Scholar]
Garnier, S.; Jost, C.; Gautrais, J.; Asadpour, M.; Caprari, G.; Jeanson, R.; Grimal, A.; Theraulaz, G. The embodiment of cockroach aggregation behavior in a group of micro-robots. Artif. Life 2008, 14, 387–408. [Google Scholar] [CrossRef] [PubMed]
Campo, A.; Garnier, S.; Dédriche, O.; Zekkri, M.; Dorigo, M. Self-organized discrimination of resources. PLoS ONE 2011, 6, e19888. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Nicolis, S.C.; Halloy, J.; Deneubourg, J.-L. Transition between segregation and aggregation: The role of environmental constraints. Sci. Rep. 2016, 6, 32703. [Google Scholar] [CrossRef] [PubMed]
Krapivsky, P.L.; Redner, S. Dynamics of majority rule in two-state interacting spin systems. Phys. Rev. Lett. 2003, 90, 238701. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Montes de Oca, M.A.; Ferrante, E.; Scheidler, A.; Pinciroli, C.; Birattari, M.; Dorigo, M. Majority-rule opinion dynamics with differential latency: A mechanism for self-organized collective decision-making. Swarm Intell. 2011, 5, 305–327. [Google Scholar] [CrossRef]
Valentini, G.; Hamann, H.; Dorigo, M. Efficient decision-making in a self-organizing robot swarm: On the speed versus accuracy trade-off. In Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, Istanbul, Turkey, 4–8 May 2015; pp. 1305–1314. [Google Scholar]
Olfati-Saber, R.; Murray, R.M. Consensus problems in networks of agents with switching topology and time-delays. IEEE Trans. Autom. Control 2004, 49, 1520–1533. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Summary of the different phases of the Pavlov conditioning experiment with a dog. (A) Initially, the dog is untrained and, when presented the conditional stimulus (CS, the bell ring), it shows no reaction. When presented the unconditional stimulus (US, food), it starts to salivate by anticipation (UR, the unconditional response). (B) In the training phase of the experiment, the dog is presented food and at the same time it hears the bell ring (US + CS training). (C) In the testing phase, no food is presented to the dog, but the bell ring alone triggers its salivation (CR, conditioned response). Hence, with training, the dog learns to associate the two stimuli together and eventually reacts positively to the bell ring even in the absence of food.

Figure 2. The aFish robot from the subCULTron project serves as a model for the simulated robots. It measures about 50 cm long, and 20 cm high. The main sensors and actuators used include thrusters (forward and backward motion, lateral rotation) and a buoyancy system for navigation. An acoustic transceiver is present for long range communication (<500 m) and modulated light transceivers disposed around the body are used for short range communication and perception of other robots (<0.5 m). A camera oriented towards the bottom allows for detecting target objects. In the simulations, the light transceivers are implemented with a perception cone to take into account range and aperture, and visual occlusions are not considered.

Figure 3. The experimental setup in our simulations is a circular pool (12.5 m diameter), with a beacon in its center that periodically emits acoustic messages and acts as an aggregation device to maintain the robots together (1.75 m range). Conditional and unconditional stimuli can be presented to the robots at any time, by triggering long-range acoustic signals.

Figure 4. Timeline of an experiment, with four main situations represented. (A) In the initial condition, robots are randomly scattered in the pool. (B) In the training phase, the robots can be exposed two different conditions, either the unconditional stimulus alone, or both the unconditional and the conditional stimuli together. During this time, robots perform a random walk and aggregate under the beacon, in a mixed or segregated configuration depending on the stimuli perceived. (C) In the testing phase, once robots are aggregated, the conditional stimulus is triggered alone to test whether robots have learned to associate it with the unconditional stimulus. Robots observe their immediate neighbors to form an opinion about their local configuration. (D) To obtain a collective response, robots exchange their opinions in a peer to peer manner and converge to a single opinion.

Figure 5. The collective memory is encoded in the spatial structure, that is, the configuration adopted by the group of robots. To this end, the robots are divided in two teams that differ only by their color, red or yellow. On the left, the robots are aggregated in a mixed state, in which on average each robot has the same number of red and yellow neighbors. On the right, the robots are in a segregated state and they each have on average a majority of neighbors with the same color as themselves.

Figure 6. Average perception of a robot observing its neighbors in a mixed aggregate. The number of signals perceived from robots of each team (

X_{S}

for self team and

X_{O}

for opposite team) is highly symmetrical and shows that most frequent observations involve fewer neighbors’ signals. The range and aperture of sensors (field of view) limit the detection of all present neighbors at once. Signals are accumulated in time windows of 10 s, a total of 60,000 observations are represented.

Figure 6. Average perception of a robot observing its neighbors in a mixed aggregate. The number of signals perceived from robots of each team (

X_{S}

for self team and

X_{O}

for opposite team) is highly symmetrical and shows that most frequent observations involve fewer neighbors’ signals. The range and aperture of sensors (field of view) limit the detection of all present neighbors at once. Signals are accumulated in time windows of 10 s, a total of 60,000 observations are represented.

Figure 7. Average perception of a robot observing its neighbors in a segregated aggregate, which is used to encode learned information in the collective memory. The number of signals perceived by a robot from the opposite team (

X_{O}

) is significantly lower than from its own self team (

X_{S}

). Signals are accumulated in time windows of 10 s, a total of 60,000 observations are represented.

Figure 7. Average perception of a robot observing its neighbors in a segregated aggregate, which is used to encode learned information in the collective memory. The number of signals perceived by a robot from the opposite team (

X_{O}

) is significantly lower than from its own self team (

X_{S}

). Signals are accumulated in time windows of 10 s, a total of 60,000 observations are represented.

Figure 8. Decision tree describing the reactive behavior implemented by the robots. In circles are the signals that can be perceived by the robot, with B the beacon signal of the aggregating device,

U S

the unconditional stimulus, and

C S

the conditional stimulus (B,

U S

, and

C S

are implemented in simulation as acoustic messages that can be perceived by all robots at once). In rounded rectangles are the subroutines that can be executed, with mainly the random walk, the quorum, and the decision to stop any motion with a probability that can either depend on any (self or opposite team) nearby robots (

P_{m i x}

) or on the nearby robots of the same team only (

P_{s e g}

). This behavior does not require memory at the individual level as it is only based on the execution of different subroutines controlled by the current perception of the robots.

Figure 8. Decision tree describing the reactive behavior implemented by the robots. In circles are the signals that can be perceived by the robot, with B the beacon signal of the aggregating device,

U S

the unconditional stimulus, and

C S

the conditional stimulus (B,

U S

, and

C S

are implemented in simulation as acoustic messages that can be perceived by all robots at once). In rounded rectangles are the subroutines that can be executed, with mainly the random walk, the quorum, and the decision to stop any motion with a probability that can either depend on any (self or opposite team) nearby robots (

P_{m i x}

) or on the nearby robots of the same team only (

P_{s e g}

). This behavior does not require memory at the individual level as it is only based on the execution of different subroutines controlled by the current perception of the robots.

Figure 9. (A) Bifurcation diagram of the steady states of

X_{S, 1}

of model (3) as a function of parameter

α

for

κ = 2.1

; (B) state diagram of the type of existing solutions as a function of

α

and

κ

. Other parameter values are

ρ = 56.25

.

Figure 9. (A) Bifurcation diagram of the steady states of

X_{S, 1}

of model (3) as a function of parameter

α

for

κ = 2.1

; (B) state diagram of the type of existing solutions as a function of

α

and

κ

. Other parameter values are

ρ = 56.25

.

Figure 10. Snapshots of the simulated experiments, relating the different phases of the experiments and the resulting behavior of the robots. In the initial condition, the 30 robots start randomly scattered in the pool. Two different conditions are tested: in the upper part of the figure, only the unconditional stimulus (US) is presented to the robots. In the lower part of the figure, the unconditional stimulus (US) and the conditional stimulus (CS) are presented together as acoustic messages that are perceived at once by all robots. During the training phase, the robots aggregate in the range of the beacon, adopting different spatial configurations depending on the perceived stimuli. During the testing phase, each robot forms an initial opinion that is advertised using white and black colors from their status LEDs. They then carry out a quorum after which the whole group has converged to a collective decision. In these snapshots, when the robots are exposed to the US and CS stimuli during the training phase, they afterwards respond positively to the CS stimulus alone, indicating that they have learned the association. When only the US stimulus is presented during the training phase, the robots produce a negative response to the CS stimulus in the testing phase, indicating that they did not learn the association.

Figure 11. Impact of the recall threshold

R_{T}

on the retrieval of information in the collective memory. The quorum responses of the robots in two different configurations, mixed or segregated aggregates, are tested for different values of

R_{T}

. For each configuration and each tested value, 1000 simulations are performed. When

R_{T}

is low, robots will have higher chances to consider that their configuration is segregated based on their local observations. Therefore, lower

R_{T}

values increase the risk of false positives when robots are in mixed configuration. Conversely, when

R_{T}

is high, robots have higher chances to detect a mixed configuration. Higher

R_{T}

values increase the risk of false negatives in which the group fails to detect a segregated configuration. In addition, 95% confidence intervals are displayed around the proportion of errors in the experiments.

Figure 11. Impact of the recall threshold

R_{T}

on the retrieval of information in the collective memory. The quorum responses of the robots in two different configurations, mixed or segregated aggregates, are tested for different values of

R_{T}

. For each configuration and each tested value, 1000 simulations are performed. When

R_{T}

is low, robots will have higher chances to consider that their configuration is segregated based on their local observations. Therefore, lower

R_{T}

values increase the risk of false positives when robots are in mixed configuration. Conversely, when

R_{T}

is high, robots have higher chances to detect a mixed configuration. Higher

R_{T}

values increase the risk of false negatives in which the group fails to detect a segregated configuration. In addition, 95% confidence intervals are displayed around the proportion of errors in the experiments.

Figure 12. Outcome of the testing phase in the Pavlov experiment (n = 1000 trials, group of 30 robots). Left plots show the dynamics of the quorum and how the majority opinion is gradually propagating until the whole group has made a collective decision (median ±95% CI). Right plots show the response advertised by the group resulting from the quorum and in response to the conditional stimulus (CS) in the testing phase. The results show that, when the group is trained with the unconditional stimulus (US), it does not recall information in the testing phase in 79% of the trials. However, when the group is trained with the unconditional and the conditional stimuli together, it learns to associate the two stimuli and recalls the association during the testing phase, providing a positive response in 82% of the trials.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Campo, A.; Nicolis, S.C.; Deneubourg, J.-L. Collective Memory: Transposing Pavlov’s Experiment to Robot Swarms. Appl. Sci. 2021, 11, 2632. https://doi.org/10.3390/app11062632

AMA Style

Campo A, Nicolis SC, Deneubourg J-L. Collective Memory: Transposing Pavlov’s Experiment to Robot Swarms. Applied Sciences. 2021; 11(6):2632. https://doi.org/10.3390/app11062632

Chicago/Turabian Style

Campo, Alexandre, Stamatios C. Nicolis, and Jean-Louis Deneubourg. 2021. "Collective Memory: Transposing Pavlov’s Experiment to Robot Swarms" Applied Sciences 11, no. 6: 2632. https://doi.org/10.3390/app11062632

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Collective Memory: Transposing Pavlov’s Experiment to Robot Swarms

Abstract

1. Introduction

2. Experimental Setup

2.1. The Robots

2.2. The Setup

2.3. The Learning and Testing Phases

3. Robot’s Behavior

3.1. Encoding Information in the Collective Memory

3.2. Decoding Information from the Collective Memory

3.3. Behavior Implemented

4. Results

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI