Next Article in Journal
When Does a Dual Matrix Have a Dual Generalized Inverse?
Next Article in Special Issue
Symmetric and Asymmetric Diffusions through Age-Varying Mixed-Species Stand Parameters
Previous Article in Journal
Three-Way Multi-Attribute Decision Making Based on Outranking Relations under Intuitionistic Fuzzy Environments
Previous Article in Special Issue
Nonlinear Analysis of the C-Peptide Variable Related to Type 1-Diabetes Mellitus
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Entropic Gradient Structure in the Network Dynamics of a Slime Mold

by
Vincenzo Bonifaci
Dipartimento di Matematica e Fisica, Università degli Studi Roma Tre, 00154 Rome, Italy
Symmetry 2021, 13(8), 1385; https://doi.org/10.3390/sym13081385
Submission received: 23 June 2021 / Revised: 20 July 2021 / Accepted: 21 July 2021 / Published: 29 July 2021

Abstract

:
The approach to equilibrium in certain dynamical systems can be usefully described in terms of information-theoretic functionals. Well-studied models of this kind are Markov processes, chemical reaction networks, and replicator dynamics, for all of which it can be proven, under suitable assumptions, that the relative entropy (informational divergence) of the state of the system with respect to an equilibrium is nonincreasing over time. This work reviews another recent result of this type, which emerged in the study of the network optimization dynamics of an acellular slime mold, Physarum polycephalum. In this setting, not only the relative entropy of the state is nonincreasing, but its evolution over time is crucial to the stability of the entire system, and the equilibrium towards which the dynamics is attracted proves to be a global minimizer of the cost of the network.

1. Introduction

Information-theoretic concepts, such as negative entropy, and the related notion of free energy have long been recognized as relevant to natural and living systems [1,2,3,4]. It turns out that the approach to equilibrium in some classes of dynamical systems can be usefully studied from an information-theoretic perspective. Baez and Pollard [5] give an excellent overview of such an approach. Consider, for example, a system modeled as a continuous-time Markov process on K distinct states, where the probability of the state of the system at time t being i { 1 , 2 , , K } is denoted by p i ( t ) . Such a system is ruled by the so-called master equation:
d d t p ( t ) = H p ( t )
where p ( t ) = ( p 1 ( t ) , p 2 ( t ) , , p K ( t ) ) and H is an appropriate K × K matrix. Drawing from information theory, one can consider the relative entropy (informational divergence) between the evolving distribution p ( t ) and another probability distribution q = ( q 1 , q 2 , , q K ) on the set of states; this is defined as
D ( q , p ( t ) ) : = i = 1 K q i log q i p i ( t ) .
If one chooses q as a steady state distribution (that is, H q = 0 ), then one can show that
d d t D ( q , p ( t ) ) 0 ,
and if the steady state distribution is unique, this can in turn be used to argue that p ( t ) is attracted to q over time. In other words, by proving monotonicity of the relative entropy (2) over time, one can conclude that the steady state distribution q is also a global attractor. This approach is not limited to continuous-time Markov processes and the linear ordinary differential equation system (1), but extends to other types of models such as discrete-time Markov chains, chemical reaction networks, and several models of population dynamics, such as the replicator dynamics [5,6,7,8,9].
This work discusses a different, nonlinear, class of systems for which the information-theoretic approach is successful: a family of dynamical systems modeling the network dynamics of the slime mold Physarum polycephalum [10,11].
The slime mold Physarum polycephalum is an acellular slime mold (myxogastrid) [12]. In its plasmodium stage, it forms a tubular network that reshapes and adapts to the environmental stimuli. The reshaping is driven by the streaming of cytoplasm through the network; tubes not sustained by cytoplasmic streaming eventually decay and dissolve. This reshaping mechanism has been experimentally observed to be remarkably effective at optimizing the use of resources [13,14]. In particular, in one laboratory experiment, the slime mold network is distributed uniformly over a maze, and two food sources are positioned in two points A and B of the maze. Over time, the slime mold retracts all tubes except those corresponding to the shortest path between A and B.
A mathematical model of the network dynamics of P. polycephalum’s plasmodium has been first proposed by Tero, Kobayashi and Nakagaki [10], who also showed how the model implied, for certain values of the parameters and for very simple networks, convergence to the optimal equilibrium point represented by the shortest path in the network. Subsequently, variations of this model have been proved to converge to the optimal network configuration for arbitrary network topologies for a wide range of model parameters [15,16,17,18]. The goal of this paper is to give a gentle introduction to Physarum polycephalum’s network modeling and to its analysis, and to explain how such analysis can be seen as another example of the information-theoretic approach to dynamical systems. Thus, the significance of the results is that they widen the scope and applicability of the information-theoretic approach in dynamical systems.
Structure of this article. Section 2 introduces one possible mathematical model for Physarum polycephalum’s network dynamics. This model is analyzed in Section 3. Section 4 discusses related works and extensions of the model. Some conclusions are drawn in Section 5.

2. Flow Constraints and Network Dynamics

2.1. Some Notation

For a vector x R m , diag ( x ) is used to denote the m × m diagonal matrix with the coefficients of x along the diagonal. The standard inner product of two vectors x , y R m is denoted by ( x , y ) : = x T y . For a vector x R m , x p denotes the p -norm of x ( 1 p < ), that is, x p : = ( x 1 p + x 2 p + + x m p ) 1 / p .

2.2. Cytoplasmic Flow and Kirchhoff’s Laws

In the experiments of Nakagaki et al. [10,13], P. polycephalum’s plasmodium forms an evolving tubular network, in which the transport capacity of the edges, that is, the tubular channels routing the cytoplasm, changes over time (see [19] for a video illustration). This network is modeled mathematically as a connected weighted undirected graph G, with node set N = { v 1 , v 2 , , v n } and edge set E = { e 1 , e 2 , , e m } . Edges represent the tubular channels, and nodes represent junctions between the tubes. Two special nodes in s 0 , s 1 N are distinguished; they correspond to the location of the food sources in the experiments. Node s 0 is called the source of the cytoplasmic flow, node s 1 the sink; the choice of which node is the source and which the sink is purely conventional.
The weight of an edge of G encodes the capacity of the corresponding tubular channel, which is directly related (as detailed below) to the radius and length of the tube. The capacity is a dynamic quantity, since as food is absorbed by the slime mold, and cytoplasm flows through the network, the radius of the tubular channels will respond to the flow; this is discussed in the next section. For now, let us focus on the situation at a specific instant in time.
At any time, the cytoplasmic flow through the network will be entirely determined by the capacity of the edges. Fix an arbitrary orientation of the edges, and let B R N × E be the incidence matrix of G under this orientation, that is,
B v e + 1 if node v s . is the tail of edge e 1 if node v s . is the head of edge e 0 otherwise ( v N , e E ) .
Let N = N { s 1 } and let A R N × E be obtained from B by removing the row corresponding to the sink s 1 . Moreover, let b R N be the vector defined by
b v 0 if v s . s 0 , 1 if v = s 0 , ( v N ) .
A flow (flux) q is represented by a vector in R E that expresses, for each oriented edge e E , the amount of flow along the positive direction of that edge at a given time. Any fluid flow from the source to the sink should obey flow conservation; this is expressed by the linear system of equations
A q = b .
This is nothing but Kirchhoff’s current law which, in words, requires that the flow has zero divergence everywhere except at nodes s 0 and s 1 , where it has divergence, respectively, + 1 and 1 [20] (Chapter IX).
Example 1.
Consider the network of Figure 1a. It holds N = { v 1 , v 2 , v 3 } with source s 0 = v 1 , sink s 1 = v 3 , and E = { e 1 , e 2 , e 3 } . If the network is oriented as in Figure 1b, then B, A and b are, respectively,
B = 1 0 1 1 1 0 0 1 1 , A = 1 0 1 1 1 0 , b = 1 0 .
Note that matrix A is simply matrix B with row 3 dropped (since the sink is v 3 ). Kirchhoff’s current law for a flow q = ( q 1 , q 2 , q 3 ) R 3 through the edges ( e 1 , e 2 , e 3 ) corresponds to the constraints:
q 1 + q 3 = 1 q 1 + q 2 = 0
Two examples of valid flows in this case are q = ( 1 , 1 , 0 ) (all the flow goes through the upper path) and q = ( 1 / 2 , 1 / 2 , 1 / 2 ) (half of the flow goes through the upper path, and half goes through the lower path). In general, any vector q R m such that A q = b encodes a valid flow.
An additional assumption is that the cytoplasmic flow q satisfies Kirchhoff’s potential law (Section II.1) [20], which says that there exist real values { p v } v N (the pressure potentials at the nodes) satisfying the hydrodynamic analogue of Ohm’s law (Poiseuille’s law). For an edge e between nodes u and v, Poiseuille’s law states that
q e = π R e 4 ( t ) 8 η ( p u p v ) l e , for every e = ( u , v ) E
where R e ( t ) is the radius of the tube at time t, and η is a viscosity constant. Thus, if the capacity of edge e is defined as x e ( t ) : = π R e 4 ( t ) / 8 η l e and the resistance as r e : = 1 / x e , these are directly related to the length and radius of the tube at time t, and one can simply write
q e = ( p u p v ) x e = p u p v r e , for every e = ( u , v ) E .
It is a standard fact from electrical network theory that, given a capacity vector x R > 0 E (or equivalently, a resistance vector r R > 0 E ), the flow q satisfying (3) and (5) is unique (Chapter IX) [20]; this will be called the cytoplasmic flow, and is analogous to the electrical flow in electrical networks. In fact, such a flow is also the unique valid flow from s 0 to s 1 of least dissipation, that is, the unique optimal solution to the following optimization problem:
minimize q 𝖳 X 1 q subject to A q = b .
Here, X R E × E is the diagonal matrix with the capacities x e 1 , , x e m along the main diagonal. The quantity E : = q 𝖳 X 1 q = e E r e q e 2 is called the energy of the flow q. The energy of the flow equals the difference between the source and sink potentials, times the value of the flow, which in the setting of this paper is unitary (Corollary IX.4) [20]:
E = b 𝖳 p = ( p s 0 p s 1 ) b s 0 = p s 0 p s 1 .
Hence, in this setting E is also the potential difference between source and sink.
An alternative, equivalent way to express the cytoplasmic flow arises from the Laplacian operator of the network [21]. The (reduced) Laplacian of G is the symmetric and positive semidefinite matrix L : = A X A 𝖳 . If one represents the potential vector of all nodes except the sink by p R N , assuming without loss of generality p s 1 = 0 , then (5) can be written in matrix form as
q = X A 𝖳 p .
Multiplying both sides by A yields the discrete Poisson equation L p = b , with solution p = L 1 b . Substituting this in (8), one gets a direct expression for the cytoplasmic flow in terms of the network structure and edge capacities:
q = X A 𝖳 L 1 b .
This also allows us to express each q e as a (nonlinear) function of the capacity vector x R > 0 E . Using the Laplacian matrix, the energy of the cytoplasmic flow can also be expressed as
E = b 𝖳 p = b 𝖳 L 1 b .
Example 2.
Continuing the same example from above, for a given vector x = ( x 1 , x 2 , x 3 ) R > 0 3 of edge capacities, the capacity matrix X and the reduced Laplacian matrix L are, respectively,
X = diag ( x ) = x 1 0 0 0 x 2 0 0 0 x 3 , L = A X A 𝖳 = x 1 + x 3 x 1 x 1 x 1 + x 2 ,
with the latter having determinant det ( L ) = x 1 x 2 + x 1 x 3 + x 2 x 3 . Hence,
L 1 = 1 det ( L ) x 1 + x 2 x 1 x 1 x 1 + x 3 , p = L 1 b = 1 det ( L ) x 1 + x 2 x 1
and the cytoplasmic flow is
q = X A 𝖳 L 1 b = 1 det ( L ) x 1 x 2 x 1 x 2 x 1 x 3 + x 2 x 3 = 1 x 1 x 2 + x 1 x 3 + x 2 x 3 x 1 x 2 x 1 x 2 x 1 x 3 + x 2 x 3 .
The energy of the cytoplasmic flow is E = p s 0 p s 1 = p s 0 = ( x 1 + x 2 ) / det ( L ) .
Notice that q 1 = q 2 and q 1 + q 3 = 1 , as one expects due to flow conservation. Equation (11) also clearly shows that the cytoplasmic flow along an edge e j is a function of the capacities of the entire network, not just of the capacity x j of the same edge.

2.3. Response Dynamics on an Edge

A dynamics on the edge capacities will now be introduced. This represents the slime mold’s positive feedback mechanism, which relates the pressure gradient along a tubular channel to the rate of increase or decrease of the capacity of the tube. The underlying idea is simple: tubes along which there is a strong pressure tend to increase their capacity, while tubes along which there is a weak pressure tend to decrease their capacity. Namely, the following will be assumed:
x ˙ e = x e φ | q e | x e 1 for all e E .
Here, q e is (as before) the cytoplasmic flow along edge e, and φ : R 0 R 0 is some strictly increasing, differentiable function such that φ ( 1 ) = 1 . This function models the physical response of the tube to the flow. Some observations:
  • By Poiseuille’s law (5), | q e | / x e = | p u p v | , thus the larger the potential difference along edge e, the larger x ˙ e will be.
  • For a given value of the potential difference, a tube tends to expand less if it is smaller, due to the dependency on x e in (12).
  • Because of the absolute value in (12), the actual direction of the flow has no influence on the dynamics. In particular, exchanging the role of the source and sink nodes (thus reversing the flow) does not alter the tubes’ dynamics. This is the reason why one can arbitrarily select any food source as the flow source and the other one as the flow sink.
  • A tube e E is in equilibrium if and only if | q e | = x e , that is, q e = ± x e .
In the remaining sections, the response function φ ( z ) = z 2 will be assumed, which is the mathematically most convenient. However, the qualitative evolution of the dynamics under other response functions appears to be (and in some cases, it provably is) similar; see the discussion in Section 4.1.

2.4. The Dynamics as an Ode

The adaptation Equation (12) can be given a natural interpretation as a local feedback mechanism: the flow q e and the capacity x e jointly determine the rate of change of the capacity x e of the tube.
However, as seen in Section 2.2 the amount of flow on an edge is a function of the capacities of the entire network, due to Equation (9). This means that each q e can be expressed as a (nonlinear) function of all the capacities of the network, that is, one can—at least in principle—rewrite Equation (12) as
x ˙ 1 = F 1 ( x 1 , x 2 , , x m ) x ˙ 2 = F 2 ( x 1 , x 2 , , x m ) x ˙ m = F m ( x 1 , x 2 , , x m )
for some appropriate nonlinear functions F 1 , , F m . In this way, one obtains an autonomous system of coupled, nonlinear ordinary differential equations, describing the evolution of the edge capacities.
In practice, writing down explicitly the functions F 1 , , F m is inconvenient but for the smallest networks, and it turns out to be unnecessary for the stability analysis. In the following, the algebraic-differential formulation of Equations (9) and (12) will be used:
x ˙ e = x e φ | q e | x e 1 for all e E , q = X A 𝖳 ( A X A 𝖳 ) 1 b .
Note that the singularity of (13) when some x e = 0 can be removed: from the expression q = X A 𝖳 ( A X A 𝖳 ) 1 b it can be seen that q e / x e = ( A 𝖳 ( A X A 𝖳 ) 1 b ) e , a rational function of x that is well-defined as long as the Kirchhoff polynomial det ( A X A 𝖳 ) of the network does not vanish (which it cannot when the edges corresponding to nonzero x e ’s form a connected graph [22]).
In particular, with the quadratic response function φ ( z ) = z 2 , one obtains
x ˙ e = x e q e 2 x e 2 1 for all e E , q = X A 𝖳 ( A X A 𝖳 ) 1 b .
The initial condition x ( 0 ) of the dynamics can be any point in the positive orthant R > 0 m . System (14) will be called the quadratic-response Physarum system. The question of whether this system has a solution x ( t ) defined over all t 0 is postponed until Section 3.3.

2.5. Equilibria

What are the equilibrium points of (14)? Any state x such that x ˙ 1 = x ˙ 2 = = x ˙ m = 0 should satisfy, for each e E , either x e = 0 or q e = ± x e . It turns out that equilibrium points are directly related to the paths connecting the source node to the sink node in the network. Specifically, for a source-sink path P, let its characteristic vector χ P R E be defined as
χ e P 1 if e P , 0 if e P .
Then one can prove the following fact.
Lemma 1.
If P is any source-sink path in the network, then χ P is an equilibrium point of (14). Conversely, any equilibrium point of (14) is a convex combination of characteristic vectors of source-sink paths. Moreover, if each source-sink path has a distinct length, then any equilibrium point is the characteristic vector χ P of some source-sink path P, and the energy of the corresponding cytoplasmic flow equals the length of the path.
Proof. 
Let P be a source-sink path, and χ P its characteristic vector. Consider the state x = χ P and let q be the cytoplasmic flow with respect to x. By definition of the cytoplasmic flow A q = b , and q e can be nonzero only when x e is nonzero (due to q = X A 𝖳 L 1 b ). In fact, q e = ± 1 for each e P , and q e = 0 for each e P , since a unit flow is sent from source to sink and the only path with nonzero capacity (due to the choice of x) is P. This means that q e = ± x e for all e E , and hence x = χ P is an equilibrium point.
Conversely, consider any equilibrium point x, satisfying either x e = 0 or q e = ± x e for all e E . By orienting the edges of E as necessary, one can assume q 0 and thus q = x . Since q = X A 𝖳 L 1 b , this implies that ( A 𝖳 L 1 b ) e = 1 whenever x e > 0 . In other words, along any edge e = ( u , v ) with positive capacity, the potential difference p u p v is 1 (recall that L 1 b = p , hence ( A 𝖳 L 1 b ) e is the potential difference along edge e). However, the potential difference along any path only depends on the difference between the potentials of the endpoints of the path. Hence, all paths with positive capacity from source to sink have the same length, p s 0 p s 1 . Since the cytoplasmic flow can only be nonzero on paths with positive capacity, this means that q is a convex combination of characteristic vectors of source-sink paths, all of the same length. Moreover, when each source-sink path has a distinct length, there can be only one nonzero-flow path in the convex combination, with length p s 0 p s 1 = E by (7). □
In the following, for the sake of exposition it will be assumed that different source-sink paths have distinct length in the network. This implies a finite set of isolated equilibrium points for the autonomous system (14), one for each source-sink path. If the assumption is not satisfied, convergence towards the convex set of minimum-energy equilibrium points can still be argued, although one cannot argue convergence towards a specific point of that set.
Example 3.
Let us continue the example from above. The autonomous system (14) is
x ˙ j = x j q j 2 x j 2 1 j = 1 , 2 , 3 , q = 1 x 1 x 2 + x 1 x 3 + x 2 x 3 x 1 x 2 x 1 x 2 x 1 x 3 + x 2 x 3 .
In this case there are only two source-sink paths: P 1 = ( e 1 , e 2 ) , and P 2 = ( e 3 ) . They have distinct lengths, hence there is one isolated equilibrium for each of P 1 and P 2 . The first corresponds to the vector of capacities x ( 1 ) = χ P 1 = ( 1 , 1 , 0 ) and cytoplasmic flow q ( 1 ) = ( 1 , 1 , 0 ) . The second corresponds to the vector of capacities x ( 2 ) = χ P 2 = ( 0 , 0 , 1 ) and cytoplasmic flow q ( 2 ) = ( 0 , 0 , 1 ) . Note that the energy of the first cytoplasmic flow is 2, while the energy of the second cytoplasmic flow is 1. A linear stability analysis around each equilibrium reveals that the first equilibrium point is unstable, while the second equilibrium point is stable.
The situation in Example 3 is not accidental: in the following, a general result will be shown implying that the equilibrium of minimum energy attracts the entire positive orthant, and hence all equilibria of non-minimal energy must be unstable.

3. Stability Analysis: An Optimization Perspective

Both the experimental results of Nakagaki et al. [13] and the toy example considered above suggest that, perhaps, the stability of equilibrium points of (14) is related to the minimality of the corresponding paths in the network. Hence, in order to study the stability of the system (14), it might be useful to adopt an optimization perspective. In this section, it will be shown that this is indeed the case.

3.1. The Shortest Path Problem

In particular, let us consider the shortest path problem in the given network, where one wants to construct a source-sink path that uses as few edges as possible. Formally, the shortest path problem can be formulated as follows:
minimize f 1 subject to A f = b , f R E ,
where f 1 : = e E | f e | = | f 1 | + | f 2 | + + | f m | . In transportation terminology: find a way to ship a unit of a commodity from the source to the sink, while minimizing the total amount of commodity shipped along the edges of the network. The quantity f e encodes the amount of commodity shipped along edge e; thus a vector f satisfying A f = b satisfies flow conservation.
Note that at this point, the only obvious relation between shortest path flows f, which solve the shortest path problem (16), and cytoplasmic flows q, which solve the least dissipation problem (6), is that they are both unit flows, that is, they satisfy Kirchhoff’s current law ( A q = b , A f = b ). Apart from this, in general they could be very different vectors. For instance, in Example 3, the equilibrium flow ( 1 , 1 , 0 ) is a cytoplasmic flow, but not a shortest path flow. Moreover, a cytoplasmic flow is defined only after a capacity vector x has been specified, while in the shortest path problem (16) there are no capacities to speak of.

3.2. A Variational Reformulation of the Shortest Path Objective

Interestingly, however, the shortest path problem can be related to a modified least dissipation problem. To see this, first observe that for any real number a,
| a | = inf x > 0 1 2 a 2 x + x .
Hence, the 1 -norm of any vector f R m can be equivalently expressed as
f 1 f = inf x R > 0 m 1 2 j = 1 m f j 2 x j + x j .
Therefore
min f R m A f = b f 1 f = min f R m A f = b inf x R > 0 m 1 2 j = 1 m f j 2 x j + x j = inf x R > 0 m 1 2 min f R m A f = b f 𝖳 X 1 f + 1 2 1 𝖳 x = inf x R > 0 m 1 2 b 𝖳 L ( x ) 1 b + 1 2 1 𝖳 x ,
with the last identity following from (10). Let us define, for any positive vector x,
F ( x ) : = b 𝖳 L ( x ) 1 b + 1 𝖳 x .
If one interprets x as a vector of capacities, then the term b 𝖳 L ( x ) 1 b is the energy of the cytoplasmic flow induced by x. Thus, the function F is built from two terms: the first can be interpreted as the cost of transport, which is proportional to the dissipated energy, and the second as the cost of maintaining the transport infrastructure. For shortness, F ( x ) will be called the dissipation potential of the vector x. By (18), finding a flow f minimizing f 1 (a shortest path flow) is equivalent to finding an assignment x of capacities to the edges of the network that minimizes the dissipation F ( x ) . The following can be concluded.
Theorem 1
([16,23]). The value of the optimization problem
minimize f 1 subject to A f = b , f R m
equals the value of the optimization problem
minimize 1 2 1 𝖳 x + 1 2 b 𝖳 ( A X A 𝖳 ) 1 b ( = 1 2 F ( x ) ) subject to x R > 0 m
where X = diag ( x ) .
The dissipation function F is defined on the positive orthant and, importantly, is convex.
Lemma 2
([23]). The dissipation function F is positive, convex and differentiable on R > 0 m .
Proof. 
Positivity follows from Equation (10). For convexity, it suffices to show that the mapping x b 𝖳 L 1 ( x ) b is convex on R > 0 m . This mapping can be seen as the composition of two other mappings: the first is the matrix-valued map x A X A 𝖳 , which is linear since each of the entries of A X A 𝖳 is a linear function of x, and yields a positive definite matrix Y = A X A 𝖳 = ( A X 1 / 2 ) ( A X 1 / 2 ) ; the second is the matrix to scalar map Y b 𝖳 Y 1 b , which is convex on the cone of positive definite matrices, for any b R n (see for example [24] Section 3.1.7). It follows that the composed mapping x b 𝖳 ( A X A 𝖳 ) 1 b is convex, and hence so is F . Finally, since the entries of L ( x ) are linear functions of x, the dissipation function F is a rational function with no poles in R > 0 m , hence differentiable. □
The dissipation function can even be defined on the boundary of the positive orthant by convex closure (that is, by posing F ( x ) = lim inf x x F ( x ) when x is on the boundary; see [23] for details). The extension is convex on the nonnegative orthant and differentiable in its interior, and attains its minimum, although the minimizer x * might lie on the boundary of the positive orthant. Theorem 1 allows us to identify this minimizer.
Corollary 1.
The minimizer of F over R 0 m is the characteristic vector χ P * of the shortest path P * of the network. The corresponding cytoplasmic flow is an equilibrium flow.
Proof. 
Let us consider the characteristic vector χ P * and interpret it as a vector of capacities x * χ P * . Since the capacity x e * is zero for any edge e P * , by Poiseuille’s law the resulting cytoplasmic flow q = X * A 𝖳 ( A X * A 𝖳 ) 1 b will also satisfy q e = 0 = x e * for all e P * . Thus, the support of flow q is contained in the path P * , which by the flow constraints A q = b implies that along any edge of P * there is a unit amount of flow | q e | = 1 = x e * . This implies that | q e | = x e * for all edges e E . The flow q is thus an equilibrium cytoplasmic flow, since φ ( | q e | / x e * ) = φ ( 1 ) = 1 . Moreover, its energy is, by (10), b 𝖳 L 1 b = q 𝖳 X * 1 q = j = 1 m q j 2 / x j * = x * 1 . Therefore,
1 2 F ( x * ) = 1 2 b 𝖳 L 1 b + 1 2 1 𝖳 x * = 1 2 x * 1 + 1 2 x * 1 = x * 1 .
Since by construction x * was chosen to minimize x * 1 , by Theorem 1 it also minimizes F ( x * ) . □

3.3. Physarum Dynamics as a Hessian Gradient Flow

Let us set aside Physarum’s autonomous system (14) for a moment, and consider how one could set up a dynamical system in R > 0 m , the solutions of which converge, over time, to the characteristic vector of the shortest path. Given Corollary 1, one possibility is to aim at minimizing the differentiable convex function F over the positive orthant. To minimize a generic differentiable convex function F over the positive orthant, one might set up the following set of ordinary differential equations:
x ˙ j = x j F ( x ) x j , j = 1 , , m ,
with initial condition x ( 0 ) = x 0 for some x 0 R > 0 m . This is an instance of the mirror descent dynamics, a well-studied dynamics in convex optimization theory [25,26]. The intuition behind (20) is simple: to approach a global minimum, one should follow the (negative) gradient of F , but the rate of change of the j-th component should be reduced the smaller x j is, in order not to violate the constraint x j 0 .
When F is the dissipation potential (19), what does one get from (20)? Let us compute the gradient of the dissipation potential.
Lemma 3
([23]). Let x R > 0 m . For any j = 1 , , m ,
F ( x ) x j = 1 ( a j L 1 ( x ) b ) 2 = 1 q j 2 x j 2 ,
where a j stands for the jth column of matrix A.
Proof. 
Let us start by observing that, by definition, L ( x ) = A X A 𝖳 = j = 1 m x j a j a j 𝖳 and thus L / x j = a j a j 𝖳 . Applying the following identity for the derivative of a matrix inverse (Section 8.4) [27]:
L 1 x j = L 1 L x j L 1 ,
yielding
b 𝖳 L 1 b x j = b 𝖳 L 1 L x j L 1 b = b 𝖳 L 1 a j a j 𝖳 L 1 b = ( a j 𝖳 L 1 b ) 2 .
By (9), a j 𝖳 L 1 b = q j / x j . The claim now follows by the definition of F , (19), since ( 1 𝖳 x ) / x j = 1 . □
Plugging Lemma 3 into (20) yields the dynamics
x ˙ j = x j q j 2 x j 2 1 , j = 1 , , m ,
which is nothing but the Physarum system (14)!
In other words, the quadratic-response Physarum system (14) can be reformulated as a Hessian gradient flow [28]: it can be written in the form
x ˙ = H 1 ( x ) F ( x )
where H ( x ) = 2 h ( x ) is the Hessian of a convex function h; namely, here H ( x ) = X 1 , and h : R > 0 m R is the negative entropy function
h ( x ) j = 1 m x j log x j j = 1 m x j .
The Hessian gradient form immediately implies the existence of a solution to system (14) for all t 0 , using standard arguments [28].
System (23) can also be expressed as
d d t h ( x ) x j = F ( x ) x j , j = 1 , , m ,
or more succinctly,
d d t h ( x ) = F ( x ) ,
another form of the mirror descent dynamics, also known as natural gradient flow [29].
The equivalent formulations (23) and (25) of the Physarum dynamics (14) show that the dynamics follows the gradient of the dissipation function F , under the geometry dictated by the negative entropy function h, defined by (24). It is thus reasonable to expect convergence to the minimum of the dissipation function, i.e., to an equilibrium point where the resulting cytoplasmic flow is the shortest path flow. To rigorously prove this, thanks to Corollary 1 it is sufficient to show that the solutions of system (23) indeed converge to a minimizer of the convex function F .

3.4. Basin of Attraction

The fact that any solution of the mirror descent dynamics (20) with initial condition in R > 0 m converges to a minimizer of a convex function F is, in fact, a result already established in the optimization literature [28,30]. A self-contained proof is presented here. Let us start with a straightforward lemma showing that F is monotonically nonincreasing along the trajectories of the dynamical system.
Lemma 4.
The values F ( x ( t ) ) with x ( t ) a solution of (20) are nonincreasing in t.
Proof. 
By the multivariable chain rule and (20),
d d t F ( x ( t ) ) = j = 1 m F x j ( x ) x ˙ j = j = 1 m x j F x j ( x ) 2 0 .
A key role in the analysis of the mirror descent dynamics is played by the notion of Bregman divergence of a convex function h. This measures the distance between the value of h at a point x, and the approximate value of h at x predicted by a linear model of the function constructed at another point y; see Figure 2 for an illustration.
Definition 1.
The Bregman divergence of a convex function h : R m ( , + ] is defined by D h ( x , y ) h ( x ) h ( y ) h ( y ) x y .
Convexity of h implies the nonnegativity of D h ( x , y ) . When h is the negative entropy, D h is the relative entropy:
D h ( x , y ) = j = 1 m x j log x j y j j = 1 m ( x j y j ) ,
which boils down to (2) when x and y are probability distributions (since j x j = j y j = 1 ). The relative entropy satisfies D h ( x , y ) = 0 if and only if x = y [29].
Using the notion of Bregman divergence, let us prove that the solutions of the mirror descent dynamics converge to the minimizer of the convex function F .
Theorem 2.
Let x ( 0 ) R > 0 m and let x * R 0 m be the minimizer of F (assumed unique). Then any solution x ( t ) of (20) satisfies, for all t 0 ,
F ( x ( t ) ) F ( x * ) 1 t D h ( x * , x ( 0 ) ) .
In particular, as t , the values F ( x ( t ) ) converge to F ( x * ) and hence x ( t ) converges to x * .
Proof. 
A proof will be given that streamlines that in [30]. In the following, to shorten notation let us write x in place of x ( t ) . Since ( d / d t ) h ( x ) + F ( x ) = 0 by (25), for any y one has ( d / d t ) h ( x ) + F ( x ) x y = 0 . This is equivalent to
d d t h ( x ) x y + F ( x ) x y = 0 .
On the other hand, since ( d / d t ) h ( x ) = h ( x ) x ˙ , Definition 1 yields
d d t D h ( y , x ) = d d t h ( x ) + d d t h ( x ) x y = h ( x ) x ˙ + d d t h ( x ) x y + h ( x ) x ˙ = d d t h ( x ) x y .
Combining (27) and (28), and plugging in y = x * ,
d d t D h ( x * , x ) = F ( x ) x x * .
The proof is concluded by a Lyapunov argument. Define Φ : [ 0 , ) R by
Φ ( t ) D h ( x * , x ) + t ( F ( x ) F ( x * ) ) .
Its time derivative is, by (29),
d d t Φ ( t ) = F ( x ) x x * + F ( x ) F ( x * ) + t d d t F ( x ) ,
where the last summand is nonpositive by Lemma 4 and the other terms sum to, by definition, D F ( x * , x ) 0 (by the convexity of F , Lemma 2. Hence, Φ ( t ) Φ ( 0 ) for all t 0 , which is equivalent to
D h ( x * , x ) + t ( F ( x ) F ( x * ) ) D h ( x * , x ( 0 ) ) ,
proving (26) since D h ( x * , x ) 0 .
Because x ( 0 ) R > 0 m by assumption, the divergence D h ( x * , x ( 0 ) ) is a finite constant and (26) implies convergence of F ( x ( t ) ) to F ( x * ) . Convergence of x ( t ) to x * follows from the uniqueness of the minimizer x * . □
Corollary 2.
In the quadratic-response Physarum system (14), the basin of attraction of the shortest path equilibrium x * = χ P * contains R > 0 m .
Proof. 
It was already argued in Section 3.3 that the Physarum system (14) can be written in the mirror descent form (20). The minimizer x * is unique because of the assumption that each source-sink path has a distinct length. Thus, for any x ( 0 ) R > 0 m , Theorem 2 guarantees attraction to x * . □
Note that, with essentially the same proof, Theorem 2 can be partially extended to the case where there is more than one minimizer of F , in the following sense: as t , the values F ( x ( t ) ) converge to the minimum of F over R 0 m . However, when the minimizer is no longer unique, one cannot directly conclude that x ( t ) converges against any specific minimizer.

4. Related Work and Generalizations of the Model

4.1. Beyond the Quadratic Response Function

Going back to the general formulation of the Physarum system (13), it is natural to ask whether the stability result of Corollary 2 can be extended to other response functions, beyond the quadratic response φ ( z ) = z 2 .
Indeed, initial work on the Physarum dynamics considered the linear response φ ( z ) = z . Tero et al. [10] were the first to introduce such a model, and proved an analogue of Corollary 2 when the network is a simple cycle, with two nodes and two edges of different lengths. The analysis of the linear-response Physarum system was later extended to certain planar networks [31], and ultimately to all networks [15]. The latter stability proofs are substantially more involved than in the case of the quadratic response, because the system cannot in those cases be expressed as a Hessian gradient system, as it was done in (23). Nevertheless, Lyapunov arguments are still the essential ingredients of the stability proofs.
Tero et al. [10] also considered nonlinear response functions, but in a formulation that is somewhat different from (13); they assume that the tubular response of edge e is controlled by the sheer amount of flow | q e | , as opposed to the pressure | q e | / x e :
x ˙ e = φ ( | q e | ) x e .
When the response is linear, that is, φ ( z ) = z , formulation (30) by Tero et al. is equivalent to (13). However, when the response is not linear, the two models are qualitatively different. In particular, the model by Tero et al. has multiple stable equilibria when φ ( z ) = z μ with μ > 1 . In contrast, the model (13) has a unique stable equilibrium even in those cases [11].
Formulation (13) was first proposed in [11], where it was shown that on a network topology consisting of parallel links, the analogue of Corollary 2 holds for all strictly increasing, differentiable functions φ : R 0 R 0 such that φ ( 1 ) = 1 . In fact, such a result holds even when each edge e of the network has a distinct response function φ e . This result was later extended to all network topologies by Karrenbauer, Kolev and Mehlhorn (Theorem 3) [17], under the following additional assumption for the response functions:
φ e ( z ) 1 + α e ( z 1 ) for some α e > 0 and all z 0 .
Condition (31) is satisfied, in particular, by all convex increasing functions (to see this, take α e = φ e ( 1 ) ). As mentioned in [17], it is an open problem whether condition (31) can be relaxed.

4.2. Beyond the Network Setting

Although the Physarum dynamics (13) and (30) originated in the context of mathematical biology, as seen in Section 3 it can also be understood from an optimization perspective. From such a more abstract view, the same dynamics can be shown to converge to the solution of optimization problems that substantially generalize the shortest path problem (16).
Namely, if instead of defining the constraint matrix A in terms of a network, one allows any full-rank matrix A R n × m , then the dynamics (13) is still well-defined, and under mild technical conditions it converges to an optimal solution of the following 1 -norm minimization problem:
min f 1 s . t . A f = b , f m .
This looks formally the same as the shortest path formulation (16), but represents a more general problem, since A is not restricted to be a network matrix and b can be an arbitrary vector in R n . Problem (32) is sometimes called basis pursuit and is an important problem in signal processing and statistics: it can yield sparse signals (f) that explain the observations (b) from a set of linear measurements (A) (see for example [32]). This abstract viewpoint on the Physarum dynamics has been explored in several works of the networks and optimization community [16,18,23,33].

4.3. Multiple Sources of Food

Finally, let us note that food-foraging experiments have been carried out in which the slime mold is provided with more than two sources of food [14]. Instead of collapsing into a path, the slime mold in this case eventually connects the sources of food into a complex network structure. One can ask how such a scenario could be modeled effectively, and what functional (if any) is optimized by the network adaptation process in this case.
In this context, a key modeling issue is to strike a balance between tractability and plausibility of the proposed model. In particular, in order for the model to be biologically plausible, it should preserve the symmetry between the sources of food. In the case of two sources of food, this symmetry is guaranteed by the absolute value in Equation (12), which makes the dynamics oblivious to the direction of the flow. In the case of more than two sources, however, it is unclear how to guarantee the same property with a conceptually simple model; the original proposal by Tero et al. [14], for example, involves a periodic random selection of the flow sink node, which seems rather challenging to analyze formally.
The formal model (13) can certainly be extended by allowing the cytoplasmic flow to have multiple sources s 1 , , s k and sinks t 1 , , t l , each with a different supply/demand of flow; interestingly, in this case, the functional optimized is the so-called transshipment cost [33,34], yielding a connection with optimal transport theory [35,36]. However, symmetry with respect to the food locations is lost: exchanging the role of a source s i with that of a sink t j will generally result in a different set of dynamical equations, with incompatible solutions. Therefore, such a model does not appear to be plausible from a biological point of view, despite certainly being interesting from an optimization perspective. All in all, development of a plausible, yet tractable, mathematical model of P. polycephalum’s network dynamics with multiple sources of food remains a challenging problem for future research.

5. Conclusions

A mathematical model of the network dynamics of P. polycephalum was presented that exhibits, from a qualitative standpoint, a behavior compatible with that observed in the laboratory food foraging experiments. The analysis of the model reveals at least two interesting aspects.
The first is the fact that the stable equilibrium point of the dynamics provably minimizes a combination of the infrastructural cost of the network—the term 1 𝖳 x in (19), which corresponds to the total capacity of the network—and of the transport cost–the term b 𝖳 L 1 ( x ) b in (19), which corresponds to the energy of the cytoplasmic flow. The fact that an organism like P. polycephalum achieves a convenient tradeoff between transport efficiency and infrastructural cost of the network should not be surprising, since after all it presumably yields an evolutive advantage. Nevertheless, it is somewhat remarkable that this optimization objective emerges so clearly from the simple positive feedback response (12) of the tubular channels to the flow.
A second remarkable fact is the central role of information-theoretic concepts in the stability analysis. It was shown that the quadratic-response dynamics can be interpreted as a gradient descent in the non-Euclidean geometry dictated by the negative entropy function h ( x ) , and that the corresponding relative entropy D h ( x * , x ) plays a crucial role in the stability proof, as it is able to track the symmetry breaking from (for example) an initially uniform configuration towards the optimal network configuration. In fact, even for response functions that are not quadratic, relative entropy makes an appearance, whether implicitly or explicitly, in all known stability proofs. Thus, information-theoretic concepts emerge as very relevant, and perhaps indispensable, mathematical tools in this context.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Acknowledgments

This work was carried out in association with Istituto di Analisi dei Sistemi ed Informatica, Consiglio Nazionale delle Ricerche, Italy.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Schrödinger, E. What Is Life? Cambridge University Press: Cambridge, UK, 1944. [Google Scholar]
  2. Johnson, H.A. Information Theory in Biology after 18 Years. Science 1970, 168, 1545–1550. [Google Scholar] [CrossRef]
  3. Gatenby, R.; Frieden, B. Information theory in living systems, methods, applications, and challenges. Bull. Math. Biol. 2007, 69, 635–657. [Google Scholar] [CrossRef]
  4. Friston, K. The free-energy principle: A unified brain theory? Nat. Rev. Neurosci. 2010, 11, 127–138. [Google Scholar] [CrossRef] [PubMed]
  5. Baez, J.C.; Pollard, B.S. Relative Entropy in Biological Systems. Entropy 2016, 18, 46. [Google Scholar] [CrossRef]
  6. Van Kampen, N. Stochastic Processes in Physics and Chemistry; North-Holland: Amsterdam, The Netherlands, 2007. [Google Scholar]
  7. Hofrichter, J.; Host, J.; Tran, T.D. Information Geometry and Population Genetics; Springer: Berlin/Heidelberg, Germany, 2017. [Google Scholar]
  8. Guberman, J.M. Mass Action Reaction Networks and The Deficiency Zero Theorem. Master’s Thesis, Harvard University, Cambridge, MA, USA, 2003. [Google Scholar]
  9. Harper, M.; Fryer, D.E.A. Lyapunov Functions for Time-Scale Dynamics on Riemannian Geometries of the Simplex. Dyn. Games Appl. 2015, 5, 318–333. [Google Scholar] [CrossRef]
  10. Tero, A.; Kobayashi, R.; Nakagaki, T. A mathematical model for adaptive transport network in path finding by true slime mold. J. Theor. Biol. 2007, 244, 553–564. [Google Scholar] [CrossRef] [Green Version]
  11. Bonifaci, V. A revised model of fluid transport optimization in Physarum polycephalum. J. Math. Biol. 2017, 74, 567–581. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  12. Alim, K.; Andrew, N.; Pringle, A. Physarum. Curr. Biol. 2013, 23, R1082–R1083. [Google Scholar] [CrossRef] [Green Version]
  13. Nakagaki, T.; Yamada, H.; Tóth, Á. Maze-solving by an amoeboid organism. Nature 2000, 407, 470. [Google Scholar] [CrossRef] [PubMed]
  14. Tero, A.; Takagi, S.; Saigusa, T.; Ito, K.; Bebber, D.P.; Fricker, M.D.; Yumiki, K.; Kobayashi, R.; Nakagaki, T. Rules for Biologically Inspired Adaptive Network Design. Science 2010, 327, 439–442. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  15. Bonifaci, V.; Mehlhorn, K.; Varma, G. Physarum Can Compute Shortest Paths. J. Theor. Biol. 2012, 309, 121–133. [Google Scholar] [CrossRef] [Green Version]
  16. Facca, E.; Cardin, F.; Putti, M. Physarum Dynamics and Optimal Transport for Basis Pursuit. arXiv 2019, arXiv:1812.11782v1. [Google Scholar]
  17. Karrenbauer, A.; Kolev, P.; Mehlhorn, K. Convergence of the non-uniform Physarum dynamics. Theor. Comput. Sci. 2020, 816, 260–269. [Google Scholar] [CrossRef] [Green Version]
  18. Straszak, D.; Vishnoi, N.K. Iteratively reweighted least squares and slime mold dynamics: Connection and convergence. Math. Program. 2021. [Google Scholar] [CrossRef]
  19. YouTube Video. 2011. Available online: https://youtu.be/czk4xgdhdY4 (accessed on 20 May 2021).
  20. Bollobás, B. Modern Graph Theory; Springer: New York, NY, USA, 1998. [Google Scholar]
  21. Strang, G. A framework for equilibrium equations. SIAM Rev. 1988, 30, 283–296. [Google Scholar] [CrossRef]
  22. Biggs, N. Algebraic Graph Theory; Cambridge University Press: Cambridge, UK, 1974. [Google Scholar]
  23. Bonifaci, V. A Laplacian Approach to 1-Norm Minimization. Comput. Optim. Appl. 2021. [Google Scholar] [CrossRef]
  24. Boyd, S.; Vanderberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
  25. Nemirovski, A.S.; Yudin, D.B. Problem Complexity and Method Efficiency in Optimization; Wiley: Chichester, UK, 1983. [Google Scholar]
  26. Bubeck, S. Convex Optimization: Algorithms and Complexity. Found. Trends Mach. Learn. 2015, 8, 231–357. [Google Scholar] [CrossRef]
  27. Magnus, J.R.; Neudecker, H. Matrix Differential Calculus with Applications in Statistics and Econometrics; Wiley: Chichester, UK, 2019. [Google Scholar]
  28. Alvarez, F.; Bolte, J.; Brahic, O. Hessian Riemannian Gradient Flows in Convex Programming. SIAM J. Control Optim. 2004, 43, 477–501. [Google Scholar] [CrossRef]
  29. Amari, S. Information Geometry and Its Applications; Springer: Berlin/Heidelberg, Germany, 2016. [Google Scholar]
  30. Wilson, A. Lyapunov Arguments in Optimization. Ph.D. Thesis, University of California at Berkeley, Berkeley, CA, USA, 2018. [Google Scholar]
  31. Miyaji, T.; Ohnishi, I. Physarum can solve the shortest path problem on Riemannian surface mathematically rigourously. Int. J. Pure Appl. Math. 2008, 47, 353–369. [Google Scholar]
  32. Chen, S.S.; Donoho, D.L.; Saunders, M.A. Atomic Decomposition by Basis Pursuit. SIAM Rev. 2001, 43, 129–159. [Google Scholar] [CrossRef] [Green Version]
  33. Ito, K.; Johansson, A.; Nakagaki, T.; Tero, A. Convergence Properties for the Physarum Solver. arXiv 2011, arXiv:1101.5249v1. [Google Scholar]
  34. Straszak, D.; Vishnoi, N.K. Natural Algorithms for Flow Problems. In Proceedings of the 27th ACM-SIAM Symposium on Discrete Algorithms, Arlington, VA, USA, 10–12 January 2016; SIAM: Philadelphia, PA, USA, 2016; pp. 1868–1883. [Google Scholar]
  35. Facca, E.; Cardin, F.; Putti, M. Towards a Stationary Monge–Kantorovich Dynamics: The Physarum Polycephalum Experience. SIAM J. Appl. Math. 2018, 78, 651–676. [Google Scholar] [CrossRef] [Green Version]
  36. Cardin, F. Trasporto ottimo, sistemi viventi (in Italian). Mat. Cult. Soc. Riv. Dell’Unione Mat. Ital. 2017, 2, 327–341. [Google Scholar]
Figure 1. Example network.
Figure 1. Example network.
Symmetry 13 01385 g001
Figure 2. Illustration of the Bregman divergence D h ( x , y ) .
Figure 2. Illustration of the Bregman divergence D h ( x , y ) .
Symmetry 13 01385 g002
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Bonifaci, V. An Entropic Gradient Structure in the Network Dynamics of a Slime Mold. Symmetry 2021, 13, 1385. https://doi.org/10.3390/sym13081385

AMA Style

Bonifaci V. An Entropic Gradient Structure in the Network Dynamics of a Slime Mold. Symmetry. 2021; 13(8):1385. https://doi.org/10.3390/sym13081385

Chicago/Turabian Style

Bonifaci, Vincenzo. 2021. "An Entropic Gradient Structure in the Network Dynamics of a Slime Mold" Symmetry 13, no. 8: 1385. https://doi.org/10.3390/sym13081385

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop