Next Article in Journal
Generalized Contractions and Fixed Point Results in Spaces with Altering Metrics
Next Article in Special Issue
Inverse Limit Shape Problem for Multiplicative Ensembles of Convex Lattice Polygonal Lines
Previous Article in Journal
Approximating Common Fixed Points of Nonexpansive Mappings on Hadamard Manifolds with Applications
Previous Article in Special Issue
Generating Functions for Four Classes of Triple Binomial Sums
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Measure-on-Graph-Valued Diffusion: A Particle System with Collisions and Its Applications

The Institute of Statistical Mathematics, Tokyo 190-8562, Japan
Mathematics 2022, 10(21), 4081; https://doi.org/10.3390/math10214081
Submission received: 10 September 2022 / Revised: 25 October 2022 / Accepted: 26 October 2022 / Published: 2 November 2022
(This article belongs to the Special Issue Random Combinatorial Structures)

Abstract

:
A diffusion-taking value in probability-measures on a graph with vertex set V, i V x i δ i is studied. The masses on each vertex satisfy the stochastic differential equation of the form d x i = j N ( i ) x i x j d B i j on the simplex, where { B i j } are independent standard Brownian motions with skew symmetry, and N ( i ) is the neighbour of the vertex i. A dual Markov chain on integer partitions to the Markov semigroup associated with the diffusion is used to show that the support of an extremal stationary state of the adjoint semigroup is an independent set of the graph. We also investigate the diffusion with a linear drift, which gives a killing of the dual Markov chain on a finite integer lattice. The Markov chain is used to study the unique stationary state of the diffusion, which generalizes the Dirichlet distribution. Two applications of the diffusions are discussed: analysis of an algorithm to find an independent set of a graph, and a Bayesian graph selection based on computation of probability of a sample by using coupling from the past.

1. Introduction

Consider a finite graph G = ( V , E ) consisting of vertices V = { 1 , , r } , r { 2 , 3 , } and edges E. Throughout this paper, a graph is undirected and connected. The neighbour of the vertex i V is denoted by N ( i ) : = { j V : j i } , where j i means that i and j are adjacent. The degree of the vertex i is denoted by d i : = | N ( i ) | , which is the cardinality of the set N ( i ) . An independent set of G is a subset of V such that no two are adjacent. In other words, a set of vertices is independent if and only if it is a clique in the graph complement of G .
If two vertices of a graph G have precisely the same neighbour, throughout this paper, we call the graph obtained by identifying these two vertices with keeping the adjacency a reduced graph of G .
Let P ( Δ r 1 ) be the totality of probability measures on the simplex
Δ r 1 = { ( x 1 , , x r ) R 0 V : i V x i = 1 }
equipped with the topology of weak convergence. Itoh et al. [1] discussed a diffusion-taking value in probability measures on graph G , whose state is identified with probability measure i = 1 r x i ( t ) δ i and the masses on each vertex
{ x ( t ) = ( x 1 ( t ) , , x r ( t ) ) Δ r 1 , P x : t 0 } P ( Δ r 1 )
which starts from x ( 0 ) = x and satisfies the stochastic differential equation of the form
d x i = j N ( i ) x i x j d B i j , i V ,
where { B i j } i j V are independent standard Brownian motions with skew symmetry B i j = B j i . The generator of the diffusion L operates on a function f C 0 2 ( Δ r 1 ) as
L f = 1 2 i , j V σ i j ( x ) 2 f x i x j ,
where σ i i ( x ) = x i j N ( i ) x j , σ i j ( x ) = x i x j , i j , and σ i j ( x ) = 0 otherwise, and C 0 2 ( Δ r 1 ) is the totality of functions with a continuous derivative up to the second order and compact support in Δ r 1 .
We say a face of the simplex Δ r 1  corresponds to a set of vertices U V if the face is the interior of the convex hall of { e i : i U } denoted by int ( conv { e i : i U } ) , where { e 1 , , e r } is the set of standard basis of the vector space R r . If U consists of a single vertex, say e i , int ( conv { e i } ) should be read as e i . An observation on the diffusion is as follows.
Proposition 1.
Every point in a face of the simplex Δ r 1 corresponding to an independent set V I V of a graph G is a fixed point of the stochastic differential Equation (1). Namely,
x int ( conv { e i : i V I } )
is a fixed point.
Proof. 
For each i V , x i ( t ) is a martingale with respect to the natural filtration generated by the diffusion ( x ( t ) , P x ) . The condition (3) implies x i = 0 for each i V I , E ( x i ( t ) ) = 0 , and thus x i ( t ) = 0 , t 0 almost surely. Then, for each i V I , (1) reduces to d x i ( t ) = 0 , and we have x i ( t ) = x i , t 0 . Therefore, (3) is a fixed point of (1).    □
Itoh et al. [1] discussed the diffusion ( x ( t ) , P x ) as an approximation of the following discrete stochastic model. Consider a system of N particles, where each particle is assigned to one vertex in V, and there is a continuous-time Markov chain
{ ( n 1 ( s ) , , n r ( s ) ) N 0 V : n 1 ( s ) + + n r ( s ) = N , s 0 } , N 0 : = { 0 , 1 , } .
Here, n i ( s ) is the number of particles assigned to the vertex i V at time s. At each instant of time, two of the N particles are chosen uniformly randomly. If particles at i j vertices are chosen, one of the particles is chosen with equal probabilities and assigned to another vertex. This causes a transition from ( i , j ) to ( i , i ) or ( j , j ) with equal probabilities. This stochastic model seems to have various applications. Tainaka et al. [2] discussed this stochastic model as a model of a speciation process caused by geography. Ben-Haim et al. [3] considered a slightly modified version, in which a transition from ( i 1 , i + 1 ) to ( i , i ) , i { 2 , , r 1 } occurs on the one-dimensional graph, where i i + 1 , i { 1 , , r 1 } . They called it a “compromise process” because if we regard the vertices as political positions, then a transition is a compromise. The process x N ( s ) : = n ( s ) / N converges weakly to the diffusion ( x ( t ) , P x ) : if x N ( 0 ) x ( 0 ) , then x N ( [ · N ( N 1 ) / 2 ] ) x ( · ) as N in the space of right continuous functions with left limits (see Theorem 10.3.5 of [4]).
Apart from an approximation of the discrete stochastic model discussed above, the diffusion ( x ( t ) , P x ) seems to appear in various contexts. The diffusion on a complete graph appears as an approximation of a quite different discrete stochastic model, called the Wright–Fisher model in population genetics, which evolves by repetition of multinomial sampling of a fixed number of particles (see, e.g., Chapter 10 of [4], for details). The class of measure-valued diffusions is called Fleming–Viot processes. Such diffusions appear as prior distributions in Bayesian statistics.
As we will see below, the support of an extremal stationary state of the semigroup associated with the diffusion ( x ( t ) , P x ) is a face of Δ r 1 which corresponds to an independent set of G . In this sense, the diffusion can be regarded as independent set-finding in graph theory. Some problems related with independent set-finding, such as the maximum independent set problem, are known to be NP-hard, so it is believed that there is no efficient algorithm to solve them. Therefore, algorithms to find maximal independent sets are useful to obtain practical solutions. For example, Luby [5] discussed a parallel algorithm for finding maximal independent sets that was derived from a stochastic algorithm to find a maximal independent set. His algorithm is based on a step to find an independent set based on random permutation executed by O ( | E | ) processors in time O ( 1 ) for large | E | .
Let us assume there exists a strongly continuous Markov semigroup { T t : t 0 } associated with the diffusion ( x ( t ) , P x ) governed by the generator (2) such that
T t 1 = 1 and T t f 0 , f C ( Δ r 1 ) satisfying f 0 ,
where C ( Δ r 1 ) is the totality of continuous functions on Δ r 1 , and
T t f f = 0 t T s L f d s , f C 0 2 ( Δ r 1 ) .
The existence of such a semigroup for complete graphs was proven by Ethier [6]. For the solution of the stochastic differential Equation (1), we have
T t f ( x ) = E x { f ( x ( t ) ) } .
Let us denote by { T t * : t 0 } the adjoint semigroup on P ( Δ r 1 ) induced by { T t : t 0 } . Consider the totality of all fixed points of { T t * } :
S = { μ P ( Δ r 1 ) : T t * μ = μ , t 0 } .
We call each element of S a stationary state of { T t * } . A stationary state ν satisfies
ν , L f = L f ( x ) ν ( d x ) = 0 , f C 0 2 ( Δ r 1 ) .
The set S is non-empty and convex. The totality of the extremal elements of stationary states is denoted by S ext . Namely, a stationary state ν is uniquely represented as
ν = i = 1 s p i ν i , ν 1 , , ν s S ext
for some p Δ s 1 , s { 2 , 3 , } . In Theorem 1, we see that support of an extremal stationary state of the diffusion ( x ( t ) , P x ) is a face of Δ r 1 corresponding to an independent set of G .
In this paper, we use the term support in a sloppy sense. Namely, positivity of a stationary state is not assumed unless otherwise stated. In fact, Proposition 1 implies that if the diffusion ( x ( t ) , P x ) starts from any point x in int ( conv { e i : i V I } ) , the stationary state is δ x , or an atom at x. In this situation, we say the support is int ( conv { e i : i V I } ) . In other words, if a stationary state does not have probability mass anywhere in an open set, then we say the set is not the support of the stationary state. Some examples are as follows.
Example 1.
Let G = K r , which is a complete graph consisting of r vertices. Since each vertex is maximally independent, a stationary state is represented as i = 1 r p i δ e i , where S ext = { δ e i : i { 1 , , r } } . For the solution of the stochastic differential Equation (1), p i is the absorption probability for the vertex i V . Since x i ( t ) is a martingale, we know p i = x i ( 0 ) .
Example 2.
Let G = C r for an even positive integer r, which is a cycle graph consisting of r vertices, i.e., i i + 1 mod r . The maximal independent sets are the set of all even integer vertices and that of all odd integer vertices. When r = 4 , we have independent sets { 1 } , { 2 } , { 3 } , { 4 } , { 1 , 3 } , and { 2 , 4 } . The supports of extremal stationary states are the faces e i , i { 1 , , 4 } , int ( conv { e 1 , e 3 } ) , and int ( conv { e 2 , e 4 } ) . The totality of the extremal stationary states is S ext = { δ e 1 , δ e 2 , δ e 3 , δ e 4 , ν 1 , 3 , ν 2 , 4 } , where ν 1 , 3 and ν 2 , 4 are densities (not necessarily strictly positive) on int ( conv { e 1 , e 3 } ) and int ( conv { e 2 , e 4 } ) , respectively. Therefore, a stationary state is represented as i = 1 4 p i δ e i + p 1 , 3 ν 1 , 3 + p 2 , 4 ν 2 , 4 .
Example 3.
Let G = K r , s for positive integers r and s, which is a complete bipartite graph consisting of two disjointed and maximally independent sets of r and s vertices. For a graph K 3 , 2 whose maximal independent sets are { 1 , 2 , 3 } and { 4 , 5 } , a stationary state may be represented as i = 1 5 p i δ e i + p 1 , 2 ν 1 , 2 + p 1 , 3 ν 1 , 3 + p 2 , 3 ν 2 , 3 + p 4 , 5 ν 4 , 5 + p 1 , 2 , 3 ν 1 , 2 , 3 .
Obtaining an explicit expression for the stationary states is a challenging problem. Itoh et al. [1] successfully obtained an explicit expression for the stationary states for a star graph S 2 , where a star graph S r 1 , r 3 is a complete bipartite graph K 1 , r 1 , and the vertices of S r 1 are numbered such that 1 i , i { 2 , , r } . A stationary state may be represented as i = 1 3 p i δ e i + p 2 , 3 ν 2 , 3 . If we identify vertices { 2 , 3 } , the star graph S 2 is reduced to a complete graph K 2 . Using the arguments for a complete graph in Example 1, we know p 1 = x 1 and p 2 + p 2 , 3 ν 2 , 3 + p 3 = x 2 + x 3 . Itoh et al. [1] obtained an explicit expression for the diffusion starting from x { e 1 , e 2 , e 3 , int ( conv { e 2 , e 3 } ) } :
p i = x 1 2 2 x 1 ( 2 x 1 ) 2 4 x i 1 , i { 2 , 3 }
by using martingales introduced in Section 2. This result is for a specific graph but can be applied to other graphs reducible to S 2 . For example, the four-cycle graph C 4 discussed in Example 2 can be reduced to S 2 . Explicit expressions for p i , i { 1 , 2 , 3 , 4 } are immediately obtained.
This paper is organized as follows. In Section 2, the martingales used by Itoh et al. [1] are revisited in a slightly generalized form. An interpretation of the martingales is presented in Section 3. A duality relation between the Markov semigroup associated with the diffusion and a Markov chain on the set of ordered integer partitions is established. The dual Markov chain is studied and used to show that the support of an extremal stationary state of the adjoint semigroup is an independent set of the graph. In Section 4, we investigate the diffusion with a linear drift, which gives a killing of the dual Markov chain on a finite integer lattice. The Markov chain is studied and used to study the unique stationary state of the diffusion, which generalizes the Dirichlet distribution. In Section 5, two applications of the diffusions are discussed: analysis of an algorithm to find an independent set of a graph and Bayesian graph selection based on computation of the probability of a sample by using coupling from the past. Section 6 is devoted to discussion of open problems.

2. Invariants among Moments

For a graph G = ( V , E ) , r = | V | , an element a N 0 V with | a | : = a 1 + + a r < is denoted by a = a 1 e 1 + + a r e r . We use multi-index notation; a monomial i x i a i is simply written as x a .
For star graphs S r 1 , Itoh et al. [1] noticed the following homogeneous polynomials of arbitrary order n r :
a 2 + + a r = n , a 1 n r + 1 a 1 n a ( c x ( t ) ) a , n a = n ! a 2 ! a r !
are martingales, where the sum is taken over all ordered positive integer partitions of n satisfying a 2 + + a r = n with a i 1 , i { 2 , , r } and c 2 + + c r = 0 with c i R , i { 2 , , r } . This result is generalized for a generic graph; an example is a reducible graph for which vertices in an independent set can be identified (a reduced graph is defined in Section 1).
Proposition 2.
Let V I V be an independent set of a graph G sharing an adjacent vertex. The homogeneous polynomials of any order n | V I | + 1 :
i V I a i = n , a 1 n | V I | a 1 n a ( c x ( t ) ) a , n a = n ! i V I a i !
is a martingale with respect to the natural filtration generated by the solution ( x ( t ) , P x ) of the stochastic differential Equation (1), where i V I c i = 0 with c i R , i V I .
Proof. 
Applying Itô’s formula to monomials x a = i V I x i a i , we have
d x a = 1 2 i V I a i ( a i 1 ) x a e i + e j ,
where the vertex j is adjacent to all vertices of V I . Then,
d k V I a k = n , a 1 f ( n , a ) x a = x j 2 k V I a k = n , a 1 i V I a i ( a i 1 ) x a e i f ( n , a ) ,
where
f ( n , a ) = n | V I | a 1 n a c a .
The right side of Equation (7) is proportional to
i V I c i k V I a k = n , a 1 , a i 2 x a e i f ( n 1 , a e i ) ,
and it vanishes because the second summation does not depend on the index i and i V I c i = 0 .    □
Proposition 2 gives invariants among n-th order moments of the marginal distribution of the solution of the stochastic differential Equation (1) at a given time. More precisely, such a moment is represented as
m a ( t ) : = E x { ( x ( t ) ) a } = T t x a , | a | = a 1 + + a r = n , a N 0 V .
Itoh et al. [1] used the invariants to derive the expression (6) for masses on atoms in the star graph S 2 .
Corollary 1.
Let V I V be an independent set of a graph G sharing an adjacent vertex. For moments of each order n | V I | + 1 , we have
i V I a i = n , a 1 n | V I | a 1 n a c a m a ( t ) = i V I a i = n , a 1 n | V I | a 1 n a ( c x ) a , t > 0 ,
where i V I c i = 0 with c i R , i V I .
A small example follows.
Example 4.
Let G = C 4 , which is the cycle graph consisting of four vertices discussed in Example 2. A maximal independent set of C 4 , V I = { 2 , 4 } shares 1 or 3 of the adjacent vertices. The totality of ordered positive integer partitions of four are ( a 2 , a 4 ) = ( 1 , 3 ) , ( 2 , 2 ) , and ( 3 , 1 ) , which correspond to the fourth-order moments m e 2 + 3 e 4 ( t ) , m 2 e 2 + 2 e 4 ( t ) , and m 3 e 2 + e 4 ( t ) , respectively. They constitute an invariant:
m e 2 + 3 e 4 ( t ) 3 m 2 e 2 + 2 e 4 ( t ) + m 3 e 2 + e 4 ( t ) = x 2 x 4 ( x 4 2 3 x 2 x 4 + x 2 2 ) , t 0 .
The existence of invariants among same-order moments is interesting, but we are also interested in computation of each moment. They can be computed by simple algebra since each order moment satisfies a system of differential equations:
d d t m a ( t ) = i V j N ( i ) a i ( a i 1 ) 2 m a e i + e j i V j N ( i ) a i a j 2 m a , m a ( 0 ) = x a
for each a Π n , r 0 , where Π n , r 0 is the totality of the ordered positive integer partitions of an integer n with r positive integers:
Π n , r 0 : = { a N 0 r : a 1 + + a r = n } .
However, it is obvious that solving the system (9) becomes prohibitive as the cardinality of the set Π n , r 0 grows. Computation of moments via stochastic simulation is discussed in Section 5.

3. Dual Process on Integer Partitions

To study diffusions ( x ( t ) , P x ) governed by the generator (2), we employ a tool called duality, which is a familiar tool in the study of interacting particle systems (see, e.g., Chapter 2 of [7]).
Consider a graph G = ( V , E ) and let ( a ( t ) , P a ) , a ( 0 ) = a be a continuous-time Markov chain on the set of ordered non-negative integer partitions of n with r = | V | , which is denoted by Π n , r 0 , by the rate matrix { R a , b } :
R a , a e i + e j = a i ( a i 1 ) 2 , i V , j N ( i ) , R a , b = 0 , for all other b a , R a , a = b a R a , b ,
where
R a , b = lim t 0 P a ( a ( t ) = b ) δ a , b t , a , b Π n , r 0 .
The backward equation for the transition probability P a ( a ( t ) = · ) is
d d t P a ( a ( t ) ) = i V j N ( i ) a i ( a i 1 ) 2 P a e i + e j ( a ( t ) ) + R a , a P a ( a ( t ) ) .
We have the following duality relation between the Markov semigroup { T t } and the Markov chain ( a ( t ) , P a ) .
Lemma 1.
The Markov semigroup { T t } associated with the generator (2) and the Markov chain ( a ( t ) , P a ) with the rate matrix (10) satisfy
T t x a = E x { ( x ( t ) ) a } = E a x a ( t ) exp 0 t k ( a ( s ) ) d s , t 0
for each a Π n , r 0 , where the killing rate is
k ( a ) = i V j N ( i ) a i a j 2 i V d i a i ( a i 1 ) 2 .
Proof. 
Noting that
L x a = i V j N ( i ) a i ( a i 1 ) 2 x a e i + e j i V j N ( i ) a i a j 2 x a = i V j N ( i ) a i ( a i 1 ) 2 ( x a e i + e j x a ) i V j N ( i ) a i a j 2 x a + i V d i a i ( a i 1 ) 2 x a = b Π n , r 0 R a , b x b k ( a ) x a
and the operation (4), we see that g ( t , a ) = T t x a satisfies the differential equation
d d t g ( t , a ) = b I R a , b g ( t , a ) k ( a ) g ( t , a ) , g ( 0 , a ) = x a
for each a Π n , r 0 . This is uniquely solved by means of the Feynman–Kac formula, and the assertion follows.    □
Since the total number of particles is kept, i.e., | a ( t ) | = a 1 ( t ) + + a r ( t ) = | a | , t , the killing rate (13) is bounded. The killing rate is not positive definite; however, a key observation is that if the support of a monomial a denoted by supp ( a ) : = { i V : a i > 0 } is an independent set of G , then the killing rate is non-positive: k ( a ) 0 . The converse is not always true.
To illustrate the Markov chain ( a ( t ) , P a ) , let us ask specific questions. What is the moment of 2 e 2 + e 4 for the cycle graph C 4 discussed in Example 2? For a chain that starts from 2 e 2 + e 4 , there are two possible transitions: the one is absorbed into the state e 1 + e 2 + e 4 and the other is absorbed into the state e 2 + e 3 + e 4 , where the rates are unities (see Figure 1).
The waiting time for the occurrence of either of these two transitions follows the exponential distribution of rate two. Since k ( a ) = 2 and k ( e 1 + e 2 + e 4 ) = k ( e 2 + e 3 + e 4 ) = 2 , the right side of the duality relation (12) can be computed as
x 2 2 x 4 e 2 t e 2 t + 1 2 ( x 1 x 2 x 4 + x 2 x 3 x 4 ) 0 t e 2 s 2 ( t s ) 2 e 2 s d s = x 2 2 x 4 + ( x 1 + x 3 ) x 2 x 4 2 ( 1 e 2 t ) , t 0 ,
where s is the time that one of the two possible transitions occurs. The first term corresponds to the case that no transition occurs until time t. Let us call a transition event a collision.
Remark 1.
An analogous dual Markov chain of the diffusion approximation of a kind of Wright–Fisher model was effectively used by Shiga [8,9], where a transition a a e i occurs with the same rate as in (10). Such a transition event is called “coalescent”. In contrast to a collision, the total number of particles decreases by a coalescent.
Here, we have a simple observation about the invariants among moments discussed in Section 2. If a chain ( a ( t ) , P a ) starts from a state a such that supp ( a ) is a maximal independent set V I of the graph G , then the killing rate is non-positive, and the duality relation (12) is reduced to
m a ( t ) = x a + E a x a ( t ) exp 0 t k ( a ( s ) ) d s ; collisions occur .
Corollary 1 implies cancellation of the second term among moments. Moreover, considering a case that the diffusion ( x ( t ) , P x ) starts from a point x in the face corresponding to V I , by Proposition 1, we know that after a collision, supp ( a ( t ) ) must contain a vertex that is not contained in V I .
Let us ask another question. What is the moment of a = 2 e 1 + e 2 + e 3 for the star graph S 2 discussed in Section 1? Some consideration reveals that a chain ( a ( t ) , P a ) starting from a will never be absorbed, and transitions occur among the three states: a, b = e 1 + 2 e 2 + e 3 , and c = e 1 + e 2 + 2 e 3 . Since k ( a ) = 2 and k ( b ) = k ( c ) = 1 , the duality relation (12) gives
m a ( t ) = e t E a x a ( t ) exp 0 t 1 ( a ( s ) = a ) d s ,
where 1 ( A ) is 1 if the argument A is true and zero otherwise. Solving the backward Equation (11), we immediately obtain
P a ( a ( t ) = a ) = 1 3 + 2 3 e 3 t , P a ( a ( t ) = b ) = P a ( a ( t ) = c ) = 1 3 1 3 e 3 t .
However, computation of the right side of Equation (14) does not seem easy because the expectation depends on a sample path of the chain ( a ( s ) : 0 s t ) . Nevertheless, the moments can be obtained easily by solving the system of differential equations (9). In fact, we have
m a ( t ) m b ( t ) m c ( t ) = e ( 3 3 ) t ( A + B ) + e 2 t C + e ( 3 + 3 ) t ( A B ) x 1 2 x 2 x 3 x 1 x 2 2 x 3 x 1 x 2 x 3 2 ,
where
A = 1 2 0 0 0 1 4 1 4 0 1 4 1 4 , B = 3 6 1 1 1 1 1 2 1 2 1 1 2 1 2 , C = 0 0 0 0 1 2 1 2 0 1 2 1 2 .
The observations above lead to the following proposition on the fate of the Markov chain ( a ( t ) , P a ) .
Proposition 3.
Consider the Markov chain ( a ( t ) , P a ) on the set of the ordered non-negative integer positive partitions Π n , r 0 .
( i )
If a chain starts from a state a satisfying 1 | a | r , then it is absorbed into an element of { 0 , 1 } V ;
( i i )
If a chain starts from a state a satisfying n = | a | > r , then the transition probability P a ( a ( t ) = · ) converges to the uniform distribution on the set of ordered positive integer partitions
Π n , r : = { b N r : b 1 + + b r = n } Π n , r 0 .

Proof. 

(i)
A state a is absorbing if and only if the row vector of the rate matrix (10) is zero, which implies a { 0 , 1 } V . Consider the set of vertices V 0 ( a ( t ) ) = { i V : a i ( t ) = 0 } . Then, z ( t ) : = | V 0 ( a ( t ) ) | , t 0 is a death process and is absorbed into the state r n at Markov time τ 0 < with respect to the Markov chain ( a ( t ) , P a ) , where a j ( τ 0 ) = 1 for j V \ V 0 ( a ( τ 0 ) ) . Let us show that the process eventually decreases if z ( t ) > r n . If z ( t ) > r n , at least one vertex, say j V \ V 0 ( a ( t ) ) , satisfies a j ( t ) 2 . If the vertex j is connected with a vertex in V 0 ( a ( t ) ) , say k, the transition a ( t ) a ( t + ) = a ( t ) e j + e k occurs with positive probability, and it makes z ( t + ) = z ( t ) 1 . Otherwise, the vertex j should be connected with at least one vertex in V \ V 0 ( a ( t ) ) , say l. The transition a ( t ) a ( t + ) = a ( t ) e j + e l makes a l ( t + ) 2 . If the vertex l is connected with a vertex in V 0 ( a ( t ) ) , the assertion is shown. Otherwise, the vertex l should be connected with at least one vertex in V \ { V 0 ( a ( t ) ) j } . The proof is completed by repeating this procedure until we reach a vertex in V \ V 0 ( a ( t ) ) that is connected with a vertex in V 0 ( a ( t ) ) .
(ii)
If a chain starts from a state in the set Π n , r 0 \ Π n , r , the argument for ( i ) , i.e., z ( t ) is a death process, shows that in a finite time the chain reaches a state, say a, in the set Π n , r and never exits from Π n , r . For such a case, consider restarting the chain from the state a. For simplicity, we consider the case of n = r + 1 . Then, a state a ( t ) can be identified with the unique vertex i V satisfying a i ( t ) = 2 . Since we are considering a connected graph, there exists a path i = v 0 , v 1 , v 2 , , v s = j for any i j V with some s N , where v k 1 v k , k = 1 , , s . Since all of the transitions occur with rate unity, the sample path has a positive probability. Hence, the Markov chain is irreducible and ergodic, and there exists the unique stationary distribution on the set Π n , r . Since the uniform distribution on Π n , r satisfies the backward Equation (11), it is the stationary distribution. Cases of n > r + 1 can be shown in a similar manner by showing that up to n r particles can be moved from one vertex to another vertex.
   □
Proposition 3 shows that if the Markov chain ( a ( t ) , P a ) starts from a state a satisfying | a | > | V | , convergence of the chain can be divided into two phases: (1) exit from the set Π n , r 0 \ Π n , r and (2) convergence to the uniform distribution on the set Π n , r . The largest eigenvalue of the rate matrix is zero. To consider the mixing, we have to know the second-largest eigenvalue, say λ 2 . By spectral decomposition of the transition probability, the mixing time of Phase 2, or the infimum of t satisfying
max a Π n , r P a ( a ( t ) = · ) | Π n , r | 1 TV < ϵ
for ϵ > 0 is less than c λ 2 1 log ϵ 1 for some constant c > 0 , where μ ν TV is the total variation distance between probability measures μ and ν .
Example 5.
Consider the case that n = r + 1 (see the proof of Proposition 3 ( i i )). It can be shown that the rate matrix { R a , b } reduces to the negative of the Laplacian matrix of G , whose second- largest eigenvalue is called the algebraic connectivity of G . For a connected graph, it is known that the largest eigenvalue is zero and the second-largest eigenvalue is bounded below by 4 / ( r diam ( G ) )  [10], where diam ( G ) is the diameter of G , or the maximum of the shortest path lengths between any pair of vertices in G . For the star graph S 2 and the fourth-order monomials discussed above, the rate matrix is the negative of the Laplacian matrix:
R = 2 1 1 1 1 0 1 0 1 ,
and the eigenvalues are 0, 1 ( < 2 / 3 ) , and 3 , where r = 3 , diam ( S 2 ) = 2 . The two eigenvalues appear in the transition probabilities (15). For a generic graph with large r, the mixing time is O ( r diam ( G ) ) log ϵ 1 .
Assessment of Phase 1 seems harder. The death process z ( t ) = | V 0 ( a ( t ) ) | , t > 0 with z ( 0 ) r 1 is not a Markov process. The death rate is bounded below:
i V j N ( i ) V 0 ( a ( t ) ) a i ( t ) ( a i ( t ) 1 ) 2 a k ( t ) ( a k ( t ) 1 ) 2 for some k V \ V 0 ( a ( t ) ) .
To obtain a rough estimate of the right side, let us suppose the state a ( t ) follows the uniform distribution on the set of ordered positive integer partitions Π n , r z , where
P ( a k = i ) = n i r z 2 n 1 r z 1 1 , i { 1 , , n r + z + 1 } .
Then, the expectation of the lower bound is ( n r + z ) r z + 2 / ( r z ) 3 , where ( n ) i = n ( n + 1 ) ( n + i 1 ) . When n is large, the dominant contribution to the expectation of the waiting time for the exit comes from the period { t : z ( t ) = z ( 0 ) } , and it is O ( n z ( 0 ) r 2 ) . Hence, the expectation of the waiting time for the exit would be O ( n 3 ) .
Shiga [8,9] and Shiga and Uchiyama [11] studied structures of extremal stationary states of the diffusion approximation of a kind of Wright–Fisher model in [ 0 , 1 ] S for a countable set S by using its dual Markov chain. Extremal states of the adjoint Markov semigroup { T t * } on P ( Δ r 1 ) induced by { T t } associated with the diffusion ( x ( t ) , P x ) can be studied by using the dual Markov chain ( a ( t ) , P a ) . Note that positivity of a stationary state is not assumed, as explained in Section 1.
Theorem 1.
The support of an extremal stationary state of the adjoint Markov semigroup { T t * } is a face of the simplex Δ r 1 corresponding to an independent set V I of the graph G , namely, int ( conv { e i : i V I } ) .
Proof. 
Consider a Markov chain ( a ( t ) , P a ) with rate matrix (10) starting from a state a ( 0 ) = a { 0 , 1 } V . According to Proposition 3 ( i ) , such an a is an absorbing state and the chain stays there. Lemma 1 gives
T t x a = x a exp t i V j N ( i ) a i a j 2 , t 0 .
A stationary state ν P ( Δ r 1 ) satisfies ν , T t x a = T t * ν , x a = ν , x a for any x a . If a { 0 , 1 } V , this condition reduces to
ν , x a 1 exp t i V j N ( i ) a i a j 2 = 0 , t 0 .
Therefore, if supp ( a ) is not an independent set, then ν , x a = 0 . Since we are considering a connected graph G , V is not an independent set of G . The condition reduces to ν , x 1 x r = 0 . Since
{ x Δ r 1 : x 1 x r > 0 } = int ( Δ r 1 ) = int ( conv { e i : i V } ) ,
the condition ν , x 1 x r = 0 excludes int ( conv { e i : i V } ) from the support of ν . Suppose there exists a vertex j 1 such that V ( 1 ) = V \ { j 1 } is still not an independent set. Set a is such that a i = 1 if i V ( 1 ) and a i = 0 otherwise for each i { 1 , , r } . Since
{ x Δ r 1 : x a > 0 } = int ( conv { e i : i V ( 1 ) } ) ,
the condition ν , x a = 0 excludes int ( conv { e i : i V ( 1 ) } ) from the support of ν . Repeating this procedure yields an independent set V I = V \ { j 1 , , j s } for some s N , and the face int ( conv { e i : i V I } ) is not excluded from the support of ν .    □
The steps in the above proof appear in the following example.
Example 6.
Let G = C 4 , which is a cycle graph consisting of four vertices. The support of an extremal stationary state appearing in Example 2 is confirmed as follows. Remove the vertex j 1 = 2 from the vertex set V. Since V ( 1 ) = V \ { 2 } = { 1 , 3 , 4 } is not an independent set, the face int ( conv { e 1 , e 3 , e 4 } ) is excluded from the support of extremal stationary states. Then, remove j 2 = 4 from V ( 1 ) . Since V ( 2 ) = V \ { 2 , 4 } = { 1 , 3 } is an independent set, the face int ( conv { e 1 , e 3 } ) is the support of an extremal stationary state.
A direct consequence of Theorem 1 on the moments is as follows.
Corollary 2.
For each n N , if the limit of an n-th order moment of the diffusion ( x ( t ) , P x ) on a graph G is positive, namely,
lim t m a ( t ) = lim t E x { ( x ( t ) ) a } > 0 , for a satisfying | a | = n ,
then supp ( a ) is an independent set of G .

4. Diffusion with Linear Drift

In this section, we consider the diffusion ( x ˜ ( t ) , P ˜ x ) taking value in probability measures on a graph G = ( V , E ) , r = | V | satisfying the following stochastic differential equation with linear drift:
d x i = j N ( i ) x i x j d B i j + α 2 ( 1 r x i ) d t , i V
for α R > 0 . The drift term, α ( 1 r x i ) d t / 2 , gives a killing of the dual process with a linear rate. As shown below, behaviours of the diffusion and the dual Markov chain are significantly different from those without drift discussed in previous sections.
In Itoh et al.’s discrete stochastic model described in Section 1, this drift corresponds to adding the following dynamics: at each instant of time, one of N particles is chosen uniformly randomly and assigned to another vertex chosen uniformly randomly with rate α ( r 1 ) / ( N 1 ) . In the Wright–Fisher model, this drift corresponds to a mutation mechanism [4].
Let ( a ˜ ( t ) , P ˜ a ) be a continuous-time Markov chain on a finite integer lattice, or the set of non-negative integers
I : = { a N 0 V : a 1 + + a r < }
by the rate matrix { R ˜ a , b } :
R ˜ a , a e i + e j = a i ( a i 1 ) 2 , i V , j N ( i ) , R ˜ a , a e i = α 2 a i , i V , R ˜ a , b = 0 , for all other b a , R ˜ a , a = b a R ˜ a , b ,
where
R ˜ a , b = lim t 0 P ˜ a ( a ˜ ( t ) = b ) δ a , b t , a , b I
The backward equation for the transition probability P ˜ a ( a ˜ ( t ) = · ) is
d d t P ˜ a ( a ˜ ( t ) ) = i V j N ( i ) a i ( a i 1 ) 2 P ˜ a e i + e j ( a ˜ ( t ) ) + α 2 i V P ˜ a e i ( a ˜ ( t ) ) R ˜ a , a P ˜ a ( a ˜ ( t ) ) .
The following duality relation between the Markov semigroup associated with the diffusion ( x ˜ ( t ) , P ˜ x ) denoted by { T ˜ t : t 0 } and the Markov chain ( a ˜ ( t ) , P ˜ a ) can be shown in the same manner as Lemma 1.
Lemma 2.
The Markov semigroup { T ˜ t } and the Markov chain ( a ˜ ( t ) , P ˜ a ) with the rate matrix (17) satisfy
T ˜ t x a = E ˜ x ( x a ( t ) ) = E ˜ a x a ˜ ( t ) exp 0 t k ˜ ( a ˜ ( s ) ) d s , t 0
for each a I , where the killing rate is
k ˜ ( a ) = i V j N ( i ) a i a j 2 + α 2 ( r 1 ) | a | i V d i a i ( a i 1 ) 2 .
In contrast to the rate matrix { R a , b } in (10), particles are erased, and it makes the total number of particles | a ˜ ( t ) | = a ˜ 1 ( t ) + + a ˜ r ( t ) decrease. It is clear from the rate matrix (17) that 0 is the unique absorbing state.
Proposition 4.
Let τ : = inf { t > 0 : a ˜ ( t ) = 0 } , which is a Markov time with respect to the Markov chain ( a ˜ ( t ) , P ˜ a ) with the rate matrix (17). Then,
P ˜ a ( τ < ) = 1 and E ˜ a τ = 2 α i = 1 | a | 1 i .
Proof. 
Since a ˜ ( t ) = 0 if and only if | a ˜ ( t ) | = 0 , we consider the Markov chain of the cardinality | a ˜ ( t ) | . According to the rate matrix (17), it is a linear death process with rate α | a ˜ ( t ) | / 2 . Noting that τ is the convolution of exponential random variables of rates α i / 2 , i { 1 , , | a | } , we have the assertion.    □
To illustrate the Markov chain ( a ˜ ( t ) , P ˜ a ) , let us ask a specific question. What is the moment of a = e 1 + e 2 from the cycle graph C 4 ? For a chain that starts from a, there are four possible sample paths (Figure 2):
(i)
No particles are erased;
(ii)
Particle 1 is erased, but Particle 2 survives;
(iii)
Particle 2 is erased, but Particle 1 survives;
(iv)
Both particles are erased.
The waiting time for either of the two particles to be erased follows the exponential distribution of rate α . Since k ˜ ( a ) = 1 + 3 α and k ˜ ( e 1 ) = k ˜ ( e 2 ) = 3 α / 2 , the right side of the duality relation (18) can be computed as
x 1 x 2 e α t e ( 1 + 3 α ) t + x 1 + x 2 2 0 t α e α s e α ( t s ) / 2 e ( 1 + 3 α ) s 3 α ( t s ) / 2 d s + 0 t 0 u α e α s α 2 e α ( u s ) / 2 e ( 1 + 3 α ) s 3 α ( u s ) / 2 d s d u = x 1 x 2 e t ( 1 + 4 α ) + ( x 1 + x 2 ) α 2 ( 1 + 2 α ) ( e 2 α t e ( 1 + 4 α ) t ) + α 4 ( 1 + 4 α ) α e 2 α t 4 ( 1 + 2 α ) + α 2 e ( 1 + 4 α ) t 2 ( 1 + 2 α ) ( 1 + 4 α ) ,
where s > 0 and u > s are the times that a particle is erased.
The stationary state of the adjoint Markov semigroup { T ˜ t * } on P ( Δ r 1 ) induced by { T ˜ t } consists of the unique probability measure.
Theorem 2.
For the adjoint Markov semigroup { T ˜ t * } , there exists the unique stationary state ν α P ( Δ r 1 ) satisfying
lim t T ˜ t * δ x = ν α
for every x Δ r 1 .
Proof. 
Since the Markov chain ( a ˜ ( t ) , P ˜ a ) with the rate matrix (17) is absorbed into 0, Lemma 2 and Proposition 4 give
lim t T ˜ t x a = lim t E ˜ a x a ˜ ( t ) exp 0 t k ˜ ( a ˜ ( s ) ) d s ; t τ + lim t E ˜ a x a ˜ ( t ) exp 0 t k ˜ ( a ˜ ( s ) ) d s ; t > τ = E ˜ a exp 0 τ k ˜ ( a ˜ ( s ) ) d s ; τ < = E ˜ a exp 0 τ k ˜ ( a ˜ ( s ) ) d s
for all a I . Since lim t δ x , T ˜ t x a = lim t T ˜ t * δ x , x a for each x Δ r 1 , there exists a unique probability measure ν α satisfying lim t T ˜ t f = ν α , f for all f C ( Δ r 1 ) .    □
The stationary state ν α converges weakly to a common limit as α irrespective of the graph.
Corollary 3.
The stationary state ν α of the adjoint Markov semigroup { T ˜ t * } satisfies
lim α ν α = δ ( 1 / r , , 1 / r ) .
Proof. 
Since the killing rate (19) is bounded and k ˜ ( a ) = α ( r 1 ) | a | / 2 + O ( 1 ) for large α , the leading contribution to the expression (20) can be evaluated by the death process | a ˜ ( t ) | considered in Proposition 4, whose waiting time follows the exponential distribution of rate α | a ˜ ( t ) | / 2 . Let n = | a ˜ ( 0 ) | = | a | . We have
ν α , x a = E ˜ a exp 0 τ k ˜ ( a ˜ ( s ) ) d s 0 d s 1 d s n i = 1 n e α 2 ( r 1 ) i s i + c n α 2 i e α 2 i s i = α 2 n n ! i = 1 n 2 α r i + 2 c n , a I ,
where c n is a constant satisfying k ˜ ( b ) α ( r 1 ) | b | / 2 + c n for all b satisfying | b | { 1 , , n } . In the same way, ν α , x a is bounded below. The assertion follows by taking the limit α of these bounds.    □
Moreover, the stationary state ν α has a continuous and strictly positive density.
Theorem 3.
For the adjoint Markov semigroup { T ˜ t * } , the unique stationary state ν α P ( Δ r 1 ) is absolutely continuous with respect to the Lebesgue measure on Δ r 1 and admits a probability density that is strictly positive in int ( Δ r 1 ) and is of C ( Δ r 1 ) -class.
Proof. 
We first show that ν α has a density of C ( Δ r 1 ) -class. By Theorem 2, we have
ν α , e 1 i V x i n i = j = 0 ( 1 ) j j ! a Π j , r 0 j a ν α , ( x n ) a = j = 0 ( 1 ) j j ! a Π j , r 0 j a n a E ˜ a exp 0 τ k ˜ ( a ˜ ( s ) ) d s
for each n Z V . Therefore, ν α has a C ( Δ r 1 ) -density represented as
p α ( x ) = n Z V e 1 i V x i n i j = 0 ( 1 ) j j ! a Π j , r 0 j a n a E ˜ a exp 0 τ k ˜ ( a ˜ ( s ) ) d s .
We next show that the density p α is strictly positive in int ( Δ r 1 ) . Consider an approximation of p α ( x ) by polynomials:
p α ( n ) ( x ) = i 1 = 0 n i r = 0 n p α i 1 n , , i r n n i x i
satisfying lim n p α ( n ) ( x ) = p α ( x ) . Suppose there exist a point x ¯ int ( Δ r 1 ) satisfying p α ( x ¯ ) = 0 . Since the polynomials x ¯ i are strictly positive, for any small positive constants ϵ 1 and ϵ 2 , there exists an N such that
p α i 1 n , , i r n < ϵ 1 , i { 0 , , n } V
and int ( Δ r 1 ) is covered by open balls:
B n ( i ) = { x : | x i j i j / n | < ϵ 2 } , i { 0 , , n } V
for all n > N . Since p α is smooth, for every point x int ( Δ r 1 ) , we can find a ball containing x and p α ( x ) < ϵ 1 + c ϵ 2 for some constant c. This implies p α ( x ) = 0 , x int ( Δ r 1 ) , but it contradicts the fact that ν α , x a > 0 , a I followed by the expression (20) because the killing rate k ˜ ( a ˜ ) is bounded and the Markov time satisfies P ˜ ( τ < ) = 1 by Proposition 4.    □
An immediate consequence is the following corollary, which is an analogous result to Corollary 2.
Corollary 4.
The moments of the stationary state ν α of the adjoint Markov semigroup { T ˜ t * } are positive, namely,
m a ( α ) : = ν α , x a = E ˜ ν α x a > 0 , for each a I .
The moments of the stationary state can be obtained by the condition for the stationary state (5). It gives a system of recurrence relations:
0 = i V j N ( i ) a i ( a i 1 ) 2 m a e i + e j i V j N ( i ) a i a j 2 m a + α 2 i V a i ( m a e i r m a )
for each a I with the boundary condition m 0 = 1 . In contrast to the system of ordinary differential equations (9), this system is not closed among the moments of the same order. Prior to solving the system for the moment of a given monomial a, we have to solve the systems for the moments of lower orders than a. Therefore, it seems a formidable task to solve System (22). Diffusion on a complete graph is an exception.
Example 7.
Let G = K r , which is the complete graph consisting of r vertices discussed in Example 1. The unique solution of the system of recurrence relations (22) is
m a ( α ) = i = 1 r ( α ) a i ( r α ) n , a I .
Moreover, since this expression is the moments of the symmetric Dirichlet distribution of parameter α, the stationary state is the Dirichlet distribution:
ν α ( x 1 , , x r ) = Γ ( r α ) { Γ ( α ) } r i = 1 r x i α 1 d x 1 d x r .
Remark 2.
The limit of the moments (23) is known as the Dirichlet-multinomial distribution up to multiplication of the multinomial coefficient. Renumbering the set supp ( a ) by { 1 , , l } and taking the limit α 0 and r with r α = θ > 0 , the expression (23) reduces to the form
θ l ( θ ) n i = 1 r ( a i 1 ) ! , a Π n , l ,
which is known as the Ewens sampling formula, or the exchangeable partition probability function of the Dirichlet prior process in Bayesian statistics (see, e.g., [12] for an introduction). Karlin and McGregor [13] derived this formula by using a system of recurrence relations based on coalescents mentioned in Remark 1. In this sense, we have found an alternative system of recurrence relations (22) the Formula (24) satisfies based on collisions.

5. Applications

In this section, we present applications of the results developed in previous sections.

5.1. Finding Independent Sets of Graphs

Itoh et al.’s discrete stochastic model described in Section 1 stops when the set of vertices occupied by at least one particle constitutes an independent set of a graph. The model is summarized as the following procedure.
The cardinality of the set of vertices to which at least one particle is assigned decreases, and the set eventually reduces to an independent set of G . The integer M is needed to confirm that we cannot choose particles from neighboring vertices. If M is sufficiently large, Algorithm 1 provides an independent set with high probability.
Algorithm 1 Finding an independent set of a graph
  • Input: A graph G = ( V , E ) with vertices V = { 1 , , r } , the number of particles N r , and an integer M N to stop the iteration.
  • Output: A candidate of an independent set of G .
  • Step 1: Assign particles to vertices such that at least one particle is assigned to each vertex.
  • Step 2: Set c = 0 .
  • Step 3: Choose two distinct uniformly random particles. Let i V and j V be the vertices to which the particles are assigned.
  • Step 4: If i j , then assign both particles to i or j with probability 1 / 2 and go to Step 2. Otherwise, c c + 1 .
  • Step 5: If c < M , go to Step 3.
  • Step 6: Output the list of vertices to which at least one particle is assigned.
A natural question is how many steps are needed to find an independent set. Answering this question seems hard, but regarding the diffusion satisfying the stochastic differential Equation (1) as an approximation of the procedure of Algorithm 1, we can deduce some rough idea. Because of the scaling in the diffusion limit, the unit time in the diffusion corresponds to the N ( N 1 ) / 2 iterations of Steps 3 and 4 of Algorithm 2.
According to the argument of Proposition 1, a sample path of the diffusion ( x ( t ) , P x ) starting from a point x int ( Δ r 1 ) is absorbed into lower dimensional faces and is eventually absorbed into a face corresponding to an independent set.
Proposition 5.
Let U V be a set of vertices that is not an independent set of a graph G . For a sample path of the diffusion ( x ( t ) , P x ) starting from a point x int ( conv { e i : i U } ) , the Markov time
τ U = inf { t > 0 : x ( t ) Bd ( conv { e i : i U } ) }
satisfies
P x ( τ U > t ) c x e t | E U | , t > 0 ,
where E U is the edge set of the induced subgraph of G consisting of U, Bd ( conv { e i : i U } ) is the boundary of conv { e i : i U } , and c x ( 0 , 1 ] is a constant depending on x.
Proof. 
By the argument in the proof of Theorem 1, we have
E x ( x ( t ) ) = E x { x ( t ) ; x ( t ) > 0 } max ( x ) E x { 1 ( x ( t ) > 0 ) } = max ( x ) P x { x ( t ) int ( conv { e i : i U } ) } = max ( x ) P x ( τ U > t ) , x = i U x i ,
while E x ( x ( t ) ) = x e t | E U | . Choose c x = x { max ( x ) } 1 = x | U | | U | .    □
The author does not find any other property of the Markov time τ U for generic graphs, but the diffusion on a complete graph is an exception; the probability distribution function can be obtained exactly.
Proposition 6.
Let G be a complete graph K r . For a face Δ s 1 = conv { e i : i U } , 2 s r , the distribution of the Markov time (25) is represented as
P x ( τ U > t ) = i s ( 2 i 1 ) ( 1 ) i e i ( i 1 ) 2 t × l = s 2 i 2 ( 2 i ) l ( i + 1 ) l l ! ( l + 1 ) ! a Π l + 2 , s l + 2 a x a b Π l + 1 , s l + 1 b x b .
Proof. 
The inclusion–exclusion argument shows the following expression:
P x ( τ U > t ) = p ¯ 1 , , s ( t ) i 1 , , i s 1 p ¯ i 1 , , i s 1 ( t ) + i 1 , , i s 2 p ¯ i 1 , , i s 2 ( t ) + ( 1 ) s 1 i = 1 s p ¯ i ( t ) ,
where p ¯ i 1 , , i u ( t ) is the total mass on conv { e i 1 , , e i u } and the summations are taken over the totality of distinct indices chosen from { 1 , , s } . For K 2 , an explicit expression can be obtained by solving a backward equation for the diffusion (Equation (4.15) in [14]):
p ¯ 1 ( t ) = x 1 + i 2 ( 2 i 1 ) ( 1 ) i e i ( i 1 ) t / 2 l 0 ( 2 i ) l ( i + 1 ) l l ! ( l + 1 ) ! ( x 1 l + 2 x 1 l + 1 ) .
A complete graph can be reduced to K 2 by any partition of the vertex set to two vertex sets (the reduction is defined in Section 1). For example, K 3 is reducible to K 2 consisting of Vertices { 1 + 2 , 3 } , where the Vertex 1 + 2 is obtained by identifying Vertices 1 and 2. In the same way, K 3 is also reducible to { 1 + 3 , 2 } and { 2 + 3 , 1 } . Therefore, an expression of p ¯ i 1 , , i u ( t ) is obtained by the right side of (27) by replacing x 1 with x i 1 + + x i u . The inclusion–exclusion argument gives
a Π l , s l a x a = ( x 1 + + x s ) l i 1 , , i s 1 ( x i 1 + + x i s 1 ) l + + ( 1 ) s 1 i = 1 s ( x 1 + + x s ) ,
where both sides are zero if l < s . Substituting the expression obtained by the expression (27) into (26) and collecting terms by using the equality (28), the assertion follows.    □
According to Proposition 6, the probability distribution function of the exit time from int ( Δ s 1 ) is asymptotically 1 ( 2 s 1 ) ! ! 2 s 1 x e s ( s 1 ) t / 2 for large t, where ( 2 s 1 ) ! ! = ( 2 s 1 ) ( 2 s 3 ) 1 . Let a sequence of vertex sets occupied by at least one particle in Algorithm 2 be denoted by V, U ( r 1 ) , U ( r 2 ) , , U ( 1 ) = { i } for a vertex i V . If the exit time for U ( s ) followed the exponential distribution of mean | E U ( s ) | = s ( s 1 ) / 2 (of course, this is not exactly true), the expectation that the waiting time of a sample path is absorbed into the vertex would be 2 ( 1 r 1 ) for large r. This rough argument suggests that the expectation of the computation cost of Algorithm 2 would be O ( r N 2 ) for a complete graph because an iteration of Steps 3 and 4 can be executed in O ( r ) . Luby’s algorithm for finding an independent set described in Section 1 demands O ( 1 ) using O ( r 2 ) processors.

5.2. Bayesian Graph Selection

Consider a sampling of particles of size n from the unique stationary state of the adjoint Markov semigroup { T ˜ * } on P ( Δ r 1 ) induced by the Markov semigroup { T t } associated with the diffusion ( x ˜ ( t ) , P ˜ x ) that appeared in Theorem 2 such that a i particles of a graph G = ( V , E ) are taken from the vertex i V . We assume the probability of taking a sample does not depend on the order of particles, namely, they are exchangeable. Such probabilities constitute the multinomial distribution, namely, a probability measure on ordered non-negative integer partitions of n as
q a : = n a x a , a Π n , r 0 , r = | V | ,
satisfying a q a ( t ) = 1 . The moment m a defined by (21) is the expectation of the sample probability  q a up to the multinomial coefficient:
E ˜ ν α q a = n a E ˜ ν α x a = n a m a ( α ) .
Before proceeding to discuss computational issues, we present a motivating problem in Bayesian statistics. The expected sample probability (29) is a mixture of multinomial distributions of parameters x over the stationary state ν α of the adjoint Markov semigroup { T ˜ * } . In statistical terminology, the sample probability q a and the expectation (29) are called the likelihood and the marginal likelihood, respectively, and ν α is called the prior distribution for the parameter x Δ r 1 .
Suppose we are interested in selecting a graphical model consisting of four vertices from three candidate models: a star graph S 3 , a cycle graph C 4 , and a complete graph K 4 (Figure 3).
For this purpose, we employ stationary states ν α as the prior distributions. As we have seen in Example 7, the prior distribution for K 4 is the Dirichlet distribution, but for the prior distributions for other graphs, closed form expressions of the distribution function are not known. Suppose we have a sample consisting of two particles. If it is e 1 + e 3 , by solving the recurrence relation (22), we obtain the expected sample probabilities under S 3 , C 4 , and K 4 as
α 2 ( 1 + 4 α ) , 1 8 , and α 2 ( 1 + 4 α ) ,
respectively. On the other hand, if the sample is e 2 + e 4 , they are
1 8 , 1 8 , and α 2 ( 1 + 4 α ) .
If α is small, e 1 + e 3 supports C 4 , while e 2 + e 4 does not support K 4 . This is reasonable because the vertex set { 1 , 3 } is an independent set of C 4 but not an independent set of S 3 and K 4 . On the other hand, the set { 2 , 4 } is not an independent set of K 4 , but it is an independent set of S 3 and C 4 . The ratio of marginal likelihoods is called the Bayes factor, which is a standard model-selection criterion in Bayesian statistics (see, e.g., Section 6.1 of [15]). If a sample is e 1 + e 3 , the Bayes factor of C 4 to S 3 or K 4 is 1 + 1 / ( 4 α ) . Therefore, C 4 is supported if α is small, while all graphs are equivalent if α is large. We do not discuss details of statistical aspects, including how to choose α , but it is worthwhile to mention that positive α improves the stability of model selection, especially for small samples. In fact, if α is small, the Bayes factor drastically changes by adding a sample unit. Suppose we have a sample e 1 + e 3 and take an additional sample unit. If it is e 2 , the expected sample probabilities for the sample e 1 + e 2 + e 3 under S 3 , C 4 , and K 4 are
3 α ( 1 + 12 α ) 32 ( 1 + 3 α ) ( 1 + 4 α ) , 3 α ( 1 + 12 α ) 32 ( 1 + 3 α ) ( 1 + 4 α ) , and 3 α 2 4 ( 1 + 2 α ) ( 1 + 4 α ) ,
respectively. The Bayes factor of C 4 to S 3 is unity, which means that the graphs C 4 and S 3 are equivalent. This conclusion is quite different from that deduced from the sample e 1 + e 3 . In the limit of a large α , by Corollary 3, the expected sample probability of any graph follows the unique limit distribution—the multinomial distribution.
Now let us discuss computation of expected sample probabilities. A closed-form expression of the stationary state ν α of the adjoint Markov semigroup { T ˜ * } is not known for generic graphs; nevertheless, we can compute the expected sample probabilities of any graph by solving the system of recurrence relations (22). Solving the system becomes prohibitive as the sample size n grows, but the following algorithm, which is a byproduct of Theorem 2, provides an unbiased estimator of the expected sample probability.
By Corollary 4, the output of Algorithm 2 is an unbiased estimator of m a ( α ) , which gives the expected sample probability (29) by multiplying the multinomial coefficient.
Algorithm 2 Estimating the sample probability, or the marginal likelihood
  • Input: A sample taken from a graph G = ( V , E ) consisting of a vector a Π n , r , r = | V | and the parameter value α > 0 .
  • Output: A random variable following exp { 0 τ k ˜ ( a ˜ ( s ) ) d s } that appeared in (20).
  • Step 1: Set k = 0 .
  • Step 2: Get a random number T following the exponential distribution of mean i = 1 r d i a i ( a i 1 ) / 2 + α | a | / 2 .
  • Step 3: k k + k ˜ ( a ) T .
  • Step 4: Divide the interval [ 0 , 1 ] with the ratio
    a 1 ( a 1 1 ) : : a 1 ( a 1 1 ) : a 2 ( a 2 1 ) : : a 2 ( a 2 1 ) : : a r ( a r 1 ) : : a r ( a r 1 ) : α a 1 : : α a r
    such that a i ( a i 1 ) , i { 1 , , r } appears d i times.
  • Step 5: Get a random number U following the uniform distribution on [ 0 , 1 ] .
  • Step 6: If U falls in the j-th interval of a i ( a i 1 ) , let a i a i 1 , a j a j + 1 . Otherwise, if U falls in the interval of α a i , let a i a i 1 .
  • Step 7: If | a | = 0 , output e k . Otherwise, go to Step 2.
An attractive property of Algorithm 2 as a Markov chain Monte Carlo is that it is a direct sampler; namely, it generates random variables independently and exactly follow the target distribution. In fact, this algorithm can be regarded as a variant of a direct sampling algorithm called coupling from the past (see, e.g., Chapter 25 of [16] for a concise summary). Regarding a sample as being generated from the infinite past and that time is going backward, the time when all particles are erased is the time when the sample path can be regarded as that which came from the infinite past because the sample path does not depend on any events older than the time. We have the following estimate of steps needed to complete the procedure.
Proposition 7.
For a sample of size n, the steps needed to complete Algorithm 2 to obtain an unbiased estimator of the expected sample probability (29) are O { | E | ( n + n ( n 1 ) / ( 2 α ) ) } for large | E | .
Proof. 
For a state a satisfying | a | = m , the probability that a particle is erased at the next step is bounded below:
α m i = 1 r a i 2 + ( α 1 ) m α m + α 1 .
Therefore, the waiting time to erase a particle is stochastically smaller than the waiting time of an event following the geometric distribution of waiting time ( m + α 1 ) / α . The sum of the waiting times from m = 1 to m = n is O ( n + n ( n 1 ) / ( 2 α ) ) . Steps 4 and 6 demand d 1 + + d r + r = 2 | E | + r = O ( | E | ) steps for large | E | . Therefore, the assertion follows. □
We have focused on the diffusion with drift, but moments of the marginal distribution of the diffusion without drift at a given time, i.e., (8), can be computed by an analogue to Algorithm 2. The problem reduces to solving the system of differential equations (9). We omit the discussion, but a similar problem for a complete graph K 4 was discussed in [17].

6. Discussion

We have studied diffusions taking value in probability measures on a graph whose masses on each vertex satisfy the stochastic differential equations of the forms (1) and (16) by using their dual Markov chains on integer partitions and on finite integer lattices, respectively. Many problems remain to be solved, especially for (1). First of all, a formal proof of the existence of the semigroup { T t } associated with the generator (2) should be established, which demands pathwise uniqueness of the solution of (1). As we have emphasized in the text, some arguments, especially those after Propositions 3 and 6, are rough and restrictive. They could be improved. Stationary states of the Markov semigroup need further studies. A counterpart of Theorem 1.5 of [11] or Theorem 3 on regularity of stationary states could be established by detailed analysis of the diffusion. Further properties of the diffusion such as absorption probabilities into a stationary state and the waiting times are interesting. To obtain explicit expressions of them is challenging, but such expressions would be helpful for further understanding the diffusion.
Two applications of the diffusions are discussed: analysis of an algorithm to find an independent set of a graph, and Bayesian graph selection based on computation of expected sample probability by using coupling from the past. Further applications and targets for modelling may exist.
For the coalescents mentioned in Remark 1, the properties of a “genealogy” of a sample, which is a sample path of the dual Markov chain, are intensively studied because a genealogy itself is used as a stochastic model of DNA sequence variation (see, e.g., [18]). Random graphs such as Figure 1 and Figure 2 are counterparts of such genealogies. Study of the properties of such random graphs would be interesting.

Funding

The author is supported in part by JSPS KAKENHI grant 20K03742 and was supported in part by JST Presto Grant, Japan.

Data Availability Statement

Not applicable.

Acknowledgments

The author thanks Yoshiaki Itoh for introducing their work [1] to him. An earlier version of this work was presented at a workshop on coalescent theory at the Centre de Recherches Mathématiques, University of Montreal, in October 2013.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Itoh, Y.; Mallows, C.; Shepp, L. Explicit sufficient invariants for an interacting particle system. J. Appl. Prob. 1998, 35, 633–641. [Google Scholar] [CrossRef]
  2. Tainaka, K.; Itoh, Y.; Yoshimura, J.; Asami, T. A geographical model of high species diversity. Popul. Ecol. 2006, 48, 113–119. [Google Scholar] [CrossRef]
  3. Ben-Haim, E.; Krapivsky, P.L.; Redner, S. Bifurcations and patterns in compromise process. Physica 2003, 183, 190–204. [Google Scholar]
  4. Ethier, S.N.; Kurtz, T.G. Markov Processes: Characterization and Convergence; Wiley: Hoboken, NJ, USA, 1986. [Google Scholar]
  5. Luby, M. A simple parallel algorithm for the maximal independent set finding problem. SIAM J. Comput. 1986, 15, 1036–1053. [Google Scholar] [CrossRef]
  6. Ethier, S.N. A class of degenerate diffusion processes occurring in population genetics. Comm. Pure Appl. Math. 1976, 29, 483–493. [Google Scholar] [CrossRef]
  7. Liggett, T.M. Interacting Particle Systems; Springer: Berlin/Heidelberg, Germany, 1985. [Google Scholar]
  8. Shiga, T. An interacting system in population genetics. J. Math. Kyoto Univ. 1980, 20, 213–242. [Google Scholar] [CrossRef]
  9. Shiga, T. An interacting system in population genetics, II. J. Math. Kyoto Univ. 1980, 20, 723–733. [Google Scholar] [CrossRef]
  10. Mohar, B. The Laplacian Spectrum of Graphs. In Graph Theory, Combinatorics, and Applications; Alavi, Y., Chartrand, G., Oellermann, O.R., Schwenk, A.J., Eds.; Wiley: Hoboken, NJ, USA, 1991; Volume 2, pp. 871–898. [Google Scholar]
  11. Shiga, T.; Uchiyama, K. Stationary state and their stability of the stepping stone model involving mutation and selection. Prob. Theor. Rel. Fields 1986, 73, 87–117. [Google Scholar] [CrossRef]
  12. Mano, S. Partitions, Hypergeometric Systems, and Dirichlet Processes in Statistics; Springer: Tokyo, Japan, 2018. [Google Scholar]
  13. Karlin, S.; McGregor, J. Addendum to paper of W. Ewens. Theor. Popul. Biol. 1972, 3, 113–116. [Google Scholar] [CrossRef]
  14. Kimura, M. Diffusion models in population genetics. J. Appl. Prob. 1964, 1, 177–232. [Google Scholar] [CrossRef] [Green Version]
  15. Bernardo, J.M.; Smith, A.F.M. Bayesian Theory; Wiley: Chichester, UK, 1994. [Google Scholar]
  16. Levin, D.A.; Peres, Y. Markov Chains and Mixing Times, 2nd ed.; American Mathematical Society: Providence, RI, USA, 2017. [Google Scholar]
  17. Mano, S. Duality between the two-locus Wright–Fisher diffusion model and the ancestral process with recombination. J. Appl. Probab. 2013, 50, 256–271. [Google Scholar] [CrossRef]
  18. Durrett, R. Probability Models for DNA Sequence Evolution, 2nd ed.; Springer: New York, NY, USA, 2008. [Google Scholar]
Figure 1. Possible transitions of the chain ( a ( t ) , P a ) on the cycle graph C 4 starting from a = 2 e 2 + e 4 .
Figure 1. Possible transitions of the chain ( a ( t ) , P a ) on the cycle graph C 4 starting from a = 2 e 2 + e 4 .
Mathematics 10 04081 g001
Figure 2. Possible sample paths of the chain ( a ˜ ( t ) , P ˜ a ) on the cycle graph C 4 starting from e 1 + e 2 .
Figure 2. Possible sample paths of the chain ( a ˜ ( t ) , P ˜ a ) on the cycle graph C 4 starting from e 1 + e 2 .
Mathematics 10 04081 g002
Figure 3. Three candidate graphical models: a star graph S 3 , a cycle graph C 4 , and a complete graph K 4 .
Figure 3. Three candidate graphical models: a star graph S 3 , a cycle graph C 4 , and a complete graph K 4 .
Mathematics 10 04081 g003
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Mano, S. A Measure-on-Graph-Valued Diffusion: A Particle System with Collisions and Its Applications. Mathematics 2022, 10, 4081. https://doi.org/10.3390/math10214081

AMA Style

Mano S. A Measure-on-Graph-Valued Diffusion: A Particle System with Collisions and Its Applications. Mathematics. 2022; 10(21):4081. https://doi.org/10.3390/math10214081

Chicago/Turabian Style

Mano, Shuhei. 2022. "A Measure-on-Graph-Valued Diffusion: A Particle System with Collisions and Its Applications" Mathematics 10, no. 21: 4081. https://doi.org/10.3390/math10214081

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop