Next Article in Journal
New Sharp Double Inequality of Becker–Stark Type
Next Article in Special Issue
Asymptotic Efficiency of Point Estimators in Bayesian Predictive Inference
Previous Article in Journal
Fundamentals vs. Financialization during Extreme Events: From Backwardation to Contango, a Copper Market Analysis during the COVID-19 Pandemic
Previous Article in Special Issue
Partial Exchangeability for Contingency Tables
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Single-Block Recursive Poisson–Dirichlet Fragmentations of Normalized Generalized Gamma Processes

by
Lancelot F. James
Department of Information Systems, Business Statistics and Operations Management, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
Mathematics 2022, 10(4), 561; https://doi.org/10.3390/math10040561
Submission received: 1 January 2022 / Revised: 1 February 2022 / Accepted: 9 February 2022 / Published: 11 February 2022

Abstract

:
Dong, Goldschmidt and Martin (2006) (DGM) showed that, for 0 < α < 1 , and θ > α , the repeated application of independent single-block fragmentation operators based on mass partitions following a two-parameter Poisson–Dirichlet distribution with parameters ( α , 1 α ) to a mass partition having a Poisson–Dirichlet distribution with parameters ( α , θ ) leads to a remarkable nested family of Poisson—Dirichlet distributed mass partitions with parameters ( α , θ + r ) for r = 0 , 1 , 2 , . Furthermore, these generate a Markovian sequence of α -diversities following Mittag-Leffler distributions, whose ratios lead to independent Beta-distributed variables. These Markov chains are referred to as Mittag-Leffler Markov chains and arise in the broader literature involving Pólya urn and random tree/graph growth models. Here we obtain explicit descriptions of properties of these processes when conditioned on a mixed Poisson process when it equates to an integer n , which has interpretations in a species sampling context. This is equivalent to obtaining properties of the fragmentation operations of (DGM) when applied to mass partitions formed by the normalized jumps of a generalized gamma subordinator and its generalizations. We focus primarily on the case where n = 0 , 1 .

1. Introduction

Let Z = ( Z r , r 0 ) denote a Markov chain characterized by a stationary transition density Z r | Z r 1 = z given for y > z and 0 < α < 1 :
P ( Z r d y | Z r 1 = z ) / d y = α ( y z ) 1 α α 1 y g α ( y ) Γ ( 1 α α ) g α ( z ) ,
where g α ( s ) : = f α ( s 1 α ) s 1 α 1 / α is the density of a variable T α α , with a Mittag-Leffler distribution, T α : = T α , 0 is a positive stable variable with density denoted as f α ( t ) , and Laplace transform E [ e λ T α ] = e λ α . More generally, as in [1,2,3,4], for θ > α , let T α , θ denote a variable with density f α , θ ( t ) = t θ f α ( t ) / E [ T α θ ] ; then, T α , θ α is said to have a generalized Mittag-Leffler distribution with parameters ( α , θ ) and distribution denoted as ML ( α , θ ) . In the cases where Z 0 = T α , θ α ML ( α , θ ) , the marginal distributions of each Z r are ML ( α , θ + r ) . Furthermore, there is a sequence of random variables ( B j , j 1 ) defined for each integer j as B j = Z j 1 / Z j ; hence, there is the exact point-wise relation Z j 1 = Z j × B j , where, remarkably, the B j are independent Beta ( θ + α + j 1 α , 1 α α ) variables, and ( B 1 , , B j ) is independent of Z j , for j = 1 , 2 , . Note further that by setting Z r = T α , θ + r α , there is the point-wise equality T α , θ = T α , θ + r × j = 1 r B j 1 α , where all the variables on the right-hand side are independent. In these cases, the sequence may be referred to as a Mittag-Leffler Markov chain with law denoted as Z MLMC ( α , θ ) , as in [5] and, subsequently, [6]. The Markov chain is described prominently in various generalities, that is, ranges of α and θ , in [5,6,7,8,9]. See for example [5,6,10,11,12,13,14,15] for more references concerning Pólya urn and random tree/graph growth models.
Now, let PD ( α , θ ) denote a two-parameter Poisson–Dirichlet distribution over the space of mass partitions summing to 1 , say P : = { s = ( s 1 , s 2 , ) : s 1 s 2 0 and i = 1 s i = 1 } , as described in [3,4,16]. Let ( P ) : = ( ( P ) , 1 ) PD ( α , θ ) correspond in distribution to the ranked lengths of excursion of a generalized Bessel bridge on [ 0 , 1 ] , as described and defined in [1,4]. In particular, PD ( 1 / 2 , 0 ) and PD ( 1 / 2 , 1 / 2 ) correspond to excursion lengths of standard Brownian motion and Brownian bridge, on [ 0 , 1 ] , respectively. As noted in [6], the single-block PD ( α , 1 α ) fragmentation results for PD ( α , θ ) mass partitions by [17], which we shall describe in more detail in Section 1.2, allow one to couple a version of Z MLMC ( α , θ ) with a nested family of mass partitions ( ( P , r ) , r 0 ) , where each ( P , r ) : = ( ( P , r ) , 1 ) takes its values in P , initial ( P , 0 ) PD ( α , θ ) has α -diversity Z 0 = T α , θ α , and each successive ( P , r ) PD ( α , θ + r ) has α -diversity Z r = T α , θ + r α . The distribution of this family is denoted as ( ( P , r ) , Z r ; r 0 ) MLMC frag ( α , θ ) .
Recall from [2] that for ( P , 0 ) PD ( α , 0 ) , ( P , 0 ) | T α = t has distribution PD ( α | t ) , and for a probability measure ν on ( 0 , ) , one may generate the general class of Poisson–Kingman distributions generated by an α -stable subordinator with mixing ν , by forming PK α ( ν ) = 0 PD ( α | t ) ν ( d t ) . Some prominent examples of interest in this work are PD ( α , θ ) = 0 PD ( α | t ) f α , θ ( t ) d t and P α [ n ] ( λ ) = 0 PD ( α | t ) f α [ n ] ( t | λ ) d t , where f α [ n ] ( t | λ ) t n e λ t f α ( t ) . Hence, P α [ 0 ] ( λ ) corresponds to the law of the ranked normalized jumps of a generalized gamma subordinator, say ( τ α ( y ) ; y 0 ) , where τ α ( λ α ) / λ has density f α [ 0 ] ( t | λ ) = e λ t e λ α f α ( t ) . In [6], we obtained some general distributional properties of ( ( P , r ) , Z r ; r 0 ) formed by repeated application of the fragmentation operations in [17] to the case where ( P , 0 ) PK α ( ν ) . Furthermore, letting ( e ) denote a sequence of iid Exp ( 1 ) variables forming the arrival times, say ( Γ = j = 1 e j ; 1 ) , of a standard Poisson process, we ([6], Section 4.3) focused in more detail on the special case of ( ( P , r ) , Z r ; r 0 ) | N T α , θ α ( λ ) = j for j = 0 , 1 , 2 , , when ( ( P , r ) , Z r ; r 0 ) MLMC frag ( α , θ ) and ( N T α , θ α ( t ) = = 1 I { Γ / T α , θ α t } , t 0 ) is a mixed Poisson process with random intensity depending on T α , θ α . That is to say, ( P , 0 ) | N T α , θ α ( λ ) = j corresponds in distribution to ( P , 0 ( λ ) ) following a PK α ( ν ) distribution, where ν corresponds to the distribution of T α , θ α | N T α , θ α ( λ ) = j .
In this work, we obtain results for the case where ( ( P , r ) , Z r ; r 0 ) is such that ( P , 0 ) P α [ n ] ( λ ) , which is when ( P , 0 ) corresponds to the ranked normallized jumps of a generalized gamma process, ( τ α ( y ) ; y 0 ) , and its size-biased generalizations. Interestingly, our results equate in distribution to the following setup involving ( ( P , r ) , Z r ; r 0 ) MLMC frag ( α , 0 ) . Let N T α be a mixed Poisson process defined by replacing T α , θ α in N T α , θ α with T α . Using the mixed Poisson framework in the manuscript of Pitman [18] (see also [6,19] for more details), we obtain some explicit distributional properties of ( ( P , r ) , Z r ; r 0 ) | N T α ( λ ) = n and corresponding variables ( B 1 , , B r , T α , r ) | N T α ( λ ) = n for n = 0 , 1 , 2 , , when ( ( P , r ) , Z r ; r 0 ) MLMC frag ( α , 0 ) . That is when ( P , 0 ) PD ( α , 0 ) . The equivalence in distribution to the fragmentation operations of [17] applied in the generalized gamma cases may be deduced from [18], who shows that when ( P , 0 ) PD ( α , 0 ) , ( P , 0 ) | N T α = n corresponds to the distribution of ( P , 0 ( λ ) ) P α [ n ] ( λ ) . We shall primarily focus on the case of n = 0 , 1 , corresponding to the generalized gamma density and its sized biased distribution, which yields the most explicit results. The fragmentation operations (6) applied to ( ( P , 0 ) ) P α [ 1 ] ( λ ) allow one to recover the entire range of PD ( α , θ ) distributions for θ > α , by gamma randomization, whereas the case for ( ( P , 0 ) ) P α [ 0 ] ( λ ) only applies to θ 0 . We note that descriptions of our results for n = 0 , 1 , albeit less refined ones, appear in the unpublished manuscript ([9], Section 6). See also [20] for an application of P α [ 0 ] ( λ ) for randomized λ .
We close this section by recalling the definition of the first size-biased pick from a random mass partition ( P ) P (see [2,3,16]). Specifically, P ˜ 1 is referred to as the first size-biased pick from ( P ) , if it satisfies, for k = 1 , 2 , ,
P ( P ˜ 1 = P k | ( P ) ) = P k .
Hereafter, let ( P ) 1 : = ( P ) \ P ˜ 1 denote the remainder, such that ( P ) = Rank ( ( ( P ) 1 , P ˜ 1 ) ) , where Rank ( · ) denotes the operation corresponding to ranked re-arrangement. From [1], P ˜ 1 may be interpreted as the length of excursion (i.e., one of the ( P ) ), first discovered by dropping a uniformly distributed random variable onto the interval [ 0 , 1 ] . The fragmentation operation of [17] may be interpreted as shattering/fragmenting that interval by the excursion lengths of a process on [ 0 , 1 ] , with distribution PD ( α , 1 α ) and then re-ranking. For clarity and comparison, we first recall some details of the more well-known Markovian size-biased deletion operation leading to stick-breaking representations, as described in [1,2,3], and more related notions arising in a Bayesian nonparametric context in the PD ( α , θ ) setting, in the next section.
Remark 1.
Although we acknowledge the influence and contributions of the manuscript [18], the pertinent distributional results we use from that work are re-derived at the beginning of Section 2. Otherwise, the interpretation of N T α from that work is briefly mentioned in Section 1.3.

1.1. P D ( α , θ ) Markovian Sequences Obtained from Successive Size-Biased Deletion

Following [1], we may define SBD ( · ) to be a size-biased deletion operator on P , as SBD ( ( P ) ) : = Rank ( ( ( P ) 1 / ( 1 P ˜ 1 ) ) ) , where it can be recalled from (2) that ( P ) = Rank ( ( ( P ) 1 , P ˜ 1 ) ) . Now, let ( SBD ( j ) ( · ) , j 1 ) be a collection of such operators. From [1], as per the description in ([4], Proposition 34, p. 881), it follows that for ( P , 0 ) : = ( P ^ , 0 ) PD ( α , θ ) , SBD ( 1 ) ( ( P ^ , 0 ) ) : = ( P ^ , 1 ) PD ( α , θ + α ) and is independent of the first size-biased pick P ˜ 1 : = V 1 Beta ( 1 α , θ + α ) , and hence, for r = 2 , ,
( P ^ , r ) : = SBD ( r ) ( P ^ , r 1 ) = SBD ( r ) SBD ( 1 ) ( P ^ , 0 ) PD ( α , θ + r α ) .
This leads to a nested Markovian family of mass partitions ( ( P ^ , r ) , r 0 ) , where ( P , 0 ) : = ( P ^ , 0 ) PD ( α , θ ) with inverse local time at time 1 , T α , θ (see ([3], Equation (4.20), p. 83)), and for each r , ( P ^ , r ) PD ( α , θ + r α ) with inverse local time at time 1 , T α , θ + r α . Furthermore, ( T α , θ + r α , r 0 ) form a Markov chain with pointwise equality T α , θ + ( j 1 ) α = T α , θ + j α / ( 1 V j ) , where V j are independent Beta ( 1 α , θ + j α ) variables and are the respective first size-biased picks from ( P ^ , j 1 ) for j 1 . Furthermore, ( V 1 , , V r ) is independent of T α , θ + r α and, more generally, ( P ^ , r ) for r = 1 , 2 , .
From this, one obtains the size-biased re-arrangement of a PD ( α , θ ) mass partition, say ( P ˜ ) GEM ( α , θ ) , satisfying P ˜ 1 = V 1 Beta ( 1 α , θ + α ) , and for 2 , P ˜ = V j = 1 1 ( 1 V j ) . Refs. [3,21] discuss the GEM ( α , θ ) distribution and these other concepts in a species sampling and Bayesian context. We mention the roles of corresponding random distribution functions as priors in a Bayesian non-parametric context. Let ( U ) denote a sequence of iid Uniform [ 0 , 1 ] variables independent of ( P ) PD ( α , θ ) ; then, the random distribution F α , θ ( y ) = = 1 P I { U y } is said to follow a Pitman–Yor distribution with parameters ( α , θ ) , (see [21,22]). F α , θ is a two-parameter extension of the Dirichlet process [23] (which corresponds to F 0 , θ ) and has been applied extensively as a more flexible prior in a Bayesian context, but it also arises in a variety of areas involving combinatorial stochastic processes [3,21]. An attractive feature of F α , θ is that it may be represented as F α , θ ( y ) = = 1 P ˜ I { U ˜ y } , where ( U ˜ ) are the iid Uniform [ 0 , 1 ] concomittants of the ( P ˜ ) , as exploited in [22] (see also [21]). This constitutes the stick-breaking representation of F α , θ . Furthermore, we can describe P ˜ 1 as folllows: let X 1 | F α , θ have distribution F α , θ , and denote the first value drawn from F α , θ ; then, P ˜ 1 is the mass in ( P ) corresponding to that atom of F α , θ . The size-biased deletion operation described above, as in (3), leads to the following decomposition of F α , θ :
F α , θ ( y ) = ( 1 P ˜ 1 ) F α , θ + α ( y ) + P ˜ 1 I { U ˜ 1 y }
where ( P ˜ 1 , U ˜ 1 ) are independent of F α , θ + α ( y ) = d k = 1 P ^ k , 1 I { U k , 1 y } , where ( P ^ , 1 ) PD ( α , θ + α ) , and independent of this, where ( U , 1 ) i i d Uniform [ 0 , 1 ] . See [1,4,24] and references therein for various interpretations of (4).

1.2. DGM Fragmentation

The single-block PD ( α , 1 α ) fragmentation operator of [17] is defined over the space P . However, for further clarity we start with an explanation at the level of random distribution functions involving the representation in (4). Suppose that G α , 1 α ( y ) : = k = 1 Q k I { U k , 1 y } , with ( Q ) PD ( α , 1 α ) and, independent of this, ( U , 1 ) i i d Uniform [ 0 , 1 ] ; hence, G α , 1 α = d F α , 1 α . Suppose that G α , 1 α is chosen independent of F α , θ in (4); then, it follows from [17] that
F α , θ + 1 ( y ) = d ( 1 P ˜ 1 ) F α , θ + α ( y ) + P ˜ 1 G α , 1 α ( y ) ,
and it is evident that the mass partition ( Q ) shatters/fragments P ˜ 1 into a countably infinite number of pieces ( P ˜ 1 ( Q ) ) : = ( P ˜ 1 Q , 1 ) = ( P ˜ 1 Q 1 , P ˜ 1 Q 2 , ) . It follows that, in this case, Rank ( P ) 1 , P ˜ 1 ( Q ) PD ( α , θ + 1 ) , which is the featured case of the PD ( α , 1 α ) fragmentation described in [17]. Hence, for general ( P ) = Rank ( ( ( P ) 1 , P ˜ 1 ) ) P , a PD ( α , 1 α ) fragmentation of ( P ) is defined as
Frag ^ α , 1 α ( P ) : = Rank ( ( P ) 1 , P ˜ 1 ( Q ) ) P ,
where, independent of ( P ) , ( Q ) PD ( α , 1 α ) . Let ( Q ( j ) ) ; j 1 denote an independent collection of PD ( α , 1 α ) mass partitions defining a sequence of independent fragmentation operators Frag ^ α , 1 α ( j ) ( · ) ; j 1 . It follows from [17] that a version of the family ( ( P , r ) , Z r ; r 0 ) MLMC frag ( α , θ ) may be constructed by the recursive fragmentation, for r = 1 , 2 , :
( P , r ) = Frag ^ α , 1 α ( r ) ( P , r 1 )
In particular, ( P , r ) PD ( α , θ + r ) when ( P , 0 ) PD ( α , θ ) .

1.3. Remarks

We close this section with remarks related to some relevant work of Eugenio Regazzini and his students, arising in a Bayesian context. From [18], in regards to a species sampling context using F α , θ (see [21]), N T α , θ ( λ ) interprets as the number of animals trapped and tagged up until time λ , and hence, Γ j / T α , θ interprets as the time when the j-th animal is trapped for j = 1 , . Ref. [18] indicates that this gives further interpretation to such types of quantities arising in [25,26]. Using a Chinese restaurant process metaphor, the animals may be replaced by customers arriving sequentially to a restaurant. More generically, N T α , θ ( λ ) is the number of exchangeable samples drawn from F α , θ up until time λ . Furthermore, F α , n ( y ) | N T α , n ( λ ) = n for each n = 0 , 1 , 2 , is equivalent in distribution to F α ( y | λ ) = d τ α ( λ α y ) / τ α ( λ α ) , which is now referred to in the Bayesian literature as a normalized generalized gamma process. While, according to [2], F α ( y | λ ) appears in a relevant species sampling context in the 1965 thesis of McCloskey [27], and certainly elsewhere, the paper by Reggazzini, Lijoi, and Prünster [28] and subsequent works by Regazzini’s students (see [29]) helped to popularize the usage of F α ( y | λ ) in the modern literature on Bayesian non-parametrics. Our work presents a view of F α ( y | λ ) subjected to the fragmentation operations in [17]. Although we do not consider specific Bayesian statistical applications in this work, we note that other types of fragmentation/coagulation of PD ( α , θ ) models have been applied, for instance, in [30]. We anticipate the same will be true of the operations considered here.

2. Results

Hereafter, we shall focus on the case of PD ( α , 0 ) , as we will recover the general ( α , θ ) cases by applying gamma randomization as in ([4], Proposition 21) for θ 0 or ([19], Corollary 2.1) for θ > α and other results. See also ([6], Section 2.2.1). We first re-derive some relevant properties related to N T α that are easily verified by first conditioning on T α and otherwise can be found in [18]. First, for fixed λ , and for j = 0 , 1 , ,
P ( N T α ( λ ) = j , T α d s ) = λ j j ! s j e λ s f α ( s ) d s ,
and for j = 1 , 2 , ,
P Γ j T α d λ , T α d s / d λ = λ j 1 ( j 1 ) ! s j e λ s f α ( s ) d s .
Note these simple results hold for any variable T with density f T in place of T α and f α . It follows from (7) and (8) that T α | N T α ( λ ) = 0 has the generalized gamma density f α [ 0 ] ( t | λ ) = e λ t e λ α f α ( t ) . Furthermore, for j = 1 , 2 , ; T α | N T α ( λ ) = j has the same distribution as T α | Γ j / T α = λ with density f α [ j ] ( t | λ ) . Since it is assumed that ( Γ ; 1 ) is independent of ( P ) , it follows that for ( P ) PD ( α , 0 ) , the conditional distribution of ( P ) | T α = t , N T α ( λ ) = n is PD ( α | t ) , and hence, ( P ) | N T α ( λ ) = n has distribution P α [ n ] ( λ ) for n = 0 , 1 , , as mentioned previously.
Remark 2.
For the next results, which are extensions to ( ( P , r ) , Z r ; r 0 ) MLMC frag ( α , 0 ) , conditioned on N T α ( λ ) = n , we note, as in [19], that the densities f α [ n ] ( t | λ ) are well-defined for any real number ϱ in place of [ n ] , with density f α [ ϱ ] ( t | λ ) , provided that λ > 0 , and for λ = 0 only in the case where ϱ = θ < α , which corresponds to f α , θ ( t ) . Ref. ([19], Corollary 2.1) shows that distributions for ϱ can be expressed as randomized (over λ) distributions for any n > ϱ .
For clarity, with respect to ( ( P , r ) , Z r ; r 0 ) MLMC frag ( α , 0 ) ,   B j = Z j 1 / Z j are independent Beta ( α + j 1 α , 1 α α ) variables for j = 1 , 2 , , and ( B 1 , , B r ) is independent of Z r = T α , r α and ( P , r ) for each r = 1 , 2 , .
Proposition 1.
Consider ( ( P , r ) , Z r ; r 0 ) MLMC frag ( α , 0 ) , formed by the fragmentation operations in (6), when ( P , 0 ) PD ( α , 0 ) . Denote the conditional distribution of ( ( P , r ) , Z r ; r 0 ) | N T α ( λ ) = n as MLMC frag [ n ] ( α | λ ) and its corresponding component values as ( ( P , r ( λ ) ) , Z r ( λ ) ; r 0 ) . Then, the distribution has the following properties.
(i) 
( P , 0 ) | N T α ( λ ) = n is equivalent in distribution to ( P , 0 ( λ ) ) P α [ n ] ( λ ) = 0 PD ( α | t ) f α [ n ] ( t | λ ) d t .
(ii) 
( P , r ) | N T α ( λ ) = n , i = 1 r B i = b r has distribution P α [ n r ] ( λ b r 1 α ) , for r = 1 , 2 , .
(iii) 
( P , r ) | N T α ( λ ) = n , i = 1 r B i = b r has the same distribution as ( P , r ) | N T α , r ( λ b r 1 α ) = n .
Proof. 
Statement (i) has already been established. For (ii) and equivalently (iii), we use T α = T α , r × i = 1 r B i 1 α , to obtain N T α ( λ ) = N T α , r ( λ i = 1 r B i 1 α ) . Use (7) and (8) with T α , r , with density f α , r ( t ) , in place of T α , to conclude that T α , r | N T α , r ( λ b r 1 α ) , i = 1 r B i = b r has density f α [ n r ] ( t | λ b r 1 α ) . Then, apply ( P , r ) | T α , r = t , N T α ( λ ) = n , i = 1 r B i = b r is PD ( α | t ) for ( P , r ) PD ( α , r ) .

3. Results for n = 0 , 1

We will now focus on results for ( B 1 , , B r , T α , r ) , given N T α ( λ ) = n , in the cases where n = 0 , 1 , and ( ( P , r ) , Z r ; r 0 ) MLMC frag ( α , 0 ) . This is equivalent to providing more explicit distributional results than Proposition 1 for the generalized gamma and its size-biased case, where ( P , 0 ( λ ) ) P α [ n ] ( λ ) , for n = 0 , 1 , subjected to the fragmentation operations in (6). We first highlight a class of random variables that will play an important role in our descriptions.
Throughout, we define γ θ Gamma ( θ , 1 ) for θ 0 , with γ 0 : = 0 . Let ( e ( ) ) and ( γ 1 α α ( ) ) denote, respectively, iid collections of exponential ( 1 ) and Gamma ( 1 α α , 1 ) random variables that are mutually independent. Use this to form iid sums γ 1 α ( k ) : = e ( k ) + γ 1 α α ( k ) Gamma ( 1 α , 1 ) , and construct increasing sums Γ α , k : = j = 1 k γ 1 α ( j ) Gamma ( k α , 1 ) for k = 1 , 2 , .
Lemma 1.
For k = 1 , 2 , , set Y k ( λ ) = ( Γ α , k 1 + λ α ) / ( Γ α , k + λ α ) , with Γ α , 0 = 0 , and hence Y 1 ( λ ) = λ α / ( Γ α , 1 + λ α ) . Then, for any r = 1 , 2 , , and λ > 0 , the joint density of ( Y 1 ( λ ) , , Y r ( λ ) ) can be expressed as
ϑ α , r [ 0 ] ( y 1 , , y r | λ ) = λ r [ Γ ( 1 α ) ] r e λ α / ( j = 1 r y j ) e λ α l = 1 r y l ( r l + 1 ) α 1 ( 1 y l ) 1 α 1 .
Furthermore, λ α / j = 1 r Y j ( λ ) = Γ α , r + λ α .

3.1. Results for ( P , 0 ( λ ) ) P α [ 0 ] ( λ ) , the Generalized Gamma Case

Let ( β ( 1 α α , 1 ) ( k ) ) denote a collection of iid Beta ( 1 α α , 1 ) variables, and independent of this, let ( τ α ( r ) ( y ) ) denote, for each fixed y 0 , a collection of iid variables such that τ α ( r ) ( y ) = d τ α ( y ) . In addition, for each r, ( β ( 1 α α , 1 ) ( 1 ) , , β ( 1 α α , 1 ) ( r ) , τ α ( r ) ( λ ) ) is independent of ( Y 1 ( λ ) , , Y r ( λ ) ) .
Proposition 2.
Consider ( ( P , r ) , Z r ; r 0 ) MLMC frag ( α , 0 ) ; then, for each r, the joint distribution of the random variables ( B 1 , , B r , T α , r ) | N T α ( λ ) = 0 is equivalent component-wise and jointly to the distribution of ( B 1 [ 0 ] ( λ ) , , B r [ 0 ] ( λ ) , T α , r [ 0 ] ( λ ) ) , where:
(i) 
B k [ 0 ] ( λ ) = d 1 β ( 1 α α , 1 ) ( k ) [ 1 Y k ( λ ) ] , with conditional density given Y k ( λ ) = y k ,
1 α α ( 1 b k ) 1 α α 1 ( 1 y k ) 1 1 α I { y k b k 1 } ,
for k = 1 , 2 , .
(ii) 
The conditional distribution of T α , r | N T α ( λ ) = 0 is equivalent to that of
T α , r [ 0 ] ( λ ) = d τ α ( r ) ( Γ α , r + λ α ) ( Γ α , r + λ α ) 1 / α
where recall λ α / j = 1 r Y j ( λ ) = Γ α , r + λ α .
(iii) 
The conditional density of T α , r [ 0 ] ( λ ) | i = 1 r Y i ( λ ) = y r , is f α [ 0 ] ( t | λ y r 1 α ) .
(iv) 
Hence, ( P , r ) | N T α ( λ ) = 0 E [ P α [ 0 ] ( ( Γ α , r + λ α ) 1 / α ) ] .
(v) 
( B 1 [ 0 ] ( λ ) , , B r [ 0 ] ( λ ) , T α , r [ 0 ] ( λ ) ) | Y 1 ( λ ) , , Y r ( λ ) are independent.
Corollary 1.
Suppose that ( P , 0 ( λ ) ) = d ( P [ 0 ] ( λ ) ) P α [ 0 ] ( λ ) = 0 PD ( α | t ) e λ t e λ α f α ( t ) d t , then for r = 1 , 2 , ,
( P , r ( λ ) ) = Frag ^ α , 1 α ( r ) ( P , r 1 ( λ ) ) = d ( P [ 0 ] ( ( Γ α , r + λ α ) 1 / α ) )
where Γ α , r = j = 1 r γ 1 α ( j ) Gamma ( r α )
Proof. 
This follows from statement (iv) of Proposition 2. □
The corollary shows that the fragmentation operations in (6) lead to a nested family of (mixed) normalized generalized gamma distributed mass partitions, with λ α replaced by the random quantities λ α / j = 1 r Y j ( λ ) = Γ α , r + λ α . In other words, ( P , r ) | N T α , 0 ( λ ) = 0 equates in distribution to the ranked masses of the random distribution function, for v [ 0 , 1 ] :
F α ( v | ( Γ α , r + λ α ) 1 / α ) = d τ α ( [ Γ α , r + λ α ] v ) τ α ( Γ α , r + λ α ) .
Now, in order to recover MLMC frag ( α , θ ) for θ 0 , when ( P , 0 ( λ ) ) P α [ 0 ] ( λ ) , set, for θ 0 , G ˜ α , θ = d G θ α 1 α = d γ θ T α , θ , where G θ α Gamma ( θ α , 1 ) . When ( P , 0 ( λ ) ) = d ( P [ 0 ] ( λ ) ) P α [ 0 ] ( λ ) , as in Corollary 1, it follows from ([4], Proposition 21) that ( P , 0 ( G ˜ α , θ ) ) PD ( α , θ ) . Hence ( ( P , r ( G ˜ α , θ ) ) , Z r ( G ˜ α , θ ) ; r 0 ) MLMC frag ( α , θ ) . It follows from Proposition 2 that, B k [ 0 ] ( G ˜ α , θ ) i n d Beta ( θ + α + k 1 α , 1 α α ) for k = 1 , 2 , . Notably, ( Y 1 ( G ˜ α , θ ) , , Y r ( G ˜ α , θ ) ) are independent variables, such that 1 Y r ( G ˜ α , θ ) Beta ( 1 α , θ + r 1 α ) for r = 1 , 2 , . When θ = 0 , or equivalently λ = 0 , Y 1 ( 0 ) = 0 , and 1 Y r ( 0 ) Beta ( 1 α , r 1 α ) for r = 2 , .

3.2. Results for ( P , 0 ( λ ) ) P α [ 1 ] ( λ )

Proposition 3.
Consider ( ( P , r ) , Z r ; r 0 ) | N T α ( λ ) = 1 MLMC frag [ 1 ] ( α | λ ) ; then, for each r, the joint distribution of the random variables ( B 1 , , B r , T α , r ) | N T α ( λ ) = 1 is equivalent component-wise and jointly to the distribution of ( B 1 [ 1 ] ( λ ) , , B r [ 1 ] ( λ ) , T α , r [ 1 ] ( λ ) ) , where:
(i) 
B 1 [ 1 ] ( λ ) = d λ α / ( γ 1 α α + λ α ) , where γ 1 α α Gamma ( 1 α α , 1 ) .
(ii) 
B k [ 1 ] ( λ ) = d B k 1 [ 0 ] ( ( γ 1 α α + λ α ) 1 / α ) for k = 2 , 3 , , component-wise and jointly.
(iii) 
T α , r [ 1 ] ( λ ) is equivalent in distribution to T α , r | N T α ( λ ) = 1 and equivalent in distribution to
T α , r 1 [ 0 ] ( ( γ 1 α α + λ α ) 1 / α ) = d τ α ( r 1 ) ( Γ α , r 1 + γ 1 α α + λ α ) ( Γ α , r 1 + γ 1 α α + λ α ) 1 / α ,
r = 1 , 2 , .
Corollary 2.
The distributions of the components of ( ( P , r ( λ ) ) , Z r ( λ ) ; r 0 ) MLMC frag [ 1 ] ( α | λ ) , where ( P , 0 ( λ ) ) = d ( P [ 1 ] ( λ ) ) P α [ 1 ] ( λ ) , for λ > 0 , satisfies for r = 1 , 2 , ,
( P , r ( λ ) ) = Frag ^ α , 1 α ( r ) ( P , r 1 ( λ ) ) = d ( P [ 1 ] ( ( Γ α , r + λ α ) 1 / α ) ) ,
where ( P [ 1 ] ( ( e 1 + Γ α , r 1 + γ 1 α α + λ α ) 1 / α ) ) = d ( P [ 0 ] ( ( Γ α , r 1 + γ 1 α α + λ α ) 1 / α ) ) for e 1 exponential ( 1 ) independent of the other variables. In this case, Γ α , r = d e 1 + Γ α , r 1 + γ 1 α α .
Proof. 
( P , r ) | N T α ( λ ) = 1 , has the same distribution as ( P , r ( λ ) ) in (11), and (iii) of Proposition 3 shows that they are equivalent in distribution to ( P [ 0 ] ( ( Γ α , r 1 + γ 1 α α + λ α ) 1 / α ) ) . From ([19], Corollary 2.1, Proposition 3.2), there is the equivalence ( P [ 1 ] ( ( e 1 + λ α ) 1 / α ) ) = d ( P [ 0 ] ( λ ) ) for any λ 0 , yields (11). □
Now, in order to recover MLMC frag ( α , θ ) for θ > α , when ( P , 0 ( λ ) ) ) P α [ 1 ] ( λ ) , use G ^ α , θ = d G θ + α α 1 α = d γ 1 + θ T α , θ , where G θ + α α Gamma ( θ + α α , 1 ) , and, ( ( P , r ( λ ) ) , Z r ( λ ) ; r 0 ) MLMC frag [ 1 ] ( α | λ ) . It follows from ([19], Corollary 2.1) that ( ( P , r ( G ^ α , θ ) ) , Z r ( G ^ α , θ ) ; r 0 ) MLMC frag ( α , θ ) , for θ > α .

3.3. Proofs of Propositions 2 and 3

Although the joint conditional density of ( B 1 , , B r , T α , r ) | N T α ( λ ) = 0 in the MLMC ( α , 0 ) setting can be easily obtained from ([6], p. 324), with h ( t ) = e λ t e λ α , for clarity, we derive it here. Since P ( N T α ( λ ) = 0 | T α , r = s , i = 1 r B i = b r ) = e λ s / b r 1 / α , and P ( N T α ( λ ) = 0 ) = e λ α , it follows that the desired conditional density of ( B 1 , , B r , T α , r ) | N T α ( λ ) = 0 , can be expressed as,
α r [ Γ ( 1 α α ) ] r i = 1 r b i α + i 1 α 1 ( 1 b i ) 1 α α 1 × s r f α ( s ) e λ s / b r 1 / α e λ α .
Now, a joint density of ( B 1 [ 0 ] ( λ ) , , B r [ 0 ] ( λ ) , T α , r [ 0 ] ( λ ) , Y 1 ( λ ) , , Y r ( λ ) ) follows from the descriptions in Proposition 2 and Lemma 3.1 and can be expressed, for 0 y k b k 1 , k = 1 , , r , as
e λ α f α ( s ) λ r [ Γ ( 1 α α ) ] r k = 1 r ( 1 b k ) 1 α α 1 × e λ s / y r 1 / α l = 1 r y l ( r l + 1 ) α 1 ,
for y r = i = 1 r y i . Proposition 2 is verified by showing that integrating over ( y 1 , , y r ) in (13) leads to (12). This is equivalent to showing that
0 b 1 0 b r e λ s / y r 1 / α l = 1 r y l ( r l + 1 ) α 1 d y r d y 1 = α r λ r s r e λ s / b r 1 / α i = 1 r b i i 1 α .
which follows by elementary calculations involving the change of variable v i = y i 1 / α , for i = 1 , , r and exponential integrals. Now, to establish Proposition 3, first note that since P ( N T α ( λ ) = 1 | T α , 1 = s , B 1 = b 1 ) = λ s b 1 1 α e λ s / b 1 1 α , and P ( N T α ( λ ) = 1 ) = α λ α e λ α , the joint density of B 1 , T α , 1 | N T α ( λ ) = 1 can be expressed as
λ 1 α Γ ( 1 α α ) b 1 1 α ( 1 b 1 ) 1 α α 1 × e λ s / b 1 1 / α e λ α f α ( s ) .
Hence, the conditional density of B 1 | N T α ( λ ) = 1 can be expressed as,
λ 1 α Γ ( 1 α α ) b 1 1 α ( 1 b 1 ) 1 α α 1 × e λ α / b 1 e λ α .
which corresponds to B 1 [ 1 ] ( λ ) = d λ α / ( γ 1 α α + λ α ) , verifying statement (i) of Proposition 3. Refs. (14) and (15) show that T α , 1 | N T α ( λ ) = 1 , B 1 = b 1 is f α [ 0 ] ( s | λ b 1 1 α ) , which leads to ( P , 1 ) | N T α ( λ ) = 1 , B 1 = b 1 having distribution P α [ 0 ] ( λ b 1 1 α ) . This agrees with statement (ii) of Proposition 1, with n = r = 1 . Using λ α / B 1 ( λ ) = d γ 1 α α + λ α and applying Proposition 2 starting with ( P , 1 ) | N T α ( λ ) = 1 , B 1 = b 1 subject to (6) concludes the proof of Proposition 3.

Funding

This research was supported in part by grants RGC-GRF 16301521, 16300217 and 601712 217 of the Research Grants Council (RGC) of the Hong Kong SAR. This research also received funding 218 from the European Research Council (ERC) under the European Union’s Horizon 2020 research 219 and innovation programme under grant agreement No. 817257.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Acknowledgments

This article is dedicated to Eugenio Regazzini on the occasion of his 75th birthday.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Perman, M.; Pitman, J.; Yor, M. Size-biased sampling of Poisson point processes and excursions. Probab. Theory Relat. Fields. 1992, 92, 21–39. [Google Scholar] [CrossRef]
  2. Pitman, J. Poisson-Kingman partitions. In Science and Statistics: A Festschrift for Terry Speed; Goldstein, D.R., Ed.; Institute of Mathematical Statistics: Hayward, CA, USA, 2003; pp. 1–34. [Google Scholar]
  3. Pitman, J. Combinatorial Stochastic Processes. In Lectures from the 32nd Summer School on Probability Theory Held in Saint-Flour, July 7–24, 2002. With a Foreword by Jean Picard. Lecture Notes in Mathematics, 1875; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
  4. Pitman, J.; Yor, M. The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator. Ann. Probab. 1997, 25, 855–900. [Google Scholar] [CrossRef]
  5. Rembart, F.; Winkel, M. A binary embedding of the stable line-breaking construction. arXiv 2016, arXiv:1611.02333. [Google Scholar]
  6. Ho, M.-W.; James, L.F.; Lau, J.W. Gibbs Partitions, Riemann-Liouville Fractional Operators, Mittag-Leffler Functions, and Fragmentations derived from stable subordinators. J. Appl. Prob. 2021, 58, 314–334. [Google Scholar] [CrossRef]
  7. Goldschmidt, C.; Haas, B. A line-breaking construction of the stable trees. Electron. J. Probab. 2015, 20, 1–24. [Google Scholar] [CrossRef]
  8. Haas, B.; Miermont, G.; Pitman, J.; Winkel, M. Continuum tree asymptotics of discrete fragmentations and applications to phylogenetic models. Ann. Probab. 2008, 36, 1790–1837. [Google Scholar] [CrossRef]
  9. James, L.F. Stick-breaking PG(α,ζ)-Generalized Gamma Processes. Unpublished manuscript. arXiv 2013, arXiv:1308.6570. [Google Scholar]
  10. Aldous, D. The continuum random tree. I. Ann. Probab. 1991, 19, 1–28. [Google Scholar] [CrossRef]
  11. Aldous, D. The continuum random tree III. Ann. Probab. 1993, 21, 248–289. [Google Scholar] [CrossRef]
  12. Móri, T.F. The maximum degree of the Barabási-Albert random tree. Combin. Probab. Comput. 2005, 14, 339–348. [Google Scholar] [CrossRef] [Green Version]
  13. Peköz, E.; Röllin, A.; Ross, N. Generalized gamma approximation with rates for urns, walks and trees. Ann. Probab. 2016, 44, 1776–1816. [Google Scholar] [CrossRef] [Green Version]
  14. Peköz, E.; Röllin, A.; Ross, N. Joint degree distributions of preferential attachment random graphs. Adv. Appl. Probab. 2017, 49, 368–387. [Google Scholar] [CrossRef] [Green Version]
  15. van der Hofstad, R. Random Graphs and Complex Networks. Cambridge University Press: New York, NY, USA, 2016; Volume I. [Google Scholar]
  16. Bertoin, J. Random Fragmentation and Coagulation Processes; Cambridge University Press: Cambridge, UK, 2006. [Google Scholar]
  17. Dong, R.; Goldschmidt, C.; Martin, J. Coagulation-fragmentation duality, Poisson-Dirichlet distributions and random recursive trees. Ann. Appl. Probab. 2006, 16, 1733–1750. [Google Scholar] [CrossRef] [Green Version]
  18. Pitman, J. Mixed Poisson and negative binomial models for clustering and species sampling. Unpublished manuscript. 2017. [Google Scholar]
  19. James, L.F. Stick-breaking Pitman-Yor processes given the species sampling size. arXiv 2019, arXiv:1908.07186. [Google Scholar]
  20. Favaro, S.; James, L.F. A note on nonparametric inference for species variety with Gibbs-type priors. Electron. J. Statist. 2015, 9, 2884–2902. [Google Scholar] [CrossRef]
  21. Pitman, J. Some Developments of the Blackwell-MacQueen urn Scheme; Statistics, Probability and Game Theory, IMS Lecture Notes Monogr. Ser. 30; Institute of Mathematical Statistics: Hayward, CA, USA, 1996; pp. 245–267. [Google Scholar]
  22. Ishwaran, H.; James, L.F. Gibbs sampling methods for stick-breaking priors. J. Am. Statist. Assoc. 2001, 96, 161–173. [Google Scholar] [CrossRef]
  23. Ferguson, T.S. A Bayesian analysis of some nonparametric problems. Ann. Statist. 1973, 1, 209–230. [Google Scholar] [CrossRef]
  24. James, L.F. Lamperti type laws. Ann. Appl. Probab. 2010, 20, 1303–1340. [Google Scholar] [CrossRef] [Green Version]
  25. James, L.F.; Lijoi, A.; Prünster, I. Posterior analysis for normalized random measures with independent increments. Scand. J. Stat. 2009, 36, 76–97. [Google Scholar] [CrossRef]
  26. Zhou, M.; Favaro, S.; Walker, S.G. Frequency of Frequencies Distributions and Size-Dependent Exchangeable Random Partitions. J. Am. Statist. Assoc. 2017, 112, 1623–1635. [Google Scholar] [CrossRef] [Green Version]
  27. McCloskey, J.W. A Model for the Distribution of Individuals by Species in an Environment. Ph.D. Thesis, Michigan State University, East Lansing, MI, USA, 1965. [Google Scholar]
  28. Regazzini, E.; Lijoi, A.; Prünster, I. Distributional results for means of normalized random measures with independent increments. Ann. Statist. 2003, 31, 560–585. [Google Scholar] [CrossRef]
  29. Lijoi, A.; Prünster, I. Models Beyond the Dirichlet Process. In Bayesian Nonparametrics; Hjort, N.L., Holmes, C., Müller, P., Walker, S., Eds.; Cambridge University Press: Cambridge, UK, 2010; pp. 80–136. [Google Scholar]
  30. Wood, F.; Gasthaus, J.; Archambeau, C.; James, L.F.; Teh, Y.W. The Sequence Memoizer. Commun. ACM 2011, 54, 91–98. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

James, L.F. Single-Block Recursive Poisson–Dirichlet Fragmentations of Normalized Generalized Gamma Processes. Mathematics 2022, 10, 561. https://doi.org/10.3390/math10040561

AMA Style

James LF. Single-Block Recursive Poisson–Dirichlet Fragmentations of Normalized Generalized Gamma Processes. Mathematics. 2022; 10(4):561. https://doi.org/10.3390/math10040561

Chicago/Turabian Style

James, Lancelot F. 2022. "Single-Block Recursive Poisson–Dirichlet Fragmentations of Normalized Generalized Gamma Processes" Mathematics 10, no. 4: 561. https://doi.org/10.3390/math10040561

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop