**1. Introduction**

In theory, a rather large number of indexes are proposed, which supposedly measure the significance of the scientific publications of an author. Among the most popular of them should be noted:


It is these two indexes that we consider in the proposed work.

The definition of the numerical value of the index (i1) is clear from its name.

Recall the definition of the Hirsch index (see [4]). The Hirsch Index *h* is the number of articles that have been cited at least *h* times each. This index was introduced in [4], where its properties were explained. In our opinion, these do not correspond to the index purpose. However, we dwell on the description of both the positive and negative sides of the Hirsch index after constructing citation models for scientific articles. One of them has already been stated by us in preprint [6].

#### **2. Citation Model Construction**

We now turn to the construction of the author's citation model. It will be considered as a composite of two models. The first of it describes the process of publishing an article by one author which will be cited, and the second describes the process of citing such an article.

Let us make some assumptions, which we discuss later.

**Assumption 1.** *Let the probability of rejection or non citing of the manuscript be q and the decisions on publication of different manuscripts are taken independently.*

Then it is clear that the probability that the scientist will have exactly *k* cited papers equals *q*(<sup>1</sup> − *<sup>q</sup>*)*<sup>k</sup>*, *k* = 0, 1, .... In other words, the number of publications of a scientist has a geometric distribution with parameter *q*. This distribution supposes that the number of an author publications may be arbitrarily large. However, (1 − *q*)*<sup>k</sup>* tends to zero rather fast as *k* → ∞ and, therefore, the mean value of the number of publications is not too large. The generating function of this distribution has the form

$$Q(z) = \frac{q}{1 - (1 - q)z}.\tag{1}$$

Of course, here we assume that all the journals to which the author sends manuscripts have the same review system, i.e., all of them accept the manuscripts of this author with the same probability 1 − *q*. More realistic is the situation with a random parameter *q*:

$$Q(z) = \int\_0^1 \frac{q}{1 - (1 - q)z} d\,\Xi(q).$$

where Ξ is a probability distribution on [0, 1] interval and then IP{*X* = *k*} = IE(*q*(<sup>1</sup> − *q*)*n*).

Let us go back to (1). How large may be the time spent by a scientist to publish a corresponding number of papers? Of course, this time is a random variable *T* and we are interested in its distribution. The usual assumption on the working time is its exponential distribution with parameter *λ* = IE*T* and the Laplace transform *ϕ*(*t*) = 1/(1 + *<sup>λ</sup>t*). Suppose that times needed for the publication of *j*-th paper is *Tj*, and *T*1, *T*2, ... are independent and identically distributed as *T* random variables. Then the time needed for all publications has the Laplace transform

$$\sum\_{k=1}^{\infty} \varphi^k(t) q (1-q)^{k-1} = \frac{1}{1 + \lambda t / q}.$$

i.e., it has exponential distribution with the parameter *λ*/*q*.

It is natural to assume that each cited publication will produce some number of citations. Of course, the likelihood that the article will be quoted again depends on the number of previous citations.

**Assumption 2.** *Assume the probability that an article having k* − 1 *(k* ≥ 1*) citations will not have new quotes equalling p*/*k<sup>γ</sup> where p is the probability that the article will not be quoted for the first time. The parameter γ is responsible for the speed of convergence of the rejection probability to zero.*

Consequently, the likelihood that the article will be quoted exactly k times equals *p*/*k<sup>γ</sup>* ∏*<sup>k</sup>*−<sup>1</sup> *j*=1 (1 − *p*/*j<sup>γ</sup>*). For the case of *γ* = 1, the generating probability function for the number of citations of this article is 1 − (1 − *<sup>z</sup>*)*<sup>p</sup>*. The corresponding distribution function is named after Sibuya [7]. Below we consider the case of arbitrary positive *γ*. The corresponding study has general mathematical interest. Therefore, we provide it in a number of sections below.

#### **3. Distribution of Citation Number of a Paper**

Let us consider an ordered sequence of experiments {E*n*; *n* = 1, 2, ...}, where an event *A* may appear in each of the experiments with the probability *pn*. Define a random variable *X* as the number of the first experiment in which *A* appears. We suppose that *X* is an improper random variable in the sense that it may take infinite value (that is, the event *A* will never appear). For the case IP{*X* = ∞} = 0 we say that *X* is a proper random variable. It is clear that, since we define any product from 1 to 0 to be 1,

$$\mathbb{P}\{X=n\} = p\_n \cdot \prod\_{k=1}^{n-1} (1 - p\_k) \tag{2}$$

and

$$\mathbb{P}\{X=\infty\} = \lim\_{n\to\infty} \prod\_{k=1}^{n-1} (1-p\_k).$$

Particular cases are:

1. The probabilities *pn* = *p* are constant. So (2) is

$$\mathbb{P}\{X=n\} = p \cdot (1-p)^{n-1}, \quad \mathbb{P}\{X=\infty\} = 0 \tag{3}$$

corresponding to the classical geometric distribution. Its tail is

$$\mathbb{P}\{X \ge n\} = (1 - p)^{n - 1}, \quad m = 1, 2, \dots$$

Clearly, the tail and probabilities (3) decrease exponentially fast as *n* tends to infinity.

2. The probabilities are given by *pn* = *p*/*<sup>n</sup>*, where p is a number from the interval (0, 1). Equation (3) is transformed to

$$\mathbb{P}\{X=n\} = \frac{p}{n} \cdot \prod\_{k=1}^{n-1} (1 - \frac{p}{k}).\tag{4}$$

According to (4) *X* is a proper random variable and has, in this case, the Sibuya distribution with parameter *p* ∈ (0, 1) with the following tail

$$\mathbb{P}\{X \ge n\} = \frac{\Gamma(n-p)}{\Gamma(n) \cdot \Gamma(1-p)} \sim \frac{1}{\Gamma(1-p) \cdot n^p}.$$

having heavy power asymptotic for *n* → ∞. Such the distribution does not have a finite mean value. It is not difficult to see that

$$\mathbb{P}\{X=n\} \sim p/(n^{p+1} \cdot \Gamma(1-p)), \quad n \to \infty.$$

The presented distributions can be respected as a kind of "extreme points" from the perspective of the tail behavior for proper random variable *X*. Hence, it is natural to study roughly speaking the cases "happening between them"; namely to consider, for example, the situations when *pn* = *p*/*nγ*, with *p* ∈ (0, 1) and *γ* > 0. As it was mentioned above, the parameter *γ* is responsible for the speed of convergence of the rejection probability to zero.

#### **4. Main Result on Citation Number Distribution**

The research subject is in the asymptotic behavior of the probabilities (2) for *pn* = *p*/*n<sup>γ</sup>* with *γ* ≥ 0. Additionally, to the discussed earlier values of *γ* = 0 or *γ* = 1, we distinguish the following two cases:

(A) 0 < *γ* < 1;

(B) *γ* > 1.

> Let us consider the case (A). We have

$$\mathbb{P}\{X=n\} = \frac{p}{n^{\gamma}} \cdot \prod\_{k=1}^{n-1} (1 - \frac{p}{k^{\gamma}}).\tag{5}$$

Consider the product from right-hand-side of (5) in more details.

$$\prod\_{k=1}^{n-1} (1 - \frac{p}{k^{\gamma}}) = \exp\left\{ \sum\_{k=1}^{n-1} \log(1 - p/k^{\gamma}) \right\} = \exp\left\{ -\sum\_{k=1}^{n-1} \sum\_{j=1}^{\infty} \frac{p^j}{j k^{\gamma j}} \right\}$$

$$\hat{\lambda} = \exp\left\{-\sum\_{j=1}^{\infty} \frac{p^j}{j} \sum\_{k=1}^{n-1} \frac{1}{k^{\prime \dagger}}\right\} = \exp\left\{-\sum\_{j=1}^{\lfloor 1/\gamma \rfloor + 1} \frac{p^j}{j} \sum\_{k=1}^{n-1} \frac{1}{k^{\prime \dagger}}\right\} \exp\left\{-\sum\_{\substack{[1/\gamma] + 1}}^{\infty} \frac{p^j}{j} \sum\_{k=1}^{n-1} \frac{1}{k^{\prime \dagger}}\right\}.\tag{6}$$

Here [1/*γ*] is an integer part of 1/*γ*. It is not difficult to see that

$$\exp\left\{-\sum\_{\left[1/\gamma\right]+1}^{\infty} \left(p^{j}/j\right) \sum\_{k=1}^{n-1} k^{-\gamma j} \right\}$$

has a finite positive limit as *n* → ∞. This limit may depend on *p* and *γ*. Let us denote it by *C*1 = *<sup>C</sup>*1(*<sup>γ</sup>*, *p*). Therefore,

$$\prod\_{k=1}^{n-1} (1 - \frac{p}{k^{\gamma}}) \sim \mathbb{C}\_1 \exp \left\{ -\sum\_{j=1}^{\lfloor 1/\gamma \rfloor + 1} \frac{p^j}{j} \sum\_{k=1}^{n-1} \frac{1}{k^{\gamma j}} \right\} \quad \text{as} \quad n \to \infty. \tag{7}$$

Relations (5) and (7) give us

$$\mathbb{P}\{X=n\} \sim \mathbb{C}\_1 \cdot \frac{p}{n^{\gamma}} \cdot \exp\left\{-\sum\_{j=1}^{\lfloor 1/\gamma \rfloor + 1} \frac{p^j}{j} \sum\_{k=1}^{n-1} \frac{1}{k^{\gamma\gamma}}\right\} \quad \text{as} \quad n \to \infty. \tag{8}$$

For 0 < *γj* < 1 the following asymptotic representation is known

$$\sum\_{k=1}^{n-1} \frac{1}{k^{\gamma j}} = \frac{n^{1-\gamma j}}{1-\gamma j} + \zeta(\gamma j) + o(1) \quad \text{as} \quad n \to \infty,\tag{9}$$

where *ζ*(*u*) is Riemann zeta function. Further considerations depend on properties of the number *γ*.

(i) Suppose that 1/*γ* is not integer. Then *γ* · [1/*γ*] < 1 and

$$\sum\_{j=1}^{\lfloor 1/\gamma \rfloor + 1} \frac{p^j}{j} \sum\_{k=1}^{n-1} \frac{1}{k^{\gamma j}} = \sum\_{j=1}^{\lfloor 1/\gamma \rfloor} \frac{n^{1-\gamma j}}{1-\gamma j} \frac{p^j}{j} + \sum\_{j=1}^{\lfloor 1/\gamma \rfloor} \zeta(\gamma j) \frac{p^j}{j} + \frac{p^{\lfloor 1/\gamma \rfloor + 1}}{\lfloor 1/\gamma \rfloor + 1} \sum\_{k=1}^{n-1} \frac{1}{k^{\gamma(\lfloor 1/\gamma \rfloor + 1)}} + o(1). \tag{10}$$

However, *γ*([1/*γ*] + 1) > 1 and, therefore,

$$\lim\_{n \to \infty} \sum\_{k=1}^{n-1} \frac{1}{k^{\gamma([1/\gamma]+1)}} = \sum\_{k=1}^{\infty} \frac{1}{k^{\gamma([1/\gamma]+1)}} < \infty.$$

From this and (10) it follows

$$\mathbb{P}\{X=n\} \sim \mathbb{C}\_2 \cdot \frac{p}{n^{\gamma}} \cdot \exp\left\{ \sum\_{j=1}^{\lfloor 1/\gamma \rfloor} \frac{n^{1-\gamma j}}{1-\gamma j} \cdot \frac{p^j}{j} \right\},\tag{11}$$

where *C*2 depends on *p* and *γ* only.

(ii) Suppose that 1/*γ* is positive integer. Then *γ*[1/*γ*] = 1 and

$$\sum\_{j=1}^{\lfloor 1/\gamma \rfloor + 1} \frac{p^j}{j} \sum\_{k=1}^{n-1} \frac{1}{k^{\gamma j}} = \sum\_{j=1}^{\lceil 1/\gamma \rceil - 1} \frac{n^{1 - \gamma j}}{1 - \gamma j} \frac{p^j}{j} + \sum\_{j=1}^{\lceil 1/\gamma \rceil - 1} \zeta(\gamma j) \frac{p^j}{j} \tag{12}$$

$$+ \frac{p^{\lceil 1/\gamma \rceil}}{\lfloor 1/\gamma \rfloor} \sum\_{k=1}^{n-1} \frac{1}{k} + \frac{p^{\lceil 1/\gamma \rceil + 1}}{\lceil 1/\gamma \rceil + 1} \sum\_{k=1}^{n-1} \frac{1}{k^2}.$$

It is known that

$$\lim\_{n \to \infty} \sum\_{k=1}^{n-1} \frac{1}{k^2} = \sum\_{k=1}^{\infty} \frac{1}{k^2} < \infty$$

and

$$\sum\_{k=1}^{n-1} \frac{1}{k} = \log(n) + \gamma\_c + o(1),$$

where *γe* is Euler's constant. Therefore,

$$\mathbb{P}\{X=n\} \sim \mathbb{C}\_3 \cdot \frac{p}{n^{\gamma+p^{\left[1/\gamma\right]}/\left[1/\gamma\right]}} \cdot \exp\left\{ \sum\_{j=1}^{\left[1/\gamma\right]-1} \frac{n^{1-\gamma j}}{1-\gamma j} \cdot \frac{p^j}{j} \right\} \text{ as } n \to \infty. \tag{13}$$

Now we see that the asymptotic behavior of the probability IP{*X* = *n*} in the case A) is given by (11) and (13). From the relations (11) and (13) it follows

$$\mathbb{P}\{X=\infty\} = \lim\_{n\to\infty} \prod\_{k=1}^{n-1} (1 - p/k^{\gamma}) = 0\_{\prime\prime}$$

so that *X* is a proper random variable.

> Denote by

$$b\_{\mathfrak{m}} = \prod\_{k=1}^{\mathfrak{m}-1} (1 - p/k^{\gamma}).$$

For the distribution tail *Tm* we have

$$T\_m = \sum\_{n=m}^{\infty} \mathbb{P}\{X = n\} = (b\_m - b\_{m+1}) + \dots + (b\_s - b\_{s+1}) + \dots = b\_m.$$

Particularly,

$$\sum\_{n=1}^{\infty} \mathbb{P}\{X = n\} = 1.$$

If 1/ *γ* is not a positive integer, then

$$T\_m = \prod\_{k=1}^{m-1} (1 - p/k^\gamma) \sim \mathbb{C}\_4 \cdot \exp\left\{ \sum\_{j=1}^{\lfloor 1/\gamma \rfloor} \frac{n^{1-\gamma j}}{1-\gamma j} \cdot \frac{p^j}{j} \right\}, \quad \text{as} \quad n \to \infty,\tag{14}$$

where *C*4 depends on *p* and *γ*. Similarly, for the case of integer 1/ *γ*,

$$T\_{\rm II} \sim \mathbb{C}\_5 \cdot \frac{p}{n^{p^{\lfloor 1/\gamma \rfloor}/\lfloor 1/\gamma \rfloor}} \cdot \exp\left\{ \sum\_{j=1}^{\lfloor 1/\gamma \rfloor - 1} \frac{n^{1-\gamma j}}{1-\gamma j} \cdot \frac{p^j}{j} \right\} \text{ as } n \to \infty. \tag{15}$$

Let us consider the case (B). We have

$$\mathbb{P}\{X=n\} = \frac{p}{n^{\gamma}} \cdot \prod\_{k=1}^{n-1} (1 - \frac{p}{k^{\gamma}}),\tag{16}$$

where *γ* > 1. Transform the product in the right-hand-side:

$$b\_n = \prod\_{k=1}^{n-1} (1 - \frac{p}{k^{\gamma}}) = \exp\left\{ \sum\_{k=1}^{n-1} \log(1 - p/k^{\gamma}) \right\},$$

$$=\exp\left\{-\sum\_{j=1}^{\infty}\sum\_{k=1}^{n-1}p^{j}/(jk^{\gamma j})\right\}=\exp\left\{-\sum\_{k=1}^{n-1}\sum\_{j=1}^{\infty}p^{j}/(jk^{\gamma j})\right\}$$

$$=\exp\left\{-\sum\_{k=1}^{n-1}p/(k^{\gamma}-p)\right\}\overset{\bigleftarrow{n}}{n}\to\infty]\longrightarrow\exp\left\{-\sum\_{k=1}^{\infty}p/(k^{\gamma}-p)\right\}.$$

The series under an exponential sign converges because *γ* > 1. From latest relation we see that

$$\mathbb{P}\{X=\infty\} = \exp\left\{-\sum\_{k=1}^{\infty} p/(k^\gamma - p)\right\} > 0,\tag{17}$$

and *X* is an improper random variable.

Therefore, for conditional probabilities we have

$$\mathbb{P}\{X = n | X < \infty\} \sim \mathbb{C}\_6 \frac{p}{n^{\gamma}} \quad \text{as} \quad n \to \infty,\tag{18}$$

where *C*6 depends on *p* and *γ* only.

> Summarizing, we obtain the following theorem

**Theorem 1.** *For the considered experiment scheme with probabilities given in (5) the following statements are true:*


$$\mathbb{P}\{X=n\} \sim \mathbb{C}\_2 \cdot \frac{p}{n^{\gamma}} \cdot \exp\left\{-\sum\_{j=1}^{\lfloor 1/\gamma \rfloor} \frac{n^{1-\gamma j}}{1-\gamma j} \cdot \frac{p^j}{j}\right\} \text{ as } n \to \infty. \tag{19}$$

*If* 0 < *γ* < 1 *and* 1/*γ is a positive integer then*

$$\mathbb{P}\{X=n\} \sim \mathbb{C}\_{\mathbb{S}} \cdot \frac{p}{n^{\gamma+p^{[1/\gamma]}/[1/\gamma]}} \cdot \exp\left\{-\sum\_{j=1}^{[1/\gamma]-1} \frac{n^{1-\gamma j}}{1-\gamma j} \cdot \frac{p^j}{j}\right\} \text{ as } n \to \infty. \tag{20}$$

• *If γ* = 1 *then*

$$\mathbb{P}\{X=n\} \sim p/(n^{p+1}\Gamma(1-p)), \quad n \to \infty. \tag{21}$$

• *If γ* > 1 *then*

$$\mathbb{P}\{X = n | X < \infty\} \sim \mathbb{C}\_4 \frac{p}{n^{\gamma}} \quad \text{as} \quad n \to \infty,\tag{22}$$

*and*

$$\mathbb{P}\{X=\infty\} = \exp\left\{-\sum\_{k=1}^{\infty} p/(k^{\gamma}-p)\right\} > 0,\tag{23}$$

*All C*, *C*1 − *C*6 *depend on parameters p and γ only.*

One of the reviewers of the first version of the paper advised us to study the form of the constants for some particular cases. We are very grateful him for the advice. Below we consider the case *γ* ∈ (1/2, <sup>1</sup>). In this case [1/*γ*] = 1 so that the sum under exponential sign in (19) contains only one summand. The calculations similar to give above leads to the following expression

$$\mathbb{P}\{X=n\} = \frac{p}{n^{\gamma}} \exp\left\{ -\frac{p}{1-\gamma}n^{1-\gamma} - \sum\_{k=1}^{\infty} \frac{p^k}{k} \zeta(k\gamma) + o(1) \right\}.$$

In other words, the constant *C*2 has form

$$C\_2 = \exp\left\{-\sum\_{k=1}^{\infty} \frac{p^k}{k} \zeta(k\gamma)\right\} > 0.$$

However, precise calculation of all other constant is rather difficult. We do not these constants for the aims of this paper and omit any other calculations of constants.
