**3. Estimation Methods**

The estimation of unknown parameters of a distribution is critical in accurately determining the behaviour of this distribution. Here, we use classical methods of estimation such as the method of maximum likelihood (mle) and weighted least square (wls) estimation for this purpose.

#### *3.1. Maximum Likelihood Estimation*

Let *X*1, *X*2, ... , *Xn* be a random sample taken from the DPsL (*θ*, *β*) distribution, and *x*1, *x*2, ..., *xn* be observations of this random sample. The likelihood function is given by:

$$\mathcal{L} = \left(\frac{1}{\beta}\right)^n \left\{ \prod\_{i=1}^n \left[ (\beta + \theta x\_i) e^{-\theta x\_i} - (\beta + \theta(x\_i + 1)) e^{-\theta(x\_i + 1)} \right] \right\}$$

and the log likelihood function is given by:

$$\log \mathcal{L} = -n \log \beta + \sum\_{i=1}^{n} \log \left[ (\beta + \theta \mathbf{x}\_i) e^{-\theta \mathbf{x}\_i} - (\beta + \theta (\mathbf{x}\_i + 1)) e^{-\theta (\mathbf{x}\_i + 1)} \right].$$

Then, the maximum likelihood estimates (MLEs) of *θ* and *β* were obtained by maximizing L or log L with respect to these parameters. They can also be determined as the solutions of the normal equations given by:

$$\begin{split} \frac{\partial \log \mathcal{L}}{\partial \theta} = 0 &\implies\\ \sum\_{i=1}^{n} \frac{e^{-\theta(2\mathbf{x}\_{i}+1)} \left[e^{\theta \mathbf{x}\_{i}} (\mathbf{x}\_{i}+1)(\theta \mathbf{x}\_{i}+\theta + \beta - 1) - e^{\theta(\mathbf{x}\_{i}+1)} \mathbf{x}\_{i}(\theta \mathbf{x}\_{i}+\beta - 1)\right]}{(\beta + \theta \mathbf{x}\_{i})e^{-\theta \mathbf{x}\_{i}} - (\beta + \theta(\mathbf{x}\_{i}+1))e^{-\theta(\mathbf{x}\_{i}+1)}} &= 0 \end{split} \tag{9}$$

and

$$\begin{split} \frac{\partial \log \mathcal{L}}{\partial \beta} &= 0 \implies\\ -\frac{n}{\beta} + \sum\_{i=1}^{n} \frac{e^{-\theta \mathbf{x}\_{i}} - e^{\theta(\mathbf{x}\_{i} + 1)}}{(\beta + \theta \mathbf{x}\_{i})e^{-\theta \mathbf{x}\_{i}} - (\beta + \theta(\mathbf{x}\_{i} + 1))e^{-\theta(\mathbf{x}\_{i} + 1)}} &= 0. \end{split} \tag{10}$$

Equations (9) and (10) can be solved by numerical optimization techniques using mathematical software such as MATHEMATICA, MATHCAD and R.

#### *3.2. Weighted Least Squares Estimation*

Let *X*(1), *X*(2), ..., *X*(*n*) be the order statistics of a random sample taken from the DPsL (*θ*, *β*) distribution, and *x*(1), *x*(2), ... , *x*(*n*) be observations of these random variables. The weighted least squares estimates (WLEs) of the parameters *θ* and *β* of the DPsL distribution were obtained by maximizing the following function with respect to *θ* and *β*:

$$\mathcal{W} = \sum\_{i=1}^{n} \frac{(n+1)^2 (n+2)}{i(n-i+1)} \left[ F\_{\text{DPsL}} \left( x\_{(i)}; \theta, \beta \right) - \frac{i}{n+1} \right]^2.$$

#### *3.3. Simulation Study*

The current section deals with examining the efficiency of two estimation methods for estimating the parameters of the DPsL distribution using simulation. Estimates were calculated for different values of parameters ((*θ* = 0.5, *β* = 1) and (*θ* = 2.2, *β* = 1.5)) for various sample sizes (25, 50, 75, 100) using the two estimation methods discussed and, thus, compared. Then, *N* = 1000 samples of values from the DPsL distribution using methods discussed in Section 2.7 were generated. The indices such as values of the estimates, mean square errors (MSEs), average absolute biases (Bias) and average mean relative estimates (MREs) were calculated in R software using the following formulas:

$$\begin{aligned} \text{MSE} &= \frac{1}{N} \sum\_{i=1}^{N} (\mathcal{\zeta}\_i - \mathcal{\zeta})^2, \quad \text{Bias} = \frac{1}{N} \sum\_{i=1}^{N} |\mathcal{\zeta}\_i - \mathcal{\zeta}|\_{\mathcal{L}} \\ \text{MRE} &= \frac{1}{N} \sum\_{i=1}^{N} \frac{|\mathcal{\zeta}\_i - \mathcal{\zeta}|}{\mathcal{\zeta}}. \end{aligned}$$

where *ζ* = *θ* or *β*, and the index *i* refers to the *i*th sample. Simulation results, including values of estimates, Bias, MSEs and MREs for the two parameters *θ* and *β* of the DPsL distribution using the estimation approaches discussed, are reported in Tables 6 and 7.


**Table 6.** Simulation results of our estimation approaches for the DPsL distribution with *θ* = 0.5, *β* = 1.

**Table 7.** Simulation results of our estimation approaches for the DPsL distribution with *θ* = 2.2, *β* = 1.5.


From the above tables, it is clear that, for estimating *θ*, the corresponding MLE performed well, and for *β*, the corresponding WLSE outperformed the MLE.

#### **4. INAR(1) Process with DPsL Innovations**

Numerous fields, such as agriculture, epidemiology, actuarial science, finance, etc., have come across certain time series of counts. Analysing these kinds of datasets using the INAR(1) process was first applied using Poisson innovations by [12,13]. Suppose that {*εt*}*t*∈<sup>Z</sup> are the innovations, so are independent and identically distributed (iid) random variables, with E(*εt*) = *με* and variance Var(*εt*) = *σ*<sup>2</sup> *<sup>ε</sup>* . A stochastic process {*Xt*}*t*∈<sup>Z</sup> defined as:

$$X\_t = p \circ X\_{t-1} + \varepsilon\_{t\prime}$$

with 0 ≤ *p* < 1, is stated to be an INAR(1) process. The symbol ◦ is called as binomial thinning operator, which can be described as:

$$p \circ X\_{t-1} = \sum\_{j=1}^{\chi\_{t-1}} U\_{j\prime}$$

where {*Uj*}*j*∈<sup>Z</sup> is a sequence of iid Bernoulli random variables with parameter *p*. The one step transition probability of the INAR(1) process is given by:

$$\Pr(X\_t = k \mid X\_{t-1} = l) = \sum\_{i=1}^{\min(k,l)} \Pr(B = i) \Pr(\varepsilon\_t = k - i), \ k, l \ge 0, \varepsilon\_t$$

where *B* denotes a random variable following the Binomial (*n*, *p*) distribution. The mean, variance and dispersion index (DI) of {*Xt*}*t*∈<sup>Z</sup> are given by [21]. They are:

$$\mathcal{E}(X\_t) = \frac{\mu\_\varepsilon}{1 - p'} \tag{11}$$

$$\text{Var}(X\_t) = \frac{p\mu\_t + \sigma\_\varepsilon^2}{1 - p^2} \tag{12}$$

and

$$\text{DI}(\mathbf{X}\_{l}) = \frac{\text{DI}\_{\ell} + p}{1 + p},\tag{13}$$

where *με*, *σ*<sup>2</sup> *<sup>ε</sup>* and DI*<sup>ε</sup>* are the mean, variance and DI of the innovation distribution. The results of [12,13] influenced us to propose a new INAR(1) process with DPsL innovations, which are capable of modelling over as well as under-dispersed count datasets. Suppose that {*εt*}*t*∈<sup>Z</sup> follow a DPsL distribution; then, the one step transition probability matrix of the corresponding process is:

$$\begin{aligned} \Pr(\mathbf{X}\_t = k \mid \mathbf{X}\_{t-1} = l) &= \\ \sum\_{i=1}^{\min(k,l)} \binom{l}{i} p^i (1-p)^{l-i} \\ \times \frac{(\beta + \theta(k-i))e^{-\theta(k-i)} - (\beta + \theta((k-i)+1))e^{-\theta((k-i)+1)}}{\beta} \end{aligned}$$

which hereafter is called the INAR(1)DPsL process. By substituting *με*, *σ*<sup>2</sup> *<sup>ε</sup>* , and DI*<sup>ε</sup>* in (11)–(13) with (6)–(8), the mean, variance and DI of the INAR(1)DPsL process could be attained. The conditional expectation and variance of the INAR(1)DPsL process are given by:

$$\mathbb{E}(X\_t \mid X\_{t-1}) = pX\_{t-1} + \mu\_{\varepsilon\prime} \tag{14}$$

$$\text{Var}(X\_t \mid X\_{t-1}) = p(1-p)X\_{t-1} + \sigma\_\varepsilon^2. \tag{15}$$

respectively, where *με* and *σ*<sup>2</sup> *<sup>ε</sup>* are given in (6) and (7), respectively (see [13,21]).

#### *4.1. Estimation*

and

Here, the inference of the INAR(1)DPsL process was examined using two estimation methods: the conditional maximum likelihood (CML) and Yule–Walker (YW) methods. A simulation study was performed to assess the efficiency of the two methods.

#### 4.1.1. Conditional Maximum Likelihood

Let *X*1, *X*2, ... , *XT* be a random sample taken from the INAR(1)DPsL process, and *x*1, *x*2,..., *xT* be observations of this random sample. Then, the conditional log likelihood function of the INAR(1)DPsL process is given by:

$$\begin{split} \ell(\Theta) &= \sum\_{t=2}^{T} \log[\Pr(X\_t = \mathbf{x}\_t \mid \mathbf{X}\_{t-1} = \mathbf{x}\_{t-1})] \\ &= \sum\_{t=2}^{T} \log \left[ \sum\_{i=1}^{\min(\mathbf{x}\_t, \mathbf{x}\_{t-1})} \binom{\mathbf{x}\_{t-1}}{i} p^i (1-p)^{\mathbf{x}\_{t-1}-i} \\ & \underbrace{\left(\beta + \theta(\mathbf{x}\_t - i)\right) e^{-\theta(\mathbf{x}\_t - i)} - (\beta + \theta(\mathbf{x}\_t - i + 1)) e^{-\theta(\mathbf{x}\_t - i + 1)}}\_{\beta} \right], \end{split} \tag{16}$$

where Θ = (*θ*, *β*, *p*) is the vector of unknown parameters to be estimated. Maximizing (16) with respect to Θ yields the CML estimates (CMLEs). In this regard, we used the optimfunction in R software for the same. In addition, the fdHess function in R was used to obtain the observed information matrix and, hence, the standard errors (SE) of estimates of parameters in the INAR(1)DPsL process.

#### 4.1.2. Yule–Walker

The YW estimates (YWEs) of the INAR(1)DPsL process were computed by solving simultaneous equations of sample and theoretical moments. Since the autocorrelation function (ACF) of the INAR(1) process at lag *h* was *ρx*(*h*) = *ph*, the YWE of *p* is given by:

$$\mathfrak{p}\_{YM} = \frac{\sum\_{t=2}^{T} (\mathfrak{x}\_t - \mathfrak{x}) (\mathfrak{x}\_{t-1} - \mathfrak{x})}{\sum\_{t=1}^{T} (\mathfrak{x}\_t - \mathfrak{x})^2}.$$

Now, the YWEs for *θ* and *β* were obtained by solving the equations of sample mean equals theoretical mean and sample dispersion equals theoretical dispersion of the process. Here, by denoting as ˆ *θYW* and *β*ˆ *YW* the YWEs of *θ* and *β*, respectively, the following relationship holds:

$$\hat{\beta}\_{YW} = \frac{\theta\_{YW} e^{\hat{\theta}\_{YW}}}{\overline{\pi} (1 - \hat{\rho}\_{YW}) (e^{\hat{\theta}\_{YW}} - 1)^2 - (e^{\hat{\theta}\_{YW}} - 1)},\tag{17}$$

where *x* = ∑*<sup>T</sup> <sup>t</sup>*=<sup>1</sup> *xt*/*N*. Substituting *<sup>β</sup>*<sup>ˆ</sup> *YW* with (17) in (13) and equating (13) to sample dispersion, we obtained ˆ *θYW*.

## *4.2. Simulation of INAR(1)DPsL Process*

Here, a simulation study was conducted to comprehensively determine the performance of CMLEs and YWEs of the parameters of the INAR(1)DPsL process. In this regard, we generated *N* = 1000 samples each of sizes *n* = 25, 50, 100 from the proposed distribution for two sets of parameter values (*θ* = 0.1, *β* = 1.1 and *θ* = 3, *β* = 4). For each *n*, average absolute bias, MSE and MRE for the parameters were calculated for the two methods. The simulation results are presented in Table 8.


**Table 8.** Simulation results of the INAR(1)DPsL process.

From the above table, we observed that the average biases, MSEs and MREs of CMLEs tended to zero quicker than those of YWEs, making them efficient for small as well as large sample sizes. Therefore, the CML estimation was preferred to attain unknown parameters of the INAR(1)DPsL process.
