*5.3. Numbers of Claims*

In this part, a comparison of the performance of the INAR(1)DPsL process with the INAR(1)DTPL (see [7]), INAR(1)NPWE (see [16]), INAR(1)DPLi (see [15]) and INAR(1)G (see [14]) processes was conducted. The one-step translation probabilities of the competitive INAR(1) processes were given as follows:

1. For the INAR(1)DPLi process:

$$\Pr(X\_t = k \mid X\_{t-1} = l) = \sum\_{i=0}^{\min(k,l)} \binom{l}{i} p^i (1-p)^{l-i} \frac{\theta^2 (k - i + \theta + 2)}{(\theta + 1)^{k - i + 3}}, \ \theta > 0.$$

2. For the INAR(1)DTPL process:

$$\begin{aligned} \Pr(X\_t = k \mid X\_{t-1} = l) &= \\ \sum\_{i=1}^{\min(k,l)} \binom{l}{i} p^i (1-p)^{l-i} \\ \times \quad \frac{\lambda^{k-i} \{\beta(\lambda(\log(\lambda)-1)+1) + (\lambda-1)\log(\lambda)(a+\beta(k-i))\}}{\beta-a\log(\lambda)} \\ 0 < \lambda < 1, a\theta + \beta > 0, \theta &= -\log(\lambda). \end{aligned}$$

3. For the INAR(1)NPWE process:

$$\begin{aligned} \Pr(X\_t = k \mid X\_{t-1} = l) &= \\ \sum\_{i=0}^{\min(k,l)} \binom{l}{i} p^i (1-p)^{l-i} \alpha (1+\theta) (1+\alpha+\alpha\theta)^{-(k-i)-1}, \\ \alpha > 0, \theta > 0. \end{aligned}$$

4. For the INAR(1)G process:

$$\begin{aligned} \Pr(X\_t = k \mid X\_{t-1} = l) &= \sum\_{i=1}^{\min(k,l)} \binom{l}{i} p^i (1-p)^{l-i} \left[ \alpha (1-\alpha)^{k-i} \right], \\ 0 < \alpha < 1. \end{aligned}$$

The third data we used here were to illustrate the application of the DPsL distribution in the INAR(1) process. Originally, the data were studied by [25], which consisted of 67 monthly claims for short-term disability benefits made by injured workers to the B.C. Workers' Compensation Board (WCB). These data were reported from the BC Center, Richmond, for the period of 10 years from 1985 to 1994. The mean, variance, and DI of the dataset were 8.6042, 11.2392 and 1.3062, respectively. To check whether the data considered had statistically significant over-dispersion, the hypothesis test proposed by [26] was applied. The value test statistic was 51.971 with a *p*-value less than 0.001, which showed the data had significant over-dispersion. Figure 6 displays the plots of the autocorrelation function (ACF), partial ACF (PACF), histogram and time series plots, and in the PACF plot the unique first lag significance indicated that these data could be used for modelling the INAR(1) process.

**Figure 6.** PACF, ACF, histogram and time series plot for the number of claims dataset.

The parameter estimates, modelling adequacy criteria, theoretical mean, variance and DI of the fitted INAR(1) process were recorded in Table 13. Since the INAR(1)DPsL process had lesser values for -L, AIC and BIC statistics than those of the INAR(1)DTPL, INAR(1)NPWE, INAR(1)PL and INAR(1)G processes, the INAR(1)DPsL process provided better fits than the competitors. Additionally, the obtained DI value of the INAR(1)DPsL

process was very near the empirical one. It is conclusive that the INAR(1)DPsL process impressively explained the characteristics of the dataset.


**Table 13.** The estimates and modelling adequacy statistics of the fitted distributions for the number of claims dataset.

The residual analysis was conducted to check whether the fitted INAR(1)DPsL process was accurate. For that, Pearson residuals for the INAR(1)DPsL process were calculated through the following formula:

$$r\_t = \frac{\mathbf{x}\_t - \operatorname{E}(X\_t \mid X\_{t-1} = \mathbf{x}\_{t-1})}{\operatorname{Var}(X\_t \mid X\_{t-1} = \mathbf{x}\_{t-1})^{1/2}} \tau$$

where E(*Xt* | *Xt*−<sup>1</sup> = *xt*−1) and Var(*Xt* | *Xt*−<sup>1</sup> = *xt*−1) were derived from (14) and (15), respectively. When the fitted INAR(1) process was statistically valid, the Pearson residual had to be uncorrelated and should have had zero mean and unit variance [27]. Here, we obtained the mean and variance of the Pearson residuals of the INAR(1)DPsL process as 0.035 and 0.967, respectively, which were very close to the desired values. According to the results of [28], the INAR(1)DPsL process for the data was

$$X\_t = 0.5620 \circ X\_{t-1} + \varepsilon\_{t-2}$$

where the innovation process was such that *ε<sup>t</sup>* follows the DPsL (0.4835, 1.9214) distribution. Predicted values of the monthly number of claims dataset and the ACF plot of the Pearson residuals via this process were displayed in Figure 7.

Based on this figure, the ACF plot of the Pearson residuals specified that there was no presence of autocorrelation for the Pearson residuals.

**Figure 7.** The predicted values of the number of claims dataset (**left**) and the ACF plot of the Pearson residuals (**right**).

#### **6. Concluding Remarks**

In this paper, a two-parameter discrete distribution, namely, the discrete Pseudo Lindley (DPsL) distribution, was proposed. Its primary motivation is the ability to model various phenomena with under- and over-dispersed observed values. Various statistical properties, almost all having a closed form, revealed the flexibility and simplicity of the distribution. The estimation of the unknown parameters was performed using two different methods. They conducted an extensive simulation study to reveal the finite sample performance of the distribution. Crucially, a new INAR(1) process with DPsL innovations was developed and studied in detail. Three real-life datasets were considered to prove the efficiency of the proposed distribution. As a future work, we could consider other methods of discretization for the PsL distribution, which would then provide better properties than the survival discretization method. Furthermore, we can attempt to extend it to bivariate models. We hope that the DPsL distribution, as well as the related modelling strategy, will be an interesting alternative to modelling count data, especially in modelling the over-dispersed count data.

**Author Contributions:** Conceptualization, M.R.I. and R.M.; methodology, M.R.I., C.C., V.D. and R.M.; software, V.D.; validation, M.R.I., C.C., V.D. and R.M.; software, V.D.; investigation, M.R.I., C.C., V.D. and R.M.; data curation, V.D.; writing—original draft preparation, V.D.; writing—review and editing, M.R.I., C.C., V.D. and R.M.; visualization, M.R.I., C.C., V.D. and R.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Acknowledgments:** We are grateful to the three reviewers for their helpful suggestions on the manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

