Next Article in Journal
The Functional Expansion Approach for Solving NPDEs as a Generalization of the Kudryashov and G/G Methods
Previous Article in Journal
Analysis of Position, Pose and Force Decoupling Characteristics of a 4-UPS/1-RPS Parallel Grinding Robot
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Asymmetric Probability Mass Function for Count Data Based on the Binomial Technique: Synthesis and Analysis with Inference

by
Afrah Al-Bossly
1,* and
Mohamed S. Eliwa
2,3
1
Department of Mathematics, College of Science and Humanities in Al-Kharj, Prince Sattam Bin Abdulaziz University, Al-Kharj 11942, Saudi Arabia
2
Department of Statistics and Operation Research, College of Science, Qassim University, Buraydah 51482, Saudi Arabia
3
Department of Mathematics, Faculty of Science, Mansoura University, Mansoura 35516, Egypt
*
Author to whom correspondence should be addressed.
Symmetry 2022, 14(4), 826; https://doi.org/10.3390/sym14040826
Submission received: 22 March 2022 / Revised: 4 April 2022 / Accepted: 11 April 2022 / Published: 15 April 2022
(This article belongs to the Section Mathematics)

Abstract

:
In this article, a new probability mass function for count data is proposed based on the binomial technique. After introducing the methodology of the newly model, some of its distributional characteristics are discussed in-detail. It is found that the newly model has explicit mathematical expressions for its statistical and reliability properties, which is not the case with many well-known discrete models. Moreover, it can be used as an effectively probability tool for modeling asymmetric over-dispersed data with leptokurtic shapes. The parameters estimation through the classical point of view have been done via utilizing the technique of maximum likelihood and Bayesian approaches. A MCMC simulation study is carried out to examine the performance of the estimators. Finally, two distinct real data sets are analyzed to prove the flexibility and notability of the newly model.
MSC:
60E05; 62E10; 62F10

1. Introduction

In today’s competitive era, the data generated from various fields such as engineering, economics, medical sciences, etc., is becoming more complex day by day. As a result, for modeling such data, we need distributions that are best suited for analytical studies of these multidimensional and complex data. For these reasons, over the past three decades, the development of new probability distribution has become the center of the statistical research. However, a large part of this research has been devoted to the development of continuous probability distributions. However, there may be situations where discrete distributions may be more appropriate for data modeling or the generated data may be naturally discrete; for example, in the field of reliability analysis, the lifetime of an on/off switching device is a discrete random variable, see [1], among others. These circumstances demand suitable discrete distributions that can adequately model such data. Therefore, in recent years, the derivation of discrete distributions has also received the attention of researchers.
The first name in this episode comes from [2], who gave a discretized version of the continuous Weibull distribution. Then afterward, ref. [3] obtained discrete inverse Weibull distribution. The paper by [4] provided an excellent review of the development of discrete distributions until 2014. Thereafter, many important discrete distributions have evolved in the literature. Ref. [5] derived discrete generalized Rayleigh distribution, ref. [6] suggested discrete Weibull geometric distribution, and discrete additive Perks–Weibull distribution was proposed by [7]. Recently, to fit discrete increasing failure and count data, ref. [8] developed a discrete Perks distribution, ref. [9] introduced a new two-parameter exponentiated discrete Lindley distribution with bathtub-shaped hazard rate characteristics, ref. [10] developed a discrete Gompertz-G family for over and under-dispersed data, ref. [11] presented a discrete Burr–Hatke distribution with the associated count regression model, ref. [12] discussed discrete Bilal distribution with properties and applications on integer-valued autoregressive processes, and [13] proposed a new three-parameter discrete Lindley distribution with an associated INAR (1) process, among others.
An important notable thing about the above-cited articles is that most of them are discretized versions of the continuous probability distribution, and they have evolved from the techniques described in [4]. Apart from these methods, another important approach to obtaining discrete models is due to Hu et al. [14]. This approach allows us to generate new discrete distributions by compounding two probability functions. According to this methodology, if X and M are two discrete random variables with probability mass functions (PMFs) f ( x ) and w ( m ) , respectively, then the PMFs of these two random variables are connected by the binomial decay transformation, i.e.,
Pr [ X = x ] = m = x m x p x ( 1 p ) m x w ( m ) ; x = 0 , 1 , 2 ,
Here, it is notable that the expression in Equation (1) is a proper PMF with an attenuating coefficient p [ 0 , 1 ] . Ref. [14] considered M as a Poisson variate with parameter λ > 0 , and then using Equation (1), they obtained Pr [ X = x ] as a Poisson distribution with parameter λ p > 0 . Ref. [15] derived uniform Poisson distribution by applying a similar idea to that of [14]. Ref. [16] proposed uniform-geometric distribution by replacing binomial distribution with uniform distribution and setting w ( m ) as geometric distribution in Equation (1). Recently, Ref. [17] introduced binomial-discrete Lindley distribution by substituting w ( m ) as discrete Lindley distribution in Equation (1). In this paper, the authors propose a new three-parameter binomial new Poisson-weighted exponential (BNPWE) distribution for modeling count data by substituting w ( m ) as a new Poisson-weighted exponential (NPWE) distribution. This procedure has been followed by a very large number of researchers in order to look for new distributions that allow capturing “certain” properties of “certain” data sets. Such an abundance of this type of procedure has made it banal, with the consequent obtaining of uninteresting results. However, in the case of this paper, this procedure is applied to discrete distributions, which is not at all frequent. Thus, there is still a way to go here, which favors the interest of the paper under analysis. The proposed model is created and developed by deducing formulas and studying properties for its parameters. From the properties of these parameters, they try to intuit and justify their suitability for application to certain types of data, in particular for medical applications.

2. The BNPWE Distribution

The NPWE distribution has recently been introduced by [18]. He obtained this discrete model by compounding Poisson and a new weighted exponential distribution of [19] and showed its applicability in a first-order integer-valued autoregressive process. The survival function (SF) of NPWE distribution can be expressed as
S ( x ; α , θ ) = ( 1 + α + α θ ) ( x + 1 ) ; x N 0 ,
where N 0 = { 0 , 1 , 2 , , ν } for 0 < ν < , α > 0 , and θ > 0 . The PMF corresponding to Equation (2) is
p ( x ; α , θ ) = α ( 1 + θ ) ( 1 + α + α θ ) ( x + 1 ) ; x N 0 .
The PMF in Equation (1) can be represented as
Pr [ X = x ] = m = x Pr ( X = x | M = m ) w ( m ) ,
where X | M = m has the binomial ( m , p ) distribution. Now, let us consider that X | M = m follows binomial ( m , β ) distribution and w ( m ) has the NPWE distribution given in Equation (3). Then, using Equation (4), the PMF of X can be expressed as
Pr [ X = x ; α , β , θ ] = α ( 1 + θ ) β x ( α + β + α θ ) x + 1 ,
where x N 0 , α > 0 , 0 < β < 1 , and θ > 0 . If random variable X has the PMF in Equation (5), then it is known as BNPWE distribution and it is symbolized by X BNPWE ( α , β , θ ) . The corresponding CDF to Equation (5) is
F ( x ; α , θ , β ) = 1 β α + β + α θ x + 1 ; x N 0 .
Based on Equation (6), the quantile function can be formulated as
x q = ln ( 1 q ) ln β α + β + α θ 1 ; 0 < q < 1 .
Figure 1 shows the PMF plots for various values of the BNPWE parameters.

2.1. Moments, Skewness, Kurtosis and Index of Dispersion

Let X be a BNPWE random variable. Then, the probability-generating function (PrGF) can be proposed in closed form as
Π X ( s ) = α ( 1 + θ ) α θ β s + α + β .
On replacing s by e s in Equation (8), we get the moment-generating function (MGF). The first four derivatives of the MGF, with respect to s at s = 0 , yield the first four moments (FFM) about origin. Thus, the FFM values of the BNPWE distribution, respectively, are
E ( X ) = β α ( 1 + θ ) ,
E ( X 2 ) = β ( α θ + 2 β + α ) α 2 ( 1 + θ ) 2 ,
E ( X 3 ) = 6 β 1 6 α 2 ( 1 + θ ) 2 + β α ( 1 + θ ) + β 2 α 3 ( 1 + θ ) 3
and
E ( X 4 ) = 24 β 1 12 α 2 ( 1 + θ ) 2 + β α ( 1 + θ ) + β 2 1 2 α ( 1 + θ ) + β α 4 ( 1 + θ ) 4 .
Based on the FFM of the BNPWE distribution about origin, the skewness, kurtosis, and index of dispersion (IxOD) can be derived in explicit forms. The IxOD is defined as variance to mean ratio; it indicates whether a certain distribution is suitable for over or (under)-dispersed data. If IxOD > ( < ) 1 , the model is under (over-dispersed). Table 1 lists some descriptive statistics for the BNPWE distribution for various values of the model parameters.
It is clear that the BNPWE distribution is appropriate for modeling over-dispersed data. Furthermore, it is capable of modeling leptokurtic and positively skewed data.

2.2. Mean Residual Life (MRL) and Mean Past Life (MPL)

The MRL is a helpful tool to analyze the burn-in and maintenance policies. In the discrete setting, the MRL is defined as
Λ ( i ) = E X i | X i = 1 1 F ( i 1 ; x ; α , θ , β ) j = i + 1 1 F ( j 1 ; x ; α , θ , β ) ; i N 0 ,
Let X be a BNPWE random variable, then the MRL can be reported in a closed form as
Λ ( i ) = β + α ( 1 + θ ) α ( 1 + θ ) β α + β + α θ .
The variance residual life (VRL) function can be defined in a closed form as
Ω VRL ( i ) = E ( X 2 | X i ) E ( X | X i ) 2 = 2 β + i α ( 1 + θ ) β + α ( 1 + θ ) α 2 ( 1 + θ ) 2 β α + β + α θ ( 2 i 1 ) Λ ( i ) Λ 2 ( i ) .
Thus, X is increasing (decreasing) VRL if
Ω VRL ( i + 1 ) ( ) Λ ( i ) 1 + Λ ( i + 1 ) .
The residual coefficient of variation (RCV) can be listed in a closed form where RCV ( i ) = Ω VRL ( i ) / Λ ( i ) .
Another measure of interest in reliability theory is MPL. It measures the time elapsed since the failure of X given that the system/component has failed sometime before i. In the discrete setting, the MPL is defined as
δ ( i ) = E i X | X < i = 1 F ( i 1 ; α , θ , β ) m = 1 i F ( m 1 ; α , θ , β ) ; i N 0 { 0 } ,
where δ ( 0 ) = 0 . Let X be a BNPWE random variable; then, the MPL can be represented in a closed form as
δ ( i ) = 1 β α + β + α θ i 1 i + β i + 1 α + β + α θ i β α ( 1 + θ ) .
For i N 0 , we get δ ( i ) i . The mean of the model can be listed as
Mean = i δ ( i ) F ( i 1 ) + Λ ( i ) 1 F ( i 1 ) ; i N 0 { 0 } .
The reversed hazard rate function (RHRF) and the MPL are related as
r ( i ; θ ) = 1 δ ( i + 1 ) + δ ( i ) δ ( i ) ; i N 0 { 0 } .
If X is a BNPWE random variable, then the CDF can be recovered by the MPL as
F ( k ; α , θ , β ) = F ( 0 ) i = 1 k δ ( i ) δ ( i + 1 ) 1 ; k N 0 { 0 } ,
where F ( 0 ) = i = 1 δ ( i ) δ ( i + 1 ) 1 1 . Table 2, Table 3 and Table 4 list some numerical computations of reliability concepts for different values of the model parameter α , β , and θ , respectively, at time i = 10 h.
From Table 2, Table 3 and Table 4, it is clear that:
  • The MRL and VRL decrease for (fixed β and θ with α grows) and (fixed α and β with θ grows), whereas for fixed α and θ with β 1 , the MRL and VRL increase.
  • The RCV and MPL increase for (fixed β and θ with α grows) and (fixed α and β with θ grows), whereas for fixed α and θ with β 1 , the RCV and MPL decrease.

3. Parameter Estimation

3.1. Classical Estimation Using Method of Maximum Likelihood

In this section, we determine the MLEs of the model parameter based on a complete sample. Assume that X 1 , X 2 , , X n is a random sample of size n from the BNPWE distribution. The likelihood function (L) can be expressed as follows
L = α n 1 + θ n β i = 1 n x i α + β + α θ i = 1 n ( x i + 1 ) .
The log-likelihood function ( L L F ) corresponding to Equation (20) is
l = log L = n ln α 1 + θ + ln β i = 1 n x i ln α + β + α θ i = 1 n x i + 1 .
By differentiating Equation with respect to the parameters α , β , and θ , we get the non-linear likelihood equations. These equations cannot be solved analytically; therefore, an iterative procedure such as Newton–Raphson is required to solve these non-linear equations numerically.

3.2. Bayesian Estimation

The significance of Bayesian analysis has grown enormously over the last several decades not only because Bayesian estimators have become much simpler to compute, but also because it is one of the most acceptable methods of computing estimates for complicated models. Given this, the current section is dedicated to the Bayesian estimation of unknown parameters of the BNPWE distribution under informative priors (IPs) and non-informative priors (NIPs).
Suppose that the prior distributions of the parameters α , β , and θ are Gamma ( a 1 , a 2 ) , Beta ( b 1 , b 2 ) , and Gamma ( c 1 , c 2 ) , respectively. Then, the joint prior distribution of α , β , and θ is given by
g ( α , β , θ ) α a 1 1 β b 1 1 θ c 1 exp ( α a 1 β b 2 θ c 2 ) ,
where ( α , θ ) > 0 , 0 < β < 1 , and ( a 1 , b 1 , a 2 , b 2 , c 1 , c 2 ) > 0 are the hyperparameters of the prior densities and can be fixed based on the amount of available prior information. If we fixed a 1 = b 1 = a 2 = b 2 = c 1 = c 2 = 0 , the joint prior density becomes non-informative prior density.
Using the application of Bayes theorem, the unnormalized joint posterior distribution of the parameters α , β , and θ given data can be obtained via the likelihood function and the joint prior density as
P ( α , β , θ | x ̲ ) α n + a 1 1 β i = 1 n x i + b 1 1 θ c 1 1 + θ n exp ( α a 1 β b 2 θ c 2 ) α + β + α θ i = 1 n ( x i + 1 ) .
Loss functions (LFs) play a vital role when researchers are not only interested in choosing the right decision but also consider the economic consequences that arise through a Bayes estimate of the unknown parameter. In our case, we use a well-known symmetric loss function called squared error function (SELF). In this loss function, positive and negative errors are equally penalized. Under SELF, the Bayes estimator of a parameter is simply the expectation of that parameter with respect to its posterior distribution.
Now, the Bayes estimator of a function of parameters α , β , and θ , say w ( α , β , θ ) is
w ^ ( α , β , θ ) = α β θ w ( α , β , θ ) P ( α , β , θ | x ̲ ) d α d β d θ .
The integral is difficult to obtain in the closure form due to the complex form of the posterior distribution. Therefore, we use a mixture of two famous Monte Carlo simulation techniques under a hybrid algorithm. These techniques are Gibbs sampler, see [20] and the Metropolis–Hastings algorithm, see [21,22]. The first technique allows us to simulate posterior samples from full conditional marginal posterior density instead of the joint posterior distribution, whereas the latter one enables us to draw required samples when conventional methods fail to generate samples. For this purpose, the full conditional marginal posterior densities of the parameters α , β , and θ are, respectively as
P 11 ( α | β , θ , x ̲ ) α n + a 1 1 exp ( α a 1 ) α + β + α θ i = 1 n ( x i + 1 ) ,
P ( β | α , θ , x ̲ ) β i = 1 n x i + b 1 1 exp ( β b 2 ) α + β + α θ i = 1 n ( x i + 1 ) ,
P ( θ | α , β , x ̲ ) θ c 1 1 + θ n exp ( θ c 2 ) α + β + α θ i = 1 n ( x i + 1 ) .
The steps of the above-mentioned hybrid algorithm are as follows:
  • Start with the initial values of α , β , and θ as ( α ( 0 ) , β ( 0 ) , θ ( 0 ) ) and set j = 0 .
  • Generate α ( j ) from P 11 ( α | β ( j 1 ) , θ ( j 1 ) , x ̲ ) using the following steps:
    (a)
    Generate the proposal point α * from the proposal distribution N ( α ( j 1 ) , σ α 2 ) and u 1 from Uniform (0,1) distribution. Here, the variance σ α 2 is suitably chosen.
    (b)
    Calculate the acceptance probability (AP) ρ α = min 1 , P 11 ( α * | β ( j 1 ) , θ ( j 1 ) , x ̲ ) P 11 ( α ( j 1 ) | β ( j 1 ) , θ ( j 1 ) , x ̲ ) , and if u 1 ρ α , then record α ( j ) = α * ; otherwise, sustain α ( j ) = α ( j 1 ) .
  • Generate β ( j ) from P 12 ( β | α ( j ) , θ ( j 1 ) , x ̲ ) using the following steps:
    (a)
    Generate the proposal value β * from N ( β ( j 1 ) , σ β 2 ) and u 2 from Uniform (0,1) distribution. The variance σ β 2 is appropriately selected.
    (b)
    Compute the AP ρ β = min 1 , P 12 ( β * | α ( j ) , θ ( j 1 ) , x ̲ ) P 12 ( β ( j 1 ) | α ( j ) , θ ( j 1 ) , x ̲ ) , and if u 2 ρ β , then record β ( j ) = β * ; otherwise, store β ( j ) = β ( j 1 ) .
  • Generate θ ( j ) from P 13 ( θ | α ( j ) , β ( j ) , x ̲ ) using the following steps:
    (a)
    Generate the proposal point θ * from N ( θ ( j 1 ) , σ θ 2 ) , and u 3 from Uniform (0,1) distribution. The variance σ θ 2 is well-chosen here.
    (b)
    Calculate the AP ρ θ = min 1 , P 13 ( θ * | μ ( j ) , α ( j ) , x ̲ ) P 13 ( θ ( j 1 ) | μ ( j ) , α ( j ) , x ̲ ) , and if u 3 ρ θ , then record θ ( j ) = θ * ; otherwise, sustain θ ( j ) = θ ( j 1 ) .
  • Set j = j + 1 .
  • Rerun the steps 2–5, a large number of times, say N times, and obtain α ( j ) , β ( j ) , and θ ( j ) , j = 1 , 2 , , N .
To avoid the effect of initial values, the first m draws are eliminated. After the convergence diagnostic of the generated chains through various graphical and statistical tests, the values α ( j ) , β ( j ) , and θ ( j ) , j = m + 1 , m + 2 , , N , represent the required posterior samples, which can be utilized to compute the Bayes estimators of the unknown parameters. Hence, the Bayes estimators of α , β , and θ under SELF are, respectively, obtained as
α ^ B = 1 N m j = m + 1 N α ( j ) , β ^ B = 1 N m j = m + 1 N β ( j ) , and θ ^ B = 1 N m j = m + 1 N θ ( j ) .
Notably, by fixing a 1 = b 1 = a 2 = b 2 = c 1 = c 2 = 0 and applying the above-mentioned procedure, we can obtain the Bayes estimators under NIPs.

4. A Monte Carlo Simulation Study

To discuss the behavior of the considered estimation methods based on various values of n and true parametric combination, we perform an MCMC simulation study. This assessment consists of the following steps: Draw 2000 samples of sizes n = 25 , 50, 100, and 200 from the newly model with ( α , β , θ ) = ( 0.5, 0.4, 0.5) and ( 0.5, 0.4, 1.01) . Calculate the MLE and BE (under IP and NIP with SELF) for the 2000 samples, say τ ^ φ j , τ ^ φ j , τ ^ φ j ; τ = α , β , θ ; j = 1 , 2 , , 2000 ; φ = ML and Bayes. Compute the mean squared error (MSE) and average absolute error (ABE) for all point estimates, where
M S E = 1 2000 j = 1 2000 τ ^ φ j τ 2 and A B E = 1 2000 j = 1 2000 | τ ^ φ j τ | .
The empirical results are shown in Table 5, Table 6 and Table 7. To assess the convergence of generated chains, we observe the MCMC, histogram, and auto-correlation plots. These plots show that all chains reach their stationery distributions. Due to space constraints, we have included convergence diagnostic plots only for the parametric combination (0.5, 0.4, 0.5) under IPs and NIPs, and they can be viewed in Figure 2 and Figure 3, respectively.
From Table 5, Table 6 and Table 7, it is noted that:
  • Based on MSE, we observe that the parameter β is more sensitive as compared to α and θ .
  • The MSE and ABE decrease to zero as n tends to infinity. This shows the consistency of the estimators.
  • Both estimation procedures perform satisfactorily. However, in overall comparison, Bayes estimators perform better in comparison to MLE, especially in case of IPs.
  • For higher values of the parameters α , β , and θ , the generated random samples produce a large number of 0s and 1s, due to which sometimes we face the convergence issue, and also the estimated error associated with an estimate becomes larger.

5. Data Analysis

In this section, we illustrate the importance of the BNPWE distribution by utilizing data from different areas. We shall compare the fits of the BNPWE distribution with some competitive models such as discrete Weibull inverse Weibull (DWIW), discrete inverse Weibull (DIW), discrete log-logistic (DLogL), discrete Burr type II (DB-II), discrete Rayleigh (DR), discrete Bilal (DBL), discrete Pareto (DPa), and discrete Burr–Hatke (DBH). Some statistical criteria have been used such as the negative log-likelihood ( L ), corrected (Akaike information criterion) C(AIC), Bayesian IC (BIC), Hannan–Quinn IC (HQIC), and Chi-square ( χ 2 ) test with its corresponding p-value.

5.1. Data Set I: COVID-19 Data

The data are reported in (https://www.worldometers.info/coronavirus/country/southkorea/ (accessed on 21 March 2022)) and represent the daily new deaths in South Korea from 15 February to 13 June 2020. The initial MS is reported using the nonparametric KME method in Figure 4, and it is noted that the mass function is asymmetric and multimodal. Moreover, Figure 4 shows that some extreme observations were listed.
The MLEs and Std-er for the parameter(s) as well as GOF measures for data set I are reported in Table 8 and Table 9.
It is clear that the DWIW and DLogL distributions work quite well besides the BNPWE distribution. However, the BNPWE distribution is the best model among all tested models. Figure 5 and Figure 6 support our empirical results where the BNPWE is more fit to analyze this data, whereas Figure 7 shows that the estimator is unique.
According to the MLEs, the EDS for mean, variance, IxOD, skewness and kurtosis are 2.37710, 8.02773, 3.37710, 2.03090, and 9.12456, respectively. The data exhibit over dispersion. Moreover, it is moderately skewed to the right and leptokurtic.

5.2. Data Set II: Biological Data

This data set is the biological experiment data which represent the number of European corn borer (No. ECB) larvae pyrausta in the field, see [23]. It was an experiment conducted randomly on 8 hills in 15 replications, and the experimenter counts the number of borers per hill of corn. For these data, the mean, variance, skewness, and kurtosis equal 1.326, 3.669, 1.976, and 8.984, respectively. The initial MS is reported using the nonparametric KME method in Figure 8, and it is noted that the mass function is asymmetric and multimodal. Moreover, Figure 8 shows that some extreme observations were reported.
The MLEs and Std-er for the parameter(s) as well as GOF measures for data set II are reported in Table 10 and Table 11.
It is clear that the DWIW and DLogL distributions work quite well besides the BNPWE distribution. However, the BNPWE distribution is the best model among all tested models. Figure 9 and Figure 10 support our empirical results where the BNPWE more fit to analyze these data, whereas Figure 11 shows that the estimator is unique.
According to the MLEs, the EDS for mean, variance, IxOD, skewness, and kurtosis are 1.48258, 3.68063, 2.48258, 2.06680, and 9.27169, respectively. The data exhibit over dispersion. Furthermore, it is moderately skewed to the right and leptokurtic.

5.3. Bayesian Analysis of Real Data Sets I and II

In this section, we derive Bayes estimates for the unknown parameters of the proposed distribution with their posterior standard errors (PSEs). Here, it is worth noting that in this estimation process, we use NIPs because there is no prior knowledge available about the model parameters for the data sets under study. These estimates can be viewed in Table 12. From this table, we can conclude that Bayesian estimation captures real data sets more accurately than the considered classical estimation procedure in terms of estimation errors.

6. Conclusions

In this article, a binomial new Poisson weighted exponential distribution is developed for analysis of count data. Its various impressive statistical properties, including moments, skewness, kurtosis, index of dispersion, mean residual life, and mean past life, are derived. One of the main features of the proposed model is that it has closed-form expressions for various important distributional characteristics, which is not the case with many well-known discrete distributions. Furthermore, it may be used to model over-dispersed, leptokurtic, and positively skewed data sets. Under classical estimation approach, the method of maximum likelihood is used, whereas in the Bayesian paradigm, we have used informative and non-informative priors with squared error loss function to estimate the unknown parameters. To evaluate the performance of the estimators with respect to the sample size, a thorough simulation study is performed. Finally, two different real data sets are examined to demonstrate the adaptability of the suggested distribution.
A future plan of action regarding the current study might be an examination of the censored data using the proposed model. We may investigate the load share model where the component failure time follows the BNPWE distribution. The stress-strength parameter may be examined using various censored data. In addition, a bivariate extension of the BNPWE distribution can be developed.

Author Contributions

A.A.-B. Software, Resources, Writing—review & editing, Funding; and M.S.E. Methodology, Supervision, Data curation, Formal analysis, Software, Writing—review & editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Deputyship for Research & Innovation, Ministry of Education in Saudi Arabia grant number IF-PSAU-2021/01/18784.

Acknowledgments

The authors extend their appreciations to the Deputyship for Research & Innovation, Ministry of Education in Saudi Arabia for funding this research work through the project number (IF-PSAU-2021/01/18784).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Krishna, H.; Pundir, P.S. Discrete Burr and discrete Pareto distributions. Stat. Methodol. 2009, 6, 177–188. [Google Scholar] [CrossRef]
  2. Nakagawa, T.; Osaki, S. The discrete Weibull distribution. IEEE Trans. Reliab. 1975, 24, 300–301. [Google Scholar] [CrossRef]
  3. Stein, W.E.; Dattero, R. A new discrete Weibull distribution. IEEE Trans. Reliab. 1984, 33, 196–197. [Google Scholar] [CrossRef]
  4. Chakraborty, S. Generating discrete analogues of continuous probability distributions: A survey of methods and constructions. J. Stat. Distrib. Appl. 2015, 2, 6. [Google Scholar] [CrossRef]
  5. Alamatsaz, M.H.; Dey, S.; Dey, T.; Harandi, S.S. Discrete generalized Rayleigh distribution. Pak. J. Stat. 2016, 32, 1–20. [Google Scholar]
  6. Jayakumar, K.; Babu, M.G. Discrete Weibull geometric distribution and its properties. Commun. Stat. Theory Methods 2018, 47, 1767–1783. [Google Scholar] [CrossRef]
  7. Tyagi, A.; Choudhary, N.; Singh, B. Discrete additive Perks-Weibull distribution: Properties and applications. Life Cycle Reliab. Saf. Eng. 2019, 8, 183–199. [Google Scholar] [CrossRef]
  8. Tyagi, A.; Choudhary, N.; Singh, B. A new discrete distribution: Theory and applications to discrete failure lifetime and count data. J. Appl. Probab. Stat. 2020, 15, 119–145. [Google Scholar]
  9. El-Morshedy, M.; Eliwa, M.S.; Nagy, H. A new two-parameter exponentiated discrete Lindley distribution: Properties, estimation and applications. J. Appl. Stat. 2020, 47, 354–375. [Google Scholar] [CrossRef]
  10. Eliwa, M.S.; Alhussain, Z.A.; El-Morshedy, M. Discrete Gompertz-G family of distributions for over-and under-dispersed data with properties, estimation, and applications. Mathematics 2020, 8, 358. [Google Scholar] [CrossRef]
  11. El-Morshedy, M.; Eliwa, M.S.; Altun, E. Discrete Burr-Hatke distribution with properties, estimation methods and regression model. IEEE Access 2020, 8, 74359–74370. [Google Scholar] [CrossRef]
  12. Altun, E.; El-Morshedy, M.; Eliwa, M.S. A study on discrete Bilal distribution with properties and applications on integer-valued autoregressive process. Revstat Stat. J. 2020, 18, 70–99. [Google Scholar]
  13. Eliwa, M.S.; Altun, E.; El-Dawoody, M.; El-Morshedy, M. A new three-parameter discrete distribution with associated INAR (1) process and applications. IEEE Access 2020, 8, 91150–91162. [Google Scholar] [CrossRef]
  14. Hu, Y.; Peng, X.; Li, T.; Guo, H. On the Poisson approximation to photon distribution for faint lasers. Phys. Lett. 2007, 367, 173–176. [Google Scholar] [CrossRef]
  15. Déniz, E.G. A new discrete distribution: Properties and applications in medical care. J. Appl. Stat. 2013, 40, 2760–2770. [Google Scholar] [CrossRef]
  16. Akdogan, Y.; Kus, C.; Asgharzadeh, A.; Kinaci, I.; Sharafi, F. Uniform-geometric distribution. J. Stat. Comput. Simul. 2016, 86, 1754–1770. [Google Scholar] [CrossRef]
  17. Coşkun, K.U.Ş.; Akdoğan, Y.; Asgharzadeh, A.; Kinaci, İ.; Karakaya, K. Binomial-discrete Lindley distribution. Commun. Fac. Sci. Univ. Ank. Ser. A1 Math. Stat. 2018, 68, 401–411. [Google Scholar]
  18. Altun, E. A new generalization of geometric distribution with properties and applications. Commun. Stat. Simul. Comput. 2020, 49, 793–807. [Google Scholar] [CrossRef]
  19. Oguntunde, P.E.; Owoloko, E.A.; Balogun, O.S. On a new weighted exponential distribution: Theory and application. Asian J. Appl. Sci. 2016, 9, 1–12. [Google Scholar] [CrossRef]
  20. Geman, S.; Geman, D. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 1984, 6, 721–741. [Google Scholar] [CrossRef]
  21. Metropolis, N.; Ulam, S. The Monte Carlo method. J. Am. Stat. Assoc. 1949, 44, 335–341. [Google Scholar] [CrossRef] [PubMed]
  22. Hastings, W.K. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 1970, 57, 97–109. [Google Scholar] [CrossRef]
  23. Bodhisuwan, W.; Sangpoom, S. The discrete weighted Lindley distribution. In Proceedings of the 2016 12th International Conference on Mathematics, Statistics, and Their Applications (ICMSA), Banda Aceh, Indonesia, 4–6 October 2016; pp. 99–103. [Google Scholar]
Figure 1. The PMF plots of the BNPWE model.
Figure 1. The PMF plots of the BNPWE model.
Symmetry 14 00826 g001
Figure 2. The MCMC diagnosticplots under NIPs.
Figure 2. The MCMC diagnosticplots under NIPs.
Symmetry 14 00826 g002
Figure 3. The MCMC diagnosticplots under IPs.
Figure 3. The MCMC diagnosticplots under IPs.
Symmetry 14 00826 g003
Figure 4. The KME, Q-Q, and box plots for data set I.
Figure 4. The KME, Q-Q, and box plots for data set I.
Symmetry 14 00826 g004
Figure 5. The estimated PMFs for data set I.
Figure 5. The estimated PMFs for data set I.
Symmetry 14 00826 g005
Figure 6. The P-P plot for data set I.
Figure 6. The P-P plot for data set I.
Symmetry 14 00826 g006
Figure 7. The L profiles for the model parameters for data set I.
Figure 7. The L profiles for the model parameters for data set I.
Symmetry 14 00826 g007
Figure 8. The KME, Q-Q, and box plots for data set II.
Figure 8. The KME, Q-Q, and box plots for data set II.
Symmetry 14 00826 g008
Figure 9. The estimated PMFs for data set II.
Figure 9. The estimated PMFs for data set II.
Symmetry 14 00826 g009
Figure 10. The P-P plot for data set II.
Figure 10. The P-P plot for data set II.
Symmetry 14 00826 g010
Figure 11. The L profiles for the model parameters for data set II.
Figure 11. The L profiles for the model parameters for data set II.
Symmetry 14 00826 g011
Table 1. Some descriptive measures of the BNPWE distribution for β = 0.5 and θ = 0.7.
Table 1. Some descriptive measures of the BNPWE distribution for β = 0.5 and θ = 0.7.
Parameter ⟶ α
Measure ↓ 0.01 0.1 0.7 1.0 1.5 2.0 2.5 3.5
Mean 29.4117 2.9411 0.4201 0.2941 0.1960 0.1470 0.1176 0.0840
Variance 894.4636 11.5916 0.5967 0.3806 0.2345 0.1686 0.1314 0.0910
IxOD 30.4117 3.9411 1.4201 1.2941 1.1960 1.1470 1.1176 1.0840
Skewness 2.0002 2.0214 2.3824 2.5743 2.8747 3.1509 3.4066 3.8700
Kurtosis 9.0011 9.0862 10.6758 11.6272 13.2639 14.9282 16.6052 19.9775
Table 2. Some numerical computations of reliability concepts for various values of α .
Table 2. Some numerical computations of reliability concepts for various values of α .
Parameter ⟶ α | β = 0.5, θ = 0.7
Measure↓ 0.01 0.1 0.7 1.0 1.5 2.0 2.5 3.5
MRL 29.4117 2.9411 0.4201 0.2941 0.1960 0.1470 0.1176 0.0840
VRL 894.4636 11.5916 0.5967 0.3806 0.2345 0.1686 0.1314 0.0910
RCV 1.0168 1.1575 1.8384 2.0976 2.4698 2.7928 3.0822 3.5916
MPL 5.7753 7.6248 9.5798 9.7058 9.8039 9.8529 9.8823 9.9159
Table 3. Some numerical computations of reliability concepts for various values of β .
Table 3. Some numerical computations of reliability concepts for various values of β .
Parameter ⟶ β | α = 0.5, θ = 0.7
Measure↓ 0.01 0.1 0.2 0.3 0.5 0.7 0.9 0.99
MRL 0.0117 0.1176 0.2352 0.3529 0.5882 0.8235 1.0588 1.1647
VRL 0.0119 0.1314 0.2906 0.4775 0.9342 1.5017 2.1799 2.5212
RCV 9.2736 3.0822 2.2912 1.9578 1.6431 1.4880 1.3944 1.3632
MPL 9.9882 9.8823 9.7647 9.6470 9.4122 9.1800 8.9541 8.8556
Table 4. Some numerical computations of reliability concepts for various values of θ .
Table 4. Some numerical computations of reliability concepts for various values of θ .
Parameter ⟶ θ | α = 0.5, β = 0.5
Measure↓ 0.01 0.1 0.7 1.0 1.5 2.0 2.5 3.5
MRL 0.9900 0.9090 0.5882 0.4999 0.4000 0.3333 0.2857 0.2222
VRL 1.9703 1.7355 0.9342 0.7500 0.5600 0.4444 0.3673 0.2716
RCV 1.4177 1.4491 1.6431 1.7320 1.8708 1.9999 2.1213 2.3452
MPL 9.0192 9.0969 9.4122 9.5001 9.6000 9.6666 9.7142 9.7777
Table 5. The MSE, ABE, and bootstrap CI for MLEs.
Table 5. The MSE, ABE, and bootstrap CI for MLEs.
n α β θ
Schema MSEABE [bootstrap CI]MSEABE [bootstrap CI]MSEABE [bootstrap CI]
I250.004810.05600 [0.363,0.566]0.006040.05995 [0.310,0.487]0.001890.02690 [0.455,0.591]
500.002750.04192 [0.399,0.545]0.002850.04099 [0.319,0.470]0.000850.02495 [0.457,0.547]
1000.001550.03100 [0.413,0.531]0.001320.02855 [0.354,0.451]0.000780.02257 [0.461,0.536]
2000.000880.02215 [0.437,0.512]0.000780.02069 [0.366,0.432]0.000690.02069 [0.467,0.529]
II250.006510.06331 [0.325,0.576]0.007850.06791 [0.316,0.491]0.003090.04407 [0.876,1.43]
500.003770.04699 [0.377,0.558]0.004050.04731 [0.324,0.486]0.002150.03758 [0.894,1.39]
1000.002140.03464 [0.395,0.534]0.002050.03328 [0.357,0.461]0.001820.03740 [0.911,1.30]
2000.001490.02702 [0.428,0.510]0.001110.02404 [0.364,0.435]0.001710.03719 [0.951,1.25]
Table 6. The MSE, ABE, and bootstrap CI for Bayes estimates under IPs.
Table 6. The MSE, ABE, and bootstrap CI for Bayes estimates under IPs.
n α β θ
Schema MSEABE [bootstrap CI]MSEABE [bootstrap CI]MSEABE [bootstrap CI]
I250.001210.02702 [0.311,0.568]0.002220.03811 [0.326,0.507]0.000290.01427 [0.387,0.583]
500.000920.02570 [0.334,0.557]0.001910.03623 [0.337,0.501]0.000190.01121 [0.394,0.561]
1000.000770.02243 [0.356,0.543]0.001400.03038 [0.342,0.487]0.000180.01106 [0.398,0.553]
2000.000480.01760 [0.377,0.531]0.000790.02311 [0.359,0.423]0.000120.00948 [0.417,0.528]
II250.001780.03197 [0.339,0.547]0.002200.03847 [0.334,0.489]0.000230.01297 [0.872,1.624]
500.001270.02933 [0.350,0.541]0.002170.03779 [0.351,0.478]0.000180.01111 [0.911,1.426]
1000.000810.02334 [0.384,0.530]0.001350.03057 [0.369,0.451]0.000150.01009 [0.937,1.340]
2000.000520.01878 [0.390,0.522]0.000990.02197 [0.388,0.430]0.000120.00959 [0.970,1.229]
Table 7. The MSE, ABE, and bootstrap CI for Bayes estimates under NIPs.
Table 7. The MSE, ABE, and bootstrap CI for Bayes estimates under NIPs.
n α β θ
Schema MSEABE [bootstrap CI]MSEABE [bootstrap CI]MSEABE [bootstrap CI]
I250.002450.03993 [0.366,0.571]0.004220.04865 [0.334,0.586]0.000770.02617 [0.327,0.560]
500.001910.03470 [0.387,0.562]0.002660.04242 [0.373,0.553]0.000820.02691 [0.335,0.551]
1000.001860.03332 [0.391,0.553]0.002100.03745 [0.382,0.531]0.000720.02604 [0.361,0.531]
2000.001590.03118 [0.399,0.532]0.001200.02791 [0.389,0.511]0.000650.02491 [0.380,0.510]
II250.002320.0371 [0.353,0.560]0.005240.05610 [0.321,0.558]0.000220.01313 [0.901,1.141]
500.001970.03627 [0.361,0.555]0.002910.04235 [0.336,0.546]0.000190.01228 [0.903,1.100]
1000.001800.03443 [0.374,0.536]0.002060.03250 [0.371,0.523]0.000170.01146 [0.924,1.090]
2000.001170.02694 [0.394,0.510]0.001140.02308 [0.387,0.514]0.000130.01032 [0.949,1.087]
Table 8. The MLEs and Std-er for data set I.
Table 8. The MLEs and Std-er for data set I.
Parameter → α β θ λ
Model ↓MLEStd-erMLEStd-erMLEStd-erMLEStd-er
BNPWE 0.1387 0.0148 0.4573 0.0347 0.3870 0.1488
DWIW 0.9524 0.0931 0.3953 0.4810 0.0112 0.0166 2.8785 3.5181
DIW 0.2338 0.0381 1.2658 0.1134
DLogL 2.0210 0.1890 1.7457 0.1523
DB-II 0.6225 0.0487 2.3359 0.3772
DR 0.9306 0.0061
DBL 0.7487 0.0143
DPa 0.4151 0.0332
DBH 0.9315 0.0269
Table 9. The GOF measures test for data set I.
Table 9. The GOF measures test for data set I.
Expected Frequency
XOFBNPWEDWIWDIWDLogLDB-IIDRDBLDBHDPa
032 36.1276 30.4229 28.5245 27.6305 34.1655 8.4645 19.2478 65.1759 55.6679
127 25.4291 26.7554 38.1383 32.81328 35.8539 22.0300 30.7382 21.5347 19.8889
217 17.8988 20.0874 18.3045 20.7934 17.0802 27.6341 25.5994 10.6342 10.3779
314 12.5986 14.4039 9.9187 12.3449 9.0893 25.2606 17.8606 6.2813 6.4241
48 8.8678 10.0466 6.0537 7.6045 5.5023 18.3967 11.4844 4.1105 4.3897
57 6.2418 6.8713 4.0199 4.9354 3.6500 11.0489 7.0553 2.8746 3.2002
66 4.3934 4.6288 2.8368 3.3619 2.5837 5.5663 4.2138 2.1058 2.4424
75 3.0924 3.0797 2.0949 2.3864 1.9183 2.3750 2.4707 1.5963 1.9288
85 2.1767 2.0277 1.6026 1.7534 1.4770 0.8635 1.4305 1.2423 1.564
91 5.1736 3.6763 10.5061 8.37632 10.6798 0.3604 1.8993 6.4444 16.1161
Total 122 122 122 122 122 122 122 122 122 122
L 250.3056 250.5345 262.3222 256.7394 263.5383 279.9239 255.5355 277.0495 279.8059
AIC 506.6112 509.0690 528.6444 517.4788 531.0766 561.8477 513.0710 556.099 561.6119
CAIC 506.8146 509.4109 528.7453 517.5796 531.1774 561.8811 513.1043 556.1323 561.6452
BIC 515.0232 520.28508 534.2525 523.0868 536.6846 564.6518 515.875 558.903 564.4159
HQIC 510.0279 513.6246 530.9222 519.7566 533.3544 562.9867 514.2099 557.2379 562.7508
χ 2 2.8444 2.4099 12.3012 5.5044 14.1272 89.7303 18.5571 43.5311 64.2454
d.f 434445545
p.value 0.5842 0.49179 0.0152 0.2393 0.0069 ≤0.001 0.0023 ≤0.001≤0.001
Table 10. The MLEs and Std-er for data set II.
Table 10. The MLEs and Std-er for data set II.
Model ↓ β α θ
Parameter →MLEStd-erMLEStd-erMLEStd-er
BNPWE 0.586 0.059 0.258 0.030 0.532 0.180
DIvW 0.345 0.043 1.541 0.156
DLLc 1.401 0.121 1.943 0.188
DBXII 0.519 0.051 2.358 0.366
DBHk 0.865 0.039
DBl 0.657 0.019
DRh 0.867 0.012
DIRh 0.319 0.042
DPo 0.329 0.034
Table 11. The GOF measures test for data set II.
Table 11. The GOF measures test for data set II.
No. EF
ECBOFBNPWEDIWDLLcDBX-IIDBHDBLDRDIRDPo
0 43 48.318 41.400 41.019 43.836 68.099 32.669 15.960 38.280 64.470
1 35 28.863 41.844 38.940 39.614 21.971 39.558 36.236 51.904 20.149
2 17 17.241 15.418 17.779 15.622 10.513 24.294 34.588 15.509 9.684
3 11 10.299 7.166 8.434 7.204 5.980 12.534 20.985 6.036 5.645
4 5 6.152 3.942 4.486 3.908 3.751 5.991 8.846 2.909 3.679
5 4 3.675 2.422 2.631 2.374 2.504 2.751 2.681 1.612 2.578
6 1 2.195 1.607 1.664 1.561 1.746 1.234 0.594 0.983 1.903
7 2 1.311 1.128 1.116 1.088 1.256 0.546 0.097 0.642 1.459
8 2 1.946 5.073 3.931 4.793 4.18 0.423 0.013 2.125 10.433
Total 120 120 120 120 120 120 120 120 120 120
L 200.877 204.812 202.630 204.293 214.053 204.675 235.227 208.439 220.618
AIC 407.755 413.624 409.261 412.587 430.106 411.351 472.453 418.878 443.236
CAIC 416.117 413.727 409.363 412.689 430.139 411.384 472.487 418.912 443.270
BIC 407.962 419.199 414.836 418.162 432.893 414.138 475.241 421.665 446.024
HQIC 411.151 415.888 411.525 414.851 431.238 412.483 473.585 420.010 444.368
χ 2 2.159 5.497 3.844 4.652 27.061 7.023 59.805 14.281 35.513
d.f 233323334
p.value 0.339 0.139 0.279 0.199 <0.001 0.071 <0.001<0.001<0.001
Table 12. Bayes estimates and PSE for real data sets.
Table 12. Bayes estimates and PSE for real data sets.
Data SetsParametersBayes EstimatesPSE
I α 0.1384 0.0125
β 0.4562 0.0197
θ 0.3874 0.0196
II α 0.2427 0.0291
β 0.5778 0.0564
θ 0.5293 0.1170
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Al-Bossly, A.; Eliwa, M.S. Asymmetric Probability Mass Function for Count Data Based on the Binomial Technique: Synthesis and Analysis with Inference. Symmetry 2022, 14, 826. https://doi.org/10.3390/sym14040826

AMA Style

Al-Bossly A, Eliwa MS. Asymmetric Probability Mass Function for Count Data Based on the Binomial Technique: Synthesis and Analysis with Inference. Symmetry. 2022; 14(4):826. https://doi.org/10.3390/sym14040826

Chicago/Turabian Style

Al-Bossly, Afrah, and Mohamed S. Eliwa. 2022. "Asymmetric Probability Mass Function for Count Data Based on the Binomial Technique: Synthesis and Analysis with Inference" Symmetry 14, no. 4: 826. https://doi.org/10.3390/sym14040826

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop