Next Article in Journal
Instrumental Variable Method for Regularized Estimation in Generalized Linear Measurement Error Models
Previous Article in Journal
Stochastic Debt Sustainability Analysis in Romania in the Context of the War in Ukraine
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Comparing Estimation Methods for the Power–Pareto Distribution

by
Frederico Caeiro
1,* and
Mina Norouzirad
2
1
Department of Mathematics and Center for Mathematics and Applications (NOVA Math), NOVA School of Science and Technology (NOVA FCT), 2829-516 Caparica, Portugal
2
Center for Mathematics and Applications (NOVA Math), NOVA School of Science and Technology (NOVA FCT), 2829-516 Caparica, Portugal
*
Author to whom correspondence should be addressed.
Econometrics 2024, 12(3), 20; https://doi.org/10.3390/econometrics12030020
Submission received: 14 May 2024 / Revised: 21 June 2024 / Accepted: 1 July 2024 / Published: 11 July 2024

Abstract

:
Non-negative distributions are important tools in various fields. Given the importance of achieving a good fit, the literature offers hundreds of different models, from the very simple to the highly flexible. In this paper, we consider the power–Pareto model, which is defined by its quantile function. This distribution has three parameters, allowing the model to take different shapes, including symmetrical and left- and right-skewed. We provide different distributional characteristics and discuss parameter estimation. In addition to the already-known Maximum Likelihood and Least Squares of the logarithm of the order statistics estimation methods, we propose several additional methods. A simulation study and an application to two datasets are conducted to illustrate the performance of the estimation methods.

1. Introduction

Univariate continuous distributions play a crucial role in modeling real-world phenomena. While well-known distributions like the normal, exponential, and Pareto distributions are commonly used, there is often a need for specialized distributions, to model specific data patterns. One established practice for defining more flexible distributions is through the quantile function (QF)
Q ( p ) = inf { x : F ( x ) p } , 0 p 1 ,
where F ( x ) = P ( X x ) represents the cumulative distribution function (CDF) of the random variable X. Thus, if F is strictly increasing, Q and F are inverse functions of each other, and F ( Q ( p ) ) = p . QFs possess numerous distinct characteristics that are absent in CDFs. We highlight that new and more flexible QFs can be easily constructed from the combination of existing QFs. For example, the product of QFs is still a valid QF.
Since the QF provides all valuable information about the distribution’s shape, several QF models have been proposed in the literature. The symmetric Tukey lambda distribution (Tukey 1960) and its asymmetric version, known as the generalized lambda distribution (Ramberg and Schmeiser 1972), are both defined in terms of their QF. Similarly, the quantile-based skew logistic distribution introduced by Gilchrist (2000) is also defined through its QF. More recently, Sankaran et al. (2016) introduce a new QF resulting from the sum of the QFs of the generalized Pareto and Weibull distributions.
In some cases, the density and distribution functions for distributions expressed through QFs are not available in closed form, except for specific parameter values. However, those functions can be easily computed by numerically inverting the corresponding QF. One significant advantage of these distributions is the simplicity of their QF, which facilitates the generation of random values through the use of uniform random variables and the application of inference procedures based on quantiles.
In this article, we are interested in the power–Pareto distribution introduced in Gilchrist (2000) and further studied in Hankin and Lee (2006). This is a versatile family of distributions for a non-negative random variable, such as income and wealth. This model is formed through the product of the power and Pareto QFs as
Q ( p c , λ 1 , λ 2 ) = c p λ 1 ( 1 p ) λ 2 , 0 p 1 ,
where c > 0 ,   min λ 1 , λ 2 0 , and max λ 1 , λ 2 > 0 . We write X P P ( c , λ 1 , λ 2 ) whenever X has the QF in Equation (1). The parameter c is related to the scale, while λ 1 and λ 2 control the shape of the distribution. By fixing or restricting some of this distribution’s parameters, we obtain well-known reduced versions. More precisely, if λ 1 = 0 and λ 2 > 0 , then X follows a Pareto (type I) distribution with the QF
Q ( p ) = c p λ 1 , 0 p 1 ,
and if λ 1 > 0 and λ 2 = 0 , X has a scaled power distribution with the QF
Q ( p ) = c ( 1 p ) λ 2 , 0 p 1 .
Furthermore, it can be observed that when λ 1 = λ 2 > 0 , X has the well-known log-logistic distribution, which is a special case of Burr (1942) type XII and Dagum (1977) family of distributions (for further details, see Caeiro and Mateus 2024). The case λ 1 = λ 2 = 0 is not considered here, as it results in a degenerate distribution at c. In the literature, the power–Pareto model in Equation (1) is also known as the Davies distribution (Hankin and Lee 2006) or Hankin–Lee distribution (Nair and Vineshkumar 2010). Hankin and Lee (2006) proposed two inference procedures to estimate the parameters c, λ 1 , and λ 2 in Equation (1), namely the maximum likelihood and the least squares method for the logged order statistics. Additionally, the authors compare the efficiency of those two estimation methods by comparing their variance. Since maximum likelihood estimators are often severely biased, for small sample sizes, we argue that solely considering the variance of the estimators may not provide a comprehensive assessment of their performance, and thus, it could lead to misleading conclusions. Therefore, the primary goal of this paper is to discuss a broader set of estimation techniques and consider alternative criteria for a more precise and unbiased comparison of the estimators.
The remainder of the paper is organized as follows. In Section 2, we describe various known properties of the power–Pareto model, like probability density and distribution functions, moments, and quantile-based measures. Several inferential procedures for the parameters of the power–Pareto distribution are discussed in Section 3. In Section 4, we conduct Monte Carlo simulations to analyze the performance of the different inferential procedures. In Section 5, we apply the inferential methods to two real datasets, and Section 6 concludes the article.

2. Statistical Properties of the Power–Pareto Distribution

2.1. Functions

From now on, we use θ = ( c , λ 1 , λ 2 ) to denote the three parameters of the power–Pareto model. The derivative of Q ( p θ ) , denoted as q ( p θ ) = Q ( p θ ) / p , is known as the quantile density function. For the model in Equation (1), this function is given by
q ( p θ ) = Q ( p θ ) λ 1 p + λ 2 1 p , 0 p 1 .
Note that the quantile density function in Equation (2) satisfies the identity
f ( Q ( p θ ) ) q ( p θ ) = 1 ,
where f ( · ) is the probability density function.
If we exclude the cases where the power–Pareto reduces to the power, the Pareto, or the log-logistic distributions, neither the distribution function nor the density function can be expressed in closed form. Thus, these functions have to be computed through numerical inversion of the QF. Suppose that u = u ( x θ ) is the solution of the equation x = Q ( u θ ) . Then, the CDF can be expressed as F ( x θ ) = u , and the density function can be derived from the inverse function rule as
f ( x θ ) = F ( x θ ) x = Q ( u θ ) u 1 = Q ( u θ ) λ 1 u + λ 2 1 u 1 .
The density at the left tail can be approximated by
f ( x θ ) 1 c λ 1 x c 1 λ 1 1 ;
Similarly, the right tail density can be approximated by
f ( x θ ) 1 c λ 2 c x 1 λ 2 + 1 .
In addition, we have
1 F ( x θ ) x c α ,
for large x, where α = 1 / λ 2 is the upper tail index (Finkelstein et al. 2006; Schluter 2018). Hence, the power–Pareto model belongs to the class of heavy-tailed distributions. In numerous applications, it is crucial to estimate accurately the tail index α in Equation (6). We refer the reader to Beirlant et al. (2012, 2004); Mehta and Yang (2022); Ndlovu and Chikobvu (2023); Reiss and Thomas (2007), among others. As noted in Hankin and Lee (2006), Equations (4) and (5) show that λ 1 controls the behavior of the left-hand tail, while λ 2 governs the right-hand tail. A larger value of λ 1 results in a shorter left tail, whereas a larger value of λ 2 leads to a longer right tail. This relationship is illustrated in Figure 1, where different parameter values are used to depict the probability density function.

2.2. Moments

The k- th moment can be expressed in an explicit form as follows:
E ( X k ) = 0 1 ( Q ( p θ ) ) k d p = c k B ( 1 + k λ 1 , 1 k λ 2 ) , λ 2 < 1 k ,
where B ( a , b ) = 0 1 x a 1 ( 1 x ) b 1 d x , with a > 0 and b > 0 , represents the Beta function. Using the notation b ( k , λ 1 , λ 2 ) = B ( 1 + k λ 1 , 1 k λ 2 ) , the mean ( μ ) and the variance ( σ 2 ) are
μ = c b ( 1 , λ 1 , λ 2 ) , σ 2 = c 2 b ( 2 , λ 1 , λ 2 ) b 2 ( 1 , λ 1 , λ 2 ) ,
and exist if λ 2 < 1 and λ 2 < 1 2 , respectively. Some other measures, like the coefficient of variation (CV), Pearson’s skewness ( S p ), and kurtosis ( K p ) can also be easily obtained in explicit forms,
CV = b ( 1 , λ 1 , λ 2 ) b ( 2 , λ 1 , λ 2 ) b 2 ( 1 , λ 1 , λ 2 ) , λ 2 < 1 2 , S p = b ( 3 , λ 1 , λ 2 ) 3 b ( 1 , λ 1 , λ 2 ) b ( 2 , λ 1 , λ 2 ) + 2 b 3 ( 1 , λ 1 , λ 2 ) b ( 2 , λ 1 , λ 2 ) b 2 ( 1 , λ 1 , λ 2 ) 3 / 2 , λ 2 < 1 3 , K p = b ( 4 , λ 1 , λ 2 ) 4 b ( 1 , λ 1 , λ 2 ) b ( 3 , λ 1 , λ 2 ) + 6 b 2 ( 1 , λ 1 , λ 2 ) b ( 2 , λ 1 , λ 2 ) 3 b 4 ( 1 , λ 1 , λ 2 ) b ( 2 , λ 1 , λ 2 ) b 2 ( 1 , λ 1 , λ 2 ) 2 , λ 2 < 1 4 .

2.3. Quantile Measures

Quantile-based measures of distributional characteristics, including location, dispersion, skewness, and kurtosis, exhibit less sensitivity to outliers when compared to conventional moments. For the power–Pareto distribution, the median (M) and the interquartile range (IQR) are, respectively, given by
M = Q ( 1 / 2 θ ) = c 2 λ 2 λ 1 , IQR = Q ( 3 / 4 θ ) Q ( 1 / 4 θ ) = c 4 λ 2 λ 1 ( 3 λ 1 3 λ 2 ) .
The asymmetry and peakedness of the distribution can be analyzed using Bowley (1901) Skewness ( S B ) and Moors (1988) Kurtosis ( K M ) quantile-based coefficients,
S B = Q ( 3 / 4 θ ) 2 Q ( 1 / 2 θ ) + Q ( 1 / 4 θ ) IQR = 3 λ 1 2 1 + λ 1 λ 2 + 3 λ 2 3 λ 1 3 λ 2 ,
and
K M = Q ( 7 / 8 θ ) Q ( 5 / 8 θ ) + Q ( 3 / 8 θ ) Q ( 1 / 8 θ ) IQR = 2 λ 2 λ 1 ( 7 λ 1 5 λ 1 3 λ 2 + 3 λ 1 5 λ 2 7 λ 2 ) 3 λ 1 3 λ 2 .
All the aforementioned quantile-based measures are more robust than moments, since they exist in the complete parameter space, in contrast to moments.

2.4. Order Statistics

Let X 1 , X 2 , , X n be a random sample of size n from a population with the QF defined in Equation (1), and let X ( 1 ) X ( 2 ) X ( n ) be the corresponding ascending order statistics. Order statistics play a crucial role in statistical inference due to their ability to provide valuable insights into the distribution of X, as well as in estimation procedures for parameters of the model. The density function of X ( i ) is
f ( i ) ( x ) = 1 B ( i , n i + 1 ) ( F ( x ) ) i 1 ( 1 F ( x ) ) n i f ( x ) .
Note that f ( i ) ( x ) does not have a closed form, since neither the CDF nor the density function can be expressed in closed form. However, the single moments of the order statistics, μ ( i ) = E ( X ( i ) ) , can be easily obtained from the corresponding QF in Equation (1). For the class of distributions in Equation (1), μ ( i ) , can be expressed as follows:
μ ( i ) = 1 B ( i , n i + 1 ) 0 1 Q ( p θ ) p i 1 ( 1 p ) n i d p = c B ( i + λ 1 , n i + 1 λ 2 ) B ( i , n i + 1 ) , n i + 1 λ 2 > 0 .
Thus, as explicit formulas for moments of order statistics exist, several mathematical quantities associated with order statistics can be derived from Equation (8).
Additional properties can be found in Giorgi and Nadarajah (2010); Nair et al. (2013); Sunoj and Sankaran (2012).

3. Estimation Methods for the Power–Pareto Distribution

In this section, we discuss the parameter estimation methods employed in this paper. For the estimation of the parameters of the aforementioned reduced versions of the power–Pareto model, we refer to Bhatti et al. (2018); Caeiro et al. (2015); Caeiro and Mateus (2023); Lu and Tao (2007); Mateus and Caeiro (2022); Rytgaard (1990); Shakeel et al. (2016); Zaka et al. (2013). Concerning the three-parameter power–Pareto model, in Equation (1), Hankin and Lee (2006) proposed the estimation of the parameters by two methods: maximum likelihood and quantile least squares. The variance–covariance matrix of those two methods is also provided in Hankin and Lee (2006). The maximum likelihood estimators possess desirable asymptotic properties. However, in the case of small samples, this method may exhibit lower efficiency, when compared to other estimation methods. Therefore, in this paper, we consider not only the estimation methods in Hankin and Lee (2006), but also new estimation methods. In the following, let x 1 , x 2 , , x n represent a sample of size n, from the power–Pareto distribution with all three parameters assumed unknown.

3.1. Maximum Likelihood (ML)

The maximum likelihood (ML) estimators of the three parameters are obtained by solving an optimization problem, which involves maximizing the likelihood function, or equivalently, minimizing the negative log-likelihood function. This can be expressed as follows:
θ ^ ML = argmin θ i = 1 n log ( Q ( u i θ ) ) + log λ 1 u i + λ 2 1 u i .
where u i represents the solution of the equation x i = Q ( u i θ ) . Here, θ ^ ML = ( c ^ ML , λ ^ 1 ML , λ ^ 2 ML ) denotes the ML estimate of θ = ( c , λ 1 , λ 2 ) .
While the ML estimation method provides asymptotically unbiased estimators and efficiency for large sample sizes, the lack of a closed-form expression for the probability density function requires θ ^ ML to be obtained through a three-dimensional numerical search. This makes the ML method computationally intensive, and convergence of the negative log-likelihood to the global minimum can be sensitive to the initial values. Thus, this estimation method for the parameters of the power–Pareto can be computationally complex and challenging, especially for large datasets. Additionally, the ML method can be impacted by model misspecification. Therefore, it is crucial to consider alternative methods, potentially with closed-form expressions for the estimators.

3.2. Log Quantile Least Squares (LQLS)

Hankin and Lee (2006) proposed a regression method for estimating the parameters of the power–Pareto distribution using order statistics. To achieve a simple linear relation involving the parameters, a log transformation is applied, yielding the sum of squares
i = 1 n log x ( i ) E log X ( i ) 2
that needs to be minimized with respect to the vector parameters θ . Since X is continuous, the inverse probability integral transform guarantees X = d Q ( U θ ) , where U denotes a uniform distribution on the interval ( 0 , 1 ) . Consequently,
X ( i ) = d Q ( U ( i ) θ ) , i = 1 , , n ,
where U ( i ) denotes the i th -order statistic from a sample of size n from a uniform distribution on ( 0 , 1 ) . Note that U ( i ) has a Beta distribution with parameters i and n i + 1 . Using Equation (11), we have
log ( X ( i ) ) = d λ 0 + λ 1 log ( U ( i ) ) λ 2 log ( 1 U ( i ) ) , i = 1 , , n ,
with λ 0 = log ( c ) . Thus,
E log U ( i ) = ψ ( i ) ψ ( n + 1 ) = k = i n 1 k , E log ( 1 U ( i ) = ψ ( n i + 1 ) ψ ( n + 1 ) = k = n i + 1 n 1 k ,
where ψ is the digamma function, the derivative of the log gamma function. For n integer,
ψ ( n ) = γ + i = 1 n 1 1 i ,
where γ is Euler’s constant. Then, by introducing the notation λ = ( λ 0 , λ 1 , λ 2 ) , Equation (10) can be expressed in matrix form as
S ( λ ) = ( Y X λ ) ( Y X λ )
where Y is a column matrix with the logarithm of the order statistics from the sample, log X ( i ) , and X is an n × 3 matrix where the i th row is given by ( 1 , a i , a n i + 1 ) , with a i = k = i n 1 k . Applying the least squares method, the vector parameters are estimated by
λ ^ LQLS = ( X X ) 1 X Y ;
Consequently,
θ ^ LQLS = ( exp ( λ ^ 0 LQLS ) , λ ^ 1 LQLS , λ ^ 2 LQLS ) .
The LQLS method offers several advantages. Firstly, it is more robust against outliers, as the logarithmic transformation reduces the influence of those values. Secondly, unlike the ML method, estimates are based on the order statistics and require straightforward calculations, leading to computational efficiency. However, the LQLS method may exhibit lower efficiency when compared to the ML method and can be sensitive to small sample sizes.

3.3. Percentile (P)

Percentile points were first used for the determination of parameters of the Weibull model (Kao 1959). This method is nowadays popular due to its simplicity. Estimators are found from the relation, through the CDF or the QF, between probabilities and percentile values. To estimate the parameters, one must consider the same number of percentiles. Therefore, given three distinct cumulative probability levels p 1 , p 2 , and p 3 ( 0 < p 1 < p 2 < p 3 < 1 ), the corresponding 100 p i % percentiles, i = 1 , 2 , 3 , are the values q 1 , q 2 , and q 3 such that
F q i θ = p i q i = Q p i θ , i = 1 , 2 , 3 ,
with Q the QF in Equation (1). Next, applying a log transformation to the ratio between two consecutive percentiles, we obtain
log q 2 q 1 = λ 1 log p 2 p 1 + λ 2 log 1 p 1 1 p 2 ,
and
log q 3 q 2 = λ 1 log p 3 p 2 + λ 2 log 1 p 2 1 p 3 .
Solving the above two equations for λ 1 and λ 2 , we obtain
λ 1 = log 1 p 2 1 p 3 log q 2 q 1 log 1 p 1 1 p 2 log q 3 q 2 log p 2 p 1 log 1 p 2 1 p 3 log p 3 p 2 log 1 p 1 1 p 2 ,
and
λ 2 = log p 3 p 2 log q 2 q 1 + log p 2 p 1 log q 3 q 2 log p 2 p 1 log 1 p 2 1 p 3 log p 3 p 2 log 1 p 1 1 p 2 .
Next, we use the following equation for the second percentile:
q 2 = c p 2 λ 1 1 p 2 λ 2 c = q 2 p 2 λ 1 1 p 2 λ 2 .
The estimators are obtained by replacing, in Equations (13)–(15), the percentiles q i , by the corresponding sample percentiles. A possible choice for the probabilities is p 1 , p 2 , p 3 = ( 0.1 , 0.5 , 0.9 ) . Equivalently, let I be a set of three distinct values from the first n positive integer values, { 1 , 2 , , n } , where n denotes the sample size. Another possible choice of percentiles is q i = x ( i ) , i I , associated to the cumulative probabilities p i = ( i a ) / ( n + b ) , where a and b are real constants. A popular choice of the constants is a = 0 and b = 1 .
The P method offers simplicity in computation and robustness against outliers. This makes it straightforward to implement and suitable for exploratory analysis and initial estimation, providing a quick and effective way to estimate parameters. However, it may be less efficient and less accurate compared to other methods.

3.4. Least Squares (LS) and Weighted Least Squares (WLS)

Here we consider the difference between the empirical and the theoretical CDF. Then, the least squares (LS) estimator of θ , denoted by θ ^ LS = ( c ^ LS , λ ^ 1 LS , λ ^ 2 LS ) , can be obtained as
θ ^ LS = argmin θ i = 1 n F ( x ( i ) θ ) i n + 1 2 .
Furthermore, the estimation of parameters using the weighted least squares (WLS) method, symbolized as θ ^ WLS = ( c ^ WLS , λ ^ 1 WLS , λ ^ 2 WLS ) , can be determined by
θ ^ WLS = argmin θ i = 1 n ( n + 1 ) 2 ( n + 2 ) i ( n i + 1 ) F ( x ( i ) θ ) i n + 1 2 .
The LS method involves minimizing the squared difference between the empirical and theoretical CDFs. This method is straightforward to implement and interpret, making it accessible for various applications. However, LS assumes homoscedasticity, which is not valid, since the variance of F ( x ( i ) θ ) depends on the index i. This violation does not affect the bias of the estimators, but may increase their variance. On the other hand, the weighting scheme used in the WLS method addresses heteroscedasticity by assigning larger weights to observations that are closer to the center of the sample and smaller weights to observations that are closer to the edges of the sample. Additionally, both the LS and WLS methods are computationally intensive, since both depend on the CDF, which needs to be computed numerically.

3.5. Quantile Least Squares (QLS)

The quantile least squares (QLS) estimator of distribution parameters, denoted by θ ^ QLS = ( c ^ QLS , λ ^ 1 QLS , λ ^ 2 QLS ) , can be derived by
θ ^ QLS = argmin θ i = 1 n x ( i ) μ ( i ) 2 ,
with μ ( i ) defined in Equation (8).
The QLS estimator minimizes the squared difference between the order statistics and their expected value, which can be easily obtained from Equation (8). A limitation of this method is that μ ( n ) only exists if λ 2 < 1 ; therefore, the QLS should only be considered if λ 2 is a small positive value. Furthermore, the accuracy of parameter estimates can be affected by the presence of large outliers.
A weighted version of this method was not considered because it would further restrict its domain of validity.

4. Comparison of the Estimation Methods by Monte Carlo Simulation

In this section, a Monte Carlo simulation study is carried out to compare the performance of the proposed P, LS, WLS, and QLS estimation methods, and to compare them with the ML and LQLS methods, proposed by Hankin and Lee (2006). Davies package was used for the ML method. Parameter estimation with the LS, WLS, and QLS was performed with the R optimization function optim of the R Software version 4.0.0 and using the starting values provided by the davies.start function in Davies package. The power–Pareto distribution was used to generate r = 1000 samples with sizes n = 10 , 20, 50, 75, and 100. Sample values are generated using the inversion method. In the simulation study, the following parameter combinations were considered:
  • Case 1: ( c , λ 1 , λ 2 ) = ( 1 , 0.1 , 0.1 ) ;
  • Case 2: ( c , λ 1 , λ 2 ) = ( 1 , 0.1 , 0.4 ) ;
  • Case 3: ( c , λ 1 , λ 2 ) = ( 1 , 0.4 , 0.4 ) ;
  • Case 4: ( c , λ 1 , λ 2 ) = ( 1 , 0.4 , 0.9 ) ;
  • Case 5: ( c , λ 1 , λ 2 ) = ( 1 , 0.9 , 0.4 ) .
All the parameter combinations provide a power–Pareto distribution with finite mean value and different levels of positive skewness and kurtosis. Both measures increase with respect to λ 2 and decrease with respect to λ 1 . The corresponding densities, for all five cases, are presented in Figure 2.
For each of the three parameters of θ , denoted generically by θ , we computed the simulated average bias (ABias), median bias (MBias), and root mean squared error (RMSE) of the corresponding estimator θ ^ . The statistics are defined by
ABias ( θ ^ ) = 1 r i = 1 r ( θ ^ i θ ) , MBias ( θ ^ ) = median ( θ ^ 1 , θ ^ 2 , , θ ^ r ) θ , RMSE ( θ ^ ) = 1 r i = 1 r ( θ ^ i θ ) 2 ,
where θ ^ i is the estimate of θ computed using the i th sample.
As a global criterion of comparison, we also computed the average absolute difference between the true and the estimated CDFs,
D abs = 1 r i = 1 r 1 n j = 1 n | F ( x i j | θ ) F ( x i j | θ ^ ) |
and the average of the maximum absolute difference between the true and estimated CDFs,
D max = 1 r i = 1 r max | F ( x i j | θ ) F ( x i j | θ ^ ) | ,
where x i j represents the j th observation in the i th sample. The smaller the values of D abs and D m a x , the better the fit to the data.
The ABias, MBias, and RMSE are presented in Figure 3, Figure 4, Figure 5, Figure 6 and Figure 7, while the related Table A1, Table A2, Table A3, Table A4 and Table A5, with the corresponding values, are given in Appendix A. It is important to note that it was impossible to obtain estimates provided by the QLS method for a few samples. This was due to the non-convergence of the optimization method used to solve Equation (18). The number of cases where convergence was achieved is indicated beneath each table. This issue is not critical, as the QLS method generally demonstrates the poorest performance. Thus, we do not advise its use.
Moreover, since only small sample sizes were considered, it is difficult to assess the convergence of both median and mean simulated bias to zero, likely attributed to sampling error. However, it is evident that if n = 75 or n = 100 , the simulated bias is usually closer to zero than if n = 10 or n = 20 . In almost all cases, the RMSE of the estimators of the parameters c, λ 1 , and λ 2 decreases toward zero, when the sample size increases.
Based on RMSE values in Figure 3, Figure 4, Figure 5, Figure 6 and Figure 7, it is evident that the performance of various estimation methods varies based on the values of λ 1 and λ 2 , and also the sample size (n). Regarding the RMSE, we also have the following additional comments:
  • For small sample sizes, such as n = 10 , the P method generally demonstrates the highest efficiency. Moreover, it is not a recommended method for larger sample sizes.
  • The WLS method consistently outperforms the LS estimator in estimating each of the three parameters.
  • The LQLS method always has a good performance for samples of size n 20 . The WLS has a similar performance to the LQLS method if λ 1 λ 2 . If λ 1 = λ 2 , LQLS and WLS methods have a similar performance for n 50 .
  • The ML method shows strong performance when λ 1 = 0.1 and λ 2 0.4 and when the sample size is equal to or larger than 50. Thus, we do not recommend its use for samples of size smaller than n = 100 .
Table 1 and Table 2 provide a comparative analysis of Monte Carlo simulated mean absolute difference and mean maximum absolute difference between true and estimated CDFs. The best values are highlighted in bold. The insights derived from the analysis of these tables can be summarized as follows:
  • The performance rankings across different methods are consistent between the two tables.
  • The WLS methods demonstrate a very good performance, typically yielding the smallest or second smallest values of D abs and D max .
  • The LS method consistently performs slightly worse than WLS, and LQLS shows similar performance to WLS when n 50 , except when λ 1 = 0.1 and λ 2 = 0.4 . The ML method is never the best performer, but it shows good performance if n 50 and λ 1 < λ 2 or λ 1 = λ 2 = 0.4 .
  • The remaining methods exhibit poor performance. Both P and WLS methods provide generally the largest absolute differences. The exception is the QLS method, for small sample sizes and λ 1 = λ 2 = 0.1 .

5. Application

In this section, we use two real datasets to illustrate the behavior of the estimators, described in Section 3. To compare the fitted power–Pareto model we computed the Kolmogorov–Smirnov (K-S) statistic and associated p-value for each method. Since parameters are estimated, the p-value of the K-S test is obtained using Monte Carlo simulation. To measure the goodness-of-fit, we also computed the empirical correlation coefficient r Q , between empirical quantiles x i and the corresponding estimated quantiles q i = Q ( x i θ ^ ) , i = 1 , 2 , , n (Beirlant et al. 2004). Since both vectors have monotonically increasing values, r Q will be non-negative.

5.1. Household Income by State in USA

The U.S. Census Bureau defines “household income” as the gross income of all people aged 15 years or older who live in the same housing unit, regardless of their relationship. Household income reflects the standard of living in distinct households and is an important indicator of the local and national economies. Table 3 presents a dataset comprising the median household income in 2016 in the United States, in dollars, of n = 52 states, as available on the website data.world.1
The histogram and the boxplot of these observations, in Figure 8, are compatible with the power–Pareto distribution.
Table 4 summarizes the estimated parameters, K-S statistics, associated p-values, and the empirical correlation coefficient for various statistical methods applied to the household income dataset.
Regarding Table 4, it is shown that all estimation techniques produce p-values exceeding 0.05 , indicating a favorable fit of the power–Pareto distribution. Considering that a lower K-S statistic and a higher p-value signify a better fit, and a higher r Q implies a stronger relationship between observed and expected quantiles, the P method stands out with notably high p-value and r Q , indicating a good fit. Moreover, the LQLS method achieves the highest r Q , further supporting its efficacy. Although the QLS method has a large r Q value, the p-value is the lowest.
Figure 9 depicts Q-Q plots, comparing the observed data with the estimated quantiles provided from various methods. If the points in the Q-Q plots align closely along the diagonal line, it indicates that the estimated distribution provides an adequate statistical fit. Figure 10 provides the empirical CDF vs. the fitted CDF, for the six different estimation methods.
Figure 9 shows a good similarity between empirical and fitted quantiles in the body of the distribution, although there are discrepancies in the right tail. All methods provide a good correspondence in the body of the distribution. But the LQLS and QLS methods provide the best correspondence in the right tail. Similar conclusions can be drawn from Figure 10.

5.2. Peak Concentrations

For the examination of accidental releases of hazardous gases, a method commonly employed is the instantaneous release of a finite volume of gas into a surrounding flow field. Concentration measurements are then taken at a fixed location downwind. In a series of experiments conducted by Hall (1991) involving 100 repetitions, a key parameter for risk assessment was the peak concentrations achieved. The dataset, studied by Hankin and Lee (2006), is provided in Table 5.
In Figure 11, we present the histogram and the boxplot of the dataset. Both plots are compatible with the power–Pareto distribution.
Table 6 provides the estimated parameters, K-S statistics, the associated p-values, and the empirical correlation coefficient for various statistical methods for the peak concentration dataset.
It is observed that the data conform well to the distribution for all estimation methods, with all associated p-values exceeding 0.05 and empirical correlation coefficient close to 1. Results for the different estimation methods are similar, except for the QLS, which presents a much higher K-S value. Furthermore, the P, LS, and WLS methods demonstrate favorable outcomes, as indicated by the low K-S statistic, high p-value, and high empirical correlation coefficient, r Q .
Figure 12 presents Q-Q plots, contrasting the observed data with the estimated quantiles derived from the fitted power–Pareto distribution. Both the P and WLS methods demonstrate a good correspondence, with similar patterns and some discrepancies in the right tail. The QLS again evidences overfitting in the right tail.
Figure 13 displays the empirical and fitted CDFs. All methods work quite well for analyzing this dataset. However, the P and WLS are the ones that provide the best correspondence between CDFs.

6. Conclusions

This study examines the power–Pareto model for non-negative variables. The model has three parameters and can exhibit various shapes, making it suitable for modelling both symmetrical and skewed data. The paper explores distributional characteristics, with a particular focus on different parameter estimation techniques, some of them introduced in this work.
The numerical analysis reveals the importance of selecting an appropriate estimation method based on both sample size and the values of the power–Pareto distribution parameters. Our results indicate that for very small sample sizes, the P method performs well in terms of RMSE. However, for larger sample sizes, the LQLS and WLS methods emerge as adequate choices and are recommended for practical applications.
Additionally, it is worth noting that the ML method also exhibits good performance for larger sample sizes, typically with at least 100 observations. However, it is essential to consider the computational time associated with this method, which is longer when compared to other methods, a factor to weigh in the decision-making process.

Author Contributions

Conceptualization, F.C.; methodology, F.C. and M.N.; software, F.C. and M.N.; validation, F.C.; formal analysis, F.C.; investigation, F.C. and M.N.; data curation, F.C. and M.N.; writing—original draft preparation, M.N.; writing—review and editing, F.C. and M.N.; visualization, F.C. and M.N. All authors have read and agreed to the published version of the manuscript.

Funding

This work is funded by national funds through the FCT—Fundação para a Ciência e a Tecnologia, I.P., under the scope of the projects UIDB/00297/2020 (https://doi.org/10.54499/UIDB/00297/2020, accessed on 6 June 2024) and UIDP/00297/2020 (https://doi.org/10.54499/UIDP/00297/2020, accessed on 6 June 2024) (Center for Mathematics and Applications).

Data Availability Statement

The data supporting the findings in Section 5 of this study are available within the article.

Acknowledgments

The authors thank the referees for their comments and suggestions that led to an improvement of the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
QFQuantile function
CDFCumulative distribution function
IQRInterquartile range
MLMaximum Likelihood
LQLSLog quantile least squares
PPercentile
LSLeast squares
WLSWeighted least squares
QLSQuantile least squares
ABiasAverage bias
MBiasMedian bias
RMSERoot mean squared error

Appendix A. Monte Carlo Simulation Results

Table A1, Table A2, Table A3, Table A4 and Table A5 provide the ABias, MBias, and RMSE for the cases in Section 4, with the best values highlighted in bold. Figure 3, Figure 4, Figure 5, Figure 6 and Figure 7 are related with these tables.
Table A1. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with c = 1 , λ 1 = λ 2 = 0.1 .
Table A1. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with c = 1 , λ 1 = λ 2 = 0.1 .
n MLLQLSPLSWLSQLS*
10 ABias ( c ^ ) 0.0155 0.0060 0.0051 0.0141 0.0127 0 . 0036
MBias ( c ^ ) 0.0042 0.0039 0.0055 0.0073 0 . 0001 0.0036
RMSE ( c ^ ) 0.1576 0.1232 0 . 1164 0.1432 0.1360 0.1756
ABias ( λ ^ 1 ) 0 . 0004 0.0005 0.0165 0.0174 0.0172 0.0055
MBias ( λ ^ 1 ) 0.0131 0.0065 0.0204 0.0077 0.0089 0 . 0005
RMSE ( λ ^ 1 ) 0.0841 0.0669 0 . 0641 0.0890 0.0836 0.0671
ABias ( λ ^ 2 ) 0.0039 0 . 0016 0.0148 0.0116 0.0125 0.0062
MBias ( λ ^ 2 ) 0.0123 0.0070 0.0219 0 . 0025 0.0049 0.0164
RMSE ( λ ^ 2 ) 0.0812 0.0684 0 . 0642 0.0883 0.0832 0.0644
20 ABias ( c ^ ) 0.0094 0 . 0027 0.0070 0.0094 0.0061 0.0199
MBias ( c ^ ) 0.0041 0.0012 0 . 0010 0.0051 0.0043 0.0066
RMSE ( c ^ ) 0.1047 0 . 0789 0.0906 0.0972 0.0880 0.1859
ABias ( λ ^ 1 ) 0 . 0003 0.0007 0.0086 0.0098 0.0074 0.0004
MBias ( λ ^ 1 ) 0.0059 0.0064 0.0115 0.0063 0.0051 0 . 0013
RMSE ( λ ^ 1 ) 0.0570 0 . 0435 0.0513 0.0596 0.0526 0.0475
ABias ( λ ^ 2 ) 0.0047 0 . 0008 0.0111 0.0034 0.0043 0.0069
MBias ( λ ^ 2 ) 0.0096 0.0051 0.0142 0.0014 0 . 0010 0.0121
RMSE ( λ ^ 2 ) 0.0559 0 . 0438 0.0518 0.0587 0.0528 0.0472
50 ABias ( c ^ ) 0 . 0006 0.0014 0.0008 0.0011 0.0017 0.0322
MBias ( c ^ ) 0.0032 0.0048 0.0020 0.0022 0.0035 0 . 0011
RMSE ( c ^ ) 0.0527 0 . 0491 0.0598 0.0581 0.0518 0.1917
ABias ( λ ^ 1 ) 0.0019 0.0012 0.0046 0.0019 0 . 0011 0.0024
MBias ( λ ^ 1 ) 0.0029 0.0034 0.0056 0.0013 0 . 0002 0.0004
RMSE ( λ ^ 1 ) 0.0299 0 . 0279 0.0347 0.0349 0.0305 0.0342
ABias ( λ ^ 2 ) 0.0009 0 . 0004 0.0031 0.0037 0.0032 0.0048
MBias ( λ ^ 2 ) 0.0026 0.0011 0.0052 0.0008 0 . 0004 0.0058
RMSE ( λ ^ 2 ) 0.0310 0 . 0283 0.0346 0.0362 0.0316 0.0354
75 ABias ( c ^ ) 0.0014 0 . 0001 0.0020 0.0018 0.0009 0.0125
MBias ( c ^ ) 0 . 0008 0.0016 0.0022 0.0013 0.0010 0.0020
RMSE ( c ^ ) 0.0405 0 . 0401 0.0493 0.0470 0.0416 0.1304
ABias ( λ ^ 1 ) 0.0008 0 . 0006 0.0021 0.0023 0.0014 0 . 0006
MBias ( λ ^ 1 ) 0.0016 0.0026 0.0029 0.0020 0.0016 0 . 0005
RMSE ( λ ^ 1 ) 0.0230 0 . 0222 0.0274 0.0275 0.0234 0.0268
ABias ( λ ^ 2 ) 0.0017 0 . 0003 0.0031 0.0010 0.0009 0.0032
MBias ( λ ^ 2 ) 0.0026 0.0010 0.0053 0.0010 0 . 0007 0.0047
RMSE ( λ ^ 2 ) 0.0239 0 . 0233 0.0282 0.0282 0.0243 0.0287
100 ABias ( c ^ ) 0.0017 0 . 0001 0.0015 0.0018 0.0010 0.0286
MBias ( c ^ ) 0.0010 0.0016 0.0005 0 . 0002 0.0015 0.0011
RMSE ( c ^ ) 0 . 0350 0.0353 0.0432 0.0404 0.0354 0.1805
ABias ( λ ^ 1 ) 0 . 0002 0.0005 0.0015 0.0020 0.0012 0.0022
MBias ( λ ^ 1 ) 0.0008 0.0015 0.0017 0.0015 0.0008 0 . 0000
RMSE ( λ ^ 1 ) 0.0204 0 . 0200 0.0243 0.0236 0.0201 0.0277
ABias ( λ ^ 2 ) 0.0019 0 . 0003 0.0025 0.0005 0.0005 0.0047
MBias ( λ ^ 2 ) 0.0024 0.0013 0.0046 0.0007 0 . 0004 0.0048
RMSE ( λ ^ 2 ) 0.0208 0 . 0200 0.0247 0.0236 0.0204 0.0286
* The numbers of convergence cases are 983 (n = 10), 978 (n = 20), 966 (n = 50), 985 (n = 75), and 969 (n = 100).
Table A2. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with c = 1 , λ 1 = 0.1 , λ 2 = 0.4 .
Table A2. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with c = 1 , λ 1 = 0.1 , λ 2 = 0.4 .
n MLLQLSPLSWLSQLS*
10 ABias ( c ^ ) 0.0853 0 . 0482 0.1121 0.1063 0.0957 0.2065
MBias ( c ^ ) 0.0579 0.0199 0.0572 0.0064 0 . 0063 0.1475
RMSE ( c ^ ) 0.4478 0 . 3366 0.3707 0.4333 0.4001 0.6011
ABias ( λ ^ 1 ) 0.0019 0 . 0008 0.0048 0.0474 0.0433 0.1051
MBias ( λ ^ 1 ) 0.0912 0 . 0043 0.0052 0.0057 0.0108 0.0669
RMSE ( λ ^ 1 ) 0.1462 0.1343 0 . 1303 0.1696 0.1546 0.2374
ABias ( λ ^ 2 ) 0.0196 0 . 0061 0.0652 0.0133 0.0174 0.1057
MBias ( λ ^ 2 ) 0.0181 0.0487 0.1005 0 . 0012 0.0031 0.1360
RMSE ( λ ^ 2 ) 0.2231 0.2375 0 . 2217 0.2490 0.2405 0.2128
20 ABias ( c ^ ) 0 . 0267 0.0280 0.0786 0.0514 0.0415 0.1414
MBias ( c ^ ) 0.0238 0.0277 0.0497 0 . 0007 0.0049 0.1462
RMSE ( c ^ ) 0.2725 0 . 2142 0.2671 0.2535 0.2207 0.4967
ABias ( λ ^ 1 ) 0.0028 0 . 0014 0.0026 0.0236 0.0186 0.0901
MBias ( λ ^ 1 ) 0.0220 0.0068 0 . 0010 0.0059 0.0084 0.0626
RMSE ( λ ^ 1 ) 0.1058 0 . 0833 0.1016 0.1090 0.0901 0.2491
ABias ( λ ^ 2 ) 0.0079 0.0056 0.0506 0 . 0032 0.0042 0.0928
MBias ( λ ^ 2 ) 0.0059 0.0299 0.0700 0 . 0044 0.0046 0.1111
RMSE ( λ ^ 2 ) 0.1693 0 . 1553 0.1754 0.1741 0.1587 0.1800
50 ABias ( c ^ ) 0 . 0017 0.0065 0.0224 0.0099 0.0075 0.0880
MBias ( c ^ ) 0.0110 0 . 0019 0.0074 0.0050 0.0024 0.1169
RMSE ( c ^ ) 0.1232 0.1342 0.1616 0.1359 0 . 1173 0.3460
ABias ( λ ^ 1 ) 0.0052 0 . 0011 0.0022 0.0054 0.0037 0.0656
MBias ( λ ^ 1 ) 0.0087 0.0020 0.0034 0 . 0006 0.0009 0.0656
RMSE ( λ ^ 1 ) 0.0496 0.0528 0.0686 0.0613 0 . 0495 0.1350
ABias ( λ ^ 2 ) 0.0014 0 . 0011 0.0175 0.0050 0.0037 0.0644
MBias ( λ ^ 2 ) 0.0068 0.0120 0.0264 0 . 0035 0.0047 0.0804
RMSE ( λ ^ 2 ) 0.0963 0.1020 0.1163 0.1059 0 . 0946 0.1496
75 ABias ( c ^ ) 0 . 0002 0.0080 0.0216 0.0117 0.0085 0.0765
MBias ( c ^ ) 0.0046 0.0072 0.0143 0.0046 0 . 0024 0.1187
RMSE ( c ^ ) 0 . 0911 0.1110 0.1336 0.1095 0.0941 0.3288
ABias ( λ ^ 1 ) 0.0035 0 . 0003 0.0012 0.0059 0.0037 0.0611
MBias ( λ ^ 1 ) 0.0045 0.0022 0 . 0018 0.0040 0.0026 0.0621
RMSE ( λ ^ 1 ) 0 . 0356 0.0427 0.0557 0.0485 0.0385 0.1240
ABias ( λ ^ 2 ) 0.0027 0.0029 0.0145 0.0005 0 . 0002 0.0595
MBias ( λ ^ 2 ) 0 . 0030 0.0078 0.0202 0.0066 0.0041 0.0714
RMSE ( λ ^ 2 ) 0.0746 0.0838 0.0943 0.0832 0 . 0740 0.1423
100 ABias ( c ^ ) 0 . 0011 0.0065 0.0167 0.0088 0.0063 0.0579
MBias ( c ^ ) 0.0060 0 . 0008 0.0092 0.0029 0.0010 0.1192
RMSE ( c ^ ) 0 . 0774 0.0966 0.1161 0.0916 0.0785 0.3323
ABias ( λ ^ 1 ) 0.0021 0 . 0004 0.0012 0.0047 0.0028 0.0565
MBias ( λ ^ 1 ) 0.0023 0.0009 0 . 0003 0.0019 0.0017 0.0619
RMSE ( λ ^ 1 ) 0 . 0307 0.0376 0.0490 0.0408 0.0324 0.1217
ABias ( λ ^ 2 ) 0.0031 0.0026 0.0117 0.0004 0 . 0001 0.0567
MBias ( λ ^ 2 ) 0.0043 0.0076 0.0175 0 . 0009 0.0030 0.0650
RMSE ( λ ^ 2 ) 0.0636 0.0721 0.0832 0.0696 0 . 0622 0.1411
* The numbers of convergence cases are 949 (n = 10), 962 (n = 20), 960 (n = 50), 959 (n = 75), and 952 (n = 100).
Table A3. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with c = 1 , λ 1 = 0.4 , λ 2 = 0.4 .
Table A3. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with c = 1 , λ 1 = 0.4 , λ 2 = 0.4 .
n MLLQLSPLSWLSQLS*
10 ABias ( c ^ ) 0.2517 0.1202 0 . 0984 0.1953 0.1829 0.4519
MBias ( c ^ ) 0 . 0054 0.0156 0.0176 0.0262 0.0081 0.2224
RMSE ( c ^ ) 0.8934 0.6031 0 . 5528 0.7712 0.7186 1.1709
ABias ( λ ^ 1 ) 0.0182 0 . 0019 0.0730 0.0710 0.0719 0.2361
MBias ( λ ^ 1 ) 0.0427 0 . 0259 0.0932 0.0327 0.0355 0.1309
RMSE ( λ ^ 1 ) 0.3683 0.2677 0 . 2599 0.3592 0.3393 0.6460
ABias ( λ ^ 2 ) 0.0277 0 . 0063 0.0476 0.0442 0.0449 0.1261
MBias ( λ ^ 2 ) 0.0690 0.0282 0.0808 0 . 0086 0.0186 0.1559
RMSE ( λ ^ 2 ) 0.3239 0.2735 0.2605 0.3529 0.3380 0 . 2464
20 ABias ( c ^ ) 0.1758 0 . 0492 0.0769 0.0988 0.0739 0.2839
MBias ( c ^ ) 0.0412 0.0046 0 . 0001 0.0216 0.0204 0.1953
RMSE ( c ^ ) 0.5886 0 . 3440 0.4030 0.4443 0.3911 0.9170
ABias ( λ ^ 1 ) 0.0414 0 . 0029 0.0361 0.0405 0.0304 0.1716
MBias ( λ ^ 1 ) 0 . 0131 0.0256 0.0478 0.0255 0.0212 0.1106
RMSE ( λ ^ 1 ) 0.2776 0.1741 0.2055 0.2389 0.2108 0 . 1106
ABias ( λ ^ 2 ) 0.0454 0 . 0031 0.0423 0.0126 0.0164 0.1047
MBias ( λ ^ 2 ) 0.0504 0.0206 0.0536 0.0060 0 . 0012 0.1269
RMSE ( λ ^ 2 ) 0.2387 0 . 1751 0.2069 0.2350 0.2119 0.2008
50 ABias ( c ^ ) 0.0290 0 . 0090 0.0181 0.0161 0.0093 0.1586
MBias ( c ^ ) 0.0126 0.0192 0 . 0086 0.0091 0.0153 0.1755
RMSE ( c ^ ) 0.2590 0 . 2000 0.2475 0.2393 0.2112 0.4555
ABias ( λ ^ 1 ) 0 . 0022 0.0048 0.0186 0.0075 0.0042 0.1206
MBias ( λ ^ 1 ) 0.0093 0.0135 0.0225 0 . 0051 0.0174 0.1279
RMSE ( λ ^ 1 ) 0.1430 0 . 1116 0.1389 0.1396 0.1221 0.3026
ABias ( λ ^ 2 ) 0.0085 0 . 0017 0.0119 0.0149 0.0131 0.0750
MBias ( λ ^ 2 ) 0.0110 0.0043 0.0206 0.0033 0 . 0022 0.0941
RMSE ( λ ^ 2 ) 0.1279 0 . 1132 0.1385 0.1450 0.1265 0.1603
75 ABias ( c ^ ) 0.0251 0 . 0102 0.0219 0.0225 0.0192 0.1271
MBias ( c ^ ) 0 . 0001 0.0064 0.0085 0.0058 0.0040 0.1567
RMSE ( c ^ ) 0.2098 0 . 1634 0.2018 0.1991 0.1993 0.4106
ABias ( λ ^ 1 ) 0.0036 0 . 0025 0.0088 0.0103 0.0075 0.0994
MBias ( λ ^ 1 ) 0.0072 0.0105 0.0117 0.0085 0 . 0057 0.1114
RMSE ( λ ^ 1 ) 0.1168 0 . 0886 0.1096 0.1124 0.0995 0.2683
ABias ( λ ^ 2 ) 0.0091 0 . 0013 0.0120 0.0032 0.0016 0.0652
MBias ( λ ^ 2 ) 0.0106 0.0039 0.0208 0.0051 0 . 0033 0.0783
RMSE ( λ ^ 2 ) 0.0968 0 . 0930 0.1127 0.1146 0.1044 0.1518
100 ABias ( c ^ ) 0.0238 0 . 0079 0.0171 0.0235 0.0162 0.1031
MBias ( c ^ ) 0.0016 0.0064 0.0018 0 . 0006 0.0052 0.1563
RMSE ( c ^ ) 0.2038 0 . 1432 0.1764 0.2003 0.1706 0.4092
ABias ( λ ^ 1 ) 0.0051 0 . 0020 0.0060 0 . 0111 0.0020 0.0888
MBias ( λ ^ 1 ) 0 . 0027 0.0059 0.0068 0.0063 0.0033 0.1110
RMSE ( λ ^ 1 ) 0.1122 0 . 0800 0.0972 0.1047 0.0859 0.2651
ABias ( λ ^ 2 ) 0.0088 0.0012 0.0099 0 . 0001 0 . 0001 0.0645
MBias ( λ ^ 2 ) 0.0117 0.0051 0.0185 0.0044 0 . 0021 0.0752
RMSE ( λ ^ 2 ) 0.0825 0 . 0798 0.0987 0.0993 0.0892 0.1525
* The numbers of convergence cases are 977 (n = 10), 958 (n = 20), 962 (n = 50), 960 (n = 75), and 949 (n = 100).
Table A4. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with c = 1 , λ 1 = 0.4 , λ 2 = 0.9 .
Table A4. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with c = 1 , λ 1 = 0.4 , λ 2 = 0.9 .
n MLLQLSPLSWLSQLS*
10 ABias ( c ^ ) 0.8308 0 . 3513 0.4594 0.7081 0.6520 6.5721
MBias ( c ^ ) 0.0791 0.0156 0.0484 0 . 0029 0.0105 1.1266
RMSE ( c ^ ) 3.3311 1 . 2160 1.5800 2.9448 2.6376 19.7535
ABias ( λ ^ 1 ) 0.0247 0 . 0001 0.0694 0.1238 0.1205 2.1121
MBias ( λ ^ 1 ) 0.0987 0 . 0106 0.0639 0.0315 0.0551 0.5679
RMSE ( λ ^ 1 ) 0.4876 0 . 3658 0.3765 0.4962 0.4657 11.4752
ABias ( λ ^ 2 ) 0.0515 0 . 0138 0.0797 0.0504 0.0574 0.5457
MBias ( λ ^ 2 ) 0.0566 0.1060 0.1754 0.0317 0 . 0047 0.5764
RMSE ( λ ^ 2 ) 0.5814 0 . 5502 0.5653 0.6279 0.6095 0.6298
20 ABias ( c ^ ) 0.5669 0 . 1522 0.2916 0.2796 0.2231 11.8500
MBias ( c ^ ) 0.0053 0.0482 0.0824 0.0234 0 . 0045 1.5789
RMSE ( c ^ ) 1.9100 0 . 6254 0.8795 0.9625 0.8223 39.5185
ABias ( λ ^ 1 ) 0.0690 0 . 0007 0.0200 0.0645 0.0508 2.9156
MBias ( λ ^ 1 ) 0.0317 0 . 0013 0.0364 0.0266 0.0245 0.8483
RMSE ( λ ^ 1 ) 0.4436 0 . 2305 0.2819 0.3250 0.2808 7.1807
ABias ( λ ^ 2 ) 0.0734 0 . 0111 0.0995 0.0143 0.0180 0.5095
MBias ( λ ^ 2 ) 0.0675 0.0597 0.1403 0.0028 0 . 0007 0.5279
RMSE ( λ ^ 2 ) 0.4729 0 . 3577 0.4106 0.4349 0.3980 0.5904
50 ABias ( c ^ ) 0.0958 0 . 0449 0.0913 0.0702 0.0549 25.8752
MBias ( c ^ ) 0.0172 0.0112 0 . 0076 0.0230 0.0146 2.2231
RMSE ( c ^ ) 0.6234 0 . 3466 0.4536 0.4785 0.3971 122.0664
ABias ( λ ^ 1 ) 0.0068 0 . 0047 0.0151 0.0140 0.0106 6.5544
MBias ( λ ^ 1 ) 0.0163 0.0017 0.0172 0 . 0001 0.0071 1.5314
RMSE ( λ ^ 1 ) 0.2183 0 . 1468 0.1905 0.1929 0.1629 24.7446
ABias ( λ ^ 2 ) 0.0206 0 . 0008 0.0347 0.0183 0.0132 0.4825
MBias ( λ ^ 2 ) 0.0293 0.0215 0.0544 0 . 0038 0.0095 0.4759
RMSE ( λ ^ 2 ) 0.2510 0 . 2342 0.2728 0.2703 0.2429 0.5576
75 ABias ( c ^ ) 0.0677 0 . 0390 0.0775 0.0662 0.0511 36.4554
MBias ( c ^ ) 0.0080 0.0073 0.0264 0.0080 0 . 0030 2.6556
RMSE ( c ^ ) 0.4865 0 . 2835 0.3631 0.4389 0.3886 155.8396
ABias ( λ ^ 1 ) 0.0080 0 . 0010 0.0038 0.0177 0.0118 8.4604
MBias ( λ ^ 1 ) 0.0041 0 . 0002 0.0017 0.0134 0.0057 1.9628
RMSE ( λ ^ 1 ) 0.1639 0 . 1177 0.1527 0.1578 0.1301 19.2592
ABias ( λ ^ 2 ) 0.0250 0.0057 0.0298 0.0022 0 . 0011 0.4697
MBias ( λ ^ 2 ) 0.0270 0.0155 0.0410 0.0112 0 . 0063 0.4659
RMSE ( λ ^ 2 ) 0.1905 0.1925 0.2215 0.2117 0 . 1885 0.5442
100 ABias ( c ^ ) 0.0541 0 . 0304 0.0604 0.0476 0.0457 41.6837
MBias ( c ^ ) 0.0076 0.0034 0.0175 0 . 0019 0.0015 2.8660
RMSE ( c ^ ) 0.4540 0 . 2453 0.3105 0.2947 0.4184 173.9902
ABias ( λ ^ 1 ) 0.0048 0 . 0005 0.0017 0.0147 0.0109 10.6657
MBias ( λ ^ 1 ) 0.0032 0 . 0007 0.0028 0.0071 0.0029 2.3627
RMSE ( λ ^ 1 ) 0.1220 0 . 1049 0.1346 0.1308 0.1192 25.2684
ABias ( λ ^ 2 ) 0.0221 0.0050 0.0251 0 . 0004 0.0011 0.4648
MBias ( λ ^ 2 ) 0.0265 0.0133 0.0385 0.0053 0 . 0042 0.4600
RMSE ( λ ^ 2 ) 0 . 1584 0.1652 0.1948 0.1762 0.1627 0.5403
* The numbers of convergence cases are 913 (n = 10), 936 (n = 20), 941 (n = 50), 946 (n = 75), and 944 (n = 100).
Table A5. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with c = 1 , λ 1 = 0.9 , λ 2 = 0.4 .
Table A5. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with c = 1 , λ 1 = 0.9 , λ 2 = 0.4 .
n MLLQLSPLSWLSQLS*
10 ABias ( c ^ ) 0.5116 0.4565 0 . 1674 0.4133 0.4064 0.7991
MBias ( c ^ ) 0.1972 0.0579 0.1179 0.0812 0 . 0219 0.3257
RMSE ( c ^ ) 1.3260 2.0416 1.0804 1.3088 1 . 2627 1.8824
ABias ( λ ^ 1 ) 0.0642 0 . 0064 0.1922 0.1072 0.1207 0.4752
MBias ( λ ^ 1 ) 0 . 0193 0.0777 0.2576 0.0638 0.0787 0.2136
RMSE ( λ ^ 1 ) 0.6308 0.5373 0 . 5156 0.6451 0.6222 1.3140
ABias ( λ ^ 2 ) 0.0569 0 . 0068 0.0180 0.0733 0.0634 0.1531
MBias ( λ ^ 2 ) 0.1791 0 . 0025 0.0488 0.0034 0.0088 0.2005
RMSE ( λ ^ 2 ) 0.4170 0.3701 0.3542 0.4941 0.4647 0 . 2817
20 ABias ( c ^ ) 0.4242 0.1479 0 . 1295 0.2003 0.1596 0.5729
MBias ( c ^ ) 0.2188 0 . 0407 0.0539 0.0691 0.0439 0.2837
RMSE ( c ^ ) 0.9908 0.7706 0.7506 0.7503 0 . 6785 1.4994
ABias ( λ ^ 1 ) 0.1136 0 . 0102 0.1035 0.0593 0.0472 0.3848
MBias ( λ ^ 1 ) 0.0605 0.0647 0.1349 0 . 0448 0.0341 0.1899
RMSE ( λ ^ 1 ) 0.4821 0 . 3566 0.4080 0.4346 0.3968 1.1138
ABias ( λ ^ 2 ) 0.0939 0 . 0010 0.0279 0.0249 0.0268 0.1224
MBias ( λ ^ 2 ) 0.1189 0 . 0011 0.0344 0.0014 0.0070 0.1507
RMSE ( λ ^ 2 ) 0.3044 0.2324 0.2815 0.3192 0.2837 0 . 2304
50 ABias ( c ^ ) 0.2037 0.0335 0.0343 0.0426 0 . 0254 0.3695
MBias ( c ^ ) 0.0297 0.0351 0.0480 0 . 0083 0.0221 0.2526
RMSE ( c ^ ) 0.6502 0.3682 0.4320 0.3960 0 . 3338 2.0387
ABias ( λ ^ 1 ) 0.0820 0.0109 0.0466 0.0126 0 . 0071 0.3530
MBias ( λ ^ 1 ) 0.0179 0.0284 0.0670 0 . 0015 0.0021 0.2278
RMSE ( λ ^ 1 ) 0.3711 0.2296 0.2731 0.2602 0 . 2292 2.1254
ABias ( λ ^ 2 ) 0.0493 0.0063 0 . 0023 0.0246 0.0208 0.0894
MBias ( λ ^ 2 ) 0.0285 0 . 0062 0.0084 0.0088 0.0138 0.1094
RMSE ( λ ^ 2 ) 0.1973 0 . 1466 0.1899 0.1941 0.1620 0.1792
75 ABias ( c ^ ) 0.1406 0 . 0268 0.0366 0.0424 0.0327 0.2482
MBias ( c ^ ) 0.0149 0.0173 0.0043 0.0078 0 . 0016 0.2125
RMSE ( c ^ ) 0.5084 0.2902 0.3427 0.3067 0 . 2891 1.1944
ABias ( λ ^ 1 ) 0.0569 0 . 0073 0.0265 0.0143 0.0098 0.2395
MBias ( λ ^ 1 ) 0 . 0028 0.0245 0.0336 0.0126 0.0048 0.2033
RMSE ( λ ^ 1 ) 0.2913 0.1814 0.2134 0.2039 0 . 1808 1.2468
ABias ( λ ^ 2 ) 0.0364 0 . 0014 0.0075 0.0074 0.0055 0.0800
MBias ( λ ^ 2 ) 0.0185 0.0029 0.0138 0.0033 0 . 0020 0.0916
RMSE ( λ ^ 2 ) 0.1591 0 . 1202 0.1547 0.1522 0.1290 0.1688
100 ABias ( c ^ ) 0.1248 0 . 0207 0.0300 0.0321 0.0263 0.2184
MBias ( c ^ ) 0.0117 0.0137 0.0154 0 . 0015 0.0061 0.2094
RMSE ( c ^ ) 0.4667 0.2570 0.3008 0.2611 0 . 2446 0.5693
ABias ( λ ^ 1 ) 0.0501 0 . 0060 0.0182 0.0112 0.0085 0.2250
MBias ( λ ^ 1 ) 0 . 0027 0.0190 0.0266 0.0119 0.0043 0.2164
RMSE ( λ ^ 1 ) 0.2663 0.1645 0.1900 0.1731 0 . 1542 0.9415
ABias ( λ ^ 2 ) 0.0307 0 . 0012 0.0069 0.0055 0.0027 0.0693
MBias ( λ ^ 2 ) 0.0149 0.0026 0.0121 0.0031 0 . 0010 0.0826
RMSE ( λ ^ 2 ) 0.1415 0 . 1040 0.1350 0.1275 0.1101 0.1561
* The numbers of convergence cases are 971 (n = 10), 974 (n = 20), 963 (n = 50), 955 (n = 75), and 967 (n = 100).

Note

1

References

  1. Beirlant, Jan, Frederico Caeiro, and M. Ivette Gomes. 2012. An overview and open research topics in statistics of univariate extremes. Revstat–Statistical Journal 10: 1–31. [Google Scholar] [CrossRef]
  2. Beirlant, Jan, Yuri Goegebeur, Johan Segers, and Jozef L. Teugels. 2004. Statistics of Extremes: Theory and Applications. Chichester: John Wiley & Sons. [Google Scholar] [CrossRef]
  3. Bhatti, Sajjad Haider, Shahzad Hussain, Tanvir Ahmad, Muhammad Aslam, Muhammad Aftab, and Muhammad Ali Raza. 2018. Efficient estimation of pareto model: Some modified percentile estimators. PLoS ONE 13: e0196456. [Google Scholar] [CrossRef] [PubMed]
  4. Bowley, Arthur L. 1901. Elements of Statistics. London: PS King & Son. [Google Scholar] [CrossRef]
  5. Burr, Irving W. 1942. Cumulative frequency functions. The Annals of Mathematical Statistics 13: 215–32. [Google Scholar] [CrossRef]
  6. Caeiro, Frederico, Ana P. Martins, and Inês J. Sequeira. 2015. Finite sample behaviour of classical and quantile regression estimators for the pareto distribution. In AIP Conference Proceedings. New York: AIP Publishing LLC. [Google Scholar] [CrossRef]
  7. Caeiro, Frederico, and Ayana Mateus. 2023. A new class of generalized probability-weighted moment estimators for the pareto distribution. Mathematics 11: 1076. [Google Scholar] [CrossRef]
  8. Caeiro, Frederico, and Ayana Mateus. 2024. Reduced bias estimation of the shape parameter of the log-logistic distribution. Journal of Computational and Applied Mathematics 436: 115347. [Google Scholar] [CrossRef]
  9. Dagum, Camilo. 1977. A new model for personal income distribution: Specification and estimation. Economie Appliqué 30: 413–37. [Google Scholar] [CrossRef]
  10. Finkelstein, Mark, Howard G. Tucker, and Jerry Alan Veeh. 2006. Pareto tail index estimation revisited. North American Actuarial Journal 10: 1–10. [Google Scholar] [CrossRef]
  11. Gilchrist, Warren. 2000. Statistical Modelling with Quantile Functions. Boca Raton: Chapman & Hall/CRC. [Google Scholar] [CrossRef]
  12. Giorgi, Giovanni Maria, and Saralees Nadarajah. 2010. Bonferroni and gini indices for various parametric families of distributions. METRON 68: 23–46. [Google Scholar] [CrossRef]
  13. Hall, David J. 1991. Repeat Variability in Instantaneously Released Heavy Gas Clouds–Some Wind Tunnel Experiments. Technical report LR 804 (PA). Stevenage: Warren Spring Laboratory. [Google Scholar]
  14. Hankin, Robin K. S., and Alan Lee. 2006. A new family of non-negative distributions. Australian & New Zealand Journal of Statistics 48: 67–78. [Google Scholar] [CrossRef]
  15. Kao, John H. K. 1959. A graphical estimation of mixed weibull parameters in life-testing of electron tubes. Technometrics 1: 389–407. [Google Scholar] [CrossRef]
  16. Lu, Hai-Lin, and Shin-Hwa Tao. 2007. The estimation of pareto distribution by a weighted least square method. Quality & Quantity 41: 913–26. [Google Scholar] [CrossRef]
  17. Mateus, Ayana, and Frederico Caeiro. 2022. Improved shape parameter estimation for the three-parameter log-logistic distribution. Computational and Mathematical Methods 2022: 8400130. [Google Scholar] [CrossRef]
  18. Mehta, Navya Jayesh, and Fan Yang. 2022. Portfolio optimization for extreme risks with maximum diversification: An empirical analysis. Risks 10: 101. [Google Scholar] [CrossRef]
  19. Moors, Johannes Josephus Antonius. 1988. A quantile alternative for kurtosis. Journal of the Royal Statistical Society: Series D (The Statistician) 37: 25–32. [Google Scholar] [CrossRef]
  20. Nair, Narayanan Unnikrishnan, and Balakrishnapillai Vineshkumar. 2010. L-moments of residual life. Journal of Statistical Planning and Inference 140: 2618–31. [Google Scholar] [CrossRef]
  21. Nair, Narayanan Unnikrishnan, Paduthol Godan Sankaran, and Narayanaswamy Balakrishnan. 2013. Quantile-Based Reliability Analysis. New York: Springer. [Google Scholar] [CrossRef]
  22. Ndlovu, Thabani, and Delson Chikobvu. 2023. The generalised pareto distribution model approach to comparing extreme risk in the exchange rate risk of bitcoin/us dollar and south african rand/us dollar returns. Risks 11: 100. [Google Scholar] [CrossRef]
  23. Ramberg, John S., and Bruce W. Schmeiser. 1972. An approximate method for generating symmetric random variables. Communications of the Association for Computing Machinery 15: 987–90. [Google Scholar] [CrossRef]
  24. Reiss, Rolf-Dieter, and Michael Thomas. 2007. Statistical Analysis of Extreme Values with Applications to Insurance, Finance, Hydrology and Other Fields. Basel: Birkhäuser Verlag. [Google Scholar] [CrossRef]
  25. Rytgaard, Mette. 1990. Estimation in the pareto distribution. ASTIN Bulletin: The Journal of the IAA 20: 201–16. [Google Scholar] [CrossRef]
  26. Sankaran, Paduthol Godan, Narayanan Unnikrishnan Nair, and Narayanan Nellikkattu Midhu. 2016. A new quantile function with applications to reliability analysis. Communications in Statistics-Simulation and Computation 45: 566–82. [Google Scholar] [CrossRef]
  27. Schluter, Christian. 2018. Top incomes, heavy tails, and rank-size regressions. Econometrics 6: 10. [Google Scholar] [CrossRef]
  28. Shakeel, Muhammad, Muhammad Ahsan ul Haq, Ijaz Hussain, Alaa Mohamd Abdulhamid, and Muhammad Faisal. 2016. Comparison of two new robust parameter estimation methods for the power function distribution. PLoS ONE 11: e0160692. [Google Scholar] [CrossRef] [PubMed]
  29. Sunoj, Sreenarayanapurath Madhavan, and Paduthol Godan Sankaran. 2012. Quantile based entropy function. Statistics & Probability Letters 82: 1049–53. [Google Scholar] [CrossRef]
  30. Tukey, John Wilder. The Practical Relationship between the Common Transformations of Percentages and Counts and of Amounts. Technical Report 36. Princeton: Statistical Techniques Research Group, Princeton University.
  31. Zaka, Azam, Navid Feroze, and Ahmad Saeed Akhter. 2013. A note on modified estimators for the parameters of the power function distribution. International Journal of Advanced Science and Technology 59: 71–84. [Google Scholar] [CrossRef]
Figure 1. The density function in (3) for fixed parameters c = 1 , λ 1 = 0.1 (left), λ 1 = 0.4 (right), and selected values for λ 2 .
Figure 1. The density function in (3) for fixed parameters c = 1 , λ 1 = 0.1 (left), λ 1 = 0.4 (right), and selected values for λ 2 .
Econometrics 12 00020 g001
Figure 2. The density function for cases 1–5.
Figure 2. The density function for cases 1–5.
Econometrics 12 00020 g002
Figure 3. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with c = 1 , λ 1 = 0.1 , λ 2 = 0.1 .
Figure 3. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with c = 1 , λ 1 = 0.1 , λ 2 = 0.1 .
Econometrics 12 00020 g003
Figure 4. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with c = 1 , λ 1 = 0.1 , λ 2 = 0.4 .
Figure 4. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with c = 1 , λ 1 = 0.1 , λ 2 = 0.4 .
Econometrics 12 00020 g004aEconometrics 12 00020 g004b
Figure 5. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with c = 1 , λ 1 = 0.4 , λ 2 = 0.4 .
Figure 5. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with c = 1 , λ 1 = 0.4 , λ 2 = 0.4 .
Econometrics 12 00020 g005aEconometrics 12 00020 g005b
Figure 6. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with c = 1 , λ 1 = 0.4 , λ 2 = 0.9 . Note that we remove the QLS methods because it is out of the range of plot.
Figure 6. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with c = 1 , λ 1 = 0.4 , λ 2 = 0.9 . Note that we remove the QLS methods because it is out of the range of plot.
Econometrics 12 00020 g006
Figure 7. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with c = 1 , λ 1 = 0.9 , λ 2 = 0.4 .
Figure 7. Monte Carlo simulated ABias, MBias, and RMSE from the power–Pareto distribution with c = 1 , λ 1 = 0.9 , λ 2 = 0.4 .
Econometrics 12 00020 g007
Figure 8. Histogram and boxplot for the household income dataset.
Figure 8. Histogram and boxplot for the household income dataset.
Econometrics 12 00020 g008
Figure 9. Q-Q plots of the household income dataset.
Figure 9. Q-Q plots of the household income dataset.
Econometrics 12 00020 g009aEconometrics 12 00020 g009b
Figure 10. Empirical vs. fitted CDFs using different estimators for the household income dataset.
Figure 10. Empirical vs. fitted CDFs using different estimators for the household income dataset.
Econometrics 12 00020 g010aEconometrics 12 00020 g010b
Figure 11. Histogram and boxplot for the peak concentration dataset.
Figure 11. Histogram and boxplot for the peak concentration dataset.
Econometrics 12 00020 g011
Figure 12. Q-Q plots of the peak concentration dataset.
Figure 12. Q-Q plots of the peak concentration dataset.
Econometrics 12 00020 g012aEconometrics 12 00020 g012b
Figure 13. Empirical vs. fitted CDFs using different estimators for the peak concentration dataset.
Figure 13. Empirical vs. fitted CDFs using different estimators for the peak concentration dataset.
Econometrics 12 00020 g013aEconometrics 12 00020 g013b
Table 1. Monte Carlo simulated average absolute difference D abs in Equation (19).
Table 1. Monte Carlo simulated average absolute difference D abs in Equation (19).
λ 1 λ 2 nMLLQLSPLSWLSQLS*
0.100.1010 0.0996 0.0935 0.1041 0.0927 0.0912 0 . 0877
20 0.0668 0.0633 0.0717 0.0656 0.0642 0 . 0617
50 0.0405 0.0399 0.0437 0.0410 0.0399 0 . 0395
75 0.0330 0 . 0318 0.0352 0.0329 0.0319 0.0319
100 0.0290 0.0278 0.0307 0.0283 0 . 0276 0.0277
0.4010 0.0983 0.1176 0.1287 0.0921 0 . 0905 0.0965
20 0.0675 0.0723 0.0841 0.0653 0 . 0635 0.0728
50 0.0403 0.0430 0.0493 0.0409 0 . 0395 0.0538
75 0 . 0315 0.0338 0.0373 0.0328 0.0316 0.0474
100 0 . 0271 0.0292 0.0320 0.0282 0.0273 0.0456
0.400.4010 0.1000 0.0935 0.1056 0.0926 0 . 0912 0.0954
20 0.0686 0 . 0633 0.0717 0.0656 0.0642 0.0694
50 0.0405 0 . 0399 0.0437 0.0410 0 . 0399 0.0483
75 0.0320 0 . 0318 0.0352 0.0330 0.0320 0.0412
100 0.0278 0.0278 0.0307 0.0284 0 . 0276 0.0383
0.9010 0.0991 0.1053 0.1237 0.0923 0 . 0909 0.1129
20 0.0688 0.0663 0.0775 0.0655 0 . 0640 0.1199
50 0.0406 0.0408 0.0451 0.0411 0 . 0398 0.1458
75 0.0325 0.0325 0.0356 0.0329 0 . 0318 0.1612
100 0.0276 0.0283 0.0310 0.0283 0 . 0275 0.1736
0.90.410 0.0996 0.0922 0.1015 0.0924 0 . 0910 0.0995
20 0.0690 0 . 0640 0.0708 0.0654 0.0641 0.0749
50 0.0440 0.0403 0.0440 0.0410 0 . 0398 0.0527
75 0.0349 0.0321 0.0354 0.0329 0 . 0318 0.0441
100 0.0296 0.0282 0.0310 0.0283 0 . 0275 0.0413
* Convergence of this estimation method is not achieved in all cases.
Table 2. Monte Carlo simulated average of the maximum absolute difference D max in (20).
Table 2. Monte Carlo simulated average of the maximum absolute difference D max in (20).
λ 1 λ 2 nMLLQLSPLSWLSQLS*
0.100.1010 0.1876 0.1658 0.1961 0.1592 0.1559 0 . 1535
20 0.1234 0.1115 0.1347 0.1157 0.1122 0 . 1089
50 0.0723 0.0698 0.0802 0.0734 0.0705 0 . 0693
75 0.0584 0 . 0557 0.0630 0.0587 0.0561 0.0562
100 0.0515 0.0487 0.0554 0.0507 0 . 0484 0.0488
0.4010 0.1807 0.2246 0.2555 0.1562 0 . 1528 0.1682
20 0.1239 0.1353 0.1722 0.1143 0 . 1098 0.1304
50 0.0715 0.0786 0.0982 0.0727 0 . 0692 0.0982
75 0.0552 0.0605 0.0712 0.0580 0 . 0550 0.0882
100 0.0475 0.0519 0.0600 0.0500 0 . 0473 0.0855
0.400.4010 0.1894 0.1656 0.1990 0.1592 0 . 1561 0.1654
20 0.1278 0 . 1115 0.1348 0.1159 0.1122 0.1234
50 0.0722 0 . 0698 0.0802 0.0734 0.0705 0.0878
75 0.0567 0 . 0557 0.0630 0.0590 0.0567 0.0766
100 0.0491 0 . 0487 0.0554 0.0511 0.0491 0.0714
0.9010 0.1855 0.1919 0.2394 0.1578 0 . 1542 0.2110
20 0.1272 0.1179 0.1514 0.1156 0 . 1115 0.2456
50 0.0727 0.0718 0.0845 0.0735 0 . 0706 0.3158
75 0.0574 0.0572 0.0642 0.0586 0 . 0558 0.3550
100 0.0488 0.0498 0.0561 0.0506 0 . 0484 0.3816
0.900.4010 0.1860 0.1614 0.1896 0.1586 0 . 1553 0.1704
20 0.1289 0.1126 0.1335 0.1152 0 . 1119 0.1323
50 0.0809 0.0708 0.0817 0.0732 0 . 0701 0.0954
75 0.0630 0.0565 0.0640 0.0590 0 . 0563 0.0815
100 0.0527 0.0495 0.0561 0.0506 0 . 0485 0.0771
* Convergence of this estimation method is not achieved in all cases.
Table 3. Household income by state dataset.
Table 3. Household income by state dataset.
60,30948,23777,35158,32846,89468,07072,08477,55659,29472,508
52,27754,67873,68457,78062,70657,30060,36558,03246,34543,103
51,95075,34673,82058,31971,72841,98356,19958,30260,65156,623
77,90069,94049,49362,75854,92061,47855,14652,03960,40762,290
62,85155,50558,68552,44859,39668,93262,14567,88071,82245,308
61,10359,073
Table 4. Parameter estimates under all methods, K-S statistics, and the associated values for the household income data.
Table 4. Parameter estimates under all methods, K-S statistics, and the associated values for the household income data.
Method c ^ λ ^ 1 λ ^ 2 K-Sp-Value r Q
ML59,636.680.08550.09090.09650.68190.9838
LQLS61,322.050.09810.07230.10640.56240.9868
P58,936.560.09040.10040.08580.80670.9822
LS59,604.210.08930.09370.09800.66390.9836
WLS59,578.920.08900.09340.09650.68230.9837
QLS62,520.990.10930.06350.12220.38800.9858
Table 5. Peak concentration dataset.
Table 5. Peak concentration dataset.
12.1001.7019.0747.0567.0254.7778.8707.65610.9206.806
8.757 5.670 12.890 7.119 2.523 9.055 7.341 3.938 10.460 11.050
6.678 3.026 6.806 11.750 5.742 4.007 7.340 2.849 6.418 8.456
5.702 7.262 6.086 7.568 7.941 14.030 7.844 3.150 7.818 8.554
5.796 3.497 7.087 15.800 4.316 7.591 13.990 9.185 6.286 11.040
11.280 6.804 5.292 6.273 10.840 6.587 8.757 9.344 5.513 11.040
16.160 11.500 5.072 9.041 8.927 7.560 4.694 6.832 15.380 10.250
10.550 7.655 5.229 14.900 7.087 2.646 3.704 9.293 6.117 13.650
5.072 6.045 6.458 4.993 7.403 13.480 11.530 9.926 3.451 16.910
9.010 3.215 5.859 10.020 6.962 11.440 5.765 6.928 5.171 7.825
Table 6. Parameter estimates under all methods, K-S statistics, and the associated values for the peak concentration dataset.
Table 6. Parameter estimates under all methods, K-S statistics, and the associated values for the peak concentration dataset.
Method c ^ λ ^ 1 λ ^ 2 K-Sp-Value r Q
ML 8.3220 0.3189 0.1812 0.0634 0.7924 0.9928
LQLS 8.4020 0.3213 0.1729 0.0651 0.7649 0.9934
P 7.9760 0.3182 0.1984 0.0590 0.8565 0.9908
LS 7.7014 0.2740 0.2293 0.0509 0.9459 0.9834
WLS 8.1498 0.3142 0.1980 0.0584 0.8647 0.9908
QLS 9.1620 0.3842 0.1383 0.0854 0.4349 0.9930
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Caeiro, F.; Norouzirad, M. Comparing Estimation Methods for the Power–Pareto Distribution. Econometrics 2024, 12, 20. https://doi.org/10.3390/econometrics12030020

AMA Style

Caeiro F, Norouzirad M. Comparing Estimation Methods for the Power–Pareto Distribution. Econometrics. 2024; 12(3):20. https://doi.org/10.3390/econometrics12030020

Chicago/Turabian Style

Caeiro, Frederico, and Mina Norouzirad. 2024. "Comparing Estimation Methods for the Power–Pareto Distribution" Econometrics 12, no. 3: 20. https://doi.org/10.3390/econometrics12030020

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop