Next Article in Journal
On Non-Occurrence of the Inspection Paradox
Previous Article in Journal
Bayesian Mediation Analysis with an Application to Explore Racial Disparities in the Diagnostic Age of Breast Cancer
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

New Goodness-of-Fit Tests for the Kumaraswamy Distribution

by
David E. Giles
†,‡
Department of Economics, University of Victoria, 3800 Finnerty Rd, Victoria, BC V8W 2Y2, Canada
Retired.
Current address: 58 Rock Lake Court, RR1, Bancroft, ON K0L 1C0, Canada
Stats 2024, 7(2), 373-388; https://doi.org/10.3390/stats7020023
Submission received: 7 March 2024 / Revised: 13 April 2024 / Accepted: 17 April 2024 / Published: 22 April 2024
(This article belongs to the Section Statistical Methods)

Abstract

:
The two-parameter distribution known as the Kumaraswamy distribution is a very flexible alternative to the beta distribution with the same (0,1) support. Originally proposed in the field of hydrology, it has subsequently received a good deal of positive attention in both the theoretical and applied statistics literatures. Interestingly, the problem of testing formally for the appropriateness of the Kumaraswamy distribution appears to have received little or no attention to date. To fill this gap, in this paper, we apply a “biased transformation” methodology to several standard goodness-of-fit tests based on the empirical distribution function. A simulation study reveals that these (modified) tests perform well in the context of the Kumaraswamy distribution, in terms of both their low size distortion and respectable power. In particular, the “biased transformation” Anderson–Darling test dominates the other tests that are considered.

1. Introduction

The two-parameter distribution introduced by Kumaraswamy [1] is a very flexible alternative to the beta distribution with the same (0,1) support. Originally proposed for the analysis of hydrological data, it has subsequently received a good deal of attention in both the theoretical and applied statistics literature. For example, Sundar and Subbiah [2], Seifi et al. [3], Ponnambalam et al. [4], Ganji et al. [5], and Courard-Hauri [6] provide applications in various fields, and theoretical extensions are implemented by Cordeiro and Castro [7], Bayer et al. [8], and Cordeiro et al. [9], among others.
The distribution function for a random variable, X, that follows the Kumaraswamy distribution is as follows:
F x = 1 1 x a b a ,   b > 0 ; 0 < x < 1 ,
which can be inverted to give the quantile function as follows:
Q y F 1 y = 1 ( 1 y ) 1 / b 1 / a ; 0 < y < 1
The corresponding density function is as follows:
f x = a b x a 1 ( 1 x a ) b 1
where ‘a’ and ‘b’ are both shape parameters. Some examples of the forms that this density can take are illustrated in Figure 1. In particular, similarly to the beta density, f x is unimodal if a > 1 and b > 1 ; uniantimodal if a < 1 and b < 1 ; increasing (decreasing) in x if a > 1 and b 1 ( a 1 and b > 1 ); and constant if a = b = 1 . Nadarajah [10] notes that the Kumaraswamy distribution is in fact a special case of the generalized beta distribution proposed by McDonald [11].
Jones [12] notes that the r’th central moment of the Kumaraswamy distribution exists if r > a and is given by the following:
E X r = b B 1 + r a , b
where B (., .) is the complete beta function; and from (2), the median of the distribution is as follows:
m d = 1 0.5 1 b 1 a
See Jones [12], Garg [13], and Mitnik [14] for a detailed discussion of the additional properties of the Kumaraswamy distribution.
These properties, compared with those of the beta distribution, are considered by many to give the Kumaraswamy distribution a competitive edge. For example, compared with the formula for the cumulative distribution function of the beta distribution, the invertible closed-form expression in (1) is seen by some as being advantageous in the context of computer-intensive simulation analyses and modelling based on quantiles. The latter consideration is of particular interest in the context of regression analyses. Beta regression, based on the closed-form mean of the distribution, is well established (e.g., Ferrari and Cribari-Neto [15]), but robust regression based on the median is impractical. In contrast, the median of the Kumaraswamy distribution has a simple form, given in (5), and so robust regression based on this distribution is straightforward. See Mitnik and Baek [16] and Hamedi-Shahraki et al. [17], for example.
Interestingly, the problem of testing formally for the appropriateness of the Kumaraswamy distribution appears to have received little or no attention in the literature. Goodness-of-fit tests based on the empirical distribution function (EDF) are obvious candidates, but their properties are unexplored for this distribution. Raschke [18] observed that such tests were unavailable for the beta distribution, and he proposed a “biased transformation” that he then applied to the test of Anderson and Darling [19,20] to fill this gap. He also used this approach to construct an EDF test for the gamma distribution. Subsequently, Raschke [21] provided extensive simulation results that favoured the use of the “bias-transformed” Anderson–Darling test over various other tests based on the EDF, such as those of Kuiper [22] and Watson [23], the Cramér–von Mises test (Cramér [24]; von Mises [25]), and the Kolmogorov–Smirnov test (Kolmogorov [26]; Smirnov [27]).
In this paper, we apply Raschke’s methodology to the problem of constructing EDF goodness-of-fit tests for the Kumaraswamy distribution, and we compare the performances of several such standard tests in terms of both size and power. We find that Raschke’s method performs well in this context, with the Kolmogorov–Smirnov and Cramér–von Mises tests exhibiting the least size distortion, and the Anderson–Darling test being a clear choice in terms of power against a wide range of alternatives.
In the next section, we introduce the “biased transformation” testing strategy suggested by Raschke and describe the five well-known EDF tests that we consider in this paper. Section 3 provides the results of a simulation experiment that evaluates the sizes and powers of the tests, and an empirical application is included in Section 4. Some concluding remarks are presented in Section 5.

2. Raschke’s “Biased Transformation” Testing

In very simple terms, the procedure proposed by Raschke involves the use of a transformation that converts the problem of testing the null hypothesis that the data follow the Kumaraswamy distribution into one of testing the null hypothesis of normality. The latter, of course, is readily performed using standard EDF tests. More specifically, the steps involved are as follows (Raschke [21]):
(i)
Assuming that the data, X, follow the Kumaraswamy distribution, estimate the shape parameters, a and b, using maximum likelihood (ML) estimation. See Lemonte [28] and Jones [12] for details of the ML estimator for this distribution;
(ii)
Using these parameter estimates, generate a sample of Y, where Y = Φ 1 ( F ( X ) ) , Φ is the distribution function for the standard normal distribution, and F ( . ) is given in (1);
(iii)
Obtain the ML estimates of the parameters of the normal distribution for Y;
(iv)
Apply an EDF test for normality to the Y data;
(v)
For a chosen significance level, α, reject H 0 : X is Kumaraswamy” if H 0 : Y is Normal” is rejected.
We consider five standard EDF tests for normality at step (iv), with the n values of the Y data in ascending order. See Stephens [29] for more details. The first two of these tests are based on the two quantities D + = m a x ( i ) [ i / n F ( Y i ) ] , D = m a x ( i ) [ F Y i ( i 1 ) / n ] and D = m a x [ D + , D ] . The Kolmogorov–Smirnov test statistic is D * = D ( n 0.01 + 0.85 / n ) , and Kuiper’s test statistic is V * = V ( n + 0.05 + 0.82 / n ) , where V = ( D + + D ) . In each case, H 0 is rejected if the test statistic exceeds the appropriate critical value.
Further, defining W 2 = i = 1 n [ F Y i ( 2 i 1 ) / ( 2 n ) ] 2 , the Cramér–von Mises test statistic is given by W 2 = W 2 ( 1.0 + 0.5 / n ) ) . Similarly, if U 2 = W 2 n { i = 1 n [ F Y i ] / n 0.5 } 2 , the Watson test statistic is defined as U 2 = U 2 ( 1.0 + 0.5 / n ) . Finally, the Anderson–Darling test statistic is defined as A 2 = A 2 ( 1.0 0.75 / n / + 2.25 / n 2 ) , where A 2 = n i = 1 n 2 i 1 l n F Y i + ln 1 F Y n + 1 i / n . Again, for these last three tests, the null hypothesis is rejected if the test statistic exceeds the appropriate critical value. In the next section, we consider nominal significance levels of α = 5% and α =10%. The critical values for the five tests are from Table 4.7 of Stephens [29]. They are reported in the last row of Table 1 in the next section, to the degree of accuracy provided by Stephens.

3. A Simulation Study

Using Raschke’s “biased transformation”, each of the five EDF tests for the Kumaraswamy null hypothesis has been evaluated in a simulation experiment, using R (R Core Team [30]). In all parts of the Monte Carlo study, 10,000 Monte Carlo replications were used. The ‘univariateML’ package (Moss and Nagler [31]) was used for obtaining the ML estimates of the Kumaraswamy distribution in step (i), and the ‘GoFKernel’ package (Pavia [32]) was used to invert the distribution in step (ii) in the last section. Random numbers for the truncated log-normal and triangular distribution were generated using the ‘EnvStats’ package (Millard and Kowarik [33]), while those for the Kumaraswamy distribution itself were generated using the ‘VGAM’ package (Yee [34]). The ‘trapezoid’ package (Hetzel [35]) and the ‘truncnorm’ package (Mersmann et al. [36]) were used to generate random variates from the trapezoidal and truncated normal distributions, respectively, and the R base ‘stats’ package was used for the beta variates. Finally, random variates from the truncated gamma distribution were generated using the ‘cascsim’ package (Bear et al. [37]), and those for the truncated Weibull distribution were obtained using the ‘ReIns’ package (Reynkens [38]). The R code that was used for both parts of the simulation experiment is available for download from https://github.com/DaveGiles1949/r-code/blob/master/Kumaraswmay%20Paper%20EDF%20Power%20Study.R (accessed on 5 March 2024).
In the first part of the experiment, we investigate the true “size” of each of the five EDF tests for various sample sizes (n) and a selection of values of the parameters (a and b) of the null distribution. As noted above, the tests are applied using nominal significance levels of both 5% and 10%, and we are concerned here with the extent of any “size distortion” that may arise.
The results obtained with seven representative (a,b) pairs and sample sizes ranging from n = 10 to n = 100 are shown in Table 1. The corresponding Kumaraswamy densities appear in Figure 1. The simulated sizes of all of the tests are very close to the nominal significance levels in all cases. This result is very encouraging and provides initial support for adopting the “biased transformation” EDF testing strategy for the Kumaraswamy distribution.
Of the five tests considered, the Kolmogorov–Smirnov test performs best, in terms of the least absolute difference between the nominal and simulated sizes in 16 of the 36 cases at the 5% nominal level and 10 of the 36 cases at the 10% nominal level, as shown in Table 1. In the latter case, it is outperformed by the Cramér–von Mises test, which dominates for 14 of the 36 cases that are considered. Further, there is a general tendency for the simulated sizes of all of the tests to exceed the nominal significance levels when n 25 , while the converse is true (in general) when n 50 . An exception is when both of the distributions’ parameters equal 0.5, in which case the density is uniantimodal. These size distortions are generally small, but their direction has implications for the results relating to the powers of the tests.
The second part of the Monte Carlo experiment investigates the powers of the five tests against a range of alternative hypotheses. The latter all involve distributions on the (0,1) interval, with some distributions truncated accordingly. It should be noted that the simulated powers that are reported are “raw powers” and are not “size-adjusted”. That is, the various critical values that are used are those reported at the end of Table 1. In practical applications, this is how a researcher would proceed.
The results of this part of the study are reported in Table 2. The sample sizes range from n = 10 to n = 1000. A wide range of parameter values was considered for each of the alternative distributions, and a representative selection of the results that were obtained are reported here. For the truncated log-normal distribution, “meanlog” is the mean of the distribution of the non-truncated random variable on the log scale, and “sdlog” is the standard deviation of the distribution of the non-truncated random variable on the log scale. For the trapezoidal distribution, m1 and m2 are the first and second modes, n1 is the growth parameter, and n3 is the decay parameter.
One immediate result that emerges is that, with only two exceptions, all of the tests are “unbiased” in all of the settings considered. That is, the power of the test exceeds the nominal significance level. The only exceptions that were encountered are when the alternative distribution is truncated log-normal, with both parameters equal to 0.5 and with a sample size of n = 10. This is a very encouraging result. A test that is “biased” has the unfortunate property that it rejects the null hypothesis less frequently when it is false than when it is true. Moreover, as the various tests are “consistent”, their powers increase as the sample size increases, for any given case.
The results in Table 2 also provide overwhelming support for the Anderson–Darling test in terms of power. Interestingly, this result is totally consistent with the conclusion reached by Raschke [21] for the same “biased transformation” EDF tests in the context of the beta distribution. This may reflect that fact that the latter distribution and the Kumaraswamy distributions have densities that are capable of following very similar shapes, depending on the values of the associated parameters. Moreover, Stephens [29] recommends the Anderson–Darling test over other EDF tests in general.
The Anderson–Darling test has the highest power among all five tests, in all cases, except for very small samples when the alternative distribution is trapezoidal with the parameters m1 = 1/4 and m2 = 3/4; when n1 = n3 = 3; and for the truncated Weibull alternative with n = 10. Of the other tests under study, the Cramér–von Mises test ranks second in terms of power, followed by Watson’s test and the Kolmogorov–Smirnov test. We find that Kuiper’s test is the least powerful, in general.
As was noted in Section 1, the density for the Kumaraswamy distribution can take shapes very similar to those of the beta density, as the values of the two shape parameters vary in each case. The densities for the alternative beta distributions that are considered in the power analysis are depicted in Figure 2 and may be compared with the Kumaraswamy densities in Figure 1. This similarity suggests that there may be instances in which the proposed EDF tests have relatively low power. If the data are generated by a beta distribution whose characteristics can be mimicked extremely closely by a Kumaraswamy distribution with the same, or similar, shape parameters, the tests may fail to reject the latter distribution. An obvious case in point is when the values of both of these shape parameters are 0.5, and the densities of both distributions are uniantimodal, though not identical. As can be seen in Figure 1, the density for the Kumaraswamy distribution is slightly asymmetric in this case, while its beta distribution counterpart is symmetric. The relatively low power of all of the EDF tests, even for n = 1000, in this case can be seen in the last section of Table 2.
In view of these observations, we have considered a wide range of different values for the shape parameters associated with the beta distributions that are taken as alternative hypotheses in the power analysis of the EDF tests. A representative selection of the results appears in Table 2. There, we see that although the various tests have modest power when the data are generated by beta (2,4), beta (4,2), and beta (3,3) distributions, they perform extremely well against several other beta alternatives.
Although the degree of size distortion associated with the use of Stephens’ critical values exhibited in Table 1 is generally quite small, a researcher may choose to simulate exact (bias-corrected) critical values for the various EDF tests. It is important to note that such values will depend on the samples size, n, and on the values of the shape parameters, a and b, associated with the Kumaraswamy distribution. In a practical application, the first of these values would be known, and estimates of the shape parameters could be obtained from the sample values in question.
However, the powers of the various tests will also depend on these three values, as well as on the characteristics of the distribution associated with the alternative hypothesis. This complicates the task of illustrating these powers when the critical values are simulated, but Table 3 provides a limited set of results. These results focus on relatively small sample sizes, as the size distortion becomes negligible for large n values. A selection of the alternative hypotheses covered in Table 2 is chosen for further investigation, and the values of the Kumaraswamy shape parameters are chosen to provide the similarity between their shapes and the associated alternatives’ densities. The critical values themselves are obtained as the 90th and 95th percentiles of 10,000 simulated values of each test statistic under the null distribution. There is a different critical value for every entry in Table 3, so they are not reported individually. The powers themselves are then simulated from a further 10,000 replications under the alternative hypothesis, as was the case for the results in Table 2.
Ranking the various tests in terms of the power results in Table 3 leaves our previous conclusions unchanged. The Anderson–Darling test emerges as the preferred choice. The powers based on the simulated critical values tend to be smaller than their counterparts in Table 2. This is consistent with the earlier observation that the size distortion emerging in that table was positive for small sample sizes.

4. Empirical Applications

To illustrate the effectiveness of the “biased transformation” Anderson–Darling test, we present two applications with actual (economic) data. The R code and associated data files can be downloaded from https://github.com/DaveGiles1949/r-code/blob/master/Kumaraswamy%20Paper%20Applications.R (accessed on 5 March 2024).

4.1. The Hidden Economy

The first application uses data for the size of the so-called “hidden economy” or “underground economy” for 158 countries in each of the years from 1991 to 2017. These data measure the size of the hidden economy (HE) relative to the value of gross domestic product (GDP) in each country and are reported by Medina and Schneider [39]. These ratios range from 0.0543 for Switzerland to 0.5578 for Bolivia, with a mean of 0.2741 and a standard deviation of 0.1120. A random sample of size n = 250 was obtained from this population of 2329 values, using the “sample” command in R with replacement.
When a Kumaraswamy distribution is fitted to the sample data, the estimates of the two shape parameters are 2.8832 and 15.4084. See Figure 3a,b. The value for the “biased transformation” Anderson–Darling statistic is 0.6313, which is less than both the asymptotic and simulated 5% critical values of 0.7520 and 0.7298, respectively. So, at this significance level, we would not reject the hypothesis that the data follow a Kumaraswamy distribution. If a beta distribution is fitted to the data, the estimates of the two shape parameters are 4.3809 and 8.5369. The corresponding Anderson–Darling statistic (using the “biased transformation” and the beta distribution) is 1.3585. This exceeds both the asymptotic and simulated 5% critical values of 0.7520 and 0.7638, respectively, leading us to reject the hypothesis that the data follow a beta distribution. These two test results support each other and allow us to discriminate between the potential distributions.

4.2. Food Expenditure

The second application uses data for the fraction of household income that is spent on food. A random sample of 38 households is available in the “FoodExpenditure” data-set in the ‘betareg’ package for R (Zeileis et al. [40]). In our sample, the observations range from 0.1075 to 0.5612 in value, and the sample mean and standard deviation are 0.2897 and 0.1014, respectively. When a Kumaraswamy distribution is fitted to the data, the estimates of the two shape parameters are 2.9546 and 26.9653. See Figure 4a,b. The Anderson–Darling statistic is 0.8521, which exceeds the asymptotic and simulated 5% critical values of 0.7520 and 0.7415, respectively. This supports the rejection of the hypothesis that the data are Kumaraswamy-distributed. Fitting a beta distribution to the data yields estimates of 6.0721 and 14.8224 for the shape parameters. The corresponding Anderson–Darling statistic is 0.5114. The asymptotic and simulated 5% critical values are 0.7520 and 0.7700, respectively, suggesting that the hypothesis that the data are beta-distributed cannot be rejected at this significance level.

5. Conclusions

The Kumaraswamy distribution is an alternative to the beta distribution, which has been applied in statistical studies in a wide range of disciplines. Its theoretical properties are well established, but the literature lacks a discussion of formal goodness-of-fit tests for this distribution. In this paper, we have applied the “biased transformation” methodology suggested by Raschke [18] to various standard tests based on the empirical distribution function and investigated their performance for the Kumaraswamy distribution.
The results of our simulation experiment, which focuses on both the size and power of these tests, can be summarized as follows. The “biased transformation” EDF goodness-of-fit testing strategy performs well for the Kumaraswamy distribution against a wide range of possible alternatives, though it needs to be treated with caution against certain beta distribution alternatives. In all cases, the Anderson–Darling test clearly emerges as the most powerful test of those considered and is recommended for practitioners.

Funding

This research received no external funding.

Informed Consent Statement

Not applicable.

Data Availability Statement

See the links referenced within the text of the paper.

Acknowledgments

I would like to thank the two anonymous reviewers whose helpful comments led to several improvements to the paper. I am also most grateful to Friedrich Schneider for supplying the data from Medina and Schneider [39] in electronic format.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Kumaraswamy, P. A generalized probability density function for double-bounded random processes. J. Hydrol. 1980, 46, 79–88. [Google Scholar] [CrossRef]
  2. Sundar, V.; Subbiah, K. Application of double bounded probability density function for analysis of ocean waves. Ocean Eng. 1989, 16, 193–200. [Google Scholar] [CrossRef]
  3. Seifi, A.; Ponnambalam, K.; Vlach, J. Maximization of Manufacturing Yield of Systems with Arbitrary Distributions of Component Values. Ann. Oper. Res. 2000, 99, 373–383. [Google Scholar] [CrossRef]
  4. Ponnambalam, K.; Seifi, A.; Vlach, J. Probabilistic design of systems with general distributions of parameters. Int. J. Circuit Theory Appl. 2001, 29, 527–536. [Google Scholar] [CrossRef]
  5. Ganji, A.; Ponnambalam, K.; Khalili, D.; Karamouz, M. Grain yield reliability analysis with crop water demand uncertainty. Stoch. Environ. Res. Risk Assess. 2006, 20, 259–277. [Google Scholar] [CrossRef]
  6. Courard-Hauri, D. Using Monte Carlo analysis to investigate the relationship between overconsumption and uncertain access to one’s personal utility function. Ecol. Econ. 2007, 64, 152–162. [Google Scholar] [CrossRef]
  7. Cordeiro, G.M.; Castro, M. A new family of generalized distributions. J. Stat. Comput. Simul. 2010, 81, 883–898. [Google Scholar] [CrossRef]
  8. Bayer, F.M.; Bayer, D.M.; Pumi, G. Kumaraswamy autoregressive moving average models for double bounded environmental data. J. Hydrol. 2017, 555, 385–396. [Google Scholar] [CrossRef]
  9. Cordeiro, G.M.; Machado, E.C.; Botter, D.A.; Sandoval, M.C. The Kumaraswamy normal linear regression model with applications. Commun. Stat.-Simul. Comput. 2018, 47, 3062–3082. [Google Scholar] [CrossRef]
  10. Nadarajah, S. On the distribution of Kumaraswamy. J. Hydrol. 2008, 348, 568–569. [Google Scholar] [CrossRef]
  11. McDonald, J.B. Some Generalized Functions for the Size Distribution of Income. Econometrica 1984, 52, 647. [Google Scholar] [CrossRef]
  12. Jones, M. Kumaraswamy’s distribution: A beta-type distribution with some tractability advantages. Stat. Methodol. 2009, 6, 70–81. [Google Scholar] [CrossRef]
  13. Garg, M. On Distribution of Order Statistics from Kumaraswamy Distribution. Kyungpook Math. J. 2008, 48, 411–417. [Google Scholar] [CrossRef]
  14. Mitnik, P.A. New properties of the Kumaraswamy distribution. Commun. Stat.-Theory Methods 2013, 42, 741–755. [Google Scholar] [CrossRef]
  15. Ferrari, S.; Cribari-Neto, F. Beta Regression for Modelling Rates and Proportions. J. Appl. Stat. 2004, 31, 799–815. [Google Scholar] [CrossRef]
  16. Mitnik, P.A.; Baek, S. The Kumaraswamy distribution: Median-dispersion re-parameterizations for regression modeling and simulation-based estimation. Stat. Pap. 2013, 54, 177–192. [Google Scholar] [CrossRef]
  17. Hamedi-Shahraki, S.; Rasekhi, A.; Yekaninejad, M.S.; Eshraghian, M.R.; Pakpour, A.H. Kumaraswamy regression modeling for Bounded Outcome Scores. Pak. J. Stat. Oper. Res. 2021, 17, 79–88. [Google Scholar] [CrossRef]
  18. Raschke, M. The Biased Transformation and Its Application in Goodness-of-Fit Tests for the Beta and Gamma Distribution. Commun. Stat.-Simul. Comput. 2009, 38, 1870–1890. [Google Scholar] [CrossRef]
  19. Anderson, T.W.; Darling, D.A. Asymptotic theory of certain “goodness of fit” criteria based on stochastic processes. Ann. Math. Stat. 1952, 23, 193–212. [Google Scholar] [CrossRef]
  20. Anderson, T.W.; Darling, D.A. A test for goodness of fit. J. Am. Stat. Assoc. 1954, 49, 300–310. [Google Scholar] [CrossRef]
  21. Raschke, M. Empirical behaviour of tests for the beta distribution and their application in environmental research. Stoch. Environ. Res. Risk Assess. 2011, 25, 79–89. [Google Scholar] [CrossRef]
  22. Kuiper, N.H. Tests concerning random points on a circle. Proc. K. Ned. Akad. Wet. A 1962, 63, 38–47. [Google Scholar] [CrossRef]
  23. Watson, G.S. Goodness-of-fit tests on a circle. I. Biometrika 1961, 48, 109–114. [Google Scholar] [CrossRef]
  24. Cramér, H. On the composition of elementary errors. Scand. Actuar. J. 1928, 1928, 13–74. [Google Scholar] [CrossRef]
  25. von Mises, R.E. Wahrscheinlichkeit, Statistik und Wahrheit; Julius Springer: Vienna, Austria, 1928. [Google Scholar]
  26. Kolmogorov, A. Sulla determinazione empirica di una legge di distribuzione. G. dell’Ist. Ital. Attuari 1933, 4, 83–91. [Google Scholar]
  27. Smirnov, N. Table for Estimating the Goodness of Fit of Empirical Distributions. Ann. Math. Stat. 1948, 19, 279–281. [Google Scholar] [CrossRef]
  28. Lemonte, A.J. Improved point estimation for the Kumaraswamy distribution. J. Stat. Comput. Simul. 2011, 81, 1971–1982. [Google Scholar] [CrossRef]
  29. Stephens, M.A. Tests based on EDF statistics. In Goodness-of-Fit Techniques; D’Augustino, R.B., Stephens, M.A., Eds.; Marcel Dekker: New York, NY, USA, 1986; pp. 97–194. [Google Scholar]
  30. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2024; Available online: https://www.R-project.org/ (accessed on 5 March 2024).
  31. Moss, J.; Nagler, T. Package ‘univariateML’. 2022. Available online: https://cran.r-project.org/web/packages/univariateML/univariateML.pdf (accessed on 5 March 2024).
  32. Pavia, J.M. Package ‘GoFKernel’. 2022. Available online: https://cran.r-project.org/web/packages/GoFKernel/GoFKernel.pdf (accessed on 5 March 2024).
  33. Millard, S.P.; Kowarik, A. Package ‘EnvStats’. 2023. Available online: https://cran.r-project.org/web/packages/EnvStats/EnvStats.pdf (accessed on 5 March 2024).
  34. Yee, T. Package ‘VGAM’. 2023. Available online: https://cran.r-project.org/web/packages/VGAM/VGAM.pdf (accessed on 5 March 2024).
  35. Hetzel, J.T. Package ‘trapezoid’. 2022. Available online: https://cran.r-project.org/web/packages/trapezoid/trapezoid.pdf (accessed on 5 March 2024).
  36. Mersmann, O.; Trautmann, H.; Steuer, D.; Bornkamp, B. Package ‘Truncnorm’. 2023. Available online: https://cran.r-project.org/web/packages/truncnorm/truncnorm.pdf (accessed on 5 March 2024).
  37. Bear, R.; Shang, K.; You, H.; Fannin, B. Package ‘Cascsim’. 2022. Available online: https://cran.r-project.org/web/packages/cascsim/cascsim.pdf (accessed on 5 March 2024).
  38. Reynkens, T. Package ‘ReIns’. 2023. Available online: https://cran.r-project.org/web/packages/ReIns/ReIns.pdf (accessed on 5 March 2024).
  39. Medina, L.; Schneider, H. Shedding Light on the Shadow Economy: A Global Database and the Interaction with the Official One; CESifo Working Paper No. 7981; Ludwig Maximilian University of Munich: Munich, Germany, 2019. [Google Scholar] [CrossRef]
  40. Zeileis, A.; Cribari-Neto, F.; Gruen, B.; Kosmidis, I.; Simas, A.B.; Rocha, A.V. Package ‘betareg’. 2022. Available online: https://cran.r-project.org/web/packages/betareg/betareg.pdf (accessed on 5 March 2024).
Figure 1. Kumaraswamy densities.
Figure 1. Kumaraswamy densities.
Stats 07 00023 g001
Figure 2. Beta densities.
Figure 2. Beta densities.
Stats 07 00023 g002
Figure 3. (a) Hidden economy densities. (b) Distribution functions.
Figure 3. (a) Hidden economy densities. (b) Distribution functions.
Stats 07 00023 g003
Figure 4. (a) Food expenditure densities. (b) Distribution functions.
Figure 4. (a) Food expenditure densities. (b) Distribution functions.
Stats 07 00023 g004
Table 1. Simulated sizes of the EDF tests for various shape parameter values *.
Table 1. Simulated sizes of the EDF tests for various shape parameter values *.
nK-SKuiperC-MWatsonA-DK-SKuiperC-MWatsonA-D
a = 3, b = 4
α = 5% α = 10%
100.05850.05970.05610.05500.06380.11930.12100.11330.11890.1277
250.04920.04640.04820.04860.04870.10530.09780.09900.10150.1037
500.05050.04750.04590.04510.04720.10070.09810.09410.09770.0982
1000.04970.04670.04880.04800.04860.10020.09250.09320.09670.0949
a = 3, b = 8
α = 5% α = 10%
100.05770.05930.05590.05510.06280.11740.12090.11300.11780.1255
250.04790.04600.04740.04810.04850.10400.09720.09780.10130.1033
500.04980.04700.04470.04340.04630.10020.09690.09240.09760.0972
1000.04940.04620.04810.04750.04750.09860.09260.09220.09560.0927
a = 2, b = 2.5
α = 5% α = 10%
100.05980.05900.05700.05600.06420.11840.12140.11330.11920.1267
250.05000.04700.04860.04860.04960.10480.09810.09910.10190.1034
500.05150.04790.04600.04600.04790.10210.09880.09510.09860.0985
1000.04890.04680.04960.04940.04920.10100.09290.09460.09740.0957
a = 2.0, b = 5.0
α = 5% α = 10%
100.05860.05970.05570.05500.06300.11840.12080.11300.11840.1269
250.04910.04630.04780.04860.04890.10550.09770.09830.10150.1032
500.05030.04720.04570.04450.04680.10030.09730.09310.09760.0976
1000.04960.04640.04880.04740.04850.10000.09230.09300.09590.0941
a = 2.0, b = 20.0
α = 5% α = 10%
100.05760.05930.05500.05500.06200.11700.12050.11060.11720.1248
250.04730.04550.04740.04810.04760.10340.09690.09780.10180.1022
500.04920.04640.04420.04300.04560.09990.09630.09110.09740.0964
1000.04850.04630.04690.04660.04590.09780.09230.09130.09430.0923
a = 1.0, b = 3.0
α = 5% α = 10%
100.05890.05920.05600.05530.06420.11890.12090.11310.11920.1273
250.04970.04640.04860.04860.04930.10510.09800.09930.10210.1037
500.05120.04770.04600.04550.04740.10140.09860.09480.09810.0985
1000.04950.04700.04930.04870.04870.10020.09260.09400.09680.0955
a = 0.5, b = 0.5
α = 5% α = 10%
100.06250.06030.05740.05820.06790.12600.11990.11790.12030.1346
250.05920.05270.06190.05900.06430.11540.10710.11580.11680.1235
500.06500.05710.06890.06240.07420.12670.11650.12850.12380.1366
1000.08640.07300.09400.08190.10170.15630.13560.16050.15090.1706
Crit.0.8951.4890.1260.1170.7520.8191.3860.1040.0960.631
* Crit. = Upper-tail critical values when the normal distribution’s parameters are both estimated. Source: Stephen [29], Table 4.7. (Stephens reports these values to only 3 decimal places). α = nominal significance level at which the tests are applied. K-S = Kolmogorov and Smirnov; C-M = Cramér and von Mises; and A-D = Anderson and Darling.
Table 2. Simulated powers of the EDF tests against various alternative hypotheses.
Table 2. Simulated powers of the EDF tests against various alternative hypotheses.
nK-SKuiperC-MWatsonA-DK-SKuiperC-MWatsonA-D
Triangular (mode = 1/4)
α = 5% α = 10%
100.07840.07520.07650.07540.08780.14390.14090.14260.14630.1604
250.09780.08680.10970.09870.11750.17660.15500.18570.17850.1988
500.14220.12180.17070.14980.18060.23970.20610.25970.24220.2742
1000.24700.21210.30510.26390.31600.37400.31820.41570.37730.4333
2500.54910.48850.64900.58310.67230.67620.62240.75640.7030.7712
5000.85290.81780.92460.88650.93470.91960.89400.95730.93510.9637
10000.99290.99090.99780.99610.99860.99750.99630.99920.99860.9994
Triangular (mode = 7/8)
α = 5% α = 10%
100.06760.07110.06600.06440.07550.13140.13300.12830.13530.1446
250.07360.06800.08220.07680.08660.14220.12940.14550.14620.1538
500.09690.08840.11440.10520.12330.17970.16260.19660.18540.2124
1000.15750.13360.19280.16590.21910.25610.22940.29640.27030.3264
2500.35880.31920.45300.38830.51220.50530.45320.58530.52310.6366
5000.65680.61980.77710.69940.83860.78450.74460.86220.81270.9001
10000.93640.93100.97890.96030.98900.97300.96710.99040.98150.9954
Truncated Log-Normal (meanlog = 0, sdlog = 1)
α = 5% α = 10%
100.07740.07580.07620.07380.08830.14580.14090.14200.14520.1614
250.10110.08620.11470.10030.12600.18360.15600.19180.17960.2139
500.15360.12120.1836 0.15480.20730.25330.20370.28420.25280.3148
1000.27780.21930.34310.27850.39070.41530.33100.46720.40350.5182
2500.61430.53760.74360.63180.80540.73990.66460.83190.75190.8795
5000.91050.88150.97130.92900.98670.96000.93970.98700.96570.9936
10000.99810.99751.00000.99921.00000.99960.99941.00000.99991.0000
Truncated Log-Normal (meanlog= 0.5, sdlog = 0.5)
α = 5% α = 10%
100.03980.04000.03750.03820.04320.07860.07570.07510.07650.0841
250.06530.05890.06720.06350.07180.13030.11670.13130.12720.1382
500.07980.06710.08680.07600.09560.15030.12900.15390.14590.1673
1000.11640.09240.13490.11060.14740.19970.16790.21680.19460.2374
2500.24270.18340.29660.23650.33710.36680.29370.41200.3540.4568
5000.44740.35380.54220.44010.61560.58610.48800.66610.57420.7337
10000.75570.67840.86700.76780.91630.85940.79800.92550.86230.9582
Truncated Normal (mean = 0.5, sd = 0.1)
α = 5% α = 10%
100.07040.06480.06860.06290.07990.13480.12530.12600.12870.1476
250.08390.07700.09520.08560.10760.15400.13680.16440.15170.1798
500.11880.09860.13950.11660.16070.20310.16980.22090.19910.2457
1000.19280.15060.23090.18680.26820.30290.24970.33930.29450.3834
2500.42630.34930.51790.42060.58400.56250.47820.64090.55420.6987
5000.71510.65090.83450.73210.88220.82270.76610.89880.83360.9310
10000.95160.94080.98910.96750.99570.98130.97230.99540.98570.9984
Truncated Normal (mean = 0.8, sd = 0.8)
α = 5% α = 10%
100.05770.05860.05470.05590.06470.11700.11660.11370.11630.1306
250.05190.05000.05290.05170.05400.10400.09810.10450.10800.1096
500.05440.05070.05840.05590.05730.10810.10030.10440.10630.1088
1000.05580.05440.05580.05520.05740.11930.10750.11150.11260.1154
2500.06810.06570.07200.06920.07650.12930.12440.12840.12410.1349
5000.08730.08670.09990.09040.10990.16480.15780.16960.16280.1790
10000.14040.13000.15660.14100.17170.23180.21960.24760.23740.2695
Trapezoidal (m1 = 1/8, m2 = 3/8; n1 = n3 = 2)
α = 5% α = 10%
100.06360.06520.06060.06050.06910.12500.12490.11930.12540.1372
250.06320.05700.06510.06200.07300.12820.11690.12930.12610.1400
500.07710.06520.08300.07030.09680.14730.12750.15370.14170.1704
1000.11240.08770.12920.10540.15310.19340.16160.21460.18890.2477
2500.23060.17720.28850.21960.35370.35210.28730.41030.34330.4821
5000.42140.36130.54050.42480.65120.57050.49370.66860.56500.7640
10000.72580.70110.87370.76250.93710.84090.81480.93220.86390.9694
Trapezoidal (m1 = 5/8, m2 = 7/8; n1 = n3 = 2)
α = 5% α = 10%
100.05870.06180.05700.05750.06780.12210.12410.11720.12320.1331
250.05530.05180.05700.05720.06030.11360.10400.11210.11510.1176
500.06280.05770.06220.05800.06810.11850.11020.12310.12010.1309
1000.07260.06370.08220.07400.09360.14390.12220.14770.13900.1635
2500.12700.10210.14730.12030.17810.21810.18270.23430.20510.2822
5000.21830.17820.27350.21570.34520.34290.28560.39340.33290.4788
10000.39730.35600.53200.41800.65360.55430.50580.66740.56760.7692
Trapezoidal (m1 = 1/4, m2 = 3/4; n1 =n3 = 3)
α = 5% α = 10%
100.06260.07320.06370.06870.07190.12710.13890.12910.13950.1439
250.07110.08620.07810.08580.08090.14020.15600.15170.16630.1576
500.09730.12850.12240.13130.12740.18600.21720.21430.23020.2213
1000.15430.21680.21180.23170.22720.27070.33300.33060.35740.3496
2500.38160.50860.52250.55030.56070.54770.63860.65880.68630.6912
5000.71780.84020.86350.87910.88850.84470.90950.92380.93220.9389
10000.97190.99140.99380.99530.99690.99050.99710.99830.99890.9991
Truncated Gamma (2,3)
α = 5% α = 10%
100.22110.27280.23640.24980.31550.32660.39580.36260.37590.4511
250.50180.62930.58070.58130.72940.64890.73850.70250.70580.8269
500.87450.94420.90620.90230.97480.93910.97080.95160.95070.9890
1000.99850.99980.99890.99831.00000.99970.99990.99960.99961.0000
2501.00001.00001.00001.00001.00001.00001.00001.00001.00001.0000
Truncated Gamma (2,6)
α = 5% α = 10%
100.14230.13210.15500.14230.18340.22340.20820.23460.22220.2713
250.25560.23880.32220.28180.37170.36620.34260.41420.37880.4710
500.44070.42110.54280.47750.61160.56460.52510.63680.58720.7005
1000.71040.69540.81240.75160.86490.80720.77990.87030.82680.9083
2500.97400.97430.99100.98300.99580.98780.98600.99560.99060.9977
5000.99980.99971.00000.99971.00001.00000.99991.00000.99991.0000
10001.00001.00001.00001.00001.00001.00001.00001.00001.00001.0000
Truncated Weibull (shape = 2, scale = 1)
α = 5% α = 10%
100.05690.05530.05080.05240.05400.11380.1120.10520.10990.1086
250.05870.05370.06080.05880.06300.11920.10840.11560.11480.1266
500.06430.05800.06960.06440.07450.12430.11680.12750.12430.1345
1000.08520.07020.09350.08070.10040.15080.13060.15630.14550.1701
2500.14500.11080.16070.13450.17390.23540.19430.25350.22440.2744
5000.24780.19000.28740.23810.31430.36340.29430.39380.34560.4239
10000.43080.34570.51670.42600.56300.57170.47570.63330.55630.6767
Beta (3,3)
α = 5% α = 10%
100.05450.05730.05390.05280.06060.11280.11150.10650.11050.1215
250.05060.04840.05340.05130.05420.10470.09820.10360.10540.1088
500.05270.05010.05650.05420.05690.11140.09750.10750.10730.1149
1000.05770.05320.06000.05860.06290.12150.11120.11860.1160.1232
2500.07900.06410.08210.07060.08970.14920.12660.14920.13690.1571
5000.11510.08630.11480.09640.13020.19390.16030.19470.17140.2145
10000.18470.13390.20650.15860.23400.29190.22570.30780.26030.3401
Beta (20,20)
α = 5% α = 10%
100.07100.06700.06890.06640.07900.13440.12880.12960.13250.1493
250.08920.07650.09740.08580.11000.16080.13970.17380.16100.1941
500.13400.10480.16290.13470.18080.23130.18710.25010.22350.281
1000.23950.17840.29220.22870.33730.36820.28420.41570.34940.4655
2500.53800.44860.65980.53980.73550.68140.58580.77030.66770.8338
5000.85800.80400.94270.86910.96920.92950.88960.97190.92910.9875
10000.99250.99040.99950.99530.99990.99810.99680.99970.99870.9999
Beta (4,2)
α = 5% α = 10%
100.05620.05880.05620.05780.06300.11550.11530.11020.11380.1229
250.05090.05320.05070.05030.05220.10030.10190.10090.10460.1072
500.05420.05160.05550.05440.05740.10800.09820.10500.10710.1100
1000.05490.05400.05620.05450.05590.10790.10380.10740.10760.1076
2500.06720.06130.06410.05790.06490.12370.11340.11860.11560.1232
5000.07400.06470.07610.06740.07950.14140.12450.13460.12830.1433
10000.10480.07870.10850.09050.11670.18660.14890.18080.16350.1956
Beta (2,4)
α = 5% α = 10%
100.05670.05840.05330.05570.06010.11180.11270.10750.11310.1194
250.05210.05440.05530.05390.05750.10530.10320.10660.10690.1120
500.05710.05390.05790.05340.05870.11080.10360.11210.11170.1180
1000.05990.05630.06310.05990.06740.11680.10570.11640.11160.1238
2500.07990.06710.07820.06890.08400.14550.12620.14360.13350.1569
5000.10660.08350.11560.09730.12540.18960.15400.18970.16860.2060
10000.17200.12480.18550.14980.21180.27440.20680.28790.24260.3124
Beta (3,20)
α = 5% α = 10%
100.05800.05850.05620.05500.06510.11800.11700.11250.11470.1297
250.06620.05810.06880.06400.07380.12700.11820.12870.12430.1413
500.08330.06780.08790.07810.09530.15160.12910.15740.14520.1677
1000.11820.09320.13740.11170.15040.21040.16430.22110.19360.2384
2500.24260.17470.29230.22260.33650.36740.28250.41190.34310.4621
5000.44800.34500.54580.43310.61450.59230.48190.67180.56560.7343
10000.75860.67530.86640.75770.91390.86000.79280.92580.85460.9548
Beta (0.5,0.5)
α = 5% α = 10%
100.05470.05570.04820.04910.05770.11540.11210.10530.11290.1224
250.05000.05060.05290.05310.05420.10670.09970.10730.11200.1131
500.05000.05010.05320.05050.05410.10480.10420.10520.11020.1070
1000.05380.05330.05530.05420.05850.10640.10560.10700.10690.1086
2500.06630.06150.06750.06310.07020.12630.11650.12350.12100.1259
5000.08340.07260.08530.07720.08980.15510.13630.15300.14570.1600
10000.11920.09930.12750.11370.13630.19830.17500.20710.18950.2232
Table 3. Simulated powers of the EDF tests with simulated critical values *.
Table 3. Simulated powers of the EDF tests with simulated critical values *.
nK-SKuiperC-MWatsonA-DK-SKuiperC-MWatsonA-D
Triangular (mode = ¼); a = 2, b = 20
α = 5% α = 10%
100.06490.06120.06470.06380.06620.12270.11240.12060.11620.1251
250.09120.08290.04820.10010.10210.15310.14430.17150.16060.1729
500.12430.11710.15270.13980.15350.21460.19310.24210.22240.2410
1000.20780.19140.25370.22690.26000.31970.29060.35940.33080.3644
Truncated-Log-Normal (meanlog = sdlog = 0.5); a= 2, b = 20
α = 5% α = 10%
100.05070.05080.05020.04960.05110.10050.10150.10100.00980.1001
250.06910.06170.07290.06550.07570.12040.11770.13340.12810.1328
500.08170.09740.09700.08770.10090.14940.13640.16720.15310.1730
1000.12530.10780.14540.12740.16030.21320.18560.23790.21080.2600
Trapezoidal (m1 = ¼, m2 = ¾, n1 = n2 = 3); a = 3, b = 8
α = 5% α = 10%
100.05050.05740.05270.05520.05140.10730.11580.10450.11140.1049
250.06970.08820.08170.08740.08180.13300.15620.15070.15730.1530
500.09590.13280.12660.14150.13110.17960.21830.22180.23250.2254
1000.12630.23460.22990.25030.24540.28200.35220.35670.37700.3742
Truncated Gamma (2,3); a = 3, b = 4
α = 5% α = 10%
100.19240.24350.22630.23530.27670.30110.34350.33510.34250.4010
250.51450.64630.58690.58890.73740.64640.75060.71040.71130.8297
500.87330.94540.91430.92360.97560.93920.97060.95700.95280.9888
1000.99780.99980.99860.99791.00000.99930.99990.99990.99981.0000
Beta (20,20); a = 3, b = 8
α = 5% α = 10%
100.06230.06030.06510.06160.06780.12010.11270.12250.12010.1257
250.09420.08240.10330.09070.11540.15740.14570.18060.16470.1921
500.13570.11260.16920.14610.19340.22730.19080.27040.22720.2892
1000.23990.19160.30630.24550.35110.36590.30400.43260.36100.4767
Beta (3,20); a = 2, b = 20
α = 5% α = 10%
100.05720.05440.05680.05680.05650.10830.10730.10930.10720.1100
250.06920.06790.07550.07110.07770.12420.12390.13400.12540.1363
500.08220.06890.09400.08460.10150.15070.13120.16610.14740.1738
1000.12090.09930.14250.11930.15950.20780.17780.23290.20020.2539
Beta (0.5,0.5); a = b = 0.5
α = 5% α = 10%
100.04420.04910.04980.05010.04990.09540.09660.09580.09740.0938
250.05340.05520.05690.05620.05460.10020.10370.10260.10320.1048
500.04780.05290.05240.05190.05100.09650.09970.10740.10920.1058
1000.05490.05390.05460.05320.05630.10990.11000.11140.11210.1141
* a and b are the shape parameters for the Kumaraswamy distribution.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Giles, D.E. New Goodness-of-Fit Tests for the Kumaraswamy Distribution. Stats 2024, 7, 373-388. https://doi.org/10.3390/stats7020023

AMA Style

Giles DE. New Goodness-of-Fit Tests for the Kumaraswamy Distribution. Stats. 2024; 7(2):373-388. https://doi.org/10.3390/stats7020023

Chicago/Turabian Style

Giles, David E. 2024. "New Goodness-of-Fit Tests for the Kumaraswamy Distribution" Stats 7, no. 2: 373-388. https://doi.org/10.3390/stats7020023

APA Style

Giles, D. E. (2024). New Goodness-of-Fit Tests for the Kumaraswamy Distribution. Stats, 7(2), 373-388. https://doi.org/10.3390/stats7020023

Article Metrics

Back to TopTop