Next Article in Journal
Generalized Framework for Liquid Neural Network upon Sequential and Non-Sequential Tasks
Previous Article in Journal
Single-Machine Scheduling with Simultaneous Learning Effects and Delivery Times
Previous Article in Special Issue
How to Determine the Optimal Number of Cardiologists in a Region?
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Modifications to the Jarque–Bera Test

1
Department of Business Analytics, Accounting and Statistics and Research Laboratory of Sustainable Development of Socio-Economic Systems, Siberian Institute of Management—Branch of the Russian Presidential Academy of National Economy and Public Administration, 630102 Novosibirsk, Russia
2
Department of Statistics, Novosibirsk State University of Economics and Management, 630099 Novosibirsk, Russia
3
Sobolev Institute of Mathematics, Siberian Branch of the Russian Academy of Science, 630090 Novosibirsk, Russia
4
Department of Computer Science in Economics, Novosibirsk State Technical University (NSTU), 630087 Novosibirsk, Russia
5
Department of Higher Mathematics, Siberian State University of Geosystems and Technologies (SSUGT), 630108 Novosibirsk, Russia
6
Department of Statistics, Institute of Mathematics and Statistics, University of São Paulo (USP), São Paulo CEP 05508-220, Brazil
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(16), 2523; https://doi.org/10.3390/math12162523
Submission received: 20 June 2024 / Revised: 29 July 2024 / Accepted: 13 August 2024 / Published: 15 August 2024
(This article belongs to the Special Issue Mathematical Modeling and Applications in Industrial Organization)

Abstract

:
The Jarque–Bera test is commonly used in statistics and econometrics to test the hypothesis that sample elements adhere to a normal distribution with an unknown mean and variance. This paper proposes several modifications to this test, allowing for testing hypotheses that the considered sample comes from: a normal distribution with a known mean (variance unknown); a normal distribution with a known variance (mean unknown); a normal distribution with a known mean and variance. For given significance levels, α = 0.05 and α = 0.01 , we compare the power of our normality test with the most well-known and popular tests using the Monte Carlo method: Kolmogorov–Smirnov (KS), Anderson–Darling (AD), Cramér–von Mises (CVM), Lilliefors (LF), and Shapiro–Wilk (SW) tests. Under the specific distributions, 1000 datasets were generated with the sample sizes n = 25 , 50 , 75 , 100 , 150 , 200 , 250 , 500 , and 1000. The simulation study showed that the suggested tests often have the best power properties. Our study also has a methodological nature, providing detailed proofs accessible to undergraduate students in statistics and probability, unlike the works of Jarque and Bera.

1. Introduction

C. Jarque and A.K. Bera proposed the following goodness-of-fit test (see [1,2,3]) to determine whether the empirical skewness and kurtosis match those of a normal distribution. The hypothesis to be tested is as follows:
H0. 
If the population the sample presents is normally distributed against the alternative hypothesis.
H1. 
If the population the sample presents follows a distribution from the Pearson family that is not normally distributed.
More precisely, the null hypothesis is formulated as follows: the sample comes from a population with a finite eighth moment, the odd central moments (up to the seventh) are equal to zero, and the kurtosis is equal to three, K = 3 . Note that only the normal distribution has these properties within any reasonable family of distributions. For sure, it is true for the Pearson family. In practice, the Pearson family is not typically mentioned in the hypothesis.
The test statistic is a combination of squares of normalized skewness, S, and kurtosis, K:
J B = n S 2 6 + ( K 3 ) 2 24 .
If the null hypothesis, H 0 , is true, then as n , the distribution of the random variable J B converges to χ 2 ( 2 ) . Therefore, for a sufficiently large sample size, the following testing rule can be applied: given a significance level α , if J B < χ 1 α 2 ( 2 ) (where χ 1 α 2 ( 2 ) is the 1 α quantile of the χ 2 ( 2 ) distribution), then the null hypothesis H 0 is accepted; otherwise, it is rejected.
Note that the Pearson family of distributions is quite rich, including exponential, gamma, beta, Student’s t, and normal distributions. Suppose it is known that a random variable has a distribution from the Pearson family and has the first four moments. In that case, its specific form is uniquely determined by the skewness, S, and kurtosis, K, see [4]. For an illustration, we present this classification in Figure 1. Due to this property, the Jarque–Bera test is a goodness-of-fit test, i.e., if the alternative hypothesis H 1 holds for the sample elements, the statistic J B converges in probability to ∞ as n .
This article emerged as a result of addressing the following question: How does the J B statistic change when the researcher knows the following?
  • The mean of the population distribution (the variance is unknown);
  • The variance of the population distribution (the mean is unknown);
  • The mean and variance of the population distribution.
In the last case, the known mean and variance lead us to the known coefficient of variation. For a discussion on inference when the coefficient of variation is known, we refer the reader to [5,6], and the references therein.
In this paper, we adapt the J B statistic for cases where one or both normal distribution parameters are known. As simulations show, the proposed tests also demonstrate good power for many samples not belonging to the Pearson family of distributions.
To conclude this section, we note the following: In practical research, knowing the parameters of the normal distribution is crucial, as it allows us to estimate the probabilities of desired events. Testing the hypothesis of normality with specific parameters is significant because any deviation—whether in the form of outliers (deviations from normality) or change points in stochastic processes (a sudden change in the parameter)—can indicate the presence of unusual or catastrophic events. For example, small but significant parameter changes can signal a disturbance in the production process. Strong and rare deviations are of particular interest when studying stochastic processes with catastrophes. We believe a deeper connection exists between these seemingly distinct fields, which still awaits thorough investigation.
The rest of this paper is organized as follows. The following section, Section 2, presents the main results (the limit theorem and criteria for testing the corresponding statistical hypotheses). In Section 3, we present a Monte Carlo simulation to compare the suggested tests with some existing procedures. We prove Theorem 1 in Section 4. Finally, the last section contains tables of test power resulting from the Monte Carlo numerical simulations.

2. Definitions and Results

Let X 1 , X 2 , , X n , n N be i.i.d. random variables on the same probability space ( Ω , F , P ) . We use E and D to denote expectation and variance with respect to the probability measure P . The convergence in distribution we denote as n d .
Recall the definition of empirical skewness, S, and kurtosis, K:
S = 1 n i = 1 n ( X i X ¯ ) 3 σ ^ 3 , K = 1 n i = 1 n ( X i X ¯ ) 4 σ ^ 4 ,
where, as usual, we have the following:
σ ^ = 1 n i = 1 n ( X i X ¯ ) 2 1 2 , X ¯ = 1 n i = 1 n X i .
The main result is as follows:
Theorem 1.
Let X , X 1 , , X n , n N be i.i.d. random variables. Then, we have the following:
(1) If X has a non-degenerate normal distribution and E X = a , then
J B a = n S a 2 15 + ( K a 3 ) 2 24 n d Y χ 2 ( 2 ) ,
where
S a = 1 n i = 1 n ( X i a ) 3 σ ^ a 3 , K a = 1 n i = 1 n ( X i a ) 4 σ ^ a 4 , σ ^ a = 1 n i = 1 n ( X i a ) 2 1 2 ;
(2) If X has a normal distribution and D X = σ 2 > 0 , then
J B σ 2 = n S σ 2 2 6 + ( K σ 2 3 ) 2 96 n d Y χ 2 ( 2 ) ,
where
S σ 2 = 1 n i = 1 n ( X i X ¯ ) 3 σ 3 , K σ 2 = 1 n i = 1 n ( X i X ¯ ) 4 σ 4 ;
(3) If X has a normal distribution with E X = a and D X = σ 2 > 0 , then
J B a , σ 2 = n S a , σ 2 2 15 + ( K a , σ 2 3 ) 2 96 n d Y χ 2 ( 2 ) ,
where
S a , σ 2 = 1 n i = 1 n ( X i a ) 3 σ 3 , K a , σ 2 = 1 n i = 1 n ( X i a ) 4 σ 4 .
Theorem 1 yields the following asymptotic tests. Let α be the significance level. Recall that χ 1 α 2 ( 2 ) denotes the 1 α quantile of the χ 2 ( 2 ) distribution.
  • If we test the null hypothesis H 0 : X i N ( a , σ 2 ) , 1 i n , (where a is known and σ 2 is unknown) against the alternative hypothesis H 1 : X i follows a distribution from the Pearson family that is not normal but has a mean equal to a. Then, from statement (1) of Theorem 1, for a sufficiently large sample size, the following rule can be used: if J B a < χ 1 α 2 ( 2 ) , then the null hypothesis H 0 is accepted; otherwise, it is rejected.
  • When we test the null hypothesis H 0 : X i N ( a , σ 2 ) , 1 i n , (where a is unknown and σ 2 is known) against the alternative H 1 : X i follows a distribution from the Pearson family that is not normal with variance equal to σ 2 . Then, from statement (2) of Theorem 1, for a sufficiently large sample size, the following rule can be used: if J B σ 2 < χ 1 α 2 ( 2 ) , then the null hypothesis H 0 is accepted; otherwise, it is rejected.
  • When the null hypothesis H 0 : X i N ( a , σ 2 ) , 1 i n , (a and σ 2 are known) is tested against the alternative H 1 : X i follows a distribution from the Pearson family that is not N ( a , σ 2 ) . Then, from statement (3) of Theorem 1, for a sufficiently large sample size, the following testing rule can be applied: if J B a , σ 2 < χ 1 α 2 ( 2 ) , then the null hypothesis H 0 is accepted; otherwise, it is rejected.
It should also be noted that the above tests are goodness-of-fit tests, i.e., if the alternative hypothesis H 1 holds for the sample elements, then the values of the corresponding statistics J B a , J B σ 2 , and J B a , σ 2 converge in probability to ∞ as n .

3. Simulation Study

In this section, we compare the power of various tests for normality using Monte Carlo simulations of alternative hypotheses. The simulations were performed in R software, version 4.2.3. We used the following sample sizes (small, moderate, and large): n = 25 , 50 , 75 , 100 , 150 , 200 , 250 , 500 , and 1000. The null hypothesis is N ( 0 , 1 ) in almost all cases; we specify separately where this is not the case. As alternative hypotheses, we considered normal, log-normal, mixed normal, Student, gamma, and uniform distributions. Note that the log-normal and mixed normal distributions do not belong to the Pearson family of distributions and uniform distribution is the limit of the Pearson type I distribution. All codes are written in R and available at https://github.com/KhrushchevSergey/Modified-Jarque-Bera-test, accessed on 1 June 2024.
Here, we consider the following tests for normality:
  • Kolmogorov–Smirnov (KS) test. The test statistic measures the maximum deviation between the theoretical cumulative distribution function and the empirical cumulative distribution function. When the parameters of the normal distribution are unknown, they are estimated from the sample and used in the test.
  • Anderson–Darling (AD) test. The Anderson–Darling test assesses whether a sample comes from a specific distribution, often the normal distribution. It gives more weight to the tails of the distribution compared to other tests, making it sensitive to deviations from normality in those areas.
  • Cramér–von Mises (CVM) test. The Cramér–von Mises test, like the KS test, is based on the distance between the empirical and specified theoretical distributions. It measures the cumulative squared differences between the empirical and theoretical cumulative distribution functions, providing a robust assessment of overall fit.
  • Lilliefors (LF) Kolmogorov–Smirnov test. The Lilliefors test is based on the Kolmogorov–Smirnov test. It tests the null hypothesis that data come from a normally distributed population without specifying the parameters.
  • Shapiro–Wilk (SW) test. The Shapiro–Wilk test is one of the most popular tests with good power. It is based on a correlation between given observations and associated normal scores.
We estimate the power in the following way. For the given n, we generate 1000 samples with the sample size n according to the alternative hypothesis. The empirical power is the ratio of the number of rejections of the null hypothesis to 1000. We categorize our findings based on the following cases of the alternative hypothesis distribution.

3.1. Normal versus Normal

  • Different variances and the same means. We start by comparing the test powers when the alternative distribution is the normal distribution with a zero mean and a variance different from one, specifically, N ( 0 , 2 ) . See Table A1. Since we considered two normal distributions with different variances but the same mean, we added the column with the power of the Fisher test. The Fisher test checks the hypothesis if the variance is equal to one. We expect that, in this situation, the modified Jarque–Bera statistic would exhibit the highest power. The KS and CVM statistics have demonstrated similarly lower power, while the power of the AD statistic falls between that of the KS, CVM, and modified Jarque–Bera statistics.
  • Different means and the same variances. Here, we compare the test powers when the alternative distribution is a normal distribution with a mean of one and a variance of one, N ( 1 , 1 ) . See Table A2. Since two normal distributions with different means but the same variance are considered, we added two additional columns with the test powers of the Student and Welch tests, respectively. All statistics perform similarly well, except for the modified Jarque–Bera with a known mean for the small sample sizes.

3.2. Normal versus Student’s t with Degrees of Freedom 1 (Cauchy), 5, and 9

  • Cauchy. Here, consider the case where the alternative distribution is the Cauchy distribution. See Table A3. Since the alternative distribution is not normal, we did not conduct additional tests such as the Student or Fisher tests. In this case, the J a , σ 2 test provided the best power, but all tests performed similarly well, except for the Kolmogorov–Smirnov and the Cramér–von Mises tests. Since the normal distribution differs significantly from the Cauchy distribution, almost all tests provided good power.
  • Student’s t-distribution with 5 and 9 degrees of freedom. Here, we consider the comparison between the test powers when the alternative distribution is the t ν distribution with ν = 5 . In contrast to the Cauchy distribution, the Student’s t distribution is more similar to the normal distribution, so the expected values of the powers are smaller than in the Cauchy case. Moreover, the J a , σ 2 statistic provided significantly better power. See Table A4. Since the Student’s t distribution with ν = 9 is even more similar to the standard normal distribution, the power will be smaller with similar relationships between different tests. Therefore, we omitted the entire table of powers. To give an idea about the magnitude of the changes in power, for the J a , σ 2 statistic with n = 25 and α = 0.05 , the power changed from 0.56 to 0.34 . Note that the KS test exhibited the worst power. Additionally, observe that the performance of J a is worse than those of J a , σ 2 and J σ 2 . This, of course, is expected because the null and alternative distributions have the same zero mean. Note that the A D c , C V M c , and S W statistics exhibited power lower than but comparable to those of all the Jarque–Bera statistics. For this case, for Student’s t with 5 degrees of freedom, we plotted the test power for both α values to provide a more detailed breakdown of these power comparisons. See Figure A1 and Figure A2 in Appendix A.

3.3. Normal versus Non-Symmetric and Non-Pearson Type Distributions

  • For non-symmetric alternative distributions, (i) the gamma (2,1), γ ( 2 , 1 ) , (ii) log-normal (0,1), and L N ( 0 , 1 ) were considered; for non-Pearson type alternative distributions, we considered (iii) the uniform on interval [ 3 , 3 ] , U [ 3 , 3 ] , and (iv) the mixture of standard normal N ( 0 , 1 ) and normal N ( 0 , 9 ) distributions with the same mixture weights, denoted by M i x . Since all alternative distributions are “significantly different” from the standard normal distribution, all tables “are similar” to that of the Cauchy distribution. See Table A5, Table A6, Table A7 and Table A8. It is expected that the J B a statistic will lose power under symmetric alternative distributions. Moreover, the A D statistic is likely to perform comparably well to the Jarque–Bera statistics J B a , σ 2 and J B σ 2 . See Table A5 and Table A8. In the non-symmetric case (Table A6 and Table A7), the K S , A D , C V M , and S W statistics exhibit similar high power to the modified Jarque–Bera statistic, J B a , σ 2 .

3.4. Normal versus Gamma Distribution with the Same Mean Value

  • Finally, we decided to perform the comparison between test powers when the null hypothesis was not a standard normal distribution. Here, we test the normal distribution N ( 2 , 2 ) versus gamma distribution γ ( 2 , 1 ) , which has a mean value of 2. The results are presented in Table A9. In this case, as before, the modified J B a , σ 2 , J B σ 2 , and A D statistics exhibit the highest power, while the K S , C V M , and S W statistics show a loss in power.

3.5. Robustness in the Presence of Outliers

  • To evaluate the performance of the modified Jarque–Bera tests in the presence of outliers, we generated data from a mixture of a standard normal distribution (weight 0.9) and a sum of independent random variables—one with a standard normal distribution and the other with a Poisson distribution with a mean of 5 (weight 0.1). This type of mixture is rarely used in simulation studies, but such discrete-value outliers can occur due to failures in production machines, for example; see Table A10. In this case, the KS and CVM tests showed the lowest power performance. The modified Jarque–Bera tests showed the best power, while the other tests ( J B , L F , A D , A D c , C V M c , and S W ) had lower but similar power. We also refer the readers to [7] for robust modifications to the Jarque–Bera statistic.

3.6. Application to Real Data

We tested the hypothesis that the mass of penguins, based on their species and gender, follows a normal distribution. We applied a modified Jarque–Bera test, using the sample mean and variance as known values. The observations were taken from a popular dataset featuring penguin characteristics from the study [8], where sexual size dimorphism (SSD), i.e., ecological sexual dimorphism, was studied in penguin populations. The normal variability of penguin mass is well-accepted, thus, accepting the null hypothesis is anticipated for this dataset. The corresponding p-values are provided in the table below.
SpeciesMaleFemale
Adelie0.77870.7361
Chinstrap0.91940.5876
Gentoo0.95980.7687
In the next section, we provide the detailed proof of our main results.

4. Proof of Theorem 1

Let us prove proposition (1) of the theorem. Consider the sequence of random variables, as follows:
Z n : = ( Z 1 , n , Z 2 , n ) : = ( 1 n i = 1 n ( X i a ) 3 σ ^ a 3 , 1 n i = 1 n ( ( X i a ) 4 3 σ ^ a 4 ) σ ^ a 4 ) .
From the convergence, i.e.,
lim n σ ^ a = σ > 0 a . s .
and Slutsky’s theorem [9], it follows that the limiting distribution of the sequence Z n coincides with the limiting distribution of the sequence
Z n : = ( Z 1 , n , Z 2 , n ) : = ( 1 n i = 1 n ( X i a ) 3 σ 3 , 1 n i = 1 n ( ( X i a ) 4 3 σ ^ a 4 ) σ 4 ) .
It is easy to see the following:
1 n i = 1 n ( ( X i a ) 4 3 σ ^ a 4 ) = 1 n i = 1 n ( X i a ) 4 3 1 n j = 1 n ( X j a ) 2 2 ± 3 σ 4 = 1 n i = 1 n ( X i a ) 4 3 σ 4 + 3 σ 2 1 n j = 1 n ( X j a ) 2 σ 2 + 1 n j = 1 n ( X j a ) 2 = 1 n i = 1 n ( X i a ) 4 3 σ 4 + 3 σ 2 1 n j = 1 n ( X j a ) 2 2 σ 2 + 1 n j = 1 n ( X j a ) 2 σ 2 = 1 n i = 1 n ( X i a ) 4 3 σ 4 + 6 σ 2 σ 2 1 n j = 1 n ( X j a ) 2 3 n i = 1 n σ 2 1 n j = 1 n ( X j a ) 2 2 = 1 n i = 1 n ( X i a ) 4 + 3 σ 4 6 σ 2 ( X i a ) 2 3 n 1 n j = 1 n ( σ 2 ( X j a ) 2 ) 2 = 1 n i = 1 n ( X i a ) 4 + 3 σ 4 6 σ 2 ( X i a ) 2 3 1 n 3 4 j = 1 n ( σ 2 ( X j a ) 2 ) 2 .
The law of the iterative logarithm yields the following:
lim n 1 n 3 4 j = 1 n ( σ 2 ( X j a ) 2 ) = 0 a . s .
Therefore, applying Slutsky’s theorem, we can conclude that the limiting distribution of the sequence Z n coincides with the limiting distribution of the sequence, as follows:
Z n : = ( Z 1 , n , Z 2 , n ) : = ( 1 n i = 1 n ( X i a ) 3 σ 3 , 1 n i = 1 n ( ( X i a ) 4 6 σ 2 ( X i a ) 2 + 3 σ 4 ) σ 4 ) = 1 n i = 1 n ( ( X i a ) 3 σ 3 , ( X i a ) 4 6 σ 2 ( X i a ) 2 + 3 σ 4 σ 4 ) .
By the central limit theorem, the sequence Z n converges to the random vector ( W 1 , W 2 ) , whose coordinates have a joint normal distribution. Therefore, it suffices to show that these coordinates are uncorrelated and have the required variances.
It is easy to see that D ( X i a ) 3 = 15 σ 6 (thus, D W 1 = 15 ); D ( ( X i a ) 4 6 σ 2 ( X i a ) 2 + 3 σ 4 ) = 24 σ 8 (thus D W 2 = 24 ). From the fact that the odd moments of a centered normally distributed random variable are equal to zero, it follows that
E ( X i a ) 3 σ 3 · ( X i a ) 4 6 σ 2 ( X i a ) 2 + 3 σ 4 σ 4 = 0
(therefore, E W 1 W 2 = 0 ).
Therefore, we have shown that ( W 1 , W 2 ) has a joint normal distribution with a covariance matrix, as follows:
Σ = 15 0 0 24 .
Thus,
W 1 2 15 + W 2 2 24 χ 2 ( 2 ) .
Let us prove statement (2). Consider the following sequence of random vectors:
Z n : = ( Z 1 , n , Z 2 , n ) : = ( 1 n i = 1 n ( X i X ¯ ) 3 σ 3 , 1 n i = 1 n ( ( X i X ¯ ) 4 3 σ 4 ) σ 4 ) .
It is easy to see the following:
1 n i = 1 n ( X i X ¯ ) 3 = 1 n i = 1 n ( ( X i a ) ( X ¯ a ) ) 3 = 1 n i = 1 n ( X i a ) 1 n j = 1 n ( X j a ) 3 = 1 n i = 1 n ( X i a ) 3 3 n i = 1 n ( X i a ) 2 · 1 n j = 1 n ( X j a ) + 3 n i = 1 n ( X i a ) · 1 n j = 1 n ( X j a ) 2 n 1 n j = 1 n ( X j a ) 3 = 1 n i = 1 n ( X i a ) 3 1 n i = 1 n ( X i a ) 2 · 3 n j = 1 n ( X j a ) + 3 n i = 1 n ( X i a ) · 1 n 3 4 j = 1 n ( X j a ) 2 1 n 5 6 j = 1 n ( X j a ) 3
Finally, introducing g 1 , n , g 1 , n and g 3 , n we have the following:
1 n i = 1 n ( X i X ¯ ) 3 = : 1 n i = 1 n ( X i a ) 3 g 1 , n · 3 n j = 1 n ( X j a ) + g 2 , n g 3 , n .
The law of large numbers and the law of the iterated logarithm yield the following:
lim n g 1 , n = σ 2 a . s . a n d lim n g 3 , n = 0 a . s .
The sequences 3 n i = 1 n ( X i a ) and 1 n 3 4 j = 1 n ( X j a ) converge almost surely to zero as n , due to the law of large numbers and the law of the iterated logarithm, respectively; therefore, we have the following:
lim n g 2 , n = 0 a . s .
Let us consider the numerator of the second coordinate of the random vector as follows:
1 n i = 1 n ( X i X ¯ ) 4 = 1 n i = 1 n ( ( X i a ) ( X ¯ a ) ) 4 = 1 n i = 1 n ( X i a ) 1 n j = 1 n ( X j a ) 4 = 1 n i = 1 n ( X i a ) 4 4 n i = 1 n ( X i a ) 3 · 1 n j = 1 n ( X j a ) + 6 n i = 1 n ( X i a ) 2 · 1 n j = 1 n ( X j a ) 2 4 n i = 1 n ( X i a ) · 1 n j = 1 n ( X j a ) 3 + n 1 n j = 1 n ( X j a ) 4 = 1 n i = 1 n ( X i a ) 4 4 n 3 4 i = 1 n ( X i a ) 3 · 1 n 3 4 j = 1 n ( X j a ) + 6 n i = 1 n ( X i a ) 2 · 1 n 3 4 j = 1 n ( X j a ) 2 4 n i = 1 n ( X i a ) · 1 n 5 6 j = 1 n ( X j a ) 3 + 1 n 7 8 j = 1 n ( X j a ) 4 .
Denoting g 4 , n , the four last terms, we have the following:
1 n i = 1 n ( X i X ¯ ) 4 = : 1 n i = 1 n ( X i a ) 4 + g 4 , n .
From the law of large numbers and the law of the iterated logarithm, we have the following:
lim n g 4 , n = 0 a . s .
From (1)–(5) and Slutsky’s theorem, it follows that the limiting distribution of the sequence Z n coincides with the limiting distribution of the sequence, as follows:
Z n : = ( Z 1 , n , Z 2 , n ) : = ( 1 n i = 1 n ( ( X i a ) 3 3 σ 2 ( X i a ) ) σ 3 , 1 n i = 1 n ( ( X i a ) 4 3 σ 4 ) σ 4 )
= 1 n i = 1 n ( ( X i a ) 3 3 σ 2 ( X i a ) σ 3 , ( X i a ) 4 3 σ 4 σ 4 ) .
By the central limit theorem, the sequence Z n converges to the random vector ( W 1 , W 2 ) , whose coordinates have a joint normal distribution with a covariance matrix, as follows:
Σ = D W 1 E W 1 W 2 E W 1 W 2 D W 2 .
It is easy to see that D ( ( X i a ) 3 3 σ 2 ( X i a ) ) = 6 σ 6 (thus, D W 1 = 6 ); D ( ( X i a ) 4 3 σ 4 ) = 96 σ 8 (thus, D W 2 = 96 ). From the fact that the odd moments of a centered normally distributed random variable are equal to zero, we derive the following:
E ( X i a ) 3 3 σ 2 ( X i a ) σ 3 · ( X i a ) 4 3 σ 4 σ 4 = 0
(and, thus, E W 1 W 2 = 0 ). Therefore,
W 1 2 6 + W 2 2 96 χ 2 ( 2 ) .
Proof. 
The proof of statement (3) is similar (just simpler since a and σ 2 are known), so we omit it. □

5. Conclusions

In this paper, new modifications to the Jarque–Bera statistics are proposed. Detailed proofs are provided, which are simple and accessible even to undergraduate students in probability and statistics.
A Monte Carlo study showed that the Jarque–Bera statistic and its new modifications perform well on the class of Pearson distributions. When the alternative distribution does not belong to the Pearson family, Jarque–Bera and its modifications perform well alongside other statistics such as Anderson–Darling, Cramér–von Mises, and Shapiro–Wilk. Like any specific test, the Jarque–Bera test and its modifications have natural limitations in their application. Despite the test performing well on classical distributions, a significant drawback is that it cannot distinguish between symmetric distributions with a kurtosis of 3.
Our goal was not to explore and compare all existing tests; therefore, we limited our comparison to the most widely used tests for normality. Comparative studies on a broader class of normality tests can be found in [10,11]. Note that the findings of [10,11] are aligned with our simulation results. For more comparative studies, we also refer to [12].
In this paper, we are limited to the univariate case. Multivariate normality tests represent a curious and interesting area of research. For discussions on existing tests and the possibility of multivariate extensions of some known statistics, including Jarque–Bera we refer the readers to [12,13], and the references therein.

Author Contributions

Conceptualization, V.G. and A.L.; methodology, A.L. and S.K.; software, data curation and validation, S.K., Y.I., O.L., L.S. and K.Z.; writing—original draft preparation, A.L. and A.Y.; writing—review and editing, Y.I., O.L., L.S. and K.Z.; visualization, S.K. and A.Y.; project administration, V.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by RSCF grant number 24-28-01047 and FAPESP 2023/13453-5.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Acknowledgments

V. Glinskiy, Y. Ismayilova, A. Logachov, K. Zaykov thanks RSCF for the financial support; A. Yambartsev thanks FAPESP for the financial support.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Tables and Figures

In this section, we report the power of various tests for normality using Monte Carlo simulations under alternative hypotheses. The simulations were performed in R. The sample sizes were small, moderate, and large, with n = 25, 50, 75, 100, 150, 200, 250, 500, and 1000. Although simulations were conducted for all stated sample sizes, the table includes only rows up to the first row where all criteria have a power of 1, to maintain shorter tables.
The null hypothesis is N ( 0 , 1 ) in almost all cases; exceptions are specified separately. As alternative hypotheses, we considered normal, log-normal, mixed normal, Student’s t, gamma, and uniform distributions.
Recall that the following procedure to estimate the power was used: 1000 samples with a given sample size were generated from the alternative hypothesis with specific parameters, and the ratio of the number of rejections of the null hypothesis to 1000 was calculated.
We used the notations A D c and C V M c for Anderson–Darling and Cramér–von Mises tests respectively, where the parameters were replaced with their estimates (i.e., modifications for testing composite hypotheses).
Table A1. The estimated power is reported when the null hypothesis H 0 : X N ( 0 , 1 ) is tested against samples simulated from the normal distribution N ( 0 , 2 ) . The last column contains the test power of the Fisher test for the null hypothesis where the variance is equal to 1.
Table A1. The estimated power is reported when the null hypothesis H 0 : X N ( 0 , 1 ) is tested against samples simulated from the normal distribution N ( 0 , 2 ) . The last column contains the test power of the Fisher test for the null hypothesis where the variance is equal to 1.
n JB a , σ 2 JB σ 2 KS AD CVM Fisher
α = 0.05 250.7390.6730.1750.4180.1730.392
500.9070.8820.2650.6550.2760.680
750.9750.9700.3910.8320.4330.835
1000.9910.9910.5100.9240.6070.929
15010.9990.7320.9930.8220.994
200110.85710.9331
250110.95110.9781
500111111
α = 0.01 250.6550.5760.0420.1790.0390.164
500.8520.8310.0770.3480.0740.401
750.9530.9400.1370.5290.1240.631
1000.9880.9860.1910.7520.2040.801
150110.3850.9400.4540.943
200110.5390.9820.6690.991
250110.7300.9970.8410.998
500110.996111
1000111111
Table A2. The estimated power is reported when the null hypothesis H 0 : X N ( 0 , 1 ) is tested against samples simulated from the normal distribution N ( 1 , 1 ) . The two last columns contain the test powers of the Student and Welch statistics used to test whether the difference between the two means is statistically significant or not.
Table A2. The estimated power is reported when the null hypothesis H 0 : X N ( 0 , 1 ) is tested against samples simulated from the normal distribution N ( 1 , 1 ) . The two last columns contain the test powers of the Student and Welch statistics used to test whether the difference between the two means is statistically significant or not.
n JB a , σ 2 JB a KS AD CVM StudentWelch
α = 0.05 250.9460.0150.9930.9970.9970.9290.929
5010.9611110.9990.999
7510.99911111
1001111111
α = 0.01 250.8890.0020.9490.9870.9780.8030.803
500.9970.0300.999110.9880.988
7510.9671110.9990.999
1001111111
Table A3. The null hypothesis H 0 : X N ( 0 , 1 ) is tested against data sampled from the Cauchy distribution.
Table A3. The null hypothesis H 0 : X N ( 0 , 1 ) is tested against data sampled from the Cauchy distribution.
n JB a , σ 2 JB a JB σ 2 JB KS LF AD AD c CVM CVM c SW
α = 0.05 250.9990.9000.9970.8960.2620.9090.9710.9270.2480.9270.939
5010.99410.9930.4780.9920.9990.9950.4780.9970.997
7511110.7201110.67611
10011110.8641110.85311
15011110.9841110.97111
20011110.999111111
25011111111111
α = 0.01 250.9950.8540.9940.8540.0720.8160.9450.8820.0710.8800.894
5010.98810.9860.1860.9860.9960.9950.1610.9960.994
7511110.3291110.28311
10011110.5121110.46311
15011110.8671110.80611
20011110.9711110.93511
25011110.9971110.99411
50011111111111
Table A4. The null hypothesis H 0 : X N ( 0 , 1 ) is tested against data sampled from the Student’s t distribution with 5 degrees of freedom.
Table A4. The null hypothesis H 0 : X N ( 0 , 1 ) is tested against data sampled from the Student’s t distribution with 5 degrees of freedom.
n JB a , σ 2 JB a JB σ 2 JB KS LF AD AD c CVM CVM c SW
α = 0.05 250.5600.2150.5310.2160.0690.1350.1680.2000.0780.1780.265
500.7420.3710.7150.3850.0690.1730.1860.2690.0650.2300.411
750.8570.5230.8280.5270.0650.270.2540.3870.0780.3410.531
1000.9100.6280.9110.6240.060.3260.2970.4730.0700.4220.624
1500.9760.7640.9740.7580.0730.4430.4200.6050.0860.5550.750
2000.9920.8610.9910.8560.0820.5400.5100.7360.1120.6850.861
2500.9960.9040.9970.9140.1020.6340.6250.8290.1240.7710.907
50010.99010.9910.1740.9050.9360.9760.2050.9650.987
100011110.4800.998110.49711
α = 0.01 250.5060.1380.4590.1480.0120.0570.0530.0970.0160.0810.136
500.660.2850.6470.2880.0110.0780.0660.1530.0130.1290.235
750.8030.4210.8030.4420.0070.1240.0770.2470.0110.1910.385
1000.8820.5040.8750.5170.0080.1580.0940.2870.0130.2480.450
1500.960.7020.950.6980.0130.2640.1630.4620.0240.4070.644
2000.9860.7890.9830.7930.0200.3460.2400.5750.0230.5120.744
2500.9910.8650.9910.8730.0220.3880.3130.6720.0260.5960.840
50010.99010.9920.0350.7500.7160.9400.0400.9020.986
100011110.1280.9700.99010.1220.9971
Table A5. The null hypothesis H 0 : X N ( 0 , 1 ) is tested against data sampled from U [ 3 , 3 ] .
Table A5. The null hypothesis H 0 : X N ( 0 , 1 ) is tested against data sampled from U [ 3 , 3 ] .
n JB a , σ 2 JB a JB σ 2 JB KS LF AD AD c CVM CVM c SW
α = 0.05 250.98900.9760.0010.6770.1250.9720.2300.7080.1780.128
50100.9990.0010.9530.25210.5670.9760.4140.446
7510.11210.0840.9950.43510.84410.6810.837
10010.60210.54310.57410.94110.8320.964
15010.98710.98510.83910.99710.9741
200111110.9471110.9961
250111110.98911111
50011111111111
α = 0.01 250.97700.95700.3390.0240.8740.0580.3230.0430.014
50100.99900.7820.0740.9980.2710.8230.1790.126
7510100.9590.14710.5530.9820.3730.444
10010.00510.0020.9990.27610.79910.5760.777
15010.53110.48310.52410.97610.8870.990
20010.97510.97010.72510.99810.9720.999
25010.99910.99910.8911110.9961
50011111111111
Table A6. The null hypothesis H 0 : X N ( 0 , 1 ) is tested against data sampled from γ ( 2 , 1 ) .
Table A6. The null hypothesis H 0 : X N ( 0 , 1 ) is tested against data sampled from γ ( 2 , 1 ) .
n JB a , σ 2 JB a JB σ 2 JB KS LF AD AD c CVM CVM c SW
α = 0.05 2510.1280.7600.38010.40910.57610.5120.623
50110.9500.77210.73010.89610.8430.934
75110.9930.94410.88010.97510.9540.984
100110.9990.99210.94810.99910.9940.999
150111110.9941110.9981
20011111111111
α = 0.01 2510.0460.6760.24410.18610.33210.2870.349
5010.3070.9300.64010.44410.75510.6760.805
75110.9830.84710.70010.93610.8790.957
100110.9940.93510.82510.98210.9550.994
1501110.99910.9651110.9971
200111110.99511111
250111110.99911111
50011111111111
Table A7. The null hypothesis H 0 : X N ( 0 , 1 ) is tested against data sampled from the log-normal distribution L N ( 0 , 1 ) .
Table A7. The null hypothesis H 0 : X N ( 0 , 1 ) is tested against data sampled from the log-normal distribution L N ( 0 , 1 ) .
n JB a , σ 2 JB a JB σ 2 JB KS LF AD AD c CVM CVM c SW
α = 0.05 250.9940.7600.9070.86010.88910.96010.9470.963
50110.9940.99410.99811111
751111111111
α = 0.01 250.9960.5740.8500.73710.72710.88110.8530.888
5010.9660.9800.97810.97210.99610.9910.998
75110.9980.9991111111
100110.99911111111
15011111111111 1
Table A8. The null hypothesis H 0 : X N ( 0 , 1 ) is tested against data sampled from the M i x , which is a mixture of the standard normal distribution N ( 0 , 1 ) and the normal distribution N ( 0 , 9 ) with equal mixture weights.
Table A8. The null hypothesis H 0 : X N ( 0 , 1 ) is tested against data sampled from the M i x , which is a mixture of the standard normal distribution N ( 0 , 1 ) and the normal distribution N ( 0 , 9 ) with equal mixture weights.
n JB a , σ 2 JB a JB σ 2 JB KS LF AD AD c CVM CVM c SW
α = 0.05 250.9970.2280.9970.2120.2770.2710.9550.3390.2920.3450.366
5010.43010.4330.4990.4250.9990.5530.5380.5360.581
7510.59910.5750.7910.62610.7560.8310.7420.737
10010.73710.7240.8890.74510.8780.9370.870.863
15010.88410.8890.9910.89310.9720.9960.9670.97
20010.95710.95710.97510.99610.9970.993
25010.98810.98610.99110.99910.9990.999
50011111111111
α = 0.01 250.9980.1550.9930.1540.0890.1050.8400.1470.0770.1390.170
5010.34910.3340.1960.2350.9830.3740.2020.3500.363
7510.44610.4440.4200.3550.9980.5370.4360.5250.505
10010.59110.5810.5640.48510.7210.6130.6990.662
15010.78910.7910.8960.73810.9090.9340.8990.883
20010.89310.8930.9860.89210.9830.9940.9810.961
25010.95810.95210.96110.99610.9950.991
50011111111111
Table A9. The null hypothesis H 0 : X N ( 2 , 2 ) is tested against data sampled from the gamma ( 2 , 1 ) distribution.
Table A9. The null hypothesis H 0 : X N ( 2 , 2 ) is tested against data sampled from the gamma ( 2 , 1 ) distribution.
n JB a , σ 2 JB a JB σ 2 JB KS LF AD AD c CVM CVM c SW
α = 0.05 250.2920.2740.3530.4100.1360.4000.1340.5580.1310.5050.620
500.4210.4770.5920.7610.2690.7320.3070.8850.2630.8370.913
750.5230.6520.7650.9420.3700.8560.4820.9750.3820.9490.991
1000.6430.8050.9480.9950.4210.9520.6580.9990.4650.9901
1500.8150.935110.6190.9960.91310.67911
2000.9230.971110.83110.99210.83711
2500.9890.994110.9791110.92811
50011111111111
α = 0.01 250.2190.1760.2600.2720.0450.1710.0340.3160.0400.2700.360
500.3630.4120.4880.6170.0980.4590.0840.7570.0920.6750.790
750.4510.5810.6650.8320.1730.6710.1790.9280.1690.8720.962
1000.5630.7010.8000.9450.1970.8180.2500.9790.1970.9460.991
1500.6840.8820.9670.9970.3710.9690.57110.4070.9971
2000.8070.9510.99910.5070.9960.79710.54911
2500.9040.99110.66710.95110.72411
500111111110.99211
100011111111111
Table A10. The null hypothesis H 0 : X N ( 0 , 1 ) is tested against data sampled from a mixture of a standard normal distribution (weight 0.9) and a sum of independent random variables—with a standard normal distribution and Poisson distribution with λ = 5 (weight 0.1).
Table A10. The null hypothesis H 0 : X N ( 0 , 1 ) is tested against data sampled from a mixture of a standard normal distribution (weight 0.9) and a sum of independent random variables—with a standard normal distribution and Poisson distribution with λ = 5 (weight 0.1).
n JB a , σ 2 JB a JB σ 2 JB KS LF AD AD c CVM CVM c SW
α = 0.05 250.9940.9230.9850.8840.1110.8080.8780.8900.1600.8780.920
5010.99910.9900.1340.9520.9660.9900.1940.9810.995
7511110.2350.9910.9990.9990.3190.9991
10011110.3100.998110.39811
15011110.5701110.60911
20011110.8231110.77011
25011110.9741110.88711
500111111110.99911
100011111111111
α = 0.01 250.9960.8610.9920.7960.0330.6470.6710.7910.0490.7540.842
5010.99210.9850.0430.8840.8540.9660.0650.9510.984
751110.9980.0850.9790.9900.9940.1310.9890.999
10011110.1080.9940.99810.1690.9981
15011110.2071110.27511
20011110.3791110.42411
25011110.6221110.61711
500111111110.97211
100011111111111
Figure A1. The power for α = 0.05 depending on the sample size n ( H 0 : X N ( 0 , 1 ) is tested against data sampled from the Student’s t distribution with 5 degrees of freedom. Figure created by S. Khrushchev and A. Yambartsev.
Figure A1. The power for α = 0.05 depending on the sample size n ( H 0 : X N ( 0 , 1 ) is tested against data sampled from the Student’s t distribution with 5 degrees of freedom. Figure created by S. Khrushchev and A. Yambartsev.
Mathematics 12 02523 g0a1
Figure A2. The power for α = 0.01 depending on the sample size n ( H 0 : X N ( 0 , 1 ) is tested against data sampled from the Student’s t distribution with 5 degrees of freedom. Figure created by S. Khrushchev and A. Yambartsev.
Figure A2. The power for α = 0.01 depending on the sample size n ( H 0 : X N ( 0 , 1 ) is tested against data sampled from the Student’s t distribution with 5 degrees of freedom. Figure created by S. Khrushchev and A. Yambartsev.
Mathematics 12 02523 g0a2

References

  1. Jarque, C.M.; Bera, A.K. Efficient tests for normality, homoscedasticity and serial independence of regression residuals. Econ. Lett. 1980, 6, 255–259. [Google Scholar] [CrossRef]
  2. Jarque, C.M.; Bera, A.K. Efficient tests for normality, homoscedasticity and serial independence of regression residuals: Monte Carlo evidence. Econ. Lett. 1981, 7, 313–318. [Google Scholar] [CrossRef]
  3. Jarque, C.M.; Bera, A.K. A test for normality of observations and regression residuals. Int. Stat. Rev. 1987, 55, 163–172. [Google Scholar] [CrossRef]
  4. Pearson, K. Mathematical contributions to the theory of evolution, XIX: Second supplement to a memoir on skew variation. Philos. Trans. R. Soc. A 1916, 216, 429–457. [Google Scholar]
  5. Searls, D.T. The utilization of a known coefficient of variation in the estimation procedure. J. Am. Stat. Assoc. 1964, 59, 1225–1226. [Google Scholar] [CrossRef]
  6. Fu, Y.; Wang, H.; Wong, A. Inference for the normal mean with known coefficient of variation. Open J. Stat. 2013, 3, 41368. [Google Scholar] [CrossRef]
  7. Rana, S.; Eshita, N.N.; Al Mamun, A.S.M. Robust normality test in the presence of outliers. J. Phys. Conf. Ser. 2021, 1863, 012009. [Google Scholar] [CrossRef]
  8. Gorman, K.B.; Williams, T.D.; Fraser, W.R. Ecological Sexual Dimorphism and Environmental Variability within a Community of Antarctic Penguins (Genus Pygoscelis). PLoS ONE 2014, 9, e90081. [Google Scholar] [CrossRef] [PubMed]
  9. Slutsky, E. Über stochastische Asymptoten und Grenzwerte. Metron 1925, 5, 3–89. [Google Scholar]
  10. Yap, B.W.; Sim, C.H. Comparisons of various types of normality tests. J. Stat. Comput. Simul. 2011, 81, 2141–2155. [Google Scholar] [CrossRef]
  11. Razali, N.M.; Wah, Y.B. Power comparisons of shapiro-wilk, kolmogorov-smirnov, lilliefors and anderson-darling tests. J. Stat. Model. Anal. 2011, 2, 21–33. [Google Scholar]
  12. Khatun, N. Applications of normality test in statistical analysis. Open J. Stat. 2021, 11, 113. [Google Scholar] [CrossRef]
  13. Chen, W.; Genton, M.G. Are you all normal? It depends! Int. Stat. Rev. 2023, 91, 114–139. [Google Scholar] [CrossRef]
Figure 1. Some distributions from the Pearson family in the S K plot. Figure created by A. Logachov and A. Yambartsev.
Figure 1. Some distributions from the Pearson family in the S K plot. Figure created by A. Logachov and A. Yambartsev.
Mathematics 12 02523 g001
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Glinskiy, V.; Ismayilova, Y.; Khrushchev, S.; Logachov, A.; Logachova, O.; Serga, L.; Yambartsev, A.; Zaykov, K. Modifications to the Jarque–Bera Test. Mathematics 2024, 12, 2523. https://doi.org/10.3390/math12162523

AMA Style

Glinskiy V, Ismayilova Y, Khrushchev S, Logachov A, Logachova O, Serga L, Yambartsev A, Zaykov K. Modifications to the Jarque–Bera Test. Mathematics. 2024; 12(16):2523. https://doi.org/10.3390/math12162523

Chicago/Turabian Style

Glinskiy, Vladimir, Yulia Ismayilova, Sergey Khrushchev, Artem Logachov, Olga Logachova, Lyudmila Serga, Anatoly Yambartsev, and Kirill Zaykov. 2024. "Modifications to the Jarque–Bera Test" Mathematics 12, no. 16: 2523. https://doi.org/10.3390/math12162523

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop