Next Article in Journal
A Set Covering Approach to Design Maximally Permissive Supervisors for Flexible Manufacturing Systems
Previous Article in Journal
A Rapid Detection Method for Coal Ash Content in Tailings Suspension Based on Absorption Spectra and Deep Feature Extraction
Previous Article in Special Issue
RNDLP: A Distributed Framework for Supporting Continuous k-Similarity Trajectories Search over Road Network
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Implementation of a Parallel Algorithm to Simulate the Type I Error Probability

by
Francisco Novoa-Muñoz
Departamento de Enfermería, Facultad de Ciencias de la Salud y de los Alimentos, Universidad del Bío-Bío, Chillán 3800708, Chile
Mathematics 2024, 12(11), 1686; https://doi.org/10.3390/math12111686
Submission received: 10 April 2024 / Revised: 7 May 2024 / Accepted: 15 May 2024 / Published: 29 May 2024
(This article belongs to the Special Issue Mathematical Modeling for Parallel and Distributed Processing)

Abstract

:
Simulating the probability of type I error is a powerful statistical tool that allows confirming if the statistical test achieves the established nominal level. However, its computational implementation has the drawback of significantly long execution times. Therefore, this article analyzes the performance of two parallel implementations (parRapply and boot) which significantly reduce the execution time of simulations of type I error probability for a goodness-of-fit test for the bivariate Poisson distribution. The results obtained demonstrate how the parallelization strategies accelerate the simulations, reducing the time by 50% to 90% when using 2 to 12 processors running in parallel. This reduction is graphically evidenced as the execution time of the analyzed parallel versions fits almost perfectly ( R 2 0.999 ) to the power model y = a p b , where p is the number of processors used, and a > 0 and b < 0 are the constants of the model. Furthermore, it is shown that the parallelization strategies used scale with an increasing number of processors. All algorithms were implemented in the R programming language, and their code is included at the end of this article.

1. Introduction

In today’s world, it is essential to anticipate events that may occur, and to do so, having models that predict such events with the highest possible accuracy is vital. Statistics provides tools for making predictions or decisions based on the information contained in the available data sample. Typically, these tools consist of models based on probability distributions. However, before using such models, it is crucial to determine whether the sample data behave according to the probability distribution on which the probabilistic model is based. This is achieved by applying a goodness-of-fit test.
Regarding the latter, it is necessary to specify that an essential part in the development of a goodness-of-fit test is the result of simulation under the null hypothesis, which confirms if the test reaches the established nominal level. In [1], two goodness-of-fit tests were generated, and the parametric bootstrap method was used to simulate the type I error probability. However, the excessive execution time of each simulation influenced the authors to make a few variations in both the parameter vector θ and the sample sizes n considered.
In this regard, to explore more and new possible scenarios, it is fundamental and necessary to reduce the computational execution time of the involved simulation processes. Many areas pursue this goal and employ parallel programming for it, for instance in [2,3,4,5,6,7].
Therefore, the interest of this research is to use parallelization strategies to accelerate the simulation processes employed in the first of the statistical tests developed in [1]. This will allow examining new parameters θ and testing with larger sample sizes n. It will contribute to reducing computation time and confirming that as sample sizes grow, parameter estimates converge to their nominal value.
This article is organized as follows: Section 2 presents the theoretical aspects and details the first test proposed in [1]. Section 3 shows the fundamental aspects to take into account to implement a parallel algorithm and are discussed the facilities offered by the R language to make implementations of this type. Section 4 presents the experimental results. In Section 5, the obtained results are discussed, and Section 6 exhibits the conclusions.

2. Background

From here on, the following notation is considered:
  • N 0 = { 0 , 1 , 2 , 3 , } .
  • Θ = { θ = ( θ 1 , θ 2 , θ 3 ) R 3 : θ 1 > θ 3 , θ 2 > θ 3 , θ 3 > 0 } .

2.1. Bivariate Poisson Distribution

There have been several definitions for the bivariate Poisson distribution. This article utilizes the data provided in [8], which is the one that has received the most attention in the statistical literature.
Definition 1.
  Let
X 1 = Y 1 + Y 3 a n d X 2 = Y 2 + Y 3 ,
where Y 1 , Y 2 and Y 3 are mutually independent Poisson random variables with means given by θ 1 = θ 1 θ 3 > 0 , θ 2 = θ 2 θ 3 > 0 and θ 3 0 , respectively.
The joint distribution of the vector ( X 1 , X 2 )  is called bivariate Poisson distribution with parameter θ = ( θ 1 , θ 2 , θ 3 ) , which will be denoted by ( X 1 , X 2 ) B P ( θ ) .
The joint probability mass function of  X 1 and X 2 is given by
P θ ( X 1 = x 1 , X 2 = x 2 ) = exp ( θ 3 θ 1 θ 2 ) i = 0 min { x 1 , x 2 } ( θ 1 θ 3 ) x 1 i ( θ 2 θ 3 ) x 2 i θ 3 i ( x 1 i ) ! ( x 2 i ) ! i ! ,
where x 1 , x 2 N 0 .
Also, the joint probability generating function (pgf) of  X 1 and X 2 is
g ( u ; θ ) = exp { θ 1 ( u 1 1 ) + θ 2 ( u 2 1 ) + θ 3 ( u 1 1 ) ( u 2 1 ) } ,
u = ( u 1 , u 2 ) R 2 , θ Θ .

2.2. Goodness-of-Fit Test of Novoa-Muñoz and Jiménez-Gamero (2014)

This subsection shows, in summary, the first test proposed by these authors, hereinafter, test Q.
To do this, let X 1 = ( X 11 , X 21 ) , , X n = ( X 1 n , X 2 n ) be random vectors independent and identically distributed of X = ( X 1 , X 2 ) N 0 2 .
The hypotheses are contrasted
H 0 : ( X 1 , X 2 ) B P ( θ 1 , θ 2 , θ 3 ) , for some ( θ 1 , θ 2 , θ 3 ) Θ , H 1 : ( X 1 , X 2 ) B P ( θ 1 , θ 2 , θ 3 ) , ( θ 1 , θ 2 , θ 3 ) Θ .
According to the definition established in [9], if  θ Θ but the hypothesis test incorrectly decides to reject H 0 , then the test has made a type I error.
The test statistic is of the Cramer Von Misses type and is given by
Q n , w ( θ ^ n ) = n 0 1 0 1 g n ( u ) g ( u ; θ ^ n ) 2 w ( u ) d u ,
where a consistent estimator θ ^ n = θ ^ n ( X 1 , X 2 , , X n ) is a consistent estimator of θ , g ( u ; θ ^ n ) is the pgf and g n ( u ) its empirical version, w ( u ) = u 1 a 1 u 2 a 2 is the weight function, in which u [ 0 , 1 ] 2 and ( a 1 , a 2 ) , called the weight vector, is such that a 1 > 1 , a 2 > 1 .
The null hypothesis will be rejected for “large” values of Q n , w ( θ ^ n ) .
Since the null distribution of the statistic turned out to be unknown, the authors estimated it using the asymptotic null distribution, but since it depends on the true value of the  θ parameter, which is unknown, it did not provide them will a useful solution and decided to estimate the null distribution of the statistic using the parametric bootstrap method, which is analyzed below.

2.3. Parametric Bootstrap

In the case under study, the distribution of the statistic depends on the parameter; then, such a model is called parametric, and therefore, the statistical methods based in this model are parametric methods [10].
As in the real world, a sample from a given population is obtained, and with this sample, the population parameter is estimated, which is expected to be an interior point of the parameter space. The parametric bootstrap method is a resampling technique that emulates the situation described above; that is, with the parameter estimator, a Bivariate Poisson is generated from which a sample is drawn, called the bootstrap sample, and with it the parameter is estimated, which is named the bootstrap parameter. With the sample and the bootstrap parameter, the statistic Q * is computed, which is the bootstrap version of the statistic Q.
Although the bootstrap method has been highly useful in statistical inference, [11] shows that there are situations in which it is not consistent. In the said article, examples are shown where parametric bootstrap does not yield good results, and particularly, it is demonstrated that the method is inconsistent when the true parameter value is on the boundary of the parameter space or very close to it.

3. A Parallel Implementation

There are two parameters to analyze the efficiency of an algorithm which solves a given problem: the memory space required to store the input data and results and the time of execution used in performing the task. Several technological advances allow the use of many computers that collaborate each other either to increase the size of problems to be solved through data distribution or task distribution [12]. In this latter case, parallel programs are being used. In the case of bootstrap resampling, it involves task distribution, as the iterations are independent of each other.

3.1. Parallelization in R Language

The R programming language offers the possibility of developing tasks in parallel, when they allow it, through a series of packages that provide commands for this type of processing. Among the packages that help accomplish this task, Snow, Multicore, and Parallel can be mentioned.
Currently, one of the packages that has gained great prominence in the programming in R language is Parallel [13], which takes the best of two previous versions Multicore and Snow, which is complemented by the random numbers generation [14,15].
The parApply package allows all the functions contained in Apply, such as the direct data manipulation from each vector data, matrix, list or data frames avoiding the use of cycles. It also allows finding out how many processors are available.
Thus, the parallel package allows distributing the data in the different cores available and there, simultaneously, manipulating them according to the indicated function, and then gathering all the information and returning it as a list [16].
In general terms, two ways of implementing parallelism using this package can be distinguished: through forking or sockets.
The forking method is based on the full duplication of a master process with a shared environment toward each of the parallel environments, including objects or variables defined before the start of parallel threads. This method has the advantage of being very fast, but it cannot run under the Windows operating system.
On the other hand, the sockets method is based where each thread runs separately without sharing objects or variables, which can only be passed from the master process explicitly. As a result, it runs slower due to communication overhead, but it has the advantage of being able to be implemented in any operating system [16].
Just as the Parallel package offers tools that allow parallelization in R, there is a Boot package [13] that incorporates functions that facilitate the application of the bootstrap method. For this package, we consider the possibility of performing the operations in parallel using forking or a socket.
When programming in parallel in R language [13], with the aim of reducing computation times, it is important to take into account some experts’ recommendations. In many cases, simply “quite fast” is pretty good. The additional speed that could be achieved by transitioning from R to C/C++ does not justify the significant amount of time required to write, debug, and maintain code at that level [17].

3.2. Performance Evaluation of a Parallel Program

According to [18], one way to measure the performance of a parallel program is to calculate its Speedup (S), which is the ratio between the execution time of the program in the sequential version ( T s ) and the execution time of the parallel version ( T p ) with p processors, which is given by S = T s T p .
Furthermore, as the processors used increase, there is more additional work (and time) than expected; that is, the ratio between S and p decreases in value. Thus, the Efficiency of a parallel program is defined as E = S p .

3.3. Methodology

To showcase the results of this work, three aspects are considered:
(a)
Preparation of the programs.
Once it was verified that the process was parallelizable, the parts of the program that could be modified to reduce computation times were identified; then, the instructions of the R language that allow for parallelizing the task were analyzed, finding parRapply from the package Parallel, and boot from the Boot package. Then, the new codes that allow for parallelizing the simulation process were developed.
(b)
Simulation with the same components used in [1]. The simulations made by these authors were replicated considering the execution sequential and parallel code. For the parallel program, the number of cores p, using p = 2 , 3 , , 12 .
(c)
Simulations with new components.
-
New weight vectors ( a 1 , a 2 ) { ( 1 , 0 ) , ( 1 , 1 ) } , with p = 1 , , 12 .
-
Sample sizes n = 100 ( 50 ) 300 , 500 with the same original population parameters and p = 10 .
-
Population parameters θ , with θ 1 = 1 , 1.5 , θ 2 = 1 and θ 3 chosen so that the correlation coefficient ρ = θ 3 θ 1 θ 2 is less than 0.25 or greater than 0.75, n = 30 ( 20 ) 70 , 100 ( 50 ) 300 , 500 and p = 10 .
The simulations described in items (b) and (c) were carried out in a cluster of computers with 12 processors with a processing speed of 2.00 GHz, 32 GB of RAM and operating system Centos 7 (Linux).

4. Experimental Results

This section presents the results of the simulations of type I error considering the three parts detailed in  Section 3.3.

4.1. Simulations of Probability of a type I error with the Components Used in the First Test of Novoa-Muñoz and Jiménez-Gamero (2014)

Firstly, it was verified that when using parallel programming, there was no significant difference between the simulated type I error probability and its respective nominal value. Due to space, the results are not presented in this paper and can be obtained from the author upon request.
The next step was to analyze the execution times and the efficiency of the two parallel versions (parRapply and boot commands). For this, it is essential to know the sequential execution time, which is presented in Table 1 and Table 2. Table 1 presents the results using the algorithm implemented in this research (described in Appendix A), and Table 2 records the results of running the R package 4.0.0 boot command. These times serve as a basis for calculating the performance of parallel programming.
The execution times and efficiency of the parallelized programs using the parRapply and boot commands, which were the two parallel versions analyzed in this study, are shown graphically below.
Figure 1, Figure 2, Figure 3, Figure 4, Figure 5 and Figure 6 present the average execution times versus the number of processors used. To obtain these results and for comparison, the same parameters θ used in [1] were used. Table A1Table A12, displayed in Appendix B, contain the data with which these graphs were constructed.
Figure 1, Figure 2 and Figure 3 show the execution times when E ( X 1 ) = E ( X 2 ) and Figure 4, Figure 5 and Figure 6 show the times when E ( X 1 ) E ( X 2 ) .
Since in this research two parallel implementations are analyzed, the interest lies in knowing which of them is more efficient; therefore, the efficiencies of both parallel implementations were calculated. Figure 7, Figure 8, Figure 9, Figure 10, Figure 11 and Figure 12 display the efficiencies, E, versus the number of processors used. Each figure shows graphically the efficiency of both commands (parRapply and boot) for the same parameter θ . These parameters are the same ones used in [1].
Tables with these data are not shown, as they can be obtained directly using Table 1 and Table 2 (sequential time) together with Table A1Table A12 in Appendix B (parallel time).

4.2. Simulations of Type I Error Probability with Sample Sizes Larger than Those Used in the First Test of Novoa-Muñoz and Jiménez-Gamero (2014)

For simulations with sample sizes greater than n = 70, only the command parRapply was used because, as seen in the previous subsection, it is more efficient than the command boot. Furthermore, only ten processors were used.
The estimations of the type I error probabilities for the nominal values α = 0.01, 0.05, and 0.10 are presented in Table 3, Table 4, Table 5, Table 6, Table 7 and Table 8 with 1%, 5% and 10%, respectively.
Figure 13 and Figure 14 present the average execution times versus different sample sizes (n) when the simulations displayed in Table 3, Table 4, Table 5, Table 6, Table 7 and Table 8 are executed.

4.3. Simulations of the Probability of a Type I Error with Parameter Vectors Different from Those Used in the First Test of Novoa-Muñoz and Jiménez-Gamero (2014)

The simulation of type I errors implemented in parallel allows to evaluate the performance of the Q test when working with vectors that have parameters different to those already studied. In particular, it allows simulating the type I error probability for the cases in which the correlation coefficient ρ takes very small (≤ 0.1) or very large (≥ 0.8) values. The first of these cases takes on special relevance, since it involves parameters that are very close to the boundary of the parametric space, which according to [11] could lead to inconsistencies while increasing computation times.
For the same reasons given in Section 4.2, only the command parRapply and ten processors were used.
Table 9, Table 10, Table 11, Table 12, Table 13, Table 14 and Table 15 present the results of the type I error simulation for the parameter vectors that meet the characteristics already mentioned, considering the cases E ( X 1 ) = E ( X 2 ) and E ( X 1 ) E ( X 2 ) with the following sample sizes n = 30 , 50 , 70 , 100 , 150 , and 200.
Figure 15 and Figure 16 present the average execution times versus different sample sizes (n) when the simulations displayed in Table 9, Table 10, Table 11, Table 12, Table 13, Table 14 and Table 15 are executed.

5. Discussion

From Table 1 and Table 2, it is evident that the sequential version implemented in this research is faster (between 1% and 2%) than the boot version of the R package in almost all parameters θ , except for  θ = (1.5, 1, 0.92). This could be attributed to the sensitivity of the Maximum Likelihood Method, as θ 2 = 1 is very close to θ 3 = 0.92 .
On the other hand, regardless of the parallel version (parRapply or boot) used, Figure 1, Figure 2, Figure 3, Figure 4, Figure 5 and Figure 6 highlight the rapid decrease in program execution time as the number of processors increases. Additionally, it is observed that the parRapply command delivers results faster than the boot command. Furthermore, when E ( X 1 ) = E ( X 2 ) , the execution time is lower than in cases where E ( X 1 ) E ( X 2 ) . It is observed that the execution time increases as sample sizes grow.
It is also important to emphasize that the power model (faster than the exponential model) fits almost perfectly to the execution time versus number of processors, as approximately 99.9% ( R 2 0.999 ) of the variation in execution time is explained by the number of processors used.
Adding to the previous point, Figure 7, Figure 8, Figure 9, Figure 10, Figure 11 and Figure 12 show that the two parallel implementations addressed in this research exhibit linear efficiency. In all analyzed cases, the parRapply command is more efficient than the boot command and with less dispersion among the sample sizes used. Furthermore, it is observed that when two processors are employed, the efficiency of the parRapply command is superior to 1.
In this line of discussion and thanks to this research (using parallel programming), it was possible to obtain results for sample sizes greater than 70, which are displayed in Table 3, Table 4, Table 5, Table 6, Table 7 and Table 8. From these tables, it can be observed that the estimates are close to the nominal values. For example, the estimates at 10% are very close to 0.1. This becomes clearer as the sample size increases, implying that the estimates are converging to the nominal values.
In line with the previous paragraph, Figure 13 and Figure 14 graphically illustrate the positive correlation between the execution time of the programs and the sample size. Clearly, the execution time of the programs is linearly adjusted with the sample size, and this adjustment is almost perfect ( R 2 0.99 ). It is also evident that the execution time of the programs is lower when E ( X 1 ) = E ( X 2 ) .
Finally, from Table 9, Table 10, Table 11, Table 12, Table 13, Table 14 and Table 15, it is possible to appreciate that the estimations of the type I error probability are very close to the nominal values in all cases where the correlation coefficient ρ > 0.1 . However, as could be expected, according to what [11] suggested, in the cases in which θ 3 = 0.1 ( ρ = 0.1 ) and θ 3 = 0.12 ( ρ 0.098 ), the simulated probabilities are very far from the nominal values and, strangely, the situation worsens as the sample size grows.
In relation to the last paragraph, from Figure 15 and Figure 16, it is appreciated that the execution time of the programs is linearly adjusted with the sample size ( R 2 > 0.94 ) with a positive slope. It is also observed that these times decrease as the correlation coefficient ρ increases, and along this path, the dependence becomes more linear ( R 2 increases). Additionally, from these graphs, the high dispersion between the lines can be observed, which is due to the large difference between the correlation coefficients ρ .

6. Conclusions

In this research, two parallel algorithmic implementations in the R language have been analyzed, which allow simulating the probability of type I error. The results obtained show that the R language is a good alternative to reduce the simulation time of the studied test. On the other hand, the estimates of the type I error probability of the studied test have a value close to the nominal value for both small sample sizes ( n = 30 , 50 , 70 ) and large sample sizes ( n = 100 ( 50 ) 300 , 500 ) as long as the involved correlation coefficient ρ is greater than 0.1
Moreover, the execution time when simulating the type I error probability of the studied test is significantly higher in cases where the correlation coefficient ρ is low ( ρ 0.1 and ρ 0.2 ) compared to those where this coefficient is high ( ρ 0.8 and ρ 0.9 ).
The major mathematical finding obtained in this research is that the execution time of the analyzed parallel versions (parRapply and boot) fits almost perfectly ( R 2 0.999 ) to the power-law model of the form y = a p b , where p is the number of processors used, a > 0 , and b < 0 are the constants of the model.
The other mathematical result found is that the execution time of the analyzed parallel versions (parRapply and boot) fits the linear model y = a n + b , where n is the sample size, and a > 0 is the slope of the line.
As future work, there are plans to deepen the research on the efficiency of the studied test in cases where the correlation coefficient is less than 0.2. Additionally, implementing the algorithm developed in this research in the C programming language is planned. Furthermore, developing a package in the R language that incorporates the goodness-of-fit test studied and others proposed in [1,19,20,21] is also planned.

Funding

This paper is funded in part by the Universidad del Bío-Bío Project DIUBB 2220529 IF/R, Dirección de Investigación y Creación Artística (Fondo de Apoyo a la Participación a Eventos Internacionales) and Vicerrectoría Académica.

Data Availability Statement

Data are contained within the article.

Acknowledgments

The author thanks the editor of this journal and the anonymous reviewers for their valuable time and their careful comments and suggestions, which have contributed to improving the quality of this work.

Conflicts of Interest

The author declare no conflicts of interest.

Appendix A

This section presents the code generated in this research.
#Estimation with Maximum Likelihood Method (MLM)
#EstimatorML: estimates the parameters of the DPB by the MLM
#X=((X_{1i},X_{2i})), i=1,2,…,n
EstimatorML = function(X){
 n = dim(X)[2]
 t1 = mean(X[1,])
 t2 = mean(X[2,])
 t3 = min(t1,t2)/2  # initial value of parameter t3
 x_min = min(X[1,])
 x_max = max(X[1,])
 y_min = min(X[2,])
 y_max = max(X[2,])
 frec = matrix(0,x_max-x_min+1,y_max-y_min+1)
 for(i in x_min:x_max){
  p = i - x_min + 1
  for(j in y_min:y_max){
   q = j - y_min + 1
   for (k in 1:n){
    if(X[1,k]==i && X[2,k]==j) frec[p,q] = frec[p,q]+1
   }
  }
 }
 # Newton-Raphson Method
 # prev_t3 saves the previous value of t3
 # RMyDerRM_t3 calculates RM and the derivative of RM with respect to t3
 num = 1
 difm = 0.001
 diff = 100
 dif_min = 100
 t3_min <- t3
 while (diff>=difm && num<=200){
  prev_t3 = t3
  RM_Der = RMyDerRM_t3(n,x_min,x_max,y_min,y_max,frec,t1,t2,t3)
  t3 = t3 - (RM_Der[1] - 1)/RM_Der[2]
  diff = abs(prev_t3 - t3)
  if(is.na(diff)){
    diff = difm/10
    num = 250
    t3 = t3_min
  } else if(diff<dif_min){
       dif_min <- diff
       t3_min <- t3
  }
 }
 num = num + 1
 }
 return(c(t1,t2,t3))
}
# probPB calculates the probability function of a PB(t1,t2,t3)
probPB = function(x,y,t1,t2,t3){
 s = 0
 if(x>=0 && y>=0){
  m = min(x,y)
  tt1 = t1 - t3
  tt2 = t2 - t3
  for(i in 0:m){
   s = s + tt1^(x-i)*tt2^(y-i)*t3^i/(gamma(x-i+1)*gamma(y-i+1)*gamma(i+1))
  }
  s = s*exp(-tt1 - tt2 - t3)
 }
 return(s)
}
RMyDerRM_t3 = function(n,x_min,x_max,y_min,y_max,frec,t1,t2,t3){
 der = rm = 0
 for(i in x_min:x_max){
  p = i - x_min + 1
  for(j in y_min:y_max){
   q = j - y_min + 1
   if(frec[p,q]>0){
    prob_ij = probPB(i,j,t1,t2,t3)
    prob_im1jm1 = probPB(i-1,j-1,t1,t2,t3)
    prob_im2jm2 = probPB(i-2,j-2,t1,t2,t3)
    prob_im2jm1 = probPB(i-2,j-1,t1,t2,t3)
    prob_im1jm2 = probPB(i-1,j-2,t1,t2,t3)
    prob_im1j = probPB(i-1,j,t1,t2,t3)
    prob_ijm1 = probPB(i,j-1,t1,t2,t3)
    rm = rm + frec[p,q]*prob_im1jm1/prob_ij
    sum1 = (prob_im2jm2 - prob_im2jm1 - prob_im1jm2)/prob_ij
    sum2 = prob_im1jm1*(prob_im1j + prob_ijm1 - prob_im1jm1)/(prob_ij^2)
    der = der + frec[p,q]*(sum1 + sum2)
   }
  }
 }
 return(c(rm/n,der/n))
}
f_n = function(i,j,n,X){
 ind1 = ind2 = rep(0,n)
 ind1[X[1,]==i] = ind2[X[2,]==j] = 1
 ss = sum(ind1*ind2)
 return(ss/n)
}
# Bivariate Poisson Sample Generation
gmpb = function(n,t1,t2,t3){
 if (t1>t3 && t2>t3 && t3>0){
  a = rpois(n,t1-t3)
  b = rpois(n,t2-t3)
  c = rpois(n,t3)
  return(rbind(a+c,b+c))
 } else { stop("The parameters do not meet the requirements"); on.exit}
}
# Calculation of the statistic R_n,a # n is the sample size
R = function(X,t1,t2,t3){
 m = 10
 inf = max(max(X[1,]),max(X[2,])) + m
 fp = Prob <- matrix(0,inf+1,inf+1)
 for(i in 0:inf){
  for(j in 0:inf){
   Prob[i+1,j+1] = probPB(i,j,t1,t2,t3)
   fp[i+1,j+1] = f_n(i,j,n,X) - probPB(i,j,t1,t2,t3)
  }
 }
 k = l = 0:inf
 f = function(k,l){1/((ii + k + a1 + 1)*(jj + l + a2 + 1))}
 sa = 0
 for(ii in 0:inf){
  for(jj in 0:inf){
   s = s + fp[ii+1,jj+1]*sum(fp*outer(k,l,FUN=f))
  }
 }
 return(n*s)
}
# A function is defined to be introduced as an argument in parRapply
# Must be a function of a column
full_function = function(X){
 X = data.frame(matrix(unlist(X), ncol = 2, byrow = F))
 X = t(X)
 th_estim = EstimatorML(X)
 t1 = th_estim[1]
 t2 = th_estim[2]
 t3 = th_estim[3]
 r_obs = R(X,t1,t2,t3)
 r_boot = rep(0,B)
 for (b in 1:B){
  next = "F"
  while(next=="F"){ # B bootstrap samples (Mboot) are generated
   X_PBboot = gmpb(n,th_estim[1],th_estim[2],th_estim[3])
   th_est_boot = EstimadorML(X_PBboot) # bootstrap theta estimator
   th_dif1 = th_est_boot[1] - th_est_boot[3]
   th_dif2 = th_est_boot[2] - th_est_boot[3]
   if(th_dif1 > 0 && th_dif2 > 0 && th_est_boot[3] > 0) next = "T"
  }
  # Statistics are evaluated in each bootstrap sample
  r_boot[b] = R(X_PBboot,th_est_boot[1],th_est_boot[2],th_est_boot[3])
 }
 # An approximation of the p-value is accumulated for each statistic
 ind_r = rep(0,B)
 ind_r[r_boot >= r_obs] = 1
 vp_b = sum(ind_r)/B
}
M = 1000  # Monte Carlo iterations B = 500   # Bootstrap iterations
va1 = c(0,0,1,1); va2 = c(0,1,0,1) # vector (a1,a2)
library(parallel) nc = detectCores()
tm = c(30,50,70,100,150,200,250,300,500)  # Sample sizes
Theta = matrix(0,6,3)
Theta[1,] = c(1.5,1,0.31); Theta[2,] = c(1.5,1,0.62)
Theta[3,] = c(1.5,1,0.92); Theta[4,] = c(1,1,0.5)
Theta[5,] = c(1,1,0.25); Theta[6,] = c(1,1,0.75) for(i in 1:6){
 th = Theta[i,]
 even = odd = c()
 for(i in 1:(2*M)){
  if(i%%2==0){
   even = c(even,i)
  } else {odd<-c(odd,i)}
 }
 for(t in 1:length(tm)){
  n = tm[t]
  X = Y = matrix(0,M,n)
  carpet = "folder where the samples are saved"
  file_L = paste0(carpet,"/Theta_",th[1],"_",th[2],"_",th[3],"_n",n,".txt")
  XY = matrix(scan(file_L,sep=","),nrow=2*M,byrow=TRUE)
  X = XY[odd,]
  Y = XY[even,]
  XY_col = matrix(0,2*n,M)
  for (m in 1:M){
   XY_col[,m] = c(X[m,],Y[m,])
  }
  # parallelize with parRapply with nc processors
  carp_file = "folder where the outputs will be saved"
  file = paste0(carp_file,th[1],"_",th[2],"_",th[3],"_n",n,".txt")
  for (a_12 in 1:4){
   a1 = va1[a_12]
   a2 = va2[a_12]
   for(nn in 1:nc){
    cl = parallel::makeCluster(nn)
    clusterExport(cl, "EstimatorML")
    clusterExport(cl, "RMyDerRM_t3")
    clusterExport(cl, "probPB")
    clusterExport(cl, "f_n")
    clusterExport(cl, "n")
    clusterExport(cl, "a1")
    clusterExport(cl, "a2")
    clusterExport(cl, "B")
    clusterExport(cl, "M")
    clusterExport(cl, "R")
    clusterExport(cl, "gmpb")
    SS = paste0("Outings with ",nn," processors", "a1=",a1, "a2=",a2)
    write(SS,file=file,append=TRUE)
    a = proc.time()
    res = parRapply(cl, t(XY_col), full_function)
    parallel::stopCluster(cl)
    execution_time = proc.time()-a
    write("",file=file,append=TRUE)
    write(c("vp : ",res),ncolumns=length(res)+1,file=file,append=TRUE)
    ind1_r = ind2_r = ind3_r = rep(0,M)
    ind1_r[res <= .01] = 1
    error1_r = sum(ind1_r)/M
    ind2_r[res <= .05] = 1
    error5_r = sum(ind2_r)/M
    ind3_r[res <= .1] = 1
    error10_r = sum(ind3_r)/M
    write("",file=file,append=TRUE)
    write(c(error1_r,error5_r,error10_r),file=file,ncolumns=4,append=TRUE)
    write("",file=file,append=TRUE)
    write("Execution time",file=arch,append=TRUE)
    write(execution_time,file=file,append=TRUE)
   }
  }
 }
}

Appendix B

In this section, the average execution times are presented as a function of the number of processors used when running the two parallel implementations (parRapply and boot) analyzed in this research.
Table A1. Average execution time (in seconds) using parRapply, θ = (1.5, 1, 0.31), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
Table A1. Average execution time (in seconds) using parRapply, θ = (1.5, 1, 0.31), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
n = 30n = 50n = 70
( a 1 , a 2 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )
p
2404940584063408844024428441344404657466246634,674
3276427622801279329592979297030063164314231643162
4211821002092211222612259225522482406239923892404
5172617611713172418681888187418702004197019881,969
6144514571444146115751551160115671679165816401655
7124212461257126313651350136113561432145714201447
8110211251123109012031193119111961288126112741264
9100710071012101211051099111210971158116211601171
109199169149189889819769861044103910551037
11841840835845905931913933963975960981
12764765764776823820820823869875882877
Table A2. Average execution time (in seconds) using boot, θ = (1.5, 1, 0.31), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
Table A2. Average execution time (in seconds) using boot, θ = (1.5, 1, 0.31), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
n = 30n = 50n = 70
( a 1 , a 2 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )
p
2435943774338435146454660467246504936494349414,948
3308930783078307932713280326932583453345434653,443
4236823682367238024962489248924782631262626262605
5200320101998201021102114209821072229222522192,209
6170117071710170617751772178317711875185218611,863
7154315331532153015981602160015981670167016661666
8136613601362136213951400140313931465147114651476
9128012701281128013221322132813281376137913771382
10116111771162116911941193120311961260125512611254
11109810881095109911261125111911261173117511701178
12102510201022102710271029104010341074107210581071
Table A3. Average execution time (in seconds) using parRapply, θ = (1.5, 1, 0.62), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
Table A3. Average execution time (in seconds) using parRapply, θ = (1.5, 1, 0.62), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
n = 30n = 50n = 70
( a 1 , a 2 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )
p
2389738793969389042574316429142614529454445304,564
3265826322626264628892886291529093107310930913,093
4203720112015200521782208219521862330232223432,344
5166916381672167317971819182318091925193419491,927
6139113921375139214951514152415161609161016101,615
7120011951205120113111307131713261388139613801399
8106410461053100111491156114911511218123612121224
997497697097010581066105910711117113511241131
108678738778749739489729501021100410161023
11819806807815873886876886929959959948
12728737730738820807806810859861853858
Table A4. Average execution time (in seconds) using boot, θ = (1.5, 1, 0.62), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
Table A4. Average execution time (in seconds) using boot, θ = (1.5, 1, 0.62), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
n = 30n = 50n = 70
( a 1 , a 2 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )
p
2413341114110411045424552452945304840483648444833
3288328742891287531693175316231603381337233953373
4218421842188218123942396238823942549254725642550
5185718541853185420342031202720332163215521742158
6155515591554155316971696169716961810179818101801
7140313971401139915281526152415251621161916221622
8122312261223122513351333133513331420141114191410
9114711441145114512591258125612561340133813411,339
10103710361036103711321131113211321211120812121209
1197297396897210581056105510561133113211331129
128928938958989739719629731035102410321024
Table A5. Average execution time (in seconds) using parRapply, θ = (1.5, 1, 0.92), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
Table A5. Average execution time (in seconds) using parRapply, θ = (1.5, 1, 0.92), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
n = 30n = 50n = 70
( a 1 , a 2 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )
p
2397039714000400443514331435543264639461946544599
3274027132701270829562963296929473130317831063130
4206420492028204222502251225322512369239123492354
5169616771668168618411865185518451978196219591973
6140314111399143815511549153815371645163016201617
7123912211235122413441339132913281436141414191409
8108710591079107611911172117411921243123812341234
998198699898910861080108310861162114411391138
108968818868829839759729891034103710271030
11832819841820906893884912948943944948
12751742753751820823825823866868873871
Table A6. Average execution time (in seconds) using boot, θ = (1.5, 1, 0.92), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
Table A6. Average execution time (in seconds) using boot, θ = (1.5, 1, 0.92), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
n = 30n = 50n = 70
( a 1 , a 2 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )
p
2408540834083408247284551453945354836483948394830
3286028502847285631853175316531663374337233963376
4215521592169215524072396239124012556254325472547
5183618381837183520442034203220352166215621642164
6153415301536152717041700169816981804180618081,811
7138113821380138115331531152615281622162216251,626
8120212031203120613391337133313331414141714181413
9112111251122112612641264126212601341134113421339
10101610181018101811331132113311351210121012141208
1195695595595410601059105810581136113411351134
128768818868779759769659661028102810271027
Table A7. Average execution time (in seconds) using parRapply, θ = (1, 1, 0.5), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
Table A7. Average execution time (in seconds) using parRapply, θ = (1, 1, 0.5), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
n = 30n = 50n = 70
( a 1 , a 2 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )
p
2345734393509347537273759376737713960397839523979
3234123302326235024982522253925172679272126762704
4178217691769175819291931191419302053203720312034
5146614791468149115911595160215901679167316901680
6122212281222123213261335132513291415139914131409
7107010751063106111501156113811461229121112101213
894193894293110031004101510131075107010721074
984884984985893692492393097610119871016
10777808781774839834835839888877877899
11715703725723773768773774813821823823
12656654655660709706710712755761743749
Table A8. Average execution time (in seconds) using boot, θ = (1, 1, 0.5), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
Table A8. Average execution time (in seconds) using boot, θ = (1, 1, 0.5), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
n = 30n = 50n = 70
( a 1 , a 2 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )
p
2366536733670368240003983398839924249425542384241
3257725722589258627922786279928062971298429662967
4195519471959196121092103211321102233224722292235
5166816631662166717941790179717951901190818981903
6139914001394139914971498149714981584159115901594
7126312601262125613511353135513521431143214311433
8109110921093109211751170117211741254125012501252
9101410141019101510931091109510951172117211751169
109389359339329899899889891049104810521050
11875878880877931929929930985986988986
12809810809810859862861860904904913911
Table A9. Average execution time (in seconds) using parRapply, θ = (1, 1, 0.25), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
Table A9. Average execution time (in seconds) using parRapply, θ = (1, 1, 0.25), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
n = 30n = 50n = 70
( a 1 , a 2 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )
p
2358936053583358739253975386739044113406341544105
3243624522442245826602628263026672792283727962803
4184418511859187919981991201819932105210321392106
5152815631545153716671669165216481760176217481745
6128012961277128913751384138213791445146814491465
7112511141111113612001194120312231260126712531260
898897298097410621043104910591108110011091111
98908928978929609579689581054101410381017
10811812810799874880870872914925910921
11748746751742797821799794846850855846
12689678687685745736738739771786773779
Table A10. Average execution time (in seconds) using boot, θ = (1, 1, 0.25), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
Table A10. Average execution time (in seconds) using boot, θ = (1, 1, 0.25), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
n = 30n = 50n = 70
( a 1 , a 2 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )
p
2396839503972395441364150414241414368435443844386
3283028092812281329232913292229073069308630583065
4217521662169216422152222222922222319232823332323
5184818451853185618921892189318881972198019721972
6158615701570158015901590159515861663165916511664
7141614211419142414301438144314341487148414801483
8125512511252125612551261126012551303129913021310
9117111801181117811691174117611761222122412261219
10108210861087108410691070106810741100109610991102
11102810201019102510031008100710021028103110321027
12954952959952931935925927950952946953
Table A11. Average execution time (in seconds) using parRapply, θ = (1, 1, 0.75), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
Table A11. Average execution time (in seconds) using parRapply, θ = (1, 1, 0.75), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
n = 30n = 50n = 70
( a 1 , a 2 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )
p
2338334263400341837313716372537333946397039813963
3232122932321229625112543252525662697269027052683
4174217551752174718991910193119182064202120312063
5147814511462144416101585159916031701169317021700
6120412051214121713201315133813221410140514161408
7105310401036104711501146114511471238123412251227
890792791692710021009100910041077108310901062
9850844847852931921920920974988978992
10755767762767831831843842892897894885
11701708713699775762770772810801805812
12643647648645711713700707759750759753
Table A12. Average execution time (in seconds) using boot, θ = (1, 1, 0.75), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
Table A12. Average execution time (in seconds) using boot, θ = (1, 1, 0.75), weight vectors ( a 1 , a 2 ) , sample size n, for different numbers of processors p.
n = 30n = 50n = 70
( a 1 , a 2 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )    ( 0 , 0 )    ( 0 , 1 )    ( 1 , 0 )    ( 1 , 1 )
p
2358236013601359639573978398339664226463142444254
3253325242539253227892790279527832963295729662972
4190419001905190021052099209520882228233522292241
5162516231621162217911791178917851901190418981897
6135813561353135114861492149114841588158915851585
7122012191218121813461347134513471428143214271427
8105310501049105211691166116611611245124912491247
997697897897910861087108710841169117011701169
108958958948969839849859821048105010491051
11843844843846927925925924984986986985
12782777775784856849860857903911914906

References

  1. Novoa-Muñoz, F.; Jiménez-Gamero, M.D. Testing for the bivariate Poisson distribution. Metrika 2014, 77, 771–793. [Google Scholar] [CrossRef]
  2. Bolis, A.; Cantwell, C.D.; Moxey, D.; Serson, D.; Sherwin, S.J. An adaptable parallel algorithm for the direct numerical simulation of incompressible turbulent flows using a Fourier spectral/hp element method and MPI virtual topologies. Comput. Phys. Commun. 2016, 206, 17–25. [Google Scholar] [CrossRef] [PubMed]
  3. Macías-Díaz, J.E. An easy-to-implement parallel algorithm to simulate complex instabilities in three-dimensional (fractional) hyperbolic systems. Comput. Phys. Commun. 2020, 254, 51059. [Google Scholar] [CrossRef]
  4. Laman, D.S.; Wieringen, W.N. A parallel algorithm for ridge-penalized estimation of the multivariate exponential family from data of mixed types. Stat. Comput. 2021, 31, 41. [Google Scholar] [CrossRef]
  5. Yang, D.; Li, M.; Liu, H. A Parallel Computing Algorithm for the Emergency-Oriented Atmospheric Dispersion Model CALPUFF. Atmosphere 2022, 13, 2129. [Google Scholar] [CrossRef]
  6. Trabes, G.; Wainer, G.; Gil-Costa, V. A Parallel Algorithm to Accelerate DEVS Simulations in Shared Memory Architectures. IEEE Trans. Parallel Distrib. Syst. 2023, 34, 1609–1620. [Google Scholar] [CrossRef]
  7. Wu, X.; Kolar, A.; Chung, J.; Jin, D.; Suchara, M.; Kettimuthu, R. Parallel Simulation of Quantum Networks with Distributed Quantum State Management. ACM Trans. Model. Comput. 2024, 34, 1–28. [Google Scholar] [CrossRef]
  8. Kocherlakota, S.; Kocherlakota, K. Bivariate Discrete Distributions; CRC Press: Boca Raton, FL, USA, 2017. [Google Scholar]
  9. Rohatgi, V.K.; Saleh, A.K.M.E. An Introduction to Probability Theory and Mathematical Statistics, 3rd ed.; Wiley Series in Probability and Statistics; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
  10. Shao, J. Mathematical Statistics; Springer Science & Business Media: New York, NY, USA, 2008. [Google Scholar]
  11. Andrew, D. Inconsistency of the bootstrap when a parameter is on the boundary of the parameter space. Econométrica 2000, 68, 399–400. [Google Scholar] [CrossRef]
  12. Pacheco, P. An Introduction to Parallel Programming; Morgan Kaufmann Publishers: Burlington, MA, USA, 2011. [Google Scholar]
  13. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing, Vienna, Austria. 2020. Available online: https://www.R-project.org/ (accessed on 5 June 2021).
  14. Urbanek, S. Multicore: Parallel Processing of R Code on Machines with Multiple Cores or CPUs; 2014. Available online: https://cran.r-project.org/package=multicore (accessed on 25 July 2021).
  15. Canty, A.; Ripley, B.D.; boot: Bootstrap R (S-Plus) Functions. R Package Version 1.3-30; 2024. Available online: https://cran.r-project.org/web/packages/boot/boot.pdf (accessed on 12 May 2024).
  16. Chapple, S.; Eilidh, T.; Thorsten, F.; Terence, S. Mastering Parallel Programming with R; Packt Publishing Ltd.: Birmingham, UK, 2016. [Google Scholar]
  17. Matloff, N. Parallel Computing for Data Science; Chapman and Hall/CRC: New York, NY, USA, 2016. [Google Scholar]
  18. Rauber, T.; Rünger, G. Parallel Programming, 3rd ed.; Springer: Cham, Switzerland, 2023. [Google Scholar]
  19. Novoa-Muñoz, F.; Jiménez-Gamero, M.D. A goodness-of-fit test for the multivariate Poisson distribution. SORT 2016, 40, 113–138. [Google Scholar]
  20. Novoa-Muñoz, F. Goodness-of-fit tests for the bivariate Poisson distribution. Commun. Stat. Simul. Comput. 2021, 50, 1998–2014. [Google Scholar] [CrossRef]
  21. González-Albornoz, P.; Novoa-Muñoz, F. Goodness-of-Fit Test for the Bivariate Hermite Distribution. Axioms 2023, 12, 7. [Google Scholar] [CrossRef]
Figure 1. Average execution time (in seconds) using parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1, 1, 0.75).
Figure 1. Average execution time (in seconds) using parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1, 1, 0.75).
Mathematics 12 01686 g001
Figure 2. Average execution time (in seconds) using parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1, 1, 0.50).
Figure 2. Average execution time (in seconds) using parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1, 1, 0.50).
Mathematics 12 01686 g002
Figure 3. Average execution time (in seconds) using parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1, 1, 0.25).
Figure 3. Average execution time (in seconds) using parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1, 1, 0.25).
Mathematics 12 01686 g003
Figure 4. Average execution time (in seconds) using parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1.5, 1, 0.62).
Figure 4. Average execution time (in seconds) using parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1.5, 1, 0.62).
Mathematics 12 01686 g004
Figure 5. Average execution time (in seconds) using parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1.5, 1, 0.92).
Figure 5. Average execution time (in seconds) using parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1.5, 1, 0.92).
Mathematics 12 01686 g005
Figure 6. Average execution time (in seconds) using parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1.5, 1, 0.31).
Figure 6. Average execution time (in seconds) using parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1.5, 1, 0.31).
Mathematics 12 01686 g006
Figure 7. Efficiency of running the parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1, 1, 0.75).
Figure 7. Efficiency of running the parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1, 1, 0.75).
Mathematics 12 01686 g007
Figure 8. Efficiency of running the parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1, 1, 0.50).
Figure 8. Efficiency of running the parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1, 1, 0.50).
Mathematics 12 01686 g008
Figure 9. Efficiency of running the parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1, 1, 0.25).
Figure 9. Efficiency of running the parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1, 1, 0.25).
Mathematics 12 01686 g009
Figure 10. Efficiency of running the parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1.5, 1, 0.62).
Figure 10. Efficiency of running the parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1.5, 1, 0.62).
Mathematics 12 01686 g010
Figure 11. Efficiency of running the parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1.5, 1, 0.92).
Figure 11. Efficiency of running the parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1.5, 1, 0.92).
Mathematics 12 01686 g011
Figure 12. Efficiency of running the parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1.5, 1, 0.31).
Figure 12. Efficiency of running the parRapply and boot commands versus number of processors p and different sample sizes n for θ = (1.5, 1, 0.31).
Mathematics 12 01686 g012
Figure 13. Average execution time (in seconds) versus sample sizes n for parameters θ when E ( X 1 ) = E ( X 2 ) .
Figure 13. Average execution time (in seconds) versus sample sizes n for parameters θ when E ( X 1 ) = E ( X 2 ) .
Mathematics 12 01686 g013
Figure 14. Average execution time (in seconds) versus sample sizes n for parameters θ when E ( X 1 ) E ( X 2 ) .
Figure 14. Average execution time (in seconds) versus sample sizes n for parameters θ when E ( X 1 ) E ( X 2 ) .
Mathematics 12 01686 g014
Figure 15. Average execution time (in seconds) versus sample sizes n for parameters θ when E ( X 1 ) = E ( X 2 ) .
Figure 15. Average execution time (in seconds) versus sample sizes n for parameters θ when E ( X 1 ) = E ( X 2 ) .
Mathematics 12 01686 g015
Figure 16. Average execution time (in seconds) versus sample sizes n for parameters θ when E ( X 1 ) E ( X 2 ) .
Figure 16. Average execution time (in seconds) versus sample sizes n for parameters θ when E ( X 1 ) E ( X 2 ) .
Mathematics 12 01686 g016
Table 1. Average execution time (in seconds) using Sequential Version, weight vectors ( a 1 , a 2 ) , sample sizes n and parameters θ .
Table 1. Average execution time (in seconds) using Sequential Version, weight vectors ( a 1 , a 2 ) , sample sizes n and parameters θ .
n = 30n = 50n = 70
θ ( 0 , 0 ) ( 0 , 1 ) ( 1 , 0 ) ( 1 , 1 ) ( 0 , 0 ) ( 0 , 1 ) ( 1 , 0 ) ( 1 , 1 ) ( 0 , 0 ) ( 0 , 1 ) ( 1 , 0 ) ( 1 , 1 )
(1.0, 1, 0.75)685168816892691275587432765774618108805479528019
(1.0, 1, 0.50)692371147006698575547556751375118008808680558109
(1.0, 1, 0.25)745172577247731778567795785477788264835982708220
(1.5, 1, 0.62)793778957969787886308624849585849174909793829210
(1.5, 1, 0.92)799579768051800488048886881987139398928792709446
(1.5, 1, 0.31)820983078175814689078925891389149427945193989400
Table 2. Average execution time (in seconds) using boot Sequential Version, weight vectors ( a 1 , a 2 ) , sample sizes n and parameters θ .
Table 2. Average execution time (in seconds) using boot Sequential Version, weight vectors ( a 1 , a 2 ) , sample sizes n and parameters θ .
n = 30n = 50n = 70
θ ( 0 , 0 ) ( 0 , 1 ) ( 1 , 0 ) ( 1 , 1 ) ( 0 , 0 ) ( 0 , 1 ) ( 1 , 0 ) ( 1 , 1 ) ( 0 , 0 ) ( 0 , 1 ) ( 1 , 0 ) ( 1 , 1 )
(1.0, 1, 0.75)688568686907696277207688765876918207816482318243
(1.0, 1, 0.50)707370427070705977527687767876908256823982678237
(1.0, 1, 0.25)752675367472748579057960801379938468838084298385
(1.5, 1, 0.62)797779267933795288238789876887829415940395009391
(1.5, 1, 0.92)791078657885788987658812877688179441936293939492
(1.5, 1, 0.31)831383358326834390289138899190299653955395739587
Table 3. Results of simulating probability of type I error. Instance θ = ( 1 , 1 , 0.75 ) , ρ = 0.75 .
Table 3. Results of simulating probability of type I error. Instance θ = ( 1 , 1 , 0.75 ) , ρ = 0.75 .
a 1 = a 2 = 0 a 1 = 0 , a 2 = 1 a 1 = 1 , a 2 = 0 a 1 = a 2 = 1
n 1%5%10%1%5%10%1%5%10%1%5%10%
1000.0090.0390.0860.0130.0420.0880.0070.0440.1020.0100.0400.093
1500.0130.0540.1110.0090.0480.1070.0100.0540.1090.0100.0520.114
2000.0130.0530.1120.0140.0580.1080.0140.0560.1150.0150.0600.113
2500.0160.0600.1090.0140.0650.1120.0140.0570.1150.0140.0600.108
3000.0130.0510.1010.0130.0500.0890.0140.0520.1000.0110.0480.101
5000.0100.0480.0990.0090.0500.0980.0110.0470.0990.0100.0490.099
Table 4. Results of simulating probability of type I error. Instance θ = ( 1 , 1 , 0.5 ) , ρ = 0.5 .
Table 4. Results of simulating probability of type I error. Instance θ = ( 1 , 1 , 0.5 ) , ρ = 0.5 .
a 1 = a 2 = 0 a 1 = 0 , a 2 = 1 a 1 = 1 , a 2 = 0 a 1 = a 2 = 1
n 1%5%10%1%5%10%1%5%10%1%5%10%
1000.0090.0540.0920.0090.0530.0990.0150.0480.0870.0090.0520.085
1500.0170.0510.1090.0090.0550.1070.0170.0610.1080.0120.0540.096
2000.0130.0680.1160.0110.0550.1150.0180.0700.1140.0130.0640.114
2500.0090.0470.0950.0090.0500.0930.0140.0380.0960.0100.0500.100
3000.0180.0480.0980.0120.0480.0990.0220.0490.0950.0150.0500.092
5000.0110.0510.1020.0100.0520.1000.0110.0490.0990.0110.0560.100
Table 5. Results of simulating probability of type I Error. Instance θ = ( 1 , 1 , 0.25 ) , ρ = 0.25 .
Table 5. Results of simulating probability of type I Error. Instance θ = ( 1 , 1 , 0.25 ) , ρ = 0.25 .
a 1 = a 2 = 0 a 1 = 0 , a 2 = 1 a 1 = 1 , a 2 = 0 a 1 = a 2 = 1
n 1%5%10%1%5%10%1%5%10%1%5%10%
1000.0060.0520.1000.0080.0510.1090.0110.0420.1050.0090.0460.096
1500.0110.0470.0970.0120.0520.1010.0070.0470.0990.0080.0430.093
2000.0170.0570.0900.0110.0440.0910.0180.0650.1100.0180.0530.102
2500.0130.0560.1080.0090.0510.0970.0130.0680.1240.0120.0560.115
3000.0100.0570.1130.0130.0560.1080.0110.0490.1060.0110.0570.111
5000.0090.0490.1000.0110.0510.1010.0090.0490.0990.0100.0480.100
Table 6. Results of simulating probability of type I error. Instance θ = ( 1.5 , 1 , 0.62 ) , ρ 0.5 .
Table 6. Results of simulating probability of type I error. Instance θ = ( 1.5 , 1 , 0.62 ) , ρ 0.5 .
a 1 = a 2 = 0 a 1 = 0 , a 2 = 1 a 1 = 1 , a 2 = 0 a 1 = a 2 = 1
n 1%5%10%1%5%10%1%5%10%1%5%10%
1000.0100.0600.1160.0120.0540.1150.0100.0610.1080.0090.0540.111
1500.0120.0630.1290.0150.0680.1300.0140.0670.1220.0110.0670.129
2000.0070.0380.0800.0090.0390.0930.0040.0370.0940.0060.0400.085
2500.0080.0460.0940.0090.0490.1030.0070.0420.0890.0090.0410.094
3000.0160.0580.1030.0140.0590.1080.0130.0550.1000.0100.0520.101
5000.0090.0480.0990.0110.0480.0980.0090.0490.0970.0100.0490.099
Table 7. Results of simulating probability of type I error. Instance θ = ( 1.5 , 1 , 0.92 ) , ρ 0.75 .
Table 7. Results of simulating probability of type I error. Instance θ = ( 1.5 , 1 , 0.92 ) , ρ 0.75 .
a 1 = a 2 = 0 a 1 = 0 , a 2 = 1 a 1 = 1 , a 2 = 0 a 1 = a 2 = 1
n 1%5%10%1%5%10%1%5%10%1%5%10%
1000.0070.0390.0880.0080.0370.0900.0090.0430.0940.0060.0350.091
1500.0110.0660.1210.0120.0540.1310.0110.0670.1230.0110.0560.118
2000.0170.0600.1100.0210.0520.1010.0160.0570.1100.0180.0570.106
2500.0190.0530.1110.0210.0590.1110.0110.0540.0980.0220.0550.098
3000.0090.0540.1110.0130.0530.1100.0110.0600.1030.0120.0520.101
5000.0110.0510.0990.0110.0510.1010.0110.0510.1010.0110.0510.101
Table 8. Results of simulating probability of type I error. Instance θ = ( 1.5 , 1 , 0.31 ) , ρ 0.25 .
Table 8. Results of simulating probability of type I error. Instance θ = ( 1.5 , 1 , 0.31 ) , ρ 0.25 .
a 1 = a 2 = 0 a 1 = 0 , a 2 = 1 a 1 = 1 , a 2 = 0 a 1 = a 2 = 1
n 1%5%10%1%5%10%1%5%10%1%5%10%
1000.0100.0440.0990.0140.0530.1040.0080.0490.0890.0090.0480.104
1500.0100.0500.0980.0060.0530.0970.0130.0570.1000.0040.0480.098
2000.0090.0360.0940.0080.0430.0900.0070.0510.0990.0080.0430.090
2500.0170.0660.1120.0190.0640.1210.0150.0560.1050.0180.0620.114
3000.0140.0550.0910.0140.0460.0880.0140.0600.1120.0150.0480.094
5000.0110.0510.1010.0110.0510.1020.0110.0490.1010.0110.0510.101
Table 9. Results of simulating probability of type I error. Instance θ = ( 1 , 1 , 0.1 ) , ρ = 0.1 .
Table 9. Results of simulating probability of type I error. Instance θ = ( 1 , 1 , 0.1 ) , ρ = 0.1 .
a 1 = a 2 = 0 a 1 = 0 , a 2 = 1 a 1 = 1 , a 2 = 0 a 1 = a 2 = 1
n 1%5%10%1%5%10%1%5%10%1%5%10%
300.0520.1410.2100.0380.1170.1790.0320.1150.1980.0370.1210.201
500.0790.1800.2630.0530.1430.2220.0570.1580.2250.0600.1600.243
700.1170.2350.3050.0820.1920.2690.0830.1810.2700.0870.1990.276
1000.1750.2810.3340.1280.2430.3060.1270.2490.3050.1410.2590.309
1500.2170.2870.3170.1660.2610.3060.1680.2700.3050.1880.2720.309
2000.2580.2900.3180.2140.2760.3090.2190.2770.3050.2310.2770.310
Table 10. Results of simulating probability of type I error. Instance θ = ( 1 , 1 , 0.2 ) , ρ = 0.2 .
Table 10. Results of simulating probability of type I error. Instance θ = ( 1 , 1 , 0.2 ) , ρ = 0.2 .
a 1 = a 2 = 0 a 1 = 0 , a 2 = 1 a 1 = 1 , a 2 = 0 a 1 = a 2 = 1
n 1%5%10%1%5%10%1%5%10%1%5%10%
300.0140.0450.0810.0090.0440.0680.0130.0440.0810.0110.0390.076
500.0090.0410.0860.0100.0390.0740.0130.0450.0770.0090.0390.074
700.0290.0630.1010.0170.0610.1070.0160.0570.1020.0160.0620.100
1000.0180.0540.0830.0150.0420.0850.0140.0490.0870.0160.0440.082
1500.0310.0580.0850.0210.0560.0950.0200.0520.0860.0210.0540.091
2000.0180.0490.0770.0200.0460.0850.0160.0430.0750.0170.0460.075
Table 11. Results of simulating probability of type I error. Instance θ = ( 1 , 1 , 0.8 ) , ρ = 0.8 .
Table 11. Results of simulating probability of type I error. Instance θ = ( 1 , 1 , 0.8 ) , ρ = 0.8 .
a 1 = a 2 = 0 a 1 = 0 , a 2 = 1 a 1 = 1 , a 2 = 0 a 1 = a 2 = 1
n 1%5%10%1%5%10%1%5%10%1%5%10%
300.0110.0520.1140.0090.0520.1000.0080.0560.1120.0090.0480.113
500.0120.0580.1090.0120.0610.1140.0140.0590.1050.0080.0610.109
700.0100.0550.1080.0130.5500.1140.0080.0500.1100.0080.0530.109
1000.0500.0510.1220.0120.0580.1180.0180.0520.1120.0140.0510.119
1500.0060.0500.1080.0080.0510.1080.0080.0560.1050.0080.0560.104
2000.0140.0410.0980.0130.0430.0990.0140.0450.0960.0140.0450.097
Table 12. Results of simulating probability of type I error. Instance θ = ( 1 , 1 , 0.9 ) , ρ = 0.9 .
Table 12. Results of simulating probability of type I error. Instance θ = ( 1 , 1 , 0.9 ) , ρ = 0.9 .
a 1 = a 2 = 0 a 1 = 0 , a 2 = 1 a 1 = 1 , a 2 = 0 a 1 = a 2 = 1
n 1%5%10%1%5%10%1%5%10%1%5%10%
300.0180.0630.1140.0200.0680.1210.0210.0650.1200.0200.0670.122
500.0090.0540.0980.0110.0560.0960.0110.0560.1090.0080.0560.103
700.0130.0590.1140.0170.0620.1070.0140.0550.1120.0160.0590.117
1000.0140.0490.1070.0150.0480.1080.0130.0540.1090.0120.0550.109
1500.0140.0580.1120.0160.0590.1090.0120.0560.1070.0130.0590.104
2000.0090.0420.0890.0090.0390.0890.0120.0450.0910.0090.0410.091
Table 13. Results of simulating probability of type I error. Instance θ = ( 1.5 , 1 , 0.12 ) , ρ 0.098 .
Table 13. Results of simulating probability of type I error. Instance θ = ( 1.5 , 1 , 0.12 ) , ρ 0.098 .
a 1 = a 2 = 0 a 1 = 0 , a 2 = 1 a 1 = 1 , a 2 = 0 a 1 = a 2 = 1
n 1%5%10%1%5%10%1%5%10%1%5%10%
300.0150.0910.1610.0100.0720.1410.0200.0800.1460.0110.0860.145
500.0510.1420.2190.0280.1090.1750.0420.1390.2190.0430.1400.211
700.0690.1720.2400.0350.1180.2000.0520.1610.2270.0590.1570.237
1000.0950.1830.2550.0510.1350.2130.0880.1840.2510.0810.1770.242
1500.1340.2330.2790.0720.1790.2600.1230.2280.2810.1160.2210.278
2000.1920.2590.2990.1290.2200.2730.1690.2550.2950.1680.2500.290
Table 14. Results of simulating probability of type I error. Instance θ = ( 1.5 , 1 , 0.25 ) , ρ 0.2 .
Table 14. Results of simulating probability of type I error. Instance θ = ( 1.5 , 1 , 0.25 ) , ρ 0.2 .
a 1 = a 2 = 0 a 1 = 0 , a 2 = 1 a 1 = 1 , a 2 = 0 a 1 = a 2 = 1
n 1%5%10%1%5%10%1%5%10%1%5%10%
300.0200.0620.1250.0150.0670.1290.0150.0650.1210.0130.0590.115
500.0110.0540.0990.0070.0510.0970.0100.0430.0910.0080.0480.103
700.0200.0660.1130.0130.0580.1160.0170.0600.0970.0160.0600.114
1000.0240.0590.1010.0150.0540.0990.0290.0640.0970.0200.0560.105
1500.0220.0620.0980.0190.0520.0810.0200.0620.1050.0180.0560.093
2000.0130.0480.0870.0120.0500.0900.0130.0390.0890.0130.0410.092
Table 15. Results of simulating probability of type I error. Instance θ = ( 1.5 , 1 , 0.98 ) , ρ 0.8 .
Table 15. Results of simulating probability of type I error. Instance θ = ( 1.5 , 1 , 0.98 ) , ρ 0.8 .
a 1 = a 2 = 0 a 1 = 0 , a 2 = 1 a 1 = 1 , a 2 = 0 a 1 = a 2 = 1
n 1%5%10%1%5%10%1%5%10%1%5%10%
300.0170.0700.1380.0160.0730.1300.0210.0680.1290.0180.0720.127
500.0150.0620.1190.0160.0630.1190.0140.0600.1080.0130.0600.115
700.0120.0430.0910.0130.0430.0900.0100.0490.0920.0100.0490.097
1000.0050.0500.1020.0080.0560.1070.0060.0410.1020.0060.0460.098
1500.0090.0490.1060.0110.0520.0980.0120.0500.0960.0090.0470.102
2000.0100.0470.0940.0070.0410.1010.0090.0480.0960.0080.0430.093
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Novoa-Muñoz, F. Implementation of a Parallel Algorithm to Simulate the Type I Error Probability. Mathematics 2024, 12, 1686. https://doi.org/10.3390/math12111686

AMA Style

Novoa-Muñoz F. Implementation of a Parallel Algorithm to Simulate the Type I Error Probability. Mathematics. 2024; 12(11):1686. https://doi.org/10.3390/math12111686

Chicago/Turabian Style

Novoa-Muñoz, Francisco. 2024. "Implementation of a Parallel Algorithm to Simulate the Type I Error Probability" Mathematics 12, no. 11: 1686. https://doi.org/10.3390/math12111686

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop