1. Introduction
INteger AutoRegressive models (INAR) are widely used in the case of non-negative integer-valued time series. One of the well-known specifications of INAR models involves the employment of equidispersed Poisson (hereafter P-INAR) for the model disturbances. Different contributions have been focused on testing for the presence of a (possibly unknown) serial dependence in stable P-INAR models, especially because conventional methods for continuous time series may fail.
A first contribution can be found in [
1], where a test statistic, based on the score function, was proposed for the P-INAR(1) model. Then, score-based statistics were compared to other proposals (e.g., the runs test and Portmentau-type statistics) in terms of empirical size and power [
2], while [
3] developed generalized score-based statistics to take into account under- and equidispersion in INAR models. Recently [
4], found that the (conditional) maximum likelihood ratio may be an efficient alternative compared to the score test in the P-INAR(1) models. Under the null hypothesis of non-serial dependence, i.e.,
in the P-INAR(1), these statistics can be generally approximated in large samples by well-known free parameter distributions, such as standard normal or Chi-squared.
In practice, asymptotic approximation issues can lead to poor performance in such tests. The simulations in [
2,
4] confirmed that the score-type tests are undersized in case of series of small or moderately small sample sizes (i.e., when
and
T is the sample size), also showing bias from the nominal level with moderately large samples (e.g.,
).
In terms of empirical power, the Monte Carlo results of [
3] evidenced a considerable gap between performance in small samples (e.g.,
) and moderately large samples (e.g.,
) under a set of local alternatives (
. The presence of underdispersion and overdispersion may play a relevant role in this context, and the tests can exhibit more or less sensitivity with respect to the possible deviations from the Poisson assumption. To overcome these issues, researchers have employed response surface regression to adjust the critical values of the tests. Moreover, this method is strongly based on arbitrary choices for both the Monte Carlo setup and the model specification.
In this paper we investigate, through both a simulation study and three empirical applications, if and how bootstrap methods can improve inference in testing serial dependence for P-INAR models. Bootstrap methods in INAR were introduced by [
5,
6] in the context of confidence bounds of forecasting. More recently, refs. [
7,
8] developed either parametric and semiparametric bootstrap methods to obtain more reliable inference in point estimation (via bootstrap-based bias correction) and confidence bounds, while [
9] extended resampling methods in a more model-based forecasting perspective. From another point of view, ref. [
10] proposed a parametric bootstrap procedure to test distributional assumption for INAR innovations. All authors pointed out the inconsistency of the conventional time series bootstrap, proposing methods that take into account the nature of integer-valued data in the resampling scheme.
Starting from [
7], we propose a straightforward semiparametric bootstrap imposing the null hypothesis (of non-serial dependence, i.e.,
) in the bootstrap data-generating process (DGP). The usage of “restricted” methods, which appears quite novel in the INAR context, can be found in [
11,
12] in bootstrapping linear regressions, also considering endogenous regressors. To the best of our knowledge, this is the first work proposing a semiparametric bootstrap algorithm to test for the presence of the INAR effect, especially considering time series of small or moderately small length. Bootstrap algorithms for score-based statistics have been proposed to solve other econometric issues. For instance, ref. [
13] considered bootstrap methods for score-based statistics in the case of instrumental variables with possible weak instruments.
The remainder of this paper is structured as follows: the P-INAR(1) model and the score-based test statistic based on Poisson assumption are presented in
Section 2.
Section 3 introduces the new semiparametric bootstrap algorithm, while the results of Monte Carlo simulations are shown in
Section 4. Applications of these methods on real datasets are presented in
Section 5, and a general discussion is provided in
Section 6. Finally,
Section 7 contains some conclusions and further possible advances.
2. The Model and Score Test Statistic
Consider the following stable INAR(
p) model, introduced in [
14,
15], defined as:
where
is an i.i.d. nonnegative integer-valued having a finite mean
and variance
. The processes
denote
p mutually independent binomial thinning operators, representing a stochastic sum of i.i.d. stochastic processes (see [
16] for further details).
This work focuses on testing serial dependence in one-lagged INAR models. Thus, without loss of generality, we consider throughout the rest of the paper the following stable INAR(1) [
15]:
where
is the parameter of interest, also denoted as the thinning parameter. In (
2), the symbol ⊙ is the binomial thinning operator, i.e., a random sum of i.i.d. random variables
, with
, independent of
, such that
and
.
The DGP of marginal process varies according to the distribution of the innovations . In the case of i.i.d. , the model is called P-INAR(1), also assuming equidispersion, i.e., .
Under such assumptions, parameter estimation can be conventionally carried out through Yule–Walker equations, conditional least squares and conditional maximum likelihood; see, e.g., [
14,
17]. In what follows, we consider the score-based test statistic for serial dependence in P-INAR model, introduced in [
1,
2]. To test for the presence of the INAR(1) effect, the following system of hypothesis is considered:
where
, i.e., the parameter of interest, comes from Equation (
2).
Score statistic for testing P-INAR(1) model, with parameters
, takes the following specification [
2,
3]:
where
. The statistic in (
4) converges in distribution to a standard normal [
1,
3].
3. Bootstrap Algorithm for Testing INAR
In this Section, a new semiparametric bootstrap method for the test statistic in Equation (
4) is presented. We remark that conventional non-parametric approaches for continuous time series, e.g., block bootstrap, ref. [
18] and the semiparametric autoregressive bootstrap [
19] should not be applied because they do not take into account the true characteristics of integer-valued time series, leading to inconsistent results. In addition, the infeasibility of conventional methods for time series has been shown in [
7].
We consider a semiparametric bootstrap for its suitability, employing a “restricted” algorithm, i.e., imposing in the bootstrap DGP and obtaining . This restriction ensures that residuals have the same support of the innovations’ DGP. In practice, the pseudo residuals are sampled from the empirical distribution function (EDF) of the restricted residuals (under the null hypothesis of ).
The following algorithm summarizes the proposed semiparametric method.
Semiparametric Bootstrap Algorithm
Given a random sample of size T,
- Step 1.
Estimate the parameters and the test statistic . Residuals can be obtained imposing , i.e., ;
- Step 2.
Use to obtain bootstrap pseudo-residuals , i.e., ;
- Step 3.
Create , plugging the pseudo-residuals in the bootstrap DGP;
- Step 4.
Compute the bootstrapped score statistic
where
;
- Step 5.
Repeat B times steps 1-4, producing ;
- Step 6.
Obtain the bootstrap
p-value as:
Moreover, the pseudo-residuals of Step 2 can be also obtained by using a parametric method, where the bootstrap DGP is constructed based on more specific assumptions. Specifically, for the P-INAR(1) model, the restricted residual is sampled from a Poisson distribution with parameter equal to the estimate of . To summarize, it is assumed that . A possible drawback of the parametric method in the P-INAR case is the sensitivity with respect to deviations from Poisson assumption, especially for what concerns the degree of dispersion.
4. Simulation Study
In this Section, a simulation study is performed to assess the proposed methodology via the comparison between the semiparametric bootstrap and the parametric bootstrap, using the asymptotic test as a benchmark.
4.1. Setup
The finite sample behaviour of the bootstrap-based score test, illustrated in the previous Section, was analysed by generating M = 10,000 samples according to the following DGP:
where
considering the following alternative parameter settings:
. Different sample sizes were used for the simulations, such that
, while the considered nominal level for the test was
. To generate Monte Carlo samples, a pre-run of 500 observations was carried out. The empirical size of bootstrap-based statistic
was evaluated under
, and an increasing sequence of
by 0.05 (starting from
) was considered for the empirical power, stopping at
to avoid the near-unit root situation [
20]. The number of replications used to compute the bootstrap
p-values was set equal to
. We also computed empirical rejection frequencies both for the parametric bootstrap illustrated in
Section 3 and asymptotic rejection frequencies for the score statistic. Performance was evaluated through both the empirical size and the empirical power.
Finally, computational times were evaluated to show the straightforward applicability of the proposed bootstrap test. Time series of lengths ranging from 50 to 500, increasing by 50, were considered.
Deviations from Poisson Assumptions
We firstly evaluated the presence of overdispersion in the innovation process. In this regard, the simulated DGP follows a negative binomial distribution, i.e.,
∼
. Given the (Fisher) index of dispersion, defined as the ratio between variance and mean of the series,
, we considered three following cases inspired by the design of [
2]:
Small overdispersion, considering such that ;
Moderate overdispersion, with resulting in ;
High overdispersion, with and .
In both cases, expected values of are equal to 2.
Then, we consider three cases of under-dispersion using a binomial distribution, i.e., , with the three following parametrisations:
Small underdispersion, considering such that ;
Moderate underdispersion, with and ;
High underdispersion, with and .
The number of Monte Carlo simulations and bootstrap iterations were equal to those considered for the Poisson-based DGP.
4.2. Main Results
We start from the Poisson case (equidispersion).
Table 1 summarizes the main results in terms of empirical size.
Even in the equidispersed case, the asymptotic rejection frequencies can be quite below the nominal level, especially with series of moderately small length (i.e., and ). Nevertheless, the distribution of rejection frequencies obtained through semiparametric bootstrap (hereafter SPB) shows the successfulness of proposed method even with series presenting moderately small length. Indeed, the good performance of parametric bootstrap (hereafter PB), which outperforms the SPB in some simulation scenarios, can be due to the combination of (a) the imposition of the true DGP in the simulation setup and (b) the usage of a score statistic which is specifically suited for equidispersed Poisson arrivals.
Figure 1 shows the performance of the bootstrap test in terms of empirical power, considering 15 different scenarios. The overall performances of SPB and PB are comparable with respect to the asymptotic test. Although the two bootstraps exhibit a conservative trend, especially with
when
, and
when
, the SPB outperforms the PB in all considered scenarios, especially when
and with moderately small
T (
). In addition, the PB outperforms the asymptotic test in the case of moderately small series (
) and for a reasonably large
(i.e.,
).
The considered tests do not appear to be particularly sensitive with respect to different choices of the parameter. To conclude, in the case of Poisson innovations, both SPB and PB are reasonable choices to improve inference in testing for the presence of INAR(1).
Figure 2 illustrates the results of computational costs in terms of the median computed through Monte Carlo replications and considering the 95% quantile intervals. To summarize, the computational cost appears very satisfactory, ranging from 5 and 30 ms, while the semiparametric bootstrap outperforms the parametric one. The gap between the two methods grows as the sample size increases.
In the cases of DGPs deviating from Poisson assumption, the results of the tests show substantial differences. Empirical sizes of SPB and PB in the case of overdispersion are depicted in
Table 2 along with the asymptotic size. Even with a low value of overdispersion (
), the PB shows worse performance than the asymptotic test, exhibiting rejection frequencies that doubled the considered nominal value. Furthermore, in the case of either a moderate or high degree of overdispersion (
), both the PB and the asymptotic test are severely oversized, appearing totally unreliable. Indeed, they exhibit an increasing trend of empirical sizes as the sample length of the series increases. Surprisingly, the SPB performs well throughout the three considered scenarios, especially when
T is sufficiently large.
Regarding the empirical power, illustrated in
Figure 3, when the overdispersion is low (
), the three tests show similar behaviour as the INAR parameter
increases. When
T is quite large (e.g.,
), the SPB, the PB, and the asymptotic test rapidly reach the unity for
. However, severe overdispersion (
) leads to a similar behaviour for the PB and the asymptotic test, producing unreliable over-rejections even for small values of
. Conversely, the SPB, which is generally dominated by both the PB and the asymptotic test, presents a behaviour that is compared to the case of equidispersed Poisson innovations.
Considering underdispersed innovations (e.g., when they follow a binomial distribution), the PB and the asymptotic tests appear useless once again.
Table 3 illustrates how the PB and the asymptotic test are both quite undersized even with slight underdispersion (
. Therefore, when
and
, the rejection frequencies are practically equal to 0 for each considered
T. As in the case of overdispersed innovations, rejection frequencies of SPB are distributed around the nominal level of 0.05.
The empirical power in the case of binomial distribution of the innovations is summarized in
Figure 4. Considering slight underdispersion (
, the SPB, the PB, and the asymptotic test share a similar behaviour: when
, the rejection frequencies are practically stackable. Moreover, when the underdispersion is moderate or severe (
, the PB and the asymptotic test suffer from the under-rejection, as already seen in the empirical size. Thus, the PB seems to perform worse than the asymptotic test, while the SPB confirms its apparent insensitivity with respect to the deviations from the equidispersion. In addition, the SPB is more powerful with respect to both the PB and the asymptotic test.
6. Discussion
The proposed semiparametric bootstrap helps to improve the performance of the score-based statistic in the case of the P-INAR model in terms of empirical size, also considering series of moderately small length. Under the i.i.d Poisson assumption for the innovations, the parametric bootstrap also exhibits excellent results due to the specific features of the simulation setup, while the satisfying performance of the semiparametric method suggests its usefulness, especially in a more generalized context (e.g., under several possible distributions for the innovations). In terms of empirical power, the semiparametric bootstrap generally dominates the parametric one.
In this regard, an analysis on the asymptotic theory will be carried out in further studies. Therefore, under i.i.d. Poisson disturbances, numerical exercises suggest that
may converge to a N
in the bootstrap sense (i.e., conditionally on the data), which is also the limit distribution of the score-based statistic
[
1,
2,
3].
Table 5 shows the averaged estimated moments of
computed using a
bootstrap iterations and 10,000 Monte Carlo replications in the case of
and
, with a series of length
. The Jarque–Bera test is also used to check normality of
. The presented exercise shows how the averaged estimated moments of
are reasonably close to the moments of a standard Gaussian distribution, while the rejection frequencies of the Jarque–Bera test on the two statistics
slightly exceed the nominal value used for the normality test (0.05).
Moreover, previous simulation studies show that the
statistic can fail in case of different parametric arrivals [
2,
3]. This is confirmed by the simulations carried out in
Section 4, while the Figure depicted in the
Appendix A (
Figure A1) shows how
is sensitive to both the degree and the type of dispersion. For instance, the (simulated) distribution of
under the null hypothesis is flatter under moderate overdispersion (
), and then it less rejects
. Under these situations, the parametric bootstrap fails since the degree of dispersion is not included in the bootstrap DGP. Thus, simulations suggest that the distribution of
in the case of the parametric bootstrap converges (conditional to the data and under the null hypothesis) to a standard Gaussian distribution, even when
or
. Conversely, the semiparametric algorithm is able to include the level of dispersion in the bootstrap DGP. Thus, numerical exercises employing two-sample Kolmogorov–Smirnov test show that
reasonably mimics the asymptotic distribution of
under the null hypothesis for any (finite) value of
.
These arguments can be strengthened by looking at the distribution of the bootstrap
p-values. Indeed, conventional bootstrap validity can be also checked when the bootstrap
p-values are (asymptotically) uniformly distributed between 0 and 1 (see e.g., [
29]).
Figure 11 presents a comparison between (simulated) asymptotic and bootstrap
p-values in the case of i.i.d. Poisson innovations and
. For both algorithms, the bootstrap
p-values are close to the 45-degree line, suggesting that they are (asymptotically) uniformly distributed. Moreover, the other two subsequent figures illustrate the simulated distributions of bootstrap
p-values in case of deviations from Poisson assumptions under
. In the case of moderate overdispersion (
Figure 12), i.e.,
∼
and
, the parametric bootstrap
p-values are systematically lower than the asymptotic ones. In addition, numerical evidence shows that they are not uniformly distributed, e.g., the mean of the
p-values is not close to the expected value (i.e., 0.5), and the one sample Kolmogorov–Smirnov test rejects the null hypothesis of uniform distribution between 0 and 1. On the other hand,
p-values obtained through semiparametric bootstrap are distributed around the 45 degree line, and numerical evidence shows that they are uniformly distributed (estimated mean is close to 0.5, and the Kolmogorov–Smirnov test does not reject the null hypothesis). In the case of underdispersed innovations, i.e.,
∼
, an opposing behaviour can be observed (
Figure 13). The parametric bootstrap
p-values are always greater than the asymptotic ones and are not uniformly distributed, while the semiparametric
p-values are, again, uniformly distributed and close to the 45 degree line.
A last consideration may regard the generation of underdispersed innovations. We remark that results of INAR(1) with binomial innovations (both in terms of empirical size and power) may be partially influenced by the intrinsic characteristics of the series, which involves counts that are constrained to assume few modalities, especially for small values of the thinning parameter
. Indeed, the performance of the semiparametric bootstrap is also checked using the Good distribution (see, e.g., [
25,
30]), also denoted as the polylogarithmic distribution, which is more appropriate to model underdispersed counts. Details of the used DGP and the results of the simulation study are presented in the
Appendix B.
7. Concluding Remarks
The score-based statistic, formalized in [
2,
3], is a reasonable way to test for the presence of serial dependence in integer-valued time series. In the case of Poisson innovations (P-INAR model), a semiparametric bootstrap algorithm can represent a straightforward solution to improve the performance of the test in terms of empirical size, especially with series of short (or moderately short) length. The method also shows a good performance in terms of empirical power, especially for a combination of reasonably large values of time persistence parameter and sample size. Furthermore, the parametric bootstrap represents also a possible competitor.
Considering not-equidispersed innovations, both the asymptotic test and the parametric bootstrap appear practically useless. Conversely, simulations and numerical exercises suggest that the semiparametric algorithm may be able to “restore” inference either in the case of overdispersion or underdispersion.
Further research will regard asymptotic theory to investigate the theoretical behaviour of the bootstrap-based score statistic (
) under both parametric and semiparametric approaches. The validity of semiparametric bootstrap in the case of dispersed innovations will be proved through a broader concept of validity occurring in the case of randomness of limit bootstrap measures [
29]. In addition, the proposed bootstrap algorithm can be extended to more generalized versions of the score statistic [
3], even considering possible other sources of misspecifications (e.g., zero inflation) arising in discrete time series. The applicability of score-based bootstrap test should be also investigated through the analysis of real integer-valued time series in many fields, such as finance, healthcare, and environment.