Next Article in Journal
Chaotic Characteristics in Devaney’s Framework for Set-Valued Discrete Dynamical Systems
Next Article in Special Issue
Portmanteau Test for ARCH-Type Models by Using High-Frequency Data
Previous Article in Journal
An Interval Observer for a Class of Cyber–Physical Systems with Disturbance
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Two-Step Estimation Method for a Time-Varying INAR Model

1
School of Mathematics, Jilin University, Changchun 130012, China
2
NUS Business School & The Logistics Institute-Asia Pacific, National University of Singapore, Singapore 119613, Singapore
3
School of Mathematics and Statistics, Liaoning University, Shenyang 110031, China
*
Author to whom correspondence should be addressed.
Axioms 2024, 13(1), 19; https://doi.org/10.3390/axioms13010019
Submission received: 28 November 2023 / Revised: 21 December 2023 / Accepted: 22 December 2023 / Published: 27 December 2023
(This article belongs to the Special Issue Time Series Analysis: Research on Data Modeling Methods)

Abstract

:
This paper proposes a new time-varying integer-valued autoregressive (TV-INAR) model with a state vector following a logistic regression structure. Since the autoregressive coefficient in the model is time-dependent, the Kalman-smoothed method is applicable. Some statistical properties of the model are established. To estimate the parameters of the model, a two-step estimation method is proposed. In the first step, the Kalman-smoothed estimation method, which is suitable for handling time-dependent systems and nonstationary stochastic processes, is utilized to estimate the time-varying parameters. In the second step, conditional least squares is used to estimate the parameter in the error term. This proposed method allows estimating the parameters in the nonlinear model and deriving the analytical solutions. The performance of the estimation method is evaluated through simulation studies. The model is then validated using actual time series data.

1. Introduction

Time series models with an integer-valued structure are prevalent in fields such as economics, insurance, medicine and crime. Recent reviews on count time series models based on thinning operators, including modeling and numerous examples, can be found in refs. [1,2,3]. The most classic one is the first-order integer-valued autoregressive INAR(1) model introduced by ref. [4] and ref. [5] using the properties of binomial sparse operators (ref. [6]). The class of INAR models is typically based on the assumption of observations that follow a Poisson distribution, which facilitates subsequent computations due to the equidispersion feature of the Poisson distribution (mean and variance being equal). Parameter estimation for these models is usually achieved by Yule–Walker, conditional least squares and conditional maximum likelihood methods, as discussed in refs. [7,8,9], among others. Recently, there has been a growing interest in research regarding the autoregressive coefficients in these models. Generally, the autoregressive coefficients can be treated as random variables, with rules such as a stationary process or specified mean and variance; see refs. [10,11,12]. Unlike the above models, this paper proposes a time-varying integer-valued autoregressive (TV-INAR) model in which the state equation is a nonstationary form.
In our work, the model is characterized by a state equation and time-varying parameters. The concept of state-space time series analysis was first introduced by ref. [13] in the field of engineering. Over time, the term “state space” became entrenched in statistics and econometrics, as the state-space model provides an effective method to deal with a wide range of problems in time series analysis. Ref. [14] presented a comprehensive treatment of the state-space approach to time series analysis. Recently, some progress has been made on the use of state space in integer autoregressive and count time series models. Ref. [15] proposed a first-order random coefficient integer-valued autoregressive model by introducing a Markov chain with a finite state space and derived some statistical properties of the model in a random environment. Ref. [16] introduced a parameter-driven state-space model to analyze integer-valued time series data. And, to accommodate the features of overdispersion, zero-inflation and temporal correlation of count time series, ref. [17] proposed a flexible class of dynamic models in the state-space framework. As noted in ref. [18], the popularity of these models stems in large part from the development of the Kalman recursion, proving a quick updating scheme for predicting, filtering and smoothing a time series. The traditional understanding of Kalman filtering can be found in the works of refs. [14,19,20]. Nevertheless, most of this research has focused on the use of continuous time series models in an economic context, such as the analysis of stocks, trade and currency. For example, ref. [21] proposed a class of time-varying parameter autoregressive models and proved the equivalence of the Kalman-smoothed estimate and generalized least squares estimate. Following this lead, [22] developed a trade growth relationship model with time-varying parameters and estimated the transition of changing parameters with a Kalman filter. Both of them all involved cases of Gaussian and linear parameters. Ref. [14] attested that the results obtained by Kalman smoothing are equivalent when the model is non-Gaussian or nonlinear. Therefore, integer-valued time series models in the non-Gaussian case are also worth investigating.
Time series models with state-space and time-varying parameters are common in economics. Many macroeconomists believe that time-varying parameter models can better predict and adapt to the data than fixed-parameter models. Early research mostly focused on time-varying continuous time series models, such as the autoregressive (AR) models, vector autoregressive (VAR) models and autoregressive moving average (ARMA) models (see refs. [21,23,24]). In recent years, research on INAR models with time-varying parameters has attracted more attention and has been applied to natural disasters and medical treatment. Ref. [25] introduced a multivariate integer-valued autoregressive model of order one with periodic time-varying parameters and adopted a composite likelihood-based approach. Ref. [26] considered a change-point analysis of counting time series data through a Poisson INAR(1) process with time-varying smoothing covariates. However, the time-varying parameters in the above models are not controlled by a state equation. Additionally, it is difficult to deal with time-varying parameter models when there are unobserved variables that need to be estimated. There are only a few effective methods available. Ref. [27] proposed a Bayesian estimation method for time-varying parameters and claimed that the Bayesian method is superior to the maximum likelihood method. Both the Bayesian and maximum likelihood methods require Kalman filtering to estimate state vectors that contain time-varying parameters. The application of the Kalman filter would be made possible once the model is put into a state-space form. It is worth exploring some new methods to deal with TV-INAR models.
Motivated by ref. [21], a new TV-INAR model with a state equation is presented in this paper. Since the model is in a state-space form, the Kalman smoothing method uses only known observation variables to estimate the parameters. Unlike traditional INAR models, which are limited to modeling stationary time series data, our TV-INAR model is equipped to handle nonstationary structures and time-varying dynamics, resulting in improved model fit and more accurate predictions. The Kalman-smoothed estimates of the time-varying unobserved state variables are derived. Furthermore, the mean of the Poisson error term is estimated through the estimates obtained in the previous step and conditional least squares methods.
The rest of this paper is organized as follows. A new TV-INAR model is presented and its basic properties are constructed in Section 2. In Section 3, the Kalman smoother is utilized to derive an estimate of the time-varying parameter. Then, incorporating conditional least squares (CLS) methods, the estimation of the mean of the Poisson error term is established. Numerical simulations and results are discussed in Section 4. In Section 5, the proposed model is applied to the offense data sets in Rochester. The conclusion is given in Section 6.

2. Poisson TV-INAR Model

In this section, the INAR(1) model is reviewed. A new TV-INAR model incorporating time-varying and nonstationary features is introduced. Then, some basic statistical properties are derived.
Suppose Y is a non-negative integer-valued random variable and β ( 0 , 1 ) . The binomial thinning ∘, introduced in ref. [6], is defined as β Y = i = 1 Y B i , where { B i } is a sequence of independent and identically distributed (i.i.d.) Bernoulli random variables, independent of Y, and satisfying P ( B i = 1 ) = 1 P ( B i = 0 ) = β . Then, the INAR(1) model is given by
Y t = β Y t 1 + ε t , t = 1 , 2 , ,
where { ε t } is a sequence of uncorrelated non-negative integer-valued random variables, with mean μ ε and finite variance σ ε 2 , and ε t is independent of all { Y t i } , i = 1 , 2 , .

2.1. Definition of TV-INAR Process

It is very common to extend the autoregressive coefficient to a random parameter in time series models. However, it differs from the time-varying parameter introduced in this paper. In random coefficient (RC) models, the variable is usually assigned a definite distribution or given its expectation and variance. Although it is a random variable, its expectation and variance do not change with time. In contrast, in time-varying parameter models, the parameter does not have a fixed distribution, and its expectation and variance often change over time. This is also one of the challenges of such models compared with the ordinary RC models. Thus, based on the above INAR(1) process, we define the time-varying integer-valued autoregressive (TV-INAR) process as follows.
Definition 1.
Let { y t } t N 0 be an integer-valued regressive process. It is a time-varying integer-valued regressive model if
y t = g ( α t ) Z t + ε t , α t = α t 1 + η t ,
where g ( · ) ( 0 , 1 ) is a function of α t ; { ε t } is a sequence of i.i.d. Poisson-distributed random variables with mean λ; { η t } is a sequence of i.i.d. standard normally distributed random variables; and ε t is independent of Z t and η t when t 1 .
In the model (2), y t and Z t are observation variables, and α t is an unobserved variable, often called a time-varying parameter. Our model allows the class of TV-INAR models. For example, Z t = y t 1 yields a TV-INAR(1) model. Model (2) becomes a TV-INAR(p) model when Z t = ( y t 1 , , y t p ) and the autocorrelation coefficient expands to ( g ( α t ) , , g ( α t p + 1 ) ) . Furthermore, Z t can also be considered as a covariate of y t .
In this paper, we devote the case of g ( α t ) = e α t 1 + e α t . The idea of this model is that the development of the system over time is determined by α t in the second equation of (2). However, since α t cannot be observed directly, we need to analyze it based on the observations y t . The first equation of (2) is called the observation equation; the second is called the state equation [14]. We assume initially that α 0 N ( μ 0 , σ 0 2 ) where μ 0 and σ 0 2 are known, because the state equation is nonstationary. Variances of the error terms ε t and η t in (2) are time-invariant.
Studies on the class of TV-AR models have been extensive, especially in the field of economics. Such models are flexible enough to capture the complexity of macroeconomic systems and fit the data better than models with constant parameters. Retaining the advantages of the above models, we propose the TV-INAR model, which can deal with integer-valued discrete data and exists in real life. And, it is of interest that α t is always nonstationary, as α t is a random walk. Therefore, the existence of the above state equation is justified. And the binomial thinning operator ∘ portrays the probability of the event, so g ( α t ) is given the form e α t 1 + e α t to ensure that the probability is between (0,1).

2.2. Statistical Properties

To have a better understanding of the model and to apply it directly to parameter estimation, some statistical properties of the model are provided. The mean, conditional mean, second-order conditional origin moment, variance and conditional variance of the time-varying integer-valued regressive process are given in the following proposition. The case of TV-INAR(1) is given in the corollary.
Proposition 1.
Suppose { Y t } is a process defined by (2), and F Z , t is a σ-field generated by { Z 1 , , Z t , η 1 , , η t , α 0 } . Then, when t 1 , we have
(1) 
E [ Y t ] = E [ Y t | F Z , t ] = e α t 1 + e α t Z t + μ ε = e α t 1 + e α t Z t + λ ;
(2) 
E [ Y t 2 ] = E [ Y t 2 | F Z , t ] = E [ ( e α t 1 + e α t Z t ) 2 | F Z , t ] + E [ ε t 2 | F Z , t ] + 2 E [ e α t 1 + e α t Z t | F Z , t ] E [ ε t | F Z , t ]
= e α t 1 + e α t ( 1 e α t 1 + e α t ) Z t + ( e α t 1 + e α t Z t ) 2 + λ + λ 2 + 2 ( e α t 1 + e α t Z t + λ ) λ
= e α t 1 + e α t Z t ( 1 e α t 1 + e α t + e α t 1 + e α t Z t + 2 λ ) + 3 λ 2 + λ ;
(3) 
V a r [ Y t ] = V a r [ Y t | F Z , t ] = E [ Y t 2 | F Z , t ] E 2 [ Y t | F Z , t ] = e α t 1 + e α t ( 1 e α t 1 + e α t ) Z t + 2 λ 2 + λ .
Corollary 1.
Suppose { Y t } satisfies the TV-INAR(1) model, i.e., Z t = Y t 1 , and F Y , t 1 = σ ( Y 1 , , Y t 1 , η 1 , , η t , α 0 ) , then for t 1 ,
(1) 
E [ Y t | F Y , t 1 ] = e α t 1 + e α t Y t 1 + λ ;
(2) 
E [ Y t ] = E [ E ( Y t | F Y , t 1 ) ] = λ 1 e α t 1 + e α t ;
(3) 
V a r [ Y t | F Y , t 1 ] = e α t 1 + e α t ( 1 e α t 1 + e α t ) Y t 1 + λ ;
(4)
V a r [ Y t ] = e α t 1 + e α t λ + λ 1 ( e α t 1 + e α t ) 2 = λ ( 1 + e α t ) .
Clearly, when Z t = Y t 1 , { Y t , α t } t N 0 is a bivariate Markov chain with the following transition probabilities:
P ( Y t = j ; α t = n | Y t 1 = i , α t 1 = m ) = P ( e α t 1 + e α t Y t 1 + ε t = j ; α t 1 + η t = n | Y t 1 = i , α t 1 = m ) = l = 0 min ( i , j ) P ( e α t 1 + e α t Y t 1 = l | Y t 1 = i , α t 1 = m ) P ( ε t = j l ) P ( η t = n m ) = l = 0 min ( i , j ) i l ( e n 1 + e n ) l ( 1 1 + e n ) i l λ j l e λ ( j l ) ! 1 2 π e ( n m ) 2 2 .

3. Estimation Procedure

There are two interesting parameters in the model, one is the time-varying parameter α t , and the other is the mean λ of ε t . The main goal of this section is to estimate these two parameters using a new two-step estimation method. The first step gives the estimate α ^ t by Kalman-smoothed state. In the second step, the estimate λ ^ is obtained by α ^ t and the conditional least squares (CLS) method.

3.1. Kalman-Smoothed Estimate of State Vector α

In this subsection, the Kalman-smoothed method is used to estimate the time-varying parameter. In fact, the essence of the Kalman filter is the minimum variance linear unbiased estimation (MVLUE). From ref. [14], we know that when the model is nonlinear and non-Gaussian, the results obtained from the standpoint of MVLUE are equivalent to the linear and Gaussian cases. To find the Kalman-smoothed estimate of the unobserved state vector, we employ the matrix formulation of Equation (2) following ref. [14]. The required moments are calculated and the analytic expression of this estimate is given at the end.
For convenience, set β t = e α t 1 + e α t . Then, α t = log β t 1 β t l o g i t β t is a logistic transformation. For t = 1 , , T , Equation (2) can be written in matrix form, similarly to ref. [21]:
Y = β Z + ε , α = ( μ + η ) C ,
where
Y = ( y 1 , , y T ) , β = ( β 1 , , β T ) , Z = Z 1 0 0 Z T , ε = ( ε 1 , , ε T ) , α = ( α 1 , , α T ) , μ = ( μ , 0 , , 0 ) , η = ( η 1 , , η T ) , C = 1 1 0 1 .
Now, consider the problem of estimating the parameter. The following theorem adapted from ref. [14] provides the Kalman-smoothed estimate of the state vector α in model (3).
Theorem 1.
The Kalman-smoothed estimate of α in model (3) is given by the conditional expectation of α , which contains all the observations of y t :
α ^ = E [ α | Y ] = E ( α ) + ( Y E ( Y ) ) V a r ( Y ) 1 C o v ( Y , α ) .
Proof. 
Let α ^ = Y A + B , where Y is a vector ( 1 × T ) , B is a vector ( 1 × T ) and A is a matrix ( T × T ) . Denote
I = E [ ( α ^ α ) ( α ^ α ) T ] = E [ ( Y A + B α ) ( Y A + B α ) T ] = t r E [ ( Y A + B α ) T ( Y A + B α ) ] .
This is the error covariance matrix of state vector estimation, and it can be regarded as a function on A and B. To minimize I, we need
I A = 2 E [ Y T ( Y A + B α ) ] = 0 , I B = 2 E [ Y A + B α ] = 0 .
We obtain
A = E [ Y T α ] E [ Y T ] E [ α ] E [ Y T Y ] E [ Y T ] E [ Y ] = V a r ( Y ) 1 C o v ( Y , α ) , B = E ( α ) E ( Y ) A = E ( α ) E ( Y ) V a r ( Y ) 1 C o v ( Y , α ) .
This proves that α ^ = Y V a r ( Y ) 1 C o v ( Y , α ) + E ( α ) E ( Y ) V a r ( Y ) 1 C o v ( Y , α ) .
Theorem 1 contains some numerical characteristics of the random variable. Following model (3), we denote Λ : = E ( ε ) = ( λ , , λ ) . Then, the mean of α and Y , the variance of Y and the covariance between Y and α are given in the following proposition.
Proposition 2.
Suppose { Y } is a TV-INAR process defined by (3). Then, for any time T,
(i) 
E ( α ) = μ C ;
(ii) 
E ( Y ) = E ( β ) Z + Λ ;
(iii) 
V a r ( Y ) = Z V a r ( β ) Z + E [ d i a g ( 1 β 1 , , 1 β t ) d i a g ( β 1 , , β T ) ] Z + d i a g ( λ , , λ ) ;
(iv) 
C o v ( Y , α ) = E ( β α ) Z Z E ( β ) μ C .
The proofs of Proposition 2 are given in Appendix A. From Proposition 2, the Kalman-smoothed estimate of the time-varying parameter is
α ^ = E [ α | Y ] = μ C + [ Y E ( β ) Z Λ ] { Z V a r ( β ) Z + E [ d i a g ( 1 β 1 , , 1 β T ) d i a g ( β 1 , , β T ) ] Z + d i a g ( λ , , λ ) } 1 [ E ( β α ) Z Z E ( β ) μ C ] ,
where α and β are random variables. Hence, in order to obtain the concrete result of the estimate, we need to compute the numerical characteristics of the random variables in Equation (5), i.e., E ( β k ) , V a r ( β k ) , E [ β k ( 1 β k ) ] , E ( β i α j ) and E ( β k l β k ) , where k , i , j denote instances in time.
First, we set α 0 N ( μ 0 , σ 0 2 ) . Then, we have
α k = α 0 + i = 1 k η i N ( μ 0 , σ 0 2 + k ) ,
e α k L N ( μ 0 , σ 0 2 + k ) .
Here, L N stands for the logarithmic Normal distribution. Denote X as e α k ( k = 1 , , T ) , μ as its mean and σ 2 as its variance, respectively. Then, X L N ( μ , σ 2 ) , and the probability density function of X is
p ( x ) = 1 2 π σ x exp ( ln x μ ) 2 2 σ 2 .
Similarly, denote Y as e l = j + 1 i η l and τ 2 as its variance, respectively. Then, Y L N ( 0 , τ 2 ) . According to the logistic transformation, β t = e α t 1 + e α t . So β k can be expressed in terms of X, i.e., β k = 1 1 1 + X . Thus, from the distributions of X and Y, the following proposition can be established.
Proposition 3.
Suppose β k and β i are elements of β in Proposition 2, and α j is an element of α . Denote
e r , k = e r μ 0 + r 2 2 ( σ 0 2 + k ) , Φ r , k = Φ μ 0 σ 0 2 + k r σ 0 2 + k , Φ 1 , r , k = Φ μ 0 σ 0 2 + k r σ 0 2 + k .
Then, for any k , i , j ( 1 , , T ) , l ( 1 , , k 1 ) ,
(1) 
E ( β k ) = 1 r = 0 ( 1 ) r e r , k Φ 1 , r , k + ( 1 ) r + 1 e ( r + 1 ) , k Φ r + 1 , k ;
(2) 
V a r ( β k ) = r = 0 ( 1 ) r ( r + 1 ) e r , k Φ 1 , r , k + e ( r + 2 ) , k Φ r + 2 , k r = 0 ( 1 ) r e r , k Φ 1 , r , k + ( 1 ) r + 1 e ( r + 1 ) , k Φ r + 1 , k 2 ;
(3) 
E [ β k ( 1 β k ) ] = r = 1 ( 1 ) r + 1 r e r , k Φ 1 , r , k + e r , k Φ r , k ;
(4) 
When j < i ,
E ( β i α j ) = 1 2 ( σ 0 2 + j 2 π + μ 0 ) r = 1 ( 1 ) r + 1 ( e r μ 0 e r μ 0 ) ( σ 0 2 + j 2 π + μ 0 ) + ( e r μ 0 + e r μ 0 ) r ( σ 0 2 + j ) e r 2 2 ( σ 0 2 + i ) Φ ( r i j ) ;
When j > i ,
E ( β i α j ) = r = 1 ( 1 ) r 1 2 ( μ 0 + r ( σ 0 2 + i ) ) e r , i Φ 1 , r , i + ( μ 0 ( r 1 ) ( σ 0 2 + i ) ) e ( r 1 ) , i Φ r 1 , i ;
(5) 
E ( β k l β k ) = 1 2 r = 1 ( 1 ) r 1 e r , k l Φ 1 , r , k l + e ( r 1 ) , k l Φ r 1 , k l + r = 1 2 e r 2 2 l Φ ( r l ) m = 1 r ( 1 ) m 1 e m , k l + ( 1 ) m e ( m 1 ) , k l .
The proofs of Proposition 3 are given in Appendix B. And the distribution function Φ ( · ) of the standard normal distribution and Taylor’s formula are used. Clearly, the exact solution α ^ of the time-varying parameter α is found by substituting (1)–(5) of Proposition 3 into Equation (5). Since lim u + Φ ( u ) = 1 , if r tends to infinity, all of the Φ ( · ) in Proposition 3 converge to 0. The corresponding results are specified in Remark 1.
Remark 1.
The items involving Φ ( · ) in Proposition 3 can be divided into two categories, namely Φ ( a r b ) and Φ ( a r b ) , where a and b are constants greater than 0. According to the properties of the distribution function of the standard normal distribution, if r tends to infinity, the following limits apply:
lim r + Φ a r b = 1 lim r + Φ r b + a = 0 , lim r + Φ a r b = 1 lim r + Φ r b a = 0 .
Therefore, (1)–(5) in Proposition 3 are convergent. Moreover, the Kalman-smoothed estimate α ^ in (5) is also convergent.

3.2. CLS Estimation of Parameter λ

Based on the proposed TV-INAR model and its statistical properties, we consider the parameter estimation problem on λ in the model. Here, the estimate α ^ t obtained in the previous section is used directly. Due to the characteristics of the model, that is, the state equation is nonstationary, the autoregressive coefficient in the observation equation is time-varying and the distribution is unknown, many estimation methods are not suitable for this model. Considering that the CLS estimation method does not need the distribution hypothesis of the model but only the corresponding moment information, and the estimation accuracy is relatively high, we prefer using the CLS method to estimate λ .
Suppose there is a TV-INAR process { Y t } . Let
Q ( λ ) = t = 1 T [ Y t E ( Y t | Z t ) ] 2
be the CLS criterion function. Then, the CLS estimator of the parameter λ can be obtained by minimizing the CLS criterion function, i.e.,
λ ^ C L S = arg min Q ( λ ) .
Taking the derivative of Q ( λ ) with respect to λ and setting it to zero, the minimizer λ ^ C L S can be found:
λ ^ C L S = 1 T t = 1 T Y t 1 T t = 1 T Z t e α ^ t 1 + e α ^ t .
This is the CLS estimate of λ .

4. Simulation

In this section, we perform simulations using Matlab R2018b to assess the performance of the two-step estimation approach discussed in Section 3. The approach recovers the true time-varying parameters, as well as the performance of the CLS estimator.
For the data generating process, using Equation (3), pseudodata are randomly generated by changing time T ( T = { 50 , 100 , 200 } ) and the parameter λ ( λ = { 0.05 , 0.1 , 0.8 , 1.3 , 4 } ) of the error term in the observation equation. The number of replications was 100. Here, the choice of λ is based on the size of the signal-to-noise ratio (SNR), which is defined as the variance in η t relative to that in ε t , i.e., SNR = σ η t 2 / σ ε t 2 . In our model, the error of the state equation is constant. So, we consider the SNRs for 1/0.05, 1/0.1, 1/0.8, 1/1.3 and 1/4 by changing the variance in the error ( σ ε t 2 ) in the observation equation. And these five sample paths of the TV-INAR(1) model are plotted in Figure 1, respectively. We can see that the sample path of y t is unsteady, and the variation in parameter combinations results in a change in the sample dispersion of the samples.
Then, we should notice the choice of the parameter’s initial value. The initial value of α t follows a known normal distribution, which is mentioned in Section 2.1. In practice, it is difficult to gauge the true value of its mean and the variance of the distribution. For simplicity, we assume α 0 is known and nonstochastic, i.e., α 0 = 0 ( μ 0 = 0 , σ 0 2 = 0 ) . This assumption brings great convenience in simulation studies and does not affect numerical results. This can be observed in Proposition 3 when the sample size is sufficiently large.
Our simulation concerns a first-order integer-valued autoregressive model with time-varying parameters (TV-INAR(1)). For the first step estimation, let α t , n denote the true value of the parameter in the data generation process and α ^ t , n denote the Kalman-smoothed estimate of the parameter. We compute the sample means and sample standard deviations of the true and estimated values, respectively:
α ¯ n = 1 T t = 1 T α t , n , s d ( α n ) = 1 T 1 t = 1 T ( α t , n α ¯ n ) 2 ,
α ^ ¯ n = 1 T t = 1 T α ^ t , n , s d ( α ^ n ) = 1 T 1 t = 1 T ( α ^ t , n α ^ ¯ n ) 2 .
After N = 100 replications, the averages of the above four indicators are found, respectively:
α ¯ = 1 N n = 1 N α ¯ n , s ¯ = 1 N n = 1 N s d ( α n ) ,
α ^ ¯ = 1 N n = 1 N α ^ ¯ n , s ^ ¯ = 1 N n = 1 N s d ( α ^ n ) .
Let b i a s α = | α ¯ α ^ ¯ | . We take the difference between α ¯ and α ^ ¯ directly to evaluate the effectiveness of the Kalman-smoothed estimate approach. In addition, denote
r a t = 1 N n = 1 N s d ( α ^ n ) s d ( α n )
as the mean of the ratio of the standard deviation of α ^ n to the standard deviation of the real process α n . The quality of the estimator depends on whether r a t is close to one. This is similar to the criterion in [21].
Next, we evaluate the performance of the second step estimation, which is to apply the α ^ t and the conditional least squares (CLS) method to estimate the parameter λ in the model (2). As we mentioned above, the true value of λ is considered λ = { 0.05 , 0.1 , 0.8 , 1.3 , 4 } . To evaluate the estimation performance, besides considering b i a s λ = | λ λ ^ | , two other indicators were selected, which are the mean absolute deviation (MAD) and mean square error (MSE), as defined below:
M A D = 1 N n = 1 N λ ^ n λ , M S E = 1 N n = 1 N λ ^ n λ 2 ,
where λ ^ n is the estimation result of λ at the nth replication. Then, considering various sample sizes and initial parameter values, the simulation results of the two-step estimation process are listed in Table 1 and Table 2.
It is shown that the smaller the variance ( λ ) of the error term, the smaller the biases of estimates b i a s α and b i a s λ . This implies that the Kalman smoothing and CLS approaches work better in the sense of b i a s . In the first-step estimation, the larger the variance ( λ ), the closer r a t is to one. In this sense, the Kalman smoothing method is the best if λ is equal to 1.3. This suggests that only using one criterion to measure the effectiveness of the estimation method is insufficient. In addition, by increasing the sample size, b a i s α is smaller and r a t is closer to one. This shows that the Kalman-smoothed estimate is closer to the true process. From the perspective of SNR, the larger the SNR, the smaller the bias in the estimation. The smaller the SNR, the closer the estimated median sample variance is to the true process. In summary, when the SNR is around 1, the estimation is relatively good. In the second-step estimation, the values of MAD and MSE are small, suggesting a relatively acceptable estimation effect. The estimation results as a whole show a trend that the larger the SNR, the smaller the estimation error. Consequently, the CLS estimate method works better. Additionally, when the sample size T increases, the corresponding MAD and MSE gradually decrease, and the estimates of the parameter gradually converge to the true values. In conclusion, the proposed two-step estimation method is a credible approach.

5. Case Application

In this section, we apply the model and method of Section 3 to predict real-time series data. The data set is a count time series of offense data, obtainable from the NSW Bureau of Crime Statistics and Research covering January 2000 to December 2009. The observations represent monthly counts of Sexual offenses in Ballina, NSW, Australia, which comprise 120 monthly observations. Figure 2 shows the time plot and partial autocorrelation function (PACF). It shows that the data are nonstationary and are first-order autocorrelated, which is an indication that it could be reasonable to model this data set with our model. The descriptive statistics of the data are displayed in Table 3. Next, we compared our TV-INAR(1) model with the INAR(1) model to fit the data set. In general, it is expected that the better model to fit the data presents smaller values for -log-likelihood, AIC and BIC. From the results in Table 4, we can conclude that the proposed model fits the data better.
For the prediction, the predicted values of the offense data are given by
E ( Y t + k | F Y , t ) = h = 1 k e α ^ t + h 1 + e α ^ t + h Y t + h = 1 k 1 m = 0 h 1 e α ^ t + k m 1 + e α ^ t + k m λ ^ + λ ^ .
Specifically, the one-step-ahead conditional expectation point predictor is given by
E ( Y t | F Y , t 1 ) = e α ^ t 1 + e α ^ t Y t 1 + λ ^ .
We compute the root mean square of the prediction errors (RMSEs) of the data in the past 6 months, with the RMSE defined as
R M S E = 1 n t = 1 n ( Y t Y ^ t ) 2 .
The estimators and RMSE results are also shown in Table 4. To analyze the adequacy of the model, we analyze the standardized Pearson residuals, defined as
e t = Y t E ( Y t | F Y , t 1 ) V a r ( Y t | F Y , t 1 ) ,
with t ( 1 , , n ) . For our model, the mean and variance of the Pearson residuals are 0.8022 and 1.2000, respectively. As discussed in ref. [28], for an adequately chosen model, the variance of the residuals should take a value close to 1. And 1.2 seems to be close to 1. Therefore, we conclude that the proposed model fits the data well.

6. Conclusions

This paper proposes a time-varying integer-valued autoregressive model with autoregressive coefficients driven by a logistic regression structure. It can be more flexible and efficient to handle integer-valued discrete and even nonstationary data. Some statistical properties of the model are derived, such as mean, variance, covariance, conditional mean and conditional variance. A two-step estimation method was introduced. Since the model is in a state-space form, the Kalman smoothing method would instead be used to estimate the time-varying parameter. In the first step, the Kalman-smoothed estimate of the state vector is obtained using the information of the known observation variables. In the second step, the estimate from the previous step and the CLS method are used to obtain the estimate of the error term parameter. The analysis formulae of the estimates in the two steps are both derived. The solution challenge lies in calculating the covariance matrix and the correlation between the variables. The advantage of this method is that the approximate estimation of the unknown parameters can be obtained from all the observable variables, and the approach has superior performance in practical applications, even in the case of nonlinear and non-Gaussian errors. The proposed method can also estimate the time-varying parameters well. In addition, an application of forecasting a real data set is presented. The results suggest that the TV-INAR(1) model is more suitable for practical data sets. In model (3), if the variances of the error terms ε t and η t are allowed to be time-dependent, the model will be regarded as a stochastic volatility model. This is a topic for discussion. Moreover, extending the research outcomes to the p-order model TV-INAR(p) is one direction of future research.

Author Contributions

Every author contributed equally to the development of this paper. Conceptualization, D.W.; ethodology, Y.P.; software, Y.P.; investigation, Y.P.; writing—original draft preparation, Y.P.; writing—review and editing, D.W. and M.G.; supervision, D.W. and M.G.; funding acquisition, D.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (Nos. 12271231, 12001229, 11901053).

Data Availability Statement

Publicly available data sets were analyzed in this study. These data can be found here: [https://data.nsw.gov.au/data/dataset/nsw-criminal-court-statistics] (accessed on 3 July 2023).

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Proof of Proposition 2

From models (3) and (2), we have
(i)
E ( α ) = E ( μ + η ) C = μ C ;
(ii)
E ( Y ) = E β Z + ε = E ( β ) Z + Λ ;
(iii)
V a r ( Y ) = V a r ( β Z ) + d i a g ( λ , , λ ) ;
V a r ( β Z ) = E { [ β Z E ( β Z ) ] [ β Z E ( β Z ) ] } = C o v ( β 1 Z 1 , β 1 Z 1 ) C o v ( β 1 Z 1 , β T Z T ) C o v ( β T Z T , β 1 Z 1 ) C o v ( β T Z T , β T Z T ) .
The main diagonal elements are
V a r ( β k Z k ) = V a r ( β k ) Z k 2 + E [ β k ( 1 β k ) ] Z k , k = 1 , , T ,
and the rest of the elements are
C o v ( β k l Z k l , β k Z k ) = E { E [ ( β k l Z k l ) ( β k Z k ) | β k l , β k ] } E [ E ( β k l Z k l | β k l ) ] E [ E ( β k Z k | β k ) ] = E { E [ ( B 1 + + B Z k l ) ( B 1 + + B Z k ) | β k l , β k ] } E ( β k l ) Z k l E ( β k ) Z k = E [ E ( B 1 B 1 + + B 1 B Z k ) + + E ( B Z k l B 1 + + B Z k l B Z k ) | β k l , β k ] E ( β k l ) Z k l E ( β k ) Z k = E ( β k l β k ) Z k l Z k E ( β k l ) E ( β k ) Z k l Z k .
Therefore,
V a r ( Y ) = Z V a r ( β ) Z + E [ d i a g ( 1 β 1 , , 1 β t ) d i a g ( β 1 , , β T ) ] Z + d i a g ( λ , , λ ) .
(iv)
C o v ( Y , α ) = C o v ( β Z , α ) = C o v ( β 1 Z 1 , α 1 ) C o v ( β 1 Z 1 , α T ) C o v ( β T Z T , α 1 ) C o v ( β T Z T , α T ) .
For any i , j ( 1 , , T ) ,
C o v ( β i Z i , α j ) = E { E [ ( β i Z i ) α j | β i , β j ] } E [ E ( β i Z i | β i ) ] E [ E ( α j | β j ) ] = E [ ( B 1 + + B Z i ) α j | β i , β j ] E ( β i ) Z i μ 0 = E [ B 1 α j + + B Z j α j | β i , β j ] E ( β i ) Z i μ 0 = E [ β i α j ] Z i E ( β i ) Z i μ 0 .
Thus, C o v ( Y , α ) = E ( β α ) Z Z E ( β ) μ C .

Appendix B. Proof of Proposition 3

(1) E ( β k ) = E e α k 1 + e α k = 1 E 1 1 + e α k = 1 E 1 1 + X . Denote X L N ( μ , σ 2 ) . Using substitution and Taylor’s expansion, we obtain
E 1 1 + X = 0 1 1 + x 1 2 π σ x exp ( ln x μ ) 2 2 σ 2 d x = 1 1 + e 2 σ s + μ 1 2 π σ e 2 σ s + μ e s 2 e 2 σ s + μ 2 σ d s = 1 π e s 2 ( 1 + e 2 σ s + μ ) 1 d s = 1 π μ 2 σ e s 2 ( 1 + e 2 σ s + μ ) 1 d s + μ 2 σ e s 2 2 σ s μ ( 1 + e 2 σ s μ ) 1 d s = 1 π [ μ 2 σ e s 2 d s e μ + σ 2 2 μ 2 σ e ( s σ 2 ) 2 d s + e 2 μ + 2 σ 2 μ 2 σ e ( s 2 σ ) 2 d s + e μ + σ 2 2 μ 2 σ e ( s + σ 2 ) 2 d s e 2 μ + 2 σ 2 μ 2 σ e ( s + 2 σ ) 2 d s + e 3 μ + 9 2 σ 2 μ 2 σ e ( s + 3 2 σ ) 2 d s ] = Φ μ σ e μ + σ 2 2 Φ μ σ σ + e 2 μ + 2 σ 2 Φ μ σ 2 σ + [ e μ + σ 2 2 Φ μ σ σ e 2 μ + 2 σ 2 Φ μ σ 2 σ + e 3 μ + 9 2 σ 2 Φ μ σ 3 σ ] = r = 0 ( 1 ) r e r μ + r 2 2 σ 2 Φ μ σ r σ + ( 1 ) r + 1 e ( r + 1 ) μ + ( r + 1 ) 2 2 σ 2 Φ μ σ ( r + 1 ) σ .
Then,
E ( β k ) = 1 r = 0 [ ( 1 ) r e r μ 0 + r 2 2 ( σ 0 2 + k ) Φ μ 0 σ 0 2 + k r σ 0 2 + k + ( 1 ) r + 1 e ( r + 1 ) μ 0 + ( r + 1 ) 2 2 ( σ 0 2 + k ) Φ μ 0 σ 0 2 + k ( r + 1 ) σ 0 2 + k ] .
(2) V a r ( β k ) = V a r 1 1 1 + e α k = V a r 1 1 + e α k = E 1 1 + X 2 E 1 1 + X 2 ,
E 1 1 + X 2 = 0 1 ( 1 + x ) 2 1 2 π σ x exp ( ln x μ ) 2 2 σ 2 d x = 1 ( 1 + e 2 σ s + μ ) 2 1 2 π σ e 2 σ s + μ e s 2 e 2 σ s + μ 2 σ d s = 1 π e s 2 ( 1 + e 2 σ s + μ ) 2 d s = 1 π μ 2 σ e s 2 ( 1 + e 2 σ s + μ ) 2 d s + μ 2 σ e s 2 2 2 σ s 2 μ ( 1 + e 2 σ s μ ) 2 d s = 1 π [ μ 2 σ e s 2 d s 2 e μ + σ 2 2 μ 2 σ e ( s σ 2 ) 2 d s + 3 e 2 μ + 2 σ 2 μ 2 σ e ( s 2 σ ) 2 d s + e 2 μ + 2 σ 2 μ 2 σ e ( s + 2 σ ) 2 d s 2 e 3 μ + 9 2 σ 2 μ 2 σ e ( s + 3 2 σ ) 2 d s + 3 e 4 μ + 16 2 σ 2 μ 2 σ e ( s + 4 2 σ ) 2 d s ] = Φ μ σ 2 e μ + σ 2 2 Φ μ σ σ + 3 e 2 μ + 2 σ 2 Φ μ σ 2 σ + [ e 2 μ + 2 σ 2 Φ ( μ σ 2 σ ) 2 e 3 μ + 9 2 σ 2 Φ μ σ 3 σ + 3 e 4 μ + 16 2 σ 2 Φ μ σ 4 σ ] = r = 0 ( 1 ) r ( r + 1 ) e r μ + r 2 2 σ 2 Φ μ σ r σ + e ( r + 2 ) μ + ( r + 2 ) 2 2 σ 2 Φ μ σ ( r + 2 ) σ .
Then,
V a r ( β k ) = r = 0 ( 1 ) r ( r + 1 ) [ e r μ 0 + r 2 2 ( σ 0 2 + k ) Φ μ 0 σ 0 2 + k r σ 0 2 + k + e ( r + 2 ) μ 0 + ( r + 2 ) 2 2 ( σ 0 2 + k ) Φ μ 0 σ 0 2 + k ( r + 2 ) σ 0 2 + k ] { r = 0 [ ( 1 ) r e r μ 0 + r 2 2 ( σ 0 2 + k ) Φ μ 0 σ 0 2 + k r σ 0 2 + k + ( 1 ) r + 1 e ( r + 1 ) μ 0 + ( r + 1 ) 2 2 ( σ 0 2 + k ) Φ μ 0 σ 0 2 + k ( r + 1 ) σ 0 2 + k ] } 2 .
(3)
E [ β k ( 1 β k ) ] = E 1 1 + X E 1 1 + X 2 = e μ + σ 2 2 Φ μ σ σ 2 e 2 μ + 2 σ 2 Φ μ σ 2 σ + 3 e 3 μ + 9 2 σ 2 Φ μ σ 3 σ + e μ + σ 2 2 Φ μ σ σ 2 e 2 μ + 2 σ 2 Φ μ σ 2 σ + 3 e 3 μ + 9 2 σ 2 Φ μ σ 3 σ = r = 1 ( 1 ) r + 1 r e r μ + r 2 2 σ 2 Φ μ σ r σ + e r μ + r 2 2 σ 2 Φ μ σ r σ = r = 1 ( 1 ) r + 1 r [ e r μ 0 + r 2 2 ( σ 0 2 + k ) Φ μ 0 σ 0 2 + k r σ 0 2 + k + e r μ 0 + r 2 2 ( σ 0 2 + k ) Φ μ 0 σ 0 2 + k r σ 0 2 + k ] .
(4)
β i α j = e α i 1 + e α i α j = exp α 0 + l = 1 j η l + l = j + 1 i η l 1 + exp α 0 + l = 1 j η l + l = j + 1 i η l ( α 0 + l = 1 j η l ) , j < i , exp α 0 + l = 1 i η l 1 + exp α 0 + l = 1 i η l ( α 0 + l = 1 i η l + l = i + 1 j η l ) , j i ,
where α 0 + l = 1 j η l = α j N ( μ 0 , σ 0 2 + j ) , l = j + 1 i η l N ( 0 , i j ) , and they are independent of each other. The same applies when j i . Let Y denote e l = j + 1 i η l and τ 2 denote its variance. Thus, Y L N ( 0 , τ 2 ) .
When j < i ,
E β i α j = 0 0 x y ln x 1 + x y 1 2 π σ x e ( ln x μ ) 2 2 σ 2 1 2 π τ y e ( ln y ) 2 2 τ 2 d x d y = 0 ln x 2 π σ e ( ln x μ ) 2 2 σ 2 d x 0 1 2 π τ ( 1 + x y ) e ( ln y ) 2 2 τ 2 d y ,
where
1 2 π τ 0 1 1 + x y e ( ln y ) 2 2 τ 2 d y = 1 2 π τ 1 1 + x e 2 τ s e s 2 e 2 τ s 2 τ d s = 1 π e s 2 + 2 τ s ( 1 + x e 2 τ s ) 1 d s = 1 π 0 e s 2 + 2 τ s ( 1 + x e 2 τ s ) 1 d s + 0 x 1 e s 2 ( 1 + x 1 e 2 τ s ) 1 d s = 1 π [ e τ 2 2 0 e ( s τ 2 ) 2 d s x e 2 τ 2 0 e ( s 2 τ ) 2 d s + x 2 e 9 2 τ 2 0 e ( s 3 2 τ ) 2 d s + x 1 0 e s 2 d s x 2 e τ 2 2 0 e ( s + τ 2 ) 2 d s + x 3 e 2 τ 2 0 e ( s + 2 τ ) 2 d s ] = e τ 2 2 Φ ( τ ) x e 2 τ 2 Φ ( 2 τ ) + x 2 e 9 2 τ 2 Φ ( 3 τ ) + x 1 Φ ( 0 ) x 2 e τ 2 2 Φ ( τ ) + x 3 e 2 τ 2 Φ ( 2 τ ) .
Thus,
E β i α j = ( σ 2 π + μ ) Φ ( 0 ) + ( e μ e μ ) ( σ 2 π + μ ) + ( e μ + e μ ) σ 2 e σ 2 + τ 2 2 Φ ( τ ) [ ( e 2 μ e 2 μ ) ( σ 2 π + μ ) + ( e 2 μ + e 2 μ ) 2 σ 2 ] e 2 σ 2 + 2 τ 2 Φ ( 2 τ ) + [ ( e 3 μ e 3 μ ) ( σ 2 π + μ ) + ( e 3 μ + e 3 μ ) 3 σ 2 ] e 9 2 ( σ 2 + τ 2 ) Φ ( 3 τ ) = 1 2 ( σ 2 π + μ ) + r = 1 ( 1 ) r + 1 ( e r μ e r μ ) ( σ 2 π + μ ) + ( e r μ + e r μ ) r σ 2 ( σ 2 π + μ ) Φ ( r τ ) = 1 2 ( σ 0 2 + j 2 π + μ 0 ) r = 1 ( 1 ) r + 1 ( e r μ 0 e r μ 0 ) ( σ 0 2 + j 2 π + μ 0 ) + ( e r μ 0 + e r μ 0 ) r ( σ 0 2 + j ) e r 2 2 ( σ 0 2 + i ) Φ ( r i j ) .
Similarly, when j i ,
E β i α j = E X ln X 1 + X + X Y 1 + X ,
X and Y are independent, and E ( Y ) = 0 . Therefore,
E β i α j = E X ln X 1 + X = 0 x ln x 1 + x 1 2 π σ x e ( ln x μ ) 2 2 σ 2 d x = 1 π 2 σ s + μ 1 + e 2 σ s + μ e s 2 + 2 σ s + μ d s = 2 σ π μ 2 σ s e s 2 + 2 σ s + μ ( 1 + e 2 σ s + μ ) 1 d s + 2 σ π μ 2 σ s e s 2 ( 1 + e 2 σ s μ ) 1 d s + μ π μ 2 σ e s 2 + 2 σ s + μ ( 1 + e 2 σ s + μ ) 1 d s + μ π μ 2 σ e s 2 ( 1 + e 2 σ s μ ) 1 d s = 2 ( μ + σ 2 ) e μ + σ 2 2 Φ ( μ σ σ ) 2 ( μ + 2 σ 2 ) e 2 μ + 2 σ 2 Φ ( μ σ 2 σ ) + 2 μ Φ ( μ σ ) 2 ( μ σ 2 ) e μ + σ 2 2 Φ ( μ σ σ ) + 2 ( μ 2 σ 2 ) e 2 μ + 2 σ 2 Φ ( μ σ 2 σ ) = r = 1 ( 1 ) r 1 2 [ ( μ + r σ 2 ) e r μ + r 2 2 σ 2 Φ ( μ σ r σ ) + ( μ ( r 1 ) σ 2 ) e ( r 1 ) μ + ( r 1 ) 2 2 σ 2 Φ ( μ σ ( r 1 ) σ ) ] = r = 1 ( 1 ) r 1 2 [ ( μ 0 + r ( σ 0 2 + i ) ) e r μ 0 + r 2 2 ( σ 0 2 + i ) Φ μ 0 σ 0 2 + i r σ 0 2 + i + ( μ 0 ( r 1 ) ( σ 0 2 + i ) ) e ( r 1 ) μ 0 + ( r 1 ) 2 2 ( σ 0 2 + i ) Φ μ 0 σ 0 2 + i ( r 1 ) σ 0 2 + i ] .
(5) According to the state equation in (2),
β k l β k = e α k l 1 + e α k l · e α k 1 + e α k = e α k l 1 + e α k l · e α k l + i = k l + 1 k η l 1 + e α k l + i = k l + 1 k η l .
Let X denote e α k l and Y denote e i = k l + 1 k η l . Then, X L N ( μ 0 , σ 0 2 + k l ) , Y L N ( 0 , l ) . For convenience, let μ and σ represent the mean and variance of X in the computation. Thus,
E ( β k l β k ) = E X 1 + X · X Y 1 + X Y = 0 0 x ( 1 + x ) ( 1 + x y ) 1 2 π σ exp ( ln x μ ) 2 2 σ 2 1 2 π l exp ( ln y ) 2 2 l d y d x = 0 x 2 π σ ( 1 + x ) exp ( ln x μ ) 2 2 σ 2 d x 0 1 2 π l ( 1 + x y ) exp ( ln y ) 2 2 l d y ,
where 0 1 2 π l ( 1 + x y ) exp ( ln y ) 2 2 l d y can be obtained from (A1).
Therefore,
E ( β k l β k ) = 1 2 π σ 0 x 1 + x e ( ln x μ ) 2 2 σ 2 [ e l 2 Φ ( l ) x e 2 l Φ ( 2 l ) + x 2 e 9 2 l Φ ( 3 l ) + x 1 Φ ( 0 ) x 2 e l 2 Φ ( l ) + x 3 e 2 l Φ ( 2 l ) ] d x = 2 Φ ( 0 ) [ e μ + σ 2 2 Φ ( μ σ σ ) e 2 μ + 2 σ 2 Φ ( μ σ 2 σ ) + e 3 μ + 9 2 σ 2 Φ ( μ σ 3 σ ) + Φ ( μ σ ) e μ + σ 2 2 Φ ( μ σ σ ) + e 2 μ + 2 σ 2 Φ ( μ σ 2 σ ) ] + 2 e l 2 Φ ( l ) e μ + σ 2 2 1 + 2 e 2 l Φ ( 2 l ) e 2 μ + 2 σ 2 + e μ + σ 2 2 1 + e μ + σ 2 2 + = 1 2 r = 1 ( 1 ) r 1 [ e r μ 0 + r 2 2 ( σ 0 2 + k l ) Φ ( μ 0 σ 0 2 + k l r σ 0 2 + k l ) + e ( r 1 ) μ 0 + ( r 1 ) 2 2 ( σ 0 2 + k l ) Φ ( μ 0 σ 0 2 + k l ( r 1 ) σ 0 2 + k l ) ] + r = 1 2 e r 2 2 l Φ ( r l ) m = 1 r ( 1 ) m 1 e m μ 0 + m 2 2 ( σ 0 2 + k l ) + ( 1 ) m e ( m 1 ) μ 0 + ( m 1 ) 2 2 ( σ 0 2 + k l ) .

References

  1. Karlis, D.; Khan, N.M. Models for integer data. Annu. Rev. Stat. Appl. 2023, 10, 297–323. [Google Scholar] [CrossRef]
  2. Scotto, M.G.; Weiß, C.H.; Gouveia, S. Thinning-based models in the analysis of integer-valued time series: A review. Stat. Model. 2015, 15, 590–618. [Google Scholar] [CrossRef]
  3. Weiß, C.H. An Introduction to Discrete-Valued Time Series; John Wiley and Sons, Inc.: Chichester, UK, 2018. [Google Scholar]
  4. McKenzie, E. Some simple models for discrete variate time series. Water Resour. Bull. 1985, 21, 645–650. [Google Scholar] [CrossRef]
  5. Al-Osh, M.A.; Alzaid, A.A. First-order integer-valued autoregressive (INAR(1)) process. J. Time Ser. Anal. 1987, 8, 261–275. [Google Scholar] [CrossRef]
  6. Steutel, F.W.; Van, H.K. Discrete analogues of self-decomposability and stability. Ann. Probab. 1979, 7, 893–899. [Google Scholar] [CrossRef]
  7. Freeland, R.K.; McCabe, B. Analysis of low count time series data by Poisson autoregression. J. Time Ser. Anal. 2004, 25, 701–722. [Google Scholar] [CrossRef]
  8. Freeland, R.K.; McCabe, B. Asymptotic properties of CLS estimators in the Poisson AR(1) model. Stat. Probab. Lett. 2005, 73, 147–153. [Google Scholar] [CrossRef]
  9. Yu, M.; Wang, D.; Yang, K.; Liu, Y. Bivariate first-order random coefficient integer-valued autoregressive processes. J. Stat. Plan. Inference 2020, 204, 153–176. [Google Scholar] [CrossRef]
  10. Deng, Z.L.; Gao, Y.; Mao, L.; Li, Y.; Hao, G. New approach to information fusion steady-state Kalman filtering. Automatica 2005, 41, 1695–1707. [Google Scholar] [CrossRef]
  11. Sbrana, G.; Silvestrini, A. Random coefficient state-space model: Estimation and performance in M3-M4 competitions. Int. J. Forecast. 2022, 38, 352–366. [Google Scholar] [CrossRef]
  12. You, J.; Chen, G. Parameter estimation in a partly linear regression model with random coefficient autoregressive errors. Commun. Stat. Theory Methods 2022, 31, 1137–1158. [Google Scholar] [CrossRef]
  13. Kalman, R.E. A new approach to linear filtering and prediction problems. J. Fluids Eng. 1960, 82, 35–45. [Google Scholar] [CrossRef]
  14. Durbin, J.; Koopman, S.J. Time Series Analysis by State Space Methods. In Oxford Statistical Science Series, 2nd ed.; Oxford University Press: Oxford, UK, 2012. [Google Scholar]
  15. Tang, M.T.; Wang, Y.Y. Asymptotic behavior of random coefficient INAR model under random environment defined by difference equation. Adv. Differ. Equ. 2014, 1, 99. [Google Scholar] [CrossRef]
  16. Koh, Y.B.; Bukhari, N.A.; Mohamed, I. Parameter-driven state-space model for integer-valued time series with application. J. Stat. Comput. Simul. 2019, 89, 1394–1409. [Google Scholar] [CrossRef]
  17. Yang, M.; Cavanaugh, J.E.; Zamba, G.K.D. State-space models for count time series with excess zeros. Stat. Model. 2015, 15, 70–90. [Google Scholar] [CrossRef]
  18. Davis, R.A.; Dunsmuir, W.T.M. State space models for count time series. In Handbook of Discrete-Valued Time Series; Davis, R., Holan, S., Lund, R., Ravishanker, N., Eds.; CRC Press: Boca Raton, FL, USA, 2016; pp. 121–144. [Google Scholar]
  19. Duncan, D.B.; Horn, S.D. Linear dynamic recursive estimation from the viewpoint of regression analysis. J. Am. Stat. Assoc. 1972, 67, 815–821. [Google Scholar] [CrossRef]
  20. Maddala, G.S.; Kim, I. Unit Roots, Cointegration, and Structural Change; Cambridge University Press: Cambridge, UK, 1998. [Google Scholar]
  21. Ito, M.; Noda, A.; Wada, T. An Alternative Estimation Method for Time-Varying Parameter Models. Econometrics 2022, 10, 23. [Google Scholar] [CrossRef]
  22. Omorogbe, J.A. Kalman filter and structural change revisited: An application to foreign trade-economic growth nexus. Struct. Chang. Econom. Model. 2019, 808, 49–62. [Google Scholar]
  23. Bernanke, B.S.; Mihov, I. Measuring monetary policy. Q. J. Econ. 1998, 113, 869–902. [Google Scholar] [CrossRef]
  24. Ito, M.; Noda, A.; Wada, T. International stock market efficiency: A non-Bayesian time-varying model approach. Appl. Econ. 2014, 46, 2744–2754. [Google Scholar] [CrossRef]
  25. Santos, C.; Pereir, I.; Scotto, M.G. On the theory of periodic multivariate INAR processes. Stat. Pap. 2021, 62, 1291–1348. [Google Scholar] [CrossRef]
  26. Chattopadhyay, S.; Maiti, R.; Das, S.; Biswas, A. Change-point analysis through INAR process with application to some COVID-19 data. Stat. Neerl. 2022, 76, 4–34. [Google Scholar] [CrossRef] [PubMed]
  27. Primiceri, G.E. Time varying structural vector autoregressions and monetary policy. Rev. Econ. Stud. 2005, 72, 821–852. [Google Scholar] [CrossRef]
  28. Aleksandrov, B.; Weiß, C.H. Testing the dispersion structure of count time series using Pearson residuals. AStA Adv. Stat. Anal. 2019, 104, 325–361. [Google Scholar] [CrossRef]
Figure 1. Sample paths of the TV-INAR(1) model for λ = { 0.05 , 0.1 , 0.8 , 1.3 , 4 } .
Figure 1. Sample paths of the TV-INAR(1) model for λ = { 0.05 , 0.1 , 0.8 , 1.3 , 4 } .
Axioms 13 00019 g001
Figure 2. Time plot and PACF of sexual offense data for the period 2000−2009.
Figure 2. Time plot and PACF of sexual offense data for the period 2000−2009.
Axioms 13 00019 g002
Table 1. Simulation results of α .
Table 1. Simulation results of α .
λ SNR T α ¯ α ^ ¯ bias α rat
0.0520500.71320.77900.06580.4982
1000.59410.64140.04730.5305
200−0.8605−0.86710.00660.7479
0.11050−0.2026−0.11370.08890.6547
100−0.9860−0.93330.05270.7899
200−0.6248−0.61750.00730.9138
0.81.25500.38010.47720.09710.8308
1000.43090.36660.06430.8965
2000.64380.61540.02840.9276
1.30.7750−0.5826−0.46990.11270.8869
1000.30950.22460.08490.9304
200−0.2924−0.33850.04611.0483
40.2550−0.3752−0.21430.16090.9025
1000.71060.58560.12500.9514
2000.66340.57620.08721.1287
Table 2. Simulation results of λ .
Table 2. Simulation results of λ .
λ SNR T λ ^ bias λ MADMSE
0.0520500.06210.01210.02560.0018
1000.05870.00870.01790.0015
2000.04410.00590.01030.0010
0.110500.08030.01970.04070.0023
1000.08580.01420.03730.0021
2000.09060.00940.02110.0017
0.81.25500.66320.13680.10140.0751
1000.68290.11710.07620.0657
2000.71570.08430.03990.0481
1.30.77501.13470.16520.17290.1251
1001.18270.11730.10940.0727
2001.20900.09100.05290.0512
40.25504.18800.18800.19310.1634
1003.83970.16030.16520.1252
2003.89030.10970.11400.0938
Table 3. Descriptive statistics for sexual offense data.
Table 3. Descriptive statistics for sexual offense data.
Sample SizeMinimumMaximumMedianMeanVariance
12003245.114.225
Table 4. Estimates of the parameters and goodness-of-fit statistics for the offense data.
Table 4. Estimates of the parameters and goodness-of-fit statistics for the offense data.
Model α ^ ¯ ( α ^ ) λ ^ −log-likelihoodAICBICRMSE
TV-INAR(1)0.15742.5928315.96635.92641.492.5278
INAR(1)0.22993.9236317.51639.01644.593.2137
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Pang, Y.; Wang, D.; Goh, M. A Two-Step Estimation Method for a Time-Varying INAR Model. Axioms 2024, 13, 19. https://doi.org/10.3390/axioms13010019

AMA Style

Pang Y, Wang D, Goh M. A Two-Step Estimation Method for a Time-Varying INAR Model. Axioms. 2024; 13(1):19. https://doi.org/10.3390/axioms13010019

Chicago/Turabian Style

Pang, Yuxin, Dehui Wang, and Mark Goh. 2024. "A Two-Step Estimation Method for a Time-Varying INAR Model" Axioms 13, no. 1: 19. https://doi.org/10.3390/axioms13010019

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop