*Article* **Monitoring Parameter Change for Time Series Models of Counts Based on Minimum Density Power Divergence Estimator**

#### **Sangyeol Lee \* and Dongwon Kim**

Department of Statistics, Seoul National University, Seoul 08826, Korea; dongwon.k@snu.ac.kr **\*** Correspondence: sylee@stats.snu.ac.kr; Tel.: +82-2-880-8814

Received: 28 October 2020; Accepted: 14 November 2020; Published: 16 November 2020

**Abstract:** In this study, we consider an online monitoring procedure to detect a parameter change for integer-valued generalized autoregressive heteroscedastic (INGARCH) models whose conditional density of present observations over past information follows one parameter exponential family distributions. For this purpose, we use the cumulative sum (CUSUM) of score functions deduced from the objective functions, constructed for the minimum power divergence estimator (MDPDE) that includes the maximum likelihood estimator (MLE), to diminish the influence of outliers. It is well-known that compared to the MLE, the MDPDE is robust against outliers with little loss of efficiency. This robustness property is properly inherited by the proposed monitoring procedure. A simulation study and real data analysis are conducted to affirm the validity of our method.

**Keywords:** time series of counts; INGARCH model; SPC; CUSUM monitoring; MDPDE

#### **1. Introduction**

In this paper we consider the cumulative sum (CUSUM) monitoring procedure for detecting a parameter change in integer-valued generalized autoregressive heteroscedastic (INGARCH) models. Integer-valued time series is a core area in time series analysis that includes diverse disciplines in social, physical, engineering, and medical sciences. Both integer-valued autoregressive (INAR) time series models and the integer-valued generalized autoregressive conditional heteroscedastic (INGARCH) models have been widely studied in the literature and applied to various practical problems. Refer to McKenzie [1], Al-Osh and Alzaid [2], Ferland, Latour and Oraichi [3], Fokianos, Rahbek and Tjøstheim [4], and Weiß [5] for a general review. Poisson, negative binomial (NB), and one-parameter exponential family distributions have been widely used as underlying distributions, as seen in Davis and Wu [6], Zhu [7], Zhu [8], Jazi, Jones and Lai [9], Christou and Fokianos [10], Davis and Liu [11], Lee, Lee and Chen [12], and Chen, Khamthong and Lee [13].

Since Page [14], the CUSUM test has been a conventional tool to detect a structural change in underlying models. For a history and background, we refer to Csörg˝o and Horváth [15], Chen and Gupta [16], Lee, Ha, Na and Na [17], and the papers cited therein. Several authors have studied the change point test for INGARCH models, including Fokianos and Fried [18], Fokianos and Fried [19], Franke, Kirch and Kamgaing [20], Fokianos, Gombay and Hussein [21], Hudecová [22], Hudecová, HuŠková and Meintanis [23], Kang and Lee [24], Lee, Lee and Chen [12], Lee, Lee and Tjøstheim [25], and Lee and Lee [26]. This CUSUM scheme has been applied not only to retrospective change point tests but also to on-line monitoring and statistical process control (SPC) problems, designed to monitoring abnormal phenomena in manufacturing processes and health care surveillance. The CUSUM control chart has been popular due to its considerable competency in early detection of anomalies. Refer to Weiß [27], Rakitzis, Maravelakis and Castagliola [28], Kim and Lee [29], and the papers cited therein. Meanwhile, Gombay and Serban [30] used the CUSUM approach based on the score vectors

for independent observations, and later extended it to autoregressive processes, wherein the Type I probability error is measured for obtaining control limits instead of the conventional average run length (ARL). Their CUSUM monitoring process is based on the asymptotic property of the partial sum process generated from score vectors. Later, Huh, Kim and Lee [31] adopted their method for analyzing Poisson INGARCH models, and compared its performance with the likelihood ratio (LR)-based control chart, originally considered by Weiss and Testik [32].

In this work, taking the approach of Gombay and Serban [30] and Huh, Kim and Lee [31], we designate a robust monitoring process based on the minimum distance power divergence estimator (MDPDE) proposed by Basu, Harris, Hjort and Jones [33]. We do this because the MDPDE is well-known to be suitable for robust inference in various models, having a trade-off between efficiency and robustness controlled through the tuning parameters with little loss in asymptotic efficiency relative to the maximum likelihood estimator (MLE) (Riani, Atkinson, Corbellini and Perrotta [34]). The MDPDE method has been successfully applied to various time series models, and in particular INGARCH models (Kim and Lee [35], Kim and Lee [36]). Recently, Lee and Lee [26] and Kim and Lee [37] considered the CUSUM tests based on score vectors for the MLE and MDPDE in exponential family distribution INGARCH models. See also Kang and Song [38]. Using their results within the framework of Gombay and Serban [30] and Huh, Kim and Lee [31], we design an MDPDE-based monitoring process to detect a model parameter change in INGARCH models. Monte Carlo simulations are conducted to assess the performance of the proposed monitoring procedure. A focus is made on comparing the MDPDE-based CUSUM test with the MLE-based CUSUM test for Poisson INGARCH models to demonstrate the superiority of the former over the latter in the presence of outliers. A real data analysis of the return times of extreme events of Goldman Sachs Group (GS) stock prices is also provided to illustrate the validity of the proposed test.

The rest of the paper is organized as follows. Section 2 reviews the MDPDE for INGARCH models and Section 3 constructs the monitoring procedure for these models and investigates its asymptotic properties. Section 4 presents a simulation study and Section 5 provides a real data analysis. Section 6 concludes the paper. The proof of the main theorem is provided in Appendix A.

#### **2. MDPDE for INGARCH Model: An Overview**

In this section, we briefly review the MDPDE for INGARCH models in [36]. Let *Y*1,*Y*2, . . . be the observations generated from integer-valued time series models with the conditional distribution of the one-parameter exponential family:

$$\mathcal{Y}\_t|\mathcal{F}\_{t-1} \sim p(y|\eta\_t), \quad \mathcal{X}\_t := E(\mathcal{Y}\_t|\mathcal{F}\_{t-1}) = f\_\theta(\mathcal{X}\_{t-1}, \mathcal{Y}\_{t-1}), \tag{1}$$

where F*t*−<sup>1</sup> is a *σ*-field generated by *Yt*−1,*Yt*−2, . . ., and *f<sup>θ</sup>* (*x*, *y*) is a non-negative bivariate function, depending on the parameter *<sup>θ</sup>* <sup>∈</sup> <sup>Θ</sup> <sup>⊂</sup> <sup>R</sup>*<sup>d</sup>* , and satisfies inf*θ*∈<sup>Θ</sup> *f<sup>θ</sup>* (*x*, *y*) ≥ *c*<sup>∗</sup> for some *c*<sup>∗</sup> > 0 for all *x*, *y*, and *p*(·|·) is a probability mass function given by

$$p(y|\eta) = \exp\{\eta y - A(\eta)\}h(y), \ y = 0, 1, \dots, \eta$$

where *η* is the natural parameter, *A*(*η*) and *h*(*y*) are known functions, and both *A* and *B* = *A* 0 are strictly increasing. In particular, *B*(*ηt*) = *X<sup>t</sup>* and *B* 0 (*ηt*) is the conditional variance of *Y<sup>t</sup>* . In what follows, symbols *Xt*(*θ*) and *ηt*(*θ*) = *B* −1 (*Xt*(*θ*)) are also utilized to stand for *X<sup>t</sup>* and *η<sup>t</sup>* , respectively.

Davis and Liu [11] demonstrated that the strict stationarity and ergodicity of {*Xt*}, and the expression of *Xt*(*θ*) = *f θ* <sup>∞</sup>(*Yt*−1,*Yt*−2, . . .) are allowed for some nonnegative measurable function *f θ* ∞ defined on N<sup>∞</sup> 0 under the contraction condition: for all *x*, *x* <sup>0</sup> ≥ 0 and *y*, *y* <sup>0</sup> ∈ N0,

$$\sup\_{\theta \in \Theta} |f\_{\theta}(\mathbf{x}, y) - f\_{\theta}(\mathbf{x'}, y')| \le \lambda\_1 |\mathbf{x} - \mathbf{x'}| + \lambda\_2 |y - y'|$$

with constants *λ*1, *λ*<sup>2</sup> ≥ 0 satisfying *λ*<sup>1</sup> + *λ*<sup>2</sup> < 1.

Meanwhile, Basu, Harris, Hjort and Jones [33] considered the minimum distance power divergence estimator (MDPDE) for model parameters using the density power divergence *d<sup>α</sup>* between two density functions *g* and *h*, defined by:

$$d\_a(\mathbf{g}, h) := \begin{cases} \int \{ \mathbf{g}^{1+a}(\mathbf{y}) - (1 + \frac{1}{a})h(\mathbf{y}) \mathbf{g}^a(\mathbf{y}) + \frac{1}{a}h^{1+a}(\mathbf{y}) \} d\mathbf{y}, & a > 0, \\\int h(\mathbf{y}) (\log h(\mathbf{y}) - \log \mathbf{g}(\mathbf{y})) d\mathbf{y}, & a = 0. \end{cases}$$

Kim and Lee [36] studied the MDPDE for one parameter exponential family distribution INGARCH models. Given *Y*1, . . . ,*Y<sup>n</sup>* generated from (1), the MDPDE is defined by

$$\delta\_{\mathfrak{a},n} = \operatorname\*{argmin}\_{\theta \in \Theta} \tilde{L}\_{\mathfrak{a},n}(\theta) = \operatorname\*{argmin}\_{\theta \in \Theta} \frac{1}{n} \sum\_{t=1}^{n} \tilde{I}\_{\mathfrak{a},t}(\theta) \,. \tag{2}$$

where

$$\tilde{I}\_{a,l}(\theta) \;= \begin{cases} \begin{array}{l} \sum\_{y=0}^{\infty} p^{1+a}(y|\dot{\eta}\_l(\theta)) - \left(1 + \frac{1}{a}\right) p^a(Y\_l|\dot{\eta}\_l(\theta)), & a > 0, \\ -\log p(Y\_l|\dot{\eta}\_l(\theta)), & a = 0, \end{array} \end{cases} \tag{3}$$

and *η*˜*t*(*θ*) = *B* −1 (*X*e*t*(*θ*)) is updated recursively through the equations: *X*e*t*(*θ*) = *f<sup>θ</sup>* (*X*e*t*−1(*θ*),*Yt*−1), *t* ≥ 2 with an initial value *X*e1(*θ*) := *X*e1.

Below, *θ*<sup>0</sup> denotes the true value of *θ* and is assumed to be an interior point in the compact parameter space <sup>Θ</sup> <sup>⊂</sup> <sup>R</sup>*<sup>d</sup>* . Moreover, it is assumed that *E* sup*θ*∈<sup>Θ</sup> *<sup>X</sup>*1(*θ*) <sup>4</sup> < ∞, *EY*<sup>4</sup> <sup>1</sup> < ∞, *Xt*(*θ*) = *Xt*(*θ*0) a.s. implies *θ* = *θ*0, and *ν T ∂Xt*(*θ*0) *∂θ* = 0 a.s. implies *ν* = 0. Furthermore, *θ* 7→ *Xt*(*θ*) is twice continuously differentiable with respect to *θ* and satisfies

$$E\left(\sup\_{\theta \in \Theta} \left\| \frac{\partial X\_t(\theta)}{\partial \theta} \right\| \right)^4 < \infty \quad \text{and} \quad E\left(\sup\_{\theta \in \Theta} \left\| \frac{\partial^2 X\_t(\theta)}{\partial \theta \partial \theta^T} \right\| \right)^2 < \infty.$$

Assuming

$$\inf\_{\theta \in \Theta} \inf\_{0 \le \delta \le 1} B'( (1 - \delta)\eta\_t(\theta) + \delta \vec{\eta}\_t(\theta)) \ge \underline{c}$$

for some *c* > 0, Kim and Lee [36] verified that the MDPDE is strongly consistent. Additionally, they showed that provided

$$\sup\_{\theta \in \Theta} \sup\_{0 \le \delta \le 1} \left\{ \left| \frac{\mathcal{B}^{\prime\prime}((1-\delta)\eta\_t(\theta) + \delta \tilde{\eta}\_t(\theta))}{\mathcal{B}^{\prime}((1-\delta)\eta\_t(\theta) + \delta \tilde{\eta}\_t(\theta))^{5/2}} \right| \le K \text{ for some } \mathcal{K} > 0\right\}$$

and

$$\sup\_{\theta \in \Theta} \left\| \frac{\partial \widetilde{X}\_t(\theta)}{\partial \theta} - \frac{\partial X\_t(\theta)}{\partial \theta} \right\| + \left\| \frac{\partial^2 \widetilde{X}\_t(\theta)}{\partial \theta \partial \theta^T} - \frac{\partial^2 X\_t(\theta)}{\partial \theta \partial \theta^T} \right\| \le V\rho^t \text{ a.s.} $$

where *V* and *ρ* ∈ (0, 1) denote a generic integrable random variable and a constant, respectively, the symbol k · k denotes the *L* 2 -norm for matrices and vectors, and expectation *E*(·) is taken under *θ*0, the MDPDE is asymptotically normal with asymptotic variance *J* −1 *<sup>α</sup> K<sup>α</sup> J* −1 *<sup>α</sup>* where

$$J\_a = -E\left(\frac{\partial^2 l\_{a,t}(\theta\_0)}{\partial \theta \partial \theta^T}\right), \; K\_a = E\left(\frac{\partial l\_{a,t}(\theta\_0)}{\partial \theta} \frac{\partial l\_{a,t}(\theta\_0)}{\partial \theta^T}\right),\tag{4}$$

and *lα*,*t*(*θ*) is the same as ˜ *lα*,*t*(*θ*) with *η*˜*t*(*θ*) in (3) replaced by *ηt*(*θ*). Moreover, additionally assuming

$$\sup\_{\theta \in \Theta} \sup\_{0 \le \delta \le 1} \left| \frac{B^{(3)}((1-\delta)\eta\_t(\theta) + \delta \mathfrak{H}(\theta))}{B'((1-\delta)\eta\_t(\theta) + \delta \mathfrak{H}(\theta))^4} \right| \le M \text{ for some } M > 0,$$

Kim and Lee [37] showed that the CUSUM test statistics designed for detecting a change in *θ* have the limiting null distribution of the sup of a Brownian bridge. In practice, *α* ∈ (0, 1] is often harnessed and an optimal *α* can be selected through the method of Warwick [39] and Warwick and Jones [40]; see Remark 1 of Kim and Lee [36].

In the literature, the following linear INGARCH model has been frequently used:

$$\mathcal{Y}\_t|\mathcal{F}\_{t-1} \sim p(y|\eta\_t), \quad \mathcal{X}\_t = \omega + a\mathcal{X}\_{t-1} + b\mathcal{Y}\_{t-1}\omega$$

where *X<sup>t</sup>* = *B*(*ηt*) = *E*(*Y<sup>t</sup>* |F*t*−1) and *θ* = (*ω*, *a*, *b*) *T* satisfy *ω* > 0 and *a* + *b* < 1. Here, we assume that *θ*<sup>0</sup> is an interior of a compact neighborhood Θ = {*θ* = (*ω*, *a*, *b*) *<sup>T</sup>* <sup>∈</sup> <sup>R</sup><sup>3</sup> <sup>+</sup> : 0 < *ω*<sup>1</sup> ≤ *ω* ≤ *ω*2, *e* ≤ *a* + *b* ≤ 1 − *e*} for some 0 < *ω*<sup>1</sup> < *ω*2, *e* > 0. Moreover, the Poisson INGARCH(1,1) model with *Yt* |F*t*−<sup>1</sup> ∼ Poisson(*Xt*) and the NB-INGARCH(1,1) model with *Y<sup>t</sup>* |F*t*−<sup>1</sup> ∼ NB(*r*, *pt*), *X<sup>t</sup>* = *r*(1−*pt*) *pt* , where NB(*r*, *p*) denotes a negative binomial (NB) distribution with parameters *r* ∈ N and *p* ∈ (0, 1), satisfy the aforementioned regularity conditions. Those conditions should be checked analytically when one aims to use a specific distribution as the conditional distribution of the INGARCH model. In this case, a goodness of fit test could be conducted to check the adequacy of the assumed underlying distribution (Fokianos and Neumann [41]).

#### **3. MDPDE-Based Monitoring Process**

In this section, we consider a monitoring process detecting a significant change in the underlying models based on sequentially observed time series *Y*1, . . . ,*Y<sup>n</sup>* following Model (1), given a training sample *Y* 0 1 , . . . ,*Y* 0 *<sup>m</sup>* from Model (1), where *m* = *m*(*n*) is a sequence of positive integers that diverges to ∞ as *n* tends to ∞. For this task, we set up the following hypotheses:

*H*<sup>0</sup> : *θ* does not change over *t* = 1, . . . , *n vs*. *H*<sup>1</sup> : not *H*0.

We first consider the case that *θ*<sup>0</sup> is known a priori from a past experience. Then we consider the monitoring process using the process *W*ˆ *<sup>k</sup>*,0 = *K*ˆ−1/2 *α* ∑ *k t*=1 *∂* ˜ *lα*,*t*(*θ*0) *∂θ* , *k* = 1, . . . , *n*, constructed as

$$\hat{T}^{\text{min}}\_{n,0} := \max\_{1 \le k \le n} \hat{T}^{\text{min}}\_{n,0}(k) = \max\_{1 \le k \le n} \frac{1}{\sqrt{n}} \left| \left| \min\_{j \le k} \hat{\mathsf{W}}\_{j,0} - \hat{\mathsf{W}}\_{k,0} \right| \right|\_{\max},<\tag{5}$$
 
$$\hat{T}^{\text{max}}\_{n,0} := \max\_{1 \le k \le n} \hat{T}^{\text{max}}\_{n,0}(k) = \max\_{1 \le k \le n} \frac{1}{\sqrt{n}} \left| \left| \max\_{j \le k} \hat{\mathsf{W}}\_{j,0} - \hat{\mathsf{W}}\_{k,0} \right| \right|\_{\max},<\tag{6}$$
 
$$\hat{T}^{\text{trans}}\_{n,0} := \max\_{1 \le k \le n} \hat{T}^{\prime}\_{n,0}(k) = \max\_{1 \le k \le n} \max\_{1 \le i < j \le k} \frac{1}{\sqrt{n}} \left| \left| \left( \frac{i}{j} \right) \hat{\mathsf{W}}\_{j,0} - \hat{\mathsf{W}}\_{i,0} \right| \right|.$$

where *<sup>∂</sup>* ˜ *lα*,*<sup>t</sup> ∂θ* is the score vector as in (3) based on *Y*1, . . . ,*Y<sup>n</sup>* and

$$
\hat{\mathcal{K}}\_{\mathfrak{a}} = \frac{1}{m} \sum\_{t=1}^{m} \frac{\partial \underline{\boldsymbol{I}}\_{\mathfrak{a},t}^{\sharp}(\boldsymbol{\theta}\_{0})}{\partial \boldsymbol{\theta}^{T}} \frac{\partial \underline{\boldsymbol{I}}\_{\mathfrak{a},t}^{\sharp}(\boldsymbol{\theta}\_{0})}{\partial \boldsymbol{\theta}^{T}},\tag{6}
$$

where *<sup>∂</sup>* ˜ *l* 0 *α*,*t ∂θ* is the score vector based on the training sample. Here, the notation max1≤*i*≤*<sup>k</sup>* **<sup>z</sup>***<sup>i</sup>* with **z***<sup>i</sup>* = (*zi*,1, . . . , *zi*,*<sup>d</sup>* ) *<sup>T</sup>* <sup>∈</sup> *<sup>R</sup> d* is defined to be the vector with the *<sup>j</sup>*th entry equal to max1≤*i*≤*<sup>k</sup> <sup>z</sup>j*,*<sup>i</sup>* for *<sup>j</sup>* = 1, . . . , *<sup>d</sup>*, and ||**z**||max = max1≤*i*≤*<sup>k</sup>* |*zi* | for **z** = (*z*1, . . . , *z<sup>d</sup>* ) *<sup>T</sup>* <sup>∈</sup> *<sup>R</sup> d* . Similar versions of *T*ˆ *max <sup>n</sup>*,0 and *T*ˆ *cusum <sup>n</sup>*,0 based on MLE have been considered by Gombay and Serban [30] and Huh, Kim and Lee [31] for the AR and Poisson INGARCH models, while *T*ˆ *min <sup>n</sup>*,0 is newly considered here. An anomaly is signaled at *k* when *T*ˆ *min <sup>n</sup>*,0 (*k*), *<sup>T</sup>*<sup>ˆ</sup> *max <sup>n</sup>*,0 (*k*), or *<sup>T</sup>*<sup>ˆ</sup> *cusum <sup>n</sup>*,0 (*k*) get out of a control limit for some *k* = 1, . . . , *n*, and the control limit can be determined using the convergence result in Theorem 1 addressed below.

Next, we consider the situation that *θ*<sup>0</sup> is unknown and must be estimated in the construction of the monitoring process in (5). We employ a monitoring process constructed based on *W*ˆ *<sup>k</sup>* = *K*ˆ−1/2 *α*,*m* ∑ *k t*=1 *∂* ˜ *lα*,*t*( ˆ*θα*,*m*) *∂θ* , where <sup>ˆ</sup>*θα*,*<sup>m</sup>* is the MDPDE of *<sup>θ</sup>*<sup>0</sup> obtained from the training sample and

$$
\hat{K}\_{\alpha,m} = \frac{1}{m} \sum\_{t=1}^{m} \frac{\partial \hat{l}'\_{\alpha,t}(\hat{\theta}\_{\alpha,m})}{\partial \theta} \frac{\partial \hat{l}'\_{\alpha,t}(\hat{\theta}\_{\alpha,m})}{\partial \theta^T},
$$

which is obtained by substituting *θ*<sup>0</sup> in *K<sup>α</sup>* in (6) with ˆ*θα*,*m*, namely,

$$\begin{split} \hat{T}\_{n}^{\min} &:= \max\_{1 \le k \le n} \hat{T}\_{n}^{\min}(k) = \max\_{1 \le k \le n} \frac{1}{\sqrt{n}} \left| \min\_{j \le k} \hat{\mathsf{W}}\_{j} - \hat{\mathsf{W}}\_{k} \right| \Big|\_{\max} \, \mathsf{} \, \mathsf{} \, \mathsf{V} \\ \hat{T}\_{n}^{\max} &:= \max\_{1 \le k \le n} \hat{T}\_{n}^{\max}(k) = \max\_{1 \le k \le n} \frac{1}{\sqrt{n}} \left| \max\_{j \le k} \hat{\mathsf{W}}\_{j} - \hat{\mathsf{W}}\_{k} \right| \Big|\_{\max} \, \mathsf{} \, \mathsf{V} \\ \hat{T}\_{n}^{\text{csum}} &:= \max\_{1 \le k \le n} \hat{T}\_{n}^{\text{csum}}(k) = \max\_{1 \le k \le n} \max\_{1 \le i < j \le k} \frac{1}{\sqrt{n}} \left| \left| \left( \frac{i}{j} \right) \hat{\mathsf{W}}\_{j,0} - \hat{\mathsf{W}}\_{i,0} \right| \right| \, \mathsf{V} \end{split}$$

An anomaly is detected at *k* when *T*ˆ *min n* (*k*), *T*ˆ *max n* (*k*), or *T*ˆ *cusum n* (*k*) get out of the control limit for some *k* = 1, . . . , *n*. The control limit can be determined theoretically using the asymptotic result in Theorem 1 addressed below. For this task, we investigate the asymptotic behavior of the monitoring processes *T*ˆ *min n* , *T*ˆ *max n* , and *T*ˆ *cusum <sup>n</sup>* defined below.

Let *W<sup>k</sup>* = *K* −1/2 *α* ∑ *k t*=1 *∂lα*,*t*(*θ*0) *∂θ* , where *<sup>K</sup><sup>α</sup>* and *<sup>∂</sup>lα*,*<sup>t</sup> ∂θ* are the ones in (4), and

$$\begin{aligned} T\_n^{\min} &= \max\_{1 \le k \le n} \frac{1}{\sqrt{n}} \left| \min\_{j \le k} W\_j - W\_k \right| \Big|\_{\max}, \\ T\_n^{\max} &= \max\_{1 \le k \le n} \frac{1}{\sqrt{n}} \left| \max\_{j \le k} W\_j - W\_k \right| \Big|\_{\max}, \\ T\_n^{\text{cum}} &= \max\_{1 \le k \le n} \max\_{1 \le i < j \le k} \frac{1}{\sqrt{n}} \left| \left( \frac{i}{j} \right) W\_j - W\_i \right| \Big| \Big| \end{aligned}$$

Using Donsker's invariance principle for martingale differences (Billingsley [42]) and the fact that sup0≤*s*≤*<sup>t</sup> B*(*s*) − *B*(*t*) = |*B*(*t*)| in distribution for any standard Brownian motion *B*, we obtain

$$T\_n^{\max} \stackrel{d}{\rightarrow} T := \sup\_{0 \le s \le 1} ||\mathcal{B}\_d(s)||\_{\max} \tag{8}$$

where B*<sup>d</sup>* and denote a *d*-dimensional standard Brownian motion, so that

$$T\_n^{\min} \stackrel{d}{\rightarrow} T = \sup\_{0 \le s \le 1} ||\mathcal{B}\_d(s)||\_{\max}$$

as *T min <sup>n</sup>* behaves asymptotically similarly to *T max n* . Meanwhile, we can see that

$$T\_n^{\text{cumum}} \stackrel{d}{\rightarrow} T' = \sup\_{0 < s \le s' \le 1} \left| \left| \frac{s}{s'} \mathcal{B}\_d^\diamond(s') - \mathcal{B}\_d^\diamond(s) \right| \right| . \tag{9}$$

where B ◦ *d* is a *d*-dimensional Brownian bridge.

Using the above facts, we are led to attain the following theorem, whose proof is provided in the Appendix A.

**Theorem 1.** *Assume that* **(A.1)***–***(A.11)** *hold. Then, under <sup>H</sup>*0*, as <sup>n</sup>* <sup>→</sup> <sup>∞</sup>*, <sup>T</sup>*<sup>ˆ</sup> *min <sup>n</sup>*,0 *and <sup>T</sup>*<sup>ˆ</sup> *max <sup>n</sup>*,0 *converge to T in distribution, and the same holds for T*ˆ *min n and T*ˆ *max n if <sup>m</sup>*/*<sup>n</sup>* <sup>→</sup> <sup>∞</sup>*. Moreover, <sup>T</sup>*<sup>ˆ</sup> *cusum <sup>n</sup>*,0 *converges to T* 0 *in distribution as n* <sup>→</sup> <sup>∞</sup>*, and so does <sup>T</sup>*<sup>ˆ</sup> *cusum n if m*/*n* → *λ* ∈ (0, ∞)*.*

The result in Theorem 1 can be used to determine a control limit for the monitoring process. Given significance level 0 < *α* < 1, we take *c* and *c* 0 satisfying *P*(*T* ≥ *c*) = *P*(*T* 0 ≥ *c* 0 ) = *α*. In particular, *<sup>P</sup>*(*<sup>T</sup>* <sup>≥</sup> *<sup>c</sup>*) = <sup>1</sup> <sup>−</sup> (*P*(sup0≤*s*≤<sup>1</sup> <sup>|</sup>*B*(*s*)| ≤ *<sup>c</sup>*))*<sup>d</sup>* , so that *c* can be obtained from the fact that *<sup>P</sup>*(sup0≤*s*≤<sup>1</sup> |*B*(*s*)| ≥ *c*) = 1 − (1 − *α*) 1/*d* . The performance of the proposed CUSUM monitoring methods is evaluated in our simulation study, focusing on *T*ˆ *cusum n* , *T*ˆ *min <sup>n</sup>*,0 , and *<sup>T</sup>*<sup>ˆ</sup> *min n* . (We do not report the result for *T*ˆ *max <sup>n</sup>*,0 and *<sup>T</sup>*<sup>ˆ</sup> *max n* , as these do not perform well compared to the others in most cases). Therein, a parametric bootstrap is adopted in obtaining control limits to reduce the parameter estimation effect, which can be more problematic when *m* is not so large compared to *n*, and the MDPDE from the training sample is used to generate the bootstrap sample.

#### **4. Simulation Results**

In this section, we compare the performance of the CUSUM monitoring processes *T*ˆ *cusum n* , *T*ˆ *min <sup>n</sup>*,0 , and *T*ˆ *min n* in three different experimental environments for the Poisson INGARCH(1,1) model as follows:

$$\mathcal{Y}\_t \mid \mathcal{F}\_{t-1} \sim \text{Poisson}\left(X\_t\right), \quad X\_t = \omega + aX\_{t-1} + b\mathcal{Y}\_{t-1}.$$

For the comparison, we compute the empirical sizes and powers at the nominal level of 0.05 for *m* = *n* = 500, 1000 with 1000 implications. For the critical value of *T*ˆ *min <sup>n</sup>*,0 , we use 2.633, which is the 0.95th quantile of sup0≤*s*≤<sup>1</sup> kB3(*s*)kmax. However, for *<sup>T</sup>*<sup>ˆ</sup> *cusum <sup>n</sup>* and *T*ˆ *min n* , we use the critical values obtained from a parametric bootstrap method, as the MDPDE ˆ*θα*,*<sup>m</sup>* might cause some size distortions. In implementation, the warp-bootstrap method is utilized to save computing times (Giacomini, Politis, and White [43]).


Case 1: *ω*<sup>1</sup> = (1 + *δ*)*ω*0, *a*<sup>1</sup> = (1 + *δ*)*a*0, *b*<sup>1</sup> = (1 + *δ*)*b*0; that is, all parameters change;

Case 2: *ω*<sup>1</sup> = (1 + *δ*)*ω*0, *a*<sup>1</sup> = *a*0, *b*<sup>1</sup> = *b*0; that is, only *ω* changes;

Case 3: *ω*<sup>1</sup> = *ω*0, *a*<sup>1</sup> = (1 + *δ*)*a*0, *b*<sup>1</sup> = *b*0; that is, only *a* changes;

Case 4: *ω*<sup>1</sup> = *ω*0, *a*<sup>1</sup> = *a*0, *b*<sup>1</sup> = (1 + *δ*)*b*0; that is, only *b* changes.



Figure 1 shows how the parameter change affects the pattern of the Poisson INGARCH(1,1) time series (Case 3) with *θ*<sup>0</sup> = (2, 0.3, 0.3), *τ* = 500, and *δ* = 0 for the left panel and *δ* = 0.5 for the right panel. As *EY<sup>t</sup>* = *<sup>ω</sup>* 1−*a*−*b* , we can see that parameter change causes a mean shift. Tables 1–3 list the size and powers for Part 1 (*τ* therein stands for the location of the change point) and show no severe size distortions and reasonably good powers for *<sup>δ</sup>* <sup>≥</sup> 0.5. In particular, *<sup>T</sup>*<sup>ˆ</sup> *cusum <sup>n</sup>* and *T*ˆ *min <sup>n</sup>*,0 largely outperform *T*ˆ *min n* in terms of power. However, as seen in Tables 4–8, the power of *T*ˆ *min n* in Part 2 appears to increase up to that of *T*ˆ *min <sup>n</sup>*,0 . In both Part 1 and Part 2, different *α* do not affect the size much, but a larger *α* tends to diminish the power. This appeals to our intuition, as the MLE is more efficient in the presence of no outliers.


**Table 1.** Empirical sizes and powers in Case 1 for the Poisson INGARCH(1,1) model when no outliers exist with *θ*<sup>0</sup> = (2, 0.1, 0.2).

t t **Figure 1.** Plots of the Poisson INGARCH(1,1) time series (Case 3) with *θ*<sup>0</sup> = (2, 0.3, 0.3), *τ* = 500 and *δ* = 0 for the left panel and *δ* = 0.5 for the right panel.

Meanwhile, Tables 9–12 show that the outliers undermine the performance of the MLE-based monitoring processes in terms of both size and power; namely, size distortions are notable and the power decreases to a certain extent. This result particularly indicates that *T*ˆ *cusum n* is improved when the MDPDE with *α* > 0 is used, which demonstrates the efficacy of the MDPDE in the monitoring process. By contrast, the size of *T*ˆ *min n* significantly increases when *α* > 0, indicating that *T*ˆ *min n* is unstable; see Figure 2. Although not reported here, we also examined the performance of the same monitoring

processes for NB INGARCH(1,1) models. The result for this case showed a similar pattern to the Poisson INGARCH(1,1) case. All our findings strongly affirm that *T*ˆ *cusum n* is the most favorable among the monitoring methods considered in this study.


**Table 2.** Empirical sizes and powers in Case 2 for the Poisson INGARCH(1,1) model when no outliers exist with *θ*<sup>0</sup> = (2, 0.6, 0.2).

**Figure 2.** Plots of the sizes and powers in Table 10 (Part 3, Case 2) for *n* = 1000. The left panel is for *T*ˆ *min <sup>n</sup>* and the right panel is for *<sup>T</sup>*<sup>ˆ</sup> *cusum n* .


**Table 3.** Empirical sizes and powers in Case 3 for the Poisson INGARCH(1,1) model when no outliers exist with *θ*<sup>0</sup> = (2, 0.3, 0.3).

**Table 4.** Empirical sizes and powers in Case 4 for the Poisson INGARCH(1,1) model when no outliers exist with *θ*<sup>0</sup> = (1, 0.4, 0.4).



**Table 5.** Empirical sizes and powers in Case 1 for the Poisson INGARCH(1,1) model when no outliers exist with *θ*<sup>0</sup> = (2, 0.1, 0.2).

**Table 6.** Empirical sizes and powers Case 2 for the Poisson INGARCH(1,1) model when no outliers exist with *θ*<sup>0</sup> = (2, 0.6, 0.2).



**Table 7.** Empirical sizes and powers Case 3 for the Poisson INGARCH(1,1) model when no outliers exist with *θ*<sup>0</sup> = (2, 0.3, 0.3).

**Table 8.** Empirical sizes and powers in Case 4 for the Poisson INGARCH(1,1) model when no outliers exist with *θ*<sup>0</sup> = (1, 0.4, 0.4).



**Table 9.** Empirical sizes and powers in Case 1 for the Poisson INGARCH(1,1) model when *θ*<sup>0</sup> = (2, 0.1, 0.2), *p* = 0.1 and *λ* = 10.

**Table 10.** Empirical sizes and powers in Case 2 for the Poisson INGARCH(1,1) model when *θ*<sup>0</sup> = (2, 0.6, 0.2), *p* = 0.1 and *λ* = 30.



**Table 11.** Empirical sizes and powers in Case 3 for the Poisson INGARCH(1,1) model when *θ*<sup>0</sup> = (2, 0.3, 0.3), *p* = 0.1 and *λ* = 30.

**Table 12.** Empirical sizes and powers in Case 4 for the Poisson INGARCH(1,1) model when *θ*<sup>0</sup> = (1, 0.4, 0.4), *p* = 0.1 and *λ* = 30.


#### **5. Real Data Analysis**

In this section, we apply *T*ˆ *cusum n* to a real dataset, using the extreme events of the daily log-returns of GS stock from 2 July 2007 to 28 February 2020. Davis and Liu [11] and Kim and Lee [37] used the GS stock datasets with different periods, but their works were focused on parameter estimation and the retrospective change point test. For the task of online monitoring, we first calculated the hitting times, *τ*1, *τ*2, . . . , for which the log-returns of GS stock fall outside the 0.05 and 0.95 quantiles of the data, and generated the time series of counts *Y<sup>t</sup>* = *τ<sup>t</sup>* − *τt*−<sup>1</sup> ≥ 0, *t* = 1, . . . , 319. Figure 3 plots *Y<sup>t</sup>* and exhibits the presence of a number of outliers.

t **Figure 3.** Plot of the return times of extreme events for Goldman Sachs Group stock.

Fitting the Poisson INGARCH(1,1) model to the whole observations, we have the MLE of (*ω*ˆ , *a*ˆ, ˆ*b*) = (1.969, 0.152, 0.664) and the MDPDE of (*ω*ˆ , *a*ˆ, ˆ*b*) = (1.213, 0.144, 0.472) when *α* = 0.1 is used. The significant difference between the two estimates is seemingly due to the presence of outliers. Using *Y<sup>t</sup>* , *t* = 1, . . . , 150 as a training sample and viewing *Y<sup>t</sup>* , *t* ≥ 151 as sequentially observed testing data, we implement the monitoring process *T*ˆ *cusum <sup>n</sup>* with *α* = 0, 0.1 to detect a parameter change. Subsequently, an anomaly is detected when *t* = 180 for *α* = 0 (blue vertical line) and *t* = 197 for *α* = 0.1 (red vertical line), which indicates that the monitoring process based on the MLE is more sensitive to relatively smaller outliers lying around *t* = 180, while that based on MDPDE is more robust to those outliers and detects a more significant change around *t* = 197, ignoring smaller ones. Obviously, we can see from Figure 3 that *Y<sup>t</sup>* has a pattern with more fluctuations after *t* = 180. Our finding affirms the adequacy of the MDPDE-based monitoring process in the presence of outliers.

#### **6. Concluding Remarks**

In this work, we studied the robust on-line monitoring process based on MDPDE for detecting a parameter change in INGARCH models. For this task, we adopted the CUSUM process based on the score functions, which were originally constructed for obtaining the MDPDE. Our simulation study and real data analysis confirmed the validity of the proposed method. Here, we focused on the monitoring process within the framework of Gombay and Serban [30] and Huh, Kim and Lee [31]. However, one can also consider a different monitoring scheme, for example as in Na, Lee and Lee [44], and conduct a comparison study, which is left as our future project.

**Author Contributions:** Conceptualization, S.L.; project administration, S.L.; methodology and derivation, S.L.; data curation, D.K.; empirical analysis, D.K.; writing, S.L.; funding acquisition, S.L. All authors read and agreed to the published version of the manuscript.

**Funding:** This research was supported by the Basic Science Research Program through the National Research Foundation of Korea funded by the Ministry of Science, ICT and Future Planning (Grant 2018R1A2A2A05019433).

**Acknowledgments:** We would like to thank the Editor and two anonymous referees for their careful reading and valuable comments.

**Conflicts of Interest:** The authors declare no conflict of interest. The funding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **Appendix A**

**Proof of Theorem 1.** We first verify that *T*ˆ *max n* converges to *T* in distribution; the cases of *T*ˆ *min <sup>n</sup>*,0 , *<sup>T</sup>*<sup>ˆ</sup> *max <sup>n</sup>*,0 , and *T*ˆ *min n* can be similarly handled and the proofs for these are omitted. As ˆ*θα*,*<sup>m</sup>* converges to *θ*<sup>0</sup> and

$$E\left(\sup\_{\theta \in \Theta} \left\| \frac{\partial^2 l\_{\mathfrak{a},t}(\theta)}{\partial \theta \partial \theta^T} - \frac{\partial^2 l\_{\mathfrak{a},t}(\theta\_0)}{\partial \theta \partial \theta^T} \right\|\right) < \infty.$$

we have that for any sequence *θ* ∗ *n* converging to *θ*<sup>0</sup> a.s.,

$$\frac{1}{n}\sum\_{t=1}^{n}\frac{\partial^2 l\_{\mathbf{a},t}(\theta\_n^\*)}{\partial\theta\partial\theta^T} \to -J\_{\mathbf{a}}\tag{A1}$$

in probability. Then, using the mean value theorem and ergodicity, owing to (A1), we have

$$\begin{split} & \max\_{1 \le k \le n} \max\_{1 \le j \le k} \left| \left| \frac{1}{\sqrt{n}} \sum\_{t=1}^{j} \frac{\partial l\_{a,t}(\hat{\theta}\_{a,m})}{\partial \theta} - \frac{1}{\sqrt{n}} \sum\_{t=1}^{k} \frac{\partial l\_{a,t}(\hat{\theta}\_{a,m})}{\partial \theta} \right| \\ & \qquad \quad - \left\{ \frac{1}{\sqrt{n}} \sum\_{t=1}^{j} \frac{\partial l\_{a,t}(\theta\_{0})}{\partial \theta} - \frac{1}{\sqrt{n}} \sum\_{t=1}^{k} \frac{\partial l\_{a,t}(\theta\_{0})}{\partial \theta} \right\} \Bigg| \Bigg| \\ & \le \sqrt{m} ||\dot{\theta}\_{a,m} - \theta\_{0}|| \sqrt{\frac{n}{m}} \max\_{1 \le j,k \le n} \left| \left( \frac{j}{n} \right) \frac{1}{j} \sum\_{t=1}^{j} \frac{\partial^{2} l\_{a,t}(\theta\_{n}^{\*})}{\partial \theta \partial \theta^{T}} - \left( \frac{k}{n} \right) \frac{1}{k} \sum\_{t=1}^{k} \frac{\partial^{2} l\_{a,t}(\theta\_{n}^{\*\*})}{\partial \theta \partial \theta^{T}} \right| \Bigg| \\ & = o\_{P}(1), \end{split} \tag{A2}$$

where ˆ*θ* ∗ *<sup>n</sup>* and <sup>ˆ</sup>*<sup>θ</sup>* ∗∗ *<sup>n</sup>* are intermediate points between *<sup>θ</sup>*<sup>0</sup> and <sup>ˆ</sup>*θα*,*m*. Hence, since *<sup>K</sup>*b*α*,*<sup>m</sup>* is a consistent estimator of *K<sup>α</sup>* (Lemma A5 of Kim and Lee [36]) and

$$\sup\_{\theta \in \Theta} \max\_{1 \le k \le n} \left| \left| \frac{1}{\sqrt{n}} \sum\_{t=1}^{k} \frac{\partial \check{l}\_{a,t}(\theta)}{\partial \theta} - \frac{1}{\sqrt{n}} \sum\_{t=1}^{k} \frac{\partial l\_{a,t}(\theta)}{\partial \theta} \right| \right| = o\_P(1) \tag{A3}$$

(Lemma 6 of Kim and Lee, 2019), we get *T*ˆ *max <sup>n</sup>* − *T max <sup>n</sup>* = *oP*(1) and *T*ˆ *max n* converges to *T* in distribution owing to (9).

Next, we deal with *T*ˆ *cusum n* . The case of *T*ˆ *cusum <sup>n</sup>*,0 can be similarly handled. Similarly to (A2), we can see that

$$\begin{split} \max\_{1 \le k \le n} \left| \left| \frac{1}{\sqrt{n}} \sum\_{t=1}^{k} \frac{\partial l\_{a,t}(\hat{\theta}\_{a,m})}{\partial \theta} - \frac{1}{\sqrt{n}} \left( \frac{k}{n} \right) \sum\_{t=1}^{n} \frac{\partial l\_{a,t}(\hat{\theta}\_{a,m})}{\partial \theta} \right| \\ - \frac{1}{\sqrt{n}} \sum\_{t=1}^{k} \frac{\partial l\_{a,t}(\theta\_0)}{\partial \theta} - \frac{1}{\sqrt{n}} \left( \frac{k}{n} \right) \sum\_{t=1}^{n} \frac{\partial l\_{a,t}(\theta\_0)}{\partial \theta} \right| \bigg|\_{} = o\_P(1). \end{split} \tag{A4}$$

Then, using the arguments as in (A3) and (A4), we can see that

$$\max\_{1 \le k \le n} \frac{1}{\sqrt{n}} \left| \left| \hat{\mathcal{W}}\_k - \left( \frac{k}{n} \right) \hat{\mathcal{W}}\_k - \mathcal{W}\_k + \left( \frac{k}{n} \right) \mathcal{W}\_k \right| \right| = o\_P(1),$$

which implies *T*ˆ *cusum <sup>n</sup>* − *T cusum <sup>n</sup>* = *oP*(1) and *T*ˆ *cusum n d* → *T* 0 holds owing to (9). This completes the proof.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

#### *Article*
