Next Article in Journal
Global Stability of the Curzon-Ahlborn Engine with a Working Substance That Satisfies the van der Waals Equation of State
Next Article in Special Issue
Mildly Explosive Autoregression with Strong Mixing Errors
Previous Article in Journal
Spatiotemporal Transformer Neural Network for Time-Series Forecasting
Previous Article in Special Issue
Divergence-Based Locally Weighted Ensemble Clustering with Dictionary Learning and L2,1-Norm
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A New Class of Weighted CUSUM Statistics

1
Department of Computer Science, Mathematics, Physics and Statistics, University of British Columbia, Kelowna, BC V1V 1V7, Canada
2
Department of Mathematics, University of Louisiana at Lafayette, Lafayette, LA 70503, USA
3
Department of Statistical Sciences, University of Toronto, Toronto, ON M5S 3G3, Canada
*
Author to whom correspondence should be addressed.
Entropy 2022, 24(11), 1652; https://doi.org/10.3390/e24111652
Submission received: 20 September 2022 / Revised: 9 November 2022 / Accepted: 11 November 2022 / Published: 14 November 2022
(This article belongs to the Special Issue Recent Advances in Statistical Theory and Applications)

Abstract

:
A change point is a location or time at which observations or data obey two different models: before and after. In real problems, we may know some prior information about the location of the change point, say at the right or left tail of the sequence. How does one incorporate the prior information into the current cumulative sum (CUSUM) statistics? We propose a new class of weighted CUSUM statistics with three different types of quadratic weights accounting for different prior positions of the change points. One interpretation of the weights is the mean duration in a random walk. Under the normal model with known variance, the exact distributions of these statistics are explicitly expressed in terms of eigenvalues. Theoretical results about the explicit difference of the distributions are valuable. The expansions of asymptotic distributions are compared with the expansion of the limit distributions of the Cramér-von Mises statistic and the Anderson and Darling statistic. We provide some extensions from independent normal responses to more interesting models, such as graphical models, the mixture of normals, Poisson, and weakly dependent models. Simulations suggest that the proposed test statistics have better power than the graph-based statistics. We illustrate their application to a detection problem with video data.

1. Introduction

A change point is a location or time at which observations or data obey two different models: before and after. Detecting change points is a nontrivial problem and has been studied by many authors; see a book treatment in [1] and recent advances in CUSUM-based change point tests [2,3,4]. In real problems, we may know some prior information about the location of the change point, say at the right or left tail of the sequence. How does one incorporate prior information into current CUSUM-based statistics? We consider a new class of weighted CUSUM statistics for a simple model and provide some extensions to more complicated models.
Given a series of univariate random variables Y 1 , , Y n , we consider the problem of testing whether there is a change in the mean of their distribution. The test statistic we use is:
S n ( Y ; τ , γ ) = k = 1 n 1 w k 1 ( τ ) i = 1 k Y i Y ¯ γ ,
where Y = ( Y 1 , , Y n ) , Y ¯ = n 1 j = 1 n Y j , γ > 0 , and
w k ( τ ) = ( k τ ) 2 + max { τ 2 , ( n τ ) 2 } = ( n + k ) ( n k ) , if τ = 0 , k ( n k ) , if τ = n / 2 , k ( 2 n k ) , if τ = n ,
where τ = 0 , n / 2 , and n account for three different prior positions of the change point, respectively. We call S n a weighted CUSUM (WC) statistic.
Inspired by the change point literature, we consider these types of quadratic weights. The term max { τ 2 , ( n τ ) 2 } = max 0 j n ( j τ ) 2 is introduced to ensure that the weight w k ( τ ) is positive for any 0 < k < n . Usually, we choose γ = 2 to capture the change in the mean. When τ = n / 2 , the weight w k ( n / 2 ) = k ( n k ) corresponds to the likelihood ratio test; see Csörgö and Horváth [1] and a related review in Jandhyala et al. [5]. If prior information indicates that the change point more likely occurs in the right or left tail of the sequence, we can set the weight w k ( 0 ) = ( n + k ) ( n k ) (left drifted to the symmetry center point 0) or w k ( n ) = k ( 2 n k ) (right drifted to the symmetry center point n) to improve the power of the test.
One interpretation of the weights is the mean duration in a random walk { X i , i 0 } on N + 1 states, { 0 , 1 , , N } , whose transition probability is given by P ( X i + 1 = k ± 1 | X i = k ) = 1 / 2 for k = 1 , , N 1 , P ( X i + 1 = 0 | X i = 0 ) = 1 , and P ( X i + 1 = N | X i = N ) = 1 . Let T denote the random time at which the process first reaches 0 or N. Then, for k = 1 , , n 1 , E ( T | X 0 = k ) = k ( n k ) = w k ( n / 2 ) if N = n ; E ( T | X 0 = k ) = k ( 2 n k ) = w k ( n ) if N = 2 n ; and E ( T | X 0 = n k ) = ( n + k ) ( n k ) = w k ( 0 ) if N = 2 n . Figure 1 depicts four vectors w k for n = 10 . The centers of symmetry of these quadratic weights are at different positions.
The weights in (1) can be thought of as an inverse prior probability on the change point, giving S n a Bayesian flavor, as in Gardner [6], who used the uniform prior n 2 , or Perron [7], who devised a unit-root test for time series. From a frequentist perspective, the weighted sum statistic offers an alternative to the maximum statistic most commonly used Csörgö and Horváth [1], which we show (in small simulations omitted here) has higher power, especially when the change point is at the center of the sequence for any τ , in the right tail of the sequence for τ = n , and in the left tail of the sequence for τ = 0 .
For these types of quadratic weights, a couple of questions naturally arise: will different weights lead to different distributions of WC in Equation (1)? If so, how significant will the differences in the distribution be? If two different weights lead to the same distribution, are there any intrinsic reasons? Although one can estimate the distribution of WC by simulation, theoretical results about the explicit differences of the distributions are valuable. Moreover, simulations and computations of eigenvalues for large n are computationally expensive. To answer the aforementioned questions, we shall study the distribution of the WC theoretically; we derive Karhunen–Loève expansions of the exact and asymptotic distributions of the WC statistics. The calculation of a Karhunen–Loève expansion is a nontrivial task, even under the normal model. Gardner [6] discussed the uniform weight under the normal assumption, but the quadratic weights we consider here increase the difficulty substantially. We present below a unified theory that enables us to establish the distribution of WC using dual Hahn polynomials. The asymptotic distributions for the quadratic weights w k ( 0 ) and w k ( n ) are identical, and the expansions of asymptotic distributions between w k ( 0 ) and another quadratic weight w k ( n / 2 ) differ by an odd number of terms. We make a comparison with the expansion of the limit distributions of the Cramér-von Mises statistic and the Anderson and Darling statistics; see also MacNeill [8].
The WC has some variants in other models. For example, in the graphical model, γ can be 1 if we replace Y with a count of edges. Here, the main challenge is to approximate the covariance of edge-count statistics under the null permutations. In the normal mixed model, a variant of WC can be derived by considering a marginal likelihood function. In the Poisson mixed model, however, the calculation of the marginal likelihood function is hindered by an integral without a closed form. To approximate this integral, one may use Laplace, or saddle point approximation [9,10,11,12,13]. Here, we apply the saddle point approximation to the integral and provide a variant of WC related to the log link. For the classical change point Poisson model without latent variables, see [1] (p. 27); for the Poisson process with a change point, we refer readers to Akman and Raftery [14], Loader [15]. Moreover, to adopt the assumption of weak dependence in practice, we avoid the estimation of the variance and provide a randomized version of WC.
The structure of the paper is outlined as follows. In Section 2, we derive the explicit expansions of the distribution of the WC statistics and explore their connections with the Karhunen–Loève expansion. We derive extended versions of WC by considering the observations as nodes in the graphical model and allowing the observations from a normal or Poisson mixed model to be weakly dependent. In Section 3, we discuss the power of the proposed WC test. In Section 4, we use simulation to compare the performance of this test with that of a graph-based test statistic. In Section 5, we present an application for video data. In Section 6, we discuss the extension to multiple change points and suggest future work on other quadratic weights.

2. Exact and Asymptotic Distributions of the WC Statistics

2.1. Explicit Distribution for a Normal Model

We assume here that { Y i } are independent following a normal distribution with a common known variance σ 2 . The case of unknown σ 2 is addressed in Remark 3, and an extension relaxing the independence assumption is given in Section 2.6.
Following the derivation in Gardner [6], we write (1) as a quadratic form
S n ( Y ; τ , 2 ) = 1 n 2 k = 1 n 1 p k i = 1 k ( n k ) Y i i = k + 1 n k Y i 2 = 1 n 2 Y A A Y = Y Q Y ,
where p k = p k ( τ ) = w k 1 ( τ ) , and n 2 Q = A A with A = ( A 1 , , A n 1 ) . Here, A k = p k 1 / 2 ( n k , , n k , k , , k ) such that the first k entries of A k are p k 1 / 2 ( n k ) and the last n k entries p k 1 / 2 k .
By using the recurrence identity and the dual Hahn polynomial, we obtain a new exact result in terms of the eigenvalues of Q in (3).
Theorem 1. 
Assume that { Y i } are independent normally distributed random variables with a common mean and known variance σ 2 . The exact distribution of S n ( Y ; τ , 2 ) is
S n ( Y ; τ , 2 ) σ 2 = d k = 1 n λ k ( τ ) Z k 2 ,
where Z k 2 are independent and identically distributed normal random variables with mean zero and variance 1, λ n ( τ ) = 0 , and
λ k ( τ ) = 1 k ( k + 1 ) , k = 1 , , n 1 , if τ = n / 2 1 2 k ( 2 k + 1 ) , k = 1 , , n 1 , if τ = 0 o r n .
The proof of Theorem 1 is given in Appendix A. We make the following remarks.
Remark 1. 
It is interesting that λ 2 k ( n / 2 ) = λ k ( 0 ) for all 0 < k < n / 2 ; namely, the eigenvalues for w k ( n / 2 ) with even indices coincide with the eigenvalues for w k ( 0 ) with indices less than n / 2 . As the sample size increases from n to n + 1 , the n 1 nonzero eigenvalues are retained and the added nonzero eigenvalue must be 1 / { n ( n + 1 ) } for w k ( n / 2 ) or 1 / { 2 n ( 2 n + 1 ) } for w k ( 0 ) or w k ( n ) . This interesting phenomenon has not been seen in the uniform weights of Gardner [6]. As far as we know, this recursive property of the eigenvalues for the non-uniform weights is new. Figure 2 depicts the pattern of eigenvalues (cross products of rows and columns) illustrated by dots for three weights w k ( n / 2 ) (blue), w k ( 0 ) (green), and w k ( n ) (purple) with the increase of n.
Remark 2. 
The distribution in (4) can be calculated numerically using Imhof’s method [16] or simulated by a Monte Carlo method, but accurate analytical approximations are potentially faster and more stable. A saddle point approximation to the distribution of quadratic forms in normal variates was studied in Kuonen [17], building on Daniels [9,18] and Lugannani and Rice [19].
Remark 3. 
When the variance σ 2 is unknown, we can replace σ 2 with a consistent estimator
σ ^ 2 = ( n 1 ) 1 i = 1 n ( Y i Y ¯ ) 2 ,
by using Slutsky’s lemma. This also holds in Corollary 1. For dependent data, one issue is to give a valid estimate of the variance; see Section 2.6.

2.2. Karhunen–Loève Expansion

The squared integral of a Brownian bridge arises in the study of tests for goodness-of-fit. Given a sample of independent and identically distributed random variables with an empirical distribution function F n ( x ) , the statistic
ω n 2 ( ψ ) = n { F n ( t ) F ( t ) } 2 ψ { F ( t ) } d F ( t )
provides a test of the null hypothesis that the observations come from the distribution F ( · ) . The Cramér-von Mises statistic has ψ ( t ) 1 , and the Anderson-Darling statistic has ψ ( t ) = 1 / { t ( 1 t ) } . Here, we shall discuss two new weights: ψ ( t ) = 1 / { t ( 2 t ) } and ψ ( t ) = 1 / ( 1 t 2 ) .
MacNeill [8] showed that
0 1 { B ( t ) t B ( 1 ) } 2 d t = k = 1 1 k 2 π 2 Z k 2 ,
using a Fourier expansion of B ( t ) t B ( 1 ) = k = 1 2 sin ( k π t ) / ( k π ) Z k , where { 2 sin ( k π t ) , k = 1 , 2 , , } is an orthonormal basis in L 2 ( 0 , 1 ) and B ( t ) is a standard Brownian motion and B ( t ) t B ( 1 ) is a Brownian bridge.
Anderson and Darling [20] showed that
0 1 { B ( t ) t B ( 1 ) } 2 t ( 1 t ) d t = k = 1 1 k ( k + 1 ) Z k 2 .
In Appendix B, we use Jacobi polynomials to derive the Karhunen–Loève expansion for the integrals of the weighted square of the Brownian bridge with two new weights ψ ( t ) = 1 / { t ( 2 t ) } and ψ ( t ) = 1 / ( 1 t 2 ) . The results are stated in the following theorem.
Theorem 2. 
The two weights ψ ( t ) = 1 / { t ( 2 t ) } and ψ ( t ) = 1 / ( 1 t 2 ) lead to the same Karhunen–Loève expansions:
0 1 { B ( t ) t B ( 1 ) } 2 2 t t 2 d t = k = 1 1 2 k ( 2 k + 1 ) Z k 2
and
0 1 { B ( t ) t B ( 1 ) } 2 1 t 2 d t = k = 1 1 2 k ( 2 k + 1 ) Z k 2 .
The proof of the above two equalities will be provided in Appendix B. One can see the equivalence of these two equalities by using a change of variable.
Given different probabilities (p), Table 1 presents the critical values c p for which p = P ( χ n 2 ( τ ) c p ) for different n, where χ n 2 ( τ ) = k = 1 n λ k ( τ ) Z k 2 and calculations of critical values for finite n are based on Imhof’s method [16] implemented in R package CompQuadForm [21]. A few critical values are tabulated in Anderson and Darling [22] for χ n 2 ( τ ) with n = . One can see the critical values converge very quickly as n increases to .
In fact, we can connect the limit distribution of WC statistic and its functional limit distribution by the Karhunen–Loève expansion of the integral of the weighted square of Brownian bridge in terms of the Jacobi polynomials. Theorem 1 immediately implies the following asymptotic distribution as n .
Corollary 1. 
Under the assumptions of Theorem 1, when n ,
S n ( Y ; τ , 2 ) σ 2 d k = 1 λ k ( τ ) Z k 2 .
One can check k = n λ k ( τ ) Z k 2 p 0 by Markov’s inequality. Hence, k = 1 n 1 λ k ( τ ) Z k 2 converges to k = 1 λ k ( τ ) Z k 2 in probability as n . By the functional limit theorem,
S n ( Y ; τ , 2 ) σ 2 d 0 1 { B ( t ) t B ( 1 ) } 2 1 t 2 d t , if τ = 0 , 0 1 { B ( t ) t B ( 1 ) } 2 t ( 1 t ) d t , if τ = n / 2 , 0 1 { B ( t ) t B ( 1 ) } 2 2 t t 2 d t , if τ = n .

2.3. Graphical Model

Assume the { Y i , j , 1 i n , 1 j q } are independent and have common mean E ( Y i , j ) = μ i and variance Var ( Y i , j ) = σ i 2 . Consider testing
H 0 : μ i μ , σ i 2 σ 2 v s H a : μ i = μ , for i k * , μ + , for k * > i , or σ i 2 = σ 2 , for i k * , σ + 2 , for k * > i ,
where μ μ + or σ 2 σ + 2 , the parameters μ , μ , μ + , σ 2 , σ 2 , and σ + 2 are unknown.
A graphical model can be established by treating each q-dimensional vector as a node and assigning the Euclidean distance between any two vectors. Here, we consider a path P with an ordering of nodes ( v 1 , , v n ) and edges ( v i , v i + 1 ) for i = 1 , , n 1 . Associated with the path, the count of edges that connect nodes between arbitrary two disjoint sets N k = { 1 , , k } and N ¯ k = { k + 1 , , n } is defined to be:
C P ( N k , N ¯ k ) = i = 1 n 1 I ( v i N k ) ( v i + 1 N ¯ k ) } { ( v i + 1 N k ) ( v i N ¯ k ) ,
where I ( · ) is an indicator function that takes 1 if true otherwise 0. The C P ( N k , N ¯ k ) counts edges between two groups N k and N ¯ k .
Denote the expectation and variance of C P ( N k , N ¯ k ) under n ! permutations of nodes as E perm C P ( N k , N ¯ k ) and Var perm C P ( N k , N ¯ k ) . By [23],
E perm C P ( N k , N ¯ k ) = 2 k ( n k ) n and Var perm C P ( N k , N ¯ k ) = 2 k ( n k ) { 2 k ( n k ) n } n 3 n 2 .
A WC statistic may be constructed as
S n ( P ; τ , γ ) = k = 1 n 1 w k 1 ( τ ) C P ( N k , N ¯ k ) + 2 k ( n k ) n γ .
A large value of observed S n ( P * ; τ , γ ) based on the shortest Hamiltonian path (SHP), P * , indicates a rejection of the null hypothesis, i.e., there is a change point; see the heuristic algorithm of SHP in Biswas et al. [24] and the analysis of power and change point in Shi, Wu and Rao [25], Shi, Wu and Rao [26] for γ = 2 and w k ( τ ) = Var perm C P ( N k , N ¯ k ) . Here, we will establish the asymptotic distribution of S n ( P ; τ , γ ) for γ = 1 , 2 . First, we give the following Lemma.
Lemma 1. 
For k = t n with 0 < t < 1 ,
1 2 n C P ( N k , N ¯ k ) + 2 k ( n k ) n d { B ( t ) t B ( 1 ) } 2 t ( 1 t ) , n .
By the functional limit theorem,
n 2 S n ( P ; τ , 1 ) d 0 1 { B ( t ) t B ( 1 ) } 2 1 t 2 d t + log ( 2 ) 1 , if τ = 0 , 0 1 { B ( t ) t B ( 1 ) } 2 t ( 1 t ) d t 1 , if τ = n / 2 , 0 1 { B ( t ) t B ( 1 ) } 2 2 t t 2 d t + log ( 2 ) 1 , if τ = n ,
and
1 2 S n ( P ; τ , 2 ) d 0 1 { B ( t ) t B ( 1 ) } 2 t ( 1 t ) 2 1 t 2 d t , if τ = 0 , 0 1 { B ( t ) t B ( 1 ) } 2 t ( 1 t ) 2 t ( 1 t ) d t , if τ = n / 2 , 0 1 { B ( t ) t B ( 1 ) } 2 t ( 1 t ) 2 2 t t 2 d t , if τ = n ,
which solves an open problem in [25,26]. Different values of γ lead to different rates of convergence and different ”normings”.

2.4. Normal Mixed Model

Assume Y i , j = μ i + U i + e i , j , where 1 i n , 1 j q , e i , j are independent and identically normally distributed with mean zero and variance σ 2 , and U i are independent latent variables following a normal distribution with mean zero and variance ν 2 .
Consider testing
H 0 : μ i μ v s H a : μ i = μ , for i k * , μ + , for k * > i ,
where μ μ + , the parameters μ , μ and μ + are unknown, and we tentatively assume the time k * , called the change point, and the variances σ 2 and ν 2 to be known.
The marginal log-likelihood function of μ under H 0 is
( μ ) = 0 i = 1 n ( Y ¯ i μ ) 2 2 ν 2 + 2 σ 2 / q ,
where 0 does not depend on μ and Y ¯ i = j = 1 q Y i , j / q .
Therefore,
max μ ( μ ) = 0 i = 1 n ( Y ¯ i μ ^ 1 , n ) 2 2 ν 2 + 2 σ 2 / q ,
where μ ^ t 1 , t 2 = i = t 1 t 2 Y ¯ i / ( t 2 t 1 + 1 ) .
In a similar way, the marginal log-likelihood function of μ and μ + under H a can be obtained. Then, the marginal log-likelihood ratio is
i = k * + 1 n ( Y ¯ i μ ^ 1 , n ) 2 ν 2 + σ 2 / q i = 1 k * ( Y ¯ i μ ^ 1 , k * ) 2 ν 2 + σ 2 / q i = k * n ( Y ¯ i μ ^ k * + 1 , n ) 2 ν 2 + σ 2 / q ,
which is equal to
n i = 1 k * ( Y ¯ i μ ^ 1 , n ) 2 k * ( n k * ) ( ν 2 + σ 2 / q ) .
As the change point k * could be unknown in practice, we may sum over k * = 1 , , n 1 and consider the average value, which leads to
S n ( Y ¯ ; n / 2 , 2 ) = k = 1 n 1 w k 1 ( n / 2 ) i = 1 k ( Y ¯ i μ ^ 1 , n ) 2 .
where Y ¯ = ( Y ¯ 1 , , , Y ¯ n , ) .
By Theorem 1 and Remark (3) in terms of weighted version for any τ , as n ,
S n ( Y ¯ ; τ , 2 ) ( n 1 ) 1 i = 1 n ( Y ¯ i μ ^ 1 , n ) 2 d k = 1 λ k ( τ ) Z k 2 .

2.5. Poisson Mixed Model

Assume Y i , j follows a Poisson distribution with conditional mean E ( Y i , j | U i ) = exp ( ρ i + U i ) . Consider testing
H 0 : ρ i ρ v s H a : ρ i = ρ , for 1 i k * , ρ + , for k * < i n ,
where ρ ρ + , the parameters ρ , ρ and ρ + are unknown. Under normal distribution for U i , the likelihood ratio contains an integral. With the focus on the simple Poisson mixed model without a change point, Hall et al. [27,28] applied the Gaussian variational approximation (GVA) to approximate the integral so as to avoid solving the integral. We provide a saddle point approximation here.
The marginal log-likelihood function of ρ under H 0 is
( ρ ) = 1 + i = 1 n log I i ( ρ ) ,
where 1 does not depend on r and I i ( ρ ) = exp q e ρ + u + q Y ¯ i ( ρ + u ) u 2 2 ν 2 d u .
The calculation of ( ρ ) is hindered by the lack of a closed form of the integral I i ( ρ ) . Here, we apply the saddle point approximation to the integral as shown in Lemma 2.
Lemma 2. 
For the integral I ( ρ ; a , b , ν 2 ) = exp b e u + a u ( u ρ ) 2 2 ν 2 d u ,
I ( ρ ; a , b , ν 2 ) ( a b e ) a 2 π a e ( c ρ ) 2 2 ν 2 ,
where the symbol ≈ means asymptotic equivalence and the saddle point c solves ϕ ( u ) = 0 with ϕ ( u ) = a u b e u , i.e., c = log ( a / b ) .
In (16), I i ( ρ ) = I ( ρ ; q Y ¯ i , q , ν 2 ) , so Lemma 2 gives the leading term as
( ρ ) 1 i = 1 n ( log Y ¯ i ρ ) 2 2 ν 2 ,
and the leading term approximation to max ρ ( ρ )
1 i = 1 n ( log Y ¯ i ρ ^ 1 , n ) 2 2 ν 2 ,
where ρ ^ t 1 , t 2 = i = t 1 t 2 log Y ¯ i / ( t 2 t 1 + 1 ) .
In a similar way, max ρ 1 , ρ 2 ( ρ 1 , ρ 2 ) under H a can be approximated, giving the approximate log-likelihood ratio
i = 1 n ( log Y ¯ i ρ ^ 1 , n ) 2 ν 2 i = 1 k * ( log Y ¯ i ρ ^ 1 , k * ) 2 ν 2 i = k * + 1 n ( log Y ¯ i ρ ^ k * + 1 , n ) 2 ν 2 = n { i = 1 k * ( log Y ¯ i ρ ^ 1 , n ) } 2 k * ( n k * ) ν 2 .
Considering that the change point k * is unknown, we may sum (17) over k * = 1 , , n 1 as shown in (1) and consider the average value,
S n ( log Y ¯ ; n / 2 , 2 ) = k = 1 n 1 w k 1 ( n / 2 ) i = 1 k ( log Y ¯ i ρ ^ 1 , n ) 2 .
Note that the term w k ( n / 2 ) is derived from the approximate likelihood ratio statistic, different from the classical Poisson change point statistic in Csörgö and Horváth [1] (p. 27).
By Theorem 1 and Remark 3 in terms of weighted version for any τ , as n , q ,
S n ( log Y ¯ ; τ , 2 ) ( n 1 ) 1 i = 1 n ( log Y ¯ i ρ ^ 1 , n ) 2 d k = 1 λ k ( τ ) Z k 2 .

2.6. Weak Dependence

Now, we consider a space-time model for the distribution of Y i , j , where i indexes time and j indexes space. First, we assume some weak dependence conditions on space by supposing the central limit theorem holds:
1 q j = 1 q ( Y i , j Y ¯ i ) d N ( 0 , σ 2 ) ,
where σ 2 = lim q q var ( Y ¯ i ) .
Next, we assume some weak dependence conditions on time by supposing that the following invariance principle or functional central limit theorem holds for any t ( 0 , 1 ) [29,30]:
1 n q i = 1 [ n t ] ( Y i Y ¯ ) σ ˜ { B ( t ) t B ( 1 ) } ,
where Y ¯ = i = 1 n Y i / n = μ ^ 1 , n and σ ˜ 2 = lim n q n q var ( Y ¯ ) .
The weak dependence conditions in (19) and (20) are satisfied if the series is m-dependence, mixing, or linear process. Shao and Zhang [31] proposed a normalized change point statistic
M n , q ( Y ) = max k n w k i = 1 k ( Y i Y ¯ ) 2 ,
where Y = ( Y 1 , , , Y n , ) and w k = i = 1 k { j = 1 i Y j ( i / k ) j = 1 i Y j } 2 + i = k + 1 n { j = i n Y j ( n i + 1 ) / ( n k ) j = k + 1 n Y j } 2 is a random weight.
They showed that
M n , q ( Y ) q d max 0 < t < 1 { B ( t ) t B ( 1 ) } 2 D 1 , 0 , t + D 2 , t , 1 ,
where D 1 , 0 , t = 0 t { B ( s ) ( s / t ) B ( t ) } 2 d s and D 2 , t , 1 = t 1 [ B ( 1 ) B ( s ) ( 1 s ) / ( 1 t ) { B ( 1 ) B ( t ) } ] 2 d s .
Similarly, with the same w k as above, we propose a randomized version of WC:
S n , q ( Y ) = k = 1 n 1 1 w k i = 1 k ( Y i Y ¯ ) 2 .
By the functional central limit theorem, when n ,
S n , q ( Y ) q d 0 1 { B ( t ) t B ( 1 ) } 2 D 1 , 0 , t + D 2 , t , 1 d t .

3. Power and Change Point Estimation

Considering the WC statistic S n ( Y ¯ ; τ , 2 ) in (14), we now consider the power of change point test based on
S n ( Y ¯ ; τ , 2 ) ( n 1 ) 1 i = 1 n ( Y ¯ i μ ^ 1 , n ) 2 ,
under the alternative hypothesis in Section 2.4. We assume some weak dependence conditions in Section 2.6. We note that (23) has the same asymptotic null distribution as (4) in Theorem 1. The asymptotic distribution is shown in Theorem 2. To establish the consistency of the test, we make a further assumption that the change point index k * is bounded away from the endpoints.
Theorem 3. 
Assume E ( Y i , j ) = μ i = μ if i k * , μ + otherwise. Under the alternative hypothesis, the change magnitude Δ = μ + μ 0 . Under a weak dependence satisfying (19) and (20), 0 < τ 1 k * / n τ 2 < 1 , τ 1 and τ 2 are two constants, n q Δ 2 , and n 3 / 2 q 1 / 2 | Δ | , q S n ( Y ; τ , 2 ) p .
The proof of Theorem 3 is in Appendix E. As expected, the power of the test based on (23) increases with n, q, and the size of the change in the mean.
The estimated change point is
k ^ ( τ ) = arg max 1 k < n w k 1 / 2 ( τ ) i = 1 k ( Y i Y ¯ ) .
We refer the reader to Bai [32,33] for some early works on the asymptotic distribution of k ^ ( n / 2 ) and [34] for a treatment on the convergence rate of k ^ ( n / 2 ) .

4. Simulations

The main purpose of this simulation is to assess the effect of different values for w k ( τ ) , n, q, and change magnitude on the power of our test in (23), and that of the graph-based tests [25,26,35], as both can handle high-dimensional data, and the distance of the graph can be changed to test different changes of parameters for a fair comparison. For example, if we are not sure whether the mean or variance changes, the Euclidean distance can be used to measure the distance between any two nodes in the graph:
d i 1 , i 2 = j = 1 q ( Y i 1 , j Y i 2 , j ) 2 1 / 2 ;
see Chen and Zhang [35], and Shi, Wu and Rao [25]. Another pseudo-distance can be used
d i 1 , i 2 * = Y i 1 , Y i 2 , ,
if only the change in the mean needs to be detected; see Shi, Wu and Rao [26]. We denote the maximal test of Chen and Zhang based on Euclidean distance by MST and based on the pseudo-distance by MST*. The associated algorithm is in the R package gSeg [36]. Similarly, we denote Shi, Wu, and Rao’s test (Shi, Wu and Rao [25,26]) based on Euclidean distance by SHP and based on the pseudo-distance by SHP * , and the associated R package can be accessed from [37].
First, we simulate { Y i , j , 1 i k * , 1 j q } independent standard normal random variables and { Y i , j , k * + 1 i n , 1 j q } independent normal random variables with mean Δ and variance 1. The critical values for α = 0.05 are given in Table 1 with p = 1 α . We use these critical values and generate 200 simulations with sample sizes n = 40 , 80 , dimensions q = 50 , 100 , change point locations k * = n / 4 , n / 2 , 3 n / 4 , and change magnitude Δ = 0.1 , 0.2 .
In Table 2, we show the percentage of rejections of the null hypothesis at level 0.05 for each of the change point tests. We can see that the power of the graph-based method MST * or SHP * is higher than that of MST and SHP, which use the pseudo-distance for detecting changes in the mean. Interestingly, the power of the graph-based method for change point detection is still not as high as that of (23). This aspect of the comparison, which we have not seen in other literature so far, is considered a new and meaningful comparison, and at least we can claim that there is room for improvement in the change point detection of the graph-based method.
Now we look at the effect of the weights on the power. This weight w k ( n / 2 ) yields the highest power when the change point is in the middle; however, the w k ( n ) weight yields the highest power when the change point is near the beginning of the sequence, and conversely, the w k ( 0 ) weight yields the highest power when the change point is near the end of the sequence. Moreover, the power increases with increasing n, q, and Δ , which agrees with Theorem 3.
Now, we introduce a mixture distribution and slightly change the way the random variables are generated. We simulate { Y i , j , k * + 1 i n , 1 j q } from a mixture of two normal distributions with mixture weights (0.5, 0.5) or (0.8, 0.2), means (0, 0.2) or (0, 1), and variance always being (1, 1), which corresponds to Δ = 0.1 or Δ = 0.2 . We keep the other settings from the previous comparison. As we expected, the difference between Table 2 and Table 3 is very small.

5. Data Analysis

Here, we analyze the video data provided by Dr. Mathieu Lihorea, which are available from [26]. In Lihoreau, Chittka and Raine [38], the authors used artificial pollen to attract bees and an automatic monitoring camera to capture the bee’s flight path. However, this automatic monitoring feature does not fully start recording when the bee enters and stops recording when the bee leaves, in fact in this video, the recording starts before the bee enters and does not stop when the bee leaves. Since we only care about the part of the video with bees, detecting the arrival and departure of bees helps us to automatically cut the original video. Although the video contains the interference of ants, the bees are much larger compared to the ants, so it can be assumed that the presence and departure of the bees cause a change in the mean value of the pixel values of the image.
This video has a length of 49 seconds, a frame width of 352, a frame height of 288, and a frame rate of 29.97 frames per second. Shi, Wu and Rao [26] extracted the video into n = 49 images according to the rate of one frame per second. From these 49 images, we can obtain that the image positions corresponding to the bee entering and leaving are 4 and 40, respectively. Moreover, we can extract this video into more images according to the rate of 2 or 5 frames per second. So, the number of images obtained, n, increases to 98 or 245, and at the same time, the positions of the images corresponding to the entry and exit of the bees also change with n. If we call the image locations where these bees appear and leave as change points, k * , we assume that k * / n is constant with respect to n and close to 0 or 1, respectively. In Figure 3 the first row is four images located at 4 (change point), 5, 40 (change point), and 41 from extracted 49 images; the second row is four images located at 7 (change point), 8, 79 (change point), and 80 from extracted 98 images; and the third row is four images located at 19 (change point), 20, 198 (change point), and 199 from extracted 245 images. Since the images contain R, G, and B components, we use a weighted average of the R, G, and B components and same-scale transformations on the weighted average as suggested by Shi, Wu and Rao [26].
Our quadratic weight test statistics are able to detect these two change points. We compared them to the graph-based change point estimates by applying the method of SHP * and MST * once to the whole sequence. As shown in Table 4, all tests are significant at a level 0.05 except the quadratic weight w k ( 0 ) for the size of 49 returns a p-value 0.067; w k ( 0 ) and w k ( n ) give the estimates of the second and first change points, respectively; w k ( n / 2 ) gives the same estimates of change points as w k ( n ) , and both cannot give the estimates of the second change point, such as SHP * and MST * . Thus, we recommend these two weights w k ( 0 ) and w k ( n ) for detecting the departure and arrival of the bee.

6. Discussion

This paper mainly focuses on single change point detection. However, it is possible to extend our method and apply the WC statistic to the detection of multiple change points. An approach recommended in the literature is to select data intervals where there is evidence for a single change point. Some researchers suggested penalty procedures based on either the adaptive lasso [39] or smoothly clipped absolute deviation [40,41,42]; others applied CUSUM statistics [4,43,44,45]. As long as the aforementioned intervals have been chosen, one could use tests based on WC. If the tests are rejected for some of the intervals, then the change point can be estimated by (24).
It would also be of interest, although challenging, to consider other quadratic weights, such as w k ( n / 4 ) and w k ( 3 n / 4 ) , as these statistics may be more powerful to detect some change points that are close to the third-quarter and quarter positions of the sequence. The eigenvalues of these quadratic terms may not have recursive formulas.

Author Contributions

X.S., X.-S.W. and N.R. designed research; X.S., X.-S.W. and N.R. performed research; X.S. analyzed data; X.S., X.-S.W. and N.R. wrote the paper. All authors have read and agreed to the published version of the manuscript.

Funding

Shi’s work was supported by NSERC Discovery Grant RGPIN 2022-03264, the Interior Universities Research Coalition and the BC Ministry of Health, and the University of British Columbia Okanagan (UBC-O) Vice Principal Research in collaboration with UBC-O Irving K. Barber Faculty of Science.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

We thank two anonymous reviewers for helpful comments.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Proof of Theorem 1.
The exact distribution of S n ( Y ; τ , 2 ) is determined by the eigenvalues of Q. Define the n × ( n 1 ) matrix B = ( B 1 , , B n 1 ) with B k = p k 1 / 2 ( 0 , , 0 , 1 , 1 , 0 , , 0 ) such that all entries of B k are zeros except the k entry p k 1 / 2 and the k + 1 entry p k 1 / 2 . It is readily seen that A B = n I n 1 and thus B Q B = I n 1 . Note that
B B = P 1 / 2 T P 1 / 2 = p 1 1 / 2 p n 1 1 / 2 2 1 1 2 1 1 2 1 1 2 p 1 1 / 2 p n 1 1 / 2 ,
where P is a diagonal matrix with P k k = p k and T is a tridiagonal matrix with T k k = 2 and T k , k + 1 = T k + 1 , k = 1 .
We shall find the relationship between Q and B B . We diagonalize B B = R Γ R with R R = I n and Γ diagonal matrix with Γ k k = γ k . Set C = B R Γ 1 / 2 . We have C C = I n 1 and C Q C = Γ 1 . Finally, we introduce u = ( n 1 / 2 , , n 1 / 2 ) such that Q u = 0 , C u = 0 and u u = 1 . Define U = ( C , u ) . It then follows that U U = I n and
U Q U = C Q C C Q u u Q C u Q u = Γ 1 0 0 0 .
This implies that the nonzero eigenvalues of Q are reciprocals of those of B B .
Let ( v 1 , , v n 1 ) be the eigenvector corresponding to an eigenvalue λ of B B . We have the recurrence identity
p k 1 1 / 2 v k 1 + 2 p k 1 / 2 v k p k + 1 1 / 2 v k + 1 = λ p k 1 / 2 v k ,
where v 0 = v n = 0 , p 0 = 1 , and k = 1 , , n 1 . The above recurrence relation appears in Gardner [6] (1.7) as an eigenvalue equation for a forward difference operator. As mentioned in Gardner [6], it is difficult to find an explicit formula for the eigenvalues unless the prior distribution is uniform; i.e., p k is independent of k. To overcome this difficulty, we make use of the above recurrence relation and then apply the classical theory of orthogonal polynomials and special functions. To be more specific, we shall link the eigenvector in (A1) to the dual Hahn polynomial by making some transformations.
Let v k = ( p k p k 1 ) 1 / 2 ( p 1 p 0 ) 1 / 2 f k . The above recurrence relation becomes
f k 1 p k p k 1 + 2 f k p k f k + 1 = λ f k .
We further denote g k = ( 1 ) k f k and π k = g k + 1 . It is readily seen that
g k 1 p k p k 1 + 2 g k p k + g k + 1 = λ g k ,
and (by shifting the index k)
π k 1 p k p k + 1 + 2 π k p k + 1 + π k + 1 = λ π k ,
where π 1 = π n 1 = 0 and π 0 = 1 . By induction, π k is a monic kth order polynomial of λ .
Now, we consider three quadratic weights in (2).
Case I. If p k = 1 / { k ( n k ) } for k = 1 , , n 1 , then π k is related to the dual Hahn polynomial
π k = lim N n 2 ( 2 ) k ( N ) k R k ( λ 2 ; 1 , 1 , N ) = lim N n 2 ( 2 ) k ( N ) k j = 0 k ( k ) j ( x ) j ( x + 3 ) j ( 1 ) j ( 2 ) j ( N ) j ,
where ( a ) j : = i = 0 j 1 ( a + i ) is the Pochhammer symbol which is commonly used in the field of orthogonal polynomials and special functions, R k is the dual Hahn polynomial of degree k and λ 2 = x ( x + 3 ) . In particular,
π n 1 = ( 1 ) n 1 ( x ) n 1 ( x + 3 ) n 1 = j = 0 n 2 ( x j ) ( x + 3 + j ) = j = 0 n 2 [ λ ( j + 1 ) ( j + 2 ) ] .
This implies that the eigenvalues of B B are ( j + 1 ) ( j + 2 ) for j = 0 , , n 2 . Consequently, the eigenvalues of Q are 0 and 1 / { k ( k + 1 ) } for k = 1 , , n 1 .
Case II. If p k = 1 / { k ( 2 n k ) } for k = 1 , , n 1 , then π k is related to the dual Hahn polynomial as follows
π k = ( 2 ) k ( 2 2 n ) k R k ( λ 2 ; 1 , 1 , 2 n 2 ) = ( 2 ) k ( 2 2 n ) k j = 0 k ( k ) j ( x ) j ( x + 3 ) j ( 1 ) j ( 2 ) j ( 2 2 n ) j ,
where R k is the dual Hahn polynomial of degree k and λ 2 = x ( x + 3 ) . In particular,
π n 1 = ( 2 ) n 1 ( 2 2 n ) n 1 j = 0 n 1 ( 1 n ) j ( x ) j ( x + 3 ) j ( 1 ) j ( 2 ) j ( 2 2 n ) j .
By Watson’s sum, π n 1 = 0 when x = 2 k 1 or x = 2 2 k with k = 1 , , n 1 . This implies that the eigenvalues of B B are ( 2 k ) ( 2 k + 1 ) for k = 1 , , n 1 . Consequently, the eigenvalues of Q are 0 and 1 / { 2 k ( 2 k + 1 ) } with k = 1 , , n 1 .
Case III. If p k = 1 / { ( n + k ) ( n k ) } for k = 1 , , n 1 , then the eigenvalues of Q are also 0 and 1 / [ 2 k ( 2 k + 1 ) ] with k = 1 , , n 1 because the sequence { p 1 , , p n 1 } is just the reverse of that in Case II.
Since it is a quadratic form in normal random variables, the results follow. □

Appendix B

Proof of Theorem 2.
Case I. For the weight 2 t t 2 , we define X t = B ( t ) t B ( 1 ) 2 t t 2 . By the Karhunen–Loève expansion,
X t = k = 1 Z k e k ( t ) ,
where random variables Z k are stochastically independent normal and e k ( · ) are an orthonormal basis. Then, the integral of the square of X t becomes k = 1 Z k 2 , and we need the variance of Z k . We consider the covariance of X t , called the Mercer Kernel:
K X ( t , s ) = E ( X t X s ) = min ( t , s ) t s 2 t t 2 2 s s 2 .
By Mercer’s theorem, there exists a set { λ k , e k ( t ) } such that
K X ( t , s ) = k = 1 λ k e k ( t ) e k ( s ) ,
where λ k are eigenvalues and e k ( t ) are eigenfunctions satisfying the Fredholm integral equation 0 1 K X ( t , s ) e k ( s ) d s = λ k e k ( t ) .
Thus, we have the eigenvalue problem
0 1 min ( t , s ) t s 2 t t 2 2 s s 2 e ( s ) d s = λ e ( t ) .
Denote e ( t ) = 2 t t 2 f ( t ) . After the multiplication of 2 t t 2 on both sides, the above eigenvalue problem becomes
0 1 [ min ( t , s ) t s ] f ( s ) d s = λ ( 2 t t 2 ) f ( t ) .
Let f ( t ) = k = 0 f k t k . It follows that
k = 0 f k 0 1 [ min ( t , s ) t s ] s k d s = λ k = 0 f k ( 2 t t 2 ) t k .
Note that
0 1 [ min ( t , s ) t s ] s k d s = t t k + 2 ( k + 1 ) ( k + 2 ) .
We then have
k = 0 f k t ( k + 1 ) ( k + 2 ) k = 0 f k t k + 2 ( k + 1 ) ( k + 2 ) = λ k = 0 2 f k t k + 1 λ k = 0 f k t k + 2 .
Obviously, λ 0 ; otherwise f k = 0 for all k 0 . Now, we compare the coefficients of t k with k 0 on both sides of the above identity. It follows that
2 λ f 0 = k = 0 f k ( k + 1 ) ( k + 2 ) ,
and
f k ( k + 1 ) ( k + 2 ) = λ ( 2 f k + 1 f k ) .
We shall prove that λ n = { 1 / [ ( 2 n ) ( 2 n + 1 ) ] , n = 1 , } are eigenvalues of K X ( t , s ) . To see this, we obtain from the above recurrence relation
f k + 1 f k = ( k + 1 ) ( k + 2 ) ( 2 n ) ( 2 n + 1 ) 2 ( k + 1 ) ( k + 2 ) = ( k + 1 2 n ) ( k + 2 n + 2 ) 2 ( k + 1 ) ( k + 2 ) .
Making use of the Pochhammer symbol ( a ) j : = i = 0 j 1 ( a + i ) , we obtain
f k f 0 = ( 1 2 n ) k ( 2 n + 2 ) k 2 k ( 1 ) k ( 2 ) k .
Consequently,
2 λ f 0 = k = 0 f k ( k + 1 ) ( k + 2 ) = f 0 k = 0 ( 1 2 n ) k ( 2 n + 2 ) k 2 k ( 1 ) k + 1 ( 2 ) k + 1 = 2 f 0 ( 2 n ) ( 2 n + 1 ) j = 1 ( 2 n ) j ( 2 n + 1 ) j 2 j ( 1 ) j ( 2 ) j
Since
j = 0 ( 2 n ) j ( 2 n + 1 ) j 2 j ( 1 ) j ( 2 ) j = 0 ,
we then have
2 λ f 0 = 2 f 0 ( 2 n ) ( 2 n + 1 ) = 2 f 0 ( 2 n ) ( 2 n + 1 ) ,
which agrees with λ = 1 / [ ( 2 n ) ( 2 n + 1 ) ] . By normalizing f 0 = 1 , we can express the eigenfunction as
f ( t ) = k = 0 ( 1 2 n ) k ( 2 n + 2 ) k 2 k ( 1 ) k ( 2 ) k t k = 1 2 n P 2 n 1 ( 1 , 1 ) ( 1 t ) ,
where
P n ( α , β ) ( z ) = ( α + 1 ) n n ! j = 0 n ( n ) k ( n + α + β + 1 ) k ( 1 ) k ( α + 1 ) k ( 1 z 2 ) j
is the Jacobi polynomial.
So, we have
0 1 { B ( t ) t B ( 1 ) } 2 2 t t 2 d t = k = 1 1 2 k ( 2 k + 1 ) Z k 2 ,
where Z k are independent normal random variables, each having mean zero and variance 1. That proves (5).
Case II. For the weight 1 t 2 , we intend to solve the eigenvalue problem
0 1 min ( t , s ) t s 1 t 2 1 s 2 e ( s ) d s = λ e ( t ) .
Denote e ( t ) = 1 t 2 f ( t ) . After the multiplication of 1 t 2 on both sides, the above eigenvalue problem becomes
0 1 [ min ( t , s ) t s ] f ( s ) d s = λ ( 1 t 2 ) f ( t ) .
Let f ( t ) = k = 0 f k t k . It follows that
k = 0 f k 0 1 [ min ( t , s ) t s ] s k d s = λ k = 0 f k ( 1 t 2 ) t k .
Note that
0 1 [ min ( t , s ) t s ] s k d s = t t k + 2 ( k + 1 ) ( k + 2 ) .
We then have
k = 0 f k t ( k + 1 ) ( k + 2 ) k = 0 f k t k + 2 ( k + 1 ) ( k + 2 ) = λ k = 0 f k t k λ k = 0 f k t k + 2 .
Obviously, λ 0 ; otherwise f k = 0 for all k 0 . Now, we compare the coefficients of t k with k 0 on both sides of the above identity. It follows that f 0 = 0 , and
λ f 1 = k = 0 f k ( k + 1 ) ( k + 2 ) ,
and
f k ( k + 1 ) ( k + 2 ) = λ ( f k + 2 f k ) .
On account of f 0 = 0 , the above recurrence relation implies that f 2 j = 0 for all j 0 . Now, we set g j = f 2 j + 1 with j 0 . The above recurrence relation (with k = 2 j + 1 ) becomes
g j ( 2 j + 2 ) ( 2 j + 3 ) = λ ( g j + 1 g j ) .
For convenience, we let 1 / λ = 2 μ ( 2 μ + 1 ) . It is readily seen that
g j + 1 g j = ( 2 j + 2 ) ( 2 j + 3 ) ( 2 μ ) ( 2 μ + 1 ) ( 2 j + 2 ) ( 2 j + 3 ) = ( j + 1 μ ) ( j + 3 / 2 + μ ) ( j + 1 ) ( j + 3 / 2 ) .
Making use of the Pochhammer symbol ( a ) j : = i = 0 j 1 ( a + i ) , we obtain
f 2 j + 1 f 1 = g j g 0 = ( 1 μ ) j ( 3 / 2 + μ ) j ( 1 ) j ( 3 / 2 ) j .
Consequently,
k = 0 f k ( k + 1 ) ( k + 2 ) = j = 0 f 2 j + 1 ( 2 j + 2 ) ( 2 j + 3 ) = f 1 6 j = 0 ( 1 μ ) j ( 3 / 2 + μ ) j ( 2 ) j ( 5 / 2 ) j .
The left-hand side is λ f 1 . When μ = n + 1 with n = 0 , 1 , , by Pfaff-Saalschütz identity, we calculate the right-hand side as
f 1 6 j = 0 ( 1 μ ) j ( 3 / 2 + μ ) j ( 1 ) j ( 2 ) j ( 5 / 2 ) j ( 1 ) j = f 1 6 · ( 1 ) n ( 1 / 2 n ) n ( 2 ) n ( 3 / 2 n ) n = f 1 ( 2 n + 2 ) ( 2 n + 3 ) .
Hence, λ n = 1 ( 2 n + 2 ) ( 2 n + 3 ) , n = 0 , 1 , are the eigenvalues. By normalizing f 1 = 1 , we can express the eigenfunction as
f ( t ) = j = 0 f 2 j + 1 t 2 j + 1 = j = 0 n ( n ) j ( 5 / 2 + n ) j ( 1 ) j ( 3 / 2 ) j t 2 j + 1 = t n ! ( 3 / 2 ) n P n ( 1 / 2 , 1 ) ( 1 2 t 2 ) .
So, we have
0 1 { B ( t ) t B ( 1 ) } 2 1 t 2 d t = k = 1 1 2 k ( 2 k + 1 ) Z k 2 ,
where Z k are independent normal random variables, each having mean zero and variance 1. This gives (6). □

Appendix C

Proof of Lemma 1.
It is well known that ( 1 t ) B ( t 1 t ) = d B ( t ) t B ( 1 ) We first show the covariance: Cov { ( 1 s ) 2 B 2 ( s 1 s ) , ( 1 t ) 2 B 2 ( t 1 t ) } for 0 < s < t < 1 . Note that B ( s 1 s ) N ( 0 , s 1 s ) .
Since B 2 ( t 1 t ) B 2 ( s 1 s ) = { B ( t 1 t ) B ( s 1 s ) } 2 + 2 { B ( t 1 t ) B ( s 1 s ) } B ( s 1 s ) , we have E { B 2 ( t 1 t ) B 2 ( s 1 s ) } = E { B ( t 1 t ) B ( s 1 s ) } 2 and E [ B 2 ( s 1 s ) { B 2 ( t 1 t ) B 2 ( s 1 s ) } ] = E { B 2 ( s 1 s ) } E { B ( t 1 t ) B ( s 1 s ) } 2 by the independence of increments. These lead to
E [ B 2 ( s 1 s ) { B 2 ( t 1 t ) B 2 ( s 1 s ) } ] = E { B 2 ( s 1 s ) } E { B 2 ( t 1 t ) B 2 ( s 1 s ) } .
Therefore, we have
Cov { ( 1 s ) 2 B 2 ( s 1 s ) , ( 1 t ) 2 B 2 ( t 1 t ) } = ( 1 s ) 2 ( 1 t ) 2 E { B 2 ( s 1 s ) B 2 ( t 1 t ) } s ( 1 s ) t ( 1 t ) ,
where
E { B 2 ( s 1 s ) B 2 ( t 1 t ) } = E [ B 2 ( s 1 s ) { B 2 ( s 1 s ) + B 2 ( t 1 t ) B 2 ( s 1 s ) } ] = E { B 4 ( s 1 s ) } + E { B 2 ( s 1 s ) } E { B 2 ( t 1 t ) B 2 ( s 1 s ) } = 3 ( s 1 s ) 2 + s 1 s ( t 1 t s 1 s ) .
So, Cov { ( 1 s ) 2 B 2 ( s 1 s ) , ( 1 t ) 2 B 2 ( t 1 t ) } = 2 s 2 ( 1 t ) 2 .
In the next step, we will show that n 1 Cov perm { C P ( N , N ¯ ) , C P ( N m , N ¯ m ) } 4 s 2 ( 1 t ) 2 for = s n , m = t n , and < m .
We first decompose Cov perm { C P ( N , N ¯ ) , C P ( N m , N ¯ m ) } . Note that C P ( N , N ¯ ) = C P ( N , N ¯ N m ) + C P ( N , N ¯ m ) and C P ( N m , N ¯ m ) = C P ( N , N ¯ m ) + C P ( N ¯ N m , N ¯ m ) . Then,
Cov perm { C P ( N , N ¯ ) , C P ( N m , N ¯ m ) } = Cov perm { C P ( N , N ¯ N m ) , C P ( N , N ¯ m ) } + Cov perm { C P ( N , N ¯ N m ) , C P ( N ¯ N m , N ¯ m ) } + Cov perm { C P ( N , N ¯ m ) , C P ( N , N ¯ m ) } + Cov perm { C P ( N , N ¯ m ) , C P ( N ¯ N m , N ¯ m ) } .
To calculate the covariance, we need the following moments for any disjoint subsets A 1 , A 2 , and A 3 of N n . Their sizes are denoted as n 1 , n 2 and n 3 with n 1 + n 2 + n 3 n .
E perm C P ( A 1 , A 2 ) = i = 1 n 1 P ( v i A 1 ) ( v i + 1 A 2 ) ( v i + 1 A 1 ) ( v i A 2 ) = 2 n 1 n 2 n .
The following calculations of second moments need to consider three cases i = j , | i j | = 1 and | i j | > 1 for 1 i , j < n , with # { i = j | 1 i , j < n } = n 1 , # { | i j | = 1 | 1 i , j < n } = 2 ( n 2 ) , and # { | i j | > 1 | 1 i , j < n } = ( n 2 ) ( n 3 ) . Therefore, we have
E perm { C P ( A 1 , A 2 ) C P ( A 2 , A 3 ) } = i = 1 n 1 j = 1 n 1 P ( v i A 1 ) ( v i + 1 , v j + 1 A 2 ) ( v j + 1 A 3 ) ( v i A 1 ) ( v i + 1 , v j + 1 A 2 ) ( v j A 3 ) ( v i + 1 A 1 ) ( v i , v j A 2 ) ( v j + 1 A 3 ) ( v i + 1 A 1 ) ( v i , v j + 1 A 2 ) ( v j A 3 ) = i = j , 1 i , j < n 0 + | i j | = 1 , 1 i , j < n n 1 n 2 n 3 n ( n 1 ) ( n 2 ) + | i j | > 1 , 1 i , j < n 4 n 1 n 2 ( n 2 1 ) n 3 n ( n 1 ) ( n 2 ) ( n 3 ) = 2 n 1 n 3 n 2 ( 2 n 2 1 ) n ( n 1 ) .
In a similar way, we have
E perm { C P 2 ( A 1 , A 2 ) } = i = j , 1 i , j < n 2 n 1 n 2 n ( n 1 ) + | i j | = 1 , 1 i , j < n n 1 n 2 ( n 1 + n 2 2 ) n ( n 1 ) ( n 2 ) + | i j | > 1 , 1 i , j < n 4 n 1 ( n 1 1 ) n 2 ( n 2 1 ) n ( n 1 ) ( n 2 ) ( n 3 ) = 2 n 1 n 2 n + 2 n 1 n 2 ( n 1 + n 2 2 ) n ( n 1 ) + 4 n 1 ( n 1 1 ) n 2 ( n 2 1 ) n ( n 1 ) .
After tedious calculations using (A3) and (A4), we have
Cov perm { C P ( N , N ¯ ) , C P ( N m , N ¯ m ) } = 2 ( n m ) { 2 ( n m ) n } / ( n 3 n 2 ) ,
which leads to n 1 Cov perm { C P ( N , N ¯ ) , C P ( N m , N ¯ m ) } 4 s 2 ( 1 t ) 2 . The proof of Lemma 1 is finished. □

Appendix D

Proof of Lemma 2.
Consider the integral
I ( ρ ; a , b , ν 2 ) = exp { a u b e u ( u ρ ) 2 / ( 2 ν 2 ) } d u ,
where a > 0 is large and b > 0 . The saddle point for the phase function a u b e u is c = log ( a / b ) . We set s = u c and define
a z 2 / 2 = b e u a u ( b e c a c ) = a ( e s s 1 )
such that z is analytic near s = 0 and z s as s 0 . It is easily seen that
z = s + s 2 6 + s 3 36 + s 4 270 + , s = z z 2 6 + z 3 36 z 4 270 + .
Moreover,
d u d z = a z b e u a = z e s 1 = z s + z 2 / 2 .
Now, we can rewrite the integral as
I ( ρ ; a , b , ν 2 ) = ( a b e ) a e a z 2 / 2 z e ( u ρ ) 2 / ( 2 ν 2 ) s + z 2 / 2 d z .
A simple calculation gives
e ( u ρ ) 2 2 ν 2 = e ( c ρ ) 2 2 ν 2 ( c ρ ) s ν 2 s 2 2 ν 2 = e ( c ρ ) 2 2 ν 2 [ 1 ( c ρ ) s ν 2 s 2 2 ν 2 + ( c ρ ) 2 s 2 2 ν 4 + O ( s 3 ) ] = e ( c ρ ) 2 2 ν 2 [ 1 ( c ρ ) ( z z 2 / 6 ) ν 2 z 2 2 ν 2 + ( c ρ ) 2 z 2 2 ν 4 + O ( z 3 ) ] = e ( c ρ ) 2 2 ν 2 [ 1 ( c ρ ) ν 2 z + ( c ρ ) ν 2 3 ν 2 + 3 ( c ρ ) 2 6 ν 4 z 2 + O ( z 3 ) ] ,
and
z s + z 2 / 2 = 1 1 + z / 3 + z 2 / 36 + O ( z 3 ) = 1 z 3 + z 2 12 + O ( z 3 ) .
Consequently,
z e ( u ρ ) 2 / ( 2 ν 2 ) s + z 2 / 2 = e ( c ρ ) 2 2 ν 2 [ 1 3 ( c ρ ) + ν 2 3 ν 2 z + ( ( c ρ ) ν 2 ν 2 + ( c ρ ) 2 2 ν 4 + 1 12 ) z 2 + O ( z 3 ) ] .
By Watson’s lemma, we obtain
I ( ρ ; a , b , ν 2 ) = ( a b e ) a e ( c ρ ) 2 2 ν 2 e a z 2 / 2 [ 1 3 ( c ρ ) + ν 2 3 ν 2 z + ( c ν 2 ν 2 + ( c ρ ) 2 2 ν 4 + 1 12 ) z 2 ] d z + O ( a 5 / 2 ) = ( a b e ) a e ( c ρ ) 2 2 ν 2 2 π a 1 / 2 + ( ( c ρ ) ν 2 ν 2 + ( c ρ ) 2 2 ν 4 + 1 12 ) 2 π a 3 / 2 + O ( a 5 / 2 ) = ( a b e ) a 2 π a e ( c ρ ) 2 2 ν 2 1 + ( ( c ρ ) ν 2 ν 2 + ( c ρ ) 2 2 ν 4 + 1 12 ) a 1 + O ( a 2 ) .

Appendix E

Proof of Theorem 3.
We denote Y ¯ i * = Y ¯ i E ( Y ¯ i ) , μ ^ 1 , n * = μ ^ 1 , n E μ ^ 1 , n . Under the alternative hypothesis, E μ ^ 1 , n = { k * μ + ( n k * ) μ + } / n .
We first find a lower bound S n ( Y ¯ ; τ , 2 ) , i.e., S n ( Y ¯ ; τ , 2 ) S k * ( Y ¯ ; τ , 2 ) . Then, we decompose the lower bound into three terms:
S k * ( Y ¯ ; τ , 2 ) = S k * ( Y ¯ * ; τ , 2 ) + 2 ( n k * ) ( μ μ + ) n S k * ( Y ¯ * ; τ , 1 ) + ( n k * ) 2 ( μ μ + ) 2 n 2 k = 1 k * k 2 w k 1 ( τ ) ,
where S n ( Y ¯ * ; τ , γ ) = k = 1 n w k 1 ( τ ) i = 1 k ( Y ¯ i * μ ^ 1 , n * ) γ .
By the weak dependence, q S k * ( Y ¯ * ; τ , 2 ) = O p ( 1 ) and q S k * ( Y ¯ * ; τ , 1 ) = O p ( | Δ | q n ) . Furthermore, n 2 ( n k * ) 2 ( μ μ + ) 2 k = 1 k * k 2 w k 1 ( τ ) = O ( n Δ 2 ) .
q S n ( Y ¯ ; τ , 2 ) p holds because n q Δ 2 and n 3 / 2 q 1 / 2 | Δ | . □

References

  1. Csörgö, M.; Horváth, L. Limit Theorems in Change-Point Analysis; Wiley: Chichester, UK, 1997. [Google Scholar]
  2. Jiang, F.; Zhao, Z.; Shao, X. Modeling the COVID-19 infection trajectory: A piecewise linear quantile trend model. J. R. Statist. Soc. B 2021, accepted. [Google Scholar]
  3. Liu, B.; Zhou, C.; Zhang, X.; Liu, Y. A unified data-adaptive framework for high dimensional change point detection. J. R. Statist. Soc. B 2020, 82, 933–963. [Google Scholar] [CrossRef]
  4. Yu, M.; Chen, X. Finite sample change point inference and identification for high-dimensional mean vectors. J. R. Statist. Soc. B 2021, 83, 247–270. [Google Scholar] [CrossRef]
  5. Jandhyala, V.; Fotopoulos, S.; MacNeill, I.; Liu, P. Inference for single and multiple change-points in time series. J. Time Ser. Anal. 2013, 34, 423–446. [Google Scholar] [CrossRef]
  6. Gardner, J.A. On detecting changes in the mean of normal variates. Ann. Math. Statist. 1969, 40, 116–126. [Google Scholar] [CrossRef]
  7. Perron, P. Dealing with structural breaks. In Palgrave Handbook of Econometrics: Volume 1, Econometric Theory; Mills, T.C., Patterson, K., Eds.; Publisher: Palgrave Macmillan, London, UK, 2006; pp. 278–352. [Google Scholar]
  8. MacNeill, I. Properties of sequences of partial sums of polynomial regression residuals with applications to tests for change of regression at unknown times. Ann. Statist. 1978, 6, 422–433. [Google Scholar] [CrossRef]
  9. Daniels, H.E. Saddlepoint approximations in statistics. Ann. Math. Statist. 1954, 25, 631–650. [Google Scholar] [CrossRef]
  10. Reid, N. Saddlepoint methods and statistical inference (with discussion). Statist. Sci. 1988, 3, 213–238. [Google Scholar]
  11. Reid, N. Approximations and asymptotics, In Statistics Theory Model; Essays in Honor of D.R. Cox; Chapman and Hall: London, UK, 1991; pp. 287–334. [Google Scholar]
  12. Shi, X.; Wang, X.-S.; Reid, N. Saddlepoint approximation of nonlinear moments. Statist. Sinica 2014, 24, 1597–1611. [Google Scholar] [CrossRef] [Green Version]
  13. Shi, X.; Reid, N.; Wu, Y. Approximation to the moments of ratios of cumulative sums. Can. J. Statist. 2014, 42, 325–336. [Google Scholar] [CrossRef]
  14. Akman, V.E.; Raftery, A.E. Asymptotic inference for a change-point Poisson process. Ann. Statist. 1986, 14, 1583–1590. [Google Scholar] [CrossRef]
  15. Loader, C.R. A log-linear model for a Poisson process change point. Ann. Statist. 1992, 20, 1391–1411. [Google Scholar] [CrossRef]
  16. Imhof, J.P. Computing the distribution of quadratic forms in normal variables. Biometrika 1961, 48, 419–426. [Google Scholar] [CrossRef] [Green Version]
  17. Kuonen, D. Saddlepoint approximations for distributions of quadratic forms in normal variables. Biometrika 1999, 86, 929–935. [Google Scholar] [CrossRef] [Green Version]
  18. Daniels, H.E. Tail probability approximations. Int. Statist. Rev. 1987, 55, 37–48. [Google Scholar] [CrossRef]
  19. Lugannani, R.; Rice, S.O. Saddlepoint approximations for the distribution of the sum of independent random variables. Adv. Appl. Probab. 1980, 12, 475–490. [Google Scholar] [CrossRef]
  20. Anderson, T.; Darling, D. Asymptotic theory of certain “goodness of fit”criteria based on stochastic processes. Ann. Math. Statist. 1952, 23, 193–212. [Google Scholar] [CrossRef]
  21. de Micheaux, P.L. R Package CompQuadForm. 2017. Available online: https://cran.r-project.org/web/packages/CompQuadForm/index.html (accessed on 25 December 2020).
  22. Anderson, T.; Darling, D. A test of ‘‘goodness of fit”. J. Amer. Statist. Assoc. 1954, 49, 765–769. [Google Scholar] [CrossRef]
  23. Wald, A.; Wolfowitz, J. On a test whether two samples are from the same distribution. Ann. Math. Statist. 1940, 11, 147–162. [Google Scholar] [CrossRef]
  24. Biswas, M.; Mukhopadhyay, M.; Ghosh, A.K. A distribution-free two-sample run test applicable to high-dimensional data. Biometrika 2014, 101, 913–926. [Google Scholar] [CrossRef]
  25. Shi, X.; Wu, Y.; Rao, C.R. Consistent and powerful graph-based change-point test for high-dimensional data. Proc. Natl. Acad. Sci. USA 2017, 114, 3969–3974. [Google Scholar] [CrossRef] [Green Version]
  26. Shi, X.; Wu, Y.; Rao, C.R. Consistent and powerful non-Euclidean graph-based change-point test with applications to segmenting random interfered video data. Proc. Natl. Acad. Sci. USA 2018, 115, 5914–5919. [Google Scholar] [CrossRef]
  27. Hall, P.; Ormerod, J.T.; Wand, M.P. Theory of Gaussian variational approximation for a Poisson mixed model. Statist. Sinica 2011, 21, 369–389. [Google Scholar]
  28. Hall, P.; Pham, T.; Wand, M.P.; Wang, S.S.J. Asymptotic normality and valid inference for Gaussian variational approximation. Ann. Statist. 2011, 39, 2502–2532. [Google Scholar] [CrossRef] [Green Version]
  29. Peligrad, M. An invariance principle for ϕ-mixing sequences. Ann. Probab. 1985, 13, 1304–1313. [Google Scholar] [CrossRef]
  30. Phillips, P.C.B.; Solo, V. Asymptotics for linear processes. Ann. Statist. 1992, 20, 971–1001. [Google Scholar] [CrossRef]
  31. Shao, X.; Zhang, X. Testing for change points in time series. J. Am. Statist. Assoc. 2010, 105, 1228–1240. [Google Scholar] [CrossRef]
  32. Bai, J. Least square estimation of a shift in linear processes. J. Time Ser. Anal. 1994, 15, 453–472. [Google Scholar] [CrossRef] [Green Version]
  33. Bai, J. Estimation of a change point in multiple regressions. Rev. Econ. Stat. 1997, 79, 551–563. [Google Scholar] [CrossRef]
  34. Kokoszka, P.; Leipus, R.D. Change-point in the mean of dependent observations. Statist. Probab. Lett. 1998, 40, 385–393. [Google Scholar] [CrossRef]
  35. Chen, H.; Zhang, N. Graph-based change-point detection. Ann. Statist. 2015, 43, 139–176. [Google Scholar] [CrossRef]
  36. Chen, H.; Zhang, N. gSeg: Graph-Based Change-Point Detection (G-Segmentation). R Package Version 0.1. 2014. Available online: https://cran.r-project.org/web/packages/gSeg/index.html (accessed on 27 December 2020).
  37. Chen, M.; Shi, X.; Li, H. GraphCpClust: Graph-Based Change-Point Detection and Clustering. R Package Version 0.1. 2021. Available online: https://github.com/Meiqian-Chen/GraphCpClust (accessed on 27 April 2021).
  38. Lihoreau, M.; Chittka, L.; Raine, N.E. Monitoring flower visitation networks and interactions between pairs of bumble bees in a large outdoor flight cage. PLoS ONE 2016, 11, e0150844. [Google Scholar] [CrossRef]
  39. Zou, H. The adaptive Lasso and its oracle properties. J. Am. Statist. Assoc. 2006, 101, 1418–1429. [Google Scholar] [CrossRef]
  40. Fan, J.; Li, R. Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Statist. Assoc. 2001, 96, 1348–1360. [Google Scholar] [CrossRef]
  41. Jin, B.; Shi, X.; Wu, Y. A novel and fast methodology for simultaneous multiple structural break estimation and variable selection for non-stationary time series models. Statist. Comput 2013, 23, 221–231. [Google Scholar] [CrossRef]
  42. Zhang, C.H. Nearly unbiased variable selection under minimax concave penalty. Ann. Statist. 2010, 38, 894–942. [Google Scholar] [CrossRef] [Green Version]
  43. Cho, H.; Fryzlewicz, P. Multiple-change-point detection for high dimensional time series via sparsified binary segmentation. J. R. Statist. Soc. B 2015, 77, 475–507. [Google Scholar] [CrossRef]
  44. Fryzlewicz, P. Wild binary segmentation for multiple change-point detection. Ann. Statist. 2014, 42, 2243–2281. [Google Scholar] [CrossRef]
  45. Wang, T.; Samworth, R.J. High dimensional change point estimation via sparse projection. J. R. Statist. Soc. B 2017, 80, 57–83. [Google Scholar] [CrossRef]
Figure 1. Plot of weights: n 2 (uniform), k ( n k ) (centered), ( n + k ) ( n k ) (left shifted), and k ( 2 n k ) (right shifted).
Figure 1. Plot of weights: n 2 (uniform), k ( n k ) (centered), ( n + k ) ( n k ) (left shifted), and k ( 2 n k ) (right shifted).
Entropy 24 01652 g001
Figure 2. The pattern of eigenvalues (cross products of rows and columns) illustrated by dots for three weights w k ( n / 2 ) for τ = n / 2 (blue), w k ( 0 ) for τ = 0 (green) and w k ( n ) for τ = n (purple) with the increase of n.
Figure 2. The pattern of eigenvalues (cross products of rows and columns) illustrated by dots for three weights w k ( n / 2 ) for τ = n / 2 (blue), w k ( 0 ) for τ = 0 (green) and w k ( n ) for τ = n (purple) with the increase of n.
Entropy 24 01652 g002
Figure 3. Typical images in three different image sets extracted from the same video data with different frame rates. The first row contains four images located at 4 (change point), 5, 40 (change point), and 41 from the first set of extracted 49 images (1 frame per second); the second row contains four images located at 7 (change point), 8, 79 (change point), and 80 from the second set of extracted 98 images (2 frames per second); and the third row contains four images located at 19 (change point), 20, 198 (change point), and 199 from the third set of extracted 245 images (5 frames per second).
Figure 3. Typical images in three different image sets extracted from the same video data with different frame rates. The first row contains four images located at 4 (change point), 5, 40 (change point), and 41 from the first set of extracted 49 images (1 frame per second); the second row contains four images located at 7 (change point), 8, 79 (change point), and 80 from the second set of extracted 98 images (2 frames per second); and the third row contains four images located at 19 (change point), 20, 198 (change point), and 199 from the third set of extracted 245 images (5 frames per second).
Entropy 24 01652 g003
Table 1. Critical values of k = 1 n λ k ( τ ) Z k 2 for different weights in (4), sizes (n), and probabilities (p).
Table 1. Critical values of k = 1 n λ k ( τ ) Z k 2 for different weights in (4), sizes (n), and probabilities (p).
n
Weightp20406080100200400100010,000
w k ( n / 2 ) 0.901.8831.9081.9161.9201.9231.9281.9301.9321.9331.933
0.9252.1112.1362.1452.1492.1512.1562.1592.1602.161
0.952.4422.4672.4762.4802.4822.4872.4902.4912.4922.492
0.9753.0273.0523.0613.0653.0673.0723.0753.0763.0773.070
0.993.8283.8533.8613.8663.8683.8733.8763.8773.8783.850
w k ( 0 ) 0.900.5990.6050.6070.6080.6090.6100.6110.6110.611
0.9250.6750.6820.6840.6850.6850.6870.6870.6880.688
0.950.7860.7920.7940.7950.7960.7970.7980.7980.798
0.9750.9810.9880.9900.9910.9910.9930.9930.9940.994
0.991.2491.2551.2571.2581.2591.2601.2611.2611.261
Table 2. Estimated power (%) for the w k ( 0 ) , w k ( n / 2 ) , and w k ( n ) in (23), MST, MST * , SHP, and SHP * , based on 200 simulations; n are the sample sizes, q are the dimensions, k * are the change point locations, and Δ is the size of the change in the mean of the normal random variables.
Table 2. Estimated power (%) for the w k ( 0 ) , w k ( n / 2 ) , and w k ( n ) in (23), MST, MST * , SHP, and SHP * , based on 200 simulations; n are the sample sizes, q are the dimensions, k * are the change point locations, and Δ is the size of the change in the mean of the normal random variables.
n40 80
q50 100 50 100
Δ 0.10.20.10.20.10.20.10.2
k * n 1 4 1 2 3 4 1 4 1 2 3 4 1 4 1 2 3 4 1 4 1 2 3 4 1 4 1 2 3 4 1 4 1 2 3 4 1 4 1 2 3 4 1 4 1 2 3 4
w k ( 0 ) 234136 679184 397272 96100100 437467 97100100 759691 100100100
w k ( n / 2 ) 314331 829282 517362 9910099 577761 99100100 879789 100100100
w k ( n ) 324323 899264 597346 9910091 617449 10010097 919776 100100100
MST354 7610 355 6199 345 72110 584 133514
MST * 182416 443844 332938 757775 212121 658069 364335 959998
SHP465 7912 469 10166 386 92213 568 132416
SHP * 10138 333733 172318 677765 91213 497154 223221 909792
Table 3. Estimated power (%) for the w k ( 0 ) , w k ( n / 2 ) , w k ( n ) , MST, MST * in (23), SHP, and SHP * , based on 200 simulations; n are the sample sizes, q are the dimensions, k * are the change point locations, and Δ is the size of the change in the mean of mixed normal distributions.
Table 3. Estimated power (%) for the w k ( 0 ) , w k ( n / 2 ) , w k ( n ) , MST, MST * in (23), SHP, and SHP * , based on 200 simulations; n are the sample sizes, q are the dimensions, k * are the change point locations, and Δ is the size of the change in the mean of mixed normal distributions.
n4080
q50 100 50 100
Δ 0.1 0.2 0.1 0.2 0.1 0.2 0.1 0.2
k * n 1 4 1 2 3 4 1 4 1 2 3 4 1 4 1 2 3 4 1 4 1 2 3 4 1 4 1 2 3 4 1 4 1 2 3 4 1 4 1 2 3 4 1 4 1 2 3 4
w k ( 0 ) 183540 649281 377462 8910098 437259 9410099 739890 100100100
w k ( n / 2 ) 283637 809474 507657 9810097 537452 9810099 849887 100100100
w k ( n ) 323827839556587542991009260724410010096899973100100100
MST456 454 666 5106 446 42420 562 82520
MST * 201124 414439 272827 707264 222124 607565 383933 9510095
SHP448 101513 6510 172821 865 102918 9116 285441
SHP * 9711264126162018557057121215445949222823929688
Table 4. Estimated change points for the w k ( 0 ) , w k ( n / 2 ) , w k ( n ) , MST * , and SHP * , based on extracted 49, 98, and 245 images; n are the sample sizes and k * are the change point locations.
Table 4. Estimated change points for the w k ( 0 ) , w k ( n / 2 ) , w k ( n ) , MST * , and SHP * , based on extracted 49, 98, and 245 images; n are the sample sizes and k * are the change point locations.
n49 98 245
k * 44077919198
w k ( 0 ) 41 82 206
w k ( n / 2 ) 4 8 19
w k ( n ) 4 8 19
MST * 4 7 19
SHP * 4 7 19
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Shi, X.; Wang, X.-S.; Reid, N. A New Class of Weighted CUSUM Statistics. Entropy 2022, 24, 1652. https://doi.org/10.3390/e24111652

AMA Style

Shi X, Wang X-S, Reid N. A New Class of Weighted CUSUM Statistics. Entropy. 2022; 24(11):1652. https://doi.org/10.3390/e24111652

Chicago/Turabian Style

Shi, Xiaoping, Xiang-Sheng Wang, and Nancy Reid. 2022. "A New Class of Weighted CUSUM Statistics" Entropy 24, no. 11: 1652. https://doi.org/10.3390/e24111652

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop