3.2. Detection and Estimation
To illustrate the superiority of the new method, we consider three time-series models for simulation and compare the new method with the least-squares estimator (LSE) with the sample size T of 80, 100, and 120, and the panel number N of 100 and 120. The number of each group . Here, we take . In each case, 1000 replications are carried out to calculate the mean value of evaluation indexes, and the final simulation results are obtained.
Following [
13,
34], if there are
s common change-points, the statistic can be defined as
where
for
.
First, we consider the AR(1) model, and define
and
where
,
for
, and the true change-point
,
, and
.
Table 1 presents the empirical probability of the number of groups
G. The true number of groups can be correctly estimated using MDL. As
N increases, the empirical probability of correct judgment increases. To better illustrate this rule, we illustrate the empirical probability of correct judgment in
Figure 1 by fixing
T.
For a given
G,
Table 2 shows the estimation results of the new method and LSE. The new method always performs better than LSE.
Figure 2 presents the curve of D and RMSE with panel number
N and time-series length
T. In
Figure 2, at fixed
N,
D decreases as
T increases. This indicates that the effect of grouping is better. Then, when we fix
T, the RMSE becomes smaller with
N increases, which indicates that the estimate becomes more accurate.
However,
Table 1 shows that there is a small probability that
is greater than
G. When this happens, say
, the three change-points can still be accurately estimated, and the fourth group
consists of individual elements from the
, and
.
Table 3 shows the RMSE of the estimate in this case, where we only consider the RMSE of three change-points. The estimation of the change-point can still achieve good results when the group number
G is misestimated. However, it is worse than the effect when the number of group
G is estimated correctly.
Then, we consider the MA(2) model
and
where
,
for
, and the true change-point
and
.
Table 4 shows the empirical probability of
G taking
and 5 in 1000 replications. It demonstrates that
G has more than 90 percent probability of being estimated correctly.
Figure 3 shows that the empirical probability of correct group selection increases as
N increases.
From
Table 4,
G is chosen as 3. Given
, we use the SCIP Solver to estimate the change-points, as shown in
Table 5. The RMSE of the new method is smaller than LSE, which means that the new method performs better. Furthermore, the
D of the new method is small, which means that the grouping is accurate.
In
Figure 4, we show the change in
D and RMSE with
T and
N.
D decreases as
T increases and RMSE decreases as
N increases.
Finally, we consider a time-series model with a trend term, and define
where
,
,
, and the true change-point
,
, and
. We define the square loss function as
Table 6 shows the empirical probability of the estimated number of groups. The results indicate that using MDL can estimate the number of groups with a high empirical probability.
Figure 5 shows the change in empirical probability. Notably, the empirical probability approaches 1 as
N increases.
In
Table 7, we display the
D and RMSE of the new estimator. The new method performs well on the time-series model with a trend.
Figure 6 shows that
D decreases with an increase in
T; this means that the grouping is more and more accurate. Further, RMSE decreases with an increase in
N, which means that the estimates are improving.
In the case of one change-point, the results can be summarized as follows: the new method performs better than LSE. With fixed T, as N increases, the empirical probability of choosing the right number of groups approaches 1 and the RMSE becomes smaller. With fixed N, the set coverage becomes smaller as T increases.
For multiple change-points, consider the AR(1) model, and define
and
where
,
and
changes with
T, when
,
; when
,
; and when
,
.
Table 8 shows the empirical probability of the estimated number of groups. An accurate group number estimation can be obtained by using the MDL criterion. To illustrate the iterative convergence rate of Algorithm 1, we show the curve of coverage (D) versus
s in
Figure 7. The algorithm reaches convergence after five iterations. According to Ref. [
13], when the number of change-points is unknown, we use LSE combined with AIC or BIC penalty to detect. The statistic is defined as
where the number of change-points
s is unknown;
for AIC penalty and
for BIC penalty.
Table 9 presents the
D,
F, and
of the new method and LSE. Clearly, the new method divides the groups accurately, and accurately obtains the number and position of change-points in each group. Using AIC penalty, the number of change-points can be obtained accurately, while the BIC penalty is less than the real number of change-points.
Although Ref. [
29] can not be applied to the above model, we can set the mean within the group to be the same and define the following model:
This model is equivalent to taking all the regression variables in Ref. [
29] as 1, and
Then, we compare the new method with Ref. [
29] under this model. The tuning parameter
in Ref. [
29] is selected by searching on the interval [1, 10,000] with 100 evenly-distributed logarithmic grids. We present the results of the new method and Ref. [
29] in
Table 10 (for this case, given
, the new method has a probability of less than 1% to split the panel into two groups. The results presented here do not include this). The grouping of [
29] is much better than that of the new method. This may be because Ref. [
29] required the same model parameters within the group and utilized this information. For the estimation of the number and position of change-points, the performance of the two methods is similar. However, when the mean within the group is different, the new method can be applied to solve, and Ref. [
29] cannot.
Last, we implement the method of Ref. [
29] in Python and give the computation time in
Table 11 (here is the average time of 100 replications; the CPU is an 11th Gen Intel Core i5-1135G7). It can be observed that the new method is faster than Ref. [
29]. This may be because the objective function of Ref. [
29] is complex and the parameters need to be tuned.