Jackknife Bias Reduction in the Presence of a Near-Unit Root

Chambers, Marcus J.; Kyriacou, Maria

doi:10.3390/econometrics6010011

Open AccessArticle

Jackknife Bias Reduction in the Presence of a Near-Unit Root

by

Marcus J. Chambers

^1,*

and

Maria Kyriacou

²

¹

Department of Economics, University of Essex, Wivenhoe Park, Colchester, Essex CO4 3SQ, UK

²

Department of Economics, University of Southampton, Southampton SO17 1BJ, UK

^*

Author to whom correspondence should be addressed.

Econometrics 2018, 6(1), 11; https://doi.org/10.3390/econometrics6010011

Submission received: 29 September 2017 / Revised: 9 February 2018 / Accepted: 22 February 2018 / Published: 5 March 2018

(This article belongs to the Special Issue Celebrated Econometricians: Peter Phillips)

Download Versions Notes

Abstract

:

This paper considers the specification and performance of jackknife estimators of the autoregressive coefficient in a model with a near-unit root. The limit distributions of sub-sample estimators that are used in the construction of the jackknife estimator are derived, and the joint moment generating function (MGF) of two components of these distributions is obtained and its properties explored. The MGF can be used to derive the weights for an optimal jackknife estimator that removes fully the first-order finite sample bias from the estimator. The resulting jackknife estimator is shown to perform well in finite samples and, with a suitable choice of the number of sub-samples, is shown to reduce the overall finite sample root mean squared error, as well as bias. However, the optimal jackknife weights rely on knowledge of the near-unit root parameter and a quantity that is related to the long-run variance of the disturbance process, which are typically unknown in practice, and so, this dependence is characterised fully and a discussion provided of the issues that arise in practice in the most general settings.

Keywords:

Jackknife; bias reduction; near-unit root; moment generating function

JEL Classification:

C22

1. Introduction

Throughout his career, Peter Phillips has made important contributions to knowledge across the broad spectrum of econometrics and statistics, providing inspiration to many other researchers along the way. This paper builds on two of the strands of Peter’s research, namely jackknife bias reduction and the analysis of nonstationary time series. Indeed, our own work on the jackknife (Chambers 2013, 2015; Chambers and Kyriacou 2013) was inspired by Peter’s work on this topic with Jun Yu, published as Phillips and Yu (2005), and the current contribution also extends the results on moment generating functions (MGFs) contained in Phillips (1987a).

The jackknife has been proven to be an easy-to-implement method of eliminating first-order estimation bias in a wide variety of applications in statistics and econometrics. Its genesis can be traced to Quenouille (1956) and Tukey (1958) in the case of independently and identically distributed (iid) samples, while it has been adapted more recently to accommodate more general time series settings. Within the class of stationary autoregressive time series models, Phillips and Yu (2005) show that the jackknife can effectively reduce bias in the pricing of bond options in finance, while Chambers (2013) analyses the performance of jackknife methods based on a variety of sub-sampling procedures. In subsequent work, Chambers and Kyriacou (2013) demonstrate that the usual jackknife construction in the time series case has to be amended when a unit root is present, while Chen and Yu (2015) show that a variance-minimising jackknife can be constructed in a unit root setting that also retains its bias reduction properties. In addition, Kruse and Kaufmann (2015) compare bootstrap, jackknife and indirect inference estimators in mildly explosive autoregressions, finding that the indirect inference estimator dominates in terms of root mean squared error, but that the jackknife excels for bias reduction in stationary and unit root situations.

The usual motivation for a jackknife estimator relies on the existence of a Nagar-type expansion of the original full-sample estimator’s bias. Its construction proceeds by finding a set of weights that, when applied to a full-sample estimator and a set of sub-sample estimators, is able to eliminate fully the first-order term in the resulting jackknife estimator’s bias expansion. In stationary time series settings, the bias expansions are common to both the full-sample and sub-sample estimators, but Chambers and Kyriacou (2013) pointed out that this property no longer holds in the case of a unit root. This is because the initial values in the sub-samples are no longer negligible in the asymptotics and have a resulting effect on the bias expansions, thereby affecting the optimal weights. Construction of a fully-effective jackknife estimator relies, therefore, on knowledge of the presence (or otherwise) of a unit root.

In this paper, we explore the construction of jackknife estimators that eliminate fully the first-order bias in the near-unit root setting. Near-unit root models have attracted a great deal of interest in time series owing, amongst other things, to their ability to capture better the effects of sample size in the vicinity of a unit root, to explore analytically the power properties of unit root tests and to allow the development of an integrated asymptotic theory for both stationary and non-stationary autoregressions; see Phillips (1987a) and Chan and Wei (1987) for details. We find that jackknife estimators can be constructed in the presence of a near-unit root that achieve this aim of bias reduction. Jackknife estimators have the advantage of incurring only a very slight additional computational burden, unlike alternative resampling and simulation-based methods such as the bootstrap and indirect inference. Furthermore, they are applicable in a wide variety of estimation frameworks and work well in finite sample situations in which the prime objective is bias reduction. Although the bootstrap is often a viable candidate for bias reduction, it was shown by Park (2006) that the bootstrap is inconsistent in the presence of a near-unit root, and hence, jackknife methods offer a useful alternative in these circumstances.

The development of a jackknife estimator that achieves bias reduction in the near-unit root case is not simply a straightforward application of previous results. While Chambers and Kyriacou (2013) first pointed out that under unit root non-stationarity, the effect of the sub-sample initial conditions does not vanish asymptotically, thereby affecting asymptotic expansions of sub-sample estimator bias and the resulting jackknife weights as compared to the stationary case, the extension of these results to a local-to-unity setting is not obvious. With a near-unit root, the autoregressive parameter plays an important role, and it is therefore necessary to derive the appropriate asymptotic expansion of sub-sample estimator bias for this more general case, as well as the MGFs of the relevant limiting distributions that can be used to construct the appropriate jackknife weights. The derivation of such results is challenging in itself and is a major reason why we focus on the bias-minimising jackknife, rather than attempting to derive results for the variance-minimising jackknife of Chen and Yu (2015).

The paper is organised as follows. Section 2 defines the near-unit root model of interest and focuses on the limit distributions of sub-sample estimators, demonstrating that these limit distributions are sub-sample dependent. An asymptotic expansion of these limit distributions demonstrates the source of the failure of the standard jackknife weights in a near-unit root setting by showing that the bias expansion is also sub-sample dependent. In order to define a successful jackknife estimator, it is necessary to compute the mean of these limit distributions, and so, Section 3 derives the moment generating function of two random variables that determine the limit distributions over an arbitrary sub-interval of the unit interval. Expressions for the computation of the mean of the ratio of the two random variables are derived using the MGF. Various properties of the MGF are established, and it is shown that results obtained in Phillips (1987a) arise as a special case, including those that emerge as the near-unit root parameter tends to minus infinity.

Based on the results in Section 2 and Section 3, the optimal weights for the jackknife estimator are defined in Section 4, which then goes on to explore, via simulations, the performance of the proposed estimator in finite samples. Consideration is given to the choice of the appropriate number of sub-samples to use when either bias reduction or root mean squared error (RMSE) minimisation is the objective. It is found that greatest bias reduction can be achieved using just two sub-samples, while minimisation of RMSE, which, it should be stressed, is not the objective of the jackknife estimator, requires a larger number of sub-samples, which increases with sample size. Section 5 contains some concluding comments, and all proofs are contained in the Appendix A.

The following notation will be used throughout the paper. The symbol

\overset{d}{=}

denotes equality in distribution;

\overset{d}{\to}

denotes convergence in distribution;

\overset{p}{\to}

denotes convergence in probability; ⇒ denotes weak convergence of the relevant probability measures;

W (r)

denotes a Wiener process on

C [0, 1]

, the space of continuous real-valued functions on the unit interval; and

J_{c} (r) = \int_{0}^{r} e^{(r - s) c} d W (s)

denotes the Ornstein–Uhlenbeck process, which satisfies

d J_{c} (r) = c J_{c} (r) d r + d W (r)

for some constant parameter c. Functionals of

W (r)

and

J_{c} (r)

, such as

\int_{0}^{1} J_{c} {(r)}^{2} d r

, are denoted

\int_{0}^{1} J_{c}^{2}

for notational convenience where appropriate, and in stochastic integrals of the form

\int e^{c r} J_{c}

, it is to be understood that integration is carried out with respect to r. Finally, L denotes the lag operator such that

L^{j} y_{t} = y_{t - j}

for a random variable

y_{t}

.

2. Jackknife Estimation with a Near-Unit Root

2.1. The Model and the Standard Jackknife Estimator

The model with a near-unit root is defined as follows.

Assumption 1.

The sequence

y_{1}, \dots, y_{n}

satisfies:

y_{t} = ρ y_{t - 1} + u_{t}, t = 1, \dots, n,

(1)

where

ρ = e^{c / n} = 1 + c / n + O (n^{- 2})

for some constant c,

y_{0}

is an observable

O_{p} (1)

random variable and

u_{t}

is the stationary linear process:

u_{t} = δ (L) ϵ_{t} = \sum_{j = 0}^{\infty} δ_{j} ϵ_{t - j}, t = 1, \dots, n,

(2)

where

ϵ_{t} \sim iid (0, σ_{ϵ}^{2})

,

E (ϵ_{t}^{4}) < \infty

,

δ (z) = \sum_{j = 0}^{\infty} δ_{j} z^{j}

,

δ_{0} = 1

and

\sum_{j = 0}^{\infty} j | δ_{j} | < \infty

.

The parameter c controls the extent to which the near-unit root deviates from unity; when

c < 0

, the process is (locally) stationary, whereas it is (locally) explosive when

c > 0

. Strictly speaking, the autoregressive parameter should be denoted

ρ_{n}

to emphasise its dependence on the sample size, n, but we use

ρ

for notational convenience. The linear process specification for the innovations is consistent with

u_{t}

being a stationary ARMA(p, q) process of the form

ϕ (L) u_{t} = θ (L) ϵ_{t}

, where

ϕ (z) = \sum_{j = 0}^{p} ϕ_{j} z^{j}

,

θ (z) = \sum_{j = 0}^{q} θ_{j} z^{j}

, and all roots of the equation

ϕ (z) = 0

lie outside the unit circle. In this case,

δ (z) = θ (z) / ϕ (z)

, but Assumption 1 also allows for more general forms of linear processes and is not restricted solely to the ARMA class. Under Assumption 1,

u_{t}

satisfies the functional central limit theorem:

\frac{1}{\sqrt{n}} \sum_{t = 1}^{[n r]} u_{t} \Rightarrow σ W (r) a s n \to \infty

(3)

on

C [0, 1]

, where

σ^{2} = σ_{ϵ}^{2} δ {(1)}^{2}

denotes the long-run variance.

Equations of the form (1) have been used extensively in the literature on testing for an autoregressive unit root (corresponding to

c = 0

) and for examining the power properties of the resulting tests (by allowing c to deviate from zero). In economic and financial time series, they offer a flexible mechanism of modelling highly persistent series whose autoregressive roots are generally close, but not exactly equal, to unity. Ordinary least squares (OLS) regression on (1) yields:

y_{t} = \hat{ρ} y_{t - 1} + {\hat{u}}_{t}, t = 1, \dots, n,

(4)

where

{\hat{u}}_{t}

denotes the regression residual, and it can be shown see Phillips (1987a) that

\hat{ρ}

satisfies:

n (\hat{ρ} - ρ) = \frac{\frac{1}{n} \sum_{t = 1}^{n} y_{t - 1} u_{t}}{\frac{1}{n^{2}} \sum_{t = 1}^{n} y_{t - 1}^{2}} \Rightarrow Z_{c} (η) = \frac{\int_{0}^{1} J_{c} d W + \frac{1}{2} (1 - η)}{\int_{0}^{1} J_{c}^{2}} a s n \to \infty,

(5)

where

η = σ_{u}^{2} / σ^{2}

,

σ_{u}^{2} = E (u_{t}^{2}) = σ_{ϵ}^{2} \sum_{j = 0}^{\infty} δ_{j}^{2}

and the functional

Z_{c} (η)

is implicitly defined. The limit distribution in (5) is skewed, and the estimator suffers from significant negative bias in finite samples; see Perron (1989) for properties of the limit distribution for the case where

σ^{2} = σ_{u}^{2}

and (hence)

η = 1

.

The jackknife estimator offers a computationally simple method of bias reduction by combining the full-sample estimator,

\hat{ρ}

, with a set of m sub-sample estimators,

{\hat{ρ}}_{j}

(j = 1, \dots, m)

, the weights assigned to these components depending on the type of sub-sampling method employed. Phillips and Yu (2005) find the use of non-overlapping sub-samples to perform well in reducing bias in the estimation of stationary diffusions, while the analysis of Chambers (2013) supports this result in the setting of stationary autoregressions. In this approach, the full sample of n observations is divided into m sub-samples, each of length ℓ, so that

n = m \times ℓ

. The generic form of jackknife estimator is given by:

{\hat{ρ}}_{J} = w_{1} \hat{ρ} + w_{2} \frac{1}{m} \sum_{j = 1}^{m} {\hat{ρ}}_{j},

(6)

where the weights are determined so as to eliminate the first-order finite sample bias. Assuming that the full-sample estimator and each sub-sample estimator satisfy a (Nagar-type) bias expansion of the form:

E (\hat{ρ} - ρ) = \frac{a}{n} + O (\frac{1}{n^{2}}), E ({\hat{ρ}}_{j} - ρ) = \frac{a}{ℓ} + O (\frac{1}{ℓ^{2}}),

it can be shown that the appropriate weights are given by

w_{1} = m / (m - 1)

and

w_{2} = - 1 / (m - 1)

, in which case:

\begin{matrix} E ({\hat{ρ}}_{J} - ρ) & = & \frac{m}{m - 1} E (\hat{ρ} - ρ) - \frac{1}{m - 1} \frac{1}{m} \sum_{j = 1}^{m} E ({\hat{ρ}}_{j} - ρ) \\ = & \frac{m}{m - 1} (\frac{a}{n} + O (\frac{1}{n^{2}})) - \frac{1}{m - 1} (\frac{a}{ℓ} + O (\frac{1}{ℓ^{2}})) \\ = & \frac{a}{m - 1} (\frac{m}{n} - \frac{1}{ℓ}) + O (\frac{1}{n^{2}}) = O (\frac{1}{n^{2}}), \end{matrix}

using the fact that

m / n = 1 / ℓ

. Under such circumstances, the jackknife estimator is capable of completely eliminating the

O (1 / n)

bias term in the estimator as compared to

\hat{ρ}

. However, in the pure unit root setting (

c = 0

), Chambers and Kyriacou (2013) demonstrated that the sub-sample estimators do not share the same limit distribution as the full-sample estimator, which means that the expansions for the bias of the sub-sample estimators are incorrect, and hence, the weights defined above do not eliminate fully the first-order bias. It is therefore important to investigate this issue in the more general setting of a near-unit root with a view toward deriving the appropriate weights for eliminating the first-order bias term.

2.2. Sub-Sample Properties

In order to explore the sub-sample properties, let:

τ_{j} = {(j - 1) ℓ + 1, \dots, j ℓ}, j = 1, \dots, m,

denote the set of integers indexing the observations in each sub-sample. The sub-sample estimators can then be written, in view of (5), as:

ℓ ({\hat{ρ}}_{j} - ρ) = \frac{\frac{1}{ℓ} \sum_{t \in τ_{j}} y_{t - 1} u_{t}}{\frac{1}{ℓ^{2}} \sum_{t \in τ_{j}} y_{t - 1}^{2}}, j = 1, \dots, m .

(7)

Theorem 1 (below) determines the limiting properties of the quantities appearing in (7), as well as the limit distribution of

ℓ ({\hat{ρ}}_{j} - ρ)

itself.

Theorem 1.

Let

y_{1}, \dots, y_{n}

satisfy Assumption 1. Then, if m is fixed as

n \to \infty

(and hence,

ℓ \to \infty

):

(a): $\frac{1}{ℓ^{2}} \sum_{t \in τ_{j}} y_{t}^{2} \Rightarrow σ^{2} m^{2} \int_{(j - 1) / m}^{j / m} J_{c}^{2}$ ;
(b): $\frac{1}{ℓ} \sum_{t \in τ_{j}} y_{t - 1} u_{t} \Rightarrow σ^{2} m \int_{(j - 1) / m}^{j / m} J_{c} d W + \frac{1}{2} (σ^{2} - σ_{u}^{2})$ ;
(c): $ℓ ({\hat{ρ}}_{j} - ρ) \Rightarrow Z_{c, j} (η) = \frac{\int_{(j - 1) / m}^{j / m} J_{c} d W + \frac{1}{2 m} (1 - η)}{m \int_{(j - 1) / m}^{j / m} J_{c}^{2}}, j = 1, \dots, m,$
$η = σ_{u}^{2} / σ^{2}$ .

where the functional

Z_{c, j} (η)

is implicitly defined.

The limit distribution in Part (c) of Theorem 1 is of the same form as that of the full-sample estimator in (5), except that the integrals are over the subset

[(j - 1) / m, j / m]

of

[0, 1]

rather than the unit interval itself. Note, too, that the first component of the numerator of

Z_{c, j} (η)

also has the representation:

\int_{(j - 1) / m}^{j / m} J_{c} d W \overset{d}{=} \frac{1}{2} [J_{c} {(\frac{j}{m})}^{2} - J_{c} {(\frac{j - 1}{m})}^{2} - 2 c \int_{(j - 1) / m}^{j / m} J_{c}^{2} - \frac{1}{m}],

(8)

which follows from the Itô calculus and is demonstrated in the proof of Part (b) of Theorem 1 in the Appendix. The familiar result,

\int_{0}^{1} W d W = [W {(1)}^{2} - 1] / 2

, follows as a special case by setting

j = m = 1

and

c = 0

.

The fact that the distributions

Z_{c, j} (η)

in Theorem 1 depend on j implies that the expansions for

E ({\hat{ρ}}_{j} - ρ)

that are used to derive the jackknife weights defined following (6) may not be correct under a near-unit root. When the process (1) has a near-unit root, we can expect the expansions for

E ({\hat{ρ}}_{j} - ρ)

to be of the form:

E ({\hat{ρ}}_{j} - ρ) = \frac{μ_{c, j}}{ℓ} + O (\frac{1}{ℓ^{2}}), j = 1, \dots, m;

indeed, we later justify this expansion and characterise

μ_{c, j}

precisely. Such expansions have been shown to hold in the unit root (

c = 0

) case, as well as more generally when

c \neq 0

. For example, Phillips (1987b, Theorem 7.1) considered the Gaussian random walk (corresponding to (1) with

c = 0

,

δ (z) = 1

,

y_{0} = 0

and

u_{t}

Gaussian) and demonstrated the validity of an asymptotic expansion for the normalised coefficient estimator; it is given by:

n (\hat{ρ} - 1) \overset{d}{=} \frac{\int_{0}^{1} W d W}{\int_{0}^{1} W^{2}} - \frac{ξ}{\sqrt{2 n} \int_{0}^{1} W^{2}} + O_{p} (\frac{1}{n}),

(9)

where

ξ

is a standard normal random variable distributed independently of W. Taking expectations in (9), using the independence of

ξ

and W and noting that the expected value of the leading term is

- 1.7814

(see, for example, Table 7.1 of Tanaka 1996), the bias satisfies:

E (\hat{ρ} - 1) = - \frac{1.7814}{n} + o (\frac{1}{n});

(10)

see, also, Phillips (2012, 2014). In the more general setting of the model in Assumption 1, and assuming that

u_{t}

is Gaussian, Theorem 1 of Perron (1996) established that:

n (\hat{ρ} - ρ) \overset{d}{=} \frac{\int_{0}^{1} J_{c} d W + \frac{1}{2} (1 - η) + \frac{y_{0}}{σ \sqrt{n}} \int_{0}^{1} e^{c r} d W - \frac{v_{f}}{2 σ^{2} \sqrt{n}} ξ}{\int_{0}^{1} J_{c}^{2} + 2 \frac{y_{0}}{σ \sqrt{n}} \int_{0}^{1} e^{c r} J_{c}} + O_{p} (\frac{1}{n}),

(11)

where

v_{f}^{2} = 2 π f_{u^{2}} (0)

and

f_{u^{2}} (0)

denotes the spectral density of

u_{t}^{2} - σ_{u}^{2}

at the origin. The following result extends the type of expansion in (11) to the sub-sample estimators.

Theorem 2.

Let

y_{1}, \dots, y_{n}

satisfy Assumption 1 and, in addition, assume that

u_{t}

is Gaussian. Then, for

j = 1, \dots, m

,

ℓ ({\hat{ρ}}_{j} - ρ) \overset{d}{=} \frac{\int_{(j - 1) / m}^{j / m} J_{c} d W + \frac{1}{2 m} (1 - η) + \frac{2 y_{0}}{σ \sqrt{ℓ}} ξ_{1 j} - \frac{v_{f}}{σ^{2} m \sqrt{ℓ}} ξ_{2 j}}{m \int_{(j - 1) / m}^{j / m} J_{c}^{2} + \frac{2 \sqrt{m} y_{0}}{σ \sqrt{ℓ}} \int_{(j - 1) / m}^{j / m} e^{c r} J_{c}} + O_{p} (\frac{1}{ℓ}),

where

ξ_{i j} \sim N (0, s_{j}^{2})

,

ξ_{2 j} \sim N (0, 1)

and:

\begin{matrix} s_{j}^{2} & = & \frac{{(\sqrt{m} - 1)}^{2}}{2 c m} {(e^{c j / m} - e^{c (j - 1) / m})}^{2} + \frac{(m - 2 \sqrt{m} - 2)}{2 c m} (e^{c j / m} - e^{c (j - 1) / m}) \\ + \frac{2 (1 + \sqrt{m})}{m^{2}} e^{c j / m} . \end{matrix}

The form of the expansion for

ℓ ({\hat{ρ}}_{j} - ρ)

in Theorem 2 is similar to that for the full-sample estimator, but also depends on m and j. Use of these expansions to derive expressions for the biases of

\hat{ρ}

and

{\hat{ρ}}_{j}

would be complicated due to the dependence on

y_{0}

. We therefore take

y_{0} = 0

1, which results in the following expectations:

E (\hat{ρ} - ρ) = \frac{E (Z_{c} (η))}{n} + O (\frac{1}{n^{2}}), E ({\hat{ρ}}_{j} - ρ) = \frac{E (Z_{c, j} (η))}{ℓ} + O (\frac{1}{ℓ^{2}}),

these results utilising the independence of the normally distributed random variables (

ξ_{1 j}

and

ξ_{2 j}

) and the Wiener process W. The next section provides the form of the moment generating function that enables expectations of the functionals

Z_{c} (η)

and

Z_{c, j} (η)

to be computed, thereby enabling the construction of a jackknife estimator that eliminates the first-order bias.

3. A Moment Generating Function and Its Properties

The following result provides the joint moment generating function (MGF) of two relevant functionals of

J_{c}

defined over a subinterval

[a, b]

of

[0, b]

where

0 \leq a < b

. Although our focus is on sub-intervals of

[0, 1]

, we leave b unconstrained for greater generality than is required for our specific purposes because the results may have more widespread use beyond our particular application.

Theorem 3.

Let

N_{c} = \int_{a}^{b} J_{c} (r) d W (r)

and

D_{c} = \int_{a}^{b} J_{c} {(r)}^{2} d r

, where

J_{c} (r)

is an Ornstein–Uhlenbeck process on

r \in [0, b]

with parameter c, and

0 \leq a < b

. Then:

(a): The joint MGF of $N_{c}$ and $D_{c}$ is given by:

$M_{c} (θ_{1}, θ_{2}) = E \exp (θ_{1} N_{c} + θ_{2} D_{c}) = \exp (- \frac{(θ_{1} + c)}{2} (b - a)) H_{c} {(θ_{1}, θ_{2})}^{- 1 / 2},$

where, defining $λ = {(c^{2} + 2 c θ_{1} - 2 θ_{2})}^{1 / 2}$ and $v^{2} = (e^{2 a c} - 1) / (2 c)$ ,

$H_{c} (θ_{1}, θ_{2}) = \cosh ((b - a) λ) - \frac{1}{λ} [θ_{1} + c + v^{2} (θ_{1}^{2} + 2 θ_{2})] \sinh ((b - a) λ) .$
(b): The individual MGFs for $N_{c}$ and $D_{c}$ are given by, respectively,

$\begin{matrix} M_{N_{c}} (θ_{1}) = & \exp (- \frac{(θ_{1} + c)}{2} (b - a)) \\ \times {[\cosh ((b - a) λ_{1}) - \frac{1}{λ_{1}} (θ_{1} + c + v^{2} θ_{1}^{2}) \sinh ((b - a) λ_{1})]}^{- 1 / 2} \end{matrix}$

(12)

$M_{D_{c}} (θ_{2}) = \exp (- \frac{c}{2} (b - a)) {[\cosh ((b - a) λ_{2}) - \frac{1}{λ_{2}} (c + 2 v^{2} θ_{2}) \sinh ((b - a) λ_{2})]}^{- 1 / 2}$

(13)

where $λ_{1} = {(c^{2} + 2 c θ_{1})}^{1 / 2}$ and $λ_{2} = {(c^{2} - 2 θ_{2})}^{1 / 2}$ .
(c): Let:

$g (θ_{2}) = \cosh ((b - a) {(c^{2} + 2 θ_{2})}^{1 / 2}) - (c - 2 v^{2} θ_{2}) \frac{\sinh ((b - a) {(c^{2} + 2 θ_{2})}^{1 / 2})}{{(c^{2} + 2 θ_{2})}^{1 / 2}} .$

Then, the expectation of $N_{c} / D_{c}$ is given by:

$E (\frac{N_{c}}{D_{c}}) = \int_{0}^{\infty} {\frac{\partial M_{c} (θ_{1}, - θ_{2})}{\partial θ_{1}}|}_{θ_{1} = 0} d θ_{2} = I_{1} (a, b) + I_{2} (a, b) + I_{3} (a, b) + I_{4} (a, b),$

where:

$\begin{matrix} I_{1} (a, b) & = & - \frac{(b - a)}{2} \exp (- \frac{c (b - a)}{2}) \int_{0}^{\infty} \frac{1}{g {(θ_{2})}^{1 / 2}} d θ_{2}, \\ I_{2} (a, b) & = & - \frac{(c (b - a) - 1)}{2} \exp (- \frac{c (b - a)}{2}) \int_{0}^{\infty} \frac{\sinh ((b - a) {(c^{2} + 2 θ_{2})}^{1 / 2})}{{(c^{2} + 2 θ_{2})}^{1 / 2} g {(θ_{2})}^{3 / 2}} d θ_{2}, \\ I_{3} (a, b) & = & - \frac{c}{2} \exp (- \frac{c (b - a)}{2}) \int_{0}^{\infty} (c - 2 v^{2} θ_{2}) \frac{\sinh ((b - a) {(c^{2} + 2 θ_{2})}^{1 / 2})}{{(c^{2} + 2 θ_{2})}^{3 / 2} g {(θ_{2})}^{3 / 2}} d θ_{2}, \\ I_{4} (a, b) & = & \frac{c (b - a)}{2} \exp (- \frac{c (b - a)}{2}) \int_{0}^{\infty} (c - 2 v^{2} θ_{2}) \frac{\cosh ((b - a) {(c^{2} + 2 θ_{2})}^{1 / 2})}{(c^{2} + 2 θ_{2}) g {(θ_{2})}^{3 / 2}} d θ_{2} . \end{matrix}$

The MGFs for the two functionals in Theorem 3 have potential applications in a wide range of sub-sampling problems with near-unit root processes. A potential application of the joint MGF in Part (a) of Theorem 3 is in the computation of the cumulative and probability density functions of the distributions

Z_{c, j} (η)

when setting

a = (j - 1) / m

and

b = j / m

. For example, the probability density function of

m Z_{c, j} (1)

is given by (with

i^{2} = - 1

):

p d f (z) = \frac{1}{2 π i} \lim_{ϵ_{1} \to 0, ϵ_{2} \to \infty} \int_{ϵ_{1} < | θ_{1} | < ϵ_{2}} {(\frac{\partial M_{c} (i θ_{1}, i θ_{2})}{\partial θ_{2}})}_{θ_{2} = - θ_{1} z} d θ_{1};

see, for example, Perron (1991, p. 221), who performs this type of calculation for the distribution

Z_{c}

, while Abadir (1993) derives a representation for the density function of

Z_{c}

in terms of a parabolic cylinder function.

The result in Part (b) of Theorem 3 is obtained by differentiating the MGF and constructing the appropriate integrals. When

c = 0

, the usual (full-sample) result, where

a = 0

and

b = 1

, can be obtained as a special case. Noting that

v^{2} = 0

in this case and making the substitution

w = {(c^{2} + 2 θ_{2})}^{1 / 2}

results in:

I_{1} (0, 1) = - \frac{1}{2} \int_{0}^{\infty} \frac{w}{\cosh {(w)}^{1 / 2}} d w, I_{2} (0, 1) = \frac{1}{2} \int_{0}^{\infty} \frac{\sinh (w)}{\cosh {(w)}^{3 / 2}} d w, I_{3} (0, 1) = 0, I_{4} (0, 1) = 0;

these expressions can be found; for example, Gonzalo and Pitarakis (1998, Lemma 3.1). Some further special cases of interest that follow from Theorem 3 are presented below.

Corollary to Theorem 3.

(a) Let

[a, b] = [0, 1]

so that

N_{c} = \int_{0}^{1} J_{c} (r) d W (r)

and

D_{c} = \int_{0}^{1} J_{c} {(r)}^{2} d r

. Then:

\begin{matrix} M_{c} (θ_{1}, θ_{2}) & = & \exp (- \frac{(θ_{1} + c)}{2}) {[\cosh (λ) - \frac{1}{λ} (θ_{1} + c) \sinh (λ)]}^{- 1 / 2}, \\ M_{N_{c}} (θ_{1}) & = & \exp (- \frac{(θ_{1} + c)}{2}) {[\cosh (λ_{1}) - \frac{1}{λ_{1}} (θ_{1} + c) \sinh (λ_{1})]}^{- 1 / 2}, \\ M_{D_{c}} (θ_{2}) & = & \exp (- \frac{c}{2}) {[\cosh (λ_{2}) - \frac{c}{λ_{2}} \sinh (λ_{2})]}^{- 1 / 2}, \end{matrix}

while taking the limit as

c \to 0

yields:

\begin{matrix} M_{0} (θ_{1}, θ_{2}) & = & \exp (- \frac{θ_{1}}{2}) {[\cosh (λ_{0}) - \frac{θ_{1}}{λ_{0}} \sinh (λ_{0})]}^{- 1 / 2}, \\ M_{N_{0}} (θ_{1}) & = & e^{- θ_{1} / 2} {(1 - θ_{1})}^{- 1 / 2}, \\ M_{D_{0}} (θ_{2}) & = & {[\cosh (λ_{0})]}^{- 1 / 2}, \end{matrix}

where

λ_{0} = \sqrt{- 2 θ_{2}}

.

(b) Let

[a, b] = [(j - 1) / m, j / m]

so that

N_{c} = \int_{(j - 1) / m}^{j / m} J_{c} (r) d W (r)

and

D_{c} = \int_{(j - 1) / m}^{j / m} J_{c} {(r)}^{2} d r (j = 1, \dots, m)

. Then:

\begin{matrix} M_{c} (θ_{1}, θ_{2}) & = & \exp (- \frac{(θ_{1} + c)}{2 m}) {[\cosh (\frac{λ}{m}) - \frac{1}{λ} [θ_{1} + c + v_{j - 1}^{2} (θ_{1}^{2} + 2 θ_{2})] \sinh (\frac{λ}{m})]}^{- 1 / 2}, \\ M_{N_{c}} (θ_{1}) & = & \exp (- \frac{(θ_{1} + c)}{2 m}) {[\cosh (\frac{λ_{1}}{m}) - \frac{1}{λ_{1}} (θ_{1} + c + v_{j - 1}^{2} θ_{1}^{2}) \sinh (\frac{λ_{1}}{m})]}^{- 1 / 2}, \end{matrix}

M_{D_{c}} (θ_{2}) = \exp (- \frac{c}{2 m}) {[\cosh (\frac{λ_{2}}{m}) - \frac{1}{λ_{2}} (c + 2 v_{j - 1}^{2} θ_{2}) \sinh (\frac{λ_{2}}{m})]}^{- 1 / 2},

where

v_{j - 1}^{2} = (\exp (2 (j - 1) c / m) - 1) / (2 c)

. Taking the limit as

c \to 0

results in:

\begin{matrix} M_{0} (θ_{1}, θ_{2}) & = & \exp (- \frac{θ_{1}}{2 m}) {[\cosh (\frac{λ_{0}}{m}) - \frac{1}{λ_{0}} [θ_{1} + \frac{(j - 1)}{m} (θ_{1}^{2} + 2 θ_{2})] \sinh (\frac{λ_{0}}{m})]}^{- 1 / 2}, \\ M_{N_{0}} (θ_{1}) & = & \exp (- \frac{θ_{1}}{2 m}) {(1 - \frac{θ_{1}}{m} - (j - 1) \frac{θ_{1}^{2}}{m^{2}})}^{- 1 / 2}, \\ M_{D_{0}} (θ_{2}) & = & {[\cosh (\frac{λ_{0}}{m}) - \frac{2 (j - 1) θ_{2}}{m λ_{0}} \sinh (\frac{λ_{0}}{m})]}^{- 1 / 2} . \end{matrix}

The results in Part (a) of the corollary are relevant in the full-sample case, and the result for

M_{0} (θ_{1}, θ_{2})

goes back to White (1958). The results in Part (b) of the corollary are pertinent to the sub-sampling issues being investigated here in the case of a near-unit root, with the unit root (

c = 0

) result for

M_{0} (θ_{1}, θ_{2})

having been first derived by Chambers and Kyriacou (2013).

It is also possible to use the above results to explore the relationship between the sub-sample distributions and the full-sample distribution. For example, it is possible to show that

M_{N_{c / m}} (θ_{1} / m)

on

[0, 1]

is equal to

M_{N_{c}} (θ_{1})

for

j = 1

in the sub-samples, while

M_{D_{c / m}} (θ_{2} / m^{2})

on

[0, 1]

is equal to

M_{D_{c}} (θ_{2})

for

j = 1

in the sub-samples; an implication of this is that:

\int_{0}^{1 / m} J_{c} d W \overset{d}{=} \frac{1}{m} \int_{0}^{1} J_{c / m} d W, \int_{0}^{1 / m} J_{c}^{2} \overset{d}{=} \frac{1}{m^{2}} \int_{0}^{1} J_{c / m}^{2} .

Furthermore, this implies that the limit distribution of the first sub-sample estimator,

ℓ ({\hat{ρ}}_{1} - ρ)

, when

ρ = e^{c / n} = e^{c / m ℓ}

, is the same as that of the full-sample estimator,

n (\hat{ρ} - ρ)

, when

ρ = e^{c / m n}

.

The sub-sample results with a near-unit root can be related to the full-sample results of Phillips (1987a). For example, the MGF in Theorem 3 has the equivalent representation:

\begin{matrix} M_{c} (θ_{1}, θ_{2}) & = & \{\frac{1}{2} \exp ((θ_{1} + c) (b - a)) λ^{- 1} [(2 λ + δ) (1 + δ v^{2}) \exp (- z) \\ - (2 λ δ v^{2} + δ (1 + δ v^{2})) \exp (z)]\} \end{matrix}

(14)

where

λ

and

v^{2}

are defined in the theorem,

z = (b - a) λ

and

δ = θ_{1} + c - λ

. When

a = 0

,

b = 1

, it follows that

v^{2} = 0

, and the above expression nests the MGF in Phillips (1987a), i.e.,

\begin{matrix} M_{c} (θ_{1}, θ_{2}) & = & \{\frac{1}{2} \exp (θ_{1} + c) {(c^{2} + 2 c θ_{1} - 2 θ_{2})}^{- 1 / 2} \\ \times [(θ_{1} + c + {(c^{2} + 2 c θ_{1} - 2 θ_{2})}^{1 / 2}) \exp (- {(c^{2} + 2 c θ_{1} - 2 θ_{2})}^{1 / 2}) \\ - (θ_{1} + c - {(c^{2} + 2 c θ_{1} - 2 θ_{2})}^{1 / 2}) \exp ({(c^{2} + 2 c θ_{1} - 2 θ_{2})}^{1 / 2})]\}; \end{matrix}

this follows straightforwardly from (14). It is also of interest to examine what happens when the local-to-unity parameter

c \to - \infty

, as in Phillips (1987a) and other recent work on autoregression, e.g., Phillips (2012). We present the results in Theorem 4 below.

Theorem 4.

Let

J_{c} (r)

denote an Ornstein–Uhlenbeck process on

r \in [0, b]

with parameter c, and let

0 \leq a < b

. Furthermore, define the functional:

K (c) = g {(c)}^{1 / 2} {(\int_{a}^{b} J_{c}^{2})}^{- 1} (\int_{a}^{b} J_{c} d W + \frac{1}{2} (1 - η)),

where

η = σ_{u}^{2} / σ^{2}

and:

g (c) = E (\int_{a}^{b} J_{c}^{2}) = \frac{1}{4 c^{2}} (\exp (2 b c) - \exp (2 a c)) + (\frac{1}{- 2 c}) (b - a) .

Then, as

c \to - \infty

:

(a): ${(- 2 c)}^{1 / 2} \int_{a}^{b} J_{c} d W \Rightarrow N (0, (b - a))$ ;
(b): $(- 2 c) \int_{a}^{b} J_{c}^{2} \overset{p}{\to} (b - a)$ ;
(c): $K (c) \Rightarrow N (0, 1)$ if $σ_{u}^{2} = σ^{2}$ (and hence $η = 1$ ) and diverges otherwise.

The functional

K (c)

in Theorem 4 represents the limit distribution of the normalised estimator

g {(c)}^{1 / 2} ℓ ({\hat{ρ}}_{a, b} - ρ)

, where ℓ denotes the number of observations in the sub-sample

[b n - ℓ + 1, b n]

(so that

a = b - (1 / m)

in this case) and

{\hat{ρ}}_{a, b}

is the corresponding estimator. However, as pointed out by Phillips (1987a), the sequential limits (large sample for fixed c, followed by

c \to - \infty

) are only indicative of the results one might expect in the stationary case and do not constitute a rigorous demonstration. The results in Theorem 4 also encompass the related results in Phillips (1987a) obtained when

a = 0

and

b = 1

.

4. An Optimal Jackknife Estimator

The discussion following Theorem 2 indicates that the weights defining an optimal jackknife estimator, which removes first-order bias in the local-to-unity setting, depend on the quantities:

\begin{matrix} μ_{c} (η) & = & E (Z_{c} (η)) = μ_{c} (1) + λ_{1} (η) E (\frac{1}{D_{c}}), \\ μ_{c, j} (η) & = & E (Z_{c, j} (η)) = μ_{c, j} (1) + λ_{2} (η, m) E (\frac{1}{D_{c, j}}), j = 1, \dots, m, \end{matrix}

where

λ_{1} (η) = (1 - η) / 2

,

λ_{2} (η, m) = λ_{1} (η) / m^{2}

and:

D_{c} = \int_{0}^{1} J_{c}^{2}, D_{c, j} = \int_{(j - 1) / m}^{j / m} J_{c}^{2}, j = 1, \dots, m .

In particular, Part (c) of Theorem 3 can be used to evaluate the quantities:

μ_{c} (1) = E (\frac{N_{c}}{D_{c}}), μ_{c, j} (1) = E (\frac{N_{c, j}}{m D_{c, j}}), j = 1, \dots, m,

where we have defined:

N_{c} = \int_{0}^{1} J_{c} d W, N_{c, j} = \int_{(j - 1) / m}^{j / m} J_{c} d W, j = 1, \dots, m .

The relevant MGFs for evaluating

μ_{c} (1)

and

μ_{c, j} (1)

are given in the corollary to Theorem 3.

Table 1 contains the values of

μ_{c, j} (1)

for values of m and c as follows:

m = {1, 2, 3, 4, 6, 8, 12}

and

c = {- 50, - 20, - 10, - 5, - 1, 0, 1}

.2 The entries for

m = 1

correspond to

μ_{c} (1) = E (Z_{c} (1))

in view of the distributional equivalence of

Z_{c} (1)

and

Z_{c, 1} (1)

discussed following the corollary. For a given combination of j and m, it can be seen that the expectations increase as c increases, while for given c and j the expectations increase with m. A simple explanation for the different properties of the sub-samples beyond

j = 1

is that the initial values are of the same order of magnitude as the partial sums of the innovations. The values of the sub-sample expectations when

c = 0

are seen from Table 1 to be independent of m and to increase with j. Note that

μ_{0, 1} (1) = - 1.7814

corresponds to the expected value of the limit distribution of the full-sample estimator

\hat{ρ}

under a unit root; see, for example, (10) and the associated commentary. The values of

μ_{0, j} (η)

can be used to define jackknife weights under a unit root for different values of m; see, for example, Chambers and Kyriacou (2013). More generally, the values of

μ_{c, j} (η)

can be used to define optimal weights for the jackknife estimator that achieve the aim of first-order bias removal in the presence of a near-unit root. The result is presented in Theorem 5.

Theorem 5.

Let

{\bar{μ}}_{c} (η) = μ_{c} (η) - \sum_{j = 1}^{m} μ_{c, j} (η)

. Then, under Assumption 1, an optimal jackknife estimator is given by:

{\hat{ρ}}_{J}^{*} = w_{1, c}^{*} (η) \hat{ρ} + w_{2, c}^{*} (η) \frac{1}{m} \sum_{j = 1}^{m} {\hat{ρ}}_{j},

where

w_{1, c}^{*} (η) = - \sum_{j = 1}^{m} μ_{c, j} (η) / {\bar{μ}}_{c} (η)

and

w_{2, c}^{*} (η) = 1 - w_{1, c}^{*} (η) = μ_{c} (η) / {\bar{μ}}_{c} (η)

.

Theorem 5 shows the form of the optimal weights for the jackknife estimator when the process (1) has a near-unit root. It can be seen that the weights depend not only on the value of c, but also on the value of

η

, both of which are unknown in practice. The authors in Chambers and Kyriacou (2013) and Chen and Yu (2015) have emphasised the case

c = 0

and

η = 1

and have reported simulation results highlighting the good bias-reduction properties of appropriate jackknife estimators in that case. When

c \neq 0

and

η = 1

, the optimal weights in Theorem 5 simplify to:

w_{1, c}^{*} (1) = - \sum_{j = 1}^{m} μ_{c, j} (1) / {\bar{μ}}_{c} (1), w_{2, c}^{*} (1) = μ_{c} (1) / {\bar{μ}}_{c} (1) .

The values of

μ_{c, j} (1)

in Table 1 can be utilised to derive these optimal weights for the jackknife estimator in this case; these are reported in Table 2 for the values of m and c used in Table 1, along with the values of the standard weights that are applicable in stationary autoregressions. The entries in Table 1 show that the optimal weights are larger in (absolute) value than the standard weights that would apply if all the sub-sample distributions were the same and that they increase with c for given m. The optimal weights also converge towards the standard weights as c becomes more negative; this could presumably be demonstrated analytically using the properties of the MGF in constructing the

μ_{c, j} (1)

by examining the appropriate limits as

c \to - \infty

, although we do not pursue such an investigation here.

The relationship between the optimal weights when

η = 1

and when

η \neq 1

is not straightforward. Noting that:

{\bar{μ}}_{c} (η) = {\bar{μ}}_{c} (1) + λ_{1} (η) E (\frac{1}{D_{c}}) - λ_{2} (η, m) \sum_{j = 1}^{m} E (\frac{1}{D_{c, j}})

and that:

\sum_{j = 1}^{m} μ_{c, j} (η) = \sum_{j = 1}^{m} μ_{c, j} (1) + λ_{2} (η, m) \sum_{j = 1}^{m} E (\frac{1}{D_{c, j}})

we find that:

w_{1, c}^{*} (η) = - \frac{\sum_{j = 1}^{m} μ_{c, j} (1) + λ_{2} (η, m) \sum_{j = 1}^{m} E (\frac{1}{D_{c, j}})}{{\bar{μ}}_{c} (1) + λ_{1} (η) E (\frac{1}{D_{c}}) - λ_{2} (η, m) \sum_{j = 1}^{m} E (\frac{1}{D_{c, j}})} .

This expression can be manipulated to write

w_{1, c}^{*} (η)

explicitly in terms of

w_{1, c}^{*} (1)

as follows:

w_{1, c}^{*} (η) = \frac{w_{1, c}^{*} (1) - \frac{λ_{2} (η, m)}{{\bar{μ}}_{c} (1)} \sum_{j = 1}^{m} E (\frac{1}{D_{c, j}})}{1 + \frac{λ_{1} (η)}{{\bar{μ}}_{c} (1)} E (\frac{1}{D_{c}}) - \frac{λ_{2} (η, m)}{{\bar{μ}}_{c} (1)} \sum_{j = 1}^{m} E (\frac{1}{D_{c, j}})} .

(15)

The second weight is obtained simply as

w_{2, c}^{*} (η) = 1 - w_{1, c}^{*} (η)

. In situations where

η \neq 1

, which essentially reflects cases where

u_{t}

does not have a white noise structure, the optimal weights can be obtained from the entries in Table 2 (at least for the relevant values of c) using (15), but knowledge is still required not only of c and

η

, but also the expectations of the inverses of

D_{c}

and

D_{c, j}

. The latter can be computed numerically; Equation (2.3) of Meng (2005) shows that:

E (\frac{1}{D_{c}}) = \int_{0}^{\infty} M_{D_{c}} (- θ_{2}) d θ_{2}, E (\frac{1}{D_{c, j}}) = \int_{0}^{\infty} M_{D_{c, j}} (- θ_{2}) d θ_{2}, j = 1, \dots, m,

where

M_{D_{c}} (θ_{2})

and

M_{D_{c, j}} (θ_{2})

are the MGFs of

D_{c}

and

D_{c, j}

, respectively, which can be obtained from the corollary to Theorem 3.

In practice, however, the values of c and

η

are still required in order to construct the optimal estimator. Although the localization parameter c from the model defined under Assumption 1 is identifiable, it is not possible to estimate it consistently, and attempts to do so require a completely different formulation of the model; see, for example, Phillips et al. (2001), who propose a block local-to-unity framework to consistently estimate c, although this approach does not appear to have been pursued subsequently. Furthermore

η

depends on an estimator of the long-run variance

σ^{2}

, which is a notoriously difficult quantity to estimate in finite samples. In view of these unresolved challenges and following earlier work on jackknife estimation of autoregressive models with a (near-)unit root, we focus on the case

η = 1

, but allow

c \neq 0

with particular attention paid to unit root and locally-stationary processes, i.e.,

c \leq 0

.

Our simulations examine the performance of five estimators3 of the parameter

ρ = e^{c / n}

. The baseline estimator is the OLS estimator in (4), the bias of which the jackknife estimators aim to reduce. Three jackknife estimators with the generic form:

{\hat{ρ}}_{J} = w_{1} \hat{ρ} + w_{2} \frac{1}{m} \sum_{j = 1}^{m} {\hat{ρ}}_{j}

are also considered, each differing in the choice of weights

w_{1}

; in all cases,

w_{2} = 1 - w_{1}

. The standard jackknife sets

w_{1} = m / (m - 1)

; the optimal jackknife sets

w_{1} = w_{1, c}^{*}

; and the unit root jackknife sets

w_{1} = w_{1, 0}^{*}

. The standard jackknife removes fully the first-order bias in stationary autoregressions, but does not do so in the near-unit root framework, in which the optimal estimator achieves this goal. However, the optimal estimator is infeasible because it relies on the unknown parameter c.4 We therefore also consider the feasible unit root jackknife obtained by setting

c = 0

. In addition, we consider the jackknife estimator of Chen and Yu (2015), which is of the form:

{\hat{ρ}}_{J}^{C Y} = w_{1} \hat{ρ} + \sum_{j = 1}^{m} w_{2, j} {\hat{ρ}}_{j} .

The weights are chosen so as to minimise the variance of the estimator in addition to providing bias reduction in the case

c = 0

. Because the choice of weights is a more complex problem for this type of jackknife estimator, Chen and Yu only provide results for the cases

m = 2

and

m = 3

, in which case the weights are

w_{1} = 2.8390

,

w_{2, 1} = 0.6771

,

w_{2, 2} = 1.1619

and

w_{1} = 2.0260

,

w_{2, 1} = 0.2087

,

w_{2, 2} = 0.3376

,

w_{2, 3} = 0.4797

, respectively; see Table 1 of Chen and Yu (2015).

Table 3 reports the bias of the five estimators obtained from 100,000 replications of the model in Assumption 1 with

u_{t} \sim i i d N (0, 1)

and

y_{0} = 0

using

m = 2

for each of the jackknife estimators; this value has been found to provide particularly good bias reduction in a number of studies, including Phillips and Yu (2005), Chambers (2013), Chambers and Kyriacou (2013) and Chen and Yu (2015). The particular values of c are

c = {- 10, - 5, - 1, 0}

, which focus on the pure unit root case, as well as locally stationary processes, and four sample sizes are considered, these being n = 24, 48, 96 and 192. The corresponding values of

ρ

are:

{0.5833, 0.7917, 0.8958, 0.9479} (c = - 10)

;

{0.7917, 0.8958, 0.9479, 0.9740} (c = - 5)

;

{0.9583, 0.9792, 0.9896, 0.9948} (c = - 1)

; and

ρ = 1

for all values of n when

c = 0

. The values of

ρ

when

c = - 10

are some way from unity for the smaller sample sizes, which suggests that the standard jackknife might perform well in these cases.

The value of the bias of the estimator producing the minimum (absolute) bias for each c and n is highlighted in bold in Table 3. The results show the substantial reduction in bias that can be achieved with jackknife estimators, the superiority of the optimal estimator being apparent as c becomes more negative, although the unit root jackknife also performs well in terms of bias reduction.

Table 4 contains the corresponding RMSE values for the jackknife estimators using

m = 2

, as well as the RMSE corresponding to the RMSE-minimising values of m, which are typically larger than

m = 2

and are also reported in the table. The RMSE value of the estimator producing the minimum RMSE for each c and n is highlighted in bold. In fact, the optimal jackknife estimator, although constructed to eliminate first order bias, manages to reduce the OLS estimator’s RMSE and outperforms the Chen and Yu (2015) jackknife estimator in both bias and RMSE reduction, although the latter occurs at a larger number of sub-samples. The results show that use of larger values of m tends to produce smaller RMSE than when

m = 2

, and again, the optimal jackknife performs particularly well when c becomes more negative. The performance of the unit root jackknife is also impressive, suggesting that it is a feasible alternative to the optimal estimator when the value of c is unknown.

Although in itself important, bias is not the only feature of a distribution that is of interest, and hence, the RMSE values in Table 4 should also be taken into account when assessing the performance of the estimators. The substantial bias reductions obtained with the bias-minimising value of

m = 2

come at the cost of a larger variance that ultimately feeds through into a larger RMSE compared with the OLS estimator

\hat{ρ}

. This can be offset, however, by using the larger RMSE-minimising values of m that, despite having a larger bias than when

m = 2

, are nevertheless able to reduce the variance sufficiently to result in a smaller RMSE than

\hat{ρ}

.

In order to assess the robustness of the jackknife estimators, some additional bias results are presented in Table 5 that correspond to values of

η < 1

, while the estimators are based on the assumption that

η = 1

, as in the preceding simulations.5 The results correspond to two different specifications for

u_{t}

that enable data to be generated that are consistent with different values of

η

. The first specifies

u_{t}

to be a first-order moving average (MA(1)) process, so that

u_{t} = ϵ_{t} + θ ϵ_{t - 1}

where

ϵ_{t} \sim i i d N (0, 1)

; in this case

η = (1 + θ^{2}) / {(1 + θ)}^{2}

. The second specification is a first-order autoregressive (AR(1)) process of the form

u_{t} = ϕ u_{t - 1} + ϵ_{t}

, in which case

η = {(1 - ϕ)}^{2} / (1 - ϕ^{2})

. In the MA(1) case, we have chosen

θ = 0.5

in order to give an intermediate value of

η = 0.5556

, while in the AR(1) case, we have chosen

ϕ = 0.9

to give a small value of

η = 0.0526

. As in Table 3, the value of the bias of the estimator producing the minimum (absolute) bias for each c and n is highlighted in bold. Table 5 shows, in the MA case, that the jackknife estimators are able to reduce bias when

c = 0

, but none of them is able to do so when

c = - 1

or

c = - 5

. In the AR case, with a smaller value of

η

, the jackknife estimators are still able to deliver bias reduction, albeit to a lesser extent than when

η = 1

, and it is the unit root jackknife of Chambers and Kyriacou (2013) that achieves the greatest bias reduction in this case. These results are indicative of the importance of knowing

η

and suggest that developing methods to allow for

η \neq 1

is important from an empirical viewpoint.

5. Conclusions

This paper has analysed the specification and performance of jackknife estimators of the autoregressive coefficient in a model with a near-unit root. The limit distributions of sub-sample estimators that are used in the construction of the jackknife estimator are derived, and the joint MGF of two components of that distribution is obtained and its properties explored. The MGF can then be used to derive the weights for an optimal jackknife estimator that removes fully the first-order finite sample bias from the OLS estimator. The resulting jackknife estimator is shown to perform well at finite sample bias reduction in a simulation study and, with a suitable choice of the number of sub-samples, is shown to be able to reduce the overall finite sample RMSE, as well.

The theoretical findings in Section 3 and Section 4 show how first-order approximations on sub-sample estimators can be used along with the well-known full-sample results of Phillips (1987a) for finite-sample refinements. The jackknife uses analytical (rather than simulation-based) results to achieve bias reduction at minimal computational cost along the same lines as indirect inference methods based on analytical approximations in Phillips (2012) and Kyriacou et al. (2017). Apart from computational simplicity, an evident advantage of analytical-based methods over simulation-based alternatives such as bootstrap or (traditional, simulation-based) indirect inference methods is that they require no distributional assumptions on the error term.

Despite its success in achieving substantial bias reduction in finite samples, as shown in the simulations, a shortcoming of the jackknife estimator, and an impediment to its use in practice, is the dependence of the optimal weights on the unknown near-unit root parameter, as well as on a quantity related to the long-run variance of the disturbances.6 However, our theoretical results in Section 3 and Section 4 reveal precisely how these quantities affect the optimal weights and therefore can, in principle, be used to guide further research into the development of a feasible data-driven version of the jackknife within this framework. Such further work is potentially useful in view of the simulations in Table 3 and Table 4 highlighting that (feasible) jackknife estimators are an effective bias and RMSE reduction tool in a local unit root setting, even if they do not fully remove first order bias. Moreover, the results obtained in Theorems 1–4 can be utilised in a wide range of sub-sampling situations outside that of jackknife estimation itself.

The results in this paper could be utilised and extended in a number of directions. An obvious application would be in the use of jackknife estimators as the basis for developing unit root test statistics, the local-to-unity framework being particularly well suited to the analysis of the power functions of such tests. It would also be possible to develop, fully, a variance-minimising jackknife estimator along the lines of Chen and Yu (2015) who derived analytic results for

c = 0

and

m = 2

or 3, although extending their approach to arbitrary c and m represents a challenging task. However, considerable progress has been made in this direction by Stoykov (2017), who builds upon our results and also proposes a two-step jackknife estimator that incorporates an estimate of c to determine the jackknife weights. The estimation model could also be extended to include an intercept and/or a time trend. The presence of an intercept will affect the limit distributions by replacing the Ornstein–Uhlenbeck processes by demeaned versions thereof, which will also have an effect on the finite sample biases. Such effects have been investigated by Stoykov (2017), who shows that substantial reductions in bias can still be achieved by jackknife methods. Applications of jackknife methods in multivariate time series settings are also possible, a recent example being Chambers (2015) in the case of a cointegrated system, but other multivariate possibilities could be envisaged.

Acknowledgments

We thank the Editors, two anonymous referees and Marian Stoykov for helpful comments on this piece of work. The initial part of the first author’s research was funded by the Economic and Social Research Council under Grant Number RES-000-22-3082.

Author Contributions

Both authors contributed equally to the paper.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A.

Proof of Theorem 1.

The proofs of Parts (a) and (b) rely on the solution to the stochastic difference equation generating

y_{t}

, which is given by:

y_{t} = \sum_{j = 1}^{t} e^{c (t - j) / n} u_{j} + e^{c t / n} y_{0} .

(A1)

The normalised partial sums of

u_{t}

,

S_{t} = \sum_{j = 1}^{t} u_{j}

, are also important, as is the functional:

X_{n} (r) = \frac{1}{\sqrt{n}} S_{[n r]} = \frac{1}{\sqrt{n}} S_{j - 1}, \frac{j - 1}{n} \leq r < \frac{j}{n} .

(A2)

Under the conditions on

u_{t}

, it follows that

X_{n} (r) \Rightarrow σ W (r)

as

n \to \infty

. Taking each part in turn:

(a) In view of (A1) and (A2), the object of interest can be written:

\begin{matrix} \frac{1}{ℓ^{2}} \sum_{t \in τ_{j}} y_{t}^{2} & = & \frac{1}{ℓ^{2}} \sum_{t \in τ_{j}} {\{\sum_{j = 1}^{t} e^{c (t - j) / n} u_{j} + e^{c t / n} y_{0}\}}^{2} \\ = & \frac{1}{ℓ^{2}} \sum_{t \in τ_{j}} \{{(\sum_{j = 1}^{t} e^{c (t - j) / n} u_{j})}^{2} + 2 e^{c t / n} y_{0} \sum_{j = 1}^{t} e^{c (t - j) / n} u_{j} + e^{2 c t / n} y_{0}^{2}\} \\ = & \frac{n^{2}}{ℓ^{2}} \sum_{t \in τ_{j}} \int_{(t - 1) / n}^{t / n} {(\sum_{j = 1}^{t} e^{c (t - j) / n} \int_{(j - 1) / n}^{j / n} d X_{n} (s))}^{2} d r + o_{p} (1) \\ = & m^{2} \int_{(j - 1) / m}^{j / m} {(\int_{0}^{r} e^{c (r - s)} d X_{n} (s))}^{2} d r + o_{p} (1) \\ \Rightarrow & σ^{2} m^{2} \int_{(j - 1) / m}^{j / m} J_{c}^{2} \end{matrix}

where, in the penultimate line, we note that

j ℓ / n = j / m

and

(j - 1) ℓ / n = (j - 1) / m

to give the limits of the outer integral.

(b) Squaring the difference equation for

y_{t}

, summing over

t \in τ_{j}

and noting that

e^{2 c / n} = 1 + (2 c / n) + O (n^{- 2})

, we obtain:

\begin{matrix} \frac{1}{ℓ} \sum_{t \in τ_{j}} y_{t}^{2} & = & \frac{e^{2 c / n}}{ℓ} \sum_{t \in τ_{j}} y_{t - 1}^{2} + \frac{2 e^{c / n}}{ℓ} \sum_{t \in τ_{j}} y_{t - 1} u_{t} + \frac{1}{ℓ} \sum_{t \in τ_{j}} u_{t}^{2} \\ = & \frac{1}{ℓ} \sum_{t \in τ_{j}} y_{t - 1}^{2} + \frac{2 c}{n ℓ} \sum_{t \in τ_{j}} y_{t - 1}^{2} + \frac{2}{ℓ} \sum_{t \in τ_{j}} y_{t - 1} u_{t} + \frac{1}{ℓ} \sum_{t \in τ_{j}} u_{t}^{2} + o_{p} (1) . \end{matrix}

Solving for the quantity of interest yields:

\begin{matrix} \frac{1}{ℓ} \sum_{t \in τ_{j}} y_{t - 1} u_{t} & = & \frac{1}{2} \{\frac{1}{ℓ} \sum_{t \in τ_{j}} y_{t}^{2} - \frac{1}{ℓ} \sum_{t \in τ_{j}} y_{t - 1}^{2} - \frac{2 c}{n ℓ} \sum_{t \in τ_{j}} y_{t - 1}^{2} - \frac{1}{ℓ} \sum_{t \in τ_{j}} u_{t}^{2}\} + o_{p} (1) \\ = & \frac{1}{2} \{{(\frac{1}{\sqrt{ℓ}} y_{j ℓ})}^{2} - {(\frac{1}{\sqrt{ℓ}} y_{(j - 1) ℓ})}^{2} - \frac{2 c}{m ℓ^{2}} \sum_{t \in τ_{j}} y_{t - 1}^{2} - \frac{1}{ℓ} \sum_{t \in τ_{j}} u_{t}^{2}\} + o_{p} (1) . \end{matrix}

Now, as

n \to \infty

,

\frac{1}{\sqrt{ℓ}} y_{j ℓ} = \frac{\sqrt{m}}{\sqrt{n}} y_{j ℓ} \Rightarrow σ \sqrt{m} J_{c} (\frac{j}{m}), \frac{1}{\sqrt{ℓ}} y_{(j - 1) ℓ} = \frac{\sqrt{m}}{\sqrt{n}} y_{(j - 1) ℓ} \Rightarrow σ \sqrt{m} J_{c} (\frac{j - 1}{m}),

noting that

j ℓ = (j / m) n

and

(j - 1) ℓ = ((j - 1) / m) n

. It follows that:

\begin{matrix} \frac{1}{ℓ} \sum_{t \in τ_{j}} y_{t - 1} u_{t} & \Rightarrow & \frac{1}{2} \{σ^{2} m J_{c} {(\frac{j}{m})}^{2} - σ^{2} m J_{c} {(\frac{j - 1}{m})}^{2} - \frac{2 c}{m} σ^{2} m^{2} \int_{(j - 1) / m}^{j / m} J_{c}^{2}\} - \frac{σ_{u}^{2}}{2} \\ = & \frac{σ^{2} m}{2} \{J_{c} {(\frac{j}{m})}^{2} - J_{c} {(\frac{j - 1}{m})}^{2} - 2 c \int_{(j - 1) / m}^{j / m} J_{c}^{2}\} - \frac{σ_{u}^{2}}{2} . \end{matrix}

Using the Itô calculus (see, for example, Tanaka (1996, p. 58), we obtain the following stochastic differential equation for

J_{c} {(t)}^{2}

:

d [J_{c} {(t)}^{2}] = 2 J_{c} (t) d J_{c} (t) + d t;

substituting

d J_{c} (t) = c J_{c} (t) d t + d W (t)

then yields:

d [J_{c} {(t)}^{2}] = 2 J_{c} (t) d W (t) + (1 + 2 c J_{c} {(t)}^{2}) d t .

Integrating the above over

[(j - 1) / m, j / m]

, we find that:

J_{c} {(\frac{j}{m})}^{2} - J_{c} {(\frac{j - 1}{m})}^{2} = 2 \int_{(j - 1) / m}^{j / m} J_{c} d W + 2 c \int_{(j - 1) / m}^{j / m} J_{c}^{2} + \frac{1}{m},

and hence, we obtain:

\begin{matrix} \frac{1}{ℓ} \sum_{t \in τ_{j}} y_{t - 1} u_{t} & \Rightarrow & \frac{σ^{2} m}{2} (2 \int_{(j - 1) / m}^{j / m} J_{c} d W + \frac{1}{m}) - \frac{σ_{u}^{2}}{2} \\ = & σ^{2} m \int_{(j - 1) / m}^{j / m} J_{c} d W + \frac{1}{2} (σ^{2} - σ_{u}^{2}) \end{matrix}

as required.

(c) The result follows immediately from Parts (a) and (b) in view of (7). ☐

Proof of Theorem 2.

Proceeding as in the proof of Theorem 1, but retaining higher order terms, we find that:

\begin{matrix} \frac{1}{ℓ^{2}} \sum_{t \in τ_{j}} y_{t}^{2} & = & \frac{n^{2}}{ℓ^{2}} \sum_{t \in τ_{j}} \int_{(t - 1) / n}^{t / n} {(\sum_{j = 1}^{t} e^{c (t - j) / n} \int_{(j - 1) / n}^{j / n} d X_{n} (s))}^{2} d r \\ + \frac{2 n^{3 / 2} y_{0}}{ℓ^{2}} \sum_{t \in τ_{j}} \int_{(t - 1) / n}^{t / n} e^{c t / n} \sum_{j = 1}^{t} e^{c (t - j) / n} \int_{(j - 1) / n}^{j / n} d X_{n} (s) d r \\ + \frac{n}{ℓ^{2}} \sum_{t \in τ_{j}} \int_{(t - 1) / n}^{t / n} e^{2 c t / n} d r y_{0}^{2} \\ = & m^{2} \int_{(j - 1) / m}^{j / m} {(\int_{0}^{r} e^{c (r - s)} d X_{n} (s))}^{2} d r + \frac{2 m^{3 / 2} y_{0}}{\sqrt{ℓ}} \int_{(j - 1) / m}^{j / m} e^{c r} \int_{0}^{r} e^{c (r - s)} d X_{n} (s) d r \\ + \frac{m}{ℓ^{2}} \int_{(j - 1) / m}^{j / m} e^{2 c r} d r y_{0}^{2} \\ \overset{d}{=} & σ^{2} m^{2} \int_{(j - 1) / m}^{j / m} J_{c}^{2} + \frac{2 σ m^{3 / 2} y_{0}}{\sqrt{ℓ}} \int_{(j - 1) / m}^{j / m} e^{c r} J_{c} + O_{p} (\frac{1}{ℓ}) . \end{matrix}

Next, as before, we have:

\frac{1}{ℓ} \sum_{t \in τ_{j}} y_{t - 1} u_{t} = \frac{1}{2} \{{(\frac{1}{\sqrt{ℓ}} y_{j ℓ})}^{2} - {(\frac{1}{\sqrt{ℓ}} y_{(j - 1) ℓ})}^{2} - \frac{2 c}{m ℓ^{2}} \sum_{t \in τ_{j}} y_{t - 1}^{2} - \frac{1}{ℓ} \sum_{t \in τ_{j}} u_{t}^{2}\} + O_{p} (\frac{1}{ℓ}) .

Now:

\begin{matrix} {(\frac{1}{\sqrt{ℓ}} y_{j ℓ})}^{2} & \overset{d}{=} & {(σ \sqrt{m} J_{c} (\frac{j}{m}) + \frac{\sqrt{m}}{\sqrt{ℓ}} e^{c j / m} y_{0})}^{2} + O_{p} (\frac{1}{ℓ}) \\ \overset{d}{=} & σ^{2} m J_{c} {(\frac{j}{m})}^{2} + \frac{2 σ m e^{c j / m} y_{0}}{\sqrt{ℓ}} J_{c} (\frac{j}{m}) + O_{p} (\frac{1}{ℓ}); \end{matrix}

a similar result holds for

{(y_{(j - 1) ℓ} / \sqrt{ℓ})}^{2}

. Furthermore,

\frac{1}{ℓ} \sum_{t \in τ_{j}} u_{t}^{2} = \frac{1}{\sqrt{ℓ}} [\frac{1}{\sqrt{ℓ}} \sum_{t \in τ_{j}} (u_{t}^{2} - σ_{u}^{2})] + σ_{u}^{2} \overset{d}{=} \frac{v_{f}}{\sqrt{ℓ}} ξ_{2 j} + σ_{u}^{2} + O_{p} (\frac{1}{ℓ})

where

ξ_{2 j} \sim N (0, 1)

(j = 1, \dots, m)

. Combining with the result for

(1 / ℓ^{2}) \sum_{t \in τ_{j}} y_{t}^{2}

, we find that:

\begin{matrix} \frac{1}{ℓ} \sum_{t \in τ_{j}} y_{t - 1} u_{t} & \overset{d}{=} & \frac{1}{2} \{σ^{2} m (J_{c} {(\frac{j}{m})}^{2} - J_{c} {(\frac{j - 1}{m})}^{2} - 2 c \int_{(j - 1) / m}^{j / m} J_{c}^{2} - \frac{1}{m}) + σ^{2} \\ + \frac{2 σ m y_{0}}{\sqrt{ℓ}} (e^{c j / m} J_{c} (\frac{j}{m}) - e^{c (j - 1) / m} J_{c} (\frac{j - 1}{m})) \\ - \frac{4 c σ \sqrt{m} y_{0}}{\sqrt{ℓ}} \int_{(j - 1) / m}^{j / m} e^{c r} J_{c} - \frac{1}{\sqrt{ℓ}} ξ_{2 j} - σ_{u}^{2}\} + O_{p} (\frac{1}{ℓ}) \\ \overset{d}{=} & σ^{2} m \int_{(j - 1) / m}^{j / m} J_{c} d W + \frac{1}{2} (σ^{2} - σ_{u}^{2}) + \frac{2 σ m y_{0}}{\sqrt{ℓ}} ξ_{1 j} - \frac{v_{f}}{\sqrt{ℓ}} ξ_{2 j} + O_{p} (\frac{1}{ℓ}) \end{matrix}

where

ξ_{1 j} (j = 1, \dots, m)

is defined by:

ξ_{1 j} = e^{c j / m} J_{c} (\frac{j}{m}) - e^{c (j - 1) / m} J_{c} (\frac{j - 1}{m}) - \frac{2 c}{\sqrt{m}} \int_{(j - 1) / m}^{j / m} e^{c r} J_{c} .

The stated distribution of

ξ_{1 j}

then follows using the property that:

E J_{c} (r) J_{c} (s) = \frac{e^{c (r + s)} - e^{c (\max (r, s) - \min (r, s))}}{2 c}

to calculate the variances and covariances; see Perron (1991, p. 234). In particular:

E {[e^{c j / m} J_{c} (\frac{j}{m}) - e^{c (j - 1) / m} J_{c} (\frac{j - 1}{m})]}^{2} = \frac{1}{2 c} [{(e^{2 c j / m} - e^{2 c (j - 1) / m})}^{2} + e^{2 c j / m} - e^{2 c (j - 1) / m}],

E {(\int_{(j - 1) / m}^{j / m} e^{c r} J_{c})}^{2} = \frac{{(e^{2 c j / m} - e^{2 c (j - 1) / m})}^{2}}{8 c^{3}} - \frac{(e^{2 c j / m} - e^{2 c (j - 1) / m})}{4 c^{3}} + \frac{e^{2 c j / m}}{2 m c^{2}},

\begin{matrix} E [(e^{c j / m} J_{c} (\frac{j}{m}) - e^{c (j - 1) / m} J_{c} (\frac{j - 1}{m})) \int_{(j - 1) / m}^{j / m} e^{c r} J_{c}] \\ = \frac{{(e^{2 c j / m} - e^{2 c (j - 1) / m})}^{2}}{4 c^{2}} + \frac{(e^{2 c j / m} - e^{2 c (j - 1) / m})}{4 c^{2}} - \frac{e^{2 c j / m}}{2 m c}, \end{matrix}

which combine to determine

s^{2}

. The result for

ℓ ({\hat{ρ}}_{j} - ρ)

follows from the above results. ☐

Proof of Theorem 3.

(a) The aim is to derive the joint MGF:

M_{c} (θ_{1}, θ_{2}) = E \exp (θ_{1} \int_{a}^{b} J_{c} d W + θ_{2} \int_{a}^{b} J_{c}^{2}) .

We begin by noting that:

\int_{a}^{b} J_{c} d W = \frac{1}{2} [J_{c} {(b)}^{2} - J_{c} {(a)}^{2} - 2 c \int_{a}^{b} J_{c}^{2} - (b - a)]

so that the function of interest becomes:

M_{c} (θ_{1}, θ_{2}) = \exp (- \frac{θ_{1} (b - a)}{2}) E \exp (\frac{θ_{1}}{2} [J_{c} {(b)}^{2} - J_{c} {(a)}^{2}] + (θ_{2} - c θ_{1}) \int_{a}^{b} J_{c}^{2}) .

Evaluation of this expectation is aided by introducing the auxiliary O-U process

Y (t)

on

t \in [0, b]

with parameter

λ

, defined by:

d Y (t) = λ Y (t) d t + d W (t), Y (0) = 0 .

Let

μ_{J_{c}}

and

μ_{Y}

denote the probability measures induced by

J_{c}

and Y, respectively. These measures are equivalent and, by Girsanov’s theorem (see, for example, Theorem 4.1 of Tanaka 1996),

\frac{d μ_{J_{c}}}{d μ_{Y}} (s) = \exp ((c - λ) \int_{0}^{b} s (t) d s (t) - \frac{(c^{2} - λ^{2})}{2} \int_{0}^{b} s {(t)}^{2} d t)

is the Radon–Nikodym derivative evaluated at

s (t)

, a random process on

[0, b]

with

s (0) = 0

. The above change of measure will be used because, for a function

f (J_{c})

,

E (f (J_{c})) = E (f (Y) \frac{d μ_{J_{c}}}{d μ_{Y}} (Y)) .

Using the change of measure, we obtain:

\begin{matrix} M_{c} (θ_{1}, θ_{2}) & = & \exp (- \frac{θ_{1} (b - a)}{2}) E \exp (\frac{θ_{1}}{2} [Y {(b)}^{2} - Y {(a)}^{2}] + (θ_{2} - c θ_{1}) \int_{a}^{b} Y^{2} \\ + (c - λ) \int_{0}^{b} Y d Y - \frac{(c^{2} - λ^{2})}{2} \int_{0}^{b} Y^{2}) . \end{matrix}

Now, using the Itô calculus,

\int_{0}^{b} Y d Y = (1 / 2) [Y {(b)}^{2} - b]

, and so:

\frac{θ_{1}}{2} [Y {(b)}^{2} - Y {(a)}^{2}] + (c - λ) \int_{0}^{b} Y d Y = \frac{(θ_{1} + c - λ)}{2} Y {(b)}^{2} - \frac{θ_{1}}{2} Y {(a)}^{2} - \frac{(c - λ)}{2} b,

while splitting the second integral involving

Y^{2}

yields:

(θ_{2} - c θ_{1}) \int_{a}^{b} Y^{2} - \frac{(c^{2} - λ^{2})}{2} \int_{0}^{b} Y^{2} = \frac{(λ^{2} - c^{2} - 2 c θ_{1} + 2 θ_{2})}{2} \int_{a}^{b} Y^{2} - \frac{(c^{2} - λ^{2})}{2} \int_{0}^{a} Y^{2} .

Hence, defining

δ = θ_{1} + c - λ

,

\begin{matrix} M_{c} (θ_{1}, θ_{2}) & = & \exp (\frac{θ_{1} a - δ b}{2}) E \exp \{\frac{δ}{2} Y {(b)}^{2} - \frac{θ_{1}}{2} Y {(a)}^{2} \\ + \frac{(λ^{2} - c^{2} - 2 c θ_{1} + 2 θ_{2})}{2} \int_{a}^{b} Y^{2} - \frac{(c^{2} - λ^{2})}{2} \int_{0}^{a} Y^{2}\} . \end{matrix}

As the parameter

λ

is arbitrary, it is convenient to set

λ = {(c^{2} + 2 c θ_{1} - 2 θ_{2})}^{1 / 2}

so as to eliminate the term

\int_{a}^{b} Y^{2}

. We shall then proceed in two steps:

(i): Take the expectation in $M_{c} (θ_{1}, θ_{2})$ conditional on $F_{0}^{a}$ , the sigma field generated by W on $[0, a]$ .
(ii): Introduce another O-U process V and apply Girsanov’s theorem again to take the expectation with respect to $F_{0}^{a}$ .

Step (i). Conditional on

F_{0}^{a}

, we obtain:

M_{c} (θ_{1}, θ_{2}; F_{0}^{a}) = \exp (\frac{θ_{1} a - δ b}{2}) \exp (- \frac{θ_{1}}{2} Y {(a)}^{2} - \frac{(c^{2} - λ^{2})}{2} \int_{0}^{a} Y^{2}) E [\exp (\frac{δ}{2} Y {(b)}^{2})| F_{0}^{a}] .

Now, from the representation

Y (b) = \exp ((b - a) λ) Y (a) + \int_{a}^{b} \exp ((b - r) λ) d W (r)

, it follows that

Y (b) | F_{0}^{a} \sim N (μ, ω^{2})

, where:

μ = E (Y (b) | F_{0}^{a}) = \exp ((b - a) λ) Y (a),

ω^{2} = E [{(Y (b) - E (Y (b) | F_{0}^{a}))}^{2} | F_{0}^{a}] = \frac{\exp (2 (b - a) λ) - 1}{2 λ} .

Hence, using Lemma 5 of Magnus (1986), for example,

E [\exp (\frac{δ}{2} Y {(b)}^{2})| F_{0}^{a}] = \exp (\frac{δ}{2} k Y {(a)}^{2}) {(1 - δ ω^{2})}^{- 1 / 2},

where

k = \exp (2 (b - a) λ) / (1 - δ ω^{2})

, and so:

M_{c} (θ_{1}, θ_{2}; F_{0}^{a}) = \exp (\frac{θ_{1} a - δ b}{2}) {(1 - δ ω^{2})}^{- 1 / 2} \exp \{(\frac{δ k - θ_{1}}{2}) Y {(a)}^{2} - \frac{(c^{2} - λ^{2})}{2} \int_{0}^{a} Y^{2}\} .

Step (ii). We now introduce a new auxiliary process,

V (t)

, on

[0, a]

, given by:

d V (t) = η V (t) d t + d W (t), V (0) = 0,

and will make use of the change of measure:

\frac{d μ_{Y}}{d μ_{V}} (s) = \exp ((λ - η) \int_{0}^{a} s (t) d s (t) - \frac{(λ^{2} - η^{2})}{2} \int_{0}^{a} s {(t)}^{2} d t)

in order to eliminate

\int_{0}^{a} Y^{2}

. We have

M_{c} (θ_{1}, θ_{2}) = E M_{c} (θ_{1}, θ_{2}; F_{0}^{a})

, and so:

M_{c} (θ_{1}, θ_{2}) = \exp (\frac{θ_{1} a - δ b}{2}) {(1 - δ ω^{2})}^{- 1 / 2} E \exp \{(\frac{δ k - θ_{1}}{2}) Y {(a)}^{2} - \frac{(c^{2} - λ^{2})}{2} \int_{0}^{a} Y^{2}\} .

With the change of measure, the expectation of interest becomes:

E \exp \{(\frac{δ k - θ_{1}}{2}) V {(a)}^{2} + (λ - η) \int_{0}^{a} V d V + \frac{η^{2} - c^{2}}{2} \int_{0}^{a} V^{2}\} .

However,

η

is arbitrary, and so, we set

η = c

in order to eliminate

\int_{0}^{a} V^{2}

. Furthermore, noting that

\int_{0}^{a} V d V = (1 / 2) [V {(a)}^{2} - a]

, we obtain:

E \exp \{(\frac{δ k - θ_{1}}{2}) V {(a)}^{2} + (λ - c) \int_{0}^{a} V d V\} = \exp (- \frac{(λ - c)}{2} a) E \exp (\frac{δ (k - 1)}{2} V {(a)}^{2}) .

Now,

V (a) = \int_{0}^{a} e^{c (a - r)} d W (r)

, and so,

V (a) \sim N (0, v^{2})

where

v^{2} = (e^{2 a c} - 1) / (2 c)

, hence:

E \exp (\frac{δ (k - 1)}{2} V {(a)}^{2}) = {(1 - δ (k - 1) v^{2})}^{- 1 / 2} .

It follows that

M_{c} (θ_{1}, θ_{2}) = \exp (- (θ_{1} + c) (b - a) / 2) H_{c} {(θ_{1}, θ_{2})}^{- 1 / 2}

where:

H_{c} (θ_{1}, θ_{2}) = \exp (- (b - a) λ) (1 - δ ω^{2}) (1 - δ (k - 1) v^{2}) .

(A3)

Let

z = (b - a) λ

. Then:

\begin{matrix} e^{- z} (1 - δ ω^{2}) & = & e^{- z} - δ e^{- z} \frac{(e^{2 z} - 1)}{2 λ} \\ = & e^{- z} - (\frac{θ_{1} + c}{λ} - 1) \frac{(e^{z} - e^{- z})}{2} \\ = & \frac{(e^{z} + e^{- z})}{2} - \frac{(θ_{1} + c)}{λ} \frac{(e^{z} - e^{- z})}{2} \\ = & \cosh z - \frac{(θ_{1} + c)}{λ} \sinh z . \end{matrix}

The second term involves the expression

(k - 1) (1 - δ ω^{2}) = e^{2 z} - 1 + δ ω^{2}

, and so, we obtain:

\begin{matrix} e^{- z} (k - 1) (1 - δ ω^{2}) & = & e^{z} - e^{- z} + δ e^{- z} \frac{(e^{2 z} - 1)}{2 λ} \\ = & e^{z} - e^{- z} + (\frac{θ_{1} + c}{λ} - 1) \frac{(e^{z} - e^{- z})}{2} \\ = & (1 + \frac{θ_{1} + c}{λ}) \frac{(e^{z} - e^{- z})}{2} = \frac{1}{λ} (λ + θ_{1} + c) \sinh z . \end{matrix}

Noting that

δ (θ_{1} + c + λ) = (θ_{1} + c - λ) (θ_{1} + c + λ) = {(θ_{1} + c)}^{2} - λ^{2} = θ_{1}^{2} + 2 θ_{2}

and combining these components yields the required expression for

H_{c} (θ_{1}, θ_{2})

.

(b) The individual MGFs follow straightforwardly from Part (a), noting that

M_{N_{c}} (θ_{1}) = M_{c} (θ_{1}, 0)

and

M_{D_{c}} (θ_{2}) = M_{c} (0, θ_{2})

.

(c) From the definition of

M_{c} (θ_{1}, θ_{2})

, we obtain:

\begin{matrix} \frac{\partial M_{c} (θ_{1}, θ_{2})}{\partial θ_{1}} & = & - \frac{(b - a)}{2} \exp (- \frac{(θ_{1} + c)}{2} (b - a)) H_{c} {(θ_{1}, θ_{2})}^{- 1 / 2} \\ - \frac{1}{2} \exp (- \frac{(θ_{1} + c)}{2} (b - a)) H_{c} {(θ_{1}, θ_{2})}^{- 3 / 2} \frac{\partial H_{c} (θ_{1}, θ_{2})}{\partial θ_{1}} . \end{matrix}

Partial differentiation of

H_{c} (θ_{1}, θ_{2})

yields:

\begin{matrix} \frac{\partial H_{c} (θ_{1}, θ_{2})}{\partial θ_{1}} & = & c (b - a) \frac{\sinh (b - a) λ}{λ} + c [θ_{1} + c + v^{2} (θ_{1}^{2} + 2 θ_{2})] \frac{\sinh (b - a) λ}{λ^{3}} \\ - (1 + 2 θ_{1} v^{2}) \frac{\sinh (b - a) λ}{λ} - c (b - a) [θ_{1} + c + v^{2} (θ_{1}^{2} + 2 θ_{2})] \frac{\cosh (b - a) λ}{λ^{2}}, \end{matrix}

which makes use of the results:

\frac{\partial \cosh (b - a) λ}{\partial θ_{1}} = c (b - a) \frac{\sinh (b - a) λ}{λ}, \frac{\partial \sinh (b - a) λ}{\partial θ_{1}} = c (b - a) \frac{\cosh (b - a) λ}{λ} .

We need to evaluate

\partial H_{c} (θ_{1}, θ_{2}) / \partial θ_{1}

at

θ_{1} = 0

and at

- θ_{2}

, and this is facilitated by defining

x = {(c^{2} + 2 θ_{2})}^{1 / 2}

to replace

λ

; this results in:

\begin{matrix} {\frac{\partial H_{c} (θ_{1}, - θ_{2})}{\partial θ_{1}}|}_{θ_{1} = 0} & = & [c (b - a) - 1] \frac{\sinh (b - a) x}{x} + c (c - 2 v^{2} θ_{2}) \frac{\sinh (b - a) x}{x^{3}} \\ - c (b - a) (c - 2 v^{2} θ_{2}) \frac{\cosh (b - a) x}{x^{2}} . \end{matrix}

It is also convenient to define:

g (x) = H_{c} (0, - θ_{2}) = \cosh (b - a) x - (c - 2 v^{2} θ_{2}) \frac{\sinh (b - a) x}{x} .

Combining the results above yields:

\begin{matrix} {\frac{\partial M_{c} (θ_{1}, - θ_{2})}{\partial θ_{1}}|}_{θ_{1} = 0} & = & - \frac{(b - a)}{2} \exp (- \frac{c (b - a)}{2}) g {(x)}^{- 1 / 2} \\ - \frac{1}{2} \exp (- \frac{c (b - a)}{2}) g {(x)}^{- 3 / 2} \{[c (b - a) - 1] \frac{\sinh (b - a) x}{x} \\ + c (c - 2 v^{2} θ_{2}) \frac{\sinh (b - a) x}{x^{3}} - c (b - a) (c - 2 v^{2} θ_{2}) \frac{\cosh (b - a) x}{x^{2}}\} . \end{matrix}

Integrating with respect to

θ_{2}

yields the result in the theorem.

Proof of Corollary to Theorem 3.

The results follow from Theorem 3 noting that:

(a): $b - a = 1$ and $v^{2} = 0$ ;
(b): $b - a = 1 / m$ and $\lim_{c \to 0} v_{j - 1}^{2} = (j - 1) / m$ .

Derivation of (14). From (A3), we can write:

M_{c} (θ_{1}, θ_{2}) = {\{\exp ((θ_{1} + c) (b - a)) \exp (- z) (1 - δ ω^{2}) (1 - δ (k - 1) v^{2})\}}^{- 1 / 2},

where

z = (b - a) λ

. It can be shown that:

1 - δ (k - 1) v^{2} = \frac{1 - δ ω^{2} - δ v^{2} [\exp (2 z) - (1 - δ ω^{2})]}{1 - δ ω^{2}}

so that:

(1 - δ ω^{2}) (1 - δ (k - 1) v^{2}) = (1 + δ v^{2}) (1 - δ ω^{2}) - δ v^{2} \exp (2 z) .

Multiplying by

\exp (- z)

and noting that:

\exp (- z) (1 - δ ω^{2}) = \frac{2 λ + δ}{2 λ} \exp (- z) - \frac{δ}{2 λ} \exp (z)

results in the expression for

M_{c} (θ_{1}, θ_{2})

in (14). ☐

Proof of Theorem 4.

We can examine what happens to the quantities in Parts (a) and (b) by considering the joint MGF of

{(- 2 c)}^{1 / 2} \int_{a}^{b} J_{c} d W

and

(- 2 c) \int_{a}^{b} J_{c}^{2}

, which is given by:

L_{c} (p, q) = M_{c} ({(- 2 c)}^{1 / 2} p, - 2 c q) .

Using (14), we need to examine the asymptotic properties of

λ

,

δ

and

v^{2}

as

c \to - \infty

. The following asymptotic expansions facilitate this:

\begin{matrix} λ = {(c^{2} + 2 c {(- 2 c)}^{1 / 2} p + 4 c q)}^{1 / 2} & = & {(c^{2} - 2^{3 / 2} {(- c)}^{3 / 2} p + 4 c q)}^{1 / 2} \\ = & - c - 2^{1 / 2} {(- c)}^{1 / 2} p - p^{2} - 2 q + {O (| c |}^{- 1 / 2}); \end{matrix}

\begin{matrix} δ = {(- 2 c)}^{1 / 2} p + c - λ & = & 2^{1 / 2} {(- c)}^{1 / 2} p + c + c + 2^{1 / 2} {(- c)}^{1 / 2} p + p^{2} + 2 q + {O (| c |}^{- 1 / 2}) \\ = & 2^{3 / 2} {(- c)}^{1 / 2} p + 2 c + p^{2} + 2 q + {O (| c |}^{- 1 / 2}); \end{matrix}

\begin{matrix} 2 λ + δ = {(- 2 c)}^{1 / 2} p + c + λ & = & {(- 2 c)}^{1 / 2} p + c - c - 2^{1 / 2} {(- c)}^{1 / 2} p - p^{2} - 2 q + {O (| c |}^{- 1 / 2}) \\ = & - p^{2} - 2 q + {O (| c |}^{- 1 / 2}); \end{matrix}

\begin{matrix} δ v^{2} & = & (\exp (2 a c) - 1) (\frac{2^{3 / 2} {(- c)}^{1 / 2} p + 2 c + p^{2} + 2 q + {O (| c |}^{- 1 / 2})}{- 2 (- c)}) \\ = & (\exp (2 a c) - 1) (- 2^{1 / 2} {(- c)}^{- 1 / 2} p + 1 - 2^{- 1} {(- c)}^{- 1} p^{2} - {(- c)}^{- 1} q + {O (| c |}^{- 3 / 2})) \\ \to & - 1 a s c \to - \infty . \end{matrix}

Combining these results, we find that:

L_{c} (p, q) \to \exp \{(\frac{1}{2} p^{2} + q) (b - a)\} a s c \to - \infty,

from which the results in (a) and (b) follow immediately. To establish (c), note that:

K (c) = {(b - a)}^{1 / 2} {((- 2 c) \int_{a}^{b} J_{c}^{2})}^{- 1} ({(- 2 c)}^{1 / 2} \int_{a}^{b} J_{c} d W + \frac{1}{2} (1 - η)) + o_{p} (1) .

The result then follows using (a) and (b). ☐

Proof of Theorem 5.

To determine the weights for

{\hat{ρ}}_{J}^{*}

, note that:

E (\hat{ρ}) = ρ + \frac{μ_{c} (η)}{n} + O (\frac{1}{n^{2}}), E ({\hat{ρ}}_{j}) = ρ + \frac{μ_{c, j} (η)}{ℓ} + O (\frac{1}{ℓ^{2}}), j = 1, \dots, m,

where

μ_{c} (η)

is defined in the theorem. From the definition of

{\hat{ρ}}_{J}^{*}

, taking expectations yields:

E ({\hat{ρ}}_{J}^{*}) = (w_{1, c}^{*} (η) + w_{2, c}^{*} (η)) ρ + \frac{1}{n} (w_{1, c}^{*} (η) μ_{c} (η) + w_{2, c}^{*} (η) \sum_{j = 1}^{m} μ_{c, j} (η)) + O (\frac{1}{n^{2}}) .

In order that

E ({\hat{ρ}}_{J}^{*}) = ρ + O (1 / n^{2})

, the requirements are that:

(i): $w_{1, c}^{*} (η) + w_{2, c}^{*} (η) = 1$ , and
(ii): $w_{1, c}^{*} (η) μ_{c} (η) + w_{2, c}^{*} (η) \sum_{j = 1}^{m} μ_{c, j} (η) = 0$ .

Solving these two conditions simultaneously yields the stated weights. ☐

References

Abadir, Karim M. 1993. The limiting distribution of the autocorrelation coefficient under a unit root. Annals of Statistics 21: 1058–70. [Google Scholar] [CrossRef]
Chambers, Marcus J. 2013. Jackknife estimation and inference in stationary autoregressive models. Journal of Econometrics 172: 142–57. [Google Scholar] [CrossRef]
Chambers, Marcus J. 2015. A jackknife correction to a test for cointegration rank. Econometrics 3: 355–75. [Google Scholar] [CrossRef]
Chambers, Marcus J., and Maria Kyriacou. 2013. Jackknife estimation with a unit root. Statistics and Probability Letters 83: 1677–82. [Google Scholar] [CrossRef]
Chan, Ngai H., and Ching-Zong Wei. 1987. Asymptotic inference for nearly nonstationary AR(1) processes. Annals of Statistics 15: 1050–63. [Google Scholar] [CrossRef]
Chen, Ye, and Jun Yu. 2015. Optimal jackknife for unit root models. Statistics and Probability Letters 99: 135–42. [Google Scholar] [CrossRef]
Gonzalo, Jesus, and Jean-Yves Pitarakis. 1998. On the exact moments of nonstandard asymptotic distributions in an unstable AR(1) with dependent errors. International Economic Review 39: 71–88. [Google Scholar] [CrossRef]
Kruse, Robinson, and Hendrik Kaufmann. 2015. Bias-corrected estimation in mildly explosive autoregressions. Paper presented at Annual Conference 2015: Economic Development—Theory and Policy, Verein für Socialpolitik/German Economic Association, Muenster, Germany, September 6–9. [Google Scholar]
Kyriacou, Maria. 2011. Jackknife Estimation and Inference in Non-stationary Autoregression. Ph.D. thesis, University of Essex, Colchester, UK. [Google Scholar]
Kyriacou, Maria, Peter C. B. Phillips, and Francesca Rossi. 2017. Indirect inference in spatial autoregression. The Econometrics Journal 20: 168–89. [Google Scholar] [CrossRef]
Magnus, Jan R. 1986. The exact moments of a ratio of quadratic forms in normal variables. Annales d’Économie et de Statistique 4: 95–109. [Google Scholar] [CrossRef]
Meng, Xiao-Li. 2005. From unit root to Stein’s estimator to Fisher’s k statistics: If you have a moment, I can tell you more. Statistical Science 20: 141–62. [Google Scholar] [CrossRef]
Park, Joon. 2006. A bootstrap theory for weakly integrated processes. Journal of Econometrics 133: 639–72. [Google Scholar] [CrossRef]
Perron, Pierre. 1989. The calculation of the limiting distribution of the least-squares estimator in a near-integrated model. Econometric Theory 5: 241–55. [Google Scholar] [CrossRef]
Perron, Pierre. 1991. A continuous time approximation to the unstable first-order autoregressive process: The case without an intercept. Econometrica 59: 211–36. [Google Scholar] [CrossRef]
Perron, Pierre. 1996. The adequacy of asymptotic approximations in the near-integrated autoregressive model with dependent errors. Journal of Econometrics 70: 317–50. [Google Scholar] [CrossRef]
Phillips, Peter C. B. 1987a. Towards a unified asymptotic theory for autoregression. Biometrika 74: 535–47. [Google Scholar] [CrossRef]
Phillips, Peter C. B. 1987b. Time series regression with a unit root. Econometrica 55: 277–301. [Google Scholar] [CrossRef]
Phillips, Peter C. B. 2012. Folklore theorems, implicit maps, and indirect inference. Econometrica 80: 425–54. [Google Scholar]
Phillips, Peter C. B. 2014. On confidence intervals for autoregressive roots and predictive regression. Econometrica 82: 1177–95. [Google Scholar] [CrossRef]
Phillips, Peter C. B., Hyungsik Roger Moon, and Zhijie Xiao. 2001. How to estimate autoregressive roots near unity. Econometric Theory 17: 26–69. [Google Scholar] [CrossRef]
Phillips, Peter C. B., and Jun Yu. 2005. Jackknifing bond option prices. Review of Financial Studies 18: 707–42. [Google Scholar] [CrossRef]
Quenouille, Maurice H. 1956. Notes on bias in estimation. Biometrika 43: 353–60. [Google Scholar] [CrossRef]
Stoykov, Marian Z. 2017. Optimal Jackknife Estimation of Local to Unit Root Models. Colchester: Essex Business School, preprint. [Google Scholar]
Tanaka, Katsuto. 1996. Time Series Analysis: Nonstationary and Noninvertible Distribution Theory. New York: Wiley. [Google Scholar]
Tukey, John W. 1958. Bias and confidence in not-quite large samples. Annals of Mathematical Statistics 29: 614. [Google Scholar]
White, John S. 1958. The limiting distribution of the serial correlation coefficient in the explosive case. Annals of Mathematical Statistics 29: 1188–97. [Google Scholar] [CrossRef]

1	If $y_{0} \neq 0$ , then additional random variables appear in the numerator and denominator of the bias, thereby complicating the derivation of the required expectation. In the case of $c = 0$ , the simulation results reported in Table 2.3 of Kyriacou (2011) indicate that the bias of $\hat{ρ}$ increases with $y_{0} / σ$ , but that the jackknife continues to be an effective method of bias reduction.
2	The integrals were computed numerically using an adaptive quadrature method in the `integrate1d` routine in Gauss 17.
3	The Gauss codes used for the jackknife estimators are available from the authors on request.
4	In the simulations, we are taking $η = 1$ as known.
5	We thank a referee for suggesting that we investigate the performance of the estimators when $η < 1$ .
6	It should also be noted that we have assumed $y_{0} = 0$ in deriving the jackknife weights, whereas the expansion in Theorem 2 suggests that the bias functions (and, hence, the jackknife weights) will depend in a non-trivial and more complicated way on $y_{0}$ when $y_{0} \neq 0$ . Although we have not investigated the issue further here, the results in Kyriacou (2011) suggest that jackknife methods can still provide bias reduction even when $y_{0} \neq 0$ .

Table 1. Values of

μ_{c, j} (1) = E (Z_{c, j} (1))

.

Table 1. Values of

μ_{c, j} (1) = E (Z_{c, j} (1))

.

$j ∖ c$	$- 50$	$- 20$	$- 10$	$- 5$	$- 1$	0	1
$m = 1$
1	$- 1.9995$	$- 1.9972$	$- 1.9912$	$- 1.9758$	$- 1.8818$	$- 1.7814$	$- 1.5811$
$m = 2$
1	$- 1.9981$	$- 1.9912$	$- 1.9758$	$- 1.9439$	$- 1.8408$	$- 1.7814$	$- 1.6969$
2	$- 1.9604$	$- 1.9043$	$- 1.8214$	$- 1.6891$	$- 1.3295$	$- 1.1382$	$- 0.8920$
$m = 3$
1	$- 1.9962$	$- 1.9838$	$- 1.9595$	$- 1.9175$	$- 1.8234$	$- 1.7814$	$- 1.7283$
2	$- 1.9412$	$- 1.8613$	$- 1.7502$	$- 1.5921$	$- 1.2722$	$- 1.1382$	$- 0.9791$
3	$- 1.9412$	$- 1.8613$	$- 1.7500$	$- 1.5845$	$- 1.1515$	$- 0.9319$	$- 0.6759$
$m = 4$
1	$- 1.9939$	$- 1.9758$	$- 1.9439$	$- 1.8973$	$- 1.8138$	$- 1.7814$	$- 1.7427$
2	$- 1.9225$	$- 1.8214$	$- 1.6891$	$- 1.5210$	$- 1.2411$	$- 1.1382$	$- 1.0210$
3	$- 1.9225$	$- 1.8214$	$- 1.6879$	$- 1.5021$	$- 1.1016$	$- 0.9319$	$- 0.7410$
4	$- 1.9225$	$- 1.8214$	$- 1.6879$	$- 1.5006$	$- 1.0396$	$- 0.8143$	$- 0.5643$
$m = 6$
1	$- 1.9884$	$- 1.9594$	$- 1.9175$	$- 1.8698$	$- 1.8037$	$- 1.7814$	$- 1.7564$
2	$- 1.8867$	$- 1.7502$	$- 1.5921$	$- 1.4268$	$- 1.2085$	$- 1.1382$	$- 1.0616$
3	$- 1.8867$	$- 1.7500$	$- 1.5845$	$- 1.3812$	$- 1.0482$	$- 0.9319$	$- 0.8059$
4	$- 1.8867$	$- 1.7500$	$- 1.5843$	$- 1.3732$	$- 0.9697$	$- 0.8143$	$- 0.6472$
5	$- 1.8867$	$- 1.7500$	$- 1.5842$	$- 1.3717$	$- 0.9243$	$- 0.7348$	$- 0.5331$
6	$- 1.8867$	$- 1.7500$	$- 1.5842$	$- 1.3715$	$- 0.8958$	$- 0.6761$	$- 0.4450$
$m = 8$
1	$- 1.9823$	$- 1.9439$	$- 1.8973$	$- 1.8526$	$- 1.7984$	$- 1.7814$	$- 1.7629$
2	$- 1.8530$	$- 1.6891$	$- 1.5210$	$- 1.3686$	$- 1.1915$	$- 1.1382$	$- 1.0813$
3	$- 1.8530$	$- 1.6879$	$- 1.5021$	$- 1.2991$	$- 1.0203$	$- 0.9319$	$- 0.8381$
4	$- 1.8530$	$- 1.6879$	$- 1.5006$	$- 1.2815$	$- 0.9326$	$- 0.8143$	$- 0.6893$
5	$- 1.8530$	$- 1.6879$	$- 1.5005$	$- 1.2766$	$- 0.8795$	$- 0.7348$	$- 0.5829$
6	$- 1.8530$	$- 1.6879$	$- 1.5005$	$- 1.2752$	$- 0.8444$	$- 0.6761$	$- 0.5008$
7	$- 1.8530$	$- 1.6879$	$- 1.5005$	$- 1.2748$	$- 0.8201$	$- 0.6302$	$- 0.4345$
8	$- 1.8530$	$- 1.6879$	$- 1.5005$	$- 1.2747$	$- 0.8027$	$- 0.5931$	$- 0.3793$
$m = 12$
1	$- 1.9693$	$- 1.9175$	$- 1.8698$	$- 1.8324$	$- 1.7929$	$- 1.7814$	$- 1.7693$
2	$- 1.7916$	$- 1.5921$	$- 1.4268$	$- 1.3016$	$- 1.1742$	$- 1.1382$	$- 1.1007$
3	$- 1.7916$	$- 1.5845$	$- 1.3812$	$- 1.1979$	$- 0.9916$	$- 0.9319$	$- 0.8698$
4	$- 1.7916$	$- 1.5842$	$- 1.3732$	$- 1.1612$	$- 0.8943$	$- 0.8143$	$- 0.7313$
5	$- 1.7916$	$- 1.5842$	$- 1.3717$	$- 1.1464$	$- 0.8328$	$- 0.7348$	$- 0.6335$
6	$- 1.7916$	$- 1.5842$	$- 1.3715$	$- 1.1403$	$- 0.7904$	$- 0.6761$	$- 0.5585$
7	$- 1.7916$	$- 1.5842$	$- 1.3714$	$- 1.1376$	$- 0.7595$	$- 0.6302$	$- 0.4981$
8	$- 1.7916$	$- 1.5842$	$- 1.3714$	$- 1.1365$	$- 0.7362$	$- 0.5931$	$- 0.4477$
9	$- 1.7916$	$- 1.5842$	$- 1.3714$	$- 1.1360$	$- 0.7183$	$- 0.5622$	$- 0.4047$
10	$- 1.7916$	$- 1.5842$	$- 1.3714$	$- 1.1358$	$- 0.7041$	$- 0.5358$	$- 0.3674$
11	$- 1.7916$	$- 1.5842$	$- 1.3714$	$- 1.1357$	$- 0.6928$	$- 0.5131$	$- 0.3346$
12	$- 1.7916$	$- 1.5842$	$- 1.3714$	$- 1.1356$	$- 0.6837$	$- 0.4931$	$- 0.3055$

Table 2. Values of standard and optimal jackknife weights

(η = 1)

.

Table 2. Values of standard and optimal jackknife weights

(η = 1)

.

$m :$	2	3	4	6	8	12
Standard weights
$w_{1}$	2.0000	1.5000	1.3333	1.2000	1.1429	1.0909
$w_{2}$	−1.0000	−0.5000	−0.3333	−0.2000	−0.1429	−0.0909
Optimal weights: $c = - 50$
$w_{1, c}^{*} (1)$	2.0206	1.5156	1.3470	1.2122	1.1544	1.1016
$w_{2, c}^{*} (1)$	−1.0206	−0.5156	−0.3470	−0.2122	−0.1544	−0.1016
Optimal weights: $c = - 20$
$w_{1, c}^{*} (1)$	2.0521	1.5385	1.3670	1.2292	1.1698	1.1151
$w_{2, c}^{*} (1)$	−1.0521	−0.5385	−0.3670	−0.2292	−0.1698	−0.1151
Optimal weights: $c = - 10$
$w_{1, c}^{*} (1)$	2.1026	1.5741	1.3969	1.2535	1.1909	1.1325
$w_{2, c}^{*} (1)$	−1.1026	−0.5741	−0.3969	−0.2535	−0.1909	−0.1325
Optimal weights: $c = - 5$
$w_{1, c}^{*} (1)$	2.1923	1.6336	1.4445	1.2898	1.2213	1.1565
$w_{2, c}^{*} (1)$	−1.1923	−0.6336	−0.4445	−0.2898	−0.2213	−0.1565
Optimal weights: $c = - 1$
$w_{1, c}^{*} (1)$	2.4605	1.7956	1.5678	1.3788	1.2937	1.2117
$w_{2, c}^{*} (1)$	−1.4605	−0.7956	−0.5678	−0.3788	−0.2937	−0.2117
Optimal weights: $c = 0$
$w_{1, c}^{*} (1)$	2.5651	1.8605	1.6176	1.4147	1.3228	1.2337
$w_{2, c}^{*} (1)$	−1.5651	−0.8605	−0.6176	−0.4147	−0.3228	−0.2337
Optimal weights: $c = 1$
$w_{1, c}^{*} (1)$	2.5689	1.8773	1.6355	1.4311	1.3373	1.2455
$w_{2, c}^{*} (1)$	−1.5689	−0.8773	−0.6355	−0.4311	−0.3373	−0.2455

Table 3. Bias of OLS and jackknife estimators

(m = 2)

when

η = 1

.

Table 3. Bias of OLS and jackknife estimators

(m = 2)

when

η = 1

.

Estimator	$n = 24$	$n = 48$	$n = 96$	$n = 192$
$c = 0$
OLS	−0.0663	−0.0351	−0.0180	−0.0091
Standard jackknife	−0.0343	−0.0155	−0.0072	−0.0035
Optimal/unit root jackknife	−0.0163	−0.0045	−0.0011	−0.0003
Chen-Yu jackknife	−0.0264	−0.0144	−0.0110	−0.0102
$c = - 1$
OLS	−0.0675	−0.0365	−0.0188	−0.0096
Standard jackknife	−0.0323	−0.0145	−0.0067	−0.0032
Optimal jackknife	−0.0161	−0.0044	−0.0011	−0.0002
Unit root jackknife	−0.0124	−0.0021	0.0002	0.0004
Chen-Yu jackknife	−0.0190	−0.0099	−0.0087	−0.0089
$c = - 5$
OLS	−0.0589	−0.0350	−0.0188	−0.0098
Standard jackknife	−0.0193	−0.0088	−0.0038	−0.0017
Optimal jackknife	−0.0116	−0.0037	−0.0009	−0.0001
Unit root jackknife	0.0031	0.0061	0.0048	0.0029
Chen-Yu jackknife	0.0037	0.0027	−0.0017	−0.0051
$c = - 10$
OLS	−0.0437	−0.0310	−0.0178	−0.0095
Standard jackknife	−0.0114	−0.0055	−0.0022	−0.0009
Optimal jackknife	−0.0080	−0.0029	−0.0006	−0.0001
Unit root jackknife	0.0069	0.0089	0.0066	0.0039
Chen-Yu jackknife	0.0090	0.0071	0.0013	−0.0035

Table 4. RMSE of OLS and jackknife estimators when

η = 1

.

Table 4. RMSE of OLS and jackknife estimators when

η = 1

.

Estimator	$n = 24$	$n = 48$	$n = 96$	$n = 192$
$c = 0$
OLS	0.1366	0.0719	0.0371	0.0187
Standard jackknife $(m = 2)$	0.1482	0.0766	0.0396	0.0199
Standard jackknife $(m = 4, 6, 6, 8)$	0.1310	0.0659	0.0333	0.0165
Optimal/unit root jackknife $(m = 2)$	0.1753	0.0915	0.0479	0.0242
Optimal/unit root jackknife $(m = 4, 8, 12, 12)$	0.1383	0.0642	0.0312	0.0154
Chen-Yu jackknife $(m = 2)$	0.1640	0.0864	0.0462	0.0249
Chen-Yu jackknife $(m = 3)$	0.1392	0.0719	0.0374	0.0188
$c = - 1$
OLS	0.1428	0.0762	0.0396	0.0200
Standard jackknife $(m = 2)$	0.1524	0.0797	0.0415	0.0209
Standard jackknife $(m = 4, 6, 6, 8)$	0.1368	0.0698	0.0355	0.0176
Optimal jackknife $(m = 2)$	0.1724	0.0908	0.0477	0.0242
Optimal jackknife $(m = 4, 8, 12, 12)$	0.1416	0.0679	0.0334	0.0165
Unit root jackknife $(m = 2)$	0.1778	0.0939	0.0495	0.0251
Unit root jackknife $(m = 4, 8, 12, 12)$	0.1435	0.0680	0.0333	0.0165
Chen-Yu jackknife $(m = 2)$	0.1710	0.0916	0.0489	0.0262
Chen-Yu jackknife $(m = 3)$	0.1477	0.0778	0.0408	0.0207
$c = - 5$
OLS	0.1626	0.0901	0.0476	0.0243
Standard jackknife $(m = 2)$	0.1745	0.0944	0.0498	0.0253
Standard jackknife $(m = 6, 6, 8, 8)$	0.1615	0.0855	0.0442	0.0223
Optimal jackknife $(m = 2)$	0.1813	0.0982	0.0520	0.0265
Optimal jackknife $(m = 6, 8, 12, 12)$	0.1641	0.0847	0.0432	0.0216
Unit root jackknife $(m = 2)$	0.1975	0.1078	0.0576	0.0295
Unit root jackknife $(m = 6, 12, 12, 12)$	0.1710	0.0852	0.0433	0.0219
Chen-Yu jackknife $(m = 2)$	0.2066	0.1138	0.0610	0.0318
Chen-Yu jackknife $(m = 3)$	0.1857	0.1014	0.0540	0.0279
$c = - 10$
OLS	0.1809	0.1037	0.0558	0.0288
Standard jackknife $(m = 2)$	0.1971	0.1096	0.0584	0.0300
Standard jackknife $(m = 8, 8, 12, 12)$	0.1853	0.1016	0.0534	0.0272
Optimal jackknife $(m = 2)$	0.2003	0.1114	0.0595	0.0306
Optimal jackknife $(m = 4, 12, 12, 12)$	0.1877	0.1015	0.0530	0.0270
Unit root jackknife $(m = 2)$	0.2175	0.1217	0.0656	0.0340
Unit root jackknife $(m = 6, 12, 12, 12)$	0.1985	0.1032	0.0539	0.0277
Chen-Yu jackknife $(m = 2)$	0.2328	0.1315	0.0711	0.0372
Chen-Yu jackknife $(m = 3)$	0.2151	0.1202	0.0649	0.0338

Table 5. Bias of OLS and jackknife estimators

(m = 2)

when

η \neq 1

.

Table 5. Bias of OLS and jackknife estimators

(m = 2)

when

η \neq 1

.

Estimator	$n = 24$	$n = 48$	$n = 96$	$n = 192$
$η = 0.5556$ (MA case)
$c = 0$
OLS	−0.0191	−0.0105	−0.0054	−0.0028
Standard jackknife	−0.0059	−0.0018	−0.0006	−0.0002
Optimal/unit root jackknife ( $η = 1$ )	0.0016	0.0031	0.0022	0.0012
Chen-Yu jackknife	0.0062	0.0054	0.0034	0.0018
$c = - 1$
OLS	−0.0060	−0.0040	−0.0022	−0.0012
Standard jackknife	0.0110	0.0068	0.0038	0.0020
Optimal jackknife ( $η = 1$ )	0.0188	0.0118	0.0065	0.0034
Unit root jackknife	0.0206	0.0129	0.0072	0.0037
Chen-Yu jackknife	0.0275	0.0167	0.0091	0.0048
$c = - 5$
OLS	0.0631	0.0304	0.0150	0.0074
Standard jackknife	0.0885	0.0457	0.0234	0.0118
Optimal jackknife ( $η = 1$ )	0.0934	0.0486	0.0250	0.0127
Unit root jackknife	0.1028	0.0543	0.0281	0.0143
Chen-Yu jackknife	0.1129	0.0601	0.0312	0.0159
$η = 0.0526$ (AR case)
$c = 0$
OLS	0.0524	0.0237	0.0107	0.0049
Standard jackknife	0.0306	0.0137	0.0066	0.0033
Optimal/unit root jackknife ( $η = 1$ )	0.0183	0.0081	0.0043	0.0024
Chen-Yu jackknife	0.0293	0.0143	0.0076	0.0039
$c = - 1$
OLS	0.0801	0.0386	0.0184	0.0089
Standard jackknife	0.0624	0.0305	0.0153	0.0077
Optimal jackknife ( $η = 1$ )	0.0542	0.0268	0.0139	0.0071
Unit root jackknife	0.0524	0.0260	0.0135	0.0070
Chen-Yu jackknife	0.0645	0.0331	0.0172	0.0088
$c = - 5$
OLS	0.2116	0.1086	0.0547	0.0273
Standard jackknife	0.2096	0.1070	0.0541	0.0271
Optimal jackknife ( $η = 1$ )	0.2092	0.1067	0.0539	0.0271
Unit root jackknife	0.2084	0.1061	0.0537	0.0270
Chen-Yu jackknife	0.2212	0.1138	0.0575	0.0289

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chambers, M.J.; Kyriacou, M. Jackknife Bias Reduction in the Presence of a Near-Unit Root. Econometrics 2018, 6, 11. https://doi.org/10.3390/econometrics6010011

AMA Style

Chambers MJ, Kyriacou M. Jackknife Bias Reduction in the Presence of a Near-Unit Root. Econometrics. 2018; 6(1):11. https://doi.org/10.3390/econometrics6010011

Chicago/Turabian Style

Chambers, Marcus J., and Maria Kyriacou. 2018. "Jackknife Bias Reduction in the Presence of a Near-Unit Root" Econometrics 6, no. 1: 11. https://doi.org/10.3390/econometrics6010011

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Jackknife Bias Reduction in the Presence of a Near-Unit Root

Abstract

1. Introduction

2. Jackknife Estimation with a Near-Unit Root

2.1. The Model and the Standard Jackknife Estimator

2.2. Sub-Sample Properties

3. A Moment Generating Function and Its Properties

4. An Optimal Jackknife Estimator

5. Conclusions

Acknowledgments

Author Contributions

Conflicts of Interest

Appendix A.

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI