HAR Testing for Spurious Regression in Trend

Phillips, Peter C. B.; Wang, Xiaohu; Zhang, Yonghui

doi:10.3390/econometrics7040050

Open AccessArticle

HAR Testing for Spurious Regression in Trend

by

Peter C. B. Phillips

^1,2,3,4,*

,

Xiaohu Wang

⁵ and

Yonghui Zhang

⁶

¹

Cowles Foundation for Research in Economics, Yale University, Box 208281, Yale Station, New Haven, CT 06520, USA

²

Department of Economics, University of Auckland, Auckland CBD, Auckland 1010, New Zealand

³

School of Economics, Singapore Management University, 81 Victoria St, Singapore 188065, Singapore

⁴

Department of Economics, University of Southampton, Southampton SO14 0DA, UK

⁵

Department of Economics, The Chinese University of Hong Kong, Hong Kong 999077, China

⁶

School of Economics, Renmin University of China, Beijing 100872, China

^*

Author to whom correspondence should be addressed.

Econometrics 2019, 7(4), 50; https://doi.org/10.3390/econometrics7040050

Submission received: 7 December 2018 / Revised: 6 October 2019 / Accepted: 28 November 2019 / Published: 16 December 2019

(This article belongs to the Special Issue Celebrated Econometricians: David Hendry)

Download

Browse Figures

Versions Notes

Abstract

:

The usual t test, the t test based on heteroskedasticity and autocorrelation consistent (HAC) covariance matrix estimators, and the heteroskedasticity and autocorrelation robust (HAR) test are three statistics that are widely used in applied econometric work. The use of these significance tests in trend regression is of particular interest given the potential for spurious relationships in trend formulations. Following a longstanding tradition in the spurious regression literature, this paper investigates the asymptotic and finite sample properties of these test statistics in several spurious regression contexts, including regression of stochastic trends on time polynomials and regressions among independent random walks. Concordant with existing theory (Phillips 1986, 1998; Sun 2004, 2014b) the usual t test and HAC standardized test fail to control size as the sample size

n \to \infty

in these spurious formulations, whereas HAR tests converge to well-defined limit distributions in each case and therefore have the capacity to be consistent and control size. However, it is shown that when the number of trend regressors

K \to \infty,

all three statistics, including the HAR test, diverge and fail to control size as

n \to \infty

. These findings are relevant to high-dimensional nonstationary time series regressions where machine learning methods may be employed.

Keywords:

HAR inference; Karhunen–Loève representation; spurious regression; t-statistics

JEL Classification:

C12; C14; C23

It is meaningless to talk about ‘confirming’ theories when spurious results are so easily obtained.
(Hendry 1980)

1. Introduction

In a well-cited contribution that emphasized the importance of diagnostic testing in econometrics, (Hendry 1980) highlighted how easy it is to mistake spurious relationships as genuine when using trending data of the type that are so commonly encountered in econometric work, especially in macroeconomics. Spurious regressions occur when conventional significance tests are so seriously biased towards rejection of the null hypothesis of no relationship that the alternative of a genuine relationship is accepted when the variables have no meaningful relationship and may even be statistically independent. Hendry’s article showcased the potential for nonsense regressions with the illustration of a regression between UK consumer prices and cumulative rainfall that displayed a high level of ‘significance’ and passed many—but not all—diagnostic tests.

Spurious regressions continue to attract considerable attention in econometric work, long after the exploratory study by (Yule 1926), the simulation experiments of (Granger and Newbold 1974), and the cautionary warnings made by David Hendry and many other writers since then. The limit theory of (Durlauf and Phillips 1988; Phillips 1986) provided the first analytic steps forward on this subject by explaining the phenomena of persistent null hypothesis rejections in spurious regressions. These two studies helped applied researchers understand the failure of conventional significance tests by showing that in regressions with independent or even correlated trending I(1) data the usual regression t- and F-ratio test statistics do not possess limiting distributions but actually diverge as the sample size

n ↑ \infty,

leading inevitably to rejections of the null of no association. Closely related work by (Phillips and Durlauf 1986; Durlauf and Phillips 1988; Park and Phillips 1988, 1989; Phillips and Hansen 1990; Phillips and Loretan 1991; Phillips 1991) extended the analytics to cover models of cointegrated and reduced rank error correction systems. Much of this work was reviewed in a useful form for practitioners by (Banerjee et al. 1993).

The original study by (Phillips 1986) on spurious regression asymptotics formed the basis of a large subsequent literature that has analyzed spurious regressions among various classes of trend stationary, long memory, nonstationary, and near-nonstationary time series. A recent article by (Ernst et al. 2017) provided further analysis by deriving an expression for the standard deviation of the sample correlation coefficient between two independent standard Brownian motions. While this expression does not explain the phenomenon of spurious regression between two independent random walks, it does reveal that the limiting correlation is not centered on the origin and is highly dispersed. This result complements the original findings in (Phillips 1986) and many subsequent papers that the coefficient of determination in a spurious regression has a well-defined limit distribution and does not converge in probability to zero.

In later work, (Phillips 1998) pointed out that spurious regressions typically reflect the fact that trending data may always be ‘explained’ by a coordinate system of other trending variables—which includes the example of UK price series being well-explained by cumulative rainfall that was used by David Hendry (Hendry 1980). In this broad sense of interpretation, there are no spurious regressions for trending time series, just alternative ‘valid’ representations of the time series trajectories (and those of its limiting stochastic process, given a suitable normalization) in terms of other stochastic processes and deterministic functions of time.

The asymptotic theory in (Phillips 1998) utilized the general representation of a stochastic process in terms of an orthonormal system and provided an extension of the Weierstrass theorem to include the approximation of continuous functions and stochastic processes by Wiener processes. That theory was applied to two classic examples of spurious regressions: regression of stochastic trends on time polynomials, and regressions among independent random walks. Such regressions were shown to reproduce asymptotically in part (and in whole as the regressor space expanded with sample size) the underlying valid representations of one trending process in terms of others, a coordinate system that is entirely analogous to orthonormal or Fourier series representations of a continuous function in terms of polynomials or other simple classes of functions over some interval. An important feature of these ‘valid’ trend relationships is that the coefficients in the representations, like those in the Karhunen–Loève representation of a general stochastic process, are themselves random variables. Randomness in the representation of time-series trajectories is embodied in these coefficients. Much subsequent work has utilized these ideas and analytic methods, either in justifying certain regression representations or in using partial versions of these regression representations to focus on certain features—such as long run features—of the data (notably: Phillips 2005, 2014; Müller 2007; Sun 2004, 2014a, 2014b, 2014c; Hwang and Sun 2018; Müller and Watson 2016, 2018).

An important element in the Hendry (Hendry 1980) discussion of econometric practice was its emphasis on the value of diagnostic testing to ascertain limitations of regressions used in applications. In any empirical regression equation, the properties of the residuals depend inevitably on the properties of the data. To build upon a saying of the famous statistician John Tukey, in the regression equation

y = X β + u

the empirical investigator chooses the variables y and X (possibly with the aid of an autometric regression or a machine learning algorithm) and god gives back

u .

Any misspecification in the relationship between y and X must therefore be manifest in the properties of

u .

This is precisely what occurs in a spurious regression—the residual embodies the consequences of a model’s fundamental error of specification—as is revealed by the fact that tests for residual serial correlation such as the Durbin Watson statistic converge in probability to zero in such regressions (Phillips 1986).

Accommodating departures in fitted relationships from conventional assumptions on the properties of regression errors and thereby some of the effects of misspecification has been a longstanding goal of econometrics. One of the great advances in econometric research over the last half-century in response to this goal has been the development of methods of inference that are robust to some of the properties of the data and, particularly, those of the regression error. Such robustness can offer protection against specification error in validating inference. This research has led to the progressive development of heteroskedastic and autocorrelation consistent (HAC) procedures1 and subsequently to heteroskedastic and autocorrelation robust (HAR) methods2 These methods control for the effects of serial dependence and heterogeneity in regression errors and they play a key role in achieving robustness in inference. One area where methods of achieving valid statistical inference via HAC procedures has proved especially important in practice are regressions that involve trending variables and cointegration. This goal motivated the early research on optimal semiparametric approaches to the estimation of cointegrating relationships (Phillips and Hansen 1990) and continues to play a role in subsequent developments in this field (Phillips 2014; Hwang and Sun 2018).

HAC methods generally have good asymptotic properties but they are susceptible to large size distortions in practical work. Several alternative methods have been proposed in the recent literature to improve finite sample performance. Among these, the ‘fixed-b’ lag truncation rule (Kiefer and Vogelsang 2002a, 2002b, 2005) has attracted considerable interest. The method uses a truncation lag M for including sample serial covariances that is proportional to the sample size n (i.e.,

M \sim b n

for some fixed

b \in (0, 1

)) and sacrifices consistent variance matrix (and hence standard error) estimation in the interest of achieving improved performance in statistical testing by mirroring finite sample characteristics of test statistics in the new asymptotic theory of these tests. The formation of t ratio and Wald statistics based on HAC estimators without truncation belongs to the more general class of HAR test statistics. There are known analytic advantages to the fixed b approach, primarily related to controlling size distortion. In particular, research by (Jansson 2004; Sun et al. 2011; Sun 2014b) has shown evidence from Edgeworth expansions of enhanced higher-order asymptotic size control in the use of these tests. Recently, (Lazarus et al. 2018; Müller 2014; Sun 2018) have surveyed work in this literature and given recommendations for practical implementation.

In studying spurious regression on trend phenomena, (Phillips 1996, 1998) showed that the use of HAC methods attenuated the misleading divergence rate (under the null hypothesis of no association) by the extent to which the truncation lag

M \to \infty .

In particular, the divergence rate of the t statistic in a spurious regression involving independent

I (1)

variables is

O_{p} (\sqrt{n / M})

rather than

O_{p} (\sqrt{n})

. Pursuing this philosophy further, (Sun 2004) offered a new solution to deal with inference in spurious regressions. He argued that the divergence of the usual t-statistic arises from the use of a standard error estimator that underestimates the true variation of the ordinary least squares (OLS) estimator. He proposed the use of a fixed-b HAR standard error estimator with a bandwidth proportional to the sample size (where M∼

b n \to \infty

at the same rate as n). The resulting t-statistic converges to a non-degenerate limiting distribution which depends on nuisance parameters. These discoveries revealed that prudent use of HAR techniques in regression testing might widen the range of inference to include spurious regression.

In the same spirit as (Sun 2004, 2014b), the present contribution analyzes possible advantages in using HAR test statistics in the context of simple trend regressions such as

x_{t} = a t + u_{t},

(1)

where

u_{t}

is

I (1) .

For trend assessment in models of this type it is of interest to test the null hypothesis

H_{0} : a = 0

of the absence of a deterministic trend in (1). This framework is a prototypical example of much more complex models where deterministic trend, stochastic trends, and trend break components may all be present, hence methods of asymptotically valid estimation and testing are needed. A recent general approach to the consistent estimation of such complex models by machine learning filtering methods is given in (Phillips and Shi 2019).

The present paper considers three types of t test widely used in econometrics: the usual t test, the t test based on HAC covariance matrix estimators, and the fixed-b HAR test. We apply these t-statistics to three classic examples of spurious regressions: regression of stochastic trends on time polynomials, regression of stochastic trends on deterministic time trend and regression among independent random walks. The asymptotic behavior of these three different t-statistics are investigated. In the regression of stochastic trends on time polynomials and the regression among independent random walks, it is shown that the usual t test and HAC based t test are likely to indicate a significant relation with probability that goes to one as the sample size n goes to infinity. However, provided the number of regressors (K) is fixed, the HAR t-statistics converge to well-defined distributions free from nuisance parameters. As a result, when appropriate critical values are drawn from these limiting distributions, the HAR t-statistics would not diverge and valid inference on the regression coefficients would be possible, concordant with (Sun 2004).

In contrast to these results and those of (Sun 2004), we find that HAR t-statistics diverge at rate

\sqrt{K}

as

K \to \infty

. Hence, the characteristics of spurious regression return even with the use of HAR test statistics in models with an increasing number of regressors. These findings seem relevant for machine learning and autometric model building methods which accommodate large numbers of regressors, including those of the

p > n

variety where model searching often begins with more regressors than sample observations and penalized methods of estimation are needed to obtain even preliminary results.

Our results also reveal that the other two t-statistics (the usual t and HAC-based t) diverge at greater rates when

K \to \infty

than when K is fixed. In the regression of stochastic trends on deterministic time trends, we derive the limiting distributions of the statistics under both the null and alternative hypotheses. The HAR test turns out to be the only test which is consistent and has controllable size. All the limit theory for these tests receives strong support in simulations. As will become evident, the appealing asymptotic properties of the HAR test in the fixed number of regressors case are manifest even in situations where some commonly-used regularity conditions in the construction of HAR tests are violated.

The rest of the paper is organized as follows. Section 2 examines regressions of stochastic trends on a complete orthonormal basis in

L_{2} [0, 1]

and establishes the limiting distributions of the three different t-statistics with explicit application to the prototypical case of a spurious linear trend regression. Section 3 examines the limit behavior of the t-statistics in regressions among independent random walks. Simulations are reported in Section 4. Section 5 concludes. All proofs are given in the Appendix A, Appendix B and Appendix C.

2. Regression of Stochastic Trend on Time Polynomials

2.1. Model Details and Background

The development in this section concentrates on a simple unit root time series

X_{t} = \sum_{s = 1}^{_{t}} μ_{s},

(2)

whose increments

μ_{t}

form a stationary time series with zero mean, finite absolute moments to order

p > 2

, and continuous spectral density function

f_{μ} (λ)

. We assume that

X_{t}

satisfies the functional central limit theorem (FCLT)

\frac{X_{⌊n r⌋}}{\sqrt{n}} \Rightarrow B (r) \equiv B M (ω^{2}), with ω^{2} = 2 π f_{μ} (0),

(3)

for which primitive conditions are well known (e.g., Phillips and Solo 1992). The results that follow are illustrative and apply with suitable modification to more general nonstationary time series, such as near integrated or long memory series, which upon standardization converge to limiting stochastic processes with sample paths that are continuous almost surely.

By the Karhunen–Loève (KL) expansion theorem (e.g., Loève 1963, p. 478) into a countable linear combination of orthogonal functions, the KL representation for the Brownian motion

B (r)

is

B (r) = \sum_{k = 1}^{\infty} \sqrt{λ_{k}} φ_{k} (r) ξ_{k} = ω \sqrt{2} \sum_{k = 1}^{\infty} \frac{sin [(k - 1 / 2) π r]}{(k - 1 / 2) π} ξ_{k},

(4)

where

λ_{k} = \frac{4 ω^{2}}{{(2 k - 1)}^{2} π^{2}}, φ_{k} (r) = \sqrt{2} sin [(k - 1 / 2) π r]

are eigenvalues and corresponding eigenfunctions of the Brownian motion covariance kernel

ω^{2} (r \land s)

, and

ξ_{k} = λ_{k}^{- 1 / 2} \int_{0}^{1} B (s) φ_{k} (s) d s

are independently and identically distributed (iid) as

N (0, 1)

. This series representation of

B (r)

is convergent almost surely and uniformly in

r \in [0, 1]

. Denoting

z_{k} = \sqrt{λ_{k}} ξ_{k}

as the stochastic coefficients, the KL representation (4) could be rewritten as

B (r) = \sum_{k = 1}^{\infty} z_{k} φ_{k} (r) .

(5)

Starting from the KL representation of

B (r)

, (Phillips 1998) studied the asymptotic properties of regressions of

X_{t}

on deterministic regressors of the type

X_{t} = \sum_{k = 1}^{K} {\hat{b}}_{k} φ_{k} (\frac{t}{n}) + {\hat{u}}_{t},

(6)

or, equivalently (with

{\hat{a}}_{k} = {\hat{b}}_{k} / \sqrt{n}

),

\frac{X_{t}}{\sqrt{n}} = \sum_{k = 1}^{K} {\hat{a}}_{k} φ_{k} (\frac{t}{n}) + \frac{{\hat{u}}_{t}}{\sqrt{n}} .

(7)

Least squares estimation gives

{\hat{α}}_{K} = {({\hat{a}}_{1}, \dots, {\hat{a}}_{K})}^{'} = {(Φ_{K}^{'} Φ_{K})}^{- 1} Φ_{K}^{'} X / \sqrt{n},

where

Φ_{K} = {(φ_{K 1}, \dots, φ_{K n})}^{'}

with

φ_{K t} = {(φ_{1} (t / n), \dots, φ_{K} (t / n))}^{'}

, and

X = {(X_{1}, \dots, X_{n})}^{'}

. Let

C_{K} \in R^{K}

be any vector with

C_{K}^{'} C_{K} = 1

. When K is fixed and

n \to \infty

, (Phillips 1998) proved that

C_{K}^{'} {\hat{α}}_{K} \Rightarrow C_{K}^{'} \int_{0}^{1} {\bar{φ}}_{K} (r) B (r) d r \equiv N (0, C_{K}^{'} Λ_{K} C_{K}),

where

Λ_{K} =

diag

(λ_{1}, \dots, λ_{K})

and

{\bar{φ}}_{K} (r) = {(φ_{1} (r), \dots, φ_{K} (r))}^{'}

. In the expanding regressors case where

K = K (n) \to \infty

and

K / n \to 0

, it was also shown in (Phillips 1998) that

C_{K}^{'} {\hat{α}}_{K} \Rightarrow N (0, σ_{c}^{2}) \equiv c^{'} Z = \sum_{k = 1}^{\infty} c_{k} z_{k},

where

c = (c_{k}) \in R^{\infty}

satisfies

c^{'} c = 1

,

Λ =

diag

(λ_{1}, λ_{2}, \dots)

,

σ_{c}^{2} = c^{'} Λ c

, and

Z = {(z_{k})}_{k = 1}^{\infty}

are the random coefficients in the KL representation (5). Therefore, the fitted coefficients in regression (7) tend to random variables in the limit as

n \to \infty

that match those in the KL representation of the limit process

B (\cdot)

. In other words, least squares regressions reproduce in part (when K is finite) and in whole (when

K \to \infty

) the underlying orthonormal representations.

2.2. Three t-Statistics

Suppose interest centers on testing whether the regression coefficients are significant or more generally whether some linear combination

C_{K}^{'} β_{K}

of the underlying coefficients

β_{K} = {(b_{1}, \dots, b_{K})}^{'}

in the estimated regression (6) is equal to 0, that is

H_{0} : C_{K}^{'} β_{K} = 0 vs . H_{1} : C_{K}^{'} β_{K} \neq 0 .

Three types of t-statistics are considered. The first is the usual t-ratio defined as

t_{_{C_{K}^{'} β_{K}}} = \frac{C_{K}^{'} {\hat{β}}_{K}}{{[s_{b}^{2} C_{K}^{'} {(Φ_{K}^{'} Φ_{K})}^{- 1} C_{K}]}^{1 / 2}}

(8)

with

s_{b}^{2} = n^{- 1} \sum_{t = 1}^{n} {\hat{u}}_{t}^{2} = n^{- 1} \sum_{t = 1}^{n} {(X_{t} - {\hat{β}}_{K}^{'} φ_{K t})}^{2}

the usual error variance estimate. The second t-statistic is constructed by using a HAC variance estimator and has the following representation

t_{_{C_{K}^{'} β_{K}}}^{^{H A C}} = \frac{C_{K}^{'} {\hat{β}}_{K}}{{\hat{ω}}_{C_{K}^{'} β_{K}}},

(9)

where

{\hat{ω}}_{C_{K}^{'} β_{K}}^{2} = C_{K}^{'} {(Φ_{K}^{'} Φ_{K})}^{- 1} [n {\hat{lrvar}}_{H A C} ({\hat{u}}_{t} φ_{K t})] {(Φ_{K}^{'} Φ_{K})}^{- 1} C_{K}

(10)

with

{\hat{lrvar}}_{H A C} (η_{t}) = \sum_{j = - M}^{M} k (\frac{j}{M}) c (j, η) and c (j, η) = \frac{1}{n} \sum_{1 \leq t, t + j \leq n} η_{t} η_{t + j}^{'} .

(11)

Here,

{\hat{lrvar}}_{H A C} (η_{t})

is a kernel estimate of the long run variance of its argument,

k (\cdot)

is a lag kernel, M is a bandwidth parameter satisfying

M / n + 1 / M \to 0

as

n \to \infty,

and the argument

η_{t} = {\hat{u}}_{t} φ_{K t}

in (10).

If we choose a fixed

b \in (0, 1]

and set

M = ⌊b n⌋

, the condition

M / n + 1 / M \to 0

as

n \to \infty

is violated. In that case, the long run variance estimate is a fixed-b estimate and leads to the HAR t-statistic

t_{_{C_{K}^{'} β_{K}}}^{^{H A R}} = \frac{C_{K}^{'} {\hat{β}}_{K}}{{\overset{ˇ}{ω}}_{C_{K}^{'} β_{K}}},

(12)

where

{\overset{ˇ}{ω}}_{C_{K}^{'} β_{K}}^{2} = C_{K}^{'} {(Φ_{K}^{'} Φ_{K})}^{- 1} [n {\hat{lrvar}}_{H A R} ({\hat{u}}_{t} φ_{K t})] {(Φ_{K}^{'} Φ_{K})}^{- 1} C_{K}

(13)

with

{\hat{lrvar}}_{H A R} (η_{t}) = \sum_{j = - (n - 1)}^{(n - 1)} k_{b} (\frac{j}{n}) c (j, η), c (j, η) = \frac{1}{n} \sum_{1 \leq t, t + j \leq n} η_{t} η_{t + j}^{'},

(14)

k_{b} (\frac{j}{n}) = k (\frac{j}{n b})

, and

k (\cdot)

is a lag kernel function as before.

With minor changes of the proof given in (Phillips 1998), it is easy to deduce that for fixed K,

t_{_{C_{K}^{'} β_{K}}} \sim O_{p} (\sqrt{n})

and

t_{_{C_{K}^{'} β_{K}}}^{^{H A C}} \sim O_{p} (\sqrt{n / M})

as

n \to \infty,

as discussed earlier. Therefore, such tests indicate statistically significant regression coefficients with probability that goes to one as

n \to \infty

. These results match what is now standard spurious regression limit theory for inference.

In addition, as we show in Theorem 2 below, the large regressor case where

K \to \infty

leads to different results. In this case, both t-statistics

t_{_{C_{K}^{'} β_{K}}}

and

t_{_{C_{K}^{'} β_{K}}}^{^{H A C}}

have greater rates of divergence that depend on the expansion rate of

K,

given by

t_{_{C_{K}^{'} β_{K}}} = O_{p} (\sqrt{n K})

and

t_{_{C_{K}^{'} β_{K}}}^{^{H A C}} = O_{p} (\sqrt{n K / M})

. Thus, with the addition of more regressors the combined effect of the regression coefficients—as well as that of the individual coefficients—appears more significant and diverges when

K \to \infty

as

n \to \infty .

In consequence, large numbers of regressors effectively worsen the spurious regression problem.

Is there a test which does not always indicate that coefficients

{\hat{β}}_{K}

are significant in the ‘spurious’ regression (6)? As the results of (Sun 2004) show, the answer is positive for the case where K is fixed. In this event, the HAR test is appealing in the sense that

t_{_{C_{K}^{'} β_{K}}}^{^{H A R}} \sim O_{p} (1)

when

n \to \infty

and K is fixed, so that test size is controlled in the limit. Therefore, when appropriate critical values obtained from the limit distribution of

t_{_{C_{K}^{'} β_{K}}}^{^{H A R}}

are employed, the coefficients

{\hat{β}}_{K}

do not inevitably signal significance as

n \to \infty

and the usual misleading test implications of spurious regression do not manifest. However, in the important case where the regressor space expands and

K \to \infty

, the test statistic

t_{^{C_{K}^{'} β_{K}}}^{^{H A R}}

diverges to infinity at rate

O_{p} (\sqrt{K})

and the coefficients

{\hat{β}}_{K}

become significant again even under HAR testing.

These results are collected in the following two theorems.

Theorem 1.

For fixed K, as

n \to \infty

and

M / n + 1 / M \to 0

, we have

(i)

\frac{t_{_{C_{K}^{'} β_{K}}}}{\sqrt{n}} \Rightarrow \frac{C_{K}^{'} Z_{K}}{{[\int_{0}^{1} {\underset{̲}{B}}_{φ_{K}}^{2}]}^{1 / 2}};

(ii)

\sqrt{\frac{M}{n}} t_{_{C_{K}^{'} β_{K}}}^{^{H A C}} \Rightarrow \frac{C_{K}^{'} Z_{K}}{{\{\int_{- 1}^{1} k (s) d s \int_{0}^{1} {\underset{̲}{B}}_{φ_{K}}^{2} {(C_{K}^{'} {\bar{φ}}_{K})}^{2}\}}^{1 / 2}};

(iii)

\begin{matrix} t_{_{C_{K}^{'} β_{K}}}^{^{H A R}} & \Rightarrow \frac{C_{K}^{'} Z_{K}}{{\{\int_{0}^{1} \int_{0}^{1} k_{b} (r - q) {\underset{̲}{B}}_{φ_{K}} (r) {\underset{̲}{B}}_{φ_{K}} (q) [C_{K}^{'} {\bar{φ}}_{K} (r)] [C_{K}^{'} {\bar{φ}}_{K} (q)] d r d q\}}^{1 / 2}} \\ \equiv \frac{C_{K}^{'} Z_{K}^{W}}{{\{\int_{0}^{1} \int_{0}^{1} k_{b} (r - q) {\underset{̲}{W}}_{φ_{K}} (r) {\underset{̲}{W}}_{φ_{K}} (q) [C_{K}^{'} {\bar{φ}}_{K} (r)] [C_{K}^{'} {\bar{φ}}_{K} (q)] d r d q\}}^{1 / 2}}, \end{matrix}

where

Z_{K} = {(z_{k})}_{k = 1}^{K}

are the random coefficients in orthonormal representation (5),

{\underset{̲}{B}}_{φ_{K}} (r) = B (r) -

Z_{K}^{'} {\bar{φ}}_{K} (r)

,

Z_{K}^{W} = Z_{K} / ω = \int_{0}^{1} W {\bar{φ}}_{K}

,

W (\cdot) \equiv B M (1),

ω^{2} = 2 π f_{μ} (0)

, and

{\underset{̲}{W}}_{φ_{K}} (r) = {\underset{̲}{B}}_{φ_{K}} (r) / ω = W (r) - {(Z_{K}^{W})}^{'} {\bar{φ}}_{K} (r)

.

Remark 1.

The fixed-b HAR based t-statistic

t_{_{C_{K}^{'} β_{K}}}^{^{H A R}}

asymptotically follows a well-defined limit distribution when the number of regressors K is fixed. The limit distribution is free from nuisance parameters and is easily computable but depends on the lag kernel as well as the form of the trend regressors, which influence the detrended standard Brownian motion process

{\underset{̲}{W}}_{φ_{K}} .

The asymptotic critical values, therefore, differ from those of the usual standard normal limit distribution of a t-statistic. But the specific features of the limit distribution of

t_{_{C_{K}^{'} β_{K}}}^{^{H A R}},

which retain randomness in the denominator of the limiting statistic, help to control size in finite sample testing.

Theorem 2.

As

n, K \to \infty

,

M / n + 1 / M \to 0

and

K^{5 / 2} / n + K^{3 / 2} / n^{\frac{1}{2} - \frac{1}{p}} \to 0

, the following results hold:

(i)

\frac{t_{_{C_{K}^{'} β_{K}}}}{\sqrt{n K}} = \frac{C_{K}^{'} Z_{K}}{{[K \int_{0}^{1} {\underset{̲}{B}}_{φ_{K}}^{2}]}^{1 / 2}} + o_{p} (1) = O_{p} (1),

where

K \int_{0}^{1} {\underset{̲}{B}}_{φ_{K}}^{2} = ω^{2} / π^{2} + o_{p} (1)

.

(ii)

\sqrt{\frac{M}{n K}} t_{_{C_{K}^{'} β_{K}}}^{^{H A C}} = \frac{C_{K}^{'} Z_{K}}{{[K \int_{- 1}^{1} k (s) d s \int_{0}^{1} {\underset{̲}{B}}_{φ_{K}}^{2} {[C_{K}^{'} {\bar{φ}}_{K}]}^{2}]}^{1 / 2}} + o_{p} (1) = O_{p} (1),

(iii)

\frac{t_{_{C_{K}^{'} β_{K}}}^{^{H A R}}}{\sqrt{K}} = \frac{C_{K}^{'} Z_{K}}{{\{K \int_{0}^{1} \int_{0}^{1} k_{b} (r - q) {\underset{̲}{B}}_{φ_{K}} (r) {\underset{̲}{B}}_{φ_{K}} (q) C_{K}^{'} {\bar{φ}}_{K} (r) C_{K}^{'} {\bar{φ}}_{K} (q) d r d q\}}^{1 / 2}} + o_{p} (1) = O_{p} (1) .

Remark 2.

Theorem 2 shows that all three t-statistics diverge as

n \to \infty

but at different rates, each of which depends on K. The divergence rate of the fixed-b test statistic

t_{_{C_{K}^{'} β_{K}}}^{^{H A R}} = O_{p} (\sqrt{K})

is the slowest and depends only on

K .

These results strengthen the findings in (Phillips 1998) that attempts to deal with serial dependence in controlling size in significance testing generally fail when enough effort is put into the regression design to fit the trajectory. This failure now includes HAR testing when

K \to \infty

. All the tests are therefore ultimately confirmatory of the existence of a ‘relationship’—in the present case a coordinate representation relationship among different types of trends, at least when a complete representation is attempted by allowing the number of regressors K to diverge with

n .

The results of the theorem may be interpreted to mean that when a serious attempt is made to model a stochastic trend using deterministic functions (either a large number of such regressors or regressors that are carefully chosen to provide a successful representation and trajectory fit) it will end up being successful even when a spurious regression robust method such as fixed-b HAR test is used.

An additional matter concerning the form of these tests may usefully be highlighted. To construct the HAC and HAR t-statistics, the following condition

Var (\frac{Φ_{K}^{'} X}{\sqrt{n}}) = Var (\frac{1}{\sqrt{n}} \sum_{t = 1}^{n} φ_{K t} X_{t}) = Γ_{0} + \sum_{j = 1}^{n - 1} (1 - \frac{j}{n}) (Γ_{j} + Γ_{j}^{'})

(15)

with

Γ_{j} = E (φ_{K t} X_{t} X_{t - j}^{'} φ_{K (t - j)}^{'})

is usually imposed (e.g., Kiefer et al. 2000; Kiefer and Vogelsang 2002a, 2002b) as in standard approaches to robust covariance matrix estimation. In other words, the process

\{φ_{K t} X_{t}\}

is typically assumed to be unconditionally stationary or weakly dependent with uniformly bounded second moments so that series such as (15) converge. However, this condition is violated in both regressions (6) and (7) as

E (φ_{K t} X_{t} X_{t - j}^{'} φ_{K (t - j)}^{'}) = φ_{K t} E (\sum_{s = 1}^{t} μ_{s} \sum_{τ = 1}^{t - j} μ_{τ}) φ_{K (t - j)}^{'}

depends on t. For example, when the components

μ_{s}

are

i i d

(0, σ^{2})

with partial sums satisfying (3) then

E (φ_{K t} X_{t} X_{t - j}^{'} φ_{K (t - j)}^{'}) = (t - j) σ^{2} φ_{K t} φ_{K (t - j)}^{'}

depends on t. Regardless of this violation, HAC and HAR t-statistics may still be constructed in the traditional way; and the HAR statistic,

t_{_{C_{K}^{'} β_{K}}}^{^{H A R}}

has nuisance parameter-free asymptotic properties even though the above unconditional stationarity condition is not satisfied.

The above results apply straightforwardly to the simple case of a spurious linear regression on trend where the time series is a unit root process generated by

X_{t} = a t + X_{t}^{0}, t = 1, \dots, n,

(16)

with

a = 0

and

X_{t}^{0} = \sum_{s = 1}^{t} μ_{s}

is the partial sum of a zero mean stationary process

\{μ_{s}\}

with continuous spectral density

f_{μ} (λ) .

The standardized process

X_{n} (r) = n^{- 1 / 2} X_{⌊n r⌋}^{0}

satisfies the functional law

X_{n} (r) \Rightarrow B (r) \equiv B M (ω^{2}), ω^{2} = 2 π f_{μ} (0) > 0 .

The fitted regression model is

X_{t} = \hat{a} t + {\hat{u}}_{t}, or equivalently, \frac{X_{t}}{\sqrt{n}} = (\sqrt{n} \hat{a}) \frac{t}{n} + \frac{{\hat{u}}_{t}}{\sqrt{n}},

(17)

where

\hat{a} = \sum_{t = 1}^{n} t X_{t} / \sum_{t = 1}^{n} t^{2}

is the LS estimate of

a,

which satisfies (Durlauf and Phillips 1988)

\sqrt{n} (\hat{a} - a) = \frac{n^{- 5 / 2} \sum_{t = 1}^{n} t X_{t}^{0}}{n^{- 3} \sum_{t = 1}^{n} t^{2}} \Rightarrow 3 \int_{0}^{1} r B (r) d r \equiv N (0, \frac{6}{5} ω^{2}),

(18)

so that

\hat{a}

is consistent, including the case where

a = 0 .

However, as is well known, the usual t-statistic has order

O_{p} (\sqrt{n})

and diverges as

n \to \infty,

indicating a significant relationship between

\{X_{t}\}

and t in spite of the fact that

a = 0 .

This outcome follows directly from Theorem 1 and the (alternate) representation for the standard Brownian motion

W (r)

as

W (r) = r ξ_{0} + \sqrt{2} \sum_{k = 1}^{\infty} \frac{sin [k π r]}{k π} ξ_{k} with ξ_{k} \equiv iid N (0, 1),

(19)

which implies that

n^{- 1 / 2} X_{⌊n r⌋}^{0} \Rightarrow B (r) = ω \cdot W (r) = (ω ξ_{0}) r + (ω ξ_{k}) \sqrt{2} \sum_{k = 1}^{\infty} \frac{sin [k π r]}{k π} .

Thus, when

a = 0

, the scaled LS estimator

\sqrt{n} \hat{a}

has a random limit

ξ_{a} \equiv N (0, \frac{6}{5} ω^{2})

from (18) that approximates but does not exactly reproduce the leading random coefficient term

(ω ξ_{0})

in the representation (19). Importantly in this case, the deterministic functions in (19) are not orthonormal and there is dependence in

L_{2} [0, 1]

between the functions r and

{\{(\sqrt{2} sin [k π r]) / (k π)\}}_{k = 1}^{\infty} .

This dependence induces an asymptotic inefficiency in the trend coefficient estimate

\hat{a}

, since

\frac{6}{5} ω^{2} >

Var

(ω ξ_{0}) = ω^{2}

.

Next, in testing

H_{0} : a = 0

versus

H_{1} : a \neq 0,

the following statistics are considered:

\begin{matrix} t_{a} & = \frac{\hat{a}}{s_{a}} = \frac{\hat{a}}{{\{[n^{- 1} \sum_{t = 1}^{n} {({\hat{u}}_{t})}^{2}] {(\sum_{t = 1}^{n} t^{2})}^{- 1}\}}^{1 / 2}}, \end{matrix}

(20)

\begin{matrix} t_{a}^{^{H A C}} & = \frac{\hat{a}}{{\hat{ω}}_{a}} = \frac{\hat{a}}{{\{{(\sum_{t = 1}^{n} t^{2})}^{- 1} [n {\hat{lrvar}}_{H A C} (t {\hat{u}}_{t})] {(\sum_{t = 1}^{n} t^{2})}^{- 1}\}}^{1 / 2}}, \end{matrix}

(21)

\begin{matrix} t_{a}^{^{H A R}} & = \frac{\hat{a}}{{\overset{ˇ}{ω}}_{a}} = \frac{\hat{a}}{{\{{(\sum_{t = 1}^{n} t^{2})}^{- 1} [n {\hat{lrvar}}_{H A R} (t {\hat{u}}_{t})] {(\sum_{t = 1}^{n} t^{2})}^{- 1}\}}^{1 / 2}}, \end{matrix}

(22)

where

{\hat{lrvar}}_{H A C} (t {\hat{u}}_{t}) = \sum_{j = - M}^{M} k (\frac{j}{M}) [\frac{1}{n} \sum_{1 \leq t, t + j \leq M} {\hat{u}}_{t} {\hat{u}}_{t + j} t (t + j)] with M / n + 1 / M \to 0 as n \to \infty,

{\hat{lrvar}}_{H A R} (t {\hat{u}}_{t}) = \sum_{j = - (n - 1)}^{(n - 1)} k_{b} (\frac{j}{n}) [\frac{1}{n} \sum_{1 \leq t, t + j \leq M} {\hat{u}}_{t} {\hat{u}}_{t + j} t (t + j)] for some fixed b \in (0, 1],

k (\cdot)

is a kernel function,

k_{b} (j / n) = k (j / (n b))

and

{\hat{u}}_{t} = X_{t} - \hat{a} t

for

t = 1, \dots, n

. The asymptotic properties of these test statistics follow in the same way as before when

n \to \infty

with

M / n + 1 / M \to 0,

giving the following results.

(i) Under

H_{0} : a = 0

,

\begin{matrix} \frac{t_{a}}{\sqrt{n}} \Rightarrow \frac{\sqrt{3} \int_{0}^{1} r B}{{\{\int_{0}^{1} {\underset{̲}{B}}^{2}\}}^{1 / 2}}, \end{matrix}

(23)

\begin{matrix} \sqrt{\frac{M}{n}} t_{a}^{^{H A C}} \Rightarrow \frac{\int_{0}^{1} r B}{{\{\int_{- 1}^{1} k (s) d s \int_{0}^{1} r^{2} {\underset{̲}{B}}^{2}\}}^{1 / 2}}, \end{matrix}

(24)

\begin{matrix} t_{a}^{^{H A R}} \Rightarrow \frac{\int_{0}^{1} r B}{{\{\int_{0}^{1} \int_{0}^{1} k_{b} (r - q) \underset{̲}{B} (r) \underset{̲}{B} (q) r q d r d q\}}^{1 / 2}} \equiv \frac{\int_{0}^{1} r W}{{\{\int_{0}^{1} \int_{0}^{1} k_{b} (r - q) \underset{̲}{W} (r) \underset{̲}{W} (q) r q d r d q\}}^{1 / 2}}; \end{matrix}

(25)

(ii) Under

H_{1} : a \neq 0

,

\begin{matrix} \frac{t_{a}}{n} \Rightarrow \frac{a}{{(3 \int_{0}^{1} {\underset{̲}{B}}^{2})}^{1 / 2}}, \end{matrix}

(26)

\begin{matrix} \frac{\sqrt{M}}{n} t_{a}^{^{H A C}} \Rightarrow \frac{a}{{[9 \int_{- 1}^{1} k (s) d s \int_{0}^{1} r^{2} {\underset{̲}{B}}^{2}]}^{1 / 2}}, \end{matrix}

(27)

\begin{matrix} \frac{t_{a}^{^{H A R}}}{\sqrt{n}} \Rightarrow \frac{a}{{[9 \int_{0}^{1} \int_{0}^{1} k_{b} (r - q) \underset{̲}{B} (r) \underset{̲}{B} (q) r q d r d q]}^{1 / 2}}, \end{matrix}

(28)

where

\underset{̲}{B} (r) : = B (r) - 3 (\int_{0}^{1} s B (s) d s) r

and

B (r) \equiv ω W (r)

. Thus, under the null hypothesis both

t_{a} = O_{p} (\sqrt{n})

and

t_{a}^{^{H A C}} = O_{p} (\sqrt{n / M})

diverge but

t_{a}^{^{H A R}} = O_{p} (1)

and has a well defined nuisance parameter free limit distribution that may be used in statistical testing. Under the alternative hypothesis, all the listed statistics are divergent but at different rates. Only

t_{a}^{^{H A R}}

has effective discriminatory power, being consistent and having controllable size. These results match those in (Sun 2004, 2014a) showing that for simple trend misspecifications like that of a finite degree polynomial trend function in place of a stochastic trend, use of fixed-b HAR testing controls size and leads to a consistent test.

3. Regressions Among Independent Random Walks

This section extends these ideas to regressions among independent random walks. Let

B (\cdot)

be a Brownian motion on the interval

[0, 1]

. (Phillips 1998) proved that there exist a sequence of independent standard Brownian motions

{\{W_{i}\}}_{i = 1}^{K}

that are independent of

B (\cdot)

, and a sequence of variables

{\{d_{i}\}}_{i = 1}^{K}

defined on an augmented probability space

(Ω, F, P)

such that, as

K \to \infty

,

B (r) \sim \sum_{i = 1}^{\infty} d_{i} W_{i} (r) in L_{2} [0, 1] a . s . (P) .

(29)

The random coefficients

d_{i}

are statistically dependent on

B (\cdot)

. Replacing the Wiener processes

W_{i}

by orthogonal functions

V_{i} (r)

in

L_{2} [0, 1]

using the Gram-Schmidt process

\begin{matrix} V_{1} & = W_{1}, \\ V_{2} & = W_{2} - (\int_{0}^{1} W_{2} V_{1}) {(\int_{0}^{1} V_{1}^{2})}^{- 1} V_{1}, \\ V_{3} & = W_{3} - (\int_{0}^{1} W_{3} V_{a}^{'}) {(\int_{0}^{1} V_{a} V_{a}^{'})}^{- 1} V_{a}, V_{a}^{'} = [V_{1}, V_{2}], etc, \end{matrix}

gives the representation

B (r) \sim \sum_{i = 1}^{\infty} e_{i} V_{i} (r) with e_{i} = (\int_{0}^{1} B V_{i}) {(\int_{0}^{1} V_{i}^{2})}^{- 1} .

(30)

In the following, we consider the unit root process

y_{t} = \sum_{s = 1}^{t} μ_{s}

with mean zero stationary components

\{μ_{s}\}

with continuous spectral density

f_{μ} (λ)

and satisfying the functional law

n^{- 1 / 2} y_{⌊n r⌋} \Rightarrow B (r) \equiv B M (ω^{2}), ω^{2} = 2 π f_{μ} (0) > 0 .

Let

x_{t} = (x_{k t}) = {(\sum_{j = 1}^{t} μ_{k j})}_{k = 1}^{K}

be K independent standard Gaussian random walks, all of which are independent of

y_{t}

. Consider the linear regression

y_{t} = {\hat{b}}_{x}^{'} x_{t} + {\hat{u}}_{t}

, based on

n > K

observations of these series. The large n asymptotic behavior of

{\hat{b}}_{x}

is (Phillips 1986)

{\hat{b}}_{x} \Rightarrow {(\int_{0}^{1} W_{x} W_{x}^{'})}^{- 1} (\int_{0}^{1} W_{x} B),

where

W_{x}

is the vector standard Brownian motion weak limit of the standardized partial sum processes

n^{- 1 / 2} x_{⌊n \cdot⌋}

.

Suppose we orthogonalize the regressors

\{x_{k \cdot} = {(x_{k t})}_{t = 1}^{n} : k = 1, \dots, K\}

using the Gram-Schmidt process

\begin{matrix} z_{1 t} & = x_{1 t}, \\ z_{2 t} & = x_{2 t} - (x_{2 \cdot}^{'} x_{1 \cdot}) {(x_{1 \cdot}^{'} x_{1 \cdot})}^{- 1} x_{1 t}, \\ z_{3 t} & = x_{3 t} - (x_{3 \cdot}^{'} X_{a}) {(X_{a}^{'} X_{a})}^{- 1} x_{a t}, X_{a} : = [x_{1 \cdot}, x_{2 \cdot}] : = [x_{a \cdot}^{'}], etc . \end{matrix}

By standard weak convergence arguments we have

n^{- 1 / 2} z_{1 ⌊n \cdot⌋} \Rightarrow V_{1} (\cdot), n^{- 1 / 2} z_{2 ⌊n \cdot⌋} \Rightarrow V_{2} (\cdot), n^{- 1 / 2} z_{3 ⌊n \cdot⌋} \Rightarrow V_{3} (\cdot), etc .

Now let

z_{t} = {(z_{k t})}_{k = 1}^{K}

, and consider the regression

y_{t} = {\hat{b}}_{z K}^{'} z_{t} + {\hat{u}}_{t} .

(31)

The LS estimator

{\hat{b}}_{z K} = {[\sum_{t = 1}^{n} z_{t} z_{t}^{'}]}^{- 1} \sum_{t = 1}^{n} z_{t} y_{t}

has the limit

{\hat{b}}_{z K} \Rightarrow {(\int_{0}^{1} {\bar{V}}_{K} {\bar{V}}_{K}^{'})}^{- 1} (\int_{0}^{1} {\bar{V}}_{K} B) \equiv E_{K} : = {(e_{k})}_{k = 1}^{K},

where

{\bar{V}}_{K} = {(V_{k})}_{k = 1}^{K}

is a

K \times 1

vector. Thus, the empirical regression of

y_{t}

on

z_{t}

reproduces the first K terms in the representation of the limit Brownian motion B in terms of an orthogonalized coordinate system formed from K independent standard Brownian motions.

Suppose now that we are interested in testing whether a linear combination of

b_{z K}

equals zero, viz.,

H_{0} : C_{K}^{'} b_{z K} = 0 vs . H_{1} : C_{K}^{'} b_{z K} \neq 0

with

C_{K} \in R^{K}

satisfying

C_{K}^{'} C_{K} = 1

. Again, three types of t-statistics are considered:

\begin{matrix} t_{b_{z K}} & = \frac{C_{K}^{'} {\hat{b}}_{z K}}{s_{b_{z K}}} = \frac{C_{K}^{'} {\hat{b}}_{z K}}{{\{C_{K}^{'} [n^{- 1} \sum_{t = 1}^{n} {({\hat{u}}_{t})}^{2}] {(\sum_{t = 1}^{n} z_{t} z_{t}^{'})}^{- 1} C_{K}\}}^{1 / 2}}, \end{matrix}

(32)

\begin{matrix} t_{b_{z K}}^{^{H A C}} & = \frac{C_{K}^{'} {\hat{b}}_{z K}}{{\hat{ω}}_{b_{z K}}} = \frac{C_{K}^{'} {\hat{b}}_{z K}}{{\{C_{K}^{'} {(\sum_{t = 1}^{n} z_{t} z_{t}^{'})}^{- 1} [n {\hat{lrvar}}_{H A C} (z_{t} {\hat{u}}_{t})] {(\sum_{t = 1}^{n} z_{t} z_{t}^{'})}^{- 1} C_{K}\}}^{1 / 2}}, \end{matrix}

(33)

\begin{matrix} t_{b_{z K}}^{^{H A R}} & = \frac{C_{K}^{'} {\hat{b}}_{z K}}{{\overset{ˇ}{ω}}_{b_{z K}}} = \frac{C_{K}^{'} {\hat{b}}_{z K}}{{\{C_{K}^{'} {(\sum_{t = 1}^{n} z_{t} z_{t}^{'})}^{- 1} [n {\hat{lrvar}}_{H A R} (z_{t} {\hat{u}}_{t})] {(\sum_{t = 1}^{n} z_{t} z_{t}^{'})}^{- 1} C_{K}\}}^{1 / 2}}, \end{matrix}

(34)

where

{\hat{lrvar}}_{H A C} (z_{t} {\hat{u}}_{t}) = \sum_{j = - M}^{M} k (\frac{j}{M}) [\frac{1}{n} \sum_{1 \leq t, t + j \leq M} z_{t} {\hat{u}}_{t} {\hat{u}}_{t + j} z_{t + j}^{'}] with M / n + 1 / M \to 0 as n \to \infty,

{\hat{lrvar}}_{H A R} (z_{t} {\hat{u}}_{t}) = \sum_{j = - (n - 1)}^{(n - 1)} k_{b} (\frac{j}{n}) [\frac{1}{n} \sum_{1 \leq t, t + j \leq M} z_{t} {\hat{u}}_{t} {\hat{u}}_{t + j} z_{t + j}^{'}] for some fixed b \in (0, 1],

k (\cdot)

is a kernel function,

k_{b} (\frac{j}{n}) = k (\frac{j}{n b}),

and

{\hat{u}}_{t} = y_{t} - {\hat{b}}_{z K}^{'} z_{t}

for

t = 1, \dots, n

.

The following theorem establishes the limiting distributions of these three t-statistics.

Theorem 3.

For fixed K, as

n \to \infty

,

(i)

\frac{1}{\sqrt{n}} t_{b_{z K}} \Rightarrow \frac{C_{K}^{'} E_{K}}{{\{C_{K}^{'} {(\int_{0}^{1} {\bar{V}}_{K} {\bar{V}}_{K}^{'})}^{- 1} C_{K} \int_{0}^{1} {\underset{̲}{W}}_{y K}^{2}\}}^{1 / 2}};

(ii) When

1 / M + M / n \to 0,

\sqrt{\frac{M}{n}} t_{b_{z K}}^{^{H A C}} \Rightarrow \frac{C_{K}^{'} E_{K}}{{\{C_{K}^{'} {(\int_{0}^{1} {\bar{V}}_{K} {\bar{V}}_{K}^{'})}^{- 1} (\int_{- 1}^{1} k (s) d s \int_{0}^{1} {\underset{̲}{W}}_{y K}^{2} {\bar{V}}_{K} {\bar{V}}_{K}^{'}) {(\int_{0}^{1} {\bar{V}}_{K} {\bar{V}}_{K}^{'})}^{- 1} C_{K}\}}^{1 / 2}};

(iii)

t_{b_{z K}}^{^{H A R}} \Rightarrow \frac{C_{K}^{'} E_{K}}{{\{C_{K}^{'} {(\int_{0}^{1} {\bar{V}}_{K} {\bar{V}}_{K}^{'})}^{- 1} H {(\int_{0}^{1} {\bar{V}}_{K} {\bar{V}}_{K}^{'})}^{- 1} C_{K}\}}^{1 / 2}};

where

{\underset{̲}{W}}_{y K} (r) = B (r) - E_{K}^{'} {\bar{V}}_{K} (r)

and

H = \int_{0}^{1} \int_{0}^{1} k_{b} (r - q) {\bar{V}}_{K} (r) {\underset{̲}{W}}_{y K} (r) {\underset{̲}{W}}_{y K} (q) {\bar{V}}_{K}^{'} (q) d r d q .

Remark 3.

As it is shown in Theorem 3,

t_{b_{z K}}

and

t_{b_{z K}}^{^{H A C}}

diverge at rate

O_{p} (\sqrt{n})

and

O_{p} (\sqrt{n / M})

, respectively. Hence, such tests indicate inevitable significance of the regressors when

n \to \infty

and

1 / M + M / n \to 0

. However, the HAR based t-statistic

t_{b_{z K}}^{^{H A R}}

is convergent in distribution, which leads to valid statistical testing when appropriate critical values from the limit distribution of

t_{b_{z K}}^{^{H A R}}

are used. Note that

B (r) \equiv B M (ω^{2}) \equiv ω W (r)

,

E_{K} = {(e_{k})}_{k = 1}^{K}

with

e_{k} = (\int_{0}^{1} B V_{i}) {(\int_{0}^{1} V_{i}^{2})}^{- 1} = ω (\int_{0}^{1} W V_{i}) {(\int_{0}^{1} V_{i}^{2})}^{- 1} for k = 1, \dots, \infty,

where

W (\cdot)

is a standard Brownian motion. Hence, the nuisance parameter ω appearing in the numerator and dominator of the limiting distribution of

t_{b_{z K}}^{^{H A R}}

cancels. The limit distribution of

t_{b_{z K}}^{^{H A R}}

is therefore free of nuisance parameters.

Remark 4.

Even when

μ_{s} \sim_{d} i i d (0, 1)

, we have

E (y_{t} y_{t - j}) = E (\sum_{s = 1}^{t} μ_{s} \sum_{s = 1}^{t - j} μ_{s}) = t - j .

Thus

E (z_{t} y_{t} y_{t - j} z_{t - j}^{'}) = E (y_{t} y_{t - j}) E (z_{t} z_{t - j}^{'})

depends on t in a similar way. Therefore, as we discussed earlier, the usual regularity conditions employed in constructing HAC and HAR t-statistics cannot apply here.

Remark 5.

In view of (30) and Theorem 4.3 in (Phillips 1998),

{\underset{̲}{W}}_{y K} (r) \to 0

almost surely and uniformly as

K \to \infty

. We can expect that the rates of divergence of

t_{b_{z K}}

and

t_{b_{z K}}^{^{H A C}}

are greater in the case where

K \to \infty

than they are when K is fixed. Moreover, similar to the earlier finding in Theorem 2, the HAR statistic

t_{b_{z K}}^{^{H A R}}

will diverge at rate

O_{p} (\sqrt{K})

. Details are omitted to save space. Hence, fitted coefficients of the spurious random walk regressors would eventually be deemed significant when fixed critical values are employed in testing under all three t-statistics including

t_{b_{z K}}^{^{H A R}}

when both K,

n \to \infty

.

4. Simulations

This section reports simulations to investigate the performance in finite samples of the different t-statistics in spurious trend regressions, simple time trend regression, and spurious regression among stochastic trends.

We first examined spurious regression of a stochastic trend on time polynomials. Consider the standard Gaussian random walk

X_{t} = \sum_{s = 1}^{t} μ_{s}

, where

u_{s} \sim_{d} i i d

N (0, 1)

3. Orthogonal basis functions

{\{φ_{k} (\cdot)\}}_{k = 1}^{K},

where

φ_{k} (r) = \sqrt{2} sin [(k - 0.5) π r]

, were used as regressors and fitted time trend regressions of the form

X_{t} = φ_{K t}^{'} \hat{β} + {\hat{u}}_{t}

were run with

φ_{K t} = [φ_{1} (\frac{t}{n})

,…,

φ_{K} (\frac{t}{n})]^{'}

. We focus on the prototypical null hypothesis

H_{0} : β_{1} = 0

in what follows. In the construction of the HAC and HAR t-statistics, a uniform kernel function was employed.

Figure 1 reports the kernel estimates of the probability densities for these t-statistics under different model scenarios based on 10,000 simulations. The first row of graphs in Figure 1 gives the results for the different t-statistics as the sample size n increases with fixed

K = 1

. It is evident that both the usual t-statistic and HAC t-statistic (with

M = ⌊n b⌋

and

b = n^{- 3 / 4}

) diverge as n increases and the HAC statistic diverges at a slower rate. In contrast, the HAR t-statistic (

b = 0.2

) is evidently convergent to a well-defined probability distribution as the sample size expands. These results clearly corroborate Theorem 1.

The second row of graphs in Figure 1 presents the estimated densities of the three t-statistics as K increases for a fixed sample size

n = 200

. As K increases, all three t-statistics are clearly divergent but at different rates. For each statistic the increase in dispersion as K increases is evident. The last row reports the results for the HAR t-statistic with

K = 1, 5, 20

and bandwidth coefficient

b = 0, 0.1, 0.4, 0.6, 0.8, 1

. As K increases while maintaining the same bandwidth setting, the densities become more progressively dispersed. For fixed K, it is clear that the quantile is not a monotonic function of b. For

K = 1, 5

, when b is close to zero, the limiting distributions become more dispersed. When b is close to one, the limiting distributions also get dispersed for all three choices of K. As explained in (Sun 2004), for small or moderate K, when b is close to zero, the behavior of the t-statistic may be better captured by conventional limit theory without taking into account the persistence of the regression residuals. But when b is close to unity, we can not expect the standard variance estimate to capture the strong autocorrelation. If we choose the kernel

k (x) = 1

and use the full sample (i.e., setting

b = 1

), the long run variance estimate equals zero by construction. We conjecture that for fixed K it may be possible to find an optimal bandwidth

b_{o p t} (K)

by following an approach similar to the method used in (Sun et al. 2011) that controls for size and power. From the shape of the densities in the last row of graphs in Figure 1, we would expect that any such optimal bandwidth

b_{o p t} (K)

will get closer to zero as K gets larger. Extension of robust testing techniques to machine learning regressions where K may be very large will likely require very careful bandwidth selection in significance testing that takes the magnitude of K into account.

Figure 2 and Figure 3 present rejection frequencies of the three t-statistics in spurious trending regression when conventional critical values from the standard normal distribution at the 5% significance level are used. These frequencies are calculated based on 10,000 simulations with the sample size

n = 200

. A manifest feature in Figure 2 and Figure 3 is that the rejection frequencies increase faster towards one as the number of regressors K increases. This feature corroborates Theorem 2 and the simulation results in Figure 1, which reveal the progressive dispersion of the densities of the three t-statistics as K increases, suggesting that the rejection frequencies are increasing functions of K for each test.

The surface displayed in Figure 3 also reveals the effect of the bandwidth choice on the rejection frequencies of the HAR test, in which the bandwidth b varies from 0 to 1 at step length 0.025. It is evident that the rejection frequency of the HAR test is a nonlinear function of b, especially when K is small. Optimal selection of a bandwidth

b_{o p t} (K)

that controls size and power of the HAR test is therefore possible. The value of

b_{o p t} (K)

should moderately deviate from zero when K is small and move towards zero faster as K increases. The findings in Figure 3 support the conjecture about the optimal bandwidth choice, which was discussed above based on the results in Figure 1.

Next, we consider a simple spurious linear trend regression of

X_{t}

on a linear time trend

t,

where

X_{t} = \sum_{s = 1}^{t} μ_{s}

and

u_{s} \sim_{d} i i d

N (0, 1)

. Figure 4 reports the sampling densities for different t-statistics based on 10,000 simulations. The first row of graphs presents kernel estimates of densities of the t-statistics for sample sizes

n = 50, 100, 400, 800

. Again, the usual t-statistic and HAC statistic are divergent but at different rates. The HAR statistic is evidently convergent. The second row in Figure 4 provides results for the HAR statistic with different bandwidth choices. It is clear that the distributions become more dispersed as b moves close to zero or close to one. In this respect the findings are similar to those of Figure 1 when

K = 1

. In Figure 5, the rejection frequencies of the HAR statistic with various bandwidth values are reported, which are calculated based on 10,000 simulations with sample size n = 200 and critical values from the standard Normal distribution at 5% significance level. Rejection frequency again follows a U-shaped function of b. This curve suggests that, for simple spurious linear trend regression, the value of the optimal bandwidth should be around 0.3.

Last, we consider spurious regressions of a standard Gaussian random walk process

{\{X_{t}\}}_{t = 1}^{T}

on independent Gaussian random walks

{\{Z_{t}\}}_{t = 1}^{T}

, where

X_{t} = \sum_{s = 1}^{t} μ_{s}

with

u_{s} \sim_{d} i i d

N (0, 1)

and

Z_{t} = \sum_{s = 1}^{t} μ_{s}^{(z)}

with

μ_{s}^{(z)} \sim_{d} i i d

N (0, I_{K}) .

Figure 6 shows kernel estimates of the probability densities for these t-statistics under different scenarios based on 10,000 simulations. Figure 7 and Figure 8 report the simulated rejection frequencies of these t-statistics for various values of K and b. The patterns exhibited are evidently similar to those in Figure 1, Figure 2 and Figure 3. The same qualitative observations made for Figure 1, Figure 2 and Figure 3 therefore apply to these regressions.

5. Conclusions

Robust inference in trend regression poses many challenges. Not least of these is the critical difficulty that a trending time series trajectory can be represented in a coordinate system by many different functions, be they relevant or irrelevant, stochastic or non-stochastic. Valid significance testing in this context needs to allow for the fact that trend regression formulations inevitably fail to capture all the subtleties of reality and to a greater or lesser extent, therefore, involve some spurious components. The practical implications of this message are powerfully stated in the header by David Hendry that opens this article.

The present work has studied the asymptotic and finite sample performance of simple t statistics that seek to achieve some degree of robustness to misspecification in such settings. The analysis is based on three classic examples of spurious regressions, including regression of stochastic trends on time polynomials, regression of stochastic trends on a simple linear trend, and regression among independent random walks. Concordant with existing theory, the usual t-statistic and HAC standardized t-statistic both diverge and imply ‘nonsense relationships’ with probability going to one as the sample size tends to infinity. Also concordant with existing theory, when the number of regressors K is fixed, the HAR standardized t-statistics converge to non-degenerate distributions free from nuisance parameters, thereby controlling size and leading to valid significance tests in these spurious regressions. These findings reinforce the optimism expressed in earlier work that fixed-b methods of correction may fix inference problems in spurious regressions.

But when the number of trend regressors

K \to \infty

, the results are different. First, rates of divergence of the usual t-statistic and HAC t-test are greater by the factor

\sqrt{K}

than when K is fixed. Second, the fixed-b HAR t-statistic is no longer convergent and instead diverges at the rate

\sqrt{K},

leading to spurious inference of significance when

K \to \infty

. Thus, in the case of models with expanding regressor sets, none of these standard statistics produce valid consistent tests with controllable size. The failure of the HAR test in this setting is particularly important, given the growing use of machine learning algorithms in econometric work where large numbers of regressors are a normal feature in initial specifications. Future research might usefully focus on methods of controlling size and achieving consistent significant tests in such settings. A further area of research that is relevant in practical work involves extension of the present results to regressions that involve more general forms of stochastic trend processes, including higher-order integrated and fractionally integrated processes, as well as unbalanced regressions. The methods of the present paper should be useful in developing asymptotic analyses of such potentially spurious regressions.

Author Contributions

Conceptualization: P.C.B.P. 80%, X.W. 10%, and Y.Z. 10%; Methodology: P.C.B.P. 60%, X.W. 20%, and Y.Z. 20%; Programming: P.C.B.P. 0%, X.W. 50%, and Y.Z. 50%; Writing: P.C.B.P. 60%, X.W. 20%, and Y.Z. 20%; Editing: P.C.B.P. 60%; X.W. 20%, and Y.Z. 20%.

Funding

Phillips acknowledges research support from the NSF under Grant No. SES 18-50860 and a Kelly Fellowship at the University of Auckland. Wang acknowledges the support from the Hong Kong Research Grants Council General Research Fund under No. 14503718. Zhang acknowledges financial support from the National Natural Science Foundation of China under Projects No.71401166, 71973141, and 7187303.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

$L_{2} [0, 1]$	space of square integrable functions on $[0, 1]$ .
⟹	weak convergence.
$⌊\cdot⌋$	integer part of.
$: =$	definitional equality.
$o_{p} (1)$	tends to zero in probability.
$o_{a . s} (1)$	tends to zero almost surely.
$O_{p} (1)$	bounded in probability.
$\overset{p}{⟶}$	converge in probability.
$r \land s$	min $(r, s)$ .
∼	asymptotic equivalence.
≡	distributional equivalence.
$\sim_{d}$	distributed as

Appendix A. Proofs of Theorems in Section 2

Lemma A1.

For any

r \in [0, 1]

, let

{\underset{̲}{B}}_{φ_{K}} (r) = B (r) - Z_{K} {\bar{φ}}_{K} (r) = \sum_{k = K + 1}^{\infty} \sqrt{λ_{k}} φ_{k} (r) ξ_{k}

be the

L_{2}

-projection residual of B on

φ_{K} (r)

, with

φ_{k} (r) = \sqrt{2} sin [(k - 1 / 2) π r]

,

λ_{k} = ω^{2} / [{(k - 1 / 2)}^{2} π^{2}]

and

ξ_{k} \equiv

iid

N (0, 1)

. When

K \to \infty,

(i): ${\underset{̲}{B}}_{φ_{K}} (r) \sim O_{p} (1 / \sqrt{K})$ uniformly in $r \in [0, 1]$ ,
(ii): $K \int_{0}^{1} {\underset{̲}{B}}_{φ_{K}}^{2} = ω^{2} / π^{2} + o_{p} (1)$ with $ω^{2} = 2 π f_{μ} (0)$ ,
(iii): $\int_{0}^{1} {\underset{̲}{B}}_{φ_{K}}^{2} {[C_{K}^{'} {\bar{φ}}_{K}]}^{2} \sim O_{p} (1 / K)$ ,
(iv): $\int_{0}^{1} \int_{0}^{1} k_{b} (r - q) {\underset{̲}{B}}_{φ_{K}} (r) {\underset{̲}{B}}_{φ_{K}} (q) [C_{K}^{'} {\bar{φ}}_{K} (r)] [C_{K}^{'} {\bar{φ}}_{K} (q)] d r d q \sim O_{p} (1 / K) .$

Lemma A2.

When

n \to \infty

,

K \to \infty

,

1 / M + M / n \to 0

, and

K^{5 / 2} / n + K^{3 / 2} / n^{\frac{1}{2} - \frac{1}{p}} \to 0

,

(i): $C_{K}^{'} {\hat{β}}_{K} / \sqrt{n} = C_{K}^{'} {\hat{α}}_{K} = C_{K}^{'} Z_{K} + o_{p} (1),$
(ii): $K (s_{b}^{2} C_{K}^{'} {(Φ_{K}^{'} Φ_{K})}^{- 1} C_{K}) = K \int_{0}^{1} {\underset{̲}{B}}_{φ_{K}}^{2} + o_{p} (1),$
(iii): $\frac{K}{M} ({\hat{ω}}_{C_{K}^{'} β_{K}}^{2}) = (\int_{- 1}^{1} k (s) d s) K \int_{0}^{1} {\underset{̲}{B}}_{φ_{K}}^{2} {[C_{K}^{'} {\bar{φ}}_{K}]}^{2} + o_{p} (1),$
(iv): $\frac{K}{n} ({\overset{ˇ}{ω}}_{C_{K}^{'} β_{K}}^{2}) = K \int_{0}^{1} \int_{0}^{1} k_{b} (r - q) {\underset{̲}{B}}_{φ_{K}} (r) {\underset{̲}{B}}_{φ_{K}} (q) [C_{K}^{'} {\bar{φ}}_{K} (r)] [C_{K}^{'} {\bar{φ}}_{K} (q)] d r d q + o_{p} (1),$ where $Z_{K} = {(z_{k})}_{k = 1}^{K}$ are the random coefficients in the orthonormal representation (5), $s_{b}^{2}$ , ${\hat{ω}}_{C_{K}^{'} β_{K}}^{2}$ and ${\overset{ˇ}{ω}}_{C_{K}^{'} β_{K}}^{2}$ are defined as in formulae (8), (10) and (13), respectively.

Proof of Lemma A1.

(i) It is easy to see that

E [{\underset{̲}{B}}_{φ_{K}} (r)] = 0

, and

Var [{\underset{̲}{B}}_{φ_{K}} (r)] = \sum_{k = K + 1}^{\infty} λ_{k} φ_{k}^{2} (r) = O (\sum_{k = K + 1}^{\infty} λ_{k}) = O (\sum_{k = K + 1}^{\infty} \frac{1}{k^{2}}) = O (\int_{K}^{\infty} \frac{1}{k^{2}} d k) = O (\frac{1}{K})

uniformly in r. So by the Chebyshev’s inequality,

{\underset{̲}{B}}_{φ_{K}} (r) = O_{P} (1 / \sqrt{K})

uniformly in r.

(ii) See (Phillips 2002), Lemma 3.1.

(iii)–(iv) The proofs of (iii) and (iv) are similar. Hence, only the proof of (iv) is given below. By noticing that

ξ_{k} \sim

i i d

N (0, 1)

and for each

k = 1, \dots, K

the functions

φ_{k} (r) = \sqrt{2} sin [(k - 1 / 2) π r]

are bounded uniformly in r, we have

\begin{matrix} E [{\underset{̲}{B}}_{φ_{K}} (r) {\underset{̲}{B}}_{φ_{K}} (q)] & = E [\sum_{k, l = K + 1}^{\infty} λ_{k}^{1 / 2} λ_{l}^{1 / 2} φ_{k} (r) φ_{l} (q) ξ_{k} ξ_{l}] = \sum_{k = K + 1}^{\infty} λ_{k} φ_{k} (r) φ_{k} (q) \\ = O (\sum_{k = K + 1}^{\infty} λ_{k}) = O (\frac{1}{K}) uniformly in r \in [0, 1] and q \in [0, 1] . \end{matrix}

Therefore,

\begin{matrix} E \{\int_{0}^{1} \int_{0}^{1} k_{b} (r - q) {\underset{̲}{B}}_{φ_{K}} (r) {\underset{̲}{B}}_{φ_{K}} (q) [C_{K}^{'} {\bar{φ}}_{K} (r)] [C_{K}^{'} {\bar{φ}}_{K} (q)] d r d q\} \\ = \int_{0}^{1} \int_{0}^{1} k_{b} (r - q) E \{{\underset{̲}{B}}_{φ_{K}} (r) {\underset{̲}{B}}_{φ_{K}} (q)\} [C_{K}^{'} {\bar{φ}}_{K} (r)] [C_{K}^{'} {\bar{φ}}_{K} (q)] d r d q \\ = O (\frac{1}{K}) \int_{0}^{1} \int_{0}^{1} |C_{K}^{'} {\bar{φ}}_{K} (r)| |C_{K}^{'} {\bar{φ}}_{K} (q)| d r d q \\ = O (\frac{1}{K}) {(\int_{0}^{1} |C_{K}^{'} {\bar{φ}}_{K} (r)| d r)}^{2} = O (\frac{1}{K}), \end{matrix}

since

k_{b} (r - q)

is uniformly bounded and

{(\int_{0}^{1} |C_{K}^{'} {\bar{φ}}_{K} (r)| d r)}^{2} \leq \int_{0}^{1} {[C_{K}^{'} {\bar{φ}}_{K} (r)]}^{2} d r = C_{K}^{'} (\int_{0}^{1} {\bar{φ}}_{K} (r) {\bar{φ}}_{K}^{'} (r) d r) C_{K} = C_{K}^{'} C_{K} = 1,

\begin{matrix} {(\int_{0}^{1} |C_{K}^{'} {\bar{φ}}_{K} (r)| d r)}^{2} & \geq {(\int_{0}^{1} C_{K}^{'} {\bar{φ}}_{K} (r) d r)}^{2} = {(\int_{0}^{1} \sum_{k = 1}^{K} c_{k} φ_{k} (r) d r)}^{2} \\ = {(\sum_{k = 1}^{K} c_{k} {\frac{- \sqrt{2} cos [(k - 1 / 2) π r]}{(k - 1 / 2) π}|}_{0}^{1})}^{2} = {(\sum_{k = 1}^{K} \frac{\sqrt{2} c_{k}}{(k - 1 / 2) π})}^{2} . \end{matrix}

Further

\begin{matrix} E [{\underset{̲}{B}}_{φ_{K}} (r) {\underset{̲}{B}}_{φ_{K}} (q) {\underset{̲}{B}}_{φ_{K}} (s) {\underset{̲}{B}}_{φ_{K}} (τ)] \\ = E (\sum_{k = K + 1}^{\infty} λ_{k}^{2} φ_{k} (r) φ_{k} (q) φ_{k} (s) φ_{k} (τ) ξ_{k}^{4} + \sum_{\begin{matrix} h, k = K + 1 \\ h \neq k \end{matrix}}^{\infty} λ_{k} λ_{h} φ_{k} (r) φ_{k} (q) φ_{h} (s) φ_{h} (τ) ξ_{k}^{2} ξ_{h}^{2}) \\ + E (\sum_{\begin{matrix} l, k = K + 1 \\ l \neq k \end{matrix}}^{\infty} λ_{k} λ_{l} φ_{k} (r) φ_{k} (s) φ_{l} (q) φ_{l} (τ) ξ_{k}^{2} ξ_{l}^{2} + \sum_{\begin{matrix} l, k = K + 1 \\ l \neq k \end{matrix}}^{\infty} λ_{k} λ_{l} φ_{k} (r) φ_{k} (τ) φ_{l} (q) φ_{l} (s) ξ_{k}^{2} ξ_{l}^{2}) \\ = 3 \sum_{k = K + 1}^{\infty} λ_{k}^{2} φ_{k} (r) φ_{k} (q) φ_{k} (s) φ_{k} (τ) + \sum_{\begin{matrix} h, k = K + 1 \\ h \neq k \end{matrix}}^{\infty} λ_{k} λ_{h} φ_{k} (r) φ_{k} (q) φ_{h} (s) φ_{h} (τ) \\ + \sum_{\begin{matrix} l, k = K + 1 \\ l \neq k \end{matrix}}^{\infty} λ_{k} λ_{l} φ_{k} (r) φ_{k} (s) φ_{l} (q) φ_{l} (τ) + \sum_{\begin{matrix} l, k = K + 1 \\ l \neq k \end{matrix}}^{\infty} λ_{k} λ_{l} φ_{k} (r) φ_{k} (τ) φ_{l} (q) φ_{l} (s) \\ = \sum_{k = K + 1}^{\infty} λ_{k} φ_{k} (r) φ_{k} (q) \sum_{h = K + 1}^{\infty} λ_{h} φ_{h} (s) φ_{h} (τ) + \sum_{k = K + 1}^{\infty} λ_{k} φ_{k} (r) φ_{k} (s) \sum_{l = K + 1}^{\infty} λ_{l} φ_{l} (q) φ_{l} (τ) \\ + \sum_{k = K + 1}^{\infty} λ_{k} φ_{k} (r) φ_{k} (τ) \sum_{l = K + 1}^{\infty} λ_{l} φ_{l} (q) φ_{l} (s) \\ = 3 \times O (\frac{1}{K}) \times O (\frac{1}{K}) = O (\frac{1}{K^{2}}) uniformly in (r, q, s, τ) \in {[0, 1]}^{4} . \end{matrix}

Therefore,

\begin{matrix} E \{{(\int_{0}^{1} \int_{0}^{1} k_{b} (r - q) {\underset{̲}{B}}_{φ_{K}} (r) {\underset{̲}{B}}_{φ_{K}} (q) [C_{K}^{'} {\bar{φ}}_{K} (r)] [C_{K}^{'} {\bar{φ}}_{K} (q)] d r d q)}^{2}\} \\ = O (\frac{1}{K^{2}}) \times \int_{0}^{1} \int_{0}^{1} \int_{0}^{1} \int_{0}^{1} |C_{K}^{'} {\bar{φ}}_{K} (r)| |C_{K}^{'} {\bar{φ}}_{K} (q)| |C_{K}^{'} {\bar{φ}}_{K} (s)| |C_{K}^{'} {\bar{φ}}_{K} (τ)| d r d q d s d τ \\ = O (\frac{1}{K^{2}}) \times {(\int_{0}^{1} |C_{K}^{'} {\bar{φ}}_{K} (r)| d r)}^{4} = O (\frac{1}{K^{2}}) . \end{matrix}

Finally, we get

Var (\int_{0}^{1} \int_{0}^{1} k_{b} (r - q) {\underset{̲}{B}}_{φ_{K}} (r) {\underset{̲}{B}}_{φ_{K}} (q) [C_{K}^{'} {\bar{φ}}_{K} (r)] [C_{K}^{'} {\bar{φ}}_{K} (q)] d r d q) = O (\frac{1}{K^{2}}) .

□

Proof of Lemma A2.

(i) See (Phillips 2002), Lemma 2.2.

(ii) Using the Hungarian strong approximation (e.g., Csörgö and Horváth 1993), we can construct an expanded probability space with a Brownian motion

B (\cdot)

for which

sup_{1 \leq t \leq n} |X_{t} - B (t)| = o_{a . s} (n^{1 / p}),

or

sup_{1 \leq t \leq n} |\frac{X_{t}}{\sqrt{n}} - B (\frac{t}{n})| = o_{a . s} (\frac{1}{n^{1 / 2 - 1 / p}}) .

Applying the matrix norm

∥A∥ = {max}_{i} \sum_{j = 1}^{K} |a_{i j}|

, (Phillips 2002) proved that

\frac{1}{n} \sum_{t = 1}^{n} φ_{K t} φ_{K t}^{'} = I_{K} + O (\frac{K}{n}),

and

{\hat{β}}_{K} / \sqrt{n} = {\hat{α}}_{K} = Λ_{K}^{1 / 2} {\tilde{ξ}}_{K} + O_{a . s} (\frac{K}{n} + \frac{1}{n^{1 / 2 - 1 / p}}) = Z_{K} + O_{a . s} (\frac{K}{n} + \frac{1}{n^{1 / 2 - 1 / p}}),

where

{\tilde{ξ}}_{K} = {(ξ_{k})}_{k = 1}^{K},

and

Z_{K} = {(z_{k})}_{k = 1}^{K}

are the random coefficients in the orthonormal representation (5). Therefore, we have

\begin{matrix} sup_{1 \leq t \leq n} |\frac{{\hat{u}}_{t}}{\sqrt{n}} - {\underset{̲}{B}}_{φ_{K}} (\frac{t}{n})| & = sup_{1 \leq t \leq n} |(\frac{X_{t}}{\sqrt{n}} - {\hat{α}}_{K}^{'} φ_{K t}) - (B (\frac{t}{n}) - Z_{K}^{'} φ_{K t})| \\ \leq sup_{1 \leq t \leq n} |\frac{X_{t}}{\sqrt{n}} - B (\frac{t}{n})| + sup_{1 \leq t \leq n} |φ_{K t}^{'} (Z_{K} - {\hat{α}}_{K})| \\ \leq o_{a . s} (\frac{1}{n^{1 / 2 - 1 / p}}) + sup_{1 \leq t \leq n} ∥φ_{K t}^{'}∥ ∥Z_{K} - {\hat{α}}_{K}∥ \\ = o_{a . s} (\frac{1}{n^{1 / 2 - 1 / p}}) + O_{a . s} (\frac{K}{n} + \frac{1}{n^{1 / 2 - 1 / p}}) sup_{1 \leq t \leq n} ∥φ_{K t}^{'}∥ \\ = O_{a . s} (\frac{K^{2}}{n} + \frac{K}{n^{1 / 2 - 1 / p}}), \end{matrix}

as

{sup}_{1 \leq t \leq n} ∥φ_{K t}^{'}∥ = {sup}_{1 \leq t \leq n} \sqrt{2} \sum_{k = 1}^{K} |sin [(k - 1 / 2) π t / n]| = O (K)

. The second inequality comes from Hölder’s inequality. Hence, when

K^{5 / 2} / n + K^{3 / 2} / n^{1 / 2 - 1 / p} \to 0

, we have

\begin{matrix} \frac{K}{n} s_{b}^{2} & = \frac{K}{n} \sum_{t = 1}^{n} {(\frac{{\hat{u}}_{t}}{\sqrt{n}})}^{2} = \frac{1}{n} \sum_{t = 1}^{n} {(\sqrt{K} {\underset{̲}{B}}_{φ_{K}} (\frac{t}{n}) + O_{a . s} (\frac{K^{5 / 2}}{n} + \frac{K^{3 / 2}}{n^{1 / 2 - 1 / p}}))}^{2} \\ = \frac{1}{n} \sum_{t = 1}^{n} {(\sqrt{K} {\underset{̲}{B}}_{φ_{K}} (\frac{t}{n}) + o_{a . s} (1))}^{2} = \frac{K}{n} \sum_{t = 1}^{n} {({\underset{̲}{B}}_{φ_{K}} (\frac{t}{n}))}^{2} + o_{a . s} (1) \\ = K \int_{0}^{1} {\underset{̲}{B}}_{φ_{K}}^{2} (r) d r + o_{p} (1) . \end{matrix}

It is straightforward to see that when

K / n \to 0,

C_{K}^{'} {(\frac{1}{n} Φ_{K}^{'} Φ_{K})}^{- 1} C_{K} = C_{K}^{'} (I_{K} + O (\frac{K}{n})) C_{K} = C_{K}^{'} C_{K} + o (1) = 1 + o (1) .

Therefore, when

K^{5 / 2} / n + K^{3 / 2} / (n^{1 / 2 - 1 / p}) \to 0,

K (s_{b}^{2} C_{K}^{'} {(Φ_{K}^{'} Φ_{K})}^{- 1} C_{K}) = \frac{K}{n} s_{b}^{2} (C_{K}^{'} {(\frac{1}{n} Φ_{K}^{'} Φ_{K})}^{- 1} C_{K}) = K \int_{0}^{1} {\underset{̲}{B}}_{φ_{K}}^{2} + o_{p} (1) .

(iii), (iv) The proofs of (iii) and (iv) are similar, so only (iv) is proved here. When

K^{5 / 2} / n + K^{3 / 2} / n^{1 / 2 - 1 / p} \to 0,

\begin{matrix} \frac{K}{n^{2}} C_{K}^{'} {\hat{lrvar}}_{HAR} ({\hat{u}}_{t} φ_{K t}) C_{K} \\ = \frac{1}{n^{2}} \sum_{j = - n + 1}^{n - 1} k_{b} (\frac{j}{n}) \frac{K}{n} \sum_{1 \leq t, t + j \leq n} {\hat{u}}_{t} {\hat{u}}_{t + j} C_{K}^{'} φ_{K t} φ_{K t + j}^{'} C_{K} \\ = \frac{1}{n^{2}} \sum_{s, t = 1}^{n} k_{b} (\frac{t - s}{n}) \frac{\sqrt{K} {\hat{u}}_{t}}{\sqrt{n}} \frac{\sqrt{K} {\hat{u}}_{s}}{\sqrt{n}} C_{K}^{'} φ_{K t} φ_{K s}^{'} C_{K} \\ = \frac{1}{n^{2}} \sum_{s, t = 1}^{n} k_{b} (\frac{t - s}{n}) (\sqrt{K} {\underset{̲}{B}}_{φ_{K}} (\frac{t}{n}) + o_{a . s} (1)) (\sqrt{K} {\underset{̲}{B}}_{φ_{K}} (\frac{s}{n}) + o_{a . s} (1)) C_{K}^{'} φ_{K t} φ_{K s}^{'} C_{K} \\ = \frac{1}{n^{2}} \sum_{s, t = 1}^{n} k_{b} (\frac{t - s}{n}) (\sqrt{K} {\underset{̲}{B}}_{φ_{K}} (\frac{t}{n})) (\sqrt{K} {\underset{̲}{B}}_{φ_{K}} (\frac{s}{n})) C_{K}^{'} φ_{K t} φ_{K s}^{'} C_{K} + o_{a . s} (1) \\ = K \int_{0}^{1} \int_{0}^{1} k_{b} (r - q) {\underset{̲}{B}}_{φ_{K}} (r) {\underset{̲}{B}}_{φ_{K}} (q) [C_{K}^{'} {\bar{φ}}_{K} (r)] [C_{K}^{'} {\bar{φ}}_{K} (q)] d r d q + o_{p} (1) . \end{matrix}

Therefore, when

K^{5 / 2} / n + K^{3 / 2} / (n^{1 / 2 - 1 / p}) \to 0,

\begin{matrix} \frac{K}{n} ({\overset{ˇ}{ω}}_{C_{K}^{'} β_{K}}^{2}) & = C_{K}^{'} {(\frac{1}{n} Φ_{K}^{'} Φ_{K})}^{- 1} (\frac{K}{n^{2}} {\hat{lrvar}}_{HAR} ({\hat{u}}_{t} φ_{K t})) {(\frac{1}{n} Φ_{K}^{'} Φ_{K})}^{- 1} C_{K} \\ = C_{K}^{'} (\frac{K}{n^{2}} {\hat{lrvar}}_{HAR} ({\hat{u}}_{t} φ_{K t})) C_{K} + o_{a . s .} (1) \\ = K \int_{0}^{1} \int_{0}^{1} k_{b} (r - q) {\underset{̲}{B}}_{φ_{K}} (r) {\underset{̲}{B}}_{φ_{K}} (q) [C_{K}^{'} {\bar{φ}}_{K} (r)] [C_{K}^{'} {\bar{φ}}_{K} (q)] d r d q + o_{p} (1) . \end{matrix}

□

Proof of Theorem 1.

(i) and (ii) The proofs are similar to those in (Phillips 1998) and are omitted.

(iii) From (Phillips 1998), when

n \to \infty

and K is fixed,

n^{- 1 / 2} β_{K} = {\hat{α}}_{K} \Rightarrow Z_{K}

. Let

φ_{K t} = {(φ_{1} (t / n), \dots, φ_{K} (t / n))}^{'}

, we have

\frac{{\hat{u}}_{⌊n \cdot⌋}}{\sqrt{n}} = \frac{1}{\sqrt{n}} {(X_{⌊n \cdot⌋} - {\hat{β}}_{K}^{'} φ_{K ⌊n \cdot⌋})}^{2} \Rightarrow B (\cdot) - Z_{K}^{'} {\bar{φ}}_{K} (\cdot) : = {\underset{̲}{B}}_{φ_{K}} (\cdot) .

The scaled long run variance estimator can be written as

\begin{matrix} \frac{1}{n^{2}} {\hat{lrvar}}_{HAR} ({\hat{u}}_{t} φ_{K t}) & = \frac{1}{n^{2}} \sum_{j = - n + 1}^{n - 1} k_{b} (\frac{j}{n}) \frac{1}{n} \sum_{1 \leq t, t + j \leq n} {\hat{u}}_{t} {\hat{u}}_{t + j} φ_{K t} φ_{K t + j}^{'} \\ = \frac{1}{n^{2}} \sum_{s, t = 1}^{n} k_{b} (\frac{t - s}{n}) \frac{{\hat{u}}_{t}}{\sqrt{n}} \frac{{\hat{u}}_{s}}{\sqrt{n}} φ_{K t} φ_{K s}^{'} \\ \Rightarrow \int_{0}^{1} \int_{0}^{1} k_{b} (r - q) {\underset{̲}{B}}_{φ_{K}} (r) {\underset{̲}{B}}_{φ_{K}} (q) {\bar{φ}}_{K} (r) {\bar{φ}}_{K}^{'} (q) d r d q . \end{matrix}

Noticing that

\frac{1}{n} Φ_{K}^{'} Φ_{K} = \frac{1}{n} \sum_{t = 1}^{n} φ_{K t} φ_{K t}^{'} \to \int_{0}^{1} {\bar{φ}}_{K} (r) {\bar{φ}}_{K}^{'} (r) d r = I_{K},

it follows that

\begin{matrix} \frac{1}{n} {\overset{ˇ}{ω}}_{C_{K}^{'} β_{K}}^{2} & = C_{K}^{'} {\{\frac{1}{n} Φ_{K}^{'} Φ_{K}\}}^{- 1} (\frac{1}{n^{2}} {\hat{lrvar}}_{HAR} ({\hat{u}}_{t} φ_{K t})) {\{\frac{1}{n} Φ_{K}^{'} Φ_{K}\}}^{- 1} C_{K} \\ \Rightarrow C_{K}^{'} \int_{0}^{1} \int_{0}^{1} k_{b} (r - q) {\underset{̲}{B}}_{φ_{K}} (r) {\underset{̲}{B}}_{φ_{K}} (q) {\bar{φ}}_{K} (r) {\bar{φ}}_{K}^{'} (q) d r d q C_{K} \\ = \int_{0}^{1} \int_{0}^{1} k_{b} (r - q) {\underset{̲}{B}}_{φ_{K}} (r) {\underset{̲}{B}}_{φ_{K}} (q) [C_{K}^{'} {\bar{φ}}_{K} (r)] [C_{K}^{'} {\bar{φ}}_{K} (q)] d r d q . \end{matrix}

Therefore,

\begin{matrix} t_{_{C_{K}^{'} β_{K}}}^{^{H A R}} & = \frac{C_{K}^{'} {\hat{β}}_{K}}{{\overset{ˇ}{ω}}_{C_{K}^{'} β_{K}}} = \frac{n^{- 1 / 2} C_{K}^{'} {\hat{β}}_{K}}{{\{n^{- 1} {\overset{ˇ}{ω}}_{C_{K}^{'} β_{K}}^{2}\}}^{1 / 2}} \\ \Rightarrow \frac{C_{K}^{'} Z_{K}}{{[\int_{0}^{1} \int_{0}^{1} k_{b} (r - q) {\underset{̲}{B}}_{φ_{K}} (r) {\underset{̲}{B}}_{φ_{K}} (q) [C_{K}^{'} {\bar{φ}}_{K} (r)] [C_{K}^{'} {\bar{φ}}_{K} (q)] d r d q]}^{1 / 2}} . \end{matrix}

Let

Z_{K}^{W} = Z_{K} / ω = \int_{0}^{1} W {\bar{φ}}_{K}

,

W (r) = B (r) / ω \equiv B M (1)

,

ω^{2} = 2 π f_{μ} (0)

, and

{\underset{̲}{W}}_{φ_{K}} (r) = {\underset{̲}{B}}_{φ_{K}} (r) / ω = W (r) - {(Z_{K}^{W})}^{'} {\bar{φ}}_{K} (r) .

It then follows immediately that

t_{_{C_{K}^{'} β_{K}}}^{^{H A R}} \Rightarrow \frac{C_{K}^{'} Z_{K}^{W}}{{[\int_{0}^{1} \int_{0}^{1} k_{b} (r - q) {\underset{̲}{W}}_{φ_{K}} (r) {\underset{̲}{W}}_{φ_{K}} (q) [C_{K}^{'} {\bar{φ}}_{K} (r)] [C_{K}^{'} {\bar{φ}}_{K} (q)] d r d q]}^{1 / 2}} .

□

Appendix B. Derivations Leading to (23)–(28)

Lemma A3.

For the regression model (17) let

B (\cdot) \equiv B M (ω^{2})

with

ω^{2} = 2 π f_{μ} (0) > 0

. Irrespective of whether a is zero or not, when

n \to \infty

and

1 / M + M / n \to 0

, the following results hold:

(i): for $r \in [0, 1]$ ,

$\frac{{\hat{u}}_{⌊n r⌋}}{\sqrt{n}} \Rightarrow B (r) - 3 (\int_{0}^{1} s B (s) d s) r : = \underset{̲}{B} (r);$
(ii): $n^{2} {(s_{a})}^{2} \Rightarrow 3 \int_{0}^{1} {\underset{̲}{B}}^{2};$
(iii): $\frac{n^{2}}{M} {({\hat{ω}}_{a})}^{2} \Rightarrow 9 \int_{- 1}^{1} k (s) d s \int_{0}^{1} r^{2} {\underset{̲}{B}}^{2};$
(iv): $n {({\overset{ˇ}{ω}}_{a})}^{2} \Rightarrow 9 \int_{0}^{1} \int_{0}^{1} k_{b} (r - q) \underset{̲}{B} (r) \underset{̲}{B} (q) r q d r d q;$

where $s_{a}$ , ${\hat{ω}}_{a}$ , ${\overset{ˇ}{ω}}_{a}$ are defined as in (20)–(22), respectively, ${\hat{u}}_{t} = X_{t} - \hat{a} t$ for $t = 1, \dots, n$ , $k (\cdot)$ is a kernel function and $k_{b} (\frac{j}{n}) = k (\frac{j}{n b})$ .

Proof of Lemma A3.

(i) Using the functional law and continuous mapping it is straightforward to obtain

\sqrt{n} (\hat{a} - a) = \frac{n^{- 5 / 2} \sum_{t = 1}^{n} X_{t}^{0}}{n^{- 3} \sum_{t = 1}^{n} t^{2}} \Rightarrow \frac{\int_{0}^{1} s B (s) d s}{\int_{0}^{1} s^{2} d s} \equiv 3 \int_{0}^{1} s B (s) d s .

Therefore, for any

r \in [0, 1]

,

\begin{matrix} \frac{{\hat{u}}_{⌊n r⌋}}{\sqrt{n}} & = \frac{X_{⌊n r⌋} - \hat{a} \cdot ⌊n r⌋}{\sqrt{n}} = \frac{X_{⌊n r⌋}^{0}}{\sqrt{n}} - \sqrt{n} (\hat{a} - a) \frac{⌊n r⌋}{n} \\ \Rightarrow B (r) - 3 (\int_{0}^{1} s B (s) d s) r : = \underset{̲}{B} (r) . \end{matrix}

(ii) From the expression of

s_{a}

given in (20), the following is immediate

\begin{matrix} n^{2} {(s_{a})}^{2} & = n^{2} [\frac{1}{n} \sum_{t = 1}^{n} {(X_{t} - \hat{a} t)}^{2}] {(\sum_{t = 1}^{n} t^{2})}^{- 1} = [\frac{1}{n} \sum_{t = 1}^{n} {(\frac{{\hat{u}}_{t}}{\sqrt{n}})}^{2}] {(\frac{1}{n^{3}} \sum_{t = 1}^{n} t^{2})}^{- 1} \\ \Rightarrow 3 \int_{0}^{1} {\underset{̲}{B}}^{2} (s) d s . \end{matrix}

(iii) As

1 / M + M / N \to 0

when

n \to \infty

, we have for any

|j| \leq M

and

r \in [0, 1]

\frac{⌊n r⌋ + j}{n} = r + O (\frac{M}{n}) \to r as n \to \infty .

Therefore, from the continuous mapping theorem, it follows that

\begin{matrix} \frac{1}{M} \frac{1}{n^{3}} {\hat{lrvar}}_{H A C} (t {\hat{u}}_{t}) & = \frac{1}{M} \sum_{j = - M}^{M} k (\frac{j}{M}) \frac{1}{n} \sum_{1 \leq t, t + j \leq n} \frac{{\hat{u}}_{t}}{\sqrt{n}} \frac{{\hat{u}}_{t + j}}{\sqrt{n}} \frac{t}{n} \frac{t + j}{n} \\ \Rightarrow \int_{- 1}^{1} k (s) d s \int_{0}^{1} \underset{̲}{B} {(r)}^{2} r^{2} d r . \end{matrix}

Hence,

\begin{matrix} \frac{n^{2}}{M} {({\hat{ω}}_{a})}^{2} & = {(\frac{1}{n^{3}} \sum_{t = 1}^{n} t^{2})}^{- 1} [\frac{1}{M} \frac{1}{n^{3}} {\hat{lrvar}}_{H A C} (t {\hat{u}}_{t})] {(\frac{1}{n^{3}} \sum_{t = 1}^{n} t^{2})}^{- 1} \\ \Rightarrow 9 \int_{- 1}^{1} k (s) d s \int_{0}^{1} \underset{̲}{B} {(r)}^{2} r^{2} d r . \end{matrix}

(iv) For the HAR based test given in (22), we have

\begin{matrix} {\hat{lrvar}}_{H A R} (t {\hat{u}}_{t}) & = \sum_{j = - n + 1}^{n - 1} k_{b} (\frac{j}{n}) (\frac{1}{n} \sum_{1 \leq t, t + j \leq n} t {\hat{u}}_{t} {\hat{u}}_{t + j} (t + j)) \\ = \frac{1}{n} \sum_{s, t = 1}^{n} k_{b} (\frac{t - s}{n}) {\hat{u}}_{t} {\hat{u}}_{s} t s . \end{matrix}

By continuous mapping

\begin{matrix} \frac{1}{n^{4}} {\hat{lrvar}}_{H A R} (t {\hat{u}}_{t}) & = \frac{1}{n^{2}} \sum_{s, t = 1}^{n} k_{b} (\frac{t - s}{n}) \frac{{\hat{u}}_{s}}{\sqrt{n}} \frac{{\hat{u}}_{t}}{\sqrt{n}} \frac{s}{n} \frac{t}{n} \\ \Rightarrow \int_{0}^{1} \int_{0}^{1} k_{b} (r - q) \underset{̲}{B} (r) \underset{̲}{B} (q) r q d r d q . \end{matrix}

Then,

\begin{matrix} n {({\overset{ˇ}{ω}}_{a})}^{2} & = {(\frac{1}{n^{3}} \sum_{t = 1}^{n} t^{2})}^{- 1} (\frac{1}{n^{4}} {\hat{lrvar}}_{H A R} (t {\hat{u}}_{t})) {(\frac{1}{n^{3}} \sum_{t = 1}^{n} t^{2})}^{- 1} \\ \Rightarrow 9 \int_{0}^{1} \int_{0}^{1} k_{b} (r - q) \underset{̲}{B} (r) \underset{̲}{B} (q) r q d r d q . \end{matrix}

□

Proof of (23)–(28).

The stated results now follow directly from the above and the fact that

\sqrt{n} \hat{a} \Rightarrow 3 \int_{0}^{1} s B (s) d s

under

H_{0} : a = 0,

and

\hat{a} \overset{p}{⟶} a

under

H_{1} : a \neq 0 .

□

Appendix C. Proof of the Theorem in Section 3

Proof of Theorem 3.

(i): In the regression (31), we already have that

n^{- 1 / 2} y_{⌊n r⌋} \Rightarrow B (r)

,

n^{- 1 / 2} z_{k ⌊n \cdot⌋} \Rightarrow V_{k} (\cdot)

, for

k = 1, \dots, K

, and

{\hat{b}}_{z K} \Rightarrow E_{K} : = {(e_{k})}_{k = 1}^{K}

. Let

{\bar{V}}_{K} = {(V_{k})}_{k = 1}^{K} .

Based on continuous mapping, we have

n^{- 2} \sum_{t = 1}^{n} z_{t} z_{t}^{'} = \frac{1}{n} \sum_{t = 1}^{n} \frac{z_{t}}{\sqrt{n}} \frac{z_{t}^{'}}{\sqrt{n}} \Rightarrow \int_{0}^{1} {\bar{V}}_{K} {\bar{V}}_{K}^{'} .

Noticing that

\frac{{\hat{u}}_{⌊n \cdot⌋}}{\sqrt{n}} = \frac{y_{⌊n \cdot⌋}}{\sqrt{n}} - {\hat{b}}_{z K}^{'} \frac{z_{⌊n \cdot⌋}}{\sqrt{n}} \Rightarrow B (\cdot) - E_{K}^{'} {\bar{V}}_{K} (\cdot) : = {\underset{̲}{W}}_{y K} (\cdot),

we obtain

n^{- 2} \sum_{t = 1}^{n} {({\hat{u}}_{t})}^{2} = \frac{1}{n} \sum_{t = 1}^{n} {(\frac{{\hat{u}}_{t}}{\sqrt{n}})}^{2} \Rightarrow \int_{0}^{1} {\underset{̲}{W}}_{y K}^{2} .

Therefore

n {(s_{b_{z K}})}^{2} = C_{K}^{'} [n^{- 2} \sum_{t = 1}^{n} {({\hat{u}}_{t})}^{2}] {(n^{- 2} \sum_{t = 1}^{n} z_{t} z_{t}^{'})}^{- 1} C_{K} \Rightarrow C_{K}^{'} {(\int_{0}^{1} {\bar{V}}_{K} {\bar{V}}_{K}^{'})}^{- 1} C_{K} \int_{0}^{1} {\underset{̲}{W}}_{y K}^{2},

and

\frac{1}{\sqrt{n}} t_{b_{z K}} = \frac{C_{K}^{'} {\hat{b}}_{z K}}{{\{n {(s_{b_{z K}})}^{2}\}}^{1 / 2}} \Rightarrow \frac{C_{K}^{'} E_{K}}{{\{C_{K}^{'} {(\int_{0}^{1} {\bar{V}}_{K} {\bar{V}}_{K}^{'})}^{- 1} C_{K} \int_{0}^{1} {\underset{̲}{W}}_{y K}^{2}\}}^{1 / 2}} .

(ii) As

1 / M + M / n \to 0

when

n \to \infty

, for any

|j| \leq M

and

r \in [0, 1]

, we have

\frac{⌊n r⌋ + j}{n} = r + O (\frac{M}{n}) \to r as n \to \infty .

Hence, for any

|j| \leq M

,

\frac{1}{n} \sum_{1 \leq t, t + j \leq n} \frac{z_{t}}{\sqrt{n}} \frac{{\hat{u}}_{t}}{\sqrt{n}} \frac{{\hat{u}}_{t + j}}{\sqrt{n}} \frac{z_{t + j}^{'}}{\sqrt{n}} \Rightarrow \int_{0}^{1} {\underset{̲}{W}}_{y K}^{2} {\bar{V}}_{K} {\bar{V}}_{K}^{'},

and

\begin{matrix} \frac{1}{M} \frac{1}{n^{2}} {\hat{lrvar}}_{H A C} (z_{t} {\hat{u}}_{t}) & = \frac{1}{M} \sum_{j = - M}^{M} k (\frac{j}{M}) [\frac{1}{n} \sum_{1 \leq t, t + j \leq n} \frac{z_{t}}{\sqrt{n}} \frac{{\hat{u}}_{t}}{\sqrt{n}} \frac{{\hat{u}}_{t + j}}{\sqrt{n}} \frac{z_{t + j}^{'}}{\sqrt{n}}] \\ \Rightarrow \int_{- 1}^{1} k (s) d s \int_{0}^{1} {\underset{̲}{W}}_{y K}^{2} {\bar{V}}_{K} {\bar{V}}_{K}^{'} . \end{matrix}

Therefore,

\begin{matrix} \frac{n}{M} {({\hat{ω}}_{b_{z K}})}^{2} & = C_{K}^{'} {(n^{- 2} \sum_{t = 1}^{n} z_{t} z_{t}^{'})}^{- 1} [\frac{1}{M} \frac{1}{n^{2}} {\hat{lrvar}}_{H A C} (z_{t} {\hat{u}}_{t})] {(n^{- 2} \sum_{t = 1}^{n} z_{t} z_{t}^{'})}^{- 1} C_{K} \\ \Rightarrow C_{K}^{'} {(\int_{0}^{1} {\bar{V}}_{K} {\bar{V}}_{K}^{'})}^{- 1} [\int_{- 1}^{1} k (s) d s \int_{0}^{1} {\underset{̲}{W}}_{y K}^{2} {\bar{V}}_{K} {\bar{V}}_{K}^{'}] {(\int_{0}^{1} {\bar{V}}_{K} {\bar{V}}_{K}^{'})}^{- 1} C_{K}, \end{matrix}

and

\begin{matrix} \sqrt{\frac{M}{n}} t_{b_{z K}}^{^{H A C}} & = \frac{C_{K}^{'} {\hat{b}}_{z K}}{{\{\frac{n}{M} {({\hat{ω}}_{b_{z K}})}^{2}\}}^{1 / 2}} \\ \Rightarrow \frac{C_{K}^{'} E_{K}}{{\{C_{K}^{'} {(\int_{0}^{1} {\bar{V}}_{K} {\bar{V}}_{K}^{'})}^{- 1} [\int_{- 1}^{1} k (s) d s \int_{0}^{1} {\underset{̲}{W}}_{y K}^{2} {\bar{V}}_{K} {\bar{V}}_{K}^{'}] {(\int_{0}^{1} {\bar{V}}_{K} {\bar{V}}_{K}^{'})}^{- 1} C_{K}\}}^{1 / 2}} . \end{matrix}

(iii) Note that

\begin{matrix} \frac{1}{n^{3}} {\hat{lrvar}}_{H A R} (z_{t} {\hat{u}}_{t}) & = \frac{1}{n^{3}} \sum_{j = - n + 1}^{n - 1} k_{b} (\frac{j}{n}) (\frac{1}{n} \sum_{1 \leq t, t + j \leq n} z_{t} {\hat{u}}_{t} {\hat{u}}_{t + j} z_{t + j}^{'}) \\ = \frac{1}{n^{4}} \sum_{s, t = 1}^{n} k_{b} (\frac{t - s}{n}) z_{t} {\hat{u}}_{t} {\hat{u}}_{s} z_{s}^{'} \\ = \frac{1}{n^{2}} \sum_{s, t = 1}^{n} k_{b} (\frac{t - s}{n}) \frac{z_{t}}{\sqrt{n}} \frac{{\hat{u}}_{t}}{\sqrt{n}} \frac{{\hat{u}}_{s}}{\sqrt{n}} \frac{z_{s}^{'}}{\sqrt{n}} \\ \Rightarrow \int_{0}^{1} \int_{0}^{1} k_{b} (r - p) {\bar{V}}_{K} (r) {\underset{̲}{W}}_{y K} (r) {\underset{̲}{W}}_{y K} (p) {\bar{V}}_{K}^{'} (p) d r d p : = H . \end{matrix}

Therefore,

\begin{matrix} {({\overset{ˇ}{ω}}_{b_{z K}})}^{2} & = C_{K}^{'} {(n^{- 2} \sum_{t = 1}^{n} z_{t} z_{t}^{'})}^{- 1} (\frac{1}{n^{3}} {\hat{lrvar}}_{H A R} (z_{t} {\hat{u}}_{t})) {(n^{- 2} \sum_{t = 1}^{n} z_{t} z_{t}^{'})}^{- 1} C_{K} \\ \Rightarrow C_{K}^{'} {(\int_{0}^{1} {\bar{V}}_{K} {\bar{V}}_{K}^{'})}^{- 1} H {(\int_{0}^{1} {\bar{V}}_{K} {\bar{V}}_{K}^{'})}^{- 1} C_{K}, \end{matrix}

and

t_{b_{z K}}^{^{H A R}} = \frac{C_{K}^{'} {\hat{b}}_{z K}}{{\{{({\hat{ω}}_{b_{z K}})}^{2}\}}^{1 / 2}} \Rightarrow \frac{C_{K}^{'} E_{K}}{{\{C_{K}^{'} {(\int_{0}^{1} {\bar{V}}_{K} {\bar{V}}_{K}^{'})}^{- 1} H {(\int_{0}^{1} {\bar{V}}_{K} {\bar{V}}_{K}^{'})}^{- 1} C_{K}\}}^{1 / 2}} .

□

References

Banerjee, Anindya, Juan J. Dolado, John W. Galbraith, and David Hendry. 1993. Co-Integration, Error Correction, and the Econometric Analysis of Non-Stationary Data. Oxford: Oxford University Press. [Google Scholar]
Csörgö, Miklós, and Lajos Horváth. 1993. Weighted Approximations in Probability and Statistics. New York: Wiley. [Google Scholar]
Durlauf, Steven N., and Peter C. B. Phillips. 1988. Trends versus random walks in time series analysis. Econometrica 56: 1333–54. [Google Scholar] [CrossRef]
Eicker, Friedhelm. 1967. Limit theorems for regression with unequal and dependent errors. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. Beijing: Statistics, vol. 1, pp. 59–82. [Google Scholar]
Ernst, Philip A., Larry A. Shepp, and Abraham J. Wyner. 2017. Yule’s “nonsense correlation” solved! Annals of Statistics 45: 1789–809. [Google Scholar] [CrossRef] [Green Version]
Granger, Clive W. J., and Paul Newbold. 1974. Spurious regressions in econometrics. Journal of Econometrics 74: 111–20. [Google Scholar] [CrossRef] [Green Version]
Hendry, David F. 1980. Econometrics—Alchemy or science. Economica 47: 387–406. [Google Scholar] [CrossRef]
Huber, Peter J. 1967. The behavior of maximum likelihood estimates under nonstandard conditions. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. Beijing: Statistics, vol. 1, pp. 221–33. [Google Scholar]
Hwang, Jungbin, and Yixiao Sun. 2018. Simple, robust, and accurate F and t tests in cointegrated systems. Econometric Theory 34: 949–84. [Google Scholar] [CrossRef] [Green Version]
Jansson, Michael. 2004. The error in rejection probability of simple autocorrelation robust tests. Econometrica 72: 937–46. [Google Scholar] [CrossRef]
Kiefer, Nicholas M., and Timothy J. Vogelsang. 2002a. Heteroskedasticity-autocorrelation robust testing using bandwidth equal to sample size. Econometric Theory 18: 1350–66. [Google Scholar] [CrossRef]
Kiefer, Nicholas M., and Timothy J. Vogelsang. 2002b. Heteroskedasticity-autocorrelation robust standard errors using the bartlett kernel without truncation. Econometrica 70: 2093–95. [Google Scholar] [CrossRef]
Kiefer, Nicholas M., and Timothy J. Vogelsang. 2005. A new asymptotic theory for heteroskedasticity-autocorrelation robust tests. Econometric Theory 21: 1130–64. [Google Scholar] [CrossRef] [Green Version]
Kiefer, Nicholas M., Timothy J. Vogelsang, and Helle Bunzel. 2000. Simple robust testing of regression hypotheses. Econometrica 68: 695–714. [Google Scholar] [CrossRef]
Lazarus, Eben, Daniel J. Lewis, James H. Stock, and Mark W. Watson. 2018. HAR inference: Recommendations for practice. Journal of Business & Economic Statistics 36: 541–59. [Google Scholar]
Loève, Michel. 1963. Probability Theory, 3rd ed. New York: Van Nostrand. [Google Scholar]
Müller, Ulrich K. 2007. A theory of robust long-run variance estimation. Journal of Econometrics 141: 1331–52. [Google Scholar] [CrossRef] [Green Version]
Müller, Ulrich K. 2014. HAC corrections for strongly autocorrelated time series. Journal of Business & Economic Statistics 32: 311–22. [Google Scholar]
Müller, Ulrich K., and Mark W. Watson. 2016. Low-frequency econometrics. In Advances in Economics: Eleventh World Congress of the Econometric Society. Edited by Bo Honoré and Larry Samuelson. Cambridge: Cambridge University Press, vol. 2, pp. 53–94. [Google Scholar]
Müller, Ulrich K., and Mark W. Watson. 2018. Long-run covariability. Econometrica 86: 775–804. [Google Scholar] [CrossRef]
Park, Joon Y., and Peter C. B. Phillips. 1988. Statistical inference in regressions with integrated processes: Part 1. Econometric Theory 4: 468–97. [Google Scholar] [CrossRef] [Green Version]
Park, Joon Y., and Peter C. B. Phillips. 1989. Statistical inference in regressions with integrated processes: Part 2. Econometric Theory 5: 95–131. [Google Scholar] [CrossRef] [Green Version]
Phillips, Peter C. B. 1986. Understanding spurious regression in econometrics. Journal of Econometrics 33: 311–40. [Google Scholar] [CrossRef] [Green Version]
Phillips, Peter C. B. 1991. Optimal inference in cointegrated systems. Econometrica 59: 283–306. [Google Scholar] [CrossRef] [Green Version]
Phillips, Peter C. B. 1996. Spurious regression unmasked. In Cowles Foundation Discussion Paper. No. 1135. New Haven: Yale University. [Google Scholar]
Phillips, Peter C. B. 1998. New tools for understanding spurious regressions. Econometrica 66: 1299–325. [Google Scholar] [CrossRef]
Phillips, Peter C. B. 2002. New unit root asymptotics in the presence of deterministic trends. Journal of Econometrics 111: 323–53. [Google Scholar] [CrossRef] [Green Version]
Phillips, Peter C. B. 2005. HAC estimation by automated regression. Econometric Theory 21: 116–42. [Google Scholar] [CrossRef] [Green Version]
Phillips, Peter C. B. 2014. Optimal estimation of cointegrated systems with irrelevant instruments. Journal of Econometrics 178: 210–24. [Google Scholar] [CrossRef] [Green Version]
Phillips, Peter C. B., and Steven N. Durlauf. 1986. Multiple time series regression with integrated processes. Review of Economic Studies 55: 473–96. [Google Scholar] [CrossRef] [Green Version]
Phillips, Peter C. B., and Bruce E. Hansen. 1990. Statistical inference in instrumental variables regression with I(1) processes. Review of Economic Studies 57: 99–125. [Google Scholar] [CrossRef]
Phillips, Peter C. B., and Mico Loretan. 1991. Estimating long-run economic equilibria. Review of Economic Studies 59: 407–36. [Google Scholar] [CrossRef] [Green Version]
Phillips, Peter C. B., and Zhentao Shi. 2019. Boosting the Hodrick-Prescott filter. In Cowles Foundation Discussion Paper. No. 2192. New Haven: Yale University. [Google Scholar]
Phillips, Peter C. B., and Victor Solo. 1992. Asymptotics for linear processes. The Annals of Statistics 20: 971–1001. [Google Scholar] [CrossRef]
Robinson, Peter M. 1998. Inference-without-smoothing in the presence of nonparametric autocorrelation. Econometrica 66: 1163–82. [Google Scholar] [CrossRef]
Sun, Yixiao. 2004. A convergent t-statistic in spurious regression. Econometric Theory 20: 943–62. [Google Scholar] [CrossRef] [Green Version]
Sun, Yixiao. 2014a. Let’s fix it: Fixed-b asymptotics versus small-b asymptotics in heteroscedasticity and autocorrelation robust inference. Journal of Econometrics 178: 659–77. [Google Scholar] [CrossRef]
Sun, Yixiao. 2014b. Fixed-smoothing asymptotics in a two-step GMM framework. Econometrica 82: 2327–70. [Google Scholar] [CrossRef]
Sun, Yixiao. 2014c. Fixed-smoothing asymptotics and asymptotic F and t tests in the presence of strong autocorrelation. In Essays in Honor of Peter C. B. Phillips (Advances in Econometrics, Volume 33). Bingley: Emerald Group Publishing Limited, pp. 23–63. [Google Scholar]
Sun, Yixiao. 2018. Comments on “HAR inference: Recommendations for practice”. Journal of Business & Economic Statistics 36: 565–68. [Google Scholar]
Sun, Yixiao, Peter C. B. Phillips, and Sainan Jin. 2011. Power maximization and size control in heteroskedasticity and autocorrelation robust tests with exponentiated kernels. Econometric Theory 27: 1320–68. [Google Scholar] [CrossRef] [Green Version]
White, Halbert. 1980. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48: 817–38. [Google Scholar] [CrossRef]
White, Halbert. 1982. Asymptotic Theory for Econometricians, 1st ed. Cambridge: Academic Press. [Google Scholar]
Yule, G. Udny. 1926. Why do we sometimes get nonsense correlations between time series? A study in sampling and the nature of time series. Journal of the Royal Statistical Society 89: 1–63. [Google Scholar] [CrossRef]

1.	Heteroskedastic robust standard errors were introduced by (Eicker 1967; Huber 1967; White 1980). HAC estimators were introduced by (White 1982) and have a long subsequent history of enhancement.
2.	Heteroskedastic and autocorrelation robust standard errors were introduced in (Kiefer and Vogelsang 2002a, 2002b) and, following this lead, (Phillips 2005) used the HAR terminology to characterize a class of robust inferential procedures in an article concerned with the development of automated mechanisms of valid inference in econometrics. Other important early contributions concerning HAC covariance matrix estimators without truncation were given by (Kiefer and Vogelsang 2005; Kiefer et al. 2000; Robinson 1998).
3.	Weakly dependent innovations in the form of an AR(1) error process, viz., $μ_{s} = ρ μ_{s - 1} + ε_{s}$ , with $ε_{s} \sim_{d} i i d$ $N (0, 1),$ were also considered. The results were similar and so only the $i i d$ case is reported here.

Figure 1. Densities of different t-statistics in spurious trend regression of a random walk.

Figure 2. Rejection frequencies of the usual t and heteroskedasticity and autocorrelation consistent (HAC) t-statistics in spurious trend regression of a random walk calculated based on 10,000 simulations with sample size

n = 200

and the critical values from the standard Normal distribution at

5 %

significance level.

Figure 2. Rejection frequencies of the usual t and heteroskedasticity and autocorrelation consistent (HAC) t-statistics in spurious trend regression of a random walk calculated based on 10,000 simulations with sample size

n = 200

and the critical values from the standard Normal distribution at

5 %

significance level.

Figure 3. Rejection frequencies of the heteroskedasticity and autocorrelation robust (HAR) t-statistic in spurious trending regressions.

Figure 4. Densities of different t-statistics in simple spurious linear trend regression.

Figure 5. Rejection frequencies of the HAR t-statistic in spurious linear trend regressions.

Figure 6. Densities of different t-statistics in spurious regression among random walks.

Figure 7. Rejection frequencies of the usual t and HAC t-statistics in spurious regressions among random walks calculated based on 10,000 simulations with sample size

n = 200

and critical values from the standard Normal distribution at

5 %

significance level.

Figure 7. Rejection frequencies of the usual t and HAC t-statistics in spurious regressions among random walks calculated based on 10,000 simulations with sample size

n = 200

and critical values from the standard Normal distribution at

5 %

significance level.

Figure 8. Rejection frequencies of the HAR t-statistic in spurious regressions among random walks.

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Phillips, P.C.B.; Wang, X.; Zhang, Y. HAR Testing for Spurious Regression in Trend. Econometrics 2019, 7, 50. https://doi.org/10.3390/econometrics7040050

AMA Style

Phillips PCB, Wang X, Zhang Y. HAR Testing for Spurious Regression in Trend. Econometrics. 2019; 7(4):50. https://doi.org/10.3390/econometrics7040050

Chicago/Turabian Style

Phillips, Peter C. B., Xiaohu Wang, and Yonghui Zhang. 2019. "HAR Testing for Spurious Regression in Trend" Econometrics 7, no. 4: 50. https://doi.org/10.3390/econometrics7040050

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

HAR Testing for Spurious Regression in Trend

Abstract

1. Introduction

2. Regression of Stochastic Trend on Time Polynomials

2.1. Model Details and Background

2.2. Three t-Statistics

3. Regressions Among Independent Random Walks

4. Simulations

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

Abbreviations

Appendix A. Proofs of Theorems in Section 2

Appendix B. Derivations Leading to (23)–(28)

Appendix C. Proof of the Theorem in Section 3

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI