Quantile-Wavelet Nonparametric Estimates for Time-Varying Coefficient Models

Zhou, Xingcai; Yang, Guang; Xiang, Yu

doi:10.3390/math10132321

Open AccessArticle

Quantile-Wavelet Nonparametric Estimates for Time-Varying Coefficient Models

by

Xingcai Zhou

^*,

Guang Yang

and

Yu Xiang

School of Statistics and Data Science, Nanjing Audit University, Nanjing 211085, China

^*

Author to whom correspondence should be addressed.

Mathematics 2022, 10(13), 2321; https://doi.org/10.3390/math10132321

Submission received: 22 May 2022 / Revised: 22 June 2022 / Accepted: 29 June 2022 / Published: 2 July 2022

(This article belongs to the Special Issue Nonparametric Statistical Methods and Their Applications)

Download

Browse Figures

Versions Notes

Abstract

:

The paper considers quantile-wavelet estimation for time-varying coefficients by embedding a wavelet kernel into quantile regression. Our methodology is quite general in the sense that we do not require the unknown time-varying coefficients to be smooth curves of a common degree or the errors to be independently distributed. Quantile-wavelet estimation is robust to outliers or heavy-tailed data. The model is a dynamic time-varying model of nonlinear time series. A strong Bahadur order

O \{{(\frac{2^{m}}{n})}^{3 / 4} {(log n)}^{1 / 2}\}

for the estimation is obtained under mild conditions. As applications, the rate of uniform strong convergence and the asymptotic normality are derived.

Keywords:

quantile-wavelet; nonparametric estimation; time-varying coefficient; Bahadur representation; strong mixing

MSC:

62G05; 62G08; 62G20

1. Introduction

The trending time-varying time series models have gained a lot of attention during the last three decades because they are increasingly used to observe changes in their dynamic structure, which have been applied to economics and finance [1,2]. When the time span of interest covers different economic periods, the parameters of the corresponding statistical models should be allowed to change with time; see references [3,4,5,6,7], among others. Varying coefficient models are flexible models for describing the dynamic structure of data, which have been extensively studied based on mean regression, for example, references [8,9,10,11,12], among others and the references therein. However, the important features of the joint distribution of the response and the covariates can not be well captured by mean regression.

In this paper, we consider the time-varying coefficient quantile regression models: the conditional

τ

th quantile of

Y_{i}

given

X_{i}

and a prespecified quantile

τ

,

Q_{τ} (Y_{i} | X_{i}) = inf \{s : F_{Y_{i} | X_{i}} (s | X_{i}) > τ\} = X_{i}^{T} β_{τ} (t_{i}), i = 1, \dots, n,

(1)

where

t_{i} = i / n

,

Y_{i}

are responses,

β_{τ} (\cdot) = {(β_{1, τ} (\cdot), \dots, β_{p, τ} (\cdot))}^{T}

is a p-dimensional vector of unspecified functions defined on

[0, 1]

,

X_{i} = {(X_{i 1}, \dots, X_{i p})}^{T}

are p-dimensional explanatory variables. We can write Equation (1) in the form

Y_{i} = X_{i}^{T} β_{τ} (t_{i}) + ϵ_{τ, i}, i = 1, \dots, n,

(2)

where the errors

ϵ_{τ, i}

satisfy

Q_{τ} (ϵ_{τ, i} | X_{i}) = 0

almost surely. Quantile regression has also been emerged as an essential statistical tool in many scientific fields; see [13]. To simplify notations, we omit the subscript

τ

from the function

β_{τ} (\cdot)

and

ϵ_{τ, i}

subsequently.

To estimate the varying coefficients in quantile regression models in Equation (2), references [7,14,15] used the kernel method, and references [16,17] considered regression splines. To our knowledge, wavelet has not been considered in quantile regression with varying coefficients. Wavelet techniques can detect and represent localized features and can also create localized features on synthesis while being efficient in terms of computational speed and storage. It has received much attention from mathematicians, engineers and statisticians. Wavelet has been introduced for nonparametric regression, for example, references [18,19,20,21,22], and so on. For wavelet smoothing applied to nonparametric models, reference [18] is a key reference that introduces wavelet versions of some classical kernel and orthogonal series estimators and studies their asymptotic properties. Reference [23] provided asymptotic bias and variance of a wavelet estimator for a regression function under a stochastic mixing process. Reference [24] considered a wavelet estimator for the mean regression function with strong mixing errors and investigated their asymptotic rates of convergence by using the thresholding of the empirical wavelet coefficients. References [25,26] showed Berry–Esseen-type bounds on wavelet estimators for semiparametric models under dependent errors. For varying coefficient models, references [27,28] provided wavelet estimation and studied convergence rate and asymptotic normality under i.i.d. errors, and reference [29] discussed these asymptotics of the wavelet estimators based on censored dependent data. However, all of the above wavelet estimators are based on mean regression by the least-squares method.

In the paper, we propose a quantile-wavelet method for time-varying coefficient models in Equation (2) with an

α

-mixing errors stochastic process. The proposed methodology is quite general in the sense that we do not require coefficient

β (\cdot)

to be smooth curves of a common degree, it does not suffer from the curse of dimensionality, it is robust to outlier or heavy-tailed data, and it is a dynamic model for nonlinear time series. Bahadur representation, rate of convergence and asymptotic normality of quantile-wavelet estimators will be established. Bahadur representation theory seeks to approximate a statistical estimate by a sum of variables with a higher-order remained. It has been a major statistical theory issue since Bahadur’s pioneering work on quantiles [30]; see [31], among others. Recently, reference [32] investigated the Bahadur representation for sample quantiles under

φ

-mixing sequence; reference [33] gave an M-estimation for time-varying coefficient models with

α

-mixing errors and established Bahadur representation in probability. In the paper, we will establish Bahadur representation with probability 1 (almost surely) for quantile-wavelet estimates in the models Equations (1) and (2).

Throughout, we assume that

{X_{i}, ϵ_{i}}

is a stationary

α

-mixing sequence. Recall that a sequence

{ζ_{k}, k \geq 1}

is said to be

α

-mixing (or strong mixing) if the mixing coefficients

α (m) = sup {| P (A \cap B) - P (A) P (B) | : A \in F_{- \infty}^{k}, B \in F_{k + m}^{\infty}}

converge to zero as

m \to \infty

, where

F_{l}^{k}

denotes the

σ

-field generated by

{ζ_{i}, l \leq i \leq k}

. We refer to the monograph of [34,35] for some properties or more mixing conditions.

The rest of this article is organized as follows. In Section 2, we present wavelet kernel and quantile-wavelet estimation for the model (2). Bahadur representation of quantile-wavelet estimators and their applications are given in Section 3. Technical proofs are provided in Section 4. Some simulation studies are conducted in Section 5.

2. Quantile-Wavelet Estimation

In the paper, the time-varying coefficient function is retrieved by a wavelet-based reproducing kernel Hilbert space (RKHS). An RKHS is a Hilbert space in which all the point evaluations are bounded linear functions. Let

f \in H

be a Hilbert space of functions on some domain

I

. For

t \in I

, then there exists an element

k_{t} \in H

, such that

f (t) = 〈 k_{t}, f 〉, \forall f \in H,

where

〈 \cdot, \cdot 〉

is the inner product in

H

. Set

〈 k_{t}, k_{s} 〉 = K (t, s)

, which is called the reproducing kernel. Let

k_{t} = K (t, \cdot)

, and then

〈 K (t, \cdot), K (s, \cdot) 〉 = K (t, s)

.

A multiresolution analysis is a sequence of closed subspaces

{V_{m}, m \in Z}

in

L_{2} (R)

such that they lie in a containment hierarchy,

\dots \subset V_{- 2} \subset V_{- 1} \subset V_{0} \subset V_{1} \subset V_{2} \subset \dots,

(3)

where

L_{2} (R)

is the collection of square-integrable functions over the real line. The hierarchy Equation (3) is constructed such that (i) V-spaces are self-similar,

f (2^{m} x) \in V_{m} iff

f (x) \in V_{0}

, and (ii) there exists a scaling function

ϕ \in V_{0}

whose integer-translate

V_{0} = {f \in L_{2} (R) | f (x) = \sum_{k \in Z} c_{k} ϕ (x - k)}

, and for which the set

{ϕ (\cdot - k), k \in Z}

is an orthonormal basis. Wavelet analysis requires a description of two related and suitably chosen orthonormal basic functions: the scaling function

ϕ

and the wavelet

ψ

. A wavelet system is generated by dilation and translation of

ϕ

and

ψ

through

ϕ_{m, k} (t) = 2^{m / 2} ϕ (2^{m} t - k), ψ_{m, k} (t) = 2^{m / 2} ψ (2^{m} t - k), m, k \in Z .

Therefore,

{ϕ_{0 k}, k \in Z}

and

{ϕ_{m k}, k \in Z}

are the orthogonal bases of

V_{0}

and

V_{m}

, respectively. From Moore-Aronszajn’s theorem [36], it follows that

K (t, s) = \sum_{k} ϕ (t - k) ϕ (s - k)

is a reproducing kernel of

V_{0}

. By self-similarity of multiresolution subspaces,

K_{m} (t, s) = 2^{m} K (2^{m} t, 2^{m} s)

is a reproducing kernel of

V_{m}

, and then the projection of g on the space

V_{m}

is given by

P_{V_{m}} g (t) = \int K_{m} (t, s) g (s) d s .

It motivates us to define a quantile-wavelet estimator of

β (t)

by

\hat{β} (t) = {argmin}_{b} \sum_{i = 1}^{n} ρ_{τ} \{y_{i} - X_{i}^{T} b\} \int_{A_{i}} K_{m} (t, s) d s,

(4)

where

ρ_{τ} (u) = u (τ - I {u < 0})

with

u \in R

called the loss (“check”) function,

I_{B}

is the indicator function of any set B, and

A_{i}

are intervals that partition [0, 1], so that

t_{i} \in A_{i}

. One way of defining the intervals

A_{i} = [s_{i - 1}, s_{i})

is by taking

s_{0} = 0

,

s_{n} = 1

, and

s_{i} = (t_{i} + t_{i + 1}) / 2

,

i = 1, \dots, n - 1

.

Note that many other nonparametric methods can be used here, including spline and Kernel approaches. However, they might not be rich enough to characterize the local properties of the time-varying coefficient function. In the following section, we will present the asymptotic properties of the quantile-wavelet estimator Equation (4).

3. Bahadur Representation and Its Applications

Let

H^{ν}

be the collection of all functions on

[0, 1]

with order

ν > 0

in Sobolev space, which is a very general space. The degree of smoothness of the true coefficient functions determines how well the functions can be approximated. Functions belonging to

H^{ν}

for

1 / 2 < ν < 3 / 2

are not continuously differentiable. It is worth stressing that the wavelet approach allows us to obtain rates under much weaker assumptions than second-order differentiability. Denote

X = (X_{1}^{T}, \dots, X_{n}^{T})

;

∥ \cdot ∥

is the

L_{2}

norm, and C is used to denote positive constants whose values are unimportant and may change from line to line in the proof.

Our main results will be established under the following assumptions.

(A1.) (i)

{ϵ_{i}, X_{i}}

is a stationary

α

-mixing sequence with

α (i) = O (i^{- κ_{0}})

with

κ_{0} > \frac{9 (1 + d) δ + 6}{2 (1 - d) δ - 4}

,

δ > \frac{2}{1 - d}

, where d is defined in (A7)(i); (ii) the noisy errors

ϵ_{i}

has

Q_{τ} (ϵ_{i} | X_{i}) = 0

almost surely, and a continuous, positive conditional density

f_{ϵ | X}

in a neighborhood of 0 given the

X_{i}

,

E {∥X_{1}∥}^{2 δ} < \infty

, and

E {|ϵ_{1} ∣ X_{1}|}^{2 δ} < \infty

, a.s.; (iii)

Φ_{x} = E (X_{1} X_{1}^{T})

,

Ω_{x} = E (f_{ϵ | X} (0) X_{1} X_{1}^{T})

are non-singular matrices.

(A2.)

β_{j}

belongs to Sobolev space

H^{ν} ([0, 1])

with order

ν > 1 / 2

.

(A3.)

β_{j}

satisfies the Lipschitz of the order condition of order

γ > 0

.

(A4.)

ϕ

has compact support, is in the Schwarz space with order

l > ν

, and satisfies the Lipschitz condition with order l. Furthermore,

| \hat{ϕ} (ξ) - 1 | = O (ξ)

as

ξ \to 0

, where

\hat{ϕ}

is the Fourier transform of

ϕ

.

(A5.)

{max}_{i} | t_{i} - t_{i - 1} | = O (n^{- 1})

.

(A6.) We also assume that for some Lipschitz function

κ (\cdot)

,

ρ (n) = max_{i} |s_{i} - s_{i - 1} - \frac{κ (s_{i})}{n}| = o (n^{- 1}) .

(A7.) The tuning parameter m satisfies (i)

2^{m} = O (n^{d})

with

0 < d < 1

; (ii) let

v^{*} = min (3 / 2, ν, γ + 1 / 2) - ϵ_{1}

and

ϵ_{1} = 0

for

ν \neq 3 / 2

,

ϵ_{1} > 0

for

ν = 3 / 2

. Assume that

n 2^{- 2 m v^{*}} \to 0

.

Remark 1.

These conditions are mild. Condition (A1) is the standard requirement for moments and the mixing coefficient for an α-mixing time series. Conditions (A2)–(A6) are the mild regularity conditions for wavelet smoothing, which have been adopted by [18]. In condition (A7), m acts as a tuning parameter, such as the bandwidth does for standard kernel smoothers; (A7) (i) is for Bahadur representation and rate of convergence, and (A7) (ii) combining with (A7) (i) is for asymptotic normality of the quantile-wavelet estimator.

Our results are as follows.

Theorem 1.

(Bahadur representation) Support that (A1)–(A5) and (A7) (i) hold, then

\hat{β} (t) - β (t) = Ω_{x}^{- 1} Z_{n} (t) + R_{n} (m; γ, ν), a . s .

with

Z_{n} (t) = \sum_{i = 1}^{n} φ_{τ} (ϵ_{i} + X_{i}^{T} [β (t_{i}) - β (t)]) X_{i} \int_{A_{i}} K_{m} (t, s) d s

and

R_{n} (m; γ, ν) = O \{{(\frac{2^{m}}{n})}^{3 / 4} {(log n)}^{1 / 2}\} .

Remark 2.

Theorem 1 presents the strong Bahadur representation of a quantile-wavelet estimator for a time-varying coefficient model. Here,

R_{n} (m; γ, ν) = O \{{(\frac{2^{m}}{n})}^{3 / 4} {(log n)}^{1 / 2}\}

, a.s., which is comparable with the Bahadur order

O \{{(\frac{log log n}{n h})}^{3 / 4}\}, a . s .

, of [37], where the bandwidth

h \to 0

. Reference [37] is based on kernel local polynomial M-estimation, and requires that the function

β

has the second-order differentiability. However, we do not need the strong, smooth conditions. The function

β

is not differentiable when

β \in H^{ν}

,

1 / 2 < ν < 3 / 2

.

The Bahadur representation of the quantile-wavelet estimator of Theorem 1 can be applied to obtain the following two results.

Corollary 1.

(Rate of uniform strong convergence) Assume that (A1)–(A5) and (A7) (i) hold, then

sup_{t \in [0, 1]} ∥\hat{β} (t) - β (t)∥ = \{\sqrt{\frac{2^{m} log n}{n}} + n^{- γ} + η_{m}\}, a . s . .

Remark 3.

Corollary 1 provides the rate of uniform strong convergence of quantile-wavelet estimator

\hat{β}

for model Equation (2). We consider the rate in the case of

1 / 2 < ν < 3 / 2

, under which

η_{m}

is a lower rate of convergence than one of

ν \geq 3 / 2

. If we take

2^{m} = O (n^{γ})

with

1 / 3 \leq γ < 1

, then

{sup}_{t \in [0, 1]} ∥\hat{β} (t) - β (t)∥ = O (n^{- (1 - γ) / 2} {(log n)}^{1 / 2}), a . s .

. If we further take

γ = 1 / 3

, then one obtains

sup_{t \in [0, 1]} | \hat{β} (t) - β (t) | = O (n^{- 1 / 3} log n), a . s .,

which is comparable with the optimal convergence rate of the nonparametric estimation in nonparametric models. The result is better than the ones of [27,28] based on the local linear estimator for the varying-coefficient model. In addition, we also do not require the unknown coefficient

β

to be smooth curves of a common degree.

Corollary 2.

(Asymptotic normality) Support that (A1)–(A7) holds, then

\sqrt{n 2^{- m}} ({\hat{β}}^{d} (t) - β (t)) \overset{D}{⟶} N (0, τ (1 - τ) κ (t) ω_{0}^{2} Ω_{x}^{- 1} Ψ_{x} Ω_{x}^{- 1}),

where

{\hat{β}}^{d} (t) = \hat{β} (t^{(m)})

with

t^{(m)} = ⌊ 2^{m} t ⌋ / 2^{m}

and

ω_{0}^{2} = \int_{R} E_{0}^{2} (0, u) d u = \sum_{k \in Z} ϕ^{2} (k)

.

Remark 4.

To obtain an asymptotic expansion of the variance and an asymptotic normality, we need to consider an approximation to

\hat{β} (t)

based on its values at dyadic points of order m, as reference [18] has done. The

{\hat{β}}^{d} (t)

is the piecewise-constant approximation of

\hat{β} (t)

at resolution

2^{- m}

, which can avoid the instability of the variance of

\hat{β} (t)

. From the proof of Corollary 2, it can see that the main term of the variance of

\hat{β} (t)

is

τ (1 - τ) κ (t) ω^{2} (t_{m}) 2^{m} n^{- 1} Ω_{x}^{- 1} Ψ_{x} Ω_{x}^{- 1}

with

t_{m} = 2^{m} t - [2^{m} t]

and

ω^{2} (t_{m}) = \int_{0}^{1} E_{0}^{2} (t_{m}, s) d s

. When the dyadic t and m sufficiently large,

t_{m} = 0

, the variance of

\hat{β} (t)

is asymptotically stable. See [18] for the details.

4. Lemmas and Proofs

Lemma 1

([18,38]). Suppose that (A4) holds. We have

(i)

K_{0} (t, s) \leq c_{k} / {(1 + | t - s |)}^{k}

and

K_{k} (t, s) \leq 2^{k} c_{k} / (1 + 2^{k} {| t - s |)}^{k}

, where k is a positive integer and

c_{k}

is a constant depending on k only.

(ii)

{sup}_{0 \leq t, s \leq 1} | K_{m} (t, s) | = O (2^{m})

.

(iii)

{sup}_{0 \leq t \leq 1} \int_{0}^{1} | K_{m} (t, s) | d s \leq c

, where c is a positive constant.

(iv)

\int_{0}^{1} K_{m} (t, s) d s \to 1

uniformly in

t \in [0, 1]

, as

m \to \infty

.

Lemma 2

([18]). Suppose that (A4)–(A5) hold and

h (\cdot)

satisfies (A2)–(A3). Then

sup_{0 \leq t \leq 1} |h (t) - \sum_{i = 1}^{n} h (t_{i}) \int_{A_{i}} K_{m} (t, s) d s| = O (n^{- γ}) + O (η_{m}),

where

η_{m} = \{\begin{matrix} {(1 / 2^{m})}^{ν - 1 / 2} & i f & 1 / 2 < ν < 3 / 2, \\ \sqrt{m} / 2^{m} & i f & ν = 3 / 2, \\ 1 / 2^{m} & i f & ν > 3 / 2 . \end{matrix}

Lemma 3

([39]). Let

{λ_{n} (θ), θ \in Θ}

be a sequence of random convex functions defined on a convex, open subset Θ of

R^{d}

. Suppose

λ (\cdot)

is a real-valued function on Θ for which

λ_{n} (θ) \to λ (θ)

with probability 1, for each fixed θ in Θ. Then for each compact subset K of Θ, with probability 1,

sup_{θ \in K} | λ_{n} (θ) - λ (θ) | \to 0 .

Lemma 4.

Let

{(X_{i}, e_{i}), 1 \leq i \leq n}

be a stationary sequence satisfying the mixing condition

α (ℓ) = O (ℓ^{- κ_{0}})

for some

κ_{0} > \frac{9 (1 + d) δ + 6}{2 (1 - d) δ - 4}

,

δ > \frac{2}{1 - d}

; and

2^{m} = O (n^{d})

with

0 < d < 1

. Further, assume that

E {| e_{1} X_{1} |^{δ}} < \infty

. If Conditions (A4) and (A5) hold. Then

sup_{s \in [0, 1]} |\sum_{i = 1}^{n} {e_{i} X_{i} - E (e_{i} X_{i})} \int_{A_{i}} K_{m} (t, s) d s| = O (\sqrt{\frac{2^{m} log n}{n}}), a . s . .

Remark 5.

In Lemma 4, we assume that

{X_{i}, 1 \leq i \leq n}

is a sequence of a 1-dimensional random variable. In fact,

X_{i} \in R^{p}

for the fixed p, we also have the same result as Lemma 4.

Proof.

The theorem is similar to Lemma A.4 in [29] but has some differences. We suppose

E (e_{i} X_{i}) = 0

. If

E (e_{i} X_{i}) \neq 0

, the method of the proof is the same. Let

Q_{m} (t) = \sum_{i = 1}^{n} e_{i} X_{i} \int_{A_{i}} K_{m} (t, s) d s

. Partition the interval [0, 1] into

N = ⌊ {(n 2^{3 m})}^{1 / 2} ⌋

subintervals

I_{j}

of equal length. Let

t_{j}

be the centers of

I_{j}

. Notice that

| Q_{m} (t) - Q_{m} (t^{'}) | \leq C 2^{2 m} | t - t^{'} | \frac{1}{n} \sum_{i = 1}^{n} | e_{i} X_{i} | \leq C 2^{2 m} | t - t^{'} | E | e X |, a . s . .

(5)

One obtains

\begin{matrix} sup_{t \in [0, 1]} | Q_{m} (t) | & \leq {max}_{1 \leq j \leq N} {sup}_{t \in I_{j}} | Q_{m} (t) - Q_{m} (t_{j}) | + {max}_{1 \leq j \leq N} | Q_{m} (t_{j}) | \\ \leq {max}_{1 \leq j \leq N} | Q_{m} (t_{j}) | + C \sqrt{2^{m} / n} . \end{matrix}

(6)

Let

Q_{m}^{B} (s) = \sum_{i = 1}^{n} e_{i} X_{i} I (| e_{i} X_{i} | \leq B_{n}) \int_{A_{i}} K_{m} (t, s) d s

, and take

B_{n} = n^{δ^{- 1} + ϵ}

for some

ϵ > 0

. Note that

\sum_{i} P (| e_{i} X_{i} | > B_{i}) < \sum_{i} B_{i}^{- δ} E {| e_{1} X_{1} |}^{δ} < \infty .

By the Borel–Cantelli lemma,

| e_{i} X_{i} | \leq B_{i}

, a.s., for sufficiently large i. Hence,

| e_{i} X_{i} | \leq B_{n}, a . s ., f o r a l l i \leq n,

(7)

for all sufficiently large n. In addition,

\begin{matrix} sup_{t \in [0, 1]} | E [Q_{m} (t) - Q_{m}^{B} (t)] | & = & sup_{t \in [0, 1]} |\sum_{i = 1}^{n} E (e_{i} X_{i}) I [| e_{i} X_{i} | > B_{n}] \int_{A_{i}} K_{m} (t, s) d s| \\ \leq & C B_{n}^{1 - δ} E {| e_{1} X_{1} |}^{δ} \leq C B_{n}^{1 - δ} . \end{matrix}

(8)

From Equations (7) and (8), we respectively have

sup_{t \in [0, 1]} | Q_{m} (t) - Q_{m}^{B} (t) - E [Q_{m} (t) - Q_{m}^{B} (t)] | = O (B_{n}^{1 - δ}) = o (n^{- 1 / 2}), a . s .

(9)

Further, we have

sup_{t \in [0, 1]} | Q_{m} (t) | \leq max_{1 \leq j \leq N} | Q_{m}^{B} (t_{j}) - E (Q_{m}^{B} (t_{j})) | + C \sqrt{2^{m} / n} .

(10)

Let

{\tilde{X}}_{i} = n (e_{i} X_{i} I (| e_{i} X_{i} | \leq B_{n}) \int_{A_{i}} - E [e_{i} X_{i} I (| e_{i} X_{i} | \leq B_{n}) \int_{A_{i}} K_{m} (t, s) d s])

. Note that

| {\tilde{X}}_{i} | \leq C 2^{m} B_{n}

. By Theorem 2.18 (ii) in Fan and Yao (2003) [35], take

h_{n} = {(M 2^{m} log n / n)}^{1 / 2}

, for any

η > 0

and sufficiently large M, for each

q = [1, n / 2]

, we have

\begin{matrix} \sum_{n = 1}^{\infty} P ({max}_{1 \leq j \leq N} |Q_{m}^{B} (t_{j}) - E (Q_{m}^{B} (t_{j}))| > η h_{n}) \\ \leq C \sum_{n = 1}^{\infty} N (exp (- \frac{h_{n}^{2} q}{v^{2} (q)}) + {(1 + \frac{2^{m} B_{n}}{h_{n}})}^{1 / 2} q α (k)), \end{matrix}

where

k = [n / (2 q)]

,

v^{2} (q) = 2 σ^{2} (q) / k^{2} + C 2^{m} B_{n} h_{n}

,

σ^{2} (q) = max_{0 \leq j \leq 2 q - 1} V a r \{{\tilde{X}}_{j k + 1} + \dots + {\tilde{X}}_{(j + 1) k + 1}\} .

Taking

k = {(B_{n} h_{n})}^{- 1}

, we obtain

σ^{2} (q) = O (2^{m} k)

. Assume

2^{m} = O (n^{d})

,

0 < d < 1

. We have

\begin{matrix} \sum_{n = 1}^{\infty} P (max_{1 \leq j \leq N} |Q_{m}^{B} (t_{j}) - E (Q_{m}^{B} (t_{j}))| > η h_{n}) \\ \leq & C \sum_{n = 1}^{\infty} N (exp (- \frac{C n h_{n}^{2}}{2^{m}}) + C \sqrt{2^{m}} n B_{n}^{κ_{0} + 1.5} h_{n}^{κ_{0} + 0.5}) \\ = & C (\sum_{n = 1}^{\infty} n^{- C M + 2}) + C (\sum_{n = 1}^{\infty} n^{\frac{5}{4} + (δ^{- 1} + ϵ) (κ_{0} + 1.5) - \frac{κ_{0}}{2}} {(2^{m})}^{\frac{(9 + 2 κ_{0})}{4}} {(log n)}^{\frac{κ_{0} + 0.5}{2}}) \\ \leq & C (\sum_{n = 1}^{\infty} n^{- C M + 2}) + C (\sum_{n = 1}^{\infty} n^{\frac{5}{4} + δ^{- 1} (κ_{0} + 1.5) - \frac{κ_{0}}{2}} {(2^{m})}^{\frac{(9 + 2 κ_{0})}{4}} n^{2 ϵ (κ_{0} + 1.5)}) \\ \leq & C (\sum_{n = 1}^{\infty} n^{- C M + 2}) + C (\sum_{n = 1}^{\infty} n^{(\frac{1}{δ} - \frac{1}{2} + \frac{d}{2}) κ_{0} + \frac{5}{4} + \frac{9 d}{4} + \frac{3 δ}{2}} n^{2 ϵ (κ_{0} + 1.5)}) < \infty, \end{matrix}

since we have

(\frac{1}{δ} - \frac{1}{2} + \frac{d}{2}) κ_{0} + \frac{5}{4} + \frac{9 d}{4} + \frac{3 δ}{2} + ϵ (κ_{0} + 1.5) < - 1

when

κ_{0} > \frac{9 (1 + d) δ + 6}{2 (1 - d) δ - 4}

and

ϵ

are small enough with

δ > 2 / (1 - d)

. By the Borel–Cantelli Lemma and Equation (10), we prove Lemma 4. □

Lemma 5.

Under conditions in Theorem 1, we have

sup_{t \in [0, 1]} ∥\sum_{i = 1}^{n} \{φ_{τ} (ϵ_{i}^{*}) - φ_{τ} (ϵ_{i})\} X_{i} \int_{A_{i}} K_{m} (t, s) d s∥ = O (n^{- γ} + η_{m} + \sqrt{\frac{2^{m} log n}{n}}), a . s .,

where

ϵ_{i}^{*} = ϵ_{i} + X_{i}^{T} [β (t_{i}) - β (t)]

.

Proof.

Let

e_{i} = I [ϵ_{i} \leq - X_{i}^{T} (β (t_{i}) - β (t))] - I [ϵ_{i} \leq 0]

. By Lemmas 2 and 4, we have

\begin{matrix} sup_{t \in [0, 1]} ∥\sum_{i = 1}^{n} \{φ_{τ} (ϵ_{i}^{*}) - φ_{τ} (ϵ_{i})\} X_{i} \int_{A_{i}} K_{m} (t, s) d s∥ = sup_{t \in [0, 1]} ∥\sum_{i = 1}^{n} e_{i} X_{i} \int_{A_{i}} K_{m} (t, s) d s∥ \\ \leq & sup_{t \in [0, 1]} ∥\sum_{i = 1}^{n} E (e_{i} X_{i}) \int_{A_{i}} K_{m} (t, s) d s∥ + sup_{t \in [0, 1]} ∥\sum_{i = 1}^{n} [e_{i} X_{i} - E (e_{i} X_{i})] \int_{A_{i}} K_{m} (t, s) d s∥ \\ \leq & sup_{t \in [0, 1]} ∥\sum_{i = 1}^{n} E [\{F_{ϵ | X} (- X_{i}^{T} [β (t_{i}) - β (t)]) - F_{ϵ | X} (0)\} X_{i})] \int_{A_{i}} K_{m} (t, s) d s∥ + O (\sqrt{\frac{2^{m} log n}{n}}) \\ = & sup_{t \in [0, 1]} ∥\sum_{i = 1}^{n} E \{f_{ϵ | X} (0) X_{i} X_{i}^{T} [β (t_{i}) - β (t)]\} \int_{A_{i}} K_{m} (t, s) d s∥ + O (\sqrt{\frac{2^{m} log n}{n}}) \\ = & sup_{t \in [0, 1]} ∥Ω_{x} \{β (t) - \sum_{i = 1}^{n} β (t_{i}) \int_{A_{i}} K_{m} (t, s) d s\}∥ + O (\sqrt{\frac{2^{m} log n}{n}}) \\ = & O (n^{- γ} + η_{m} + \sqrt{\frac{2^{m} log n}{n}}), a . s . . \end{matrix}

(11)

This completes the proof of Lemma 5. □

Lemma 6.

Under conditions in Theorem 1, for fixed

θ

, we have

sup_{t \in [0, 1]} ∥E (R_{n} (θ, t)) - \frac{1}{2} θ^{T} Ω_{x} θ∥ = O (\sqrt{\frac{2^{m} log n}{n}})

(12)

and

sup_{t \in [0, 1]} ∥R_{n} (θ, t) - E (R_{n} (θ, t))∥ = O (\sqrt{\frac{2^{m} log n}{n}}), a . s .,

(13)

where

R_{n} (θ, t)

is defined by (18) in the proof of Theorem 1.

Proof.

For Equation (12), by Condition (A1) (iii) and Lemma 4, one obtains

\begin{matrix} E (R_{n} (θ, t) | X) \\ = & n 2^{- m} \sum_{i = 1}^{n} \int_{0}^{v_{i}} F_{ϵ | X} (s - X_{i}^{T} [β (t_{i}) - β (t)]) - F_{ϵ | X} (- X_{i}^{T} [β (t_{i}) - β (t)]) d s \int_{A_{i}} K_{m} (t, s) d s \\ = & n 2^{- m} \sum_{i = 1}^{n} \int_{0}^{v_{i}} f_{ϵ | X} (0) s d s \int_{A_{i}} K_{m} (t, s) d s (1 + o (1)) \\ = & n 2^{- m} \frac{1}{2} θ^{T} \{\frac{2^{m}}{n} \sum_{i = 1}^{n} f_{ϵ | X} (0) X_{i} X_{i}^{T} \int_{A_{i}} K_{m} (t, s) d s\} θ (1 + o (1)) \\ = & \frac{1}{2} θ^{T} Ω_{x} θ + {o (∥ θ ∥}^{2}) + O (\sqrt{\frac{2^{m} log n}{n}}), a . s . . \end{matrix}

(14)

Note that

E (R_{n} (θ, t)) = E [E (R_{n} (θ, t) | X)]

. Thus, Equation (12) holds.

Now, let us prove Equation (13). Notice that, for any

k > 0

,

| I (ϵ_{i}^{*} \leq s) - I (ϵ_{i}^{*} \leq 0) |^{k} = I (d_{1} \leq ϵ_{i} \leq d_{2}),

(15)

where

d_{1} = min (c_{1}, s + c_{1})

, and

d_{2} = max (c_{1}, s + c_{1})

with

c_{1} = - X_{i}^{T} [β (t_{i}) - β (t)]

. Let

{\tilde{v}}_{i} = n 2^{- m} \int_{0}^{v_{i}} {I (ϵ_{i}^{*} \leq s) - I (ϵ_{i}^{*} \leq 0)} d s

. To prove Equation (13), we only need to show that

E (| {\tilde{v}}_{i} |^{δ} |) < \infty

by the proof of Lemma 4. By Equation (15) and Jensen’s inequality, one obtains

\begin{matrix} E (| {\tilde{v}}_{i} |^{δ}) & = & E \{E (| {\tilde{v}}_{i} |^{δ} | X)\} \\ \leq & E \{({[n 2^{- m} \int_{0}^{v_{i}} E | I (ϵ_{i}^{*} \leq s) - I (ϵ_{i}^{*} \leq 0) | d s]}^{δ} | X)\} \\ = & E \{({[n 2^{- m} \int_{0}^{v_{i}} | F_{ϵ | X} (d_{2}) - F_{ϵ | X} (d_{1}) | d s]}^{δ} | X)\} \\ = & E \{({[n 2^{- m} \int_{0}^{v_{i}} f_{ϵ | X} (0) s d s]}^{δ} | X)\} (1 + o (1)) \\ = & E {(X_{1}^{T} θ)}^{2 δ} (1 + o (1)) < \infty . \end{matrix}

By Lemma 4, we obtain Equation (13).

In the sequence, we will give the proofs of the main results. □

Proof of Theorem 1.

Recall that

{\hat{β}}_{n} (t) = \hat{b}

and

\hat{b}

minimize

\sum_{i = 1}^{n} ρ_{τ} \{y_{i} - X_{i}^{T} b\} \int_{A_{i}} K_{m} (t, s) d s .

Let

\hat{θ} = \sqrt{n 2^{- m}} (\hat{b} - β (t))

and

ϵ_{i}^{*} = ϵ_{i} + X_{i}^{T} [β (t_{i}) - β (t)]

. The behavior of

\hat{θ}

follows from consideration of the objective function

G_{n} (θ; t) = n 2^{- m} \sum_{i = 1}^{n} \{ρ_{τ} (ϵ_{i}^{*} - \sqrt{2^{m} / n} X_{i}^{T} θ) - ρ_{τ} (ϵ_{i}^{*})\} \int_{A_{i}} K_{m} (t, s) d s .

The function

G_{n} (θ; t)

is obviously convex and is minimized at

\hat{θ}

. It is sufficient to show that

G_{n} (θ; t)

converges to its expectation since it follows from Lemma 3 that the convergence is also uniform on any compact set K of

θ

. Using the identity of [40],

ρ_{τ} (u - v) - ρ_{τ} (u) = - v φ_{τ} (u) + \int_{0}^{v} {I (u \leq s) - I (u \leq 0) d s},

where

φ_{τ} (u) = τ - I {u < 0}

; we may write

G_{n} (θ; t) = θ^{T} W_{n} (t) + E (R_{n} (θ, t)) + (R_{n} (θ, t) - E [R_{n} (θ, t)]),

(16)

where

W_{n} (t) = - \sqrt{n / 2^{m}} \sum_{i = 1}^{n} φ_{τ} (ϵ_{i}^{*}) X_{i} \int_{A_{i}} K_{m} (t, s) d s,

(17)

R_{n} (θ, t) = n 2^{- m} \sum_{i = 1}^{n} \int_{0}^{v_{i}} {I (ϵ_{i}^{*} \leq s) - I (ϵ_{i}^{*} \leq 0)} d s \int_{A_{i}} K_{m} (t, s) d s,

(18)

with

v_{i} = \sqrt{2^{m} / n} X_{i}^{T} θ

.

To obtain strong Bahadur representation of the quantile-wavelet estimator, we first need to show the uniform approximation of

G_{n} (θ; t)

on

t \in [0, 1]

with probability 1 for the fixed

θ

by the three terms in Equation (16). By Lemma 6, we have

\begin{matrix} G_{n} (θ; t) & = & θ^{T} W_{n} (t) + \frac{1}{2} θ^{T} Ω_{x} θ + O (\sqrt{\frac{2^{m} log n}{n}}) a . s . \end{matrix}

(19)

uniformly on

t \in [0, 1]

. Let

a_{n}^{2} = \sqrt{\frac{2^{m}}{n}} log n

. We have

sup_{t \in [0, 1]} a_{n}^{- 2} |G_{n} (θ; t) - θ^{T} W_{n} (t) - \frac{1}{2} θ^{T} Ω_{x} θ| = O (\frac{1}{\sqrt{log n}}) = o (1), a . s . .

(20)

Second, the strong Bahadur representation requires the existence of one compact subset K with probability measure 1 that will suffice for all

θ

. We prove it by applying a stronger convexity lemma (See Lemma 3) than one of [41]. However, the arguments to prove it are essentially the same as in [41].

Let

\bar{θ} = - Ω_{x}^{- 1} W_{n} (t)

. It is easy to see that

W_{n} (t)

has a bounded second moment and hence is stochastically bounded. Since the convex function

λ_{n} (θ) = G_{n} (θ; t) - θ^{T} W_{n} (t)

converges with probability 1 to the convex function

λ (θ) = \frac{1}{2} θ^{T} Ω_{x} θ

, it follows from the convexity Lemma 3 that for any compact subset

K \subset R^{p}

,

sup_{θ \in K} a_{n}^{- 2} |G_{n} (θ; t) - θ^{T} W_{n} (t) - \frac{1}{2} θ^{T} Ω_{x} θ| = o (1), a . s .

The argument will be complete if we can show for each

ε > 0

that, with probability 1,

∥ \hat{θ} - \bar{θ} ∥ < a_{n} ε .

Because

\bar{θ}

is almost surely converged by Lemmas 4 and 5, it is bounded with probability 1. The compact subset K can be chosen to contain

B (n)

, a closed ball with center

\bar{θ}

and radius

a_{n} ε

, thereby implying that

Δ_{n} \equiv sup_{θ \in B (n)} a_{n}^{2} |G_{n} (θ; t) - θ^{T} W_{n} (t) - \frac{1}{2} θ^{T} Ω_{x} θ| = o (1), a . s .

(21)

Now, we consider the behavior of

G_{n} (θ; t)

outside K. For any

θ = \bar{θ} + a_{n} ϱ υ

, with

ϱ > ε

and

υ

a unit vector. Define

θ^{*}

as the boundary point of

B_{n}

that lies on the line segment form

\bar{θ}

to

θ

, that is,

θ^{*} = \bar{θ} + a_{n} ε υ

. Convexity of

G_{n} (θ; t)

and the definition of

Δ_{n}

imply

\begin{matrix} \frac{ε}{ϱ} G_{n} (θ; t) + (1 - \frac{ε}{ϱ}) G_{n} (\bar{θ}; t) & \geq & G_{n} (θ^{*}; t) \\ = & - {(θ^{*})}^{T} Ω_{x} \bar{θ} + \frac{1}{2} {(θ^{*})}^{T} Ω_{x} θ^{*} - a_{n}^{2} Δ_{n} \\ = & \frac{1}{2} a_{n}^{2} υ^{T} Ω_{x} υ ε^{2} - \frac{1}{2} {\bar{θ}}^{T} Ω_{x} \bar{θ} - a_{n}^{2} Δ_{n} \\ \geq & G_{n} (\bar{θ}; t) + a_{n}^{2} \{\frac{1}{2} υ^{T} Ω_{x} υ ε^{2} - 2 Δ_{n}\} . \end{matrix}

The last expression does not depend on

θ

. It follows that

inf_{∥ θ - \bar{θ} ∥ > a_{n} ε} G_{n} (θ; t) \geq G_{n} (\bar{θ}, t) + a_{n}^{2} \frac{ϱ}{ε} [\frac{1}{2} υ^{T} Ω_{x} υ ε^{2} - 2 Δ_{n}] .

As

Ω_{x}

is positively defined, then according to (21), with probability 1,

2 Δ_{n} < \frac{1}{2} υ^{T} Ω_{x} υ ε^{2}

for enough n. This implies that for any

ε > 0

and for large enough n, the minimum of

G_{n} (θ; t)

must be achieved with

B (n)

, i.e.,

∥ \hat{θ} - \bar{θ} ∥ < a_{n} ε

, that is,

\sqrt{n 2^{- m}} (\hat{β} (t) - β (t)) = - Ω_{x}^{- 1} W_{n} (t) + O (a_{n}), a . s . .

One obtains

R_{n} (m; γ, ν) = O \{{(\frac{2^{m}}{n})}^{3 / 4} {(log n)}^{1 / 2}\} .

We complete the proof of Theorem 1. □

Proof of Corollary 1.

From Theorem 1, and Lemmas 4 and 5, it is easy to obtain the result of Corollary 1. □

Proof of Corollary 2.

From Theorem 1 and Lemma 5, we only verify the asymptotic normality of

U_{n} (t) = \sqrt{n / 2^{m}} \sum_{i = 1}^{n} φ_{τ} (ϵ_{i}) X_{i} \int_{A_{i}} K_{m} (t, s) d s .

□

First, we compute its variance-covariance matrix. Let

V_{i} = \sqrt{n / 2^{m}} φ_{τ} (ϵ_{i}) X_{i} \int_{A_{i}} K_{m} (t, s) d s

. We know

E (V_{i}) = 0

. Let

S_{1} = {(j, i) : 1 \leq j - i \leq d_{n}; 1 \leq i < j \leq n}

and

S_{2} = {(j, i) : 1 \leq i < j \leq n} - S_{1}

with

d_{n} \to \infty

specified later. We have

\begin{matrix} V a r (U_{n} (t)) = V a r \{\sum_{i = 1}^{n} V_{i}\} \leq \sum_{i = 1}^{n} E (V_{i} V_{i}^{T}) + 2 \sum_{1 \leq i < j \leq n} C o v (V_{i}, V_{j}) \\ = & \sum_{i = 1}^{n} E (V_{i} V_{i}^{T}) + 2 \sum_{S_{1}} C o v (V_{i}, V_{j}) + 2 \sum_{S_{2}} C o v (V_{i}, V_{j}) \\ = & I_{1} + I_{2} + I_{3} . \end{matrix}

(22)

For

I_{1}

, by Theorem 3.3 and Lemma 6.1 in [18], one obtains

\begin{matrix} I_{1} & = & \sum_{i = 1}^{n} E (E (V_{i} V_{i}^{T} | X_{i})) = n 2^{- m} τ (1 - τ) E (X_{1} X_{1}^{T}) \sum_{i = 1}^{n} {(\int_{A_{i}} K_{m} (t, s) d s)}^{2} \\ = & τ (1 - τ) E (X_{1} X_{1}^{T}) \{n 2^{- m} \sum_{i = 1}^{n} {(\int_{A_{i}} K_{m} (t, s) d s)}^{2} - 2^{- m} \int_{0}^{1} E_{m}^{2} (t, s) κ (s) d s\} \\ + τ (1 - τ) E (X_{1} X_{1}^{T}) 2^{- m} \int_{0}^{1} E_{m}^{2} (t, s) κ (s) d s \\ = & τ (1 - τ) κ (t) E (X_{1} X_{1}^{T}) \int_{0}^{1} E_{0}^{2} (0, s) d s + O (n ρ (n) + 2^{m} / n) . \end{matrix}

(23)

For

I_{2}

, since

n 2^{- m} \to \infty

, take

d_{n} \to \infty

such that

d_{n} / (n 2^{- m}) \to 0

. Let

{(I_{2})}_{k, l}

be the

(k, l)

th component of

I_{2}

. By Lemma 1, we have

\begin{matrix} | {(I_{2})}_{k, l} | & \leq & 2 \sum_{S_{1}} \{E | V_{i k} V_{j l} | + E | V_{i k} | E | V_{j l} |\} \\ \leq & C n 2^{- m} \sum_{S_{1}} \{E | X_{i k} X_{j l} | |\int_{A_{i}} K_{m} (t, s) d s \int_{A_{j}} K_{m} (t, s) d s| + E^{2} ∥ X_{1} ∥ {|\int_{A_{i}} K_{m} (t, s) d s|}^{2}\} \\ = & n 2^{- m} O ({(2^{m} / n)}^{2} d_{n}) = O (\frac{2^{m}}{n} d_{n}) = o (1) . \end{matrix}

(24)

For

I_{3}

, let

{(I_{3})}_{k, l}

be the

(k, l)

th component of

I_{3}

. Noting that

κ_{0} > (2 + δ) / δ

, by Proposition 2.5 (i) in [35], we have

\begin{matrix} | {(I_{3})}_{k, l} ∥ & \leq & C ∥\sum_{S_{2}} {[E | V_{i k} |^{2 + δ}]}^{2 / (2 + δ)} α^{δ / (2 + δ)} (j - i)∥ \\ \leq & C n 2^{- m} {(\frac{2^{m}}{n})}^{2} {\{E | X_{i k} |^{2 + δ}\}}^{2 / (2 + δ)} \sum_{j = d_{n}}^{\infty} α^{δ / (2 + δ)} (j) \\ \leq & O (\frac{2^{m}}{n} d_{n}^{- κ_{0} δ / (2 + δ) + 1}) = O \{(\frac{2^{m}}{n} d_{n}) d_{n}^{- κ_{0} δ / (2 + δ)}\} = o (1) . \end{matrix}

(25)

From Equations (22)–(25), we obtain

V a r (U_{n} (t)) = τ (1 - τ) κ (t) E (X_{1} X_{1}^{T}) \int_{0}^{1} E_{0}^{2} (0, s) d s + o (1) .

(26)

Similar to the proof of Theorem 2 in [42], by using the small-block and large-block technique and the Cramér–Wold device, one can show that

Z_{n} (t) \to N (0, τ (1 - τ) κ (t) ω_{0}^{2} Ψ_{x}) .

(27)

This, in conjunction with Theorem 1 and the Slutsky Theorem, proves Corollary 2.

5. Simulation Study

To explore the numerical performances of quantile wavelet estimation, we compare our estimator with a local linear kernel [43] and Spline [44] by quantile regression. We call them QR-Wavelet, QR-Local-linear and QR-Spline methods, respectively. In the simulation study, our goal is to show that the QR-wavelet is robust to heavy-tailed data and more adaptive to a nonsmooth nonparametric function than local linear and spline methodologies. The data are of the form:

Y_{i} = X_{i} β (i / n) + ϵ_{τ, i}, i = 1, \dots, n,

where

X_{i}

are random design points generated from the normal distribution

N (2, 0 . 1^{2})

independently, and

ϵ_{τ, i} = ε_{i} - F^{- 1} (τ)

with F being the distribution function of

ε_{i}

. Here,

F^{- 1} (τ)

is subtracted from

ε_{i}

to make the

τ

th quantile

ϵ_{τ, i}

zero for identifiability.

We set

n = 200

, 300 and 500;

τ = 0.10

, 0.25, 0.50, 0.75 and 0.90;

ε_{i}

comes from the normal distribution

N (0, σ^{2})

with

σ = 0.1

and the t distribution with d degrees of freedom (denoted as

t (d)

) with

d = 5

, respectively. We consider two special curves of

β (t)

for

t \in [0, 1]

as follows:

Case 1. Pwpn (piecewise polynomial function):

β (t) = [4 t^{2} (3 - 4 t)] 1_{[0, 0.5] (t)} + [\frac{4}{3} t (4 t^{2} - 10 t + 7) - 1.5] 1_{(0.5, 0.75]} (t) + [\frac{16}{3} t {(t - 1)}^{2}] 1_{(0.75, 1]} (t)

. The function is generally smooth except for a jump at

t = 0.5

.

Case 2. Blocks:

β (t) = \frac{0.6}{9.2} {4 ssgn (t - 0.1) - 5 ssgn (t - 0.13) + 3 ssgn (t - 0.15) - 4 ssgn (t - 0.23) + 5 ssgn (t - 0.25) - 4.2 ssgn (t - 0.4) + 2.1 ssgn (t - 0.44) + 4.3 ssgn (t - 0.65) - 3.1 ssgn (t - 0.76) + 2.1 ssgn (t - 0.78) - 4.2 ssgn (t - 0.81)} + 0.2

, where

ssgn (t) = [1 + sgn (t)] / 2

with

sgn (t) = 1_{(0, \infty)} (t) - 1_{(- \infty, 0)} (t)

. It is a step function with many jumps. Many jumps bring difficulties for the local linear and spline smoothing methods.

Pwpn and Blocks are shown in Figure 1 and Figure 2, respectively. In the study, we use the Haar wavelet, which is the simplest of the wavelets. For a given sample size, take

2^{m} = 0.6 n^{2 / 3}

in the QR-wavelet, h is chosen by “leave-one-out" cross-validation procedures and use the Gaussian kernel

K (t) = exp (- t^{2} / 2) / \sqrt{2 π}

in the QR-local-linear. In the QR-splines, we use three degrees of the piecewise polynomial (cubic splines) and the knots

⌊ 0.5 n^{1 / 2} log n ⌋

. The performances of the estimators are evaluated via the mean square error (MSE) based on the 200 repetitions, which is defined by

MSE = \frac{1}{M} \sum_{k = 1}^{M} {({\hat{β}}_{τ} (t_{k}) - β (t_{k}))}^{2},

where

{t_{k}, k = 1, \dots, M}

is a sequence of regular grid points. These results of MSE for

{\hat{β}}_{τ} (\cdot)

are listed in Table 1 for Case I (Pwpn) and Table 2 for Case II (Blocks). The actual functions and their estimated curves for the two cases are depicted in Figure 1 and Figure 2 when

n = 300

, and

τ = 0.25

, 0.5 and 0.75, respectively.

From Table 1 and Table 2, we can make the following observations: (i) The MSEs of each time-varying coefficient

β (\cdot)

obtained by wavelet, local linear and spline techniques decrease with increasing the sample size n. The accuracy of QR-wavelet is obviously higher than that of QR-local linear and QR-splines. (ii) All QR methods work well when the random error comes from the t distribution with five freedoms; that is, QR is robust to heavy-tailed data. Based on MSE only, for Pwpn that is generally smooth except for a jump, all three methods perform almost equally well, but for Blocks with many jumps, the QR-wavelet performs better than QR-local linear and QR-splines. (iii) All three methods perform well for different quantile levels, especially for high and low quantile levels (

τ = 0.9

(high),

τ = 0.1

(low)). We also conducted some experiments based on the above setting by using least square estimation, and found that their estimators lead to systematic bias, especially at high and low quantile levels. However, QR-wavelet generally performs better than QR-local linear and QR-splines at these extreme quantile levels when there is a large sample size (e.g.,

n = 500

) and multiple jump points (e.g., Blocks). From Figure 1 and Figure 2, the estimators based on QR-local-linear and QR-splines are not better than those of QR-wavelet. For example, in Pwpn, QR-local linear and QR-splines cannot characterize the shape of the function in the interval

(0.4, 0.6)

; in Blocks, they cannot find their jumping points, while QR-wavelet detects these jumping points and represents the localized features of Pwpn and Blocks as a whole. Compared with the local linear and spline methods, the wavelet technique has great advantages in characterizing the local features of underlying curves. Therefore, the QR-wavelet overwhelmingly outperforms the QR-local-linear and QR-splines for the discontinuous/irregular functional coefficients in our time-varying coefficient models.

Author Contributions

Conceptualization, X.Z.; methodology, X.Z.; validation, G.Y. and Y.X.; investigation, X.Z.; writing—original draft preparation, X.Z.; writing—review and editing, X.Z.; supervision, X.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Chinese National Social Science Fund (Grant No. 19BTJ034), the National Natural Science Foundation of China (Grant No. 12171242, 11971235) and the Postgraduate Research and Practice Innovation Program of Jiangsu Province (KYCX22_2210, KYCX21_1940).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Cochrane, J.H. Asset Pricing; Princeton University Press: Englewood Cliffs, NJ, USA, 2001. [Google Scholar]
Tsay, R. Analysis of Financial Time Series; Wiley: New York, NY, USA, 2002. [Google Scholar]
Robinson, P.M. Nonparametric Estimation of Time-Varying Parameters. In Statistical Analysis and Forecasting of Economic Structural Change; Hackl, P., Ed.; Springer: Berlin, Germany, 1989; pp. 164–253. [Google Scholar]
Orbe, S.; Ferreira, E.; Rodríguez-Póo, J. Nonparametric estimation of time varying parameters under shape restrictions. J. Econom. 2005, 126, 53–77. [Google Scholar] [CrossRef]
Cai, Z. Trending time-varying coefficient time series models with serially correlated errors. J. Econom. 2007, 136, 163–188. [Google Scholar] [CrossRef]
Li, D.; Chen, J.; Lin, Z. Statistical inference in partially time-varying coefficient models. J. Stat. Plan. Inference 2011, 141, 995–1013. [Google Scholar] [CrossRef]
Wu, W.; Zhou, Z. Nonparametric inference for time-varying coefficient quantile regression. J. Bus. Econ. Stat. 2017, 35, 98–109. [Google Scholar] [CrossRef]
Hastie, T.; Tibshirani, R. Varying-coefficient models. J. R. Stat. Soc. Ser. B 1993, 55, 757–796. [Google Scholar] [CrossRef]
Fan, J.; Zhang, W. Statistical estimation in varying coefficient models. Ann. Stat. 1999, 27, 1491–1518. [Google Scholar] [CrossRef]
Hoover, D.R.; Rice, J.A.; Wu, C.O.; Yang, L.P. Nonparametric smoothing estimates of time-varying coefficient models with longitudinal data. Biometrika 1998, 85, 809–822. [Google Scholar] [CrossRef]
Huang, J.Z.; Wu, C.O.; Zhou, L. Polynomial spline estimation and inference for varying coefficient models with longitudinal data. Stat. Sin. 2004, 14, 763–788. [Google Scholar]
Zhou, Z.; Wu, W.B. Simultaneous inference of linear models with time varying coefficients. J. R. Stat. Soc. Ser. B 2010, 72, 513–531. [Google Scholar] [CrossRef]
Koenker, R. Quantile Regression; Cambridge University Press: Cambridge, UK, 2005. [Google Scholar]
Honda, T. Quantile regression in varying coefficient models. J. Stat. Plan. Inference 2004, 121, 113–125. [Google Scholar] [CrossRef]
Cai, Z.; Xu, X. Nonparametric quantile estimations for dynamic smooth coefficient models. J. Am. Stat. Assoc. 2009, 104, 371–383. [Google Scholar] [CrossRef]
Kim, M.O. Quantile regression with varying coefficients. Ann. Stat. 2007, 35, 92–108. [Google Scholar] [CrossRef] [Green Version]
Andriyana, Y.; Gijbels, I. Quantile regression in heteroscedastic varying coefficient models. AStA. Adv. Stat. Anal. 2017, 101, 151–176. [Google Scholar] [CrossRef]
Antoniadis, A.; Gregoire, G.; McKeague, I.W. Wavelet methods for curve estimation. J. Am. Stat. Assoc. 1994, 89, 1340–1353. [Google Scholar] [CrossRef]
Donoho, D.L.; Johnstone, I.M. Adapting to unknown smoothness via wavelet shrinkage. J. Am. Stat. Assoc. 1994, 90, 1200–1224. [Google Scholar] [CrossRef]
Hall, P.; Patil, P. Formulae for mean integrated squared error of non-linear wavelet-based density estimators. Ann. Stat. 1995, 23, 905–928. [Google Scholar] [CrossRef]
Härdle, W.; Kerkycharian, G.; Picard, D.; Tsybakov, A. Wavelet, Approximation and Statistical Application; Springer: New York, NY, USA, 1998. [Google Scholar]
Vidakovic, B. Statistical Modeling by Wavelet; Wiley: New York, NY, USA, 1999. [Google Scholar]
Doosti, H.; Afshari, M.; Niroumand, H.A. Wavelets for nonparametric stochastic regression with mixing stochastic process. Commun. Stat.- Methods 2008, 37, 373–385. [Google Scholar] [CrossRef]
Li, L.; Xiao, Y. Wavelet-based estimation of regression function with strong mixing errors under fixed design. Commun. Stat.- Methods 2017, 46, 4824–4842. [Google Scholar] [CrossRef]
Zhou, X.C.; Lin, J.G. Asymptotic properties of wavelet estimators in semiparametric regression models under dependent errors. J. Multivar. Anal. 2013, 122, 251–270. [Google Scholar] [CrossRef]
Ding, L.; Chen, P.; Li, Y. Berry-Esseen bound of wavelet estimator in heteroscedastic regression model with random errors. Int. J. Comput. Math. 2019, 96, 821–852. [Google Scholar] [CrossRef]
Zhou, X.; You, J. Wavelet estimation in varying-coefficient partially linear regression models. Stat. Probab. Lett. 2004, 68, 91–104. [Google Scholar] [CrossRef]
Lu, Y.; Li, Z. Wavelet estimation in varying-coefficient models. Chin. J. Appl. Probab. 2009, 25, 409–420. [Google Scholar]
Zhou, X.C.; Xu, Y.Z.; Lin, J.G. Wavelet estiamtion in varying coefficient models for censored dependent data. Stat. Probab. Lett. 2017, 122, 179–189. [Google Scholar] [CrossRef]
Bahadur, R.R. A note quantiles in large samples. Ann. Math. Statist. 1966, 37, 577–581. [Google Scholar] [CrossRef]
He, X.M.; Shao, Q.M. A general bahadur representation of M-estimators and its application to linear regression with nonstochastic designs. Ann. Stat. 1996, 24, 2608–2630. [Google Scholar] [CrossRef]
Yang, W.; Hu, S.; Wang, X. The Bahadur representation for sample quantiles under dependent sequence. Acta Math. Appl. Sin. Engl. Ser. 2019, 35, 521–531. [Google Scholar] [CrossRef] [Green Version]
Zhou, X.; Zhu, F. Wavelet-M-estimation for time-varying coefficient time series models. Discret. Dyn. Nat. Soc. 2020, 2020, 1025452. [Google Scholar] [CrossRef]
Doukhan, P. Mixing: Properties and Examples. In Lecture Notes in Statistics; Springer: Berlin, Germany, 1994; Volume 85. [Google Scholar]
Fan, J.; Yao, Q. Nonlinear Time Series: Nonparametric and Parametric Methods; Springer: New York, NY, USA, 2003. [Google Scholar]
Aronszajn, N. Theory of reproducing kernels. Trans. Am. Math. Soc. 1950, 68, 337–404. [Google Scholar] [CrossRef]
Hong, S.Y. Bahadur representation and its applications for local polynomial estiamtes in nonparametric M regression. J. Nonparametr. Stat. 2003, 15, 237–251. [Google Scholar] [CrossRef]
Walter, G.G. Wavelets and Orthogonal Systems with Applications; CRC Press Inc.: Boca Raton, FL, USA, 1994. [Google Scholar]
Kong, E.; Xia, Y. A single-index quantile regression model and its estimation. Econom. Theory 2012, 28, 730–768. [Google Scholar] [CrossRef] [Green Version]
Knight, K. Limiting distributions of L₁ regression estimators under general conditions. Ann. Stat. 1998, 26, 755–770. [Google Scholar] [CrossRef]
Pollard, D. Asymptotics for least absolute deviation regression estimators. Econom. Theory 1991, 7, 186–199. [Google Scholar] [CrossRef]
Cai, Z.; Fan, J.; Yao, Q. Functional-coefficient regression models for nonlinear time series. J. Am. Stat. Assoc. 2000, 95, 941–956. [Google Scholar] [CrossRef]
Wand, M.P.; Jones, M.C. Kernel Smoothing; Chapman and Hall: London, UK, 1995. [Google Scholar]
Chambers, J.M.; Hastie, T.J. Statistical Models in S; Wadsworth & Brooks/Cole: Salt Lake City, UT, USA, 1992. [Google Scholar]

Figure 1. The true time-varying coefficient function Papw, and their QR-wavelet, QR-local-linear and QR-splines estimates when

n = 300

, and

τ = 0.25

, 0.50 and 0.75 for the errors

N (0, 1)

and

t (5)

.

Figure 1. The true time-varying coefficient function Papw, and their QR-wavelet, QR-local-linear and QR-splines estimates when

n = 300

, and

τ = 0.25

, 0.50 and 0.75 for the errors

N (0, 1)

and

t (5)

.

Figure 2. The true time-varying coefficient function Blocks and their QR-wavelet, QR-local-linear and QR-splines estimates when

n = 300

, and

τ = 0.25

, 0.50 and 0.75 for the errors

N (0, 1)

and

t (5)

.

Figure 2. The true time-varying coefficient function Blocks and their QR-wavelet, QR-local-linear and QR-splines estimates when

n = 300

, and

τ = 0.25

, 0.50 and 0.75 for the errors

N (0, 1)

and

t (5)

.

Table 1. MSEs of wavelet, local linear and spline smoothing in Case I: Pwpn.

		QR-Wavelet		QR-Local-Linear		QR-Splines
$n$	$τ$	$N (0, {0.1}^{2})$	$t (5)$	$N (0, {0.1}^{2})$	$t (5)$	$N (0, {0.1}^{2})$	$t (5)$
200	0.10	0.00073	0.02273	0.00644	0.00479	0.01032	0.00824
	0.25	0.00065	0.00125	0.00383	0.00299	0.00859	0.00584
	0.50	0.00062	0.00085	0.00213	0.00293	0.00530	0.00529
	0.75	0.00065	0.00104	0.00293	0.00262	0.00860	0.00532
	0.90	0.00074	0.02208	0.00474	0.00345	0.01011	0.00660
300	0.10	0.00042	0.00966	0.00605	0.00287	0.01049	0.00557
	0.25	0.00040	0.00157	0.00355	0.00271	0.00868	0.00539
	0.50	0.00037	0.00055	0.00197	0.00250	0.00529	0.00524
	0.75	0.00039	0.00141	0.00279	0.00244	0.00867	0.00534
	0.90	0.00041	0.01121	0.00455	0.00254	0.01017	0.00561
500	0.10	0.00011	0.00319	0.00516	0.00262	0.00514	0.00319
	0.25	0.00010	0.00041	0.00311	0.00227	0.00358	0.00309
	0.50	0.00010	0.00026	0.00176	0.00213	0.00303	0.00300
	0.75	0.00010	0.00045	0.00255	0.00221	0.00354	0.00306
	0.90	0.00011	0.00354	0.00411	0.00252	0.00532	0.00339

Table 2. MSEs of wavelet, local linear and spline smoothing in Case II: Blocks.

		QR-Wavelet		QR-Local-Linear		QR-Splines
$n$	$τ$	$N (0, {0.1}^{2})$	$t (5)$	$N (0, {0.1}^{2})$	$t (5)$	$N (0, {0.1}^{2})$	$t (5)$
200	0.10	0.00573	0.01904	0.01114	0.00714	0.01730	0.00897
	0.25	0.00503	0.00368	0.00661	0.00593	0.00921	0.00778
	0.50	0.00378	0.00347	0.00599	0.00587	0.00803	0.00765
	0.75	0.00418	0.00384	0.00721	0.00613	0.00894	0.00773
	0.90	0.00574	0.02662	0.00865	0.00822	0.01150	0.01086
300	0.10	0.00624	0.01358	0.01059	0.00622	0.01751	0.00777
	0.25	0.00611	0.00553	0.00647	0.00556	0.00925	0.00764
	0.50	0.00484	0.00448	0.00584	0.00567	0.00800	0.00761
	0.75	0.00674	0.00557	0.00704	0.00570	0.00897	0.00775
	0.90	0.00682	0.01421	0.00852	0.00592	0.01139	0.00794
500	0.10	0.00562	0.00426	0.00958	0.00562	0.01541	0.00694
	0.25	0.00385	0.00368	0.00634	0.00528	0.00751	0.00671
	0.50	0.00310	0.00285	0.00557	0.00528	0.00706	0.00669
	0.75	0.00357	0.00325	0.00679	0.00550	0.00757	0.00681
	0.90	0.00408	0.00481	0.00813	0.00545	0.01000	0.00684

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhou, X.; Yang, G.; Xiang, Y. Quantile-Wavelet Nonparametric Estimates for Time-Varying Coefficient Models. Mathematics 2022, 10, 2321. https://doi.org/10.3390/math10132321

AMA Style

Zhou X, Yang G, Xiang Y. Quantile-Wavelet Nonparametric Estimates for Time-Varying Coefficient Models. Mathematics. 2022; 10(13):2321. https://doi.org/10.3390/math10132321

Chicago/Turabian Style

Zhou, Xingcai, Guang Yang, and Yu Xiang. 2022. "Quantile-Wavelet Nonparametric Estimates for Time-Varying Coefficient Models" Mathematics 10, no. 13: 2321. https://doi.org/10.3390/math10132321

APA Style

Zhou, X., Yang, G., & Xiang, Y. (2022). Quantile-Wavelet Nonparametric Estimates for Time-Varying Coefficient Models. Mathematics, 10(13), 2321. https://doi.org/10.3390/math10132321

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Quantile-Wavelet Nonparametric Estimates for Time-Varying Coefficient Models

Abstract

1. Introduction

2. Quantile-Wavelet Estimation

3. Bahadur Representation and Its Applications

4. Lemmas and Proofs

5. Simulation Study

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI