Non-Parametric Conditional U-Processes for Locally Stationary Functional Random Fields under Stochastic Sampling Design

Bouzebda, Salim; Soukarieh, Inass

doi:10.3390/math11010016

Open AccessArticle

Non-Parametric Conditional U-Processes for Locally Stationary Functional Random Fields under Stochastic Sampling Design

by

Salim Bouzebda

^*,†

and

Inass Soukarieh

^†

Laboratory of Applied Mathematics of Compiègne(LMAC), Université de Technologie de Compiègne, CEDEX, 60 203 Compiègne, France

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Mathematics 2023, 11(1), 16; https://doi.org/10.3390/math11010016

Submission received: 21 October 2022 / Revised: 14 December 2022 / Accepted: 16 December 2022 / Published: 20 December 2022

(This article belongs to the Special Issue Current Developments in Theoretical and Applied Statistics)

Download Versions Notes

Abstract

:

Stute presented the so-called conditional U-statistics generalizing the Nadaraya–Watson estimates of the regression function. Stute demonstrated their pointwise consistency and the asymptotic normality. In this paper, we extend the results to a more abstract setting. We develop an asymptotic theory of conditional U-statistics for locally stationary random fields

{X_{s, A_{n}} : s in R_{n}}

observed at irregularly spaced locations in

R_{n} = {[0, A_{n}]}^{d}

as a subset of

R^{d}

. We employ a stochastic sampling scheme that may create irregularly spaced sampling sites in a flexible manner and includes both pure and mixed increasing domain frameworks. We specifically examine the rate of the strong uniform convergence and the weak convergence of conditional U-processes when the explicative variable is functional. We examine the weak convergence where the class of functions is either bounded or unbounded and satisfies specific moment conditions. These results are achieved under somewhat general structural conditions pertaining to the classes of functions and the underlying models. The theoretical results developed in this paper are (or will be) essential building blocks for several future breakthroughs in functional data analysis.

Keywords:

conditional U-statistics; locally stationary random field; functional data; empirical processes; conditional U-processes; VC-class of functions; kernel-type estimators; regression; irregularly spaced data

MSC:

60F05; 62G05; 62G08; 62G20; 62G35; 62G07; 62G32; 62G30; 62E20; 62G08; 62M10

1. Introduction

The regression problem has been studied by statisticians and probability theorists for many years, resulting in a vast array of approaches. Various themes have been covered, such as modeling, estimate method applications, tests, and other related topics. In addition to the parametric framework, in which one must estimate a finite number of parameters based on an a priori specified model structure, the non-parametric framework is devoted to data that lack a priori structural information. As inherent disadvantages, non-parametric processes are susceptible to estimation biases and reductions in convergence rates compared to parametric methods. Kernel non-parametric function estimation techniques have long been of great interest; for good references to research literature and statistical applications in this area, see [1,2,3,4,5,6] and the references therein. Even though they are widely used, they are just one of several possible approaches to building reliable function estimators. Despite their popularity, methods such as nearest neighbor, spline, neural network, and wavelet analysis are examples of these approaches. These techniques have been utilized on a vast range of different types of data. In this article, our focus will be narrowed to the development of consistent kernel-type estimators for the conditional U-statistics in the context of spatial data. Spatial data are typically generated in numerous research fields, such as econometrics, epidemiology, environmental science, image analysis, oceanography, meteorology, geostatistics, etc. These data are typically collected in various fields and treated statistically on measurement sites. Consult [7,8,9,10] as well as the references contained in these works to find reliable sources of references to the research literature in this area and discover some statistical applications. In the context of non-parametric estimation for spatial data, the existing papers are mostly concerned with estimating probability density and regression functions. Hence, we will cite some important references [11,12,13,14,15] and the references in which they are included. By considering the conditional U-processes, we give a more generic and abstract context based on this research. With many possible applications, the idea of U-statistics (introduced in a landmark work by [16]) and U-processes have attracted a great deal of interest over the past few decades. U-processes are effective for resolving intricate statistical issues: density estimation, non-parametric regression tests, and goodness-of-fit tests are among the examples. Specifically, U-processes emerge in statistics in a variety of contexts, such as the terms of higher order in von Mises expansions. In particular, U-statistics assist in the analysis of estimators and function estimators, with varying degrees of smoothness. For example, Ref. [17] aimed to analyze the product limit estimator for shortened data, so he employs almost sure uniform bounds for

P

-canonical U-processes. In addition, Ref. [18] introduced two novel normality tests based on U-processes. Likewise, new tests for normality that use as test statistics weighted

L_{1}

-distances between the standard normal density and local U-statistics based on standardized data were introduced by [19,20,21]. In addition, Ref. [22] challenged the estimate of the mean of multivariate functions under the assumption of possibly heavy-tailed distributions and presented the median-of-means based on U-statistics. The applications of U-processes in various statistical applications may also include tests for functions’ qualitative features in non-parametric statistics (c.f. [23,24,25]), cross-validation for density estimation [26], and establishing the limiting distributions of M-estimators (see, e.g., Refs. [27,28,29]). Historically, Ref. [27] furnishes the necessary and sufficient criteria for the law of large numbers and the sufficient conditions for the central limit theorem for U-processes, equipped by [16,30,31], who provided (amongst others) the first asymptotic results for the case that the underlying random variables are independent and identically distributed. However, under weak dependency assumptions, asymptotic outcomes are illustrated in [32,33,34] or just lately in [35] and in a more general setting in [36,37,38,39,40,41]. The applicability of U-statistics in estimation and machine learning applications is comprehensive. We refer to the U-statistics with random kernels of divergent orders to [40,42,43,44,45]. Infinite-order U-statistics are helpful tools for creating simultaneous prediction intervals. These constructed intervals are important to quantify ensemble methods’ uncertainty such as subbagging and random forests. For additional information on the topic, c.f [46]. The MeanNN method estimation for differential entropy, which was first described by [47], is a remarkable instance of the U-statistic. A novel test statistic for goodness-of-fit tests was proposed by [48] using U-statistics. Using U-statistics, the conference [49] proposed a measure to quantify the level of clustering quality exhibited by a partition. The interested reader may refer to [50,51,52] for outstanding resources of references on the U-statistics. The book of [29] provides a profound and in-depth view of the notion of U-processes.

In this work, our primary focus is on the scenario, including spatial–functional data. We give an excerpt from [53]: “Functional data analysis (FDA) is a branch of statistics concerned with the analysis of infinite-dimensional variables such as curves, sets, and images. It has undergone phenomenal growth over the past 20 years, stimulated in part by major advances in data collection technology that have brought about the “Big Data" revolution. Often perceived as a somewhat arcane area of research at the turn of the century, FDA is now one of the most active and relevant fields of investigation in data science.” The reader is directed to the works of reference [54,55] for an overview of this subject area. These references include fundamental approaches to functional data analysis and a wide range of case studies from diverse disciplines, such as criminology, economics, archaeology, and neurophysiology. It is important to note that the extension of probability theory to random variables taking values in normed vector spaces (for example, Banach and Hilbert spaces), including extensions of certain classical asymptotic limit theorems, predates the recent literature on functional data; the reader is referred to the book [56] for more information on this topic. Considering density and mode estimates for data with values in a normed vector space is the focus of the work presented by [57]. The problem of the curse of dimensionality, which occurs when functional data have too many dimensions, is discussed in this study, along with potential solutions to the issue. According to [55], non-parametric models were deemed useful in regression estimation. We could also refer to [58,59,60].

Modern empirical process theory has recently been applied to processing functional data. Ref. [61] provided the consistency rates of several conditional models, such as the regression function, the conditional cumulative distribution, the conditional density, and others, uniformly over a subset of the explanatory variable. Ref. [62] extended [63]’s UIB consistency to the ergodic setting. Ref. [64] considered the problem of local linear estimation of the regression function when the regressor is functional and showed strong convergence, with specified rates, uniformly in bandwidth parameters. Ref. [65] examined the k-nearest neighbors (kNN) estimate of the non-parametric regression model for strong mixing functional time series data and determined the uniform, almost complete convergence rate of the kNN estimator under some mild conditions. Ref. [66] treated the ergodic data and offered a variety of results related to the limiting distribution for the conditional mode in the functional setting; for recent references, c.f [38,67,68,69,70,71,72].

Ref. [73] raised a class of estimators for

r^{(m)} (φ, t)

, known as conditional U-statistics, attempted to generalize the Nadaraya–Watson regression function estimations. Foremost, we present Stute’s estimators. Consider the regular sequence of random elements

{(X_{i}, Y_{i}), i \in N^{*}}

with

X_{i} \in R^{d}

and

Y_{i} \in Y

some polish space and

N^{*} = N ∖ {0}

. Let

φ : Y^{m} \to R

be a measurable function. In this study, the estimation of the conditional expectation, or regression function, is our primary concern:

\begin{matrix} r^{(m)} (φ, t) & = & E (φ (Y_{1}, \dots, Y_{m}) ∣ (X_{1}, \dots, X_{m}) = t), for t \in R^{d m}, \end{matrix}

(1)

whenever it exists, i.e.,

E (|φ (Y_{1}, \dots, Y_{m})|) < \infty .

We now introduce a kernel function

K : R^{d} \to R

with support contained in

{[- B, B]}^{d}

,

B > 0,

satisfying:

sup_{x \in R^{d}} | K (x) | = : κ < \infty and \int K (x) d x = 1 .

(2)

Hence, the class of estimators for

r^{(m)} (φ, t)

, given by [73], is defined, for each

t \in R^{d m}

, as follows:

\begin{matrix} {\hat{r}}_{n}^{(m)} (φ, t; h_{n}) = \frac{\sum_{i \in I_{n}^{m}} φ (Y_{i_{1}}, \dots, Y_{i_{m}}) K (\frac{t_{1} - X_{i_{1}}}{h_{n}}) \dots K (\frac{t_{m} - X_{i_{m}}}{h_{n}})}{\sum_{i \in I_{n}^{m}} K (\frac{t_{1} - X_{i_{1}}}{h_{n}}) \dots K (\frac{t_{m} - X_{i_{m}}}{h_{n}})}, \end{matrix}

(3)

where

I_{n}^{m} = \{i = (i_{1}, \dots, i_{m}) : 1 \leq i_{j} \leq n and i_{j} \neq i_{r} if j \neq r\},

denotes the set of all m-tuples of different integers

i_{j}

between 1 and n and

{h_{n} : = h_{n}}_{n \geq 1}

is a sequence of positive constants that converge to zero with rate

n h_{n}^{m} \to \infty

.

For

m = 1

, the

r^{(m)} (φ, t)

becomes

r^{(1)} (φ, t) = E (φ (Y) | X = t)

and the estimate of Stute will be transformed to the Nadaraya–Watson estimator of

r^{(1)} (φ, t)

.

Behind, Ref. [74] aimed to estimate the rate of uniform convergence in

t

of

{\hat{r}}_{n}^{(m)} (φ, t; h_{n})

to

r^{(m)} (φ, t)

. Meanwhile, the study of [75] developed the limit distributions of

{\hat{r}}_{n}^{(m)} (φ, t; h_{n})

, by discussing and contrasting the findings of Stute. Correspondingly, under appropriate mixing conditions, Ref. [76] spread the results of [73] to weakly dependent data and employed their findings to validate the Bayes risk consistency of the relevant discrimination rules. Ref. [77] suggested symmetrized nearest neighbor conditional U-statistics as alternatives to conventional kernel estimators. Ref. [78] took into consideration the functional conditional U-statistic and established the finite-dimensional asymptotic normality. Nevertheless, the non-parametric estimate of the conditional U-statistics in the functional data framework had not received significant attention, despite the subject’s relevance. Some recent developments are discussed in references [79,80], in which the authors examine the challenges associated with maintaining a uniform bandwidth in a general framework. The test of independence in the functional framework was based on the Kendall statistics, which may be thought of as examples of the U-statistics; for instance, see [81]. The extension of the investigation described above to conditional empirical U-processes is theoretically attractive, practically helpful, and technically challenging.

The primary objective of this study is to examine a general framework and the weak convergence’s characterization of the regular sequence of random spatial functions based on conditional U-processes. This inquiry is simple, as it is difficult to hold the asymptotic equicontinuity under minimal conditions in this general setting, which constitutes a fundamentally unresolved open subject in the literature. We intend to fill this gap in the literature by merging the findings of [37,82,83] with techniques handling the functional data given in [84,85,86,87]. However, as demonstrated in the following section, the challenge requires much more than “just” merging concepts from the current outcomes. In fact, complex mathematical derivations are necessary to deal with the typical functional data in our context. This requires the successful application of large-sample theoretical tools, which have been established for empirical processes, where we used the results of the work of [37,82,83].

The structure of the present article is as follows. Section 2 introduces the functional framework and the definitions requested in our work. The assumptions used in our asymptotic analysis go along with a brief discussion. Section 3 gives the uniform rates of the strong convergence. Section 4 includes the paper’s main results concerning the uniform weak convergence for the conditional U-processes. In Section 5, we provide some potential applications. In Section 6, we consider the conditional U-statistics in the right censored data framework. In Section 7, we present how to select the bandwidth through the cross-validation procedures. Some concluding remarks and possible future developments are relegated to Section 8. All proofs are gathered in Section 9 to prevent interrupting the presentation flow. Finally, some relevant technical results are given in Appendix A.

2. The Functional Framework

2.1. Notation

For any set

A \subset R^{d}

,

| A |

represents the Lebesgue measure of A and

[[A]]

denotes the number of elements in A. For any positive sequence

a_{n}, b_{n}

, we write

a_{n} ≲ b_{n}

if a constant

C > 0

independent of n exists such that

a_{n} \leq C b_{n}

for all n,

a_{n} \sim b_{n}

if

a_{n} ≲ b_{n}

and

b_{n} ≲ a_{n}

, and

a_{n} ≪ b_{n}

si

a_{n} / b_{n} \to 0

as

n \to \infty

. We use the notation

\overset{d}{\to}

to indicate convergence in the distribution. We write

X \overset{d}{=} Y

if the random variables X and Y have the same distribution.

P_{S}

will denote the joint probability distribution of the sequence of independent and identically distributed (i.i.d.) random vectors

{S_{0, j}}_{j \geq 1}

, and

P_{. | S}

is the conditional probability distribution for

{S_{0, j}}_{j \geq 1}

. Let

E_{. | S}

represent the conditional expectation and

{Var}_{. | S}

represent the variance for

{S_{0, j}}_{j \geq 1}

.

2.2. Generality on the Model

In this investigation, we examine the following model:

\begin{matrix} φ (Y_{s_{i_{1}}, A_{n}}, \dots, Y_{s_{i_{m}}, A_{n}}) \\ = r^{(m)} (φ, X_{s_{i_{1}}, A_{n}}, \dots, X_{s_{i_{m}}, A_{n}}, \frac{s_{i_{1}}}{A_{n}}, \dots, \frac{s_{i_{m}}}{A_{n}}) + \prod_{j = 1}^{m} σ (\frac{s_{i_{j}}}{A_{n}}, X) ϵ_{i_{j}} \\ = r^{(m)} (φ, X_{s_{i_{1}}, A_{n}}, \dots, X_{s_{i_{m}}, A_{n}}, \frac{s_{i_{1}}}{A_{n}}, \dots, \frac{s_{i_{m}}}{A_{n}}) + \prod_{j = 1}^{m} ϵ_{s_{i_{j}}, A_{n}}, s_{i_{j}} \in R_{n}, j = 1, \dots, m, \end{matrix}

(4)

where

E [ϵ_{s, A_{n}} | X_{s, A_{n}}] = 0

and

R_{n} = {[0, A_{n}]}^{d} \subset R^{d}

denotes a sampling region with

A_{n} \to \infty

as

n \to \infty

. Here,

Y_{s_{j}, A_{n}}

and

X_{s, A_{n}}

denote random functions in

H

and

Y

. We consider

{X_{s, A_{n}} : s \in R_{n}}

as a locally stationary random function field on

R_{n} \subset R^{d}

(

d \geq 2

). As suggested by [88], locally stationary processes are nonstationary time series in which the parameters of the time series can change over time. Locally in time, they can be modeled by a stationary time series, which makes it possible to use asymptotic theories to estimate the parameters of models that depend on time. Time series analyses mostly look at locally stationary models in a parametric framework with coefficients that change over time.

2.3. Local Stationarity

A random function field

{X_{s_{j}, A_{n}} : s \in R_{n}}

(

A_{n} \to \infty

as

n \to \infty

) is considered to be locally stationary if it exhibits behavior that is approximately stationary in the local space. To guaranteed that it is locally stationary around each rescaled space point

u

, a process

{X_{s, A_{n}}}

can be approximated by a stationary random function field

{X_{u} (s) : s \in R^{d}}

stochastically; for instance, see [89]. The following is one possible way to define this idea.

Definition 1.

The

H

-valued stochastic process

{X_{s, A_{n}} : s \in R_{n}}

denotes locally stationary if for each rescaled time point

u \in {[0, 1]}^{d}

, there exists an associated

H

-valued process

{X_{u} (s) : s \in R^{d}}

with the following properties:

(i): ${X_{u} (s) : s \in R^{d}}$ denotes strictly stationary.
(ii): It holds that

$\begin{matrix} d (X_{s, A_{n}}, X_{u} (s)) \leq ({∥\frac{s}{A_{n}} - u∥}_{2} + \frac{1}{A_{n}^{d}}) U_{s, A_{n}} (u) a . s ., \end{matrix}$

(5)

where ${U_{s, A_{n}} (u)}$ denotes a process of positive variables satisfying $E [{(U_{s, A_{n}} (u))}^{ρ}] < C$ for some $ρ > 0$ , $C < \infty$ ; C is independent of $u, s$ , and $A_{n}$ . ${∥ . ∥}_{2}$ is arbitrary norms of $R^{d}$ .

The concept of local stationarity for real-valued time series was first presented by [88], and Definition 1 is a natural extension of that idea.

In addition, the definition we offer is the same as that of [90] (Definition 2.1) when

H

is the Hilbert space

L_{R}^{2} ([0, 1])

of all real-valued functions that are square integrable with respect to the Lebesgue measure on the unit interval

[0, 1]

with the

L_{2}

-norm given by

{∥ f ∥}_{2} = \sqrt{〈 f, f 〉}, 〈 f, g 〉 = \int_{0}^{1} f (t) g (t) d t,

where

f, g \in L_{R}^{2} ([0, 1])

. In addition to this, the authors provide necessary conditions so that an

L_{R}^{2} ([0, 1])

-valued stochastic process

{X_{t, T}}

satisfies (5) with

d (f, g) = {∥ f - g ∥}_{2}

and

ρ = 2

.

2.4. Sampling Design

We are going to look at the stochastic sampling strategy in order to accommodate the data that are irregularly spaced. First, define

R_{n}

as the sampling region. Let

{A_{n}}_{n \geq 1}

be a sequence of positive numbers such that

A_{n} \to \infty

as

n \to \infty

. We consider the sampling region as follows:

\begin{matrix} R_{n} = {[0, A_{n}]}^{d} . \end{matrix}

(6)

We will discuss the (random) sample designs we will use. Let

f_{S} (s_{0})

be a continuous, everywhere positive probability density function on

R_{0} = {[0, 1]}^{d}

, and let

{S_{0, j}}_{j \geq 1}

be a sequence of i.i.d. random vectors with probability density

f_{S} (s_{0})

such that

{S_{0, j}}_{j \geq 1}

and

{X_{s, A} : s \in R_{n}}

share a common probability space

(Ω, F, P)

and are independent. The realizations

s_{0, 1}, \dots, s_{0, n}

of random vectors

S_{0, 1}, \dots, S_{0, n}

by the following relation:

s_{j} = A_{n} s_{0, j}, j = 1, \dots, n .

gives the sampling sites

s_{1}, \dots, s_{n}

Herein, we assume that

n A_{n}^{- d} \to \infty

as

n \to \infty

.

Remark 1.

In practice,

A_{n}

can be derived by taking the sampling region’s diameter. We can extend the applicability of the assumption (6) to

R_{n}

to a broader range of situations, i.e.,

R_{n} = \prod_{j = 1}^{d} [0, A_{j, n}],

where

A_{j, n}

are sequences of positive constants with

A_{j, n} \to \infty

as

n \to \infty

. To avoid more challenging outcomes, we assumed (6). For additional discussion, please refer to [85,87,91,92,93] and ([94], Chapter 12).

2.5. Mixing Condition

The sequence

Z_{1}, Z_{2},

is said to be

β

-mixing or absolute regular, refer to [95,96], if:

β (k) : = E sup_{l ⩾ 1} \{|P (A | σ_{1}^{l}) - P (A)| : A \in σ_{l + k}^{\infty}\} ⟶ 0 as k \to \infty .

Notably, Ref. [97] produced a comprehensive description of stationary Gaussian processes matching the last condition. Now, we define

β

-mixing coefficients for a random function field

\tilde{X}

. Let

σ_{\tilde{X}} (T) = σ ({\tilde{X} (s) : s \in T})

be the

σ

-field generated by variables

{\tilde{X (} s) : s \in T}

,

T \subset R^{d}

. For subsets

T_{1}

and

T_{2}

of

R^{d}

, let

\bar{β} (T_{1}, T_{2}) = sup \frac{1}{2} \sum_{j = 1}^{J} \sum_{k = 1}^{K} | P (A_{j} \cap B_{k}) - P (A_{j}) P (B_{k}) |,

where the supremum is taken over all pairs of (finite) partitions

{A_{1}, \dots, A_{J}}

and

{B_{1}, \dots, B_{K}}

of

R^{d}

such that

A_{j} \in σ_{\tilde{X}} (T_{1})

and

B_{k} \in σ_{\tilde{X}} (T_{2})

. Furthermore, let

d (T_{1}, T_{2}) = inf {| x - y | : x \in T_{1}, y \in T_{2}},

where

| x | = \sum_{j = 1}^{d} | x_{j} |

for

x \in R^{d}

, and let

R (b)

be the collection of all finite disjoint unions of cubes in

R^{d}

with a volume total not exceeding b. Subsequently, the

β

-mixing coefficients for the random field

\tilde{X}

can be defined as

\begin{matrix} β (a; b) = sup {\bar{β} (T_{1}, T_{2}) : d (T_{1}, T_{2}) \geq a, T_{1}, T_{2} \in R (b)} . \end{matrix}

(7)

We assume that a non-increasing function

β_{1}

with

{lim}_{a \to \infty} β_{1} (a) = 0

and a non-decreasing function

g_{1}

exist such that the

β

-mixing coefficient

β (a; b)

satisfies the following inequality:

β (a; b) \leq β_{1} (a) g_{1} (b), a > 0, b > 0,

(8)

where

g_{1}

may be unbounded for

d \geq 2

.

Remark 2

(Some remarks about mixing conditions). The size of index sets

T_{1}

and

T_{2}

in the definition of

β (a; b)

must be restricted. Let us explain this point. If the β-mixing coefficients of a random field

X

are defined similarly to the β-mixing coefficients for the time series as follows: Let

O_{1}

and

O_{2}

be half-planes with boundaries

L_{1}

and

L_{2}

, respectively. For each real number

a > 0

, define

β (a) = sup \{\bar{β} (O_{1}, O_{2}) : d (O_{1}, O_{2}) \geq a\},

where sup is taken over all pairs of parallel lines

L_{1}

and

L_{2}

such that

d (L_{1}, L_{2}) \geq a

. Subsequently, ([98] Theorem 1) shows that if

\{X (s) : s \in R^{2}\}

is a strictly stationary mixing random field, and

a > 0

is a real number. Then,

β (a) = 1

or 0. This means that if a random field

X

is β-mixing

(({lim}_{a \to \infty} β (a) = 0))

, then for η, a positive constant and for some

a > η

, the random field

X

is “m-dependent”, i.e.,

β (a) = 0

. However, this is highly restricted in practice. In order to loosen these results and make them more flexible for practical purposes, it will be necessary to restrict the size of

T_{1}

and

T_{2}

and adopt Definition 7 for the β-mixing. We refer to [85,87,99,100,101] for additional information on mixing coefficients for random fields.

Ref. [93] writes the form of mixing condition given in Equation (8) for the

α

-mixing condition and it was considered also in the works of [102,103]. We have considered the

β

-mixing case, and it is well known that the

β

-mixing implies the

α

-mixing. In general, in the expression (8)

β_{1}

is a function defined in a way that it could be dependent on n as the random field

X_{s, A_{n}}

depends on n, yet, g does not, just for the simplicity sake, despite that the general cases where g changes with n are not difficult. We note that the random field

Y_{s, A_{n}} (o r φ (Y_{s, A_{n}}))

does not necessarily satisfy the mixing condition (8), since the mixing condition is assumed for

X_{s, A_{n}}

, but with the regression form represented by the model in (4),

Y_{s, A_{n}} (o r φ (Y_{s, A_{n}}))

may have a flexible dependence structure.

2.6. Generality on the Model

Let

{X_{s, A_{n}}, Y_{s, A_{n}} : s \in R_{n}}

be random variables where

Y_{s, A_{n}}

is in

Y

and

X_{s, A_{n}}

takes values in some semi-metric space

H

with a semi-metric

d (\cdot, \cdot)

(a semi-metric (sometimes called pseudo-metric)

d (\cdot, \cdot)

is a metric which allows

d (x_{1}, x_{2}) = 0

for some

x_{1} \neq x_{2}

) defining a topology to measure the proximity between two elements of

H

and which is dissociated from the definition of X in order to prevent concerns with measurability. This study aims to establish the weak convergence of the conditional U-process using the following U-statistic.

\begin{matrix} {\hat{r}}_{n}^{(m)} (x, u; h_{n}) : = {\hat{r}}_{n}^{(m)} (φ, x, u; h_{n}) \\ = \frac{\sum_{i \in I_{n}^{m}} φ (Y_{s_{i_{1}}, A_{n}}, \dots, Y_{s_{i_{m}, A_{n}}}) \prod_{j = 1}^{m} \{\bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\}}{\sum_{i \in I_{n}^{m}} \prod_{j = 1}^{m} \{\bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\}} \\ = \frac{\sum_{i \in I_{n}^{m}} φ (Y_{s_{i_{1}}, A_{n}}, \dots, Y_{s_{i_{m}, A_{n}}}) \prod_{j = 1}^{m} \{\prod_{ℓ = 1}^{d} K_{1} (\frac{u_{j, ℓ} - \frac{s_{i_{j, ℓ}}}{A_{n}}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\}}{\sum_{i \in I_{n}^{m}} \prod_{j = 1}^{m} \{\prod_{ℓ = 1}^{d} K_{1} (\frac{u_{j, ℓ} - \frac{s_{i_{j, ℓ}}}{A_{n}}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\}}, \end{matrix}

(9)

where

\begin{matrix} I_{n}^{m} & : = & \{i = (i_{1}, \dots, i_{m}) : 1 \leq i_{j} \leq n and i_{r} \neq i_{j} if r \neq j\}, \\ \bar{K} (u) & = & \prod_{ℓ = 1}^{d} K_{1} (u_{ℓ}), \end{matrix}

(10)

and

φ : Y^{m} ⟶ R

is a symmetric, measurable function that belongs to some class of functions

F_{m},

and

{\{h_{n}\}}_{n \in N^{*}}

is a sequence of positive real numbers satisfying

h_{n} \to 0 as n \to \infty

. In order to examine the weak convergence of the conditional empirical process and the conditional U-process under functional data, we must introduce new notations, let

F_{m} = {φ : Y^{m} \to R},

be a point-wise measurable class of real-valued symmetric measurable functions on

Y^{m}

with a measurable envelope function:

F (y) \geq sup_{φ \in F_{m}} | φ (y) |, for y \in Y^{m} .

(11)

For a kernel function

K (\cdot)

, we define the point-wise measurable class of functions, for

1 \leq m \leq n

K^{m} : = \{(x_{1}, \dots, x_{m}) \mapsto \prod_{i = 1}^{m} K (\frac{d (x_{i}, \cdot)}{h_{n}}), 0 < h_{n} < 1 and (x_{1}, \dots, x_{m}) \in H^{m}\} .

We use the notation

ψ (\cdot, \cdot) \in F_{m} K^{m} : = \{φ_{1} (\cdot) φ_{2} (\cdot) : φ_{1} \in F_{m}, φ_{2} (\cdot) \in K^{m}\},

and

ψ (\cdot, \cdot) \in F_{1} K^{1} : = F K = \{φ_{1} (\cdot) φ_{2} (\cdot) : φ_{1} \in F_{1}, φ_{2} (\cdot) \in K^{1}\} .

2.6.1. Small Ball Probability

In the absence of a universal reference measure, such as the Lebesgue measure, the density function of the functional variable does not exist, which is one of the technical challenges in infinite-dimensional spaces. To circumvent this obstacle, we employ the concept of “small-ball probability”. The function

ϕ_{x} (\cdot)

precisely controls the concentration of the probability measure of the functional variable on a small ball, which is defined, for a fixed

x \in H

for all

r > 0,

by:

P (X \in B (x, r)) = : ϕ_{x} (r) > 0,

(12)

where the space

H

is equipped with the semi-metric

d (., .)

, and:

B (x, r) = \{z \in H : d (z, x) ⩽ r\}

is a ball in

H

with the center x and radius

r .

2.6.2. VC-Type Classes of Functions

The asymptotic analysis of functional data is related to concentration properties expressed in terms of the small-ball probability concept. When considering a process indexed by a class of functions, one must also account for other topological concepts, including metric entropy and VC-subgraph classes (“VC” for Vapnik and Červonenkis).

Definition 2.

Let

S_{E}

denote a subset of a semi-metric space

E

and

N_{ε}

a positive integer, a finite set of points

{e_{1}, \dots, e_{N_{ε}}} \subset E

is called, for a given

ε > 0

, a ε-net of

S_{E}

if:

S_{E} \subseteq \cup_{j = 1}^{N_{ε}} B (e_{j}, ε) .

If

N_{ε} (S_{E})

denotes the cardinality of the smallest ε-net (the minimal number of open balls of radius ε) in

E,

needed to cover

S_{H}

, then we call Kolmogorov’s entropy (metric entropy) of the set

S_{E}

the quantity

ψ_{S_{E}} (ε) : = log N_{ε} (S_{E}) .

From its name, one can deduce that Kolmogorov invented this idea of metric entropy (cf. Ref. [104]), which was then explored for different metric spaces. This concept was utilized by [105] to provide necessary conditions for the continuity of Gaussian processes. It served as the foundation for remarkable expansions of Donsker’s theorem on the weak convergence of empirical processes.

B_{H}

and

S_{H}

represent two subsets of the space

H

with Kolmogorov’s entropy (for the radius

ε

)

ψ_{B_{H}} (ε)

and

ψ_{S_{H}} (ε)

, respectively, then the Kolmogorov entropy for the subset

B_{H} \times S_{H}

of the semi-metric space

H^{2}

by:

ψ_{B_{H} \times S_{H}} (ε) = ψ_{B_{H}} (ε) + ψ_{S_{H}} (ε) .

Hence,

m ψ_{S_{H}} (ε)

is the Kolmogorov entropy of the subset

S_{H}^{m}

of the semi-metric space

H^{m}

. We specify by d the semi-metric on

H

; then, this semi-metric defined on

H^{m}

by:

d_{H^{m}} (x, y) : = \frac{1}{m} d (x_{1}, y_{1}) + \dots + \frac{1}{m} d (x_{m}, y_{m})

for

x = (x_{1}, \dots, x_{m}), y = (y_{1}, \dots, y_{m}) \in H^{m} .

In this type of study, the semi-metric plays a crucial role. The reader can discover helpful discussions on how to select the semi-metric in [55] (see Chapter 3 and Chapter 13). We must additionally consider another topological term: namely, VC-subgraph classes (“VC” for Vapnik and Červonenkis).

Definition 3.

We call a class of subsets

C

on a set C a VC-class if there exists a polynomial

P (\cdot)

such that, for every set of

N_{ε}

points in C, the class

C

picks out at most

P (N_{ε})

distinct subsets.

Definition 4.

A class of functions

F

is called a VC-subgraph class if the graphs of the functions in

F

form a VC-class of sets, i.e., if we define the subgraph of a real-valued function f on S as the following subset

G_{f}

on

\times R

:

G_{f} = {(s, t) : 0 \leq t \leq f (s) o r f (s) \leq t \leq 0}

the class

{G_{f} : f \in F}

is a VC-class of sets on

S \times R

. Informally, VC-class functions are identified by their polynomial covering number (the minimal number of required functions to make a covering on the entire class of functions).

A VC-class of functions

F

with envelope function F have the following entropy property, for a given

1 ⩽ q < \infty

, there are constants a and b such as:

{N (ϵ, F, ∥ . ∥}_{L_{q} (Q)}) \leq a {(\frac{{(Q F^{q})}^{1 / q}}{ϵ})}^{b}

(13)

for any

ϵ > 0

and each probability measure such that

Q F^{q} < \infty

. For instance, the following references ([26], Lemma 22), ([106], §4.7), ([107], Theorem 2.6.7), ([108], §9.1) provide a number of sufficient conditions under which (13) holds; refer to ([109], §3.2) for further discussions.

2.7. Conditions and Comments

Assumption 1.

(Model and distribution assumptions)

(M1): The $H$ -valued stochastic process ${X_{s, A_{n}} : s \in R_{n}}$ is locally stationary. Hence, for each time point $u \in {[0, 1]}^{d}$ , a strictly stationary process ${X_{u} (s) : s \in R^{d}}$ exists such that for $∥ . ∥$ an arbitrary norm on $R^{d}$ ,

$\begin{matrix} d (X_{s, A_{n}}, X_{u} (s)) \leq ({∥\frac{s}{A_{n}} - u∥}_{2} + \frac{1}{A_{n}^{d}}) U_{s, A_{n}} (u) a . s ., \end{matrix}$

(14)

with $E [{(U_{s, A_{n}} (u))}^{ρ}] < C$ for some $ρ \geq 1$ and $C < \infty$ that is independent of $u, s$ and $A_{n}$ .
(M2): For $i = 1, \dots, m$ , let $B (x_{i}, h) = {y \in H : d (x_{i}, y) \leq h}$ be a ball centered at $x_{i} \in H$ with radius h, and let $c_{d} < C_{d}$ be positive constants. For all $u \in {[0, 1]}^{d}$ ,

$ϕ_{x} (h_{n}) : = P (X_{u} (s_{1}) \in B (x_{1}, h_{n}), \dots, X_{u} (s_{m}) \in B (x_{m}, h_{n})) = F_{u} (h_{n}, x_{1}, \dots, x_{m})$

satisfies:

$\begin{matrix} 0 < c_{d} ϕ (h) f_{1} (x) \leq ϕ_{x} (h) \leq C_{d} ϕ (h) f_{1} (x), \end{matrix}$

(15)

where $ϕ (h) \to 0$ as $h \to \infty$ , and $f_{1} (x)$ is a non-negative functional in $x \in H^{m}$ . Moreover, there exist constants $C_{ϕ} > 0$ and $ε_{0} > 0$ such that for any $0 < ε < ε_{0}$ ,

$\begin{matrix} \int_{0}^{ε} ϕ (u) d u > C_{ϕ} ε ϕ (ε) . \end{matrix}$

(16)
(M3): Let $X_{s, A_{n}} = (X_{s_{m}, A_{n}}, \dots, X_{s_{1}, A_{n}})$ and $X_{v, A_{n}} = (X_{v_{1}, A_{n}}, \dots, X_{v_{m}, A_{n}})$ and $B (x, h) = \prod_{i = 1}^{m} B (x_{i}, h)$ . Assume

$sup_{s, x, A_{n}} sup_{s \neq v} P ((X_{s, A_{n}}, X_{v, A_{n}}) \in B (x, h) \times B (x, h)) \leq ψ (h) f_{2} (x),$

where $ψ (h) \to 0$ as $h \to 0$ , and $f_{2} (x)$ is a non-negative functional in $x \in H^{m}$ . We assume that the ratio $ψ (h) / ϕ^{2} (h)$ is bounded.
(M4): $σ : [0, 1] \times H^{m} \to R$ is bounded by some constant $C_{σ} < \infty$ from above and by some constant $c_{σ} > 0$ from below, that is, $0 < c_{σ} \leq σ (u, x) \leq C_{σ} < \infty$ for all $u$ and $x$ .
(M5): $σ (., .)$ is Lipschitz continuous with respect to $u$ .
(M6): ${sup}_{u \in {[0, 1]}^{m}} {sup}_{z : d (x, z) \leq h} | σ (u, x) - σ (u, z) | = o (1)$ as $h \to 0$ .
(M7): $r^{(m)} (u, x)$ is Lipschitz, and it satisfies

$sup_{u \in [0, 1]} | r^{(m)} (u_{1}, x) - r^{(m)} (u_{2}, z) | \leq c_{m} (d_{H^{m}} {(x, z)}^{α} + {∥ u_{1} - u_{2} ∥}^{α})$

(17)

for some $c_{m} > 0$ and $α > 0$ and the semi-metric $d_{H^{m}} (x, z)$ is defined on $H^{m}$ by:

$d_{H^{m}} (x, z) : = \frac{1}{m} d (x_{1}, z_{1}) + \dots + \frac{1}{m} d (x_{m}, z_{m})$

for $x = (x_{1}, \dots, x_{m}), z = (z_{1}, \dots, z_{m}) \in H^{m},$ and it is twice continuously partially differentiable with first derivatives

$\partial_{u_{i}} r^{(m)} (u, x) = \frac{\partial}{\partial u_{i}} r^{(m)} (u, x),$

and second derivatives

$\partial_{u_{i} u_{j}}^{2} r^{(m)} (u, x) = \frac{\partial^{2}}{\partial u_{i} \partial u_{j}} r^{(m)} (u, x) .$

Assumption 2.

(Kernel assumptions)

(KB1): The kernel $K_{2} (\cdot)$ is non-negative, bounded by $\tilde{κ}$ , and has support in $[0, 1]$ such that $0 < K_{2} (0)$ and $K_{2} (1) = 0$ . Moreover, $K_{2}^{'} (v) = d K_{2} (v) / d v$ exists on $[0, 1]$ and satisfies $C_{1}^{'} \leq K_{2}^{'} (v) \leq C_{2}^{'}$ for two real constants $- \infty < C_{1}^{'} <$ $C_{2}^{'} < 0$ .
(KB2): The kernel $\bar{K} : R^{d} \to [0, \infty)$ is bounded and has compact support ${[- C, C]}^{d}$ . Moreover,

$\int_{{[- C, C]}^{d}} \bar{K} (x) d x = 1, \int_{{[- C, C]}^{d}} x^{α} \bar{K} (x) d x = 0, for any α \in Z^{d} with | α | = 1,$

and $| \bar{K} (u) - \bar{K} (v) | \leq C ∥ u - v ∥$ .
(KB3): The bandwidth h converges to zero at least at a polynomial rate; that is, there exists a small $ξ_{1} > 0$ such that $h \leq C n^{- ξ_{1}}$ for some constant $0 < C < \infty$ .

Assumption 3.

(Sampling design assumptions)

(S1): For any $α \in N^{d}$ with $| α | = 1, 2$ , $\partial^{α} f_{S} (s)$ exists and is continuous on ${(0, 1)}^{d}$ .
(S2): $C_{0} \leq n A_{n}^{- d} \leq C_{1} n^{η_{1}}$ for some $C_{0}, C_{1} > 0$ and small $η_{1} \in (0, 1)$ .

Assumption 4.

(Block decomposition assumptions)

(B1): Let ${\{A_{1, n}\}}_{n \geq 1}$ and ${\{A_{2, n}\}}_{n \geq 1}$ be two sequences of positive numbers such that $A_{1, n}, A_{2, n} \to \infty, A_{2, n} = o (A_{1, n})$ , and $A_{1, n} = o (A_{n})$ , or $\frac{A_{1, n}}{A_{n}} + \frac{A_{2, n}}{A_{1, n}} \leq C_{0}^{- 1} n^{- η} \to 0 for some C_{0} > 0 and η > 0 .$
(B2): We have ${lim}_{n \to \infty} n A_{n}^{- d} = κ \in (0, \infty]$ with $A_{n} \geq n^{\bar{κ}}$ for some $\bar{κ} > 0$ .
(B3): We have

${(\frac{1}{n h^{m d} ϕ (h)})}^{1 / 3} {(\frac{A_{1, n}}{A_{n}})}^{2 d / 3} {(\frac{A_{2, n}}{A_{1, n}})}^{2 / 3} g_{1}^{1 / 3} (A_{1, n}^{d}) \sum_{k = 1}^{A_{n} / A_{1, n}} k^{d - 1} β_{1}^{1 / 3} (k A_{1, n} + A_{2, n}) \to 0 .$
(B4): We have ${lim}_{n \to \infty} A_{n}^{d} A_{1, n}^{- d} β (A_{2, n}; A_{n}^{d}) = 0$ .

Assumption 5.

(Regularity conditions) Let

α_{n} = \sqrt{log n / n h^{m d} ϕ (h)}

. As

n \to \infty

,

(R1): $h^{- (m d)} ϕ {(h)}^{- 1} α_{n}^{m d} A_{n}^{d} A_{1, n}^{- d} β (A_{2, n}; A_{n}^{d}) \to 0$ and $A_{1, n}^{d} A_{n}^{- d} n h^{m d} ϕ (h) (log n) \to 0$ ,
(R2): $n^{1 / 2} h^{(m d) / 2} ϕ {(h)}^{1 / 2} / A_{1, n}^{d} n^{1 / ζ} \geq C_{0} n^{η}$ for some $0 < C_{0} < \infty$ and $η > 0$ and $ζ > 2$ .
(R3): $A_{n}^{d p} ϕ (h) \to \infty$ , where p is defined in the sequel.

Assumption 6.

(E1): For $W_{s_{i}, A_{n}} = \prod_{j = 1}^{m} ϵ_{s_{i_{j}}, A_{n}}$ , it holds that ${sup}_{x \in H^{m}} E {| W_{s, A_{n}} |}^{ζ} \leq C$ and

$sup_{x \in H^{m}} E [{| W_{s, A_{n}} |}^{ζ} ∣ X_{i, n} = x] \leq C$

for $ζ > 2$ and $C < \infty$ .
(E2): The β-mixing coefficients of the array $\{X_{s, A_{n}}, W_{s, A_{n}}\}$ satisfy $β (a; b) \leq β_{1} (a) g_{1} (b)$ with $β_{1} (a) \to 0$ as $a \to \infty .$

Assumption 7.

(Class of functions assumptions)

The classes of functions

K^{m}

and

F_{m}

are such that:

(C1): The class of functions $F_{m}$ is bounded, and its envelope function satisfies for some $0 < M < \infty$ :

$F (y) \leq M, y \in Y^{m} .$
(C2): The class of functions $F_{m}$ is unbounded and its envelope function satisfies for some $p > 2$ :

$θ_{p} : = sup_{t \in S_{H}^{m}} E (F^{p} (Y) ∣ x = x) < \infty .$
(C3): The metric entropy of the class $F_{m} K^{m}$ satisfies, for some $2 < p < \infty$ :

$\begin{matrix} \int_{0}^{\infty} {(log N (u, F_{m} K^{m}, L_{1} (P^{m})))}^{\frac{1}{2}} d u < \infty, \\ \int_{0}^{\infty} (log N (u, F_{m} K^{m}, L_{2} (P^{m}))))^{\frac{1}{2}} d u < \infty, \\ \int_{0}^{\infty} {(log N (u, F_{m} K^{m}, L_{p} (P^{m})))}^{\frac{1}{2}} d u < \infty . \end{matrix}$

Comments

When it comes to functional data, traditional statistical methods are entirely ineffective. In our non-parametric functional regression model, we took on the complex theoretical challenge of establishing functional central limit theorems for the conditional U-process under functional absolute regular data and in the context of a two-fold situation. This was accomplished by adopting a two-fold framework. Despite this, the imposed assumptions coincide with some properties of the infinite-dimensional space. These properties include the topological structure on

H^{m}

, the probability distribution of

X

, and the measurability concept for the classes

F_{m}

and

K^{m};

consequently, a discussion regarding the aforementioned assumptions is necessary. The majority of these assumptions were motivated by [37,55,57,84,85,86,110]. Assumption 1 is the beginning of a formalization of the property of

X i

to be locally stationary, and we continue by placing certain restrictions on the distribution behavior of the variables. This allows us to formalize the property in a more precise manner. The condition (M1) refers to the idea of a locally stationary time series, and various random fields can fulfill this requirement. Ref. [85] gave us some examples, and particularly, he proved that this condition is satisfied for locally stationary versions of Lévy-driven moving average random fields. Condition (M2) has been adopted by [84], who in turn was inspired by [57] in his non-parametric density estimation under functional observations. Ref. [84] clarifies that if

H^{m} = R^{m}

, then the condition overlaps with the fundamental axioms of probability calculus; furthermore, if

H^{m}

is an infinitely dimensional Hilbert space, then

ϕ (h_{n})

can drop toward 0 by an exponential speed as

n \to \infty

. Equation (15) controls the behavior of the small ball probability around zero and is the quite usual condition on the small ball probability. This approximately shows that the small ball probability can be written approximately as the product of two independent functions

ϕ^{m} (\cdot)

and

f_{1} (\cdot)

; for instance, for

m = 1

, refer to [111] for the diffusion process, Ref. [112] for a Gaussian measure, Ref. [113] for a general Gaussian process and [84] employed these assumptions for strongly mixing processes. For example, the function

ϕ (\cdot)

can be expressed as

ϕ (ϵ) = ϵ^{δ} exp (- C / ϵ^{a})

with

δ \geq 0

and

a \geq 0

, and it corresponds to the Ornstein–Uhlenbeck and general diffusion processes (for such processes,

a = 2

and

δ = 0

) and the fractal processes (for such processes,

δ > 0

and

a = 0

). We refer to the paper of [114] for other examples. Conditions (M4), (M5), (M6) and Assumption 2 represent the regularity conditions, and they are the umbrella that covers the limiting theorems of such a process. Due to the sampling design strategy employed in Section 2.4, a non-uniform density is possible across the sampling region, by which the number of sampling sites is enabled to increase at different rates with respect to the region’s volume

O (A_{n}^{d})

. This sampling design allows the pure increasing domain case

({lim}_{n \to \infty} n A_{n}^{- d} = κ \in (0, \infty))

and the mixed increasing domain case (

{lim}_{n \to \infty} n A_{n}^{- d} = \infty)

. Assumption 3 is assumed to address this sampling design and the infill sampling criteria in the stochastic design case, which can also be seen in [94,115]. In addition to the non-uniform possibility of the sampling density, an approach for irregularly spaced sampling sites based on a homogeneous Poisson point process was discussed in ([10], Chapter 8), where the sampling sites must have a uniform distribution over the sampling region. This makes the sampling design used in this work more flexible than the homogeneous Poisson point process and more useful for practical applications. Condition (B1) in Assumption 4 is related to the Blocking technique used to decompose the sampling region

R_{n}

into big and small blocks. The sequences

\{A_{1, n}\}

and

\{A_{2, n}\}

are related to the large-block–small-block argument, respectively, which is commonly used in proving CLTs for sums of mixing random variables; see [94]. Precisely,

A_{1, n}

corresponds to the side length of large blocks, while

A_{2, n}

corresponds to the side length of small blocks. Furthermore, Assumptions 6 help for deriving the weak convergence of the conditional U-statistic

\hat{ψ}

defined in Section 3. Condition (C1) says that we are dealing with bounded functions, but we are also interested in establishing the functional central limit theorem for conditional U-processes indexed by an unbounded class of functions; therefore, this condition will be replaced by (C2). Each of these generic assumptions is sufficiently weak in connection to the many objects described in our preliminary results. They discuss and utilize the four key axes of this work, which are the topological structure of the functional variables, the probability measure in this functional space, the idea of measurability on the class of functions, and the uniformity governed by the entropy characteristics.

Remark 3.

Note that Assumption (C2) in 7 might be substituted by more general hypotheses upon moments of

Y

as in [109]. That is

(C4)′: We denote by ${M (x) : x \geq 0}$ a non-negative continuous function, increasing on $[0, \infty)$ , and such that, for some $s > 2$ , ultimately as $x ↑ \infty$ ,

$x^{- s} M (x) ↓; x^{- 1} M (x) ↑ .$

(18)

For each $t \geq M (0)$ , we define $M^{i n v} (t) \geq 0$ by $M (M^{i n v} (t)) = t$ . Assuming further that:

$E (M (∣ F (Y) ∣)) < \infty .$

The following choices of

M (\cdot)

are of particular interest:

(i): $M (x) = x^{p}$ for some $p > 2$ ;
(ii): $M (x) = exp (s x)$ for some $s > 0$ .

3. Uniform Convergence Rates for Kernel Estimators

Before expressing the asymptotic behavior of our estimator represented in (9), we will generalize the study to a U-statistic estimator defined by:

\hat{ψ} (u, x) = \frac{(n - m)!}{n! h^{m d} ϕ^{m} (h)} \sum_{i \in I_{n}^{m}} \prod_{j = 1}^{m} \{\bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\} W_{s_{i}, A_{n}},

(19)

where

W_{s_{i}, A_{n}}

is an array of one-dimensional random variables. In this study, we use the results with

W_{s_{i}, A_{n}} = 1

and

W_{s_{i}, A_{n}} = \prod_{j = 1}^{m} ϵ_{s_{i_{j}}, A_{n}}

.

3.1. Hoeffding’s Decomposition

Note

\hat{ψ} (u, x)

is a standard U-statistic with a kernel depending on n. We define

\begin{matrix} ξ_{j} & : = & \frac{1}{h^{d}} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}), \\ H (Z_{1}, \dots, Z_{m}) & : = & \prod_{j = 1}^{m} \frac{1}{ϕ (h)} K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}}) W_{s_{i}, A_{n}}, \end{matrix}

thus, the U-statistic in (19) can be viewed as a weighted U-statistic of degree m:

\hat{ψ} (u, x) = \frac{(n - m)!}{n!} \sum_{i \in I_{n}^{m}} ξ_{i_{1}} \dots ξ_{i_{m}} H (Z_{i_{1}}, \dots, Z_{i_{m}}) .

(20)

We can write Hoeffding’s decomposition in this case. If we will not assume symmetry for

W_{s_{i}, A_{n}}

or H, we must define:

The expectation of $H (Z_{i_{1}}, \dots, Z_{i_{m}})$ :

$θ (i) : = E H (Z_{i_{1}}, \dots, Z_{i_{m}}) = \int W_{s_{i}, A_{n}} \prod_{j = 1}^{m} \frac{1}{ϕ (h)} K_{2} (\frac{d (x_{j}, ν_{s_{i_{j}}, A_{n}})}{h_{n}}) d P_{i} (z_{i}) .$

(21)
For all $ℓ \in {1, \dots, m}$ the position of the argument, construct the function $π_{ℓ}$ such that:

$π_{ℓ} (z; z_{1}, \dots, z_{m - 1}) : = (z_{1}, \dots, z_{ℓ - 1}, z, z_{ℓ}, \dots, z_{m - 1}) .$
Define:

$\begin{matrix} H^{(ℓ)} (z; z_{1}, \dots, z_{m - 1}) & : = H \{π_{ℓ} (z; z_{1}, \dots, z_{m - 1})\} \\ θ^{(ℓ)} (i; i_{1}, i_{2}, \dots, i_{m - 1}) & : = θ \{π_{ℓ} (i; i_{1}, i_{2}, \dots, i_{m - 1})\} . \end{matrix}$

Hence, the first order expansion of

H (\cdot)

will be seen as:

\begin{matrix} {\tilde{H}}^{(ℓ)} (z) : = E (H^{(ℓ)} (z, Z_{1}, \dots, Z_{m - 1})) \\ = \int W_{s_{(1, \dots, ℓ - 1, i, ℓ, \dots, m - 1)}, A_{n}} \prod_{\underset{j \neq i}{j = 1}}^{m - 1} \frac{1}{ϕ (h)} K_{2} (\frac{d (x_{j}, ν_{s_{j}, A_{n}})}{h}) \times \frac{1}{ϕ (h)} K_{2} (\frac{d (x_{i}, ν_{s_{i}, A_{n}})}{h}) \\ \times P (d ν_{1}, \dots, d ν_{ℓ - 1}, d ν_{ℓ}, \dots, d ν_{m - 1}) \\ : = \frac{1}{ϕ (h)} K_{2} (\frac{d (x_{i}, X_{s_{i}, A_{n}})}{h}) W_{s_{i}, A_{n}} \times \int W_{s_{(1, \dots, ℓ - 1, ℓ, \dots, m - 1)}, A_{n}} \prod_{\underset{j \neq i}{j = 1}}^{m - 1} \frac{1}{ϕ (h)} K_{2} (\frac{d (x_{j}, ν_{s_{j}, A_{n}})}{h}) \\ \times P (d ν_{1}, \dots, d ν_{ℓ - 1}, d ν_{ℓ}, \dots, d ν_{m - 1}), \end{matrix}

(22)

with

P

as the underlying probability measure, and define

f_{i, i_{1}, \dots, i_{m - 1}}^{(ℓ)} : = \sum_{ℓ = 1}^{m} ξ_{i_{1}} \dots ξ_{i_{ℓ - 1}} ξ_{i} ξ_{i_{ℓ}} \dots ξ_{i_{m - 1}} ({\tilde{H}}^{(ℓ)} (z) - θ^{(ℓ)} (i; i_{1}, \dots, i_{m - 1})) .

(23)

Then, the first-order projection can be defined as:

{\hat{H}}_{1, i} (u, x) : = \frac{(n - m)!}{(n - 1)!} \sum_{I_{n - 1}^{m - 1} (- i)} f_{i, i_{1}, \dots, i_{m - 1}}^{(ℓ)},

(24)

where

I_{n - 1}^{m - 1} (- i) : = \{1 \leq i_{1} < \dots < i_{m - 1} \leq n and i_{j} \neq i for all j \in {1, \dots, m - 1}\} .

For the remainder terms, we denote by

i ∖ i_{ℓ} : = (i_{1}, \dots, i_{l - 1}, i_{l + 1}, \dots, i_{m})

and for

ℓ \in {1, \dots, m}

, let

H_{2, i} (z) : = H (z) - \sum_{l = 1}^{m} {\tilde{H}}_{i ∖ i_{ℓ}}^{(ℓ)} (z_{ℓ}) + (m - 1) θ (i),

(25)

where

{\tilde{H}}_{i ∖ i_{ℓ}}^{(ℓ)} (z_{ℓ}) = E (H (Z_{1}, \dots, Z_{ℓ - 1}, z, Z_{ℓ + 1} Z_{m - 1})),

defined in (22), this projection derives us to the following remainder term:

{\hat{ψ}}_{2} (u, x) : = \frac{(n - m)!}{(n)!} \sum_{i \in I_{n}^{m}} ξ_{i_{1}} \dots ξ_{i_{m}} H_{2, i} (z) .

(26)

Finally, using Equations (24) and (26), and under conditions that:

\begin{matrix} E ({\hat{H}}_{1, i} (u, X)) & = & 0, \end{matrix}

(27)

\begin{matrix} E (H_{2, i} (Z ∣ Z_{k})) & = & 0 a.s., \end{matrix}

(28)

we obtain the [16] decomposition:

\begin{matrix} \hat{ψ} (u, x) - E (\hat{ψ} (u, x)) & = & \frac{1}{n} \sum_{i = 1}^{n} {\hat{H}}_{1, i} (u, x) + {\hat{ψ}}_{2, i} (u, x) \\ : = & {\hat{ψ}}_{1} (u, x) + {\hat{ψ}}_{2} (u, x) . \end{matrix}

3.2. Strong Uniform Convergence Rate

We start by giving the following general result concerning the rate of convergence of the U-process presented in (19).

Proposition 1.

Let

F_{m} K^{m}

be a measurable VC-subgraph class of functions satisfying Assumption 7 and assume that Assumptions 2, 3, and Condition (B1) in Assumptions 4–6 are also satisfied. Then, the following result holds

\begin{matrix} sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in {[0, 1]}^{m}} |\hat{ψ} (u, x) - E [\hat{ψ} (u, x)]| & = O_{P_{. ∣ S}} (\sqrt{\frac{log n}{n h^{m d} ϕ^{m} (h)}}) P_{S} - a . s . \end{matrix}

Next, the uniform rate of convergence of the estimator (9) of the mean function

r^{(m)}

in the model (4) will be given, using the results of the last proposition.

Theorem 1.

Let

F_{m} K^{m}

be a measurable VC-subgraph class of functions satisfying Assumption 7. Let

I_{h} = {[C_{1} h, 1 - C_{1} h]}^{d m}

and let

S_{c}

be a compact subset of

H^{m}

. Suppose that

inf_{u \in {[0, 1]}^{d}} f (u) > 0 .

Then, under Assumptions 1–3, Condition (B1) in Assumptions 4–6 (with

W_{s_{i}, A_{n}} = 1

and

ϵ_{s_{i}, A_{n}}

), the following result holds for

P_{S}

almost surely:

\begin{matrix} sup_{F_{m} K^{m}} sup_{x \in S_{c}} sup_{u \in I_{h}} |{\hat{r}}_{n}^{(m)} (x, u; h_{n}) - r^{(m)} (x, u; h_{n})| \\ = O_{P_{. | S}} (\sqrt{log n / n h^{m d} ϕ^{m} (h)} + h^{2 \land α} + \frac{1}{A_{n}^{d p} ϕ (h)}), \end{matrix}

where

p = min {1, ρ}

, and

ρ > 0

given in Definition 1.

It is worth to note here that the approximation of the functional time series

X_{s, A_{n}}

by a functional stationary random field

X_{u} (s)

provides the error term

A_{n}^{- d p} ϕ^{- 1} (h)

.

4. Weak Convergence for Kernel Estimators

In this section, we are interested in studying the weak convergence of the conditional U-process, defined by Equation (9), under absolute regular observations. The following theorem represents the main result in this work concerning the weak convergence of the functional locally stationary random field estimator. Let us define, for

φ_{1}, φ_{2} \in F_{m}

\begin{matrix} σ (φ_{1}, φ_{2}) & = & E_{. ∣ S} (\sqrt{n h^{m d} ϕ^{m} (h)} ({\hat{r}}_{n}^{(m)} (φ_{1}, x, u; h_{n}) - r^{(m)} (φ_{1}, x, u) \\ \times \sqrt{n h^{m d} ϕ^{m} (h)} ({\hat{r}}_{n}^{(m)} (φ_{2}, x, u; h_{n}) - r^{(m)} (φ_{2}, x, u)) . \end{matrix}

(29)

Theorem 2.

Let

F_{m} K^{m}

be a measurable VC-subgraph class of functions satisfying Assumption 7. Suppose that

f_{S} (u) > 0

and

ϵ_{s_{i_{j}}, A_{n}} = σ (s_{i_{j}} / A_{n}, x) ϵ_{i_{j}}

, where

σ (., .)

is continuous and

{ϵ_{i}}_{i = 1}^{n}

is a sequence of i.i.d. random variables with mean zero and variance 1. Moreover, suppose

n h^{m (d + 1) + 4} \to c_{0}

for a constant

c_{0}

. If all assumptions assumed in Theorem 1 hold in addition of Conditions (B2), (B3) and (B4), then the following result holds for

P_{S}

almost surely:

\begin{matrix} \sqrt{n h^{m d} ϕ^{m} (h)} [{\hat{r}}_{n}^{(m)} (φ, x, u; h) - r^{(m)} (φ, x, u) - B_{u, x}] \end{matrix}

converges to a Gaussian process

G_{n}

over

F_{m} K^{m}

, whose simple paths are bounded and informally continuous with respect to

{∥ . ∥}_{2}

-norm with co-variance function given by (29) and where the bias term

B_{u, x} = O_{P_{. ∣ S}} (h^{2 \land α})

.

Remark 4.

Set

A_{n}^{d} = O (n^{1 - {\bar{η}}_{1}})

for some

{\bar{η}}_{1} \in [0, 1), A_{1, n} = O (A_{n}^{γ_{A_{1}}}), A_{2, n} = O (A_{n}^{γ_{A_{2}}})

with

0 < γ_{A_{2}} < γ_{A_{1}} < 1 / 3

and

p = min {1, ρ} = 1

. Assume that we can take a sufficiently large

ζ > 2

such that

\frac{2}{ζ} < (1 - {\bar{η}}_{1}) (1 - 3 γ_{A_{1}})

. Then, Assumption 4 is satisfied for

d \geq 1

.

Remark 5.

It is simple to modify the proofs of our results to demonstrate that they still hold when the entropy condition is replaced by the bracketing condition:

\int_{0}^{\infty} {(log N_{[]} (u, FK, L_{p} (P^{m})))}^{\frac{1}{2}} d u < \infty

Refer to p. 270 of [116] for the definition of

N_{[]} (u, FK, L_{p} (P^{m}))

.

Remark 6.

There are basically no restrictions on the choice of the kernel function in our setup apart from satisfying some mild conditions. The selection of the bandwidth, however, is more problematic. It is worth noticing that the choice of the bandwidth is crucial to obtain a good rate of consistency; for example, it has a big influence on the size of the estimate’s bias. In general, we are interested in the selection of bandwidth that produces an estimator which has a good balance between the bias and the variance of the considered estimators. It is then more appropriate to consider the bandwidth varying according to the criteria applied and to the available data and location which cannot be achieved by using the classical methods. The interested reader may refer to [117] for more details and discussion on the subject. It would be of interest to establish uniform-in-bandwidth central limit theorems in our setting; i.e., we will let

h > 0

vary in such a way that

h_{n}^{'} \leq h \leq h_{n}^{″}

, where

{\{h_{n}^{'}\}}_{n \geq 1}

and

{\{h_{n}^{″}\}}_{n \geq 1}

are two sequences of positive constants such that

0 < h_{n}^{'} \leq h_{n}^{″} < \infty

and, for either choice of

h_{n} = h_{n}^{'}

or

h_{n} = h_{n}^{″}

, fulfills our conditions. It will be of interest to show that

\begin{matrix} sup_{h_{n}^{'} \leq h \leq h_{n}^{″}} \sqrt{n h^{m d} ϕ^{m} (h)} [{\hat{r}}_{n}^{(m)} (φ, x, u; h) - r^{(m)} (φ, x, u) - B_{u, x}] \end{matrix}

converges to a Gaussian process

G_{n}

over

F_{m} K^{m}

.

5. Applications

5.1. Metric Learning

Metric learning tries to adapt the metric to the data and has garnered significant interest in recent years; for an overview of metric learning and its applications, see [118,119]. This is prompted by applications ranging from computer vision to bioinformatics-based information retrieval. As an example of the utility of this notion, we give the metric learning issue for supervised classification as described in [119]. Let us consider dependent copies

(X_{s_{1}, A_{n}}, Y_{1}), \dots, (X_{s_{n}, A_{n}}, Y_{n})

of a

H \times Y

valued random couple

(X, Y)

, where

H

is some feature space and

Y = {1, \dots, C}

, with

C \geq 2

say, a finite set of labels. Let

D

be a set of distance measures

D : H \times H \to R_{+}

. The intuitive objective of metric learning in this context is to identify a measure under which points with the same label are close together and those with different labels are far apart. The standard way to define the risk of a metric D is as follows:

R (D) = E [ϕ ((1 - D (X, X^{'}) \cdot (2 1 \{Y = Y^{'}\} - 1))],

(30)

where

ϕ (u)

is a convex loss function upper bounding the indicator function

1 {u \geq 0}

: for instance, the hinge loss

ϕ (u) = max (0, 1 - u)

. To estimate

R (D)

, we consider the natural empirical estimator

\begin{matrix} R_{n} (D) & = & \frac{2}{n (n - 1)} \sum_{1 \leq i < j \leq n} \bar{K} (\frac{u_{i} - s_{i} / A_{n}}{h_{n}}) \bar{K} (\frac{u_{j} - s_{j} / A_{n}}{h_{n}}) \\ \times ϕ ((D (X_{s_{i}, A_{n}}, X_{s_{j}, A_{n}}) - 1) \cdot (2 π \{Y_{i} = Y_{j}\} - 1)), \end{matrix}

(31)

which is one sample U-statistic of degree two with kernel given by:

φ_{D} ((x, y), (x^{'}, y^{'})) = ϕ ((D (x, x^{'}) - 1) \cdot (2 1 \{y = y^{'}\} - 1)) .

The convergence to (30) of a minimizer of (31), in the non-spatial setting, has been studied in the frameworks of algorithmic stability [120], algorithmic robustness [121] and based on the theory of U-processes under appropriate regularization [122].

5.2. Multipartite Ranking

Let us recall the problem from [119]. Let

X \in H

be a random vector of attributes/features and the (temporarily hidden) ordinal labels

Y \in {1, \dots, K}

assigned to it. The goal of multipartite ranking, which uses a training set of labeled examples, is to put the attributes or features in the same order as the labels. Many different fields use this statistical learning problem (e.g., medicine, finance, search engines, e-commerce). Rankings are usually defined by a scoring function,

s : H \to R

, which moves the natural order on the real line to the feature space. The ROC manifold, or its usual summary, the VUS criterion (VUS stands for Volume Under the ROC Surface), is the gold standard for evaluating the ranking performance of

s (x)

; see [123] and the references therein. The best scoring functions, according to [124], are those that are best for all bipartite subproblems. More specifically, they are increasing transformations of the likelihood ratio

{dF}_{k + 1} / {dF}_{k}

, where

F_{k}

is the class-conditional distribution for the kth class. When the set of optimal scoring functions is not empty, the authors showed that it is the same as the set of functions that maximizes the amount of space under the ROC surface

VUS (s) = P \{s (X_{s_{1}}) < \dots < s (X_{s_{K}}) ∣ Y_{1} = 1, \dots, Y_{K} = K\} .

Given K independent samples

(X_{s_{1}, A_{n_{k}}}^{(k)}, \dots, X_{s_{n_{k}}, A_{n_{k}}}^{(k)})

with distribution

F_{k} (d x)

for

k = 1, \dots, K

, the empirical counterpart of the VUS can be written in the following way:

\begin{matrix} \hat{VUS} (s) \\ = \frac{1}{\prod_{k = 1}^{K} n_{k}} \sum_{i_{1} = 1}^{n_{1}} \dots \sum_{i_{k} = 1}^{n_{K}} \prod_{j = 1}^{K} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) 1 \{s (X_{s_{i_{1}}, A_{n_{1}}}^{(1)}) < \dots < s (s_{i_{K}}, A_{n_{K}}^{(K)})\} . \end{matrix}

(32)

The empirical VUS (32) is a K-sample U-statistic of degree

(1, \dots, K)

with kernel given by:

φ_{s} (x_{1}, \dots, x_{K}) = 1 \{s (x_{1}) < \dots < s (x_{K})\} .

5.3. Set Indexed Conditional U-Statistics

We aim to study the links between

X

and

Y

by estimating functional operators associated with the conditional distribution of

Y

given

X

, such as the regression operator, for

C_{1} \times \dots \times C_{k} : = \tilde{C}

in a class of sets

C^{m}

,

G^{(m)} (C_{1} \times \dots \times C_{m} ∣ t, u) = E (\prod_{i = 1}^{m} 1 {Y_{i} \in C_{i}} ∣ (X_{1}, \dots, X_{k}) = (t_{1}, \dots, t_{m}) = t) for t \in S_{c},

where

u = (u_{1}, \dots, u_{d})

. We define metric entropy with the inclusion of the class of sets

C

. For each

ε > 0

, the covering number is defined as:

\begin{matrix} N (ε, C, G^{(1)} (\cdot ∣ x)) & = & inf {n \in N : \exists C_{1}, \dots, C_{n} \in C such that \forall C \in C \exists 1 \leq i, j \leq n \\ with C_{i} \subset C \subset C_{j} and G^{(1)} (C_{j} ∖ C_{i} ∣ x) < ε}, \end{matrix}

the quantity

log (N (ε, C, G^{(1)} (\cdot ∣ x)))

is called metric entropy with inclusion of

C

with respect to the conditional distribution

G^{(1)} (\cdot ∣ x)

. The quantity

log N (ε, C, G^{(1)} (\cdot ∣ x))

is called metric entropy with inclusion of

C

with respect to

G (\cdot ∣ x)

. Estimates for such covering numbers are known for many classes (see, e.g., [125]). We will often assume below that either

log N (ε, C, G^{(1)} (\cdot ∣ x))

or

N (ε, C, G^{(1)} (\cdot ∣ x))

behave like powers of

ε^{- 1}

: we say that the condition (

R_{γ}

) holds if

log N (ε, C, G^{(1)} (\cdot ∣ x)) \leq H_{γ} (ε), for all ε > 0,

(33)

where

H_{γ} (ε) = \{\begin{matrix} log (A ε) & if & γ = 0, \\ A ε^{- γ} & if & γ > 0, \end{matrix}

for some constants

A, r > 0

. As in [126], it is worth noticing that the condition (33),

γ = 0

, holds for intervals, rectangles, balls, ellipsoids, and for classes which are constructed from the above by performing set operations union, intersection and complement finitely many times. The classes of convex sets in

R^{d}

(

d \geq 2

) fulfill the condition (33),

γ = (d - 1) / 2

. This and other classes of sets satisfying (33) with

γ > 0

can be found in [125]. As a particular case of (9), we estimate

G^{(m)} (C_{1} \times \dots \times C_{m} ∣ t)

{\hat{G}}_{n}^{(m)} (\tilde{C}, t, u) = \frac{\sum_{i \in I_{n}^{m}} \prod_{j = 1}^{m} 1 {Y_{s_{i_{j}}, A_{n}} \in C_{j}} \{\prod_{ℓ = 1}^{d} K_{1} (\frac{u_{j, ℓ} - \frac{s_{i_{j, ℓ}}}{A_{n}}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\}}{\sum_{i \in I_{n}^{m}} \prod_{j = 1}^{m} \{\prod_{ℓ = 1}^{d} K_{1} (\frac{u_{j, ℓ} - \frac{s_{i_{j, ℓ}}}{A_{n}}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\}} .

(34)

One can apply Theorem 1 to infer that, in probabbility,

sup_{\tilde{C} \in C^{m}} sup_{t \in S_{c}, u \in I_{h}} |{\hat{G}}_{n}^{(m)} (\tilde{C}, t) - G^{(m)} (\tilde{C} ∣ t)| ⟶ 0 .

(35)

Remark 7.

Another point of view is to consider the following situation, for a compact

J \subset R^{d m}

,

G^{(m)} (y_{1}, \dots, y_{k} ∣ t, u) = E (\prod_{i = 1}^{m} 1 {Y_{i} \leq y_{i}} ∣ (X_{1}, \dots, X_{m}) = t) f o r t \in S_{c}, (y_{1}, \dots, y_{m}) \in J .

Let

L (\cdot)

be a distribution in

R^{d}

and

h_{n}

is a sequence of positive real numbers. One can estimate

G^{(m)} (y_{1}, \dots, y_{m} ∣ t, u) = G^{(m)} (y ∣ t, u)

by

\begin{matrix} {\hat{G}}_{n}^{(m)} (y, t, u) & : = & = \frac{\sum_{i \in I_{n}^{m}} \prod_{j = 1}^{m} L (\frac{Y_{s_{i_{j}}, A_{n}} - t_{j}}{h_{n}}) \{\prod_{ℓ = 1}^{d} K_{1} (\frac{u_{j, ℓ} - \frac{s_{i_{j, ℓ}}}{A_{n}}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\}}{\sum_{i \in I_{n}^{m}} \prod_{j = 1}^{m} \{\prod_{ℓ = 1}^{d} K_{1} (\frac{u_{j, ℓ} - \frac{s_{i_{j, ℓ}}}{A_{n}}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\}} . \end{matrix}

One can use Theorem 1 to infer that, in probability,

sup_{t \in S_{c}, u \in I_{h}} sup_{y \in J} |{\hat{G}}_{n}^{(m)} (y, t, u) - G^{(m)} (y ∣ t, u)| ⟶ 0 .

(36)

5.4. Discrimination

Now, we apply the results on the problem of discrimination described in Section 3 of [127], refer to also to [128]. We will use similar notations and settings. Let

φ (\cdot)

be any function taking at most finitely many values, say

1, \dots, M

. The sets

A_{j} = \{(y_{1}, \dots, y_{k}) : φ (y_{1}, \dots, y_{k}) = j\}, 1 \leq j \leq M

then yield a partition of the feature space. Predicting the value of

φ (y_{1}, \dots, y_{k})

is tantamount to predicting the set in the partition to which

(Y_{1}, \dots, Y_{k})

belongs. For any discrimination rule

g (\cdot)

, we have

P (g (X_{1}, \dots, X_{m}) = φ (Y_{1}, \dots, Y_{m})) \leq \sum_{j = 1}^{M} \int_{{x : g (t) = j}} max m^{j} (t) d P (t),

where

m^{j} (t) = P (φ (Y_{1}, \dots, Y_{m}) = j ∣ X_{1}, \dots, X_{m} = t), t \in S_{c} .

The above inequality becomes equality if

g_{0} (t) = arg max_{1 \leq j \leq M} m^{j} (t) .

The function

g_{0} (\cdot)

is called the Bayes rule, and the pertaining probability of error

L^{*} = 1 - P (g_{0} (X_{1}, \dots, X_{m}) = φ (Y_{1}, \dots, Y_{m})) = 1 - E \{max_{1 \leq j \leq M} m^{j} (t)\}

is called the Bayes risk. Each of the above unknown functions

m^{j} (\cdot)

values can be consistently estimated by one of the methods discussed in the preceding sections. Let, for

1 \leq j \leq M

,

\begin{matrix} m_{n}^{j} (x, u) = \frac{\sum_{i \in I_{n}^{m}} 1 {φ (Y_{s_{i_{1}}, A_{n}}, \dots, Y_{s_{i_{m}, A_{n}}}) = j} \prod_{j = 1}^{m} \{\prod_{ℓ = 1}^{d} K_{1} (\frac{u_{j, ℓ} - \frac{s_{i_{j, ℓ}}}{A_{n}}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\}}{\sum_{i \in I_{n}^{m}} \prod_{j = 1}^{m} \{\prod_{ℓ = 1}^{d} K_{1} (\frac{u_{j, ℓ} - \frac{s_{i_{j, ℓ}}}{A_{n}}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\}} . \end{matrix}

Set

g_{0, n} (t) = arg max_{1 \leq j \leq M} m_{n}^{j} (t) .

Let us introduce

L_{n}^{*} = P (g_{0, n} (X_{1}, \dots, X_{m}) \neq φ (Y_{1}, \dots, Y_{m})) .

Then, one can show that the discrimination rule

g_{0, n} (\cdot)

is asymptotically Bayes’ risk consistent

L_{n}^{*} \to L^{*} .

This follows from the obvious relation

|L^{*} - L_{n}^{*}| \leq 2 E [max_{1 \leq j \leq M} |m_{n}^{j} (X, u) - m^{j} (X)|] .

6. Extension to the Censored Case

Consider a triple

(Y, C, X)

of random variables defined in

R \times R \times H

. Here, Y is the variable of interest, C is a censoring variable and

X

is a concomitant variable. Throughout, we will use [129] notation and we work with a sample

{(Y_{i}, C_{i}, X_{s_{i}, A_{n}}}

of identically distributed replication of

(Y, C, X)

,

n \geq 1

. Actually, in the right censorship model, the pairs

(Y_{i}, C_{i})

,

1 \leq i \leq n

, are not directly observed, and the corresponding information is given by

Z_{i} : = min {Y_{i}, C_{i}}

and

δ_{i} : = 1 {Y_{i} \leq C_{i}}

,

1 \leq i \leq n

. Accordingly, the observed sample is

D_{n} = {(Z_{i}, δ_{i}, X_{s_{i}, A_{n}}), i = 1, \dots, n} .

Survival data in clinical trials or failure time data in reliability studies, for example, are often subject to such censoring. More specifically, many statistical experiments result in incomplete samples, even under well-controlled conditions. For example, clinical data for surviving most types of disease are usually censored by other competing risks to life which result in death. In the sequel, we impose the following assumptions upon the distribution of

(X, Y)

. For

- \infty < t < \infty

, set

F_{Y} (t) = P (Y \leq t), G (t) = P (C \leq t), a n d H (t) = P (Z \leq t),

the right-continuous distribution functions of Y, C and Z respectively. For any right-continuous distribution function L defined on

R

, denote by

T_{L} = sup {t \in R : L (t) < 1}

the upper point of the corresponding distribution. Now, consider a point-wise measurable class

F

of real measurable functions defined on

R

, and assume that

F

is of VC-type. We recall the regression function of

ψ (Y)

evaluated at

X = x

, for

ψ \in F

and

x \in H

, given by

r^{(1)} (ψ, x) = E (ψ (Y) ∣ X = x),

when Y is right-censored. To estimate

r^{(1)} (ψ, \cdot)

, we make use of the Inverse Probability of Censoring Weighted (I.P.C.W.) estimators have recently gained popularity in the censored data literature (see [130,131,132]). The key idea of I.P.C.W. estimators is as follows. Introduce the real-valued function

Φ_{ψ} (\cdot, \cdot)

defined on

R^{2}

by

Φ_{ψ} (y, c) = \frac{1 {y \leq c} ψ (y \land c)}{1 - G (y \land c)} .

(37)

Assuming the function

G (\cdot)

to be known, first note that

Φ_{ψ} (Y_{i}, C_{i}) = δ_{i} ψ (Z_{i}) / (1 - G (Z_{i}))

is observed for every

1 \leq i \leq n

. Moreover, under the Assumption ( $I$ ) below,

( $I$ ): C and $(Y, X)$ are independent.

We have

\begin{matrix} r^{(1)} (Φ_{ψ}, x) & : = & E (Φ_{ψ} (Y, C) ∣ X = x) \\ = & E \{\frac{1 {Y \leq C} ψ (Z)}{1 - G (Z)} ∣ X = x\} \\ = & E \{\frac{ψ (Y)}{1 - G (Y)} E (1 {Y \leq C} ∣ X, Y) ∣ X = x\} \\ = & r^{(1)} (ψ, x) . \end{matrix}

(38)

Therefore, any estimate of

r^{(1)} (Φ_{ψ}, \cdot)

, which can be built on fully observed data, turns out to be an estimate for

r^{(1)} (ψ, \cdot)

too. Thanks to this property, most statistical procedures known to provide estimates of the regression function in the uncensored case can be naturally extended to the censored case. For instance, kernel-type estimates are particularly easy to construct. Set, for

x \in H

,

h \geq l_{n}

,

1 \leq i \leq n

,

\begin{matrix} {\bar{ω}}_{n, K_{1, 2}, h_{n}, i}^{(1)} (x, u) : = \frac{\prod_{ℓ = 1}^{d} K_{1} (\frac{u_{ℓ} - \frac{s_{j, ℓ}}{A_{n}}}{h_{n}}) K_{2} (\frac{d (x, X_{s_{j}, A_{n}})}{h_{n}})}{\sum_{j = 1}^{n} \prod_{ℓ = 1}^{d} K_{1} (\frac{u_{ℓ} - \frac{s_{j, ℓ}}{A_{n}}}{h_{n}}) K_{2} (\frac{d (x, X_{s_{j}, A_{n}})}{h_{n}})} . \end{matrix}

(39)

In view of (37)–(39), whenever

G (\cdot)

is known, a kernel estimator of

r^{(1)} (ψ, \cdot)

is given by

\begin{matrix} {\overset{˘}{r}}_{n}^{(1)} (ψ, x, u; h_{n}) = \sum_{i = 1}^{n} {\bar{ω}}_{n, K_{1, 2}, h_{n}, i}^{(1)} (x, u) \frac{δ_{i} ψ (Z_{i})}{1 - G (Z_{i})} . \end{matrix}

(40)

The distribution function

G (\cdot)

is generally unknown and has to be estimated. We will denote by

G_{n}^{*} (\cdot)

the Kaplan–Meier estimator of the function

G (\cdot)

[133]. Namely, adopting the conventions

\prod_{\emptyset} = 1

and

0^{0} = 1

and setting

N_{n} (u) = \sum_{i = 1}^{n} 1 {Z_{i} \geq u},

we have

G_{n}^{*} (u) = 1 - \prod_{i : Z_{i} \leq u} {\{\frac{N_{n} (Z_{i}) - 1}{N_{n} (Z_{i})}\}}^{(1 - δ_{i})}, for u \in R .

Given this notation, we will investigate the following estimator of

r^{(1)} (ψ, \cdot)

\begin{matrix} {\overset{˘}{r}}_{n}^{(1) *} (ψ, x, u; h_{n}) = \sum_{i = 1}^{n} {\bar{ω}}_{n, K_{1, 2}, h_{n}, i}^{(1)} (x, u) \frac{δ_{i} ψ (Z_{i})}{1 - G_{n}^{*} (Z_{i})}, \end{matrix}

(41)

refer to [129,130]. Adopting the convention

0 / 0 = 0

, this quantity is well defined, since

G_{n}^{*} (Z_{i}) = 1

if and only if

Z_{i} = Z_{(n)}

and

δ_{(n)} = 0

, where

Z_{(k)}

is the kth-ordered statistic associated with the sample

(Z_{1}, \dots, Z_{n})

for

k = 1, \dots, n

and

δ_{(k)}

is the

δ_{j}

corresponding to

Z_{k} = Z_{j}

. When the variable of interest is right-censored, the functional of the (conditional) law can generally not be estimated on the complete support (see [132]). To obtain our results, we will work under the following assumptions.

(A.1): $F = {ψ : = ψ_{1} 1 {{(- \infty, τ)}^{m}}, ψ_{1} \in F_{1}}$ , where $τ < T_{H}$ and $F_{1}$ is a point-wise measurable class of real measurable functions defined on $R$ and of type VC.
(A.2): The class of functions $F$ has a measurable and uniformly bounded envelope function $Υ$ with,

$Υ (y_{1}, \dots, y_{k}) \geq sup_{ψ \in F} |ψ (y_{1}, \dots, y_{k})|, y_{i} \leq T_{H} .$

We now have all the ingredients to state the result corresponding to the censored case. By combining the results of Proposition 9.6 and Lemma 9.7 of [134], Theorem 1, we have, in probability,

\begin{matrix} sup_{x, u} |{\overset{˘}{r}}_{n}^{(1) *} (ψ, x, u; h_{n}) - \hat{E} ({\overset{˘}{r}}_{n}^{(1) *} (ψ, x, u; h_{n}))| \to 0 . \end{matrix}

(42)

A right-censored version of an unconditional U-statistic with a kernel of degree

m \geq 1

is introduced by the principle of a mean preserving reweighting scheme in [135]. Ref. [136] has proved almost sure convergence of multi-sample U-statistics under random censorship and provided application by considering the consistency of a new class of tests designed for testing equality in distribution. To overcome potential biases arising from right-censoring of the outcomes and the presence of confounding covariates, Ref. [137] proposed adjustments to the classical U-statistics. Ref. [138] proposed a different way in the estimation procedure of the U-statistic by using a substitution estimator of the conditional kernel given the observed data. To our best knowledge, the problem of the estimation of the conditional U-statistics was opened up to the present, and it gives our main motivation to the study of this section. A natural extension of the function defined in (37) is given by

Φ_{ψ} (y_{1}, \dots, y_{m}, c_{1}, \dots, c_{m}) = \frac{\prod_{i = 1}^{m} {1 {y_{i} \leq c_{i}} ψ (y_{1} \land c_{1}, \dots, y_{m} \land c_{m})}{\prod_{i = 1}^{m} {1 - G (y_{i} \land c_{i})}} .

From this, we have an analogous relation to (38) given by

\begin{matrix} E (Φ_{ψ} (Y_{1}, \dots, Y_{m}, C_{1}, \dots, C_{m}) ∣ (X_{1}, \dots, X_{m}) = t) \\ = E (\frac{\prod_{i = 1}^{m} {1 {Y_{i} \leq C_{i}} ψ (Y_{1} \land C_{1}, \dots, Y_{k} \land C_{m})}{\prod_{i = 1}^{m} {1 - G (Y_{i} \land C_{i})}} ∣ (X_{1}, \dots, X_{m}) = t) \\ = E (\frac{ψ (Y_{1}, \dots, Y_{m})}{\prod_{i = 1}^{m} {1 - G (Y_{i})}} E (\prod_{i = 1}^{m} {1 {Y_{i} \leq C_{i}} ∣ (Y_{1}, X_{1}), \dots (Y_{m}, X_{m})) ∣ (X_{1}, \dots, X_{m}) = t) \\ = E (ψ (Y_{1}, \dots, Y_{m}) ∣ (X_{1}, \dots, X_{m}) = t) = m_{ψ} (t) . \end{matrix}

An analogue estimator to (9) in the censored case is given by

\begin{matrix} {\overset{˘}{r}}_{n}^{(m)} (ψ, t, u; h_{n}) = \sum_{(i_{1}, \dots, i_{m}) \in I (m, n)} \frac{δ_{i_{1}} \dots δ_{i_{m}} ψ (Z_{i_{1}}, \dots, Z_{i_{m}})}{(1 - G (Z_{i_{1}}) \dots (1 - G (Z_{i_{k}}))} {\bar{ω}}_{n, K_{1, 2}, h_{n}, i}^{(m)} (t, u), \end{matrix}

(43)

where, for

i = (i_{1}, \dots, i_{k}) \in I (k, n)

,

\begin{matrix} {\bar{ω}}_{n, K_{1, 2}, h_{n}, i}^{(k)} (x, u) \frac{\prod_{j = 1}^{m} \{\prod_{ℓ = 1}^{d} K_{1} (\frac{u_{j, ℓ} - \frac{s_{i_{j, ℓ}}}{A_{n}}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\}}{\sum_{i \in I_{n}^{m}} \prod_{j = 1}^{m} \{\prod_{ℓ = 1}^{d} K_{1} (\frac{u_{j, ℓ} - \frac{s_{i_{j, ℓ}}}{A_{n}}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\}} . \end{matrix}

(44)

The estimator that we will investigate is given by

\begin{matrix} {\overset{˘}{r}}_{n}^{(m) *} (ψ, t, u; h_{n}) = \sum_{(i_{1}, \dots, i_{k}) \in I (m, n)} \frac{δ_{i_{1}} \dots δ_{i_{k}} ψ (Z_{i_{1}}, \dots, Z_{i_{m}})}{(1 - G_{n}^{*} (Z_{i_{1}}) \dots (1 - G_{n}^{*} (Z_{i_{m}}))} {\bar{ω}}_{n, K_{1, 2}, h_{n}, i}^{(k)} (t, u) . \end{matrix}

(45)

Corollary 1.

Under the assumptions (A.1)–(A.2) and the conditions of Theorem 1, we have

\begin{matrix} sup_{x \in H^{m}} sup_{u \in I_{h}, x \in S_{c}} |{\overset{˘}{r}}_{n}^{(m) *} (ψ, t, u; h_{n}) - E {\overset{˘}{r}}_{n}^{(m) *} (ψ, t, u; h_{n})| \\ = O_{P_{. | S}} (\sqrt{log n / n h^{m d} ϕ^{m} (h)} + \frac{1}{A_{n}^{d p} ϕ (h)}), \end{matrix}

In the last corollary, we use the law of iterated logarithm for

G_{n}^{*} (\cdot)

established in [139] ensuring that

sup_{t \leq τ} | G_{n}^{*} - G (t) | = O (\sqrt{\frac{log log n}{n}}) almost surely as n \to \infty .

(46)

At this point, we may refer to [69,134,140].

7. The Bandwidth Selection Criterion

Many methods have been established and developed to construct, in asymptotically optimal ways, bandwidth selection rules for non-parametric kernel estimators especially for the Nadaraya–Watson regression estimator we quote among them [141,142,143]. This parameter has to be selected suitably, either in the standard finite dimensional case, or in the infinite dimensional framework for ensuring good practical performances. However, according to our knowledge, such studies do not presently exist for treating a such general functional conditional U-statistic. Nevertheless, an extension of the leave-one-out cross-validation procedure allows us to define, for any fixed

j = (j_{1}, \dots, j_{m}) \in I_{n}^{m}

:

\begin{matrix} {\hat{r}}_{n, j}^{(m)} (x, u; h_{n}) \\ = \frac{\sum_{i \in I_{n}^{m} (j)} φ (Y_{s_{i_{1}}, A_{n}}, \dots, Y_{s_{i_{m}, A_{n}}}) \prod_{j = 1}^{m} \{\prod_{ℓ = 1}^{d} K_{1} (\frac{u_{j, ℓ} - \frac{s_{i_{j, ℓ}}}{A_{n}}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\}}{\sum_{i \in I_{n}^{m}} \prod_{j = 1}^{m} \{\prod_{ℓ = 1}^{d} K_{1} (\frac{u_{j, ℓ} - \frac{s_{i_{j, ℓ}}}{A_{n}}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\}}, \end{matrix}

(47)

where

I_{n}^{m} (j) : = \{i \in I_{n}^{m} and i \neq j\} = I_{n}^{m} ∖ {j} .

The Equation (47) represents the leave-one-out-

(X_{j}, Y_{j})

estimator of the functional regression and also could be considered as a predictor of

φ (Y_{s_{j_{1}}, A_{n}}, \dots, Y_{s_{j_{m}, A_{n}}}) : = φ (Y_{j})

. In order to minimize the quadratic loss function, we introduce the following criterion: we have for some (known) non-negative weight function

W (\cdot)

:

C V (φ, h_{n}) : = \frac{(n - m)!}{n!} \sum_{j \in I_{n}^{m}} {(φ (Y_{j}) - {\hat{r}}_{n, j}^{(m)} (X_{j}, u; h_{n}))}^{2} \tilde{W} (X_{j}),

(48)

where

X_{j} = (X_{s_{j_{1}}, A_{n}}, \dots, X_{s_{j_{m}, A_{n}}})

. Following the ideas developed by [143], a natural way for choosing the bandwidth is to minimize the precedent criterion, so let us choose

{\hat{h}}_{n} \in [a_{n}, b_{n}]

minimizing among

h \in [a_{n}, b_{n}]

:

C V (φ, h_{n}) .

The main interest of our results is the possibility to derive the asymptotic properties of our estimate even if the bandwidth parameter is a random variable, as in the last equation. Following [144] where the bandwidths are locally chosen by a data-driven method based on the minimization of a functional version of a cross-validated criterion, one can replace (48) by

C V (φ, h_{n}) : = \frac{(n - m)!}{n!} \sum_{j \in I_{n}^{m}} {(φ (Y_{j}) - {\hat{r}}_{n, j}^{(m)} (X_{j}, u; h_{n}))}^{2} \hat{W} (X_{j}, x),

(49)

where

\hat{W} (s, t) : = \prod_{i = 1}^{m} \hat{W} (s_{i}, t_{i}) .

In practice, one takes for

i \in I_{n}^{m}

the uniform global weights

\tilde{W} (X_{i}) = 1

, and the local weights

\hat{W} (X_{i}, t) = \{\begin{matrix} 1 & if & d (X_{i}, t) \leq h, \\ 0 & otherwise . \end{matrix}

For the sake of brevity, we have just considered the most popular method: that is, the cross-validated selected bandwidth. This may be extended to any other bandwidth selector such as the bandwidth based on Bayesian ideas [145].

Remark 8.

For notational convenience, we have chosen the same bandwidth sequence for each margin. This assumption can be dropped easily if one wants to use the vector bandwidths (see, in particular, Chapter 12 of [6]). With obvious changes of notation, our results and their proofs remain true when

h_{n}

is replaced by a vector bandwidth

h_{n} = (h_{n}^{(1)}, \dots, h_{n}^{(d)})

, where

min h_{n}^{(i)} > 0

. In this situation, we set

h_{n} = \prod_{i = 1}^{d} h_{n}^{(i)}

, and for any vector

v = (v_{1}, \dots, v_{d})

, we replace

v / h_{n}

by

(v_{1} / h_{n}^{(1)}, \dots, v_{1} / h_{n}^{(d)})

. For ease of presentation, we chose to use real-valued bandwidths throughout.

Remark 9.

We mention that a different bandwidth criterion suggested by [1] is the rule of thumb. Strictly speaking, since the cross-validated bandwidth is random, the asymptotic theory can only be justified with this random bandwidth via a specific stochastic equicontinuity argument. Cross-validation is employed by [146] to examine the equality of two unconditional and conditional functions in the context of mixed categorical and continuous data. However, this approach, which is optimal for estimation, loses its optimality when applied to non-parametric kernel testing. For testing a parametric model for conditional mean function against a non-parametric alternative, Ref. [147] proposed an adaptive-rate-optimal rule. Ref. [148] present the other method for selecting a proper bandwidth. Ref. [148] propose, utilizing the Edgeworth expansion of the asymptotic distribution of the test, to select the bandwidth such that the power function of the test is maximized while the size function is controlled. Future investigation will focus on the aforementioned three approaches.

8. Concluding Remarks

In this paper, we considered the kernel type estimator for conditional U-statistics, including a particular case, the Nadaraya–Watson estimator, in a functional setting with random fields. To obtain our results, we ought to make assumptions requiring some regularity on the conditional U-statistics and conditional moments, some decay rates on the probability of the variables belonging to shrinking open balls, and suitable decreasing rates on the mixing coefficients. Mainly, the conditional moment assumption enables the consideration of unbounded classes of functions. The proof of the weak convergence respects a typical technique: finite dimensional convergence and equicontinuity of the conditional U-processes.

Both results, the uniform rate of convergence and the weak convergence, are grounded on a general blocking technique adjusted for irregularly spaced sampling sites, where we need to pay attention to the effect of the non-equidistant sampling sites. We intricately reduce the work to the independent setting to address this issue. Indeed, as there is no practical guidance for introducing order to spatial points as opposed to time series, not asymptotically but exactly independent blocks of observations have been constructed by ([149], Corollary 2.7) (Lemma A4) and then results of independent data could be applied directly to the independent blocks. Here, Ref. [149] declares that the uniform convergence result requires the

β

-mixing condition to connect the original sequence with the sequence of the independent blocks, and this connection still holds under the

ϕ

-mixing condition but is not necessary under the

α

-mixing conditions. Therefore, we use the

β

-mixing sequence as we aim to derive the weak convergence for processes indexed by classes of functions.

Ref. [85] in his work gives us a possible extension of the sampling region inspired by [93]. This extension can be explained as follows. It is feasible to generalize the definition of the sample region

R_{n}

to include non-standard forms. For instance, we may use the sample region concept [93] as follows: First, let

R_{n}

be the sampling region. Define

R_{0}^{*}

as an open connected subset of

{(- 2, 2]}^{d}

containing

{[- 1, 1]}^{d}

and

R_{0}

as a Borel set such that

R_{0}^{*} \subset R_{0} \subset {\bar{R}}_{0}^{*}

, and where for any set

S \subset R^{d}, \bar{S}

signifies its closure. Let

{\{A_{n}\}}_{n \geq 1}

be a sequence of positive numbers such that

A_{n} \to \infty

as

n \to \infty

and define

R_{n} = A_{n} R_{0}

as a sampling region. In addition, for any sequence of positive numbers

{\{a_{n}\}}_{n \geq 1}

with

a_{n} \to 0

as

n \to \infty

, let

O (a_{n}^{- d + 1})

, as

n \to \infty

, be the number of cubes of the form

a_{n} (i + {[0, 1)}^{d}), i \in Z^{d}

with their lower left corner

a_{n} i

on the lattice

a_{n} Z^{d}

that intersects both

R_{0}

and

R_{0}^{c}

(see Condition B in [93], Chapter 12, Section 12.2) (This condition is the prototype

R_{0}

boundary’s condition; it must always be assumed on the region

R_{n}

to prevent pathological situations, and it is satisfied by the majority of areas of practical significance. This condition is satisfied in the plane (d = 2), for instance, if the boundary

\partial R_{0}

of

R_{0}

is defined by a simple rectifiable curve of limited length. When sample sites are defined on the integer grid

Z^{d}

, this condition means that the effect of data points toward the boundary of

R_{n}

is small compared to the overall number of data points). In addition, define f as a continuous, everywhere positive probability density function on

R_{0}

, and let

{\{S_{0, i}\}}_{i \geq 1}

be a sequence of i.i.d. random vectors with density f. Assume that

{\{S_{0, i}\}}_{i \geq 1}

and

X_{s, A_{n}}

are independent. Replacing our setting in Section 2.4 with this new one, our results still hold, and it will be possible to show uniform convergence and weak convergence under the same assumptions and identical proofs. For future investigation, it will be interesting to relax the mixing conditions to the weak dependence (or the ergodicity framework). This generalization is nontrivial, since we need some maximal moment inequalities in our asymptotic results that are not available in this setting. Another interesting direction is to consider the incomplete data setting (missing at random, censored in different schemes) for locally spatial–functional data. A natural question is how to adapt our results to the wavelet-based estimators, the delta sequence estimators, the kNN estimators, and the local linear estimators.

9. Mathematical Developments

The proofs for our results are covered in this section. The following continues to use the notations that were previously presented.

To avoid the repetition of the Blocking technique and the notation used, we will devote the following subsection to introducing all notations needed for this decomposition.

9.1. Preliminaries

This treatment requires an extension of the Blocking techniques of Bernstein to the spacial process, refer to [85]. Let us introduce some notations related to this technique. Recall that

\{A_{1, n}\}

and

\{A_{2 . n}\}

are sequences of positive numbers such that

A_{1, n} / A_{n} + A_{2, n} / A_{1, n} \to 0 as n \to \infty .

Let

A_{3, n} = A_{1, n} + A_{2, n} .

We consider a partition of

R^{d}

by hypercubes of the form

Γ_{n} (ℓ; 0) = (ℓ + {(0, 1]}^{d}) A_{3, n},

ℓ = {(ℓ_{1}, \dots, ℓ_{d})}^{'} \in Z^{d}

and divide

Γ_{n} (ℓ; 0)

into

2^{d}

hypercubes as follows:

Γ_{n} (ℓ; ϵ) = \prod_{j = 1}^{d} I_{j} (ϵ_{j}), ϵ = {(ϵ_{1}, \dots, ϵ_{d})}^{'} \in {1, 2}^{d},

(50)

where for

j = 1, \dots, d

,

I_{j} (ϵ_{j}) = \{\begin{matrix} (ℓ_{j} A_{3, n}, ℓ_{j} A_{3, n} + A_{1, n}] & if ϵ_{j} = 1, \\ (ℓ_{j} A_{3, n} + A_{1, n}, (ℓ_{j} + 1) A_{3, n}] & if ϵ_{j} = 2 . \end{matrix}

(51)

We note that

|Γ_{n} (ℓ; ϵ)| = A_{1, n}^{q (ϵ)} A_{2, n}^{d - q (ϵ)}

(52)

for any

ℓ \in Z^{d}

and

ϵ \in {1, 2}^{d}

, where

q (ϵ) = [\{1 \leq j \leq d : ϵ_{j} = 1\}] .

Let

ϵ_{0} = {(1, \dots, 1)}^{'}

. The partitions

Γ_{n} (ℓ; ϵ_{0})

correspond to “large blocks” and the partitions

Γ (ℓ; ϵ)

for

ϵ \neq ϵ_{0}

correspond to “small blocks”.

Let

L_{1, n} = \{ℓ \in Z^{d} : Γ_{n} (ℓ, 0) \subset R_{n}\}

be the index set of all hypercubes

Γ_{n} (ℓ, 0)

that are contained in

R_{n}

, and let

L_{2, n} = \{ℓ \in Z^{d} : Γ_{n} (ℓ, 0) \cap R_{n} \neq 0, Γ_{n} (ℓ, 0) \cap R_{n}^{c} \neq \emptyset\}

denote the boundary hypercubes index set. Define

L_{n} = L_{1, n} \cup L_{2, n}

.

9.2. Proof of Proposition 1

As we mentioned, our statistic is a weighted U-statistic that can be decomposed into a sum of U-statistics using the Hoeffding decomposition. We will treat this decomposition detailed in Section 3.1 to achieve the desired results. In the mentioned section, we have seen that

\hat{ψ} (u, x) - E (\hat{ψ} (u, x)) = {\hat{ψ}}_{1} (u, x) + {\hat{ψ}}_{2} (u, x),

where the linear term

{\hat{ψ}}_{1} (u, x)

and the remainder term

{\hat{ψ}}_{2} (u, x)

are well defined in (24) and (26), respectively. We aim to prove that the linear term leads the rate of convergence of this statistic while the remaining one converges to zero almost surely as

n \to \infty

. We will deal with the first term in the decomposition. For

B = [0, 1], α_{n} = \sqrt{log n / n h^{m d} ϕ^{m} (h)}

and

τ_{n} = ρ_{n} n^{1 / ζ}

, where

ζ

is a positive constant given in Assumption 6 part (i), with

ρ_{n} = {(log n)}^{ζ_{0}}

for some

ζ_{0} > 0

. Define

\begin{matrix} {\tilde{H}}_{1}^{(ℓ)} (z) & : = {\tilde{H}}^{(ℓ)} (z) 1_{\{|W_{s_{i}, A_{n}}| \leq τ_{n}\}}, \end{matrix}

(53)

\begin{matrix} {\tilde{H}}_{2} (z) & : = {\tilde{H}}^{(ℓ)} (z) 1_{\{|W_{s_{i}, A_{n}}| > τ_{n}\}}, \end{matrix}

(54)

and

\begin{matrix} {\hat{ψ}}_{1}^{(1)} (u, x) - θ (i) & = & \frac{1}{n} \sum_{i = 1}^{n} \frac{(n - m)!}{(n - 1)!} \sum_{I_{n - 1}^{m - 1} (- i)} \sum_{ℓ = 1}^{m} ξ_{i_{1}} \dots ξ_{i_{ℓ - 1}} ξ_{i} ξ_{i_{ℓ}} \dots ξ_{i_{m - 1}} {\tilde{H}}_{1}^{(ℓ)} (z), \\ {\hat{ψ}}_{1}^{(2)} (u, x) - θ (i) & = & \frac{1}{n} \sum_{i = 1}^{n} \frac{(n - m)!}{(n - 1)!} \sum_{I_{n - 1}^{m - 1} (- i)} \sum_{ℓ = 1}^{m} ξ_{i_{1}} \dots ξ_{i_{ℓ - 1}} ξ_{i} ξ_{i_{ℓ}} \dots ξ_{i_{m - 1}} {\tilde{H}}_{2}^{(ℓ)} (z) . \end{matrix}

Clearly, we have

\begin{matrix} {\hat{ψ}}_{1} (u, x) - E {\hat{ψ}}_{1} (u, x) \\ = [{\hat{ψ}}_{1}^{(1)} (u, x) - E {\hat{ψ}}_{1}^{(1)} (u, x)] + [{\hat{ψ}}_{1}^{(2)} (u, x) - E {\hat{ψ}}_{1}^{(2)} (u, x)] . \end{matrix}

(55)

To begin, it is plain to see that

\begin{matrix} P_{\cdot ∣ S} (sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |{\hat{ψ}}_{1}^{(2)} (u, x) - θ (i)| > α_{n}) \\ = P_{\cdot ∣ S} (sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |{\hat{ψ}}_{1}^{(2)} (u, x) - θ (i)| > α_{n}) \\ ⋂ \{sup_{F_{m} K^{m}} sup_{x \in H^{m}} ⋃_{i = 1}^{n} |W_{s_{i}, A_{n}}| > τ_{n}\} ⋃ \{sup_{F_{m} K^{m}} sup_{x \in H^{m}} {\{⋃_{i = 1}^{n} |W_{s_{i}, A_{n}}| > τ_{n}\}}^{c}\} \\ \leq P_{\cdot ∣ S} \{sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |{\hat{ψ}}_{1}^{(2)} (u, x, φ) - θ (i)| > α_{n} \\ ⋂ \{sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} ⋃_{i = 1}^{n} |W_{s_{i}, A_{n}}| > τ_{n}\}\} \\ + P_{\cdot ∣ S} \{sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |{\hat{ψ}}_{2}^{(1)} (u, x, φ) - θ (i)| > α_{n} \\ ⋂ \{{\{sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} ⋃_{i = 1}^{n} |W_{s_{i}, A_{n}}| > τ_{n}\}}^{c}\}\} \\ \leq P_{\cdot ∣ S} (sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |W_{s_{i}, A_{n}}| > τ_{n} for some i = 1, \dots, n) + P_{\cdot ∣ S} (⌀) \\ \leq τ_{n}^{- ζ} \sum_{i = 1}^{n} E_{\cdot ∣ S} [sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} {|W_{s_{i}, A_{n}}|}^{ζ}] \leq n τ_{n}^{- ζ} = ρ_{n}^{- ζ} \to 0 . \end{matrix}

We infer that

\begin{matrix} E_{\cdot ∣ S} [|{\hat{ψ}}_{1}^{(2)} (u, x)|] & \leq & \frac{1}{n} \sum_{i = 1}^{n} \frac{(n - m)!}{(n - 1)!} \sum_{I_{n - 1}^{m - 1} (- i)} \sum_{ℓ = 1}^{m} ξ_{i_{1}} \dots ξ_{i_{ℓ - 1}} ξ_{i} ξ_{i_{ℓ}} \dots ξ_{i_{m - 1}} E_{\cdot ∣ S} (|{\tilde{H}}_{2}^{(ℓ)} (z)|), \end{matrix}

where

\begin{matrix} E_{\cdot ∣ S} (|{\tilde{H}}_{2}^{(ℓ)} (z)|) \\ = E_{\cdot ∣ S} [|\frac{1}{ϕ (h)} K_{2} (\frac{d (x_{i}, X_{s_{i}, A_{n}})}{h}) W_{s_{i}, A_{n}} \times \int W_{s_{(1, \dots, ℓ - 1, ℓ, \dots, m - 1)}, A_{n}} \prod_{\underset{j \neq i}{j = 1}}^{m - 1} \frac{1}{ϕ (h)} \\ K_{2} (\frac{d (x_{j}, ν_{s_{j}, A_{n}})}{h}) P (d ν_{1}, \dots, d ν_{ℓ - 1}, d ν_{ℓ}, \dots, d ν_{m - 1})|] \\ \leq E_{\cdot ∣ S} [|\frac{1}{ϕ (h)} (K_{2} (\frac{d (x_{i}, X_{s_{i}, A_{n}})}{h}) + K_{2} (\frac{d (x_{i}, X_{u_{i}} (s_{i}))}{h}) \\ - K_{2} (\frac{d (x_{i}, X_{u_{i}} (s_{i}))}{h})) W_{s_{i}, A_{n}} 1_{\{|W_{s_{i}, A_{n}}| > τ_{n}\}}|] \\ \leq \frac{τ_{n}^{- (ζ - 1)}}{ϕ (h)} E_{\cdot ∣ S} [|K_{2} (\frac{d (x_{i}, X_{s_{i}, A_{n}})}{h}) - K_{2} (\frac{d (x_{i}, X_{u_{i}} (s_{i}))}{h})| {|W_{s_{i}, A_{n}}|}^{ζ} \\ + |K_{2} (\frac{d (x_{i}, X_{u_{i}} (s_{i}))}{h})| {|W_{s_{i}, A_{n}}|}^{ζ}] \\ \leq \frac{τ_{n}^{- (ζ - 1)}}{ϕ (h)} E_{\cdot ∣ S} [h^{- 1} |d (x_{i}, X_{s_{i}, A_{n}}) - d (x_{i}, X_{u_{i}} (s_{i}))| {|W_{s_{i}, A_{n}}|}^{ζ}] \\ + E_{\cdot ∣ S} [|K_{2} (\frac{d (x_{i}, X_{u_{i}} (s_{i}))}{h})| {|W_{s_{i}, A_{n}}|}^{ζ}] \\ ≲ \frac{τ_{n}^{- (ζ - 1)}}{ϕ (h)} \times [\frac{1}{n h} + ϕ (h)] \\ ≲ \frac{τ_{n}^{- (ζ - 1)}}{n h ϕ (h)} + τ_{n}^{- (ζ - 1)} . \end{matrix}

(56)

Hence, we have

\begin{matrix} E_{\cdot ∣ S} [|{\hat{ψ}}_{1}^{(2)} (u, x)|] & ≲ & \frac{1}{n} \sum_{i = 1}^{n} \frac{(n - m)!}{(n - 1)!} \sum_{I_{n - 1}^{m - 1} (- i)} \sum_{ℓ = 1}^{m} ξ_{i_{1}} \dots ξ_{i_{ℓ - 1}} ξ_{i} ξ_{i_{ℓ}} \dots ξ_{i_{m - 1}} [τ_{n}^{- (ζ - 1)}] \\ ≲ & τ_{n}^{- (ζ - 1)} \frac{1}{n^{m}} \sum_{i \in I_{n}^{m}} \prod_{j = 1}^{m} \frac{1}{h^{d}} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \\ = & \frac{C}{τ_{n}^{(ζ - 1)}} (f_{S} (u) + O (\sqrt{\frac{log n}{n h^{m d}}} + h^{2})) (Using Lemma A 1) \\ \leq & \frac{C}{τ_{n}^{(ζ - 1)}} = C ρ_{n}^{- (ζ - 1)} n^{- (ζ - 1) / ζ} \leq C α_{n} P_{S} - a . s . \end{matrix}

As a result, we obtain that

sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |{\hat{ψ}}_{1}^{(2)} (u, x) - E_{\cdot ∣ S} {\hat{ψ}}_{1}^{(2)} (u, x)| = O_{P_{\cdot ∣ S}} (α_{n}) .

(57)

Second, let us treat

sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |{\hat{ψ}}_{1}^{(1)} (u, x, φ) - E {\hat{ψ}}_{1}^{(1)} (u, x, φ)| .

Recall the large blocks and small blocks and the notation given in Section 9.1, and define

\begin{matrix} S_{s, A_{n}} (u, x) & : = & \frac{(n - m)!}{(n - 1)!} \sum_{I_{n - 1}^{m - 1} (- i)} \sum_{ℓ = 1}^{m} ξ_{i_{1}} \dots ξ_{i_{ℓ - 1}} ξ_{i} ξ_{i_{ℓ}} \dots ξ_{i_{m - 1}} {\tilde{H}}_{1}^{(ℓ)} (z), \\ S_{n} (ℓ; ϵ) & = & \sum_{i : s_{i} \in Γ_{n} (ℓ; ϵ) \cap R_{n}} S_{s, A_{n}} (u, x) = {(S_{n}^{(1)} (ℓ; ϵ), \dots, S_{n}^{(p)} (ℓ; ϵ))}^{'} . \end{matrix}

Then, we have

\begin{matrix} S_{n} & = {(S_{n}^{(1)}, \dots, S_{n}^{(m)})}^{'} \\ = \sum_{i = 1}^{n} S_{s, A_{n}} (u, x) \\ = \sum_{ℓ \in L_{n}} S_{n} (ℓ; ϵ_{0}) + \sum_{ϵ \neq ϵ_{0}} \underset{= : S_{2, n} (ϵ)}{\underset{︸}{\sum_{ℓ \in L_{1, n}} S_{n} (ℓ; ϵ)}} + \sum_{ϵ \neq ϵ_{0}} \underset{= : S_{3, n} (ϵ)}{\underset{︸}{\sum_{ℓ \in L_{2, n}} S_{n} (ℓ; ϵ)}} \\ = : S_{1, n} + \sum_{ϵ \neq ϵ_{0}} S_{2, n} (ϵ) + \sum_{ϵ \neq ϵ_{0}} S_{3, n} (ϵ) . \end{matrix}

(58)

In order to achieve our result, we will pass by the following two steps.

Step 1 (Reduction to independence). Recall

S_{n} (ℓ; ϵ) = \sum_{i : s_{i} \in Γ_{n} (ℓ; ϵ) \cap R_{n}} S_{s, A_{n}} (u, x) .

For each

ϵ \in {1, 2}^{d}

, let

\{{\overset{˘}{S}}_{n} (ℓ; ϵ) : ℓ \in L_{n}\}

be a sequence of independent random variables in

R

under

P_{\cdot ∣ S}

such that

{\overset{˘}{S}}_{n} (ℓ; ϵ) \overset{d}{=} S_{n} (ℓ; ϵ), under P_{. ∣ S}, ℓ \in L_{n} .

Define

{\overset{˘}{S}}_{1, n} = \sum_{ℓ \in L_{n}} {\overset{˘}{S}}_{n} (ℓ; ϵ_{0}) = {({\overset{˘}{S}}_{1, n}^{(1)}, \dots, {\overset{˘}{S}}_{1, n}^{(m)})}^{'}

and for

ϵ \neq ϵ_{0}

, define

{\overset{˘}{S}}_{2, n} (ϵ) = \sum_{ℓ \in L_{1, n}} {\overset{˘}{S}}_{n} (ℓ; ϵ)

and

{\overset{˘}{S}}_{3, n} (ϵ) = \sum_{ℓ \in L_{2, n}} {\overset{˘}{S}}_{n} (ℓ; ϵ) .

We start by confirming the following results:

\begin{matrix} sup_{t > 0} |P_{\cdot ∣ S} (S_{1, n} > t) - P_{\cdot ∣ S} ({\overset{˘}{S}}_{1, n} > t)| & \leq & C {(\frac{A_{n}}{A_{1, n}})}^{d} β (A_{2, n}; A_{n}^{d}), \end{matrix}

(59)

\begin{matrix} sup_{t > 0} |P_{\cdot ∣ S} ({∥S_{2, n} (ϵ)∥}_{\infty} > t) - P_{\cdot ∣ S} ({∥{\overset{˘}{S}}_{2, n} (ϵ)∥}_{\infty} > t)| & \leq & C {(\frac{A_{n}}{A_{1, n}})}^{d} β (A_{2, n}; A_{n}^{d}), \end{matrix}

(60)

\begin{matrix} sup_{t > 0} |P_{\cdot ∣ S} ({∥S_{3, n} (ϵ)∥}_{\infty} > t) - P_{\cdot ∣ S} ({∥{\overset{˘}{S}}_{3, n} (ϵ)∥}_{\infty} > t)| & \leq & C {(\frac{A_{n}}{A_{1, n}})}^{d} β (A_{2, n}; A_{n}^{d}) . \end{matrix}

(61)

Keep in mind that

〚 L_{n} 〛 = O ({(A_{n} / A_{3, n})}^{d}) ≲ {(A_{n} / A_{1, n})}^{d} .

For

ϵ \in {1, 2}^{d}

and

ℓ_{1}, ℓ_{2} \in L_{n}

with

ℓ_{1} \neq ℓ_{2}

, let

J_{1} (ϵ) = \{1 \leq i_{1} \leq n : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ)\}, J_{2} (ϵ) = \{1 \leq i_{2} \leq n : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ)\} .

For any

s_{i_{k}} = (s_{1, i_{k}}, \dots, s_{d, i_{k}}), k = 1, 2

in such a way that

i_{1} \in J_{1} (ϵ)

and

i_{2} \in J_{2} (ϵ)

, we obtain

{max}_{1 \leq u \leq d} ∣ s_{u, i_{1}} -

s_{u, i_{2}} ∣ \geq A_{2, n}

using the definition of

Γ (ℓ; ϵ)

. This gives

|s_{i_{1}} - s_{i_{2}}| \geq A_{2, n} .

For any

ϵ \in {1, 2}^{d}

, let

S_{n} (ℓ_{1}; ϵ), \dots, S_{n} (ℓ_{[L_{n}]}; ϵ)

be an arrangement of

\{S_{n} (ℓ; ϵ) : ℓ \in L_{n}\}

. Let

P_{. ∣ S}^{(a)}

be the marginal distribution of

S_{n} (ℓ_{a}; ϵ)

and let

P_{\cdot ∣ S}^{(a : b)}

be the joint distribution of

\{S_{n} (ℓ_{k}; ϵ)

:

a \leq k \leq b}

. The

β

-mixing property of

X

gives that for

1 \leq k \leq 〚 L_{n} 〛 - 1

,

{∥P_{\cdot ∣ S} - P_{\cdot ∣ S}^{(1 : k)} \times P_{\cdot ∣ S}^{(k + 1 : [L_{n}])}∥}_{TV} ≲ β (A_{2, n}; A_{n}^{d}) .

The inequality is independent of the arrangement of

\{S_{n} (ℓ; ϵ) : ℓ \in L_{n}\}

. Therefore, the Assumption A2 in Lemma A4 is fulfilled for

\{S_{n} (ℓ; ϵ) : ℓ \in L_{n}\}

with

τ \sim β (A_{2, n}; A_{n}^{d})

and

m ≲ {(A_{n} / A_{1, n})}^{d}

. Combining the boundary condition on

R_{n}

and Lemma A4, we get (59)–(61).

Remark 10.

Since

〚 \{ϵ \in {1, 2}^{d} : ϵ \neq ϵ_{0}\} 〛 = 2^{d} - 1, 〚 L_{1, n} 〛 \sim {(A_{n} / A_{3, n})}^{d} \sim {(A_{n} / A_{1, n})}^{d}

and

〚 L_{2, n} 〛 \sim {(A_{n} / A_{3, n})}^{d - 1} \sim {(A_{n} / A_{1, n})}^{d - 1} ≪ 〚 L_{1, n} 〛,

Lemma A5 and Equation (52) give for sufficiently large n the summands numbers of

S_{2, n}

and

S_{3, n}

are at most

O (A_{1, n}^{d - 1} A_{2, n} n A_{n}^{- d} {(A_{n} / A_{1, n})}^{d}) = O (\frac{A_{2, n}}{A_{1, n}} n)

and

O (A_{1, n}^{d - 1} A_{2, n} n A_{n}^{- d} {(A_{n} / A_{1, n})}^{d - 1}) = O (\frac{A_{2, n}}{A_{n}} n),

respectively.

Step 2. Recall that we aim to treat

sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |{\hat{ψ}}_{1}^{(1)} (u, x, φ) - E_{\cdot ∣ S} {\hat{ψ}}_{1}^{(1)} (u, x, φ)| .

To achieve the intended result, we will cover the region

B^{m} = {[0, 1]}^{d m}

by

⋃_{k_{1}, \dots, k_{m} = 1}^{N_{(u)}} \prod_{j = 1}^{m} B (u_{k_{j}}, r),

for some radius r. Hence, for each

u = (u_{1}, \dots, u_{m}) \in {[0, 1]}^{d m}

, there exists

l (u) = (l (u_{1}), \dots, l (u_{m}))

, where

\forall 1 \leq i \leq m, 1 \leq l (u_{i}) \leq N_{(u)}

in such a way that

u \in \prod_{i = 1}^{m} B (u_{l (u_{i})}, r) and | u_{i} - u_{l (u_{i})} | \leq r, for 1 \leq i \leq m,

then for each

u \in {[0, 1]}^{d m}

, the closest center will be

u_{l} (u)

, and the ball with the closest center will be defined by

B (u, l (u), r) : = \prod_{j = 1}^{m} B (u_{k_{j}}, r) .

In the same way,

H^{m}

should be covered by

⋃_{k_{1}, \dots, k_{m} = 1}^{N_{(x)}} \prod_{j = 1}^{m} B (x_{k_{j}}, r),

for some radius r. Hence, for each

x = (x_{1}, \dots, x_{m}) \in H^{m}

, there exists

l (x) = (l (x_{1}), \dots,

l (x_{m}))

, where

\forall 1 \leq i \leq m, 1 \leq l (x_{i}) \leq N_{(x)}

in such a way that

x \in \prod_{i = 1}^{m} B (u_{l (x_{i})}, r) and d (x_{i}, x_{l (u_{i})}) \leq r, for 1 \leq i \leq m,

then for each

x \in H^{m}

, the closest center will be

x_{l} (x)

, and the ball with the closest center will be defined by

B (x, l (x), r) : = \prod_{i = 1}^{m} B (x_{l (x_{i})}, r) .

We define:

K^{*} (ω, v) \geq C_{0} \prod_{j = 1}^{m} \prod_{ℓ = 1}^{d} 1_{(| ω_{j, ℓ} | \leq 2 C_{1})} \prod_{j = 1}^{m} K_{2} (v_{k}) for (ω, v) \in R^{2} .

We can show that for

(u, x) \in B_{j, n}

and n large enough,

\begin{matrix} |\bar{K} (\frac{u - s / A_{n}}{h_{n}}) K_{2} (\frac{d (x_{i}, X_{s_{i}, A_{n}})}{h}) - \bar{K} (\frac{u_{n} - s / A_{n}}{h_{n}}) K_{2} (\frac{d (x_{n, i}, X_{s_{i}, A_{n}})}{h})| \\ \leq α_{n} K^{*} (\frac{u_{n} - s / A_{n}, d (x_{n, i}, X_{s_{i}, A_{n}})}{h_{n}}) . \end{matrix}

Then, for

\begin{matrix} {\hat{ψ}}_{1}^{(1)} (u, x) = \frac{1}{n} \sum_{i = 1}^{n} ξ_{i} \frac{1}{ϕ (h)} K_{2} (\frac{d (x_{i}, X_{s_{i}, A_{n}})}{h}) W_{s_{i}, A_{n}} 1_{\{|W_{s_{i}, A_{n}}| \leq τ_{n}\}} \\ \times \frac{(n - m)!}{(n - 1)!} \sum_{I_{n - 1}^{m - 1} (- i)} \sum_{ℓ = 1}^{m} ξ_{i_{1}} \dots ξ_{i_{ℓ - 1}} ξ_{i_{ℓ}} \dots ξ_{i_{m - 1}} \\ \times \int W_{s_{(1, \dots, ℓ - 1, ℓ, \dots, m - 1)}, A_{n}} \prod_{\underset{j \neq i}{j = 1}}^{m - 1} \frac{1}{ϕ (h)} K_{2} (\frac{d (x_{j}, ν_{s_{j}, A_{n}})}{h}) P (d ν_{1}, \dots, d ν_{ℓ - 1}, d ν_{ℓ}, \dots, d ν_{m - 1}) . \end{matrix}

Let us define

\begin{matrix} {\bar{ψ}}_{1}^{(1)} (u, x) = \frac{1}{n h^{d} ϕ (h)} \sum_{i = 1}^{n} K^{*} (\frac{u_{n} - s_{i} / A_{n}, d (x_{n, i}, X_{s_{i}, A_{n}})}{h_{n}}) W_{s_{i}, A_{n}} 1_{\{|W_{s_{i}, A_{n}}| \leq τ_{n}\}} \\ \times \frac{(n - m)!}{(n - 1)!} \sum_{I_{n - 1}^{m - 1} (- i)} \sum_{ℓ = 1}^{m} ξ_{i_{1}} \dots ξ_{i_{ℓ - 1}} ξ_{i_{ℓ}} \dots ξ_{i_{m - 1}} \\ \times \int W_{s_{(1, \dots, ℓ - 1, ℓ, \dots, m - 1)}, A_{n}} \prod_{\underset{j \neq i}{j = 1}}^{m - 1} \frac{1}{ϕ (h)} K_{2} (\frac{d (x_{j}, ν_{s_{j}, A_{n}})}{h}) P (d ν_{1}, \dots, d ν_{ℓ - 1}, d ν_{ℓ}, \dots, d ν_{m - 1}) \\ : = \frac{1}{n h^{d} ϕ (h)} \sum_{i = 1}^{n} S_{s, A_{n}}^{'} (u, x) . \end{matrix}

(62)

We mention that

E_{\cdot ∣ S} [|{\bar{ψ}}_{1}^{(1)} (u, x, φ)|] \leq M < \infty,

for some M large enough. Let

N_{F_{m} K^{m}} N_{(x)}^{m} N_{(u)}

denote the covering number related, respectively, to the class of functions

F_{m} K^{m}

, the balls that cover

{[0, 1]}^{m}

and the balls that cover

H^{m}

. Then, we obtain

\begin{matrix} sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B} |{\hat{ψ}}_{1}^{(1)} (u, x, φ) - E_{\cdot ∣ S} [{\hat{ψ}}_{1}^{(1)} (u, x, φ)]| \\ \leq & N_{F_{m} K^{m}} N_{(x)}^{m} N_{(u)}^{m} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (x_{i (x)}, r)} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (u_{i (u)}, r)} |{\hat{ψ}}_{1}^{(1)} (u, x, φ) - E_{\cdot ∣ S} [{\hat{ψ}}_{1}^{(1)} (u, x, φ)]| \\ \leq & N_{F_{m} K^{m}} N_{(x)}^{m} N_{(u)}^{m} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (x_{i (x)}, r)} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (u_{i (u)}, r)} |{\hat{ψ}}_{1}^{(1)} (u_{n}, x) - E_{\cdot ∣ S} [{\hat{ψ}}_{1}^{(1)} (u_{n}, x)]| \\ + N_{F_{m} K^{m}} N_{(x)}^{m} N_{(u)}^{m} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (x_{i (x)}, r)} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (u_{i (u)}, r)} α_{n} (|{\bar{ψ}}_{1}^{(1)} (u_{n}, x)| + E_{\cdot ∣ S} [|{\bar{ψ}}_{1}^{(1)} (u_{n}, x)|]) \\ \leq & N_{F_{m} K^{m}} N_{(x)}^{m} N_{(u)}^{m} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (x_{i (x)}, r)} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (u_{i (u)}, r)} |{\hat{ψ}}_{1}^{(1)} (u_{n}, x) - E_{\cdot ∣ S} [{\hat{ψ}}_{1}^{(1)} (u_{n}, x)]| \\ + N_{F_{m} K^{m}} N_{(x)}^{m} N_{(u)}^{m} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (x_{i (x)}, r)} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (u_{i (u)}, r)} |{\bar{ψ}}_{1}^{(1)} (u_{n}, x) - E_{\cdot ∣ S} [{\bar{ψ}}_{1}^{(1)} (u_{n}, x)]| \\ + 2 M F (y) α_{n} \\ \leq & N_{F_{m} K^{m}} N_{(x)}^{m} N_{(u)}^{m} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (x_{i (x)}, r)} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (u_{i (u)}, r)} |\sum_{ℓ \in L_{n}} S_{n} (ℓ; ϵ_{0})| \\ + N_{F_{m} K^{m}} N_{(x)}^{m} N_{(u)}^{m} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (x_{i (x)}, r)} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (u_{i (u)}, r)} |\sum_{ϵ \neq ϵ_{0}} \sum_{ℓ \in L_{1, n}} S_{n} (ℓ; ϵ)| \\ + N_{F_{m} K^{m}} N_{(x)}^{m} N_{(u)}^{m} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (x_{i (x)}, r)} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (u_{i (u)}, r)} |\sum_{ϵ \neq ϵ_{0}} \sum_{ℓ \in L_{2, n}} S_{n} (ℓ; ϵ)| \\ + N_{F_{m} K^{m}} N_{(x)}^{m} N_{(u)}^{m} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (x_{i (x)}, r)} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (u_{i (u)}, r)} |\sum_{ℓ \in L_{n}} S_{n}^{'} (ℓ; ϵ_{0})| \\ + N_{F_{m} K^{m}} N_{(x)}^{m} N_{(u)}^{m} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (x_{i (x)}, r)} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (u_{i (u)}, r)} |\sum_{ϵ \neq ϵ_{0}} \sum_{ℓ \in L_{1, n}} S_{n}^{'} (ℓ; ϵ)| \\ + N_{F_{m} K^{m}} N_{(x)}^{m} N_{(u)}^{m} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (x_{i (x)}, r)} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (u_{i (u)}, r)} |\sum_{ϵ \neq ϵ_{0}} \sum_{ℓ \in L_{2, n}} S_{n}^{'} (ℓ; ϵ)| + 2 M F (y) α_{n} . \end{matrix}

(63)

Even more, for each

ϵ \in {1, 2}^{d}

, let

\{{\overset{˘}{S}}_{n}^{'} (ℓ; ϵ) : ℓ \in L_{n}\}

denote a sequence of independent random vectors in

R^{m}

under

P_{\cdot ∣ S}

such that

{\overset{˘}{S}}_{n}^{'} (ℓ; ϵ) \overset{d}{=} S_{n}^{'} (ℓ; ϵ), under P_{. ∣ S}, ℓ \in L_{n} .

Show that

\begin{matrix} P_{\cdot ∣ S} (N_{F_{m} K^{m}} N_{(x)}^{m} N_{(u)}^{m} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (x_{i (x)}, r)} \end{matrix}

\begin{matrix} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (u_{i (u)}, r)} |{\hat{ψ}}_{1}^{(1)} (u, x) - E_{\cdot ∣ S} [{\hat{ψ}}_{1}^{(1)} (u, x)]| > 2^{m d + 1} M a_{n}) \\ \leq & N_{F_{m} K^{m}} N_{(x)}^{m} N_{(u)}^{m} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (x_{i (x)}, r)} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (u_{i (u)}, r)} \\ P_{\cdot ∣ S} (sup_{(u, x) \in B_{k}} |{\hat{ψ}}_{1} (u, x) - E_{\cdot ∣ S} [{\hat{ψ}}_{1} (u, x)]| > 2^{m d + 1} M a_{n}) \end{matrix}

(64)

\begin{matrix} \leq & \sum_{ϵ \in {1, 2}^{d}} {\hat{Q}}_{n} (ϵ) + \sum_{ϵ \in {1, 2}^{d}} {\bar{Q}}_{n} (ϵ) + 2^{m d + 1} N_{F_{m} K^{m}} N_{(x)}^{m} N_{(u)}^{m} {(\frac{A_{n}}{A_{1, n}})}^{d} β (A_{2, n}; A_{n}^{d}), \end{matrix}

(65)

where

\begin{matrix} {\hat{Q}}_{n} (ϵ_{0}) = N_{F_{m} K^{m}} N_{(x)}^{m} N_{(u)}^{m} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (x_{i (x)}, r)} \\ max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (u_{i (u)}, r)} P_{\cdot ∣ S} (|\sum_{ℓ \in L_{n}} {\overset{˘}{S}}_{n} (ℓ; ϵ_{0})| > M a_{n} n^{m} h^{m d} ϕ (h)), \\ {\bar{Q}}_{n} (ϵ_{0}) = N_{F_{m} K^{m}} N_{(x)}^{m} N_{(u)}^{m} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (x_{i (x)}, r)} \\ max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (u_{i (u)}, r)} P_{\cdot ∣ S} (|\sum_{ℓ \in L_{n}} {\overset{˘}{S}}_{n}^{'} (ℓ; ϵ_{0})| > M a_{n} n^{m} h^{m d} ϕ (h)), \end{matrix}

and for

ϵ \neq ϵ_{0}

\begin{matrix} {\hat{Q}}_{n} (ϵ) = N_{F_{m} K^{m}} N_{(x)}^{m} N_{(u)}^{m} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (x_{i (x)}, r)} \\ max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (u_{i (u)}, r)} P_{\cdot ∣ S} (|\sum_{ℓ \in L_{n}} {\overset{˘}{S}}_{n} (ℓ; ϵ)| > M a_{n} n^{m} h^{m d} ϕ (h)), \\ {\bar{Q}}_{n} (ϵ) = N_{F_{m} K^{m}} N_{(x)}^{m} N_{(u)}^{m} max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (x_{i (x)}, r)} \\ max_{1 \leq i_{1} < \dots < i_{m} \leq m} sup_{B (u_{i (u)}, r)} P_{\cdot ∣ S} (|\sum_{ℓ \in L_{n}} {\overset{˘}{S}}_{n}^{'} (ℓ; ϵ)| > M a_{n} n^{m} h^{m d} ϕ (h)) . \end{matrix}

Due to the similarity between the two cases,

ϵ \neq ϵ_{0}

and

ϵ = ϵ_{0}

, we are going to treat

{\hat{Q}}_{n}

only for

ϵ \neq ϵ_{0}

. An application of Lemma A5, with the fact that

{\overset{˘}{S}}_{n} (ℓ; ϵ)

are zero-mean random variables, shows us that:

P_{\cdot ∣ S} (|\sum_{ℓ \in L_{n}} {\overset{˘}{S}}_{n} (ℓ; ϵ)| > M a_{n} n h^{m d} ϕ (h)) \leq 2 P_{\cdot ∣ S} (\sum_{ℓ \in L_{n}} {\overset{˘}{S}}_{n} (ℓ; ϵ) > M a_{n} n h^{m d} ϕ (h))

and

\begin{matrix} |{\overset{˘}{S}}_{n} (ℓ; ϵ)| & \leq C A_{1, n}^{d - 1} A_{2, n} (log n) τ_{n}, P_{S} - a . s . (from Lemma A 5) \\ E_{\cdot ∣ S} [{({\overset{˘}{S}}_{n} (ℓ; ϵ))}^{2}] & \leq C h^{m d} ϕ (h) A_{1, n}^{d - 1} A_{2, n} (log n), P_{S} - a . s . (By Lemma A 6) \end{matrix}

(66)

Using Bernstein’s inequality represented in Lemma A7, we have

\begin{matrix} P_{\cdot ∣ S} (\sum_{ℓ \in L_{n}} {\overset{˘}{S}}_{n} (ℓ; ϵ) > M a_{n} n h^{m d} ϕ (h)) \\ \leq exp (- \frac{\frac{1}{2} \times M n h^{m d} ϕ (h) log n}{{(\frac{A_{n}}{A_{1, n}})}^{d} A_{1, n}^{d - 1} A_{2, n} h^{m d} ϕ (h) (log n) + \frac{1}{3} \times M^{1 / 2} n^{1 / 2} h^{m d / 2} ϕ {(h)}^{1 / 2} {(log n)}^{1 / 2} A_{1, n}^{d - 1} A_{2, n} τ_{n}}) . \end{matrix}

Observe that

\begin{matrix} \frac{n h^{m d} log n}{{(\frac{A_{n}}{A_{1, n}})}^{d} A_{1, n}^{d - 1} A_{2, n} h^{m d} (log n)} & = & n A_{n}^{- d} (\frac{A_{1, n}}{A_{2, n}}) ≳ \frac{A_{1, n}}{A_{2, n}} ≳ n^{η}, \end{matrix}

(67)

\begin{matrix} \frac{n h^{m d} ϕ (h) log n}{n^{1 / 2} h^{m d / 2} ϕ {(h)}^{1 / 2} {(log n)}^{1 / 2} A_{1, n}^{d - 1} A_{2, n} τ_{n}} & = & \frac{n^{1 / 2} h^{m d / 2} ϕ {(h)}^{1 / 2} {(log n)}^{1 / 2}}{A_{1, n}^{m d} (\frac{A_{2, n}}{A_{1, n}}) ρ_{n} n^{1 / ζ}} \geq C_{0} n^{η / 2} . \end{matrix}

(68)

Taking

M > 0

to be sufficiently large, and for

N \leq C h^{- m d} ϕ (h) α_{n}^{- m}

, this shows the desired result.

We must move on to the nonlinear part of the Hoeffding decomposition. Accordingly, the goal is to prove that:

P_{\cdot ∣ S} [sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |{\hat{ψ}}_{2} (u, x)| > λ] \to 0 as n \to \infty .

(69)

In the following, we will give a lemma that can be viewed as a technical result in the proof of our proposition, and it helps us to achieve our goal in Expression (69). The proof of this lemma used the Blocking technique defined before but for the U-statistic, making the block treatment more complicated.

Lemma 1.

Let

F_{m} K^{m}

be a uniformly bounded class of measurable canonical functions,

m \geq 2

. Assume that there are finite constants

a

and

b

in such a way that the

F_{m} K^{m}

covering number fulfills:

N (ϵ, F_{m} K^{m} {, ∥ \cdot ∥}_{L_{2} (Q)}) \leq a ϵ^{- b},

(70)

for all

ϵ > 0

and all probability measures Q. If the mixing coefficients β of the local stationary sequence

{Z_{i} = {(X_{s_{i}, A_{n}}, W_{s_{i}, A_{n}})}_{i \in N^{☆}}

satisfy Condition (E2) in Assumption 6, then, for some

r > 1

, we have:

sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} P [h^{m d / 2} ϕ^{m / 2} (h) n^{- m + 1 / 2} \sum_{i \in I_{n}^{m}} ξ_{i_{1}} \dots ξ_{i_{m}} H (Z_{i_{1}}, \dots, Z_{i_{m}})] \to 0 .

(71)

Remark 11.

As mentioned before,

W_{s_{i}, A_{n}}

will be equal to 1 or

ϵ_{s_{i}, A_{n}} = σ (\frac{s_{i}}{A_{n}}, X_{s_{i}, A_{n}}) ϵ_{s_{i}}

. In the proof of the previous lemma,

W_{s_{i}, A_{n}}

will be equal

ε_{i, n} = σ (\frac{i}{n}, X_{i, n}) ϵ_{i}

, and we will use the notation

W_{s_{i}, A_{n}}^{(u)}

to indicate

σ (u, x) ϵ_{i}

.

□

9.2.1. Proof of Lemma 1

This lemma’s proof is based on the blocking technique employed by [82], and it is called Bernstein’s method, referred to [150], in which we are enabled to apply the symmetrization and the many other techniques available for the i.i.d random variables. We will extend this technique to the spacial processes in the U-statistics setting, in the same line as in [93]. In addition to the notation in Section 9.1, define

L_{n} : = L_{1, n} \cup L_{2, n},

Δ_{1} = {ℓ_{2} : min_{1 \leq i \leq d} | ℓ_{1 i} - ℓ_{2 i} | \leq 1}

Δ_{2} = {ℓ_{2} : min_{1 \leq i \leq d} | ℓ_{1 i} - ℓ_{2 i} | \geq 2}

With the notation introduced above, it is easy to show that, for

m = 2

,

\begin{matrix} \frac{1}{h^{2 d} ϕ^{2} (h)} \sum_{i \in I_{n}^{2}} \prod_{j = 1}^{2} \{\bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\} W_{s_{i}, A_{n}} \\ = \frac{1}{h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} \\ \times \prod_{j = 1}^{2} \{\bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\} W_{s_{i}, A_{n}} \\ + \frac{1}{h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \underset{i_{1} \neq i_{2}}{\sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}}} \\ \times \prod_{j = 1}^{2} \{\bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\} W_{s_{i}, A_{n}} \\ + 2 \frac{1}{h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{Δ_{2}} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} \\ \times \prod_{j = 1}^{2} \{\bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\} W_{s_{i}, A_{n}} \\ + 2 \frac{1}{h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{Δ_{1}} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} \\ \times \prod_{j = 1}^{2} \{\bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\} W_{s_{i}, A_{n}} \\ + \frac{1}{h^{2 d} ϕ^{2} (h)} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{1} \neq ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} \\ \times \prod_{j = 1}^{2} \{\bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\} W_{s_{i}, A_{n}} \\ + \frac{1}{h^{2 d} ϕ^{2} (h)} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{1} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{1} < i_{2} : s_{i_{1}}, s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ) \cap R_{n}} \\ \times \prod_{j = 1}^{2} \{\bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\} W_{s_{i}, A_{n}} \\ : = I + II + III + IV + V + VI . \end{matrix}

(72)

(I):: The Same Type of Blocks but Not the Same Block

Let

{η_{i}}_{i \in N^{*}}

be a sequence of independent blocks. An application of Lemma A4 shows that:

\begin{matrix} P (sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |n^{- 3 / 2} \frac{1}{h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} \\ \times \prod_{j = 1}^{2} \{\bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\} W_{s_{i}, A_{n}}| > δ) \\ \leq P (sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} \\ \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) [\prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}}) - \prod_{j = 1}^{2} K_{2} (\frac{d (x_{i}, X_{u_{j}} (s_{i_{j}}))}{h})] W_{s_{i}, A_{n}}| > δ) \\ + P (sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} \\ \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{2} K_{2} (\frac{d (x_{i}, X_{u_{j}} (s_{i_{j}}))}{h}) [W_{s_{i}, A_{n}} - W_{s_{i}, A_{n}}^{(u)}]| > δ) \\ + P (sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} \\ \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{2} K_{2} (\frac{d (x_{i}, X_{u_{j}} (s_{i_{j}}))}{h}) W_{s_{i}, A_{n}}^{(u)}| > δ) \\ \leq P (sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} \\ \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{s_{i}, A_{n}}^{(u)}| > δ) \\ + C {(\frac{A_{n}}{A_{1, n}})}^{d} β (A_{2, n}; A_{n}^{d}) + o_{P} (1) + o_{P} (1), \end{matrix}

Because:

\begin{matrix} E_{. ∣ S} |\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \\ [\prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}}) - \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, X_{u_{j}} (s_{i_{j}}))}{h})] W_{s_{i}, A_{n}}| \\ = \frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \\ E_{. ∣ S} |[\prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}}) - \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, X_{u_{j}} (s_{i_{j}}))}{h})] W_{s_{i}, A_{n}}| \\ = \frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \\ E_{. ∣ S} |[\prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}}) - \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, X_{u_{j}} (s_{i_{j}}))}{h})] \prod_{j = 1}^{m} ϵ_{s_{i_{j}}, A_{n}}| \\ = \frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \\ E_{. ∣ S} |[\prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}}) - \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, X_{u_{j}} (s_{i_{j}}))}{h})] \prod_{j = 1}^{m} σ (\frac{s_{i_{j}}}{A_{n}}, X_{s_{i_{j}}, A_{n}}) ϵ_{s_{i_{j}}}| \\ = \frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{m} E_{. ∣ S} (ϵ_{s_{i_{j}}}) \\ E_{. ∣ S} |[\prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}}) - \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, X_{u_{j}} (s_{i_{j}}))}{h})] \\ [\prod_{j = 1}^{m} σ (\frac{s_{i_{j}}}{A_{n}}, X_{s_{i_{j}}, A_{n}}) - \prod_{j = 1}^{m} σ (u_{j}, x_{j}) + \prod_{j = 1}^{m} σ (u_{j}, x_{j})]| \\ \leq \frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{m} E_{. ∣ S} (ϵ_{s_{i_{j}}}) \\ E_{. ∣ S} [C \sum_{j = 1}^{m} {|K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}}) - K_{2} (\frac{d (x_{j}, X_{\frac{s_{i_{j}}}{A_{n}}} (s_{i_{j}}))}{h})|}^{p}] \prod_{j = 1}^{m} [σ (u_{j}, x_{j}) + o_{P} (1)] \\ (Using a telescoping argument, and the boundedness of K_{2} for p = min (ρ, 1) and C < \infty) \\ ≲ \frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{m} E_{. ∣ S} (ϵ_{s_{i_{j}}}) \\ E_{. ∣ S} [ϕ^{m - 1} (h) {|\frac{C}{A_{n}^{d}} U_{s_{i_{j}}, A_{n}} (\frac{s_{i_{j}}}{A_{n}})|}^{p}] \prod_{j = 1}^{m} [σ (u_{j}, x_{j}) + o_{P} (1)] \\ \sim o_{P} (1), \end{matrix}

(73)

and

\begin{matrix} E_{. ∣ S} |\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \\ \prod_{j = 1}^{2} K_{2} (\frac{d (x_{i}, X_{u_{j}} (s_{i_{j}}))}{h}) [W_{s_{i}, A_{n}} - W_{s_{i}, A_{n}}^{(u)}]| \\ = \frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{m} E_{. ∣ S} (ϵ_{s_{i_{j}}}) \\ E_{. ∣ S} |\prod_{j = 1}^{2} K_{2} (\frac{d (x_{i}, X_{u_{j}} (s_{i_{j}}))}{h}) [\prod_{j = 1}^{m} σ (\frac{s_{i_{j}}}{A_{n}}, X_{s_{i_{j}}, A_{n}}) - \prod_{j = 1}^{m} σ (u_{j}, x_{j})]| \\ ≲ \frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{m} E_{. ∣ S} (ϵ_{s_{i_{j}}}) \\ \times (o_{P} (1)) \int_{0}^{h} \prod_{k = 1}^{m} K_{2} (\frac{y_{k}}{h}) d F_{i_{k} / n} (y_{k}, x_{k}) \\ ≲ \frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{m} E_{. ∣ S} (ϵ_{s_{i_{j}}}) \\ \times (o_{P} (1)) (ϕ^{2} (h)) \\ \sim o_{P} (1) . \end{matrix}

(74)

Under the assumptions of the lemma, we have

β (a; b) \leq β_{1} (a) g_{1} (b)

with

β_{1} (a) \to 0

as

a \to \infty

and

n \to \infty

, so the term to consider is the first summand. For the second part of the inequality, we will use the work of [27] in the non-fixed kernels settings, precisely, we will define

f_{i_{1}, \dots, i_{m}} = \prod_{k = 1}^{m} ξ_{i_{k}} \times H

and

F_{i_{1}, \dots, i_{m}}

, respectively, as a collection of kernels and the class of functions related to this kernel; then, we will use ([29], Theorem 3.1.1 and Remarks 3.5.4 part 2) for decoupling and randomization. As we mentioned above, we will suppose that

m = 2

. Then, we can see that

\begin{matrix} E_{. ∣ S} ∥\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \\ {\prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{i, φ, n}^{(u)}∥}_{F_{2} K^{2}} \\ = E_{. ∣ S} {∥\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} f_{i_{1}, i_{2}} (u, η)∥}_{F_{i_{1}, i_{2}}} \\ \leq c_{2} E_{. ∣ S} {∥\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} ϵ_{p} ϵ_{q} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} f_{i_{1}, i_{2}} (u, η)∥}_{F_{i_{1}, i_{2}}} \\ \leq c_{2} E_{. ∣ S} \int_{0}^{D_{n h}^{(U_{1})}} N (t, F_{i_{1}, i_{2}}, {\tilde{d}}_{n h, 2}^{(1)}) d t, (By Lemma A 9 and Proposition A 1 .) \end{matrix}

(75)

where

D_{n h}^{(U_{1})}

is the diameter of

F_{i_{1}, i_{2}}

according to the distance

{\tilde{d}}_{n h, 2}^{(1)},

respectively, which is defined as

\begin{matrix} D_{n h}^{(U_{1})} & : = & {∥E_{ϵ} |\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} ϵ_{p} ϵ_{q} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} f_{i_{1}, i_{2}} (u, η)|∥}_{F_{i_{1}, i_{2}}} \\ = & ∥E_{ϵ} |\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} ϵ_{p} ϵ_{q} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \\ {\prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{s_{i}, A_{n}}^{(u)}|∥}_{F_{2} K^{2}}, \end{matrix}

and:

\begin{matrix} {\tilde{d}}_{n h, 2}^{(1)} (ξ_{1 .} K_{2, 1} {W^{'}}^{(u)}, ξ_{2 .} K_{2, 2} {W^{″}}^{(u)}) \\ : = E_{ϵ} |\frac{1}{n^{3 / 2} h^{d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} ϵ_{p} ϵ_{q} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} [ξ_{1 i_{1}} ξ_{1 i_{2}} \\ \prod_{k = 1}^{2} K_{1, 2} (\frac{d (x_{k}, η_{i_{k}})}{h}) {W_{s_{i}, A_{n}}^{'}}^{(u)} - ξ_{2 i_{1}} ξ_{2 i_{2}} \prod_{k = 1}^{2} K_{2, 2} (\frac{d (x_{k}, η_{i_{k}})}{h}) {W_{s_{i}, A_{n}}^{″}}^{(u)}]| . \end{matrix}

Let consider another semi-norm

{\tilde{d}}_{n h, 2}^{(2)}

:

\begin{matrix} {\tilde{d}}_{n h, 2}^{(2)} (ξ_{1 .} K_{2, 1} {W^{'}}^{(u)}, ξ_{2 .} K_{2, 2} {W^{″}}^{(u)}) \\ = \frac{1}{n h^{d} ϕ^{2} (h)} [\sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} (ξ_{1 i_{1}} ξ_{1 i_{2}} \prod_{k = 1}^{2} K_{1, 2} (\frac{d (x_{k}, η_{i_{k}})}{h}) {W_{s_{i}, A_{n}}^{'}}^{(u)} \\ - {{ξ_{2 i_{1}} ξ_{2 i_{2}} \prod_{k = 1}^{2} K_{2, 2} (\frac{d (x_{k}, η_{i_{k}})}{h}) {W_{s_{i}, A_{n}}^{″}}^{(u)})}^{2}]}^{1 / 2} . \end{matrix}

One can see that

{\tilde{d}}_{n h, 2}^{(1)} (ξ_{1 .} K_{2, 1} {W^{'}}^{(u)}, ξ_{2 .} K_{2, 2} {W^{″}}^{(u)}) ⩽ A_{1, n} n^{- 1 / 2} h^{d} ϕ (h) {\tilde{d}}_{n h, 2}^{(2)} (ξ_{1 .} K_{2, 1} {W^{'}}^{(u)}, ξ_{2 .} K_{2, 2} {W^{″}}^{(u)}) .

We readily infer that

\begin{matrix} E_{. ∣ S} ∥\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \\ {\prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{i, φ, n}^{(u)}∥}_{F_{2} K^{2}} \\ ⩽ c_{2} E_{. ∣ S} \int_{0}^{D_{n h}^{(U_{1})}} N (t A_{1, n}^{- d} n^{1 / 2}, F_{i, j}, {\tilde{d}}_{n h, 2}^{(2)}) d t \\ ⩽ c_{2} A_{1, n}^{d} n^{- 1 / 2} P \{D_{n h}^{(U_{1})} A_{1, n}^{- d} n^{1 / 2} ⩾ λ_{n}\} + c_{m} A_{1, n}^{d} n^{- 1 / 2} \int_{0}^{λ_{n}} log t^{- 1} d t, \end{matrix}

(76)

where

λ_{n} \to 0 .

We have

\frac{(\int_{0}^{λ_{n}} log t^{- 1} d t)}{(λ_{n} log λ_{n}^{- 1})} \to 0,

where

λ_{n}

must be chosen in such a way that the following relation will be achieved

A_{1, n}^{d} λ_{n} n^{- 1 / 2} log λ_{n}^{- 1} \to 0 .

(77)

By utilizing the triangle inequality in conjunction with Hoeffding’s trick, we are easily able to derive that

\begin{matrix} A_{1, n}^{d} n^{- 1 / 2} P \{D_{n h}^{(U_{1})} ⩾ λ_{n} A_{1, n}^{d} n^{- 1 / 2}\} \\ ⩽ λ_{n}^{- 2} A_{1, n}^{- d} n^{- 5 / 2} h ϕ^{- 1} (h) E_{. ∣ S} ∥\sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} [\sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} ξ_{i_{1}} ξ_{i_{2}} K_{2} (\frac{d (x_{1}, η_{i_{1}})}{h}) \\ {{K_{2} (\frac{d (x_{2}, η_{i_{2}}^{'})}{h}) W_{s_{i}, A_{n}}^{(u)}]}^{2}∥}_{F_{2} K^{2}} \\ ⩽ c_{2} [[L_{n}]] λ_{n}^{- 2} A_{1, n}^{- d} n^{- 5 / 2} h ϕ^{- 1} (h) E_{. ∣ S} ∥\sum_{ℓ_{1} \in L_{n}} [\sum_{i_{1}, i_{2} : s_{i_{1}}, s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} ξ_{i_{1}} ξ_{i_{2}} K_{2} (\frac{d (x_{1}, η_{i_{1}})}{h}) \\ {{K_{2} (\frac{d (x_{2}, η_{i_{2}}^{'})}{h}) W_{s_{i}, A_{n}}^{(u)}]}^{2}∥}_{F_{2} K^{2}}, \end{matrix}

(78)

where

{\{η_{i}^{'}\}}_{i \in N^{*}}

are independent copies of

{(η_{i})}_{i \in N^{*}}

. By imposing

λ_{n}^{- 2} A_{1, n}^{d - r} n^{- 1 / 2} \to 0,

(79)

we readily infer that

\begin{matrix} {∥[[L_{n}]] λ_{n}^{- 2} A_{1, n}^{- d} n^{- 5 / 2} h ϕ^{- 1} (h) E_{. ∣ S} \sum_{ℓ_{1} \in L_{n}} {[\sum_{i_{1}, i_{2} : s_{i_{1}}, s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} ξ_{i_{1}} ξ_{i_{2}} \prod_{k = 1}^{2} K_{2} (\frac{d (x_{k}, η_{i_{k}})}{h}) W_{s_{i}, A_{n}}^{(u)}]}^{2}∥}_{F_{2} K^{2}} \\ ⩽ O (λ_{n}^{- 2} A_{1, n}^{d - r} n^{- 1 / 2}) . \end{matrix}

A symmetrization of the last inequality in (78) succeeded by an application of the Proposition A1 in the Appendix A gives

\begin{matrix} [[L_{n}]] λ_{n}^{- 2} A_{1, n}^{- d} n^{- 5 / 2} h ϕ^{- 1} (h) E_{. ∣ S} ∥\sum_{ℓ_{1} \in L_{n}} [\sum_{i_{1}, i_{2} : s_{i_{1}}, s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} ϵ_{p} ξ_{i_{1}} ξ_{i_{2}} K_{2} (\frac{d (x_{1}, η_{i_{1}})}{h}) \\ {{K_{2} (\frac{d (x_{2}, η_{i_{2}}^{'})}{h}) W_{s_{i}, A_{n}}^{(u)}]}^{2}∥}_{F_{2} K^{2}} \\ ⩽ c_{2} E_{. ∣ S} (\int_{0}^{D_{n h}^{(U_{2})}} {(log N (u, F_{i, j}, {\tilde{d}}_{n h, 2}^{'}))}^{1 / 2}), \end{matrix}

(80)

where

\begin{matrix} D_{n h}^{(U_{2})} & = & ∥ E_{ϵ} | [[L_{n}]] λ_{n}^{- 2} A_{1, n}^{- d} n^{- 5 / 2} ϕ^{- 1} (h) \\ {\sum_{ℓ_{1} \in L_{n}} {[\sum_{i_{1}, i_{2} : s_{i_{1}}, s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} ξ_{i_{1}} ξ_{i_{2}} K_{2} (\frac{d (x_{1}, η_{i_{1}})}{h}) K_{2} (\frac{d (x_{2}, η_{i_{2}}^{'})}{h}) W_{s_{i}, A_{n}}^{(u)}]}^{2}|∥}_{F_{2} K^{2}} . \end{matrix}

and for

ξ_{1 .} K_{2, 1} W^{'}, ξ_{2 .} K_{2, 2} W^{″} \in F_{i j}

:

\begin{matrix} {\tilde{d}}_{n h, 2}^{'} (ξ_{1 .} K_{2, 1} {W^{'}}^{(u)}, ξ_{2 .} K_{2, 2} {W^{″}}^{(u)}) \\ : = E_{ϵ} |[[L_{n}]] λ_{n}^{- 2} A_{1, n}^{- d} n^{- 5 / 2} ϕ^{- 1} (h) \sum_{ℓ_{1} \in L_{n}} ϵ_{p} [(\sum_{i_{1}, i_{2} : s_{i_{1}}, s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} ξ_{1 i_{1}} ξ_{1 i_{2}} K_{2, 1} (\frac{d (x_{1}, η_{i_{1}})}{h}) \\ {K_{2, 1} (\frac{d (x_{2}, η_{i_{2}}^{'})}{h}) {W_{s_{i}, A_{n}}^{'}}^{(u)})}^{2} - {(\sum_{i_{1}, i_{2} \in H_{p}^{(U)}} ξ_{2 i} ξ_{2 j} K_{2, 2} (\frac{d (x_{1}, η_{i_{1}})}{h}) K_{2, 2} (\frac{d (x_{2}, η_{i_{2}}^{'})}{h}) {W_{s_{i}, A_{n}}^{″}}^{(u)})}^{2}]| . \end{matrix}

By the fact that:

\begin{matrix} E_{ϵ} |[[L_{n}]] λ_{n}^{- 2} A_{1, n}^{- d} n^{- 5 / 2} ϕ^{- 1} (h) \sum_{ℓ_{1} \in L_{n}} ϵ_{p} (\sum_{i_{1}, i_{2} : s_{i_{1}}, s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} ξ_{i_{1}} ξ_{i_{2}} K_{2} (\frac{d (x_{1}, η_{i_{1}})}{h}) \\ {K_{2} (\frac{d (x_{2}, η_{i_{2}}^{'})}{h}) W_{s_{i}, A_{n}}^{(u)})}^{2}| \\ ⩽ A_{1, n}^{3 d / 2} λ_{n}^{- 2} n^{- 1} [{[[L_{n}]]}^{- 1} A_{1, n}^{- 2 d} ϕ^{- 2} (h_{n}) \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1}, i_{2} : s_{i_{1}}, s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} (ξ_{i_{1}} ξ_{i_{2}} K_{2} (\frac{d (x_{i}, η_{i_{1}})}{h}) \\ {{K_{2} (\frac{d (x_{2}, η_{j}^{'})}{h}) W_{s_{i}, A_{n}}^{(u)})}^{4}]}^{1 / 2}, \end{matrix}

so:

A_{1, n}^{d 3 / 2} λ_{n}^{- 2} n^{- 1} \to 0,

(81)

we have the convergence of (80) to zero. Recall that

〚 L_{n} 〛 = O ({(A_{n} / A_{3, n})}^{d}) ≲ {(A_{n} / A_{1, n})}^{d} .

(II):: The Same Block

\begin{matrix} P (sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |\frac{1}{h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \underset{i_{1} \neq i_{2}}{\sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}}} \\ \times \prod_{j = 1}^{2} \{\bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\} W_{s_{i}, A_{n}}| > δ) \\ \leq P (sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \underset{i_{1} \neq i_{2}}{\sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}}} \\ \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) [\prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}}) - \prod_{j = 1}^{2} K_{2} (\frac{d (x_{i}, X_{u_{j}} (s_{i_{j}}))}{h})] W_{s_{i}, A_{n}}| > δ) \\ + P (sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \underset{i_{1} \neq i_{2}}{\sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}}} \\ \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{2} K_{2} (\frac{d (x_{i}, X_{u_{j}} (s_{i_{j}}))}{h}) [W_{s_{i}, A_{n}} - W_{s_{i}, A_{n}}^{(u)}]| > δ) \\ + P (sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \underset{i_{1} \neq i_{2}}{\sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}}} \\ \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{2} K_{2} (\frac{d (x_{i}, X_{u_{j}} (s_{i_{j}}))}{h}) W_{s_{i}, A_{n}}^{(u)}| > δ) \\ \leq P (sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \underset{i_{1} \neq i_{2}}{\sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}}} \\ \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{s_{i}, A_{n}}^{(u)}| > δ) \\ + C {(\frac{A_{n}}{A_{1, n}})}^{d} β (A_{2, n}; A_{n}^{d}) + o_{P} (1) + o_{P} (1), \end{matrix}

In the same manner as

I

, we can show that the first and the second term in the previous inequality is of order

o_{P} (1)

. So, as the preceding proof, it suffices to prove that

\begin{matrix} E_{. ∣ S} (∥\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \underset{i_{1} \neq i_{2}}{\sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}}} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \\ {\prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{s_{i}, A_{n}}^{(u)}∥}_{F_{2} K^{2}}) \to 0 . \end{matrix}

Notice that we treat uniformly bounded classes functions in which we obtain uniformly in

B^{m} \times F_{2} K^{2}

E_{. ∣ S} (\sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \underset{i_{1} \neq i_{2}}{\sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}}} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{s_{i}, A_{n}}^{(u)}) = O (a_{n}) .

This implies that we have to prove that, for

u \in B^{m}

\begin{matrix} E E_{. ∣ S} (∥\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} {\sum_{i_{1} \neq i_{2}}}_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \\ [\prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{s_{i}, A_{n}}^{(u)} \\ - {E_{. ∣ S} (\prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{s_{i}, A_{n}}^{(u)})]∥}_{F_{2} K^{2}}) \to 0 . \end{matrix}

(82)

As for empirical processes, to prove (82), it is enough to symmetrize and show that

\begin{matrix} E_{. ∣ S} ∥\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \underset{i_{1} \neq i_{2}}{\sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}}} ϵ_{p} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \\ {\prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{s_{i}, A_{n}}^{(u)}∥}_{F_{2} K^{2}} \to 0 . \end{matrix}

Similarly to how in (75), we have

\begin{matrix} E_{. ∣ S} ∥\frac{1}{n^{3 / 2} h^{d + 1} ϕ (h)} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \underset{i_{1} \neq i_{2}}{\sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}}} ϵ_{p} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \\ {\prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{s_{i}, A_{n}}^{(u)}∥}_{F_{2} K^{2}} ⩽ E (\int_{0}^{D_{n h}^{(U_{3})}} {(log N (u, F_{i_{1}, i_{2}}, {\tilde{d}}_{n h, 2}^{(3)}))}^{1 / 2} d u), \end{matrix}

where

\begin{matrix} D_{n h}^{(U_{3})} & = & ∥E_{ϵ} |\frac{1}{n^{3 / 2} h^{d} ϕ (h)} \sum_{ℓ_{1} \in L_{n}} ϵ_{p} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \underset{i_{1} \neq i_{2}}{\sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}}} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \\ {\prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{s_{i}, A_{n}}^{(u)}|∥}_{F_{2} K^{2}}, \end{matrix}

(83)

and the semi-metric

{\tilde{d}}_{n h, 2}^{(3)}

is defined by

\begin{matrix} {\tilde{d}}_{n h, 2}^{(3)} (ξ_{1 .} K_{2, 1} {W^{'}}^{(u)}, ξ_{2 .} K_{2, 2} {W^{″}}^{(u)}) \\ = E_{ϵ} |\frac{1}{n^{3 / 2} h^{d} ϕ (h)} \sum_{ℓ_{1} \in L_{n}} ϵ_{p} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \underset{i_{1} \neq i_{2}}{\sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}}} (ξ_{1 i} ξ_{1 j} K_{2, 1} (\frac{d (x_{1}, η_{i_{1}})}{h}) \\ K_{2, 1} (\frac{d (x_{2}, η_{i_{2}})}{h}) {W_{s_{i}, A_{n}}^{'}}^{(u)} - ξ_{2 i} ξ_{2 j} K_{2, 2} (\frac{d (x_{1}, η_{i_{1}})}{h}) K_{2, 2} (\frac{d (x_{2}, η_{i_{2}})}{h}) {W_{s_{i}, A_{n}}^{″}}^{(u)})| . \end{matrix}

Since we are considering uniformly bounded classes of functions, we obtain

\begin{matrix} E_{ϵ} |n^{- 3 / 2} h ϕ^{- 1} (h_{n}) \sum_{ℓ_{1} \in L_{n}} ϵ_{p} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \underset{i_{1} \neq i_{2}}{\sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}}} ξ_{i_{1}} ξ_{i_{2}} K_{2} (\frac{d (x_{1}, η_{i_{1}})}{h}) \\ K_{2} (\frac{d (x_{2}, η_{i_{2}})}{h}) W_{s_{i}, A_{n}}^{(u)}| \\ ⩽ A_{1, n}^{3 d / 2} n^{- 1} h ϕ^{- 1} (h_{n}) [\frac{1}{[[L_{n}]] A_{1, n}^{2}} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \underset{i_{1} \neq i_{2}}{\sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}}} (ξ_{i_{1}} ξ_{i_{2}} K_{2} (\frac{d (x_{1}, η_{i_{1}})}{h}) \\ {{K_{2} (\frac{d (x_{2}, η_{i_{2}})}{h}) W_{s_{i}, A_{n}}^{(u)})}^{2}]}^{1 / 2} ⩽ O (A_{1, n}^{3 d / 2} n^{- 1} ϕ^{- 1} (h_{n})) . \end{matrix}

Since

A_{1, n}^{3 d / 2} n^{- 1} ϕ^{- 1} (h) \to 0

,

D_{n h}^{(U_{3})} \to 0

, we obtain

II \to 0

as

n \to \infty

.

(III):: Different Types of Blocks

Avoiding the repetition, we can directly see that:

\begin{matrix} P (sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |\frac{1}{h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{Δ_{2}} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \\ \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} \prod_{j = 1}^{2} \{\bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\} W_{s_{i}, A_{n}}| > δ) \\ \leq & P (sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{Δ_{2}} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \\ \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{s_{i}, A_{n}}^{(u)}| > δ) \\ + C {(\frac{A_{n}}{A_{1, n}})}^{d} β (A_{2, n}; A_{n}^{d}) + o_{P} (1) + o_{P} (1) . \end{matrix}

(84)

For

p = 1

and

p = ν_{n}

:

\begin{matrix} E_{. ∣ S} ∥\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{Δ_{2}} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} \\ {\prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{s_{i}, A_{n}}^{(u)}∥}_{F_{2} K^{2}} \\ = E_{. ∣ S} ∥\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{ℓ_{2} : min_{1 \leq i \leq d} ℓ_{2 i} = 3} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} \\ {\prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{s_{i}, A_{n}}^{(u)}∥}_{F_{2} K^{2}} . \end{matrix}

For

2 ⩽ p ⩽ υ_{n} - 1

, we obtain:

\begin{matrix} E_{. ∣ S} ∥\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{Δ_{2}} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} \\ {\prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{s_{i}, A_{n}}^{(u)}∥}_{F_{2} K^{2}} \\ = E_{. ∣ S} ∥\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{ℓ_{2} : min_{1 \leq i \leq d} ℓ_{2 i} = 4} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} \\ {\prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{s_{i}, A_{n}}^{(u)}∥}_{F_{2} K^{2}} \\ ⩽ E_{. ∣ S} ∥\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{ℓ_{2} : min_{1 \leq i \leq d} ℓ_{2 i} = 3} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} \\ {\prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{s_{i}, A_{n}}^{(u)}∥}_{F_{2} K^{2}}, \end{matrix}

therefore, it suffices to show that:

\begin{matrix} E_{. ∣ S} ∥\frac{1}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{ℓ_{2} : min_{1 \leq i \leq d} ℓ_{2 i} = 3} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} \\ {\prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{s_{i}, A_{n}}^{(u)}∥}_{F_{2} K^{2}} . \end{matrix}

(85)

By similar arguments as in [82], the usual symmetrization is applied and:

\begin{matrix} E_{. ∣ S} ∥\frac{〚 L_{n} 〛}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{ℓ_{2} : min_{1 \leq i \leq d} ℓ_{2 i} = 3} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} \\ {\prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{s_{i}, A_{n}}^{(u)}∥}_{F_{2} K^{2}} \\ ⩽ & 2 E_{. ∣ S} ∥\frac{〚 L_{n} 〛}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{ℓ_{2} : min_{1 \leq i \leq d} ℓ_{2 i} = 3} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} ϵ_{q} \\ {\prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{s_{i}, A_{n}}^{(u)}∥}_{F_{2} K^{2}} \\ = & 2 E_{. ∣ S} \{∥\frac{〚 L_{n} 〛}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{ℓ_{2} : min_{1 \leq i \leq d} ℓ_{2 i} = 3} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} ϵ_{q} \\ {\prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{s_{i}, A_{n}}^{(u)}∥}_{F_{2} K^{2}} 1_{\{D_{n h}^{(U_{4})} ⩽ γ_{n}\}}\} \\ + 2 E_{. ∣ S} \{∥\frac{〚 L_{n} 〛}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{ℓ_{2} : min_{1 \leq i \leq d} ℓ_{2 i} = 3} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} \\ {ϵ_{q} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{s_{i}, A_{n}}^{(u)}∥}_{F_{2} K^{2}} 1_{\{D_{n h}^{(U_{4})} > γ_{n}\}}\} \\ = & 2 {III}_{1} + 2 {III}_{2}, \end{matrix}

(86)

where

\begin{matrix} D_{n h}^{(U_{4})} \\ = & ∥\frac{〚 L_{n} 〛}{n^{3 / 2} h^{2 d} ϕ^{2} (h)} [\underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} : min_{1 \leq i \leq d} ℓ_{2 i} = 3}} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} (\sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \\ \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) {{{\prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}}}{h}) W_{s_{i}, A_{n}}^{(u)})}^{2}]}^{1 / 2}∥}_{F_{2} K^{2}} . \end{matrix}

(87)

In a similar way as in (75), we infer that

\begin{matrix} {III}_{1} ⩽ c_{2} \int_{0}^{γ_{n}} {(log N (t, F_{i_{1}, i_{2}}, {\tilde{d}}_{n h, 2}^{(4)}))}^{1 / 2} d t, \end{matrix}

(88)

where

\begin{matrix} {\tilde{d}}_{n h, 2}^{(4)} (ξ_{1 .} K_{2, 1} {W^{'}}^{(u)}, ξ_{2 .} K_{2, 2} {W^{″}}^{(u)}) \\ : = E_{ϵ} |〚 L_{n} 〛 n^{- 3 / 2} h ϕ^{- 1} (h_{n}) \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{ℓ_{2} : min_{1 \leq i \leq d} ℓ_{2 i} = 3} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} \\ ϵ_{q} [ξ_{1 i_{1}} ξ_{1 i_{2}} K_{2, 1} (\frac{d (x_{1}, η_{i_{1}})}{h}) K_{2, 1} (\frac{d (x_{2}, η_{i_{2}})}{h}) {W_{s_{i}, A_{n}}^{'}}^{(u)} \\ - ξ_{2 i_{1}} ξ_{2 i_{2}} K_{2, 2} (\frac{d (x_{1}, η_{i_{1}})}{h}) K_{2, 2} (\frac{d (x_{2}, η_{i_{2}})}{h}) {W_{s_{i}, A_{n}}^{″}}^{(u)}]| . \end{matrix}

Since we have

\begin{matrix} E_{ϵ} |〚 L_{n} 〛 n^{3 / 2} h^{2 d} ϕ^{2} (h) \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{ℓ_{2} : min_{1 \leq i \leq d} ℓ_{2 i} = 3} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} ϵ_{q} ξ_{i_{1}} ξ_{i_{2}} \\ K_{2} (\frac{d (x_{1}, η_{i_{1}})}{h}) K_{2} (\frac{d (x_{2}, η_{i_{2}})}{h}) W_{s_{i}, A_{n}}^{(u)}| \\ ⩽ A_{1, n}^{- d / 2} A_{2, n}^{d} h^{- d + 1} ϕ (h) (\frac{1}{A_{1, n}^{d} A_{2, n}^{d} 〚 L_{n} 〛 h^{d - 1} ϕ^{4} (h_{n})} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{ℓ_{2} : min_{1 \leq i \leq d} ℓ_{2 i} = 3} \\ {\underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} {[ξ_{i_{1}} ξ_{i_{2}} K_{2} (\frac{d (x_{1}, η_{i_{1}})}{h}) K_{2} (\frac{d (x_{2}, η_{i_{2}})}{h}) W_{s_{i}, A_{n}}^{(u)}]}^{2})}^{1 / 2}, \end{matrix}

and considering the semi-metric

\begin{matrix} {\tilde{d}}_{n h, 2}^{(5)} (ξ_{1 .} K_{2, 1} {W^{'}}^{(u)}, ξ_{2 .} K_{2, 2} {W^{″}}^{(u)}) \\ : = & (\frac{1}{A_{1, n}^{d} A_{2, n}^{d} 〚 L_{n} 〛 h^{d - 1} ϕ^{4} (h_{n})} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{ℓ_{2} : min_{1 \leq i \leq d} ℓ_{2 i} = 3} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} [ξ_{1 i_{1}} ξ_{1 i_{2}} \\ K_{2, 1} (\frac{d (x_{1}, η_{i_{1}})}{h}) K_{2, 1} (\frac{d (x_{2}, η_{i_{2}})}{h}) {W_{s_{i}, A_{n}}^{'}}^{(u)} - {{ξ_{2 i_{1}} ξ_{2 i_{2}} K_{2, 2} (\frac{d (x_{1}, η_{i_{1}})}{h}) K_{2, 2} (\frac{d (x_{2}, η_{i_{2}})}{h}) {W_{s_{i}, A_{n}}^{″}}^{(u)}]}^{2})}^{1 / 2} . \end{matrix}

We demonstrate that the statement in (88) is bounded as follows

〚 L_{n} 〛^{1 / 2} A_{2, n}^{d} n^{- 1 / 2} h^{2} ϕ (h) \int_{0}^{〚 L_{n} 〛^{- 1 / 2} A_{2, n}^{- d} n^{1 / 2} h^{2 d} γ_{n}} {(log N (t, F_{i_{1}, i_{2}}, {\tilde{d}}_{n h, 2}^{(5)}))}^{1 / 2} d t,

by choosing

γ_{n} = n^{- α}

for some

α > (17 r - 26) / 60 r

, we obtain the convergence of the preceding quantity to zero. In order to bound the second term on the right-hand side of (86), we can mention that

\begin{matrix} {III}_{2} & = & E \{∥〚 L_{n} 〛 n^{- 3 / 2} h ϕ^{- 2} (h) \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{ℓ_{2} : min_{1 \leq i \leq d} ℓ_{2 i} = 3} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} ϵ_{q} \\ {\prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}})}{h}) W_{s_{i}, A_{n}}^{(u)}∥}_{F_{2} K^{2}} 1_{\{D_{n h}^{(U_{4})} > γ_{n}\}}\} \\ ⩽ & A_{1, n}^{- 1} A_{2, n} n^{1 / 2} h^{d} ϕ^{- 1} (h) P \{∥〚 L_{n} 〛^{2} n^{- 3} h^{2} ϕ^{- 2} (h_{n}) \sum_{ℓ_{2} : min_{1 \leq i \leq d} ℓ_{2 i} = 3} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} (\sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} \\ {{\sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}})}{h}) W_{s_{i}, A_{n}}^{(u)})}^{2}∥}_{F_{2} K^{2}} ⩾ γ_{n}^{2}\} . \end{matrix}

(89)

We are going to use the square root method on the last expression conditionally on

Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}

. We denote by

E_{ϵ \neq ϵ_{0}}

the expectation with respect to

σ \{η_{i_{2}}, ϵ \neq ϵ_{0}\}

and we will suppose that any class of functions

F_{m}

is unbounded and its envelope function satisfies for some

p > 2

:

θ_{p} : = sup_{x \in S_{H}^{m}} E (F^{p} (Y) | X = x) < \infty,

(90)

for

2 r / (r - 1) < s < \infty

(in the notation in of [151] [Lemma 5.2]).

\begin{matrix} M_{n} & = & 〚 L_{n} 〛^{1 / 2} E_{ϵ \neq ϵ_{0}} (\sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \prod_{j = 1}^{2} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \\ {\prod_{j = 1}^{2} K_{2} (\frac{d (x_{j}, η_{i_{j}})}{h}) W_{s_{i}, A_{n}}^{(u)})}^{2}, \end{matrix}

where

x = γ_{n}^{2} A_{1, n}^{5 d / 2} n^{1 / 2} h^{m d / 2} ϕ^{- m / 2} (h), ρ = λ = 2^{- 4} γ_{n} A_{1, n}^{5 d / 4} n^{1 / 4} h^{m d / 4} ϕ^{- m / 4} (h),

and

m = exp (γ_{n}^{2} n h^{2 d} ϕ^{- 2} (h_{n}) A_{2, n}^{- 2 d}) .

However, since we need

t > 8 M_{n},

and

m \to \infty

, by similar arguments as in ([82], p. 69), we reach the convergence of (88) and (89) to zero.

(IV):: Blocks of Different Types

The target here is to prove that:

\begin{matrix} P (sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |\frac{1}{h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{Δ_{1}} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} \\ \times \prod_{j = 1}^{2} \{\bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\} W_{s_{i}, A_{n}}| > δ) \to 0 . \end{matrix}

We have

\begin{matrix} ∥n^{- 3 / 2} \frac{1}{h^{2 d} ϕ^{2} (h)} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{Δ_{1}} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} \\ {\times \prod_{j = 1}^{2} \{\bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\} W_{s_{i}, A_{n}}∥}_{F_{2} K^{2}} \\ ⩽ c_{2} 〚 L_{n} 〛 A_{1, n}^{d} A_{2, n}^{d} n^{- 3 / 2} h^{- d} ϕ^{- 1} (h) \to 0 . \end{matrix}

Hence, the proof of the lemma is complete.

The final step in the proof of Proposition 1 lies in the use of Lemma 1 to prove that the nonlinear term converges to zero. □

9.2.2. Proof of Theorem 1

We have

\begin{matrix} {\hat{r}}_{n}^{(m)} (φ, x, u; h_{n}) - r^{(m)} (φ, x, u) \\ = \frac{1}{{\tilde{r}}_{1} (φ, x, u)} ({\hat{g}}_{1} (u, x) + {\hat{g}}_{2} (u, x) - r^{(m)} (φ, x, u) {\tilde{r}}_{1} (φ, x, u)), \end{matrix}

where

\begin{matrix} {\tilde{r}}_{1} (φ, u, x) & = & \frac{(n - m)!}{n! h^{m d} ϕ^{m} (h)} \sum_{i \in I_{n}^{m}} \prod_{j = 1}^{m} \{\bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\}, \\ {\hat{g}}_{1} (u, x) & = & \frac{(n - m)!}{n! h^{m d} ϕ^{m} (h)} \sum_{i \in I_{n}^{m}} \prod_{j = 1}^{m} \{\bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\} \prod_{j = 1}^{m} ϵ_{s_{i_{j}}, A_{n}}, \\ {\hat{g}}_{2} (u, x) & = & \frac{(n - m)!}{n! h^{m d} ϕ^{m} (h)} \sum_{i \in I_{n}^{m}} \prod_{j = 1}^{m} \{\bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\} \\ \times r^{(m)} (φ, X_{s_{i_{1}}, A_{n}}, \dots, X_{s_{i_{m}}, A_{n}}, \frac{s_{i_{1}}}{A_{n}}, \dots, \frac{s_{i_{m}}}{A_{n}}) . \end{matrix}

The proof of this theorem is involved and divided into the following four steps, where in each one, we aim to show that

Step 1.: $sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} | {\hat{g}}_{1} (u, x) | = O_{P} (\sqrt{log n / n h^{m d} ϕ^{m} (h)}) .$
Step 2.: $\begin{matrix} sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} | {\hat{g}}_{2} (u, x) - r^{(m)} (φ, u, x) {\tilde{r}}_{1} (φ, u, x; h_{n}) \\ - E_{. ∣ S} ({\hat{g}}_{2} (u, x) - r^{(m)} (φ, u, x) {\tilde{r}}_{1} (φ, u, x; h_{n})) | = O_{P} (\sqrt{log n / n h^{m d} ϕ^{m} (h)}) . \end{matrix}$
Step 3.: Let $κ_{2} = \int_{R} x^{2} K (x) d x$ .

$\begin{matrix} sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |E_{. ∣ S} ({\hat{g}}_{2} (u, x) - r^{(m)} (φ, u, x) {\tilde{r}}_{1} (φ, u, x; h_{n}))| \\ = O (\frac{1}{A_{n}^{d p} ϕ (h)}) + o (h^{2}), P_{S} - a . s . \end{matrix}$
Step 4.: $sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |{\tilde{r}}_{1} (φ, u, x) - E_{. ∣ S} ({\tilde{r}}_{1} (φ, u, x))| = o_{P_{. ∣ S}} (1) .$

It is clear that Step 1 follows directly from Proposition 1 for

W_{s_{i}, A_{n}} = \prod_{j = 1}^{m} ϵ_{s_{i_{j}}, A_{n}}

. The second one (Step 2) holds also if we replace

W_{s_{i}, A_{n}}

with

{\hat{g}}_{2} (u, x) - r^{(m)} (φ, u, x) {\tilde{r}}_{1} (φ, u, x; h_{n})

then applying Proposition 1.

We will pass now to Step 4. Observe that

for

W_{s_{i}, A_{n}} \equiv 1

, the previous mentioned proposition proved that

sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in B^{m}} |{\tilde{r}}_{1} (φ, u, x) - E_{. ∣ S} ({\tilde{r}}_{1} (φ, u, x))| = o_{P_{. ∣ S}} (1) .

Step 3. will be treated in what follows:

Let

K_{0} : [0, 1] \to R

be a Lipschitz continuous function compactly support on

[- q C_{1}, q C_{1}]

for some

q > 1

and such that

K_{0} (x) = 1, \forall x \in [- C_{1}, C_{1}] .

Show that

E_{. ∣ S} [{\hat{g}}_{2} (u, x) - r^{(m)} (φ, u, x) {\tilde{r}}_{1} (φ, u, x; h_{n}))] = \sum_{i = 1}^{4} Q_{i} (u, x),

(91)

where

Q_{i}

can be defined as follows

Q_{i} (u, x) = \frac{(n - m)!}{n! h^{m d} ϕ^{m} (h)} \sum_{i \in I_{n}^{m}} \{\prod_{j = 1}^{m} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}})\} q_{i} (u, x),

(92)

such that

\begin{matrix} q_{1} (u, x) & = & E_{. ∣ S} [\prod_{j = 1}^{m} K_{0} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}}) \{\prod_{j = 1}^{m} K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}}) \\ - \prod_{j = 1}^{m} K_{2} (\frac{d (x_{i}, X_{\frac{s_{i_{j}}}{A_{n}}} (s_{i_{j}}))}{h})\} \times \{r^{(m)} (φ, \frac{s_{i}}{A_{n}}, X_{s_{i}, A_{n}}) - r^{(m)} (φ, u, x)\}], \\ q_{2} (u, x) & = & E_{. ∣ S} [\prod_{j = 1}^{m} \{K_{0} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}}) K_{2} (\frac{d (x_{i}, X_{\frac{s_{i_{j}}}{A_{n}}} (s_{i_{j}}))}{h})\} \\ \{r^{(m)} (φ, \frac{s_{i}}{A_{n}}, X_{s_{i}, A_{n}}) - r^{(m)} (φ, \frac{s_{i}}{A_{n}}, X_{s_{i} / A_{n}} (s_{i}))\}], \\ q_{3} (u, x) & = & E_{. ∣ S} [\{\prod_{j = 1}^{m} K_{0} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}}) - \prod_{j = 1}^{m} K_{0} (\frac{d (x_{i}, X_{\frac{s_{i_{j}}}{A_{n}}} (s_{i_{j}}))}{h})\} \\ \prod_{j = 1}^{m} K_{2} (\frac{d (x_{i}, X_{\frac{s_{i_{j}}}{A_{n}}} (s_{i_{j}}))}{h}) \times \{r^{(m)} (φ, \frac{s_{i}}{A_{n}}, X_{s_{i} / A_{n}} (s_{i})) - r^{(m)} (φ, u, x)\}], \\ q_{4} (u, x) & = & E_{. ∣ S} [\prod_{j = 1}^{m} K_{2} (\frac{d (x_{i}, X_{\frac{s_{i_{j}}}{A_{n}}} (s_{i_{j}}))}{h}) \{r^{(m)} (φ, \frac{s_{i}}{A_{n}}, X_{s_{i} / A_{n}} (s_{i})) - r^{(m)} (φ, u, x)\}] . \end{matrix}

Observe that

\begin{matrix} Q_{1} (u, x) & ≲ & \frac{(n - m)!}{n! h^{m d} ϕ^{m} (h)} \sum_{i \in I_{n}^{m}} \{\prod_{j = 1}^{m} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) E_{. ∣ S} [\prod_{j = 1}^{m} K_{0} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}}) \\ |\prod_{j = 1}^{m} K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}}) - \prod_{j = 1}^{m} K_{2} (\frac{d (x_{i}, X_{\frac{s_{i_{j}}}{A_{n}}} (s_{i_{j}}))}{h})| \\ \times |r^{(m)} (φ, \frac{s_{i}}{A_{n}}, X_{s_{i}, A_{n}}) - r^{(m)} (φ, u, x)|]\}, \end{matrix}

using the properties of

r^{(m)} (u, x)

allows us to show that

\begin{matrix} \prod_{j = 1}^{m} K_{0} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}}) |r^{(m)} (φ, \frac{s_{i}}{A_{n}}, X_{s_{i}, A_{n}}) - r^{(m)} (φ, u, x)| \leq C h^{m} \end{matrix}

\begin{matrix} Q_{1} (u, x) & \leq & \frac{(n - m)!}{n! h^{m d} ϕ^{m} (h)} \sum_{i \in I_{n}^{m}} \{\prod_{j = 1}^{m} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}})\} E_{. ∣ S} [C h^{m} \\ \times C \sum_{j = 1}^{m} {|K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}}) - K_{2} (\frac{d (x_{i}, X_{\frac{s_{i_{j}}}{A_{n}}} (s_{i_{j}}))}{h})|}^{p}] (Using the telescoping \\ argument, and the boundness of K_{2} for p = min (ρ, 1) and C < \infty) \\ \leq & \frac{(n - m)!}{n! h^{m d} ϕ^{m} (h)} \sum_{i \in I_{n}^{m}} \{\prod_{j = 1}^{m} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}})\} E_{. ∣ S} [C h^{m} \sum_{j = 1}^{m} {|\frac{C}{A_{n}^{d} h} U_{s_{i_{j}}, A_{n}} (\frac{s_{i_{j}}}{A_{n}})|}^{p}] \\ \leq & \frac{C}{A_{n}^{p d} ϕ^{m} (h) h^{p - m}} uniformly in u . \end{matrix}

In a similar way, and for

E [\prod_{j = 1}^{m} K_{2} (\frac{d (x_{i}, X_{\frac{s_{i_{j}}}{A_{n}}} (s_{i_{j}}))}{h})] \leq C ϕ^{m - 1} (h),

and since

r^{(m)} (\cdot)

is Lipschitz and

d (X_{s_{i_{j}}, A_{n}}, X_{\frac{s_{i_{j}}}{A_{n}}} (s_{j})) \leq \frac{C}{A_{n}^{d}} U_{s_{i_{j}}, A_{n}} (\frac{s_{i_{j}}}{A_{n}})

and the variable

U_{s_{i_{j}}, A_{n}} (\frac{s_{i_{j}}}{A_{n}})

has a finite p-th moment, we can see that

\begin{matrix} Q_{2} (u, x) & = & \frac{(n - m)!}{n! h^{m d} ϕ^{m} (h)} \sum_{i \in I_{n}^{m}} \prod_{j = 1}^{m} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \\ E_{. ∣ S} [\prod_{j = 1}^{m} \{K_{0} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}}) K_{2} (\frac{d (x_{i}, X_{\frac{s_{i_{j}}}{A_{n}}} (s_{i_{j}}))}{h})\} \\ \{r^{(m)} (φ, \frac{s_{i}}{A_{n}}, X_{s_{i}, A_{n}}) - r^{(m)} (φ, \frac{s_{i}}{A_{n}}, X_{s_{i} / A_{n}} (s_{i}))\}] \\ \leq & \frac{(n - m)!}{n! h^{m d} ϕ^{m} (h)} \sum_{i \in I_{n}^{m}} \{\prod_{j = 1}^{m} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}})\} E_{. ∣ S} [ϕ^{m - 1} (h) {|\frac{C}{A_{n}^{d}} U_{s_{i_{j}}, A_{n}} (\frac{s_{i_{j}}}{A_{n}})|}^{p}] \\ \leq & \frac{C}{A_{n}^{p d} ϕ (h)}, \end{matrix}

(93)

and

sup_{F_{m} K^{m}} sup_{x \in H^{m}} sup_{u \in I_{h}^{m}} Q_{3} (u, x) ≲ \frac{1}{A_{n}^{p d} ϕ^{m} (h) h^{p - m}} .

(94)

For the last term, we have

\begin{matrix} Q_{4} (u, x) & = & \frac{(n - m)!}{n! h^{m d} ϕ^{m} (h)} \sum_{i \in I_{n}^{m}} \prod_{j = 1}^{m} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \\ E_{. ∣ S} [\prod_{j = 1}^{m} K_{2} (\frac{d (x_{i}, X_{\frac{s_{i_{j}}}{A_{n}}} (s_{i_{j}}))}{h}) \{r^{(m)} (φ, \frac{s_{i}}{A_{n}}, X_{s_{i} / A_{n}} (s_{i})) - r^{(m)} (φ, u, x)\}] . \end{matrix}

Using Lemma A1 and inequality (17) and under Assumption 1, it follows that

\begin{matrix} |Q_{4} (u, x)| & \leq & \frac{(n - m)!}{n! h^{m d} ϕ^{m} (h)} \sum_{i \in I_{n}^{m}} \prod_{j = 1}^{m} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \\ E_{. ∣ S} [\prod_{j = 1}^{m} K_{2} (\frac{d (x_{i}, X_{\frac{s_{i_{j}}}{A_{n}}} (s_{i_{j}}))}{h}) |r^{(m)} (φ, \frac{s_{i}}{A_{n}}, X_{s_{i} / A_{n}} (s_{i})) - r^{(m)} (φ, u, x)|] \\ ≲ & \frac{(n - m)!}{n! h^{m d} ϕ^{m} (h)} \sum_{i \in I_{n}^{m}} \prod_{j = 1}^{m} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \\ E_{. ∣ S} [\prod_{j = 1}^{m} K_{2} (\frac{d (x_{i}, X_{\frac{s_{i_{j}}}{A_{n}}} (s_{i_{j}}))}{h}) {|d_{H^{m}} (X_{s_{i} / A_{n}} (s_{i}), x) + ∥ u - \frac{s_{i}}{A_{n}} ∥|}^{α}] \\ ≲ & \frac{(n - m)!}{n! h^{m d} ϕ^{m} (h)} \sum_{i \in I_{n}^{m}} |\prod_{j = 1}^{m} \bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) \\ - \int_{0}^{1} \dots \int_{0}^{1} \frac{1}{h^{m}} \prod_{j = 1}^{m} \bar{K} (\frac{(u_{j} - v_{j})}{h}) d v_{j}| E_{. ∣ S} [|\prod_{j = 1}^{m} K_{2} (\frac{d (x_{i}, X_{\frac{s_{i_{j}}}{A_{n}}} (s_{i_{j}}))}{h})| \times h^{α}] \\ + \frac{(n - m)!}{n! h^{m d} ϕ^{m} (h)} \sum_{i \in I_{n}^{m}} \int_{0}^{1} \dots \int_{0}^{1} \frac{1}{h^{m d}} \prod_{j = 1}^{m} \bar{K} (\frac{u_{j} - v_{j}}{h}) d v_{j} \times E_{. ∣ S} [ϕ^{m - 1} (h) h^{α}] \\ ≲ & O_{P_{. ∣ S}} (h^{2 \land α}) . \end{matrix}

(95)

Adding the obtained results of

Q_{i}

,

1 \leq i \leq 4

, Step 3 yields the rate of convergence of the estimator. □

9.2.3. Proof of Theorem 2

Recall that

{\hat{r}}_{n}^{(m)} (φ, x, u; h_{n}) = \frac{\sum_{i \in I_{n}^{m}} φ (Y_{s_{i_{1}}, A_{n}}, \dots, Y_{s_{i_{m}, A_{n}}}) \prod_{j = 1}^{m} \{\bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\}}{\sum_{i \in I_{n}^{m}} \prod_{j = 1}^{m} \{\bar{K} (\frac{u_{j} - s_{i_{j}} / A_{n}}{h_{n}}) K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\}} .

For

x \in H^{m}, y \in Y^{m}

, define

\begin{matrix} G_{φ, i} (x, y) & : = & \frac{\prod_{j = 1}^{m} \{K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\} φ (Y_{s_{i_{1}}, A_{n}}, \dots, Y_{s_{i_{m}, A_{n}}})}{E \prod_{j = 1}^{m} \{K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\}}; \\ G & : = & \{G_{φ, i} (\cdot, \cdot) φ \in F_{m}, i = (i_{1}, \dots, i_{m})\}; \\ G^{(k)} & : = & \{π_{k, m} G_{φ, i} (\cdot, \cdot), φ \in F_{m},\}; \\ U_{n} (φ) & = & U_{n}^{(m)} (G_{φ, i}) : = \frac{(n - m)!}{n!} \sum_{i \in I_{n}^{m}} \prod_{j = 1}^{m} ξ_{i_{j}} G_{φ, i} (X_{i}, Y_{i}); \end{matrix}

and the U-empirical process is defined to be

μ_{n} (φ) : = \sqrt{n h^{m} ϕ (h)} \{U_{n} (φ) - E (U_{n} (φ))\} .

Then

{\tilde{r}}_{n}^{(m)} (φ, x, u; h_{n}) = \frac{U_{n} (φ)}{U_{n} (1)} .

In order to establish the weak convergence of our estimator, it must be established first for

μ_{n} (φ)

. We have mentioned before that we deal with unbounded classes of functions; that is why we should truncate the function

G_{φ, i} (x, y)

, indeed, for

λ_{n} = n^{1 / p}

, with

p > 0

, we have:

\begin{matrix} G_{φ, i} (x, y) & = & G_{φ, i} (x, y) 1_{\{F (y) \leq λ_{n}\}} + G_{φ, i} (x, y) 1_{\{F (y) > λ_{n}\}} \\ : = & G_{φ, i}^{(T)} (x, y) + G_{φ, i}^{(R)} (x, y) . \end{matrix}

We can write the U-statistic as follows:

\begin{matrix} μ_{n} (φ) = & \sqrt{n h^{m} ϕ (h)} \{U_{n}^{(m)} (G_{φ, i}^{(T)}) - E (U_{n}^{(m)} (G_{φ, i}^{(T)}))\} \\ + \sqrt{n h^{m} ϕ (h)} \{U_{n}^{(m)} (G_{φ, i}^{(R)}) - E (U_{n}^{(m)} (G_{φ, i}^{(R)}))\} \\ : = & \sqrt{n h^{m} ϕ (h)} \{U_{n}^{(T)} (φ, i) - E (U_{n}^{(T)} (φ))\} \\ + \sqrt{n h^{m} ϕ (h)} \{U_{n}^{(R)} (φ) - E (U_{n}^{(R)} (φ))\} \\ : = & μ_{n}^{(T)} (φ) + μ_{n}^{(R)} (φ) . \end{matrix}

(96)

The first term is the truncated part and the second is the remaining one. We have to prove that:

$μ_{n}^{(T)} (φ)$ converges to a Gaussian process.
The remainder part does not matter much, in the sense that

${∥\sqrt{n h^{m} ϕ (h)} \{U_{n}^{(R)} (φ) - E (U_{n}^{(R)} (φ))\}∥}_{F_{m} K^{m}} \overset{P}{⟶} 0 .$

For the first point, we will use the decomposition of Hoeffding, which would be the same as the previous decomposition in Section 3.1 except that we replace

W_{i, n}

by

φ (Y_{i, n})

U_{n}^{(T)} (φ) - E (U_{n}^{(T)} (φ)) : = U_{1, n} (φ) + U_{2, n} (φ),

where

\begin{matrix} U_{1, n} (φ) & : = & \frac{1}{n} \sum_{i = 1}^{n} {\hat{H}}_{1, i} (u, x, φ), \end{matrix}

(97)

\begin{matrix} U_{2, n} (φ) & : = & \frac{(n - m)!}{(n)!} \sum_{i \in I_{n}^{m}} ξ_{i_{1}} \dots ξ_{i_{m}} H_{2, i} (z) . \end{matrix}

(98)

The convergence of

U_{2, n} (φ)

to zero in probability follows from Lemma 1. Hence, it is enough to show that

U_{1, n} (φ)

converges weakly to a Gaussian process called

G (φ)

. In order to achieve our goal, we will go through finite-dimensional convergence and equicontinuity.

The finite-dimensional convergence simply asserts that every finite set of functions

f_{1}, \dots, f_{q}

in

L_{2}

, for

\tilde{U}

the centered form of

U

:

(\sqrt{n h^{m} ϕ (h)} {\tilde{U}}_{1, n} (f_{1}), \dots, \sqrt{n h^{m} ϕ (h)} {\tilde{U}}_{1, n} (f_{q}))

(99)

converges to the corresponding finite-dimensional distributions of the process

G (φ)

. It is sufficient to show that for every fixed collection

(a_{1}, \dots, a_{q}) \in R^{q}

, we have

\sum_{j = 1}^{q} a_{j} {\tilde{U}}_{1, n} (f_{j}) \to N (0, v^{2}),

where

v^{2} = \sum_{j = 1}^{q} a_{j}^{2} Var ({\tilde{U}}_{1, n} (f_{j})) + \sum_{s \neq r} a_{s} a_{r} Cov ({\tilde{U}}_{1, n} (f_{s}), {\tilde{U}}_{1, n} (f_{r})) .

(100)

Take

h (\cdot) = \sum_{j = 1}^{q} a_{j} f_{j} (\cdot) .

By linearity of

h (\cdot)

, we have to see that

{\tilde{U}}_{1, n} (h, i) \to G (h) .

Let

N = E \prod_{j = 1}^{m} \{K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}})\} .

We have:

\begin{matrix} {\tilde{U}}_{1, n} (h_{n}) \\ = N^{- 1} \times \frac{1}{n} \sum_{i = 1}^{n} \frac{(n - m)!}{(n - 1)!} \sum_{I_{n - 1}^{m - 1} (- i)} \sum_{ℓ = 1}^{m} ξ_{i_{1}} \dots ξ_{i_{ℓ - 1}} ξ_{i} ξ_{i_{ℓ}} \dots ξ_{i_{m - 1}} \frac{1}{ϕ (h)} K_{2} (\frac{d (x_{i}, X_{s_{i}, A_{n}})}{h_{n}}) \\ \times \int h (y_{1}, \dots, y_{ℓ - 1}, Y_{i}, y_{ℓ}, \dots, y_{m - 1}) \prod_{\underset{j \neq i}{j = 1}}^{m - 1} \frac{1}{ϕ (h)} K_{2} (\frac{d (x_{j}, X_{s_{i_{j}}, A_{n}})}{h_{n}}) \\ P (d (ν_{1}, y_{1}), \dots, d (ν_{ℓ - 1}, y_{ℓ - 1}), d (ν_{ℓ}, y_{ℓ}), \dots, d (ν_{m - 1}, y_{m - 1})), \\ : = N^{- 1} \frac{1}{n} \sum_{i = 1}^{n} ξ_{i} \frac{1}{ϕ (h)} K_{2} (\frac{d (x_{i}, X_{s_{i}, A_{n}})}{h_{n}}) \tilde{h} (Y_{i}) . \end{matrix}

The next step requires an extension of the Blocking techniques of Bernstein to the spacial process where all notions are defined in Section 9.1.

Recall that

L_{n} = L_{1, n} \cup L_{2, n}

and define:

Z_{s, A_{n}} (u, x) : = ξ_{i} \frac{1}{ϕ (h)} K_{2} (\frac{d (x_{i}, X_{s_{i}, A_{n}})}{h_{n}}) \tilde{h} (Y_{i}),

(101)

and

Z_{n} (ℓ; ϵ) = \sum_{i : s_{i} \in Γ_{n} (ℓ; ϵ) \cap R_{n}} Z_{s, A_{n}} (u, x) = {(Z_{n}^{(1)} (ℓ; ϵ), \dots, Z_{n}^{(p)} (ℓ; ϵ))}^{'} .

(102)

Then, we have

\begin{matrix} {\tilde{U}}_{1, n} (h_{n}) & = & \sum_{i = 1}^{n} Z_{s, A_{n}} (u, x) \\ = & \sum_{ℓ \in L_{n}} Z_{n} (ℓ; ϵ_{0}) + \sum_{ϵ \neq ϵ_{0}} \underset{= : Z_{2, n} (ϵ)}{\underset{︸}{\sum_{ℓ \in L_{1, n}} Z_{n} (ℓ; ϵ)}} + \sum_{ϵ \neq ϵ_{0}} \underset{= : Z_{3, n} (ϵ)}{\underset{︸}{\sum_{ℓ \in L_{2, n}} Z_{n} (ℓ; ϵ)}} \\ = : & Z_{1, n} + \sum_{ϵ \neq ϵ_{0}} Z_{2, n} (ϵ) + \sum_{ϵ \neq ϵ_{0}} Z_{3, n} (ϵ) . \end{matrix}

(103)

Lemma A8 proves that

Z_{2, n}

and

Z_{3, n}

, for

ϵ \neq ϵ_{0}

, are asymptotically negligible. Treating now the variance of

Z_{1, n}

is clear; first, mixing conditions are used to replace large blocks with independent random variables, and then Lyapunov’s condition for the central limit theorem is applied to the sum of independent random variables. Similary to the proof of Proposition 1 using Lemma A4, as in Equation (59), observe that

sup_{t > 0} |P_{\cdot ∣ S} (Z_{1, n} > t) - P_{\cdot ∣ S} ({\overset{˘}{Z}}_{1, n} > t)| \leq C {(\frac{A_{n}}{A_{1, n}})}^{d} β (A_{2, n}; A_{n}^{d}),

(104)

where

\{{\overset{˘}{Z}}_{n} (ℓ; ϵ) : ℓ \in L_{n}\}

denotes a sequence of independent random vectors in

R^{p}

under

P_{\cdot ∣ S}

such that

{\overset{˘}{Z}}_{n} (ℓ; ϵ) \overset{d}{=} Z_{n} (ℓ; ϵ), under P_{∣ S}, ℓ \in L_{n} .

Applying Lyapunov’s condition for the central limit theorem for the sum of independent random variables, the remaining condition of finite-dimensional convergence must be established.

We end up with the asymptotic equicontinuity. We have to prove that:

\begin{matrix} lim_{δ \to 0} lim_{n \to \infty} P \{\sqrt{n h^{m} ϕ (h)} {∥{\tilde{U}}_{1, n} (h_{n}, i)∥}_{FK (δ, ∥ . ∥_{p})} > ϵ\} = 0, \end{matrix}

(105)

where

\begin{matrix} FK (δ, ∥ . ∥_{p}) & : = & \{{\tilde{U}}_{1, n}^{'} (h_{n}) - {\tilde{U}}_{1, n}^{″} (h_{n}) : \\ ∥{\tilde{U}}_{1, n}^{'} (h_{n}) - {\tilde{U}}_{1, n}^{″} (h_{n})∥ < δ, {\tilde{U}}_{1, n}^{'} (h_{n}), {\tilde{U}}_{1, n}^{″} (h_{n}) \in FK\}, \end{matrix}

(106)

for

\begin{matrix} {\tilde{U}}_{1, n}^{'} (h_{n}) & = & N^{- 1} \frac{1}{n} \sum_{i = 1}^{n} ξ_{i} \frac{1}{ϕ (h)} K_{2, 1} (\frac{d (x_{i}, X_{s_{i}, A_{n}})}{h_{n}}) {\tilde{h}}_{1} (Y_{i}) - E (U_{1, n}^{'} (h_{n})) \\ {\tilde{U}}_{1, n}^{″} (h_{n}) & = & N^{- 1} \frac{1}{n} \sum_{i = 1}^{n} ξ_{i} \frac{1}{ϕ (h)} K_{2, 2} (\frac{d (x_{i}, X_{s_{i}, A_{n}})}{h_{n}}) {\tilde{h}}_{2} (Y_{i}) - E (U_{1, n}^{″} (h_{n})) \end{matrix}

(107)

At this point, we will adapt the chaining technique found in [82] and use it for the conditional setting with the locally stationary process in [152] but for random fields, as in Lemma 1.

Using the same strategy also as in Lemma 1 to pass from the sequence of locally stationary random variables to the stationary one and find that, for

ζ_{i} = (η_{i}, ς_{i})

, the independent blocks sequences:

\begin{matrix} P \{{∥{(n ϕ (h))}^{- 1 / 2} h^{m / 2} N^{- 1} \sum_{i = 1}^{n} (ξ_{i} K_{2} (\frac{d (x_{i}, X_{i})}{h}) \tilde{h} (Y_{i}) - E (U_{1, n} (h_{n})))∥}_{{FK}_{(b, ∥ \cdot ∥_{p})}} > ϵ\} \\ \leq 2 P \{∥{(n ϕ (h))}^{- 1 / 2} h^{m / 2} N^{- 1} \sum_{ℓ \in L_{n}} \sum_{i : s_{i} \in Γ_{n} (ℓ; ϵ_{0}) \cap R_{n}} (ξ_{i} K_{2} (\frac{d (x_{i}, η_{i})}{h}) \tilde{h} (ς_{i}) \\ {- E (U_{1, n} (h_{n})))∥}_{{FK}_{(b, ∥ \cdot ∥_{p})}} > ϵ^{'}\} + C {(\frac{A_{n}}{A_{1, n}})}^{d} β (A_{2, n}; A_{n}^{d}) + o_{P} (1) . \end{matrix}

(108)

Taking advantage of the condition (E2) in Assumption 6, we obtain

β (A_{2, n}; A_{n}^{d}) ⟶ 0

as

n \to 0

; then, it is simply a matter of placing the first phrase in the right-hand sight of (109). Due to the fact that the blocks are independent, we symmetricize using a sequence

{ϵ_{j}}_{j \in N^{*}}

of i.i.d. Rademacher variables, i.e., r.v’s with

P (ϵ_{j} = 1) = P (ϵ_{j} = - 1) = 1 / 2 .

It is important to notice that the sequence

{ϵ_{j}}_{j \in N^{*}}

is independent of the sequence

{\{ξ_{i} = (ς_{i}, ζ_{i})\}}_{i \in N^{*}}

; therefore, it remains to establish, for all

ϵ > 0

and

δ \to 0

,

\begin{matrix} lim_{δ \to 0} lim_{n \to \infty} P \{∥{(n ϕ (h))}^{- 1 / 2} h^{m / 2} N^{- 1} \sum_{ℓ \in L_{n}} \sum_{i : s_{i} \in Γ_{n} (ℓ; ϵ_{0}) \cap R_{n}} (ξ_{i} K_{2} (\frac{d (x_{i}, η_{i})}{h}) \tilde{h} (ς_{i}) \\ {- E (U_{1, n} (h_{n}, i)))∥}_{{FK}_{(b, ∥ \cdot ∥_{p})}} > ϵ\} < δ . \end{matrix}

(109)

Define the semi-norm:

\begin{matrix} {\tilde{d}}_{n ϕ, 2} & : = & ({(n ϕ (h))}^{- 1 / 2} h^{m / 2} N^{- 1} \sum_{ℓ \in L_{n}} \sum_{i : s_{i} \in Γ_{n} (ℓ; ϵ_{0}) \cap R_{n}} |(ξ_{i} K_{2, 1} (\frac{d (x_{i}, η_{i})}{h}) {\tilde{h}}_{1} (ς_{i}) \\ - E ({U^{'}}_{1, n} (h_{n}, i))) {{- (ξ_{i} K_{2, 2} (\frac{d (x_{i}, η_{i})}{h}) {\tilde{h}}_{2} (ς_{i}) - E ({U^{″}}_{1, n} (h_{n}, i)))|}^{2})}^{1 / 2} \end{matrix}

(110)

and the covering number defined for any class of functions

E

by:

{\tilde{N}}_{n ϕ, 2} (u, E) : = N_{n ϕ, 2} (u, E, {\tilde{d}}_{n ϕ, 2}) .

Because of the latter, we are able to bound (109) (more details are in [83]). In the same way, as in [83] and before in [82], as a result of the independence between the blocks and Assumption 7 (C3), and by applying ([151], Lemma 5.2), the equicontinuity is achieved, and then the weak convergence is achieved, too.

Now, we need to show that:

P \{{∥μ_{n}^{(R)} (φ, t)∥}_{F_{m} K^{m}} > λ\} \to 0 a s n \to \infty .

For clarity purposes, we restrict ourselves to

m = 2

. Using the same notation as in Lemma 1, we have the following decomposition:

\begin{matrix} μ_{n}^{(R)} (φ, i) & = & \sqrt{n h^{m + d} ϕ (h_{n})} \{U_{n}^{(R)} (φ, i) - E (U_{n}^{(R)} (φ, i))\} \\ = & \frac{\sqrt{n h^{m + d} ϕ (h_{n})}}{n (n - 1)} \sum_{i_{1} \neq i_{2}}^{n} ξ_{i_{1}} ξ_{i_{2}} \{G_{φ, t}^{(R)} (((X_{i_{1}}, X_{i_{2}}), (Y_{i_{1}}, Y_{i_{2}}))) \\ - E [G_{φ, i}^{(R)} ((X_{i_{1}}, X_{i_{2}}), (Y_{i_{1}}, Y_{i_{2}}))]\} \\ ⩽ & \frac{1}{\sqrt{n h^{m + d} ϕ (h_{n})}} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} \\ ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} \{G_{φ, i}^{(R)} ((X_{i}, X_{j}), (Y_{i}, Y_{j})) - E [G_{φ, i}^{(R)} ((X_{i_{1}}, X_{i_{2}}), (Y_{i_{1}}, Y_{i_{2}}))]\} \\ + & \frac{1}{\sqrt{n h^{m + d} ϕ (h_{n})}} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \underset{i_{1} \neq i_{2}}{\sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}}} \\ ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} \{G_{φ, i}^{(R)} ((X_{i}, X_{j}), (Y_{i}, Y_{j})) - E [G_{φ, i}^{(R)} ((X_{i_{1}}, X_{i_{2}}), (Y_{i_{1}}, Y_{i_{2}}))]\} \\ + 2 & \frac{1}{\sqrt{n h^{m + d} ϕ (h_{n})}} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{Δ_{2}} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} \\ ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} \{G_{φ, i}^{(R)} ((X_{i}, X_{j}), (Y_{i}, Y_{j})) - E [G_{φ, i}^{(R)} ((X_{i_{1}}, X_{i_{2}}), (Y_{i_{1}}, Y_{i_{2}}))]\} \\ + 2 & \frac{1}{\sqrt{n h^{m + d} ϕ (h_{n})}} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{Δ_{1}} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} \\ ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} \{G_{φ, t}^{(R)} ((X_{i}, X_{j}), (Y_{i}, Y_{j})) - E [G_{φ, t}^{(R)} ((X_{i_{1}}, X_{i_{2}}), (Y_{i_{1}}, Y_{i_{2}}))]\} \\ + & \frac{1}{\sqrt{n h^{m + d} ϕ (h_{n})}} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{1} \neq ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} \\ ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} \{G_{φ, t}^{(R)} ((X_{i}, X_{j}), (Y_{i}, Y_{j})) - E [G_{φ, t}^{(R)} ((X_{i_{1}}, X_{i_{2}}), (Y_{i_{1}}, Y_{i_{2}}))]\} \\ + & \frac{1}{\sqrt{n h^{m + d} ϕ (h_{n})}} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{1} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{1} < i_{2} : s_{i_{1}}, s_{i_{2}} \in Γ_{n} (ℓ_{1}; ϵ) \cap R_{n}} \\ ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} \{G_{φ, t}^{(R)} ((X_{i}, X_{j}), (Y_{i}, Y_{j})) - E [G_{φ, t}^{(R)} ((X_{i_{1}}, X_{i_{2}}), (Y_{i_{1}}, Y_{i_{2}}))]\} \\ = : & I^{'} + {II}^{'} + {III}^{'} + {IV}^{'} + V^{'} + {VI}^{'} . \end{matrix}

We shall employ blocking arguments and evaluate the terms that result. We begin by examining the first

I^{'}

. We obtain

\begin{matrix} P \{∥\frac{1}{\sqrt{n ϕ (h_{n})}} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} \{G_{φ, i}^{(R)} ((X_{i_{1}}, X_{i_{2}}), (Y_{i_{1}}, Y_{i_{2}})) \\ {- E [G_{φ, t}^{(R)} ((X_{i_{1}}, X_{i_{2}}), (Y_{i_{1}}, Y_{i_{2}}))]\}∥}_{F_{2} K^{2}} > δ\} \\ ⩽ P \{∥\frac{1}{\sqrt{n ϕ (h_{n})}} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} \\ \{G_{φ, i}^{(R)} ((ς_{i_{1}}, ς_{i_{2}}), (ζ_{i_{1}}, ζ_{i_{2}})) {- E [G_{φ, i}^{(R)} ((ς_{i_{1}}, ς_{i_{2}}), (ζ_{i_{1}}, ζ_{i_{2}}))]\}∥}_{F_{2} K^{2}} > δ\} \\ + C {(\frac{A_{n}}{A_{1, n}})}^{d} β (A_{2, n}; A_{n}^{d}) . \end{matrix}

Recall that for all

φ \in F_{m},

and:

x \in H^{2}, y \in Y^{2} : 1_{\{d (x, X_{i, n}) ⩽ h\}} F (y) ⩾ φ (y) K_{2} (\frac{d (x_{i}, X_{s_{i}, A_{n}})}{h_{n}}) .

Hence, by the symmetry of

F (\cdot)

:

\begin{matrix} ∥\frac{1}{\sqrt{n ϕ (h_{n})}} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} \{G_{φ, i}^{(R)} ((ς_{i_{1}}, ς_{i_{2}}), (ζ_{i_{1}}, ζ_{i_{2}})) \\ {- E [G_{φ, t}^{(R)} ((ς_{i_{1}}, ς_{i_{2}}), (ζ_{i_{1}}, ζ_{i_{2}}))]\}∥}_{F_{2} K^{2}} \\ ≲ |\frac{1}{\sqrt{n ϕ (h_{n})}} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} \{F (ζ_{i}, ζ_{j}) 1_{\{F > λ_{n}\}} \\ - E [F (ζ_{i}, ζ_{j}) 1_{\{F > λ_{n}\}}]\}| . \end{matrix}

(111)

We are going to use Chebyshev’s inequality, Hoeffding’s trick and inequality, respectively, to obtain:

\begin{matrix} P \{|\frac{1}{\sqrt{n ϕ (h)}} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} \{F (ζ_{i}, ζ_{j}) 1_{\{F > λ_{n}\}} \\ - E [F (ζ_{i}, ζ_{j}) 1_{\{F > λ_{n}\}}]\}| > δ\} \\ ≲ & δ^{- 2} n^{- 1} ϕ^{- 1} (h) V a r (\sum_{ℓ_{1} \neq ℓ_{2} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} F (ζ_{i}, ζ_{j}) 1_{\{F > λ_{n}\}}) \\ ≲ & c_{2} 〚 L_{n} 〛 δ^{- 2} n^{- 1} ϕ^{- 1} (h) V a r (\sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} F (ζ_{i}, ζ_{j}^{'}) 1_{\{F > λ_{n}\}}) \\ ≲ & 2 c_{2} 〚 L_{n} 〛 δ^{- 2} n^{- 2} ϕ^{- 1} (h) \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} E [{(F (ζ_{1}, ζ_{2}))}^{2} 1_{\{F > λ_{n}\}}] . \end{matrix}

(112)

Under Assumption 7 (iii), we have for each

λ > 0

:

\begin{matrix} c_{2} 〚 L_{n} 〛 δ^{- 2} n^{- 2} ϕ^{- 1} (h_{n}) \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} E [{(F (ζ_{1}, ζ_{2}))}^{2} 1_{\{F > λ_{n}\}}] \\ = c_{2} 〚 L_{n} 〛 δ^{- 2} n^{- 2} ϕ^{- 1} (h_{n}) \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} \\ \times \int_{0}^{\infty} P \{{(F (ζ_{1}, ζ_{2}))}^{2} 1_{\{F > λ_{n}\}} ⩾ t\} d t \\ = c_{2} 〚 L_{n} 〛 δ^{- 2} n^{- 2} ϕ^{- 1} (h_{n}) \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} \int_{0}^{λ_{n}} P \{F > λ_{n}\} d t \\ + c_{2} 〚 L_{n} 〛 δ^{- 2} n^{- 2} ϕ^{- 1} (h_{n}) \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ_{0}) \cap R_{n}} ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} \int_{λ_{n}}^{\infty} P \{{(F)}^{2} > t\} d t, \end{matrix}

converging to 0 as

n \to \infty

. Terms

{II}^{'}, V^{'}

and

{VI}^{'}

will be handled the same way as the last term was. The terms

{II}^{'}, {VI}^{'}

do not follow the same line because the variables

{ζ_{i}, ζ_{j}}_{ϵ = ϵ_{0}}

(or

{ζ_{i}, ζ_{j}}_{ϵ \neq ϵ_{0}}

for

{VI}^{'}

) belong to the same blocks. Term

{IV}^{'}

can be deduced from the study of Terms

I^{'}

and

{III}^{'}

. Considering the term

{III}^{'}

, we have

\begin{matrix} P & \{∥\frac{1}{\sqrt{n ϕ (h_{n})}} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{Δ_{2}} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} \\ \{G_{φ, i}^{(R)} ((X_{i}, X_{j}), (Y_{i}, Y_{j})) {- E [G_{φ, i}^{(R)} ((X_{i_{1}}, X_{i_{2}}), (Y_{i_{1}}, Y_{i_{2}}))]\}∥}_{F_{2} K^{2}} > δ\} \\ ⩽ P \{∥\frac{1}{\sqrt{n ϕ (h_{n})}} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{Δ_{2}} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} \\ {\{G_{φ, i}^{(R)} ((ς_{i_{1}}, ς_{i_{2}}), (ζ_{i_{1}}, ζ_{i_{2}})) - E [G_{φ, i}^{(R)} ((ς_{i_{1}}, ς_{i_{2}}), (ζ_{i_{1}}, ζ_{i_{2}}))]\}∥}_{F_{2} K^{2}} > δ\} \\ + \frac{〚 L_{n} 〛 A_{1, n}^{d} A_{2, n}^{d} β (A_{2, n}; A_{n}^{d})}{\sqrt{n ϕ (h_{n})}} . \end{matrix}

(113)

We have also

\begin{matrix} P \{∥\frac{1}{\sqrt{n ϕ (h_{n})}} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{Δ_{2}} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} \\ \{G_{φ, i}^{(R)} ((ς_{i_{1}}, ς_{i_{2}}), (ζ_{i_{1}}, ζ_{i_{2}})) {- E [G_{φ, i}^{(R)} ((ς_{i_{1}}, ς_{i_{2}}), (ζ_{i_{1}}, ζ_{i_{2}}))]\}∥}_{F_{2} K^{2}} > δ\} \\ ⩽ P \{∥\frac{1}{\sqrt{n ϕ (h_{n})}} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{Δ_{2}} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} \\ \{G_{φ, i}^{(R)} ((ς_{i_{1}}, ς_{i_{2}}), (ζ_{i_{1}}, ζ_{i_{2}})) {- E [G_{φ, i}^{(R)} ((ς_{i_{1}}, ς_{i_{2}}), (ζ_{i_{1}}, ζ_{i_{2}}))]\}∥}_{F_{2} K^{2}} > δ\} . \end{matrix}

Since (111) is still true, the problem can be reduced to

\begin{matrix} P \{|\frac{1}{\sqrt{n ϕ (h_{n})}} \sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{Δ_{2}} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} \\ \{F (ζ_{i}, ζ_{j}) 1_{\{F > λ_{n}\}} - E [F (ζ_{i}, ζ_{j}) 1_{\{F > λ_{n}\}}]\}| > δ\} \\ ≲ & δ^{- 2} n^{- 1} ϕ (h_{n}) V a r (\sum_{ℓ_{1} \in L_{n}} \sum_{i_{1} : s_{i_{1}} \in Γ_{n} (ℓ_{1}; ϵ_{0}) \cap R_{n}} \sum_{Δ_{2}} \underset{ϵ \neq ϵ_{0}}{\sum_{ℓ_{2} \in L_{1, n} \cup L_{2, n}}} \sum_{i_{2} : s_{i_{2}} \in Γ_{n} (ℓ_{2}; ϵ) \cap R_{n}} ϕ (h_{n}) ξ_{i_{1}} ξ_{i_{2}} \\ \times F (ζ_{i}, ζ_{j}) 1_{\{F > λ_{n}\}}), \end{matrix}

the identical technique is followed as in (112). The remainder has just been demonstrated to be asymptotically negligible. Finally, with

{\hat{r}}^{(m)} (φ, x, u) \to E (U_{n} (φ, i))

, and for

(U_{n} (1, i)) \underset{P}{\to} 1

, the weak convergence of our estimator is accomplished. □

Author Contributions

I.S. and S.B.: conceptualization, methodology, investigation, writing—original draft, writing—review and editing. All authors contributed equally to the writing of this paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Acknowledgments

The authors would like to thank the Special Issue Editor of the Special Issue on “Current Developments in Theoretical and Applied Statistics”, Christophe Chesneau for the invitation. We extend our sincere thanks to the Editor-in-Chief, Associate Editor and several referees for their constructive comments that lead to numerous improvements over a previous version.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

This appendix contains supplementary information that is an essential part of providing a more comprehensive understanding of the paper.

Assumption A1.

(KD1): (KB2) in Assumption 2 holds.
(KD2): For any $α \in Z^{d}$ where $| α | = 1, 2$ , $\partial^{α} f_{S} (s)$ exists and is continuous on ${(0, 1)}^{d}$ .

Define

{\hat{f}}_{S} (u) = \frac{1}{n h^{d}} \sum_{j = 1}^{n} {\bar{K}}_{h} (u - S_{0, j}) .

Lemma A1

([153], Theorem 2). Under Assumption 8 and

h \to 0

such that

n h^{d} / (log n) \to

∞ as

n \to \infty

, we have that

sup_{u \in {[0, 1]}^{d}} |{\hat{f}}_{S} (u) - f_{S} (u)| = O (\sqrt{\frac{log n}{n h^{d}}} + h^{2}) P_{S} - a . s .

Lemma A2.

Let

I_{h} = [C_{1} h, 1 - C_{1} h]

. Suppose that kernel

K_{1}

satisfies Assumption 8 part(i). Then for

q = 0, 1, 2

and

m > 1

:

\begin{matrix} sup_{u \in I_{h}} |\frac{1}{n^{m} h^{m d}} \sum_{i \in I_{n}^{m}} \prod_{j = 1}^{m} \bar{K} (\frac{u_{j} - S_{0, i_{j}}}{h_{n}}) {(\frac{u_{j} - S_{0, i_{j}}}{h})}^{q} \\ - \int_{R^{m d}} \frac{1}{h^{m d}} \prod_{j = 1}^{m} \{\bar{K} (\frac{u_{j} - ω_{j}}{h_{n}}) {(\frac{u_{j} - ω_{j}}{h})}^{q}\} f_{S} (ω_{j}) \prod_{j = 1}^{m} d ω_{j}| = O (\sqrt{\frac{log n}{n h^{d m}}}) P_{S} - a . s . \end{matrix}

Lemma A3.

Suppose that kernel

\bar{K}

satisfies Assumption 8. Let

g : {[0, 1]}^{m d} \times H^{m} \to R

,

(u, x) \mapsto g (u, x)

be a function continuously partially differentiable with respect to

u_{j}

. For

k = 1, 2

, we have

\begin{matrix} sup_{u \in I_{h}} |\frac{1}{n^{m} h^{m d}} \sum_{i \in I_{n}^{m}} \prod_{j = 1}^{m} \bar{K} {(\frac{u_{j} - S_{0, j}}{h_{n}})}^{k} g (S_{0, j}, x_{j}) - \prod_{j = 1}^{m} κ_{k} f_{S} (u_{j}) g (u_{j}, x_{j})| \\ = O (\sqrt{\frac{log n}{n h^{m d}}}) + o (h), P_{S} - a . s . \end{matrix}

(A1)

where

κ_{k} = \int_{R^{d}} {\bar{K}}^{k} (x) d x .

For any probability measure Q on a product measure space

(Ω_{1} \times Ω_{2}, Σ_{1} \times Σ_{2})

, we may define the

β

-mixing coefficients as follows:

Definition A1

([149], Definition 2.5). Let

Q_{1}

and

Q_{2}

be the marginal probability measures of Q on

(Ω_{1}, Σ_{1})

and

(Ω_{2}, Σ_{2})

, respectively. We set

β (Σ_{1}, Σ_{2}, Q) = P sup \{|Q (B ∣ Σ_{1}) - Q_{2} (B)| : B \in Σ_{2}\} .

The following lemma holds true for every finite n and is essential for the generation of independent blocks for

β

-mixing sequences.

Lemma A4

([149], Corollary 2.7). Let

m \in N

and let Q denote a probability measure on a product space

(\prod_{i = 1}^{m} Ω_{i}, \prod_{i = 1}^{m} Σ_{i})

with the associated marginal measures

Q_{i}

on

(Ω_{i}, Σ_{i})

. Assume that h is a bounded measurable function on the product probability space in such a way that

| h | \leq M_{h} < \infty

. For

1 \leq a \leq b \leq m

, let

Q_{a}^{b}

be the marginal measure on

(\prod_{i = a}^{b} Ω_{i}, \prod_{i = a}^{b} Σ_{i})

. For a given

τ > 0

, suppose that, for all

1 \leq k \leq m - 1

,

{∥Q - Q_{1}^{k} \times Q_{k + 1}^{m}∥}_{T V} \leq 2 τ

(A2)

where

Q_{1}^{k} \times Q_{k + 1}^{m}

is the product measure and

{∥ \cdot ∥}_{TV}

is the total variation. Then

| Q h - P h | \leq 2 M_{h} (m - 1) τ,

where

P = \prod_{i = 1}^{m} Q_{i}, Q h = \int h d Q

and

P h = \int h d P

.

Lemma A5.

Let

I_{n} = \{i \in Z^{d} : (i + {(0, 1]}^{d}) \cap R_{n} \neq \emptyset\} .

Then, we have

P_{S} (\sum_{j = 1}^{n} 1 \{A_{n} S_{0, j} \in (i + {(0, 1]}^{d}) \cap R_{n}\} > 2 (log n + n A_{n}^{- d}) for some i \in I_{n}, i . o .) = 0

and

P_{S} (\sum_{j = 1}^{n} 1 \{A_{n} S_{0, j} \in Γ_{n} (ℓ; ϵ)\} > C A_{1, n}^{q (ϵ)} A_{2, n}^{d - q (ϵ)} n A_{n}^{- d} for some ℓ \in L_{1, n}, i . o .) = 0

for any

ϵ \in {1, 2}^{d}

, where “i.o.” stands for infinitly often.

Proof.

See the proof in ([93], Lemma A.1) for each statement. □

Remark A1.

Lemma A5 implies that each

Γ_{n} (ℓ; ϵ)

contains at most

C A_{1, n}^{q (ϵ)} A_{2, n}^{d - q (ϵ)} n A_{n}^{- d}

samples

P_{S}

-almost surely.

Lemma A6.

Under Assumptions 2 and 3, Condition (B1) in 4–6 and 8, we have:

E_{. ∣ S} [{({\bar{S}}_{n} (ℓ; ϵ))}^{2}] \leq C A_{1, n}^{d - 1} A_{2, n} (n A_{n}^{- d} + log n) h^{m d} ϕ (h) .

Appendix A.1. Proof of Lemma A6

We have

\begin{matrix} E_{. ∣ S} [{({\bar{S}}_{n} (ℓ; ϵ))}^{2}] \\ = \sum_{i : s_{i} \in Γ_{n} (ℓ; ϵ) \cap R_{n}} E_{. ∣ S} [{\bar{S}}_{s, A_{n}}^{2} (u, x)] + \sum_{i \neq j : s_{i}, s_{j} \in Γ_{n} (ℓ; ϵ) \cap R_{n}} E_{. ∣ S} [{\bar{S}}_{s_{i}, A_{n}} (u, x) {\bar{S}}_{s_{j}, A_{n}} (u, x)] \end{matrix}

where

\begin{matrix} \sum_{i : s_{i} \in Γ_{n} (ℓ; ϵ) \cap R_{n}} E_{. ∣ S} [{\bar{S}}_{s, A_{n}}^{2} (u, x)] \\ \leq {(n - 1)}^{- 2 m} \sum_{I_{n - 1}^{m - 1} (- i)} \sum_{ℓ = 1}^{m} ξ_{i_{1}}^{2} \dots ξ_{i_{ℓ - 1}}^{2} ξ_{i}^{2} ξ_{i_{ℓ}}^{2} \dots ξ_{i_{m - 1}}^{2} (\int W_{s_{(1, \dots, ℓ - 1, ℓ, \dots, m - 1)}, A_{n}} \prod_{\underset{j \neq i}{j = 1}}^{m - 1} \frac{1}{ϕ (h)} \\ {K_{2} (\frac{d (x_{j}, ν_{s_{j}, A_{n}})}{h}) P (d ν_{1}, \dots, d ν_{ℓ - 1}, d ν_{ℓ}, \dots, d ν_{m - 1}))}^{2} \\ \{E_{\cdot ∣ S} (\frac{1}{ϕ^{2} (h)} K_{2}^{2} (\frac{d (x_{i}, X_{s_{i}, A_{n}})}{h}) W_{s_{i}, A_{n}}^{2}) \\ + {[E_{\cdot ∣ S} (\frac{1}{ϕ^{2} (h)} K_{2} (\frac{d (x_{i}, X_{s_{i}, A_{n}})}{h}) W_{s_{i}, A_{n}})]}^{2}\} \\ \leq C ϕ^{2} (h) {(n - 1)}^{- 2 m} \sum_{I_{n - 1}^{m - 1} (- i)} \sum_{ℓ = 1}^{m} ξ_{i_{1}}^{2} \dots ξ_{i_{ℓ - 1}}^{2} ξ_{i}^{2} ξ_{i_{ℓ}}^{2} \dots ξ_{i_{m - 1}}^{2} P_{S} - a . s . \end{matrix}

(A3)

Likewise, we can see that

\begin{matrix} E_{. ∣ S} [{\bar{S}}_{s_{i}, A_{n}} (u, x) {\bar{S}}_{s_{j}, A_{n}} (u, x)] \\ \leq C ϕ^{2} (h) {(n - 1)}^{- 2 m} \sum_{I_{n - 1}^{m - 1} (- i)} \sum_{ℓ = 1}^{m} ξ_{i_{1}}^{2} \dots ξ_{i_{ℓ - 1}}^{2} ξ_{i}^{2} ξ_{i_{ℓ}}^{2} \dots ξ_{i_{m - 1}}^{2} P_{S} - a . s . \end{matrix}

(A4)

Applying Lemma A5 and Lemma A1 to find that

\begin{matrix} \sum_{i : s_{i} \in Γ_{n} (ℓ; ϵ) \cap R_{n}} {\bar{K}}_{h}^{2} (u - \frac{s_{j}}{A_{n}}) \times {(n - 1)}^{- 2 m} \sum_{I_{n - 1}^{m - 1} (- i)} \sum_{ℓ = 1}^{m} ξ_{i_{1}}^{2} \dots ξ_{i_{ℓ - 1}}^{2} ξ_{i_{ℓ}}^{2} \dots ξ_{i_{m - 1}}^{2} \\ \leq C \sum_{i : s_{i} \in Γ_{n} (ℓ; ϵ) \cap R_{n}} {\bar{K}}_{h} (u - \frac{s_{i}}{A_{n}}) \times {(n - 1)}^{- 2 m} \sum_{I_{n - 1}^{m - 1} (- i)} \sum_{ℓ = 1}^{m} ξ_{i_{1}} \dots ξ_{i_{ℓ - 1}} ξ_{i_{ℓ}} \dots ξ_{i_{m - 1}} \\ \leq C h^{m d} 〚 \{i : s_{i} \in Γ_{n} (ℓ; ϵ) \cap R_{n}\} 〛 \\ \leq C h^{m d} A_{1, n}^{d - 1} A_{2, n} (n A^{- d} + log n), P_{S} - a . s ., \end{matrix}

and

\begin{matrix} \sum_{i \neq j : s_{i}, s_{j} \in Γ_{n} (ℓ, ϵ) \cap R_{n}} {\bar{K}}_{h} (u - \frac{s_{i}}{A_{n}}) {\bar{K}}_{h} (u - \frac{s_{j}}{A_{n}}) \\ \times {(n - 1)}^{- m} (\sum_{I_{n - 1}^{m - 1} (- i)} \sum_{ℓ = 1}^{m} ξ_{i_{1}} \dots ξ_{i_{ℓ - 1}} ξ_{i_{ℓ}} \dots ξ_{i_{m - 1}} \sum_{I_{n - 1}^{m - 1} (- j)} \sum_{ℓ = 1}^{m} ξ_{j_{1}} \dots ξ_{j_{ℓ - 1}} ξ_{j_{ℓ}} \dots ξ_{j_{m - 1}}) \\ \leq {(\sum_{j : s_{j} \in Γ_{n} (ℓ; ϵ) \cap R_{n}} {\bar{K}}_{h} (u - \frac{s_{j}}{A_{n}}) {(n - 1)}^{- m} \sum_{I_{n - 1}^{m - 1} (- j)} \sum_{ℓ = 1}^{m} ξ_{j_{1}} \dots ξ_{j_{ℓ - 1}} ξ_{j_{ℓ}} \dots ξ_{j_{m - 1}})}^{2} \\ \leq C h^{2 m d} 〚 \{j : s_{j} \in Γ_{n} (ℓ; ϵ) \cap R_{n}\} 〛^{2} \\ \leq C h^{2 m d} A_{1, n}^{2 (d - 1)} A_{2, n}^{2} {(n A^{- d} + log n)}^{2}, P_{S} - a . s . \end{matrix}

Since

A_{1, n}^{d - 1} A_{2, n} (n A_{n}^{- d} + log n) h^{m d} ϕ (h) \leq A_{1, n}^{d} (n A_{n}^{- d} + log n) h^{m d} ϕ (h) = o (1),

we have

\begin{matrix} E_{\cdot ∣ S} [{({\bar{S}}_{n} (ℓ; ϵ))}^{2}] & \leq & C \{A_{1, n}^{d - 1} A_{2, n} (n A_{n}^{- d} + log n) h^{m d} ϕ (h) \\ + A_{1, n}^{2 (d - 1)} A_{2, n}^{2} (n^{2} A_{n}^{- 2 d} + {log}^{2} n) h^{2 (d)} ϕ^{2} (h)\} \\ \leq & C A_{1, n}^{d - 1} A_{2, n} (n A_{n}^{- d} + log n) h^{m d} ϕ (h), P_{S} - a . s . \end{matrix}

Lemma A7 (Bernstein’s inequality).

Let

X_{1}, \dots, X_{n}

be zero-mean independent random variables. Assume that

max_{1 \leq i \leq n} |X_{i}| \leq M < \infty, a . s .

For all

t > 0

, we have

P (\sum_{i = 1}^{n} X_{i} \geq t) \leq exp (- \frac{\frac{t^{2}}{2}}{\sum_{i = 1}^{n} E [X_{i}^{2}] + \frac{M t}{3}})

Lemma A8.

Under Assumptions 2–4 and 6, we have

\frac{1}{n h^{m d} ϕ^{m} (h)} {Var}_{\cdot ∣ S} (\sum_{ℓ \in L_{1, n}} Z_{n} (ℓ; ϵ)) = o (1), P_{S} - a . s .

(A5)

\frac{1}{n h^{m d} ϕ^{m} (h)} {Var}_{\cdot ∣ S} (\sum_{ℓ \in L_{2, n}} Z_{n} (ℓ; ϵ)) = o (1), P_{S} - a . s .

(A6)

□

Appendix A.2. Proof of Lemma A8

We have

\begin{matrix} \frac{1}{n h^{m d} ϕ (h)} {Var}_{\cdot ∣ S} (\sum_{ℓ \in L_{1, n}} Z_{n} (ℓ; ϵ)) & = & \frac{1}{n h^{m d} ϕ (h)} \sum_{ℓ \in L_{1, n}} E_{\cdot ∣ S} [{(Z_{n} (ℓ; ϵ))}^{2}] \\ + \frac{1}{n h^{m d} ϕ (h)} \sum_{ℓ_{1} \neq ℓ_{2} \in L_{1, n}} E_{\cdot ∣ S} [Z_{n} (ℓ_{1}; ϵ) Z_{n} (ℓ_{2}; ϵ)] \\ : = & I_{1} + I_{2} . \end{matrix}

(A7)

Using Lemma A6 and Assumption 4, it is easy to see that

\begin{matrix} I_{1} & \leq & C \frac{1}{n h^{m d} ϕ (h)} {(\frac{A_{n}}{A_{1, n}})}^{d} A_{1, n}^{d - 1} A_{2, n} (n A_{n}^{- d} + log n) h^{m d} ϕ (h) \\ \leq & C \frac{A_{2, n}}{A_{1, n}} (log n) = o (1) . \end{matrix}

(A8)

For

I_{2}

, using [Theorem 1.1] from [154], we have:

\begin{matrix} E_{\cdot ∣ S} [Z_{n} (ℓ_{1}; ϵ) Z_{n} (ℓ_{2}; ϵ)] \\ \leq E_{\cdot ∣ S} {[{| Z_{n} (ℓ_{1}; ϵ) |}^{3}]}^{1 / 3} E_{\cdot ∣ S} {[{| Z_{n} (ℓ_{2}; ϵ) |}^{3}]}^{1 / 3} β^{1 / 3} (d (ℓ_{1}, ℓ_{2}) A_{2, n}, A_{1, n}^{m d}) \\ \leq E_{\cdot ∣ S} {[{| Z_{n} (ℓ_{1}; ϵ) |}^{3}]}^{1 / 3} E_{\cdot ∣ S} {[{| Z_{n} (ℓ_{2}; ϵ) |}^{3}]}^{1 / 3} β_{1}^{1 / 3} (d (ℓ_{1}, ℓ_{2}) A_{2, n}) g_{1}^{1 / 3} (A_{1, n}^{m d}) . \end{matrix}

The first inequality holds using Equation (8), and for

d (ℓ_{1}, ℓ_{2}) = max_{1 \leq j \leq d} | ℓ_{1}^{j} - ℓ_{2}^{j} |

. Using the same strategy as Lemma A6, we have

E_{\cdot ∣ S} [{| Z_{n} (ℓ_{1}; ϵ) |}^{3}] \leq C A_{1, n}^{d - 1} A_{2, n} (n A_{n}^{- d} + log n) h^{m d},

and

E_{\cdot ∣ S} [{| Z_{n} (ℓ_{2}; ϵ) |}^{3}] \leq C A_{1, n}^{d - 1} A_{2, n} (n A_{n}^{- d} + log n) h^{m d} .

Note that for

ℓ_{1}, ℓ_{2} \in L_{1 n}, Γ (ℓ_{1}; ϵ_{0})

and

Γ (ℓ_{2}; ϵ_{0})

in

R_{n}

are separated by the

(ℓ^{1} -)

distance

d (Γ (ℓ_{1}; ϵ_{0}), Γ (ℓ_{2}; ϵ_{0})) \geq ([(|ℓ_{1} - ℓ_{2}| - d) + A_{3 n}] + A_{2 n}) .

\begin{matrix} I_{2} & \leq & C \frac{{(A_{1, n}^{d - 1} A_{2, n} (n A_{n}^{- d} + log n) h^{p + d})}^{2 / 3}}{n h^{d + p}} \\ \times \sum_{ℓ_{1}, ℓ_{2} \in L_{1, n}, ℓ_{1} \neq ℓ_{2}} β_{1 / 3}^{1 / 3} ((|ℓ_{1} - ℓ_{2}| - d) + A_{3, n} + A_{2, n}) g_{1}^{1 / 3} (A_{1, n}^{d}) \\ \leq & C \{{(\frac{1}{n h^{d + p}})}^{1 / 3} {(\frac{A_{1, n}}{A_{n}})}^{2 d / 3} {(\frac{A_{2, n}}{A_{1, n}})}^{2 / 3} + \frac{A_{1, n}^{(d - 1) / 3} A_{2, n}^{1 / 3} {(log n)}^{1 / 3}}{n h^{(d + p) / 3}\}} \\ \times g_{1}^{1 / 3} (A_{1, n}^{d}) \{β_{1}^{1 / 3} (A_{2, n}) + \sum_{k = 1}^{A_{n} / A_{1, n}} k^{d - 1} β_{1}^{1 / 3} (k A_{3, n} + A_{2, n})\} = o (1) \end{matrix}

(A9)

The last inequality follows using Assumption 4 and for

| ℓ_{1} - ℓ_{2} | = \sum_{j = 1}^{d} | ℓ_{1, j} - ℓ_{2, j} | .

Equation (A6) could be treated similarly to (A5).

Remark A2.

In order to prove that the summation over the small block is asymptotically negligible, we can use the method of [87] where they used to pass from the dependence structure of variables to the independence as a first step; then, they proved the convergence of second-order expectation to zero using a maximal inequality. This method avoids the treatment of covariance, and it is based on the use of maximal inequality.

Proposition A1

([27], Proposition 3.6). Let

{X_{i} : i \in n}

be a process satisfying, for

m ⩾ 1

:

{(E {∥X_{i} - X_{j}∥}^{p})}^{1 / p} ⩽ {(\frac{p - 1}{q - 1})}^{m / 2} {(E {∥X_{i} - X_{j}∥}^{q})}^{1 / q}, 1 < q < p < \infty,

and the semi-metric:

ρ (j, i) = {(E {∥X_{i} - X_{j}∥}^{2})}^{1 / 2} .

There exists a constant

K = K (m)

such that:

E sup_{i, j \in n} ∥X_{i} - X_{j}∥ ⩽ K \int_{0}^{D} {[log N (ϵ, n, ρ)]}^{m / 2} d ϵ,

where D is the ρ-diameter of n.

Lemma A9

([155]). Let

X_{1}, \dots, X_{n}

be a sequence of independent random elements taking values in a Banach space

(B, ∥ . ∥)

with

E X_{i} = 0

for all

i .

Let

\{ε_{i}\}

be a sequence of independent Bernoulli r.v values independent of

\{X_{i}\} .

Then, for any convex increasing function Φ,

E Φ (\frac{1}{2} ∥\sum_{i = 1}^{n} X_{i} ε_{i}∥) \leq E Φ (∥\sum_{i = 1}^{n} X_{i}∥) \leq E Φ (2 ∥\sum_{i = 1}^{n} X_{i} ε_{i}∥) .

□

References

Silverman, B.W. Density Estimation for Statistics and Data Analysis; Monographs on Statistics and Applied Probability; Chapman & Hall: London, UK, 1986; pp. x+175. [Google Scholar] [CrossRef]
Nadaraya, E.A. Nonparametric Estimation of Probability Densities and Regression Curves; Mathematics and Its Applications (Soviet Series); Kluwer Academic Publishers Group: Dordrecht, The Netherlands, 1989; Volume 20, pp. x+213. [Google Scholar] [CrossRef]
Härdle, W. Applied Nonparametric Regression; Econometric Society Monographs; Cambridge University Press: Cambridge, UK, 1990; Volume 19, pp. xvi+333. [Google Scholar] [CrossRef]
Wand, M.P.; Jones, M.C. Kernel Smoothing; Monographs on Statistics and Applied Probability; Chapman and Hall, Ltd.: London, UK, 1995; Volume 60, pp. xii+212. [Google Scholar] [CrossRef]
Eggermont, P.P.B.; LaRiccia, V.N. Maximum Penalized Likelihood Estimation. Density Estimation; Springer Series in Statistics; Springer: New York, NY, USA, 2001; Volume I, pp. xviii+510. [Google Scholar]
Devroye, L.; Lugosi, G. Combinatorial Methods in Density Estimation; Springer Series in Statistics; Springer: New York, NY, USA, 2001; pp. xii+208. [Google Scholar] [CrossRef]
Ripley, B.D. Spatial statistics: Developments 1980–1983. Internat. Statist. Rev. 1984, 52, 141–150. [Google Scholar] [CrossRef]
Rosenblatt, M. Stationary Sequences and Random Fields; Birkhäuser Boston, Inc.: Boston, MA, USA, 1985; p. 258. [Google Scholar] [CrossRef]
Guyon, X. Random Fields on a Network; Probability and Its Applications (New York); Springer: New York, NY, USA, 1995; pp. xii+255. [Google Scholar]
Cressie, N.A.C. Statistics for Spatial Data, revised ed.; Wiley Classics Library, John Wiley & Sons, Inc.: New York, NY, USA, 2015; pp. xx+900. [Google Scholar]
Tran, L.T. Kernel density estimation on random fields. J. Multivar. Anal. 1990, 34, 37–53. [Google Scholar] [CrossRef] [Green Version]
Tran, L.T.; Yakowitz, S. Nearest neighbor estimators for random fields. J. Multivar. Anal. 1993, 44, 23–46. [Google Scholar] [CrossRef] [Green Version]
Biau, G.; Cadre, B. Nonparametric spatial prediction. Stat. Inference Stoch. Process. 2004, 7, 327–349. [Google Scholar] [CrossRef]
Dabo-Niang, S.; Yao, A.F. Kernel spatial density estimation in infinite dimension space. Metrika 2013, 76, 19–52. [Google Scholar] [CrossRef] [Green Version]
Ndiaye, M.; Dabo-Niang, S.; Ngom, P. Nonparametric prediction for spatial dependent functional data under fixed sampling design. Rev. Colomb. Estadíst. 2022, 45, 391–428. [Google Scholar] [CrossRef]
Hoeffding, W. A class of statistics with asymptotically normal distribution. Ann. Math. Statistics 1948, 19, 293–325. [Google Scholar] [CrossRef]
Stute, W. Almost sure representations of the product-limit estimator for truncated data. Ann. Statist. 1993, 21, 146–156. [Google Scholar] [CrossRef]
Arcones, M.A.; Wang, Y. Some new tests for normality based on U-processes. Statist. Probab. Lett. 2006, 76, 69–82. [Google Scholar] [CrossRef]
Giné, E.; Mason, D.M. Laws of the iterated logarithm for the local U-statistic process. J. Theoret. Probab. 2007, 20, 457–485. [Google Scholar] [CrossRef]
Giné, E.; Mason, D.M. On local U-statistic processes and the estimation of densities of functions of several sample variables. Ann. Statist. 2007, 35, 1105–1145. [Google Scholar] [CrossRef] [Green Version]
Schick, A.; Wang, Y.; Wefelmeyer, W. Tests for normality based on density estimators of convolutions. Statist. Probab. Lett. 2011, 81, 337–343. [Google Scholar] [CrossRef]
Joly, E.; Lugosi, G. Robust estimation of U-statistics. Stoch. Process. Appl. 2016, 126, 3760–3773. [Google Scholar] [CrossRef]
Lee, S.; Linton, O.; Whang, Y.J. Testing for stochastic monotonicity. Econometrica 2009, 77, 585–602. [Google Scholar] [CrossRef]
Ghosal, S.; Sen, A.; van der Vaart, A.W. Testing monotonicity of regression. Ann. Statist. 2000, 28, 1054–1082. [Google Scholar] [CrossRef]
Abrevaya, J.; Jiang, W. A nonparametric approach to measuring and testing curvature. J. Bus. Econom. Statist. 2005, 23, 1–19. [Google Scholar] [CrossRef]
Nolan, D.; Pollard, D. U-processes: Rates of convergence. Ann. Statist. 1987, 15, 780–799. [Google Scholar] [CrossRef]
Arcones, M.A.; Giné, E. Limit theorems for U-processes. Ann. Probab. 1993, 21, 1494–1542. [Google Scholar] [CrossRef]
Sherman, R.P. Maximal inequalities for degenerate U-processes with applications to optimization estimators. Ann. Statist. 1994, 22, 439–459. [Google Scholar] [CrossRef]
de la Peña, V.H.; Giné, E. Decoupling. From Dependence to Independence, Randomly Stopped Processes. U-Statistics and Processes. Martingales and Beyond; Probability and Its Applications (New York); Springer: New York, NY, USA, 1999; pp. xvi+392. [Google Scholar] [CrossRef]
Halmos, P.R. The theory of unbiased estimation. Ann. Math. Stat. 1946, 17, 34–43. [Google Scholar] [CrossRef]
von Mises, R. On the asymptotic distribution of differentiable statistical functions. Ann. Math. Stat. 1947, 18, 309–348. [Google Scholar] [CrossRef]
Yoshihara, K.i. Limiting behavior of U-statistics for stationary, absolutely regular processes. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 1976, 35, 237–252. [Google Scholar] [CrossRef]
Borovkova, S.; Burton, R.; Dehling, H. Limit theorems for functionals of mixing processes with applications to U-statistics and dimension estimation. Trans. Amer. Math. Soc. 2001, 353, 4261–4318. [Google Scholar] [CrossRef]
Denker, M.; Keller, G. On U-statistics and v. Mises’ statistics for weakly dependent processes. Z. Wahrsch. Verw. Gebiete 1983, 64, 505–522. [Google Scholar] [CrossRef]
Leucht, A. Degenerate U- and V-statistics under weak dependence: Asymptotic theory and bootstrap consistency. Bernoulli 2012, 18, 552–585. [Google Scholar] [CrossRef]
Leucht, A.; Neumann, M.H. Degenerate U- and V-statistics under ergodicity: Asymptotics, bootstrap and applications in statistics. Ann. Inst. Statist. Math. 2013, 65, 349–386. [Google Scholar] [CrossRef] [Green Version]
Bouzebda, S.; Nemouchi, B. Weak-convergence of empirical conditional processes and conditional U-processes involving functional mixing data. Stat. Inference Stoch. Process. Appear 2022, 1–56. [Google Scholar] [CrossRef]
Bouzebda, S.; Nezzal, A.; Zari, T. Uniform consistency for functional conditional U-statistics using delta-sequences. Mathematics 2022, 24, 3745. [Google Scholar]
Soukarieh, I.; Bouzebda, S. Exchangeably Weighted Bootstraps of General Markov U-Process. Mathematics 2022, 10, 3745. [Google Scholar] [CrossRef]
Bouzebda, S.; Soukarieh, I. Renewal type bootstrap for increasing degree U-process of a Markov chain. J. Multivar. Anal. 2022, 195, 105143. [Google Scholar]
Bouzebda, S.; Soukarieh, I. Renewal type bootstrap for U-process Markov chains. Markov Process. Related Fields 2022, 13, 1–50. [Google Scholar]
Frees, E.W. Infinite order U-statistics. Scand. J. Statist. 1989, 16, 29–45. [Google Scholar]
Rempala, G.; Gupta, A. Weak limits of U-statistics of infinite order. Random Oper. Stochastic Equ. 1999, 7, 39–52. [Google Scholar] [CrossRef]
Heilig, C.; Nolan, D. Limit theorems for the infinite-degree U-process. Statist. Sinica 2001, 11, 289–302. [Google Scholar]
Song, Y.; Chen, X.; Kato, K. Approximating high-dimensional infinite-order U-statistics: Statistical and computational guarantees. Electron. J. Stat. 2019, 13, 4794–4848. [Google Scholar] [CrossRef]
Peng, W.; Coleman, T.; Mentch, L. Rates of convergence for random forests via generalized U-statistics. Electron. J. Stat. 2022, 16, 232–292. [Google Scholar] [CrossRef]
Faivishevsky, L.; Goldberger, J. ICA based on a Smooth Estimation of the Differential Entropy. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–10 December 2008; Koller, D., Schuurmans, D., Bengio, Y., Bottou, L., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2008; Volume 21. [Google Scholar]
Liu, Q.; Lee, J.; Jordan, M. A Kernelized Stein Discrepancy for Goodness-of-fit Tests. In Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, 19–24 June 2016; Balcan, M.F., Weinberger, K.Q., Eds.; PMLR: New York, New York, USA, 2016; Volume 48, pp. 276–284. [Google Scholar]
Clémençcon, S. On U-processes and clustering performance. In Proceedings of the Advances in Neural Information Processing Systems, Granada, Spain, 12–15 December 2011; Shawe-Taylor, J., Zemel, R., Bartlett, P., Pereira, F., Weinberger, K., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2011; Volume 24. [Google Scholar]
Borovskikh, Y.V. U-Statistics in Banach Spaces; VSP: Utrecht, The Netherlands, 1996; pp. xii+420. [Google Scholar]
Koroljuk, V.S.; Borovskich, Y.V. Theory of U-Statistics; Mathematics and Its Applications; Kluwer Academic Publishers Group: Dordrecht, The Netherlands, 1994; Volume 273, pp. x+552. [Google Scholar]
Lee, A.J. U-Statistics. Theory and Practice; Statistics: Textbooks and Monographs; Marcel Dekker Inc.: New York, NY, USA, 1990; Volume 110, pp. xii+302. [Google Scholar]
Aneiros, G.; Cao, R.; Fraiman, R.; Genest, C.; Vieu, P. Recent advances in functional data analysis and high-dimensional statistics. J. Multivar. Anal. 2019, 170, 3–9. [Google Scholar] [CrossRef]
Ramsay, J.O.; Silverman, B.W. Applied Functional Data Analysis. Methods and Case Studies; Springer Series in Statistics; Springer: New York, NY, USA, 2002; pp. x+190. [Google Scholar] [CrossRef]
Ferraty, F.; Vieu, P. Nonparametric Functional Data Analysis. Theory and Practice; Springer Series in Statistics; Springer: New York, NY, USA, 2006; pp. xx+258. [Google Scholar]
Araujo, A.; Giné, E. The Central Limit Theorem for Real and Banach Valued Random Variables; Wiley Series in Probability and Mathematical Statistics; John Wiley & Sons: New York, NY, USA, 1980; pp. xiv+233. [Google Scholar]
Gasser, T.; Hall, P.; Presnell, B. Nonparametric estimation of the mode of a distribution of random curves. J. R. Stat. Soc. Ser. B Stat. Methodol. 1998, 60, 681–691. [Google Scholar] [CrossRef]
Bosq, D. Linear Processes in Function Spaces. Theory and Applications; Lecture Notes in Statistics; Springer: New York, NY, USA, 2000; Volume 149, pp. xiv+283. [Google Scholar] [CrossRef]
Horváth, L.; Kokoszka, P. Inference for Functional Data with Applications; Springer Series in Statistics; Springer: New York, NY, USA, 2012; pp. xiv+422. [Google Scholar] [CrossRef]
Ling, N.; Vieu, P. Nonparametric modelling for functional data: Selected survey and tracks for future. Statistics 2018, 52, 934–949. [Google Scholar] [CrossRef]
Ferraty, F.; Laksaci, A.; Tadj, A.; Vieu, P. Rate of uniform consistency for nonparametric estimates with functional variables. J. Statist. Plann. Inference 2010, 140, 335–352. [Google Scholar] [CrossRef]
Bouzebda, S.; Chaouch, M. Uniform limit theorems for a class of conditional Z-estimators when covariates are functions. J. Multivar. Anal. 2022, 189, 104872. [Google Scholar] [CrossRef]
Kara-Zaitri, L.; Laksaci, A.; Rachdi, M.; Vieu, P. Uniform in bandwidth consistency for various kernel estimators involving functional data. J. Nonparametr. Stat. 2017, 29, 85–107. [Google Scholar] [CrossRef]
Attouch, M.; Laksaci, A.; Rafaa, F. On the local linear estimate for functional regression: Uniform in bandwidth consistency. Comm. Statist. Theory Methods 2019, 48, 1836–1853. [Google Scholar] [CrossRef]
Ling, N.; Meng, S.; Vieu, P. Uniform consistency rate of kNN regression estimation for functional time series data. J. Nonparametr. Stat. 2019, 31, 451–468. [Google Scholar] [CrossRef]
Bouzebda, S.; Chaouch, M.; Laïb, N. Limiting law results for a class of conditional mode estimates for functional stationary ergodic data. Math. Methods Statist. 2016, 25, 168–195. [Google Scholar] [CrossRef]
Mohammedi, M.; Bouzebda, S.; Laksaci, A. The consistency and asymptotic normality of the kernel type expectile regression estimator for functional data. J. Multivar. Anal. 2021, 181, 104673. [Google Scholar] [CrossRef]
Bouzebda, S.; Mohammedi, M.; Laksaci, A. The k-Nearest Neighbors method in single index regression model for functional quasi-associated time series data. Rev. Mat. Complut. 2022, 1–30. [Google Scholar] [CrossRef]
Bouzebda, S.; Nezzal, A. Uniform consistency and uniform in number of neighbors consistency for nonparametric regression estimates and conditional U-statistics involving functional data. Jpn. J. Stat. Data Sci. 2022, 5, 431–533. [Google Scholar] [CrossRef]
Didi, S.; Al Harby, A.; Bouzebda, S. Wavelet Density and Regression Estimators for Functional Stationary and Ergodic Data: Discrete Time. Mathematics 2022, 10, 3433. [Google Scholar] [CrossRef]
Almanjahie, I.M.; Bouzebda, S.; Kaid, Z.; Laksaci, A. Nonparametric estimation of expectile regression in functional dependent data. J. Nonparametr. Stat. 2022, 34, 250–281. [Google Scholar] [CrossRef]
Almanjahie, I.M.; Bouzebda, S.; Chikr Elmezouar, Z.; Laksaci, A. The functional kNN estimator of the conditional expectile: Uniform consistency in number of neighbors. Stat. Risk Model. 2022, 38, 47–63. [Google Scholar] [CrossRef]
Stute, W. Conditional U-statistics. Ann. Probab. 1991, 19, 812–825. [Google Scholar] [CrossRef]
Sen, A. Uniform strong consistency rates for conditional U-statistics. Sankhyā Ser. A 1994, 56, 179–194. [Google Scholar]
Prakasa Rao, B.L.S.; Sen, A. Limit distributions of conditional U-statistics. J. Theoret. Probab. 1995, 8, 261–301. [Google Scholar] [CrossRef]
Harel, M.; Puri, M.L. Conditional U-statistics for dependent random variables. J. Multivar. Anal. 1996, 57, 84–100. [Google Scholar] [CrossRef] [Green Version]
Stute, W. Symmetrized NN-conditional U-statistics. In Research Developments in Probability and Statistics; VSP: Utrecht, The Netherlands, 1996; pp. 231–237. [Google Scholar]
Fu, K.A. An application of U-statistics to nonparametric functional data analysis. Comm. Statist. Theory Methods 2012, 41, 1532–1542. [Google Scholar] [CrossRef]
Bouzebda, S.; Nemouchi, B. Uniform consistency and uniform in bandwidth consistency for nonparametric regression estimates and conditional U-statistics involving functional data. J. Nonparametr. Stat. 2020, 32, 452–509. [Google Scholar] [CrossRef]
Bouzebda, S.; Elhattab, I.; Nemouchi, B. On the uniform-in-bandwidth consistency of the general conditional U-statistics based on the copula representation. J. Nonparametr. Stat. 2021, 33, 321–358. [Google Scholar] [CrossRef]
Jadhav, S.; Ma, S. Kendall’s Tau for Functional Data Analysis. arXiv 2019, arXiv:1912.03725. [Google Scholar]
Arcones, M.A.; Yu, B. Central limit theorems for empirical and U-processes of stationary mixing sequences. J. Theoret. Probab. 1994, 7, 47–71. [Google Scholar] [CrossRef]
Bouzebda, S.; Nemouchi, B. Central limit theorems for conditional empirical and conditional U-processes of stationary mixing sequences. Math. Methods Statist. 2019, 28, 169–207. [Google Scholar] [CrossRef]
Masry, E. Nonparametric regression estimation for dependent functional data: Asymptotic normality. Stoch. Process. Appl. 2005, 115, 155–177. [Google Scholar] [CrossRef] [Green Version]
Kurisu, D. Nonparametric regression for locally stationary random fields under stochastic sampling design. Bernoulli 2022, 28, 1250–1275. [Google Scholar] [CrossRef]
Kurisu, D. Nonparametric regression for locally stationary functional time series. Electron. J. Stat. 2022, 16, 3973–3995. [Google Scholar] [CrossRef]
Kurisu, D.; Kato, K.; Shao, X. Gaussian approximation and spatially dependent wild bootstrap for high-dimensional spatial data. arXiv 2021, arXiv:2103.10720. [Google Scholar]
Dahlhaus, R. Fitting time series models to nonstationary processes. Ann. Statist. 1997, 25, 1–37. [Google Scholar] [CrossRef]
Dahlhaus, R.; Subba Rao, S. Statistical inference for time-varying ARCH processes. Ann. Statist. 2006, 34, 1075–1114. [Google Scholar] [CrossRef] [Green Version]
van Delft, A.; Eichler, M. Locally stationary functional time series. Electron. J. Stat. 2018, 12, 107–170. [Google Scholar] [CrossRef]
Hall, P.; Patil, P. Properties of nonparametric estimators of autocovariance for stationary random fields. Probab. Theory Related Fields 1994, 99, 399–424. [Google Scholar] [CrossRef]
Matsuda, Y.; Yajima, Y. Fourier analysis of irregularly spaced data on ℝ^d. J. R. Stat. Soc. Ser. B Stat. Methodol. 2009, 71, 191–217. [Google Scholar] [CrossRef]
Lahiri, S.N. Central limit theorems for weighted sums of a spatial process under a class of stochastic and fixed designs. Sankhyā 2003, 65, 356–388. [Google Scholar]
Lahiri, S.N. Resampling Methods for Dependent Data; Springer Series in Statistics; Springer: New York, NY, USA, 2003; pp. xiv+374. [Google Scholar] [CrossRef]
Volkonskiĭ, V.A.; Rozanov, Y.A. Some limit theorems for random functions. I. Theor. Probability Appl. 1959, 4, 178–197. [Google Scholar] [CrossRef]
Rosenblatt, M. Remarks on some nonparametric estimates of a density function. Ann. Math. Statist. 1956, 27, 832–837. [Google Scholar] [CrossRef]
Ibragimov, I.A.; Solev, V.N. A condition for the regularity of a Gaussian stationary process. Dokl. Akad. Nauk SSSR 1969, 185, 509–512. [Google Scholar]
Bradley, R.C. A caution on mixing conditions for random fields. Statist. Probab. Lett. 1989, 8, 489–491. [Google Scholar] [CrossRef]
Bradley, R.C. Some examples of mixing random fields. Rocky Mountain J. Math. 1993, 23, 495–519. [Google Scholar] [CrossRef]
Doukhan, P. Mixing. Properties and examples; Lecture Notes in Statistics; Springer: New York, NY, USA, 1994; Volume 85, pp. xii+142. [Google Scholar] [CrossRef]
Dedecker, J.; Doukhan, P.; Lang, G.; León, R.; Louhichi, S.; Prieur, C. Weak Dependence: With Examples and Applications; Lecture Notes in Statistics; Springer: New York, NY, USA, 2007; Volume 190, pp. xiv+318. [Google Scholar]
Lahiri, S.N.; Zhu, J. Resampling methods for spatial regression models under a class of stochastic designs. Ann. Statist. 2006, 34, 1774–1813. [Google Scholar] [CrossRef] [Green Version]
Bandyopadhyay, S.; Lahiri, S.N.; Nordman, D.J. A frequency domain empirical likelihood method for irregularly spaced spatial data. Ann. Statist. 2015, 43, 519–545. [Google Scholar] [CrossRef]
Kolmogorov, A.N.; Tihomirov, V.M. ε-entropy and ε-capacity of sets in functional space. Amer. Math. Soc. Transl. 1961, 17, 277–364. [Google Scholar]
Dudley, R.M. The sizes of compact subsets of Hilbert space and continuity of Gaussian processes. J. Funct. Anal. 1967, 1, 290–330. [Google Scholar] [CrossRef] [Green Version]
Dudley, R.M. Uniform Central Limit Theorems; Cambridge Studies in Advanced Mathematics; Cambridge University Press: Cambridge, UK, 1999; Volume 63, pp. xiv+436. [Google Scholar] [CrossRef]
van der Vaart, A.W.; Wellner, J.A. Weak Convergence and Empirical Processes. With Applications to Statistics; Springer Series in Statistics; Springer: New York, NY, USA, 1996; pp. xvi+508. [Google Scholar] [CrossRef]
Kosorok, M.R. Introduction to Empirical Processes and Semiparametric Inference; Springer Series in Statistics; Springer: New York, NY, USA, 2008; pp. xiv+483. [Google Scholar] [CrossRef]
Deheuvels, P. One bootstrap suffices to generate sharp uniform bounds in functional estimation. Kybernetika 2011, 47, 855–865. [Google Scholar]
Vogt, M. Nonparametric regression for locally stationary time series. Ann. Statist. 2012, 40, 2601–2633. [Google Scholar] [CrossRef]
Mayer-Wolf, E.; Zeitouni, O. The probability of small Gaussian ellipsoids and associated conditional moments. Ann. Probab. 1993, 21, 14–24. [Google Scholar] [CrossRef]
Bogachev, V.I. Gaussian Measures; Mathematical Surveys and Monographs; American Mathematical Society: Providence, RI, USA, 1998; Volume 62, pp. xii+433. [Google Scholar] [CrossRef]
Li, W.V.; Shao, Q.M. Gaussian processes: Inequalities, small ball probabilities and applications. In Stochastic Processes: Theory and Methods; Handbook of Statist: Amsterdam, The Netherlands, 2001; Volume 19, pp. 533–597. [Google Scholar] [CrossRef]
Ferraty, F.; Mas, A.; Vieu, P. Nonparametric regression on functional data: Inference and practical aspects. Aust. N. Z. J. Stat. 2007, 49, 267–286. [Google Scholar] [CrossRef] [Green Version]
Lahiri, S.N.; Kaiser, M.S.; Cressie, N.; Hsu, N.J. Prediction of spatial cumulative distribution functions using subsampling. J. Amer. Statist. Assoc. 1999, 94, 86–110. [Google Scholar] [CrossRef]
van der Vaart, A.W. Asymptotic Statistics; Cambridge Series in Statistical and Probabilistic Mathematics; Cambridge University Press: Cambridge, UK, 1998; Volume 3, pp. xvi+443. [Google Scholar] [CrossRef]
Mason, D.M. Proving consistency of non-standard kernel estimators. Stat. Inference Stoch. Process. 2012, 15, 151–176. [Google Scholar] [CrossRef]
Bellet, A.; Habrard, A.; Sebban, M. A Survey on Metric Learning for Feature Vectors and Structured Data. arXiv 2013, arXiv:1306.6709. [Google Scholar]
Clémençon, S.; Colin, I.; Bellet, A. Scaling-up empirical risk minimization: Optimization of incomplete U-statistics. J. Mach. Learn. Res. 2016, 17, 76. [Google Scholar]
Jin, R.; Wang, S.; Zhou, Y. Regularized Distance Metric Learning:Theory and Algorithm. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 7–10 December 2009; Bengio, Y., Schuurmans, D., Lafferty, J., Williams, C., Culotta, A., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2009; Volume 22. [Google Scholar]
Bellet, A.; Habrard, A. Robustness and generalization for metric learning. Neurocomputing 2015, 151, 259–267. [Google Scholar] [CrossRef] [Green Version]
Cao, Q.; Guo, Z.C.; Ying, Y. Generalization bounds for metric and similarity learning. Mach. Learn. 2016, 102, 115–132. [Google Scholar] [CrossRef] [Green Version]
Clémençon, S.; Robbiano, S. The TreeRank Tournament algorithm for multipartite ranking. J. Nonparametr. Stat. 2015, 27, 107–126. [Google Scholar] [CrossRef] [Green Version]
Clémençon, S.; Robbiano, S.; Vayatis, N. Ranking data with ordinal labels: Optimality and pairwise aggregation. Mach. Learn. 2013, 91, 67–104. [Google Scholar] [CrossRef] [Green Version]
Dudley, R.M. A course on empirical processes. In École d’été de Probabilités de Saint-Flour, XII—1982; Lecture Notes in Math; Springer: Berlin, Germany, 1984; Volume 1097, pp. 1–142. [Google Scholar] [CrossRef]
Polonik, W.; Yao, Q. Set-indexed conditional empirical and quantile processes based on dependent data. J. Multivar. Anal. 2002, 80, 234–255. [Google Scholar] [CrossRef] [Green Version]
Stute, W. Universally consistent conditional U-statistics. Ann. Statist. 1994, 22, 460–473. [Google Scholar] [CrossRef]
Stute, W. L^p-convergence of conditional U-statistics. J. Multivar. Anal. 1994, 51, 71–82. [Google Scholar] [CrossRef]
Maillot, B.; Viallon, V. Uniform limit laws of the logarithm for nonparametric estimators of the regression function in presence of censored data. Math. Methods Statist. 2009, 18, 159–184. [Google Scholar] [CrossRef]
Kohler, M.; Máthé, K.; Pintér, M. Prediction from randomly right censored data. J. Multivar. Anal. 2002, 80, 73–100. [Google Scholar] [CrossRef] [Green Version]
Carbonez, A.; Györfi, L.; van der Meulen, E.C. Partitioning-estimates of a regression function under random censoring. Statist. Decis. 1995, 13, 21–37. [Google Scholar] [CrossRef]
Brunel, E.; Comte, F. Adaptive nonparametric regression estimation in presence of right censoring. Math. Methods Statist. 2006, 15, 233–255. [Google Scholar]
Kaplan, E.L.; Meier, P. Nonparametric estimation from incomplete observations. J. Am. Statist. Assoc. 1958, 53, 457–481. [Google Scholar] [CrossRef]
Bouzebda, S.; El-hadjali, T. Uniform convergence rate of the kernel regression estimator adaptive to intrinsic dimension in presence of censored data. J. Nonparametr. Stat. 2020, 32, 864–914. [Google Scholar] [CrossRef]
Datta, S.; Bandyopadhyay, D.; Satten, G.A. Inverse probability of censoring weighted U-statistics for right-censored data with an application to testing hypotheses. Scand. J. Stat. 2010, 37, 680–700. [Google Scholar] [CrossRef]
Stute, W.; Wang, J.L. Multi-sample U-statistics for censored data. Scand. J. Statist. 1993, 20, 369–374. [Google Scholar]
Chen, Y.; Datta, S. Adjustments of multi-sample U-statistics to right censored data and confounding covariates. Comput. Statist. Data Anal. 2019, 135, 1–14. [Google Scholar] [CrossRef]
Yuan, A.; Giurcanu, M.; Luta, G.; Tan, M.T. U-statistics with conditional kernels for incomplete data models. Ann. Inst. Statist. Math. 2017, 69, 271–302. [Google Scholar] [CrossRef]
Földes, A.; Rejto, L. A LIL type result for the product limit estimator. Z. Wahrsch. Verw. Gebiete 1981, 56, 75–86. [Google Scholar] [CrossRef]
Bouzebda, S.; El-hadjali, T.; Ferfache, A.A. Uniform in bandwidth consistency of conditional U-statistics adaptive to intrinsic dimension in presence of censored data. Sankhya A 2022, 1–59. [Google Scholar] [CrossRef]
Hall, P. Asymptotic properties of integrated square error and cross-validation for kernel estimation of a regression function. Z. Wahrsch. Verw. Gebiete 1984, 67, 175–196. [Google Scholar] [CrossRef]
Härdle, W.; Marron, J.S. Optimal bandwidth selection in nonparametric regression function estimation. Ann. Statist. 1985, 13, 1465–1481. [Google Scholar] [CrossRef]
Rachdi, M.; Vieu, P. Nonparametric regression for functional data: Automatic smoothing parameter selection. J. Statist. Plann. Inference 2007, 137, 2784–2801. [Google Scholar] [CrossRef]
Benhenni, K.; Ferraty, F.; Rachdi, M.; Vieu, P. Local smoothing regression with functional data. Comput. Statist. 2007, 22, 353–369. [Google Scholar] [CrossRef]
Shang, H.L. Bayesian bandwidth estimation for a functional nonparametric regression model with mixed types of regressors and unknown error density. J. Nonparametr. Stat. 2014, 26, 599–615. [Google Scholar] [CrossRef] [Green Version]
Li, Q.; Maasoumi, E.; Racine, J.S. A nonparametric test for equality of distributions with mixed categorical and continuous data. J. Econom. 2009, 148, 186–200. [Google Scholar] [CrossRef]
Horowitz, J.L.; Spokoiny, V.G. An adaptive, rate-optimal test of a parametric mean-regression model against a nonparametric alternative. Econometrica 2001, 69, 599–631. [Google Scholar] [CrossRef]
Gao, J.; Gijbels, I. Bandwidth selection in nonparametric kernel testing. J. Am. Statist. Assoc. 2008, 103, 1584–1594. [Google Scholar] [CrossRef] [Green Version]
Yu, B. Rates of convergence for empirical processes of stationary mixing sequences. Ann. Probab. 1994, 22, 94–116. [Google Scholar] [CrossRef]
Bernstein, S. Sur l’extension du théoréme limite du calcul des probabilités aux sommes de quantités dépendantes. Math. Ann. 1927, 97, 1–59. [Google Scholar] [CrossRef]
Giné, E.; Zinn, J. Some limit theorems for empirical processes. Ann. Probab. 1984, 12, 929–998. [Google Scholar] [CrossRef]
Bouzebda, S.; Soukarieh, I. Weak Convergence of the Conditional U-statistics for Locally Stationary Functional Time Series. Stat. Inference Stoch. Process. 2022. [Google Scholar]
Masry, E. Multivariate local polynomial regression for time series: Uniform strong consistency and rates. J. Time Ser. Anal. 1996, 17, 571–599. [Google Scholar] [CrossRef]
Rio, E. Inequalities and Limit Theorems for Weakly Dependent Sequences. Available online: https://cel.hal.science/cel-00867106/ (accessed on 20 October 2022).
de la Peña, V.H. Decoupling and Khintchine’s inequalities for U-statistics. Ann. Probab. 1992, 20, 1877–1892. [Google Scholar] [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bouzebda, S.; Soukarieh, I. Non-Parametric Conditional U-Processes for Locally Stationary Functional Random Fields under Stochastic Sampling Design. Mathematics 2023, 11, 16. https://doi.org/10.3390/math11010016

AMA Style

Bouzebda S, Soukarieh I. Non-Parametric Conditional U-Processes for Locally Stationary Functional Random Fields under Stochastic Sampling Design. Mathematics. 2023; 11(1):16. https://doi.org/10.3390/math11010016

Chicago/Turabian Style

Bouzebda, Salim, and Inass Soukarieh. 2023. "Non-Parametric Conditional U-Processes for Locally Stationary Functional Random Fields under Stochastic Sampling Design" Mathematics 11, no. 1: 16. https://doi.org/10.3390/math11010016

APA Style

Bouzebda, S., & Soukarieh, I. (2023). Non-Parametric Conditional U-Processes for Locally Stationary Functional Random Fields under Stochastic Sampling Design. Mathematics, 11(1), 16. https://doi.org/10.3390/math11010016

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Non-Parametric Conditional U-Processes for Locally Stationary Functional Random Fields under Stochastic Sampling Design

Abstract

1. Introduction

2. The Functional Framework

2.1. Notation

2.2. Generality on the Model

2.3. Local Stationarity

2.4. Sampling Design

2.5. Mixing Condition

2.6. Generality on the Model

2.6.1. Small Ball Probability

2.6.2. VC-Type Classes of Functions

2.7. Conditions and Comments

Comments

3. Uniform Convergence Rates for Kernel Estimators

3.1. Hoeffding’s Decomposition

3.2. Strong Uniform Convergence Rate

4. Weak Convergence for Kernel Estimators

5. Applications

5.1. Metric Learning

5.2. Multipartite Ranking

5.3. Set Indexed Conditional U-Statistics

5.4. Discrimination

6. Extension to the Censored Case

7. The Bandwidth Selection Criterion

8. Concluding Remarks

9. Mathematical Developments

9.1. Preliminaries

9.2. Proof of Proposition 1

9.2.1. Proof of Lemma 1

9.2.2. Proof of Theorem 1

9.2.3. Proof of Theorem 2

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

Appendix A.1. Proof of Lemma A6

Appendix A.2. Proof of Lemma A8

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI