Nonparametric Expectile Shortfall Regression for Complex Functional Structure

Alamari, Mohammed B.; Almulhim, Fatimah A.; Kaid, Zoulikha; Laksaci, Ali

doi:10.3390/e26090798

Open AccessArticle

Nonparametric Expectile Shortfall Regression for Complex Functional Structure

¹

Department of Mathematics, College of Science, King Khalid University, Abha 62529, Saudi Arabia

²

Department of Mathematical Sciences, College of Sciences, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Entropy 2024, 26(9), 798; https://doi.org/10.3390/e26090798

Submission received: 23 August 2024 / Revised: 12 September 2024 / Accepted: 17 September 2024 / Published: 18 September 2024

(This article belongs to the Section Complexity)

Download

Browse Figures

Versions Notes

Abstract

This paper treats the problem of risk management through a new conditional expected shortfall function. The new risk metric is defined by the expectile as the shortfall threshold. A nonparametric estimator based on the Nadaraya–Watson approach is constructed. The asymptotic property of the constructed estimator is established using a functional time-series structure. We adopt some concentration inequalities to fit this complex structure and to precisely determine the convergence rate of the estimator. The easy implantation of the new risk metric is shown through real and simulated data. Specifically, we show the feasibility of the new model as a risk tool by examining its sensitivity to the fluctuation in financial time-series data. Finally, a comparative study between the new shortfall and the standard one is conducted using real data.

Keywords:

financial risk; complete consistency; expected shortfall; functional data; kernel method; expectile regression; quantile regresion

1. Introduction

With the huge development and the progress of computer and data science, the statistical modeling of complex and unstructured data is becoming indispensable. In this context, the functional statistics constitutes a good mathematical tool to fit this situation. At this stage, we aim in this contribution to develop a new metric of risk management. More precisely, instead of the standard expected shortfall, we define a new expected shortfall regression (ESR) using the conditional expectile. In fact, the use of the expectile instead of the quantile in the ESR is motivated by the principal feature of the expectile that is very sensitive to the outliers or the extreme risk.

From a historical point of view, the expectile function was introduced by [1]. It constitutes a good alternative to the quantile. In financial time-series analysis, the use of the expectile instead of the VaR function is motivated by its high sensitivity to the outliers, which increase its ability to fit the financial risk. At this stage, the expectile function has gained popularity in risk analysis management. For more motivations on this model in financial risk management, we refer the reader to [2,3,4,5]. Furthermore, the expectile function has been used for other statistical modeling, including the outliers testing (see [6]) or the heteroscedasticity detection (see, for instance, [7,8]). Concerning the use of the expectile in the regression analysis, we cite [9] for the multivariate case and [10] for the functional case. The authors of this last cited work have obtained the asymptotic properties of the nonparametric estimation of the expectile regression with a functional covariate. Alternatively, the functional version of the parametric expectile regression was studied by [11]. They used the reproducing kernel Hilbert space structure to construct their estimator. They obtained the asymptotic upper and lower bounds of the convergence rate. In parallel, the shortfall function was introduced by [12]. The use of this risk metric in financial time-series data is motivated by its coherency property. We return to [12] for a comparative study between the Value at Risk (VaR) and expected shortfall model (ES) in financial time-series analysis. The authors of this cited paper have proved that the VaR is unusable when the profit–loss is not Gaussian. From an analytical point of view, the estimation of the ES model can be performed by multiple ways including parametric, semi, or nonparametric algorithms. The parametric approaches were used by [13,14,15]. Meanwhile, the first results in the nonparametric techniques were obtained by [16]. He used the kernel method to construct an estimator of the ES-model. Ref. [17] established the asymptotic distribution of the kernel estimator of the ES model. Using the Bahadur representation, the authors of [18] have constructed an alternative estimator of the ES-model. Ref. [19] has studied the functional version of the Nadaraya–Watson estimator of the conditional ES model (CESM) under the mixing assumption. They proved that the constructed estimator almost completely converges. Alternative functional time-series data were developed by [20]. In particular, they obtained the almost complete consistency of the kernel estimator of the CESM under the quasi-associated dependency. While in all the previous cited work, the expected loss of the shortfall is defined by the VaR level, in this paper, we introduce an alternative risk threshold that is the expectile regression.

As mentioned below, the main aim of the present contribution is to develop a new risk metric based on the expectile regression. Specifically, we define the expected shortfall with respect to the tail expectile. Such a new risk metric accumulates the advantages for two functions. Indeed, it is well known that the expectile is an elicitable and coherent risk metric. Moreover, it is very sensitive to the magnitude of the lower tail, unlike the VaR, which is not influenced by the outliers. In parallel, the ES model fulfills the condition of spectral risk measures (see [21]). Thus, the ES model based on a tail expectile improves significantly the risk management. The particularity of the present contribution is the treatment of this model using the functional time-series structure. Thus, the principal achievement of this paper is the construction of a computational kernel estimator and the study of its asymptotic property using the mixing assumption. It should be noted that the functional time-series case is more realistic than the independent functional data. The practical implementation of this risk metric has been evaluated using artificial and real data. To the best of our knowledge, no attempt has been made so far to estimate the functional EES regression based on the tail expectile. We may refer to [22,23,24,25,26,27] for more recent advances in ftsa.

The paper is organized as follows. We present our risk metric as well as its estimator in Section 2. Section 3 is dedicated to introducing the functional time-series framework. The almost complete convergence of the constructed estimator is shown in Section 4. Section 5 is devoted to discussing some of the computation ability of the estimator over artificial and areal data applications. Finally, the proofs of the auxiliary results are given in the Section 6.

2. Model and Estimator

Let

(X_{1}, Y_{1}), \dots (X_{n}, Y_{n})

be n pairs of random pairs in

F \times I R

which are identically distributed as

(X, Y)

. Moreover, we suppose that the regular version of the conditional distribution of Y given X exists. The standard ES regression is defined through the tail-quantile as

for z \in F, by R E S_{p} (z) = I E [Y | Y > R V a R_{p} (z), X = z], p \in (0, 1)

where

R V a R_{p}

is the quantile regression of order

1 - p

. So, alternatively to this tail quantile, we introduce the ES regression using the tail expectation, which is defined as

for z \in F, by R E X_{p} (z) = I E [Y | Y > R E X P_{p} (z), X = z], p \in (0, 1)

where

{R E X P}_{p}

is the expectile regression. The latter is defined by

{R E X P}_{p} (z) = arg min_{t \in I R} \{I E [p {(Y - t)}^{2} 𝟙_{{(Y - t) > 0}} ∣ X = z]

+ I E [(1 - p) {(Y - t)}^{2} 𝟙_{{(Y - t) \leq 0}} ∣ X = z]\} p \in (0, 1) .

where

1_{A}

is the indicator function of the set A.

It worth noting that the replacement of

R V a R_{p}

by

{R E X P}_{p}

permits remedying the lack of risk insensitivity of

R V a R_{p}

to the extreme values. This characteristic is very important in practice because the catastrophic losses are located at the extreme values.

Now, for the estimation step, we assume that F

(\cdot)

is a known measurable function and

r = r_{n}

is a positive sequence of real numbers tending to zero as n tends to infinity. Next, the estimation procedure involves two steps. In the first step, we start by estimating the expectile regression

{R E X P}_{p}

. The latter is estimated by

{\hat{R E X P}}_{p}

, as a kernel estimator of

{\hat{R E X P}}_{p}

defined as the root of

\hat{G} ({\hat{R E X P}}_{p} (z); x) = \frac{p}{1 - p} p \in (0, 1)

with

\tilde{G} (t; x) = \frac{- \sum_{i = 1}^{n} F_{n i} (z) (Y_{i} - t) 𝟙_{{(Y_{i} - t) \leq 0}}}{\sum_{i = 1}^{n} F_{n i} (z) (Y_{i} - t) 𝟙_{{(Y_{i} - t) > 0}}}, for t \in I R,

where

F_{n i} (z) = \frac{F_{i}}{\sum_{i = 1}^{n} F_{i}} and F_{i} = F [r^{- 1} d (z, X_{i})] .

The second step is the estimation of the ES regression. Naturally, the ES regression is estimated by

\hat{R E X_{p}} (z) = \frac{\sum_{i = 1}^{n} F [r^{- 1} d (z, X_{i})] Y_{i} 1_{Y_{i} > {\hat{R E X P}}_{p} (z)}}{\sum_{i = 1}^{n} F (r^{- 1} d (z, X_{i}))}, p \in (0, 1)

(1)

The main purpose of the theoretical section of this work is to establish the almost complete consistency of the estimator

\hat{R E X_{p}} (\dots)

to

R E X_{p} (\cdot)

using strong functional time-series data. For the reader not familiar with this aspect of functional time-series data analysis, we devote the rest of this section to recalling the definition of the strong mixing assumption property, which requires the introduction of the following notations. Firstly, we consider

\{Z_{i}, i = 1, 2, \dots\}

to be a strictly stationary sequence of random variables, and we denote by

S_{i}^{k} (Z)

the

σ -

algebra generated by

\{Z_{j}, i \leq j \leq k\} .

Secondly, for a positive integer n, we define

\begin{matrix} α (n) = sup { & | & I P (A \cap B) - I P (A) I P (B) | : A \in S_{1}^{k} (Z) and B \in F_{k + n}^{\infty} (Z), \\ k is strictly positive integer\} . \end{matrix}

So, the sequence

\{Z_{i}, i = 1, 2, \dots\}

is said to be

α

-mixing (strong mixing) if the mixing coefficient

α (n) \to 0

as

n \to \infty .

It is well documented that this condition is verified by many processes including the usual ARMA processes (with innovations satisfying some existing moment conditions) (see [28]), the EXPAR models (see [29]), the ARCH models (see [30]), and the GARCH model (see [31]), among others.

3. Main Asymptotic Result

Before stating the asymptotic properties of the estimator

\hat{R E X_{p}}

, we need to introduce some notations and assumptions. Firstly, we set by

C_{z}

or

C_{z}^{'}

some strictly positive generic constants,

N_{z}

is a given neighborhood of

z

, and, for all

t \in I R

, we define

E S (t, z) = I E [Y 1_{Y > t} | X = z]

. Now, to formulate our main results, we will use the hypotheses listed below:

(P1): $P (X \in B (z, r)) = ϕ (z, r) > 0$ where $B (z, r) = \{x^{'} \in F : d (z^{'}, z) < r\}$ .
(P2): $\exists δ > 0, \forall (t_{1}, t_{2}) \in [{R E X P}_{p} (z) - δ, {R E X P}_{p} (z) + δ]$ , $\forall (z_{1}, z_{2}) \in N_{x}^{2}$ ,

$| E S (t_{1}, z_{1}) - E S (t_{2}, z_{2}) | \leq C_{x} (d^{b} {(z_{1}, z_{2})}^{b} + | t_{1} - t_{2} |), b > 0 .$
(P3): The sequence ${(X_{i}, Y_{i})}_{i \in I N}$ is a strong mixing process that has a coefficient $α (n)$ and satisfies $\exists a > 2, \exists c > 0 : \forall n \in I N, α (n) \leq c n^{- a}$ and

$\{\begin{matrix} \forall i \neq j, \forall t \in [θ_{x} - δ, θ_{x} + δ], I E [Y_{i} Y_{j} | X_{i}, X_{j}] \leq C < \infty, \\ I P ((z_{i}, X_{j}) \in B (z, r) \times B (z, r)) = φ (z, r) > 0 . \\ I E [{|Y|}^{2} | X] < C < \infty and I E [{|Y|}^{p}] < C < \infty, p > 1 \end{matrix}$
(P4): $F$ is a function with support $(0, 1)$ such that

$0 < C 𝟙_{(0, 1)} < F (t) < C^{'} 𝟙_{(0, 1)} < \infty .$
(P5): There exists a sequence of positive real numbers $γ_{n}$ and $η > 0$ such that

$\{\begin{matrix} \sum_{n} n^{3 p / 2} γ_{n}^{- p} < \infty, \\ \sum_{n} n^{\frac{2 - a}{a} - η} {(χ (z, r))}^{- (a + 4)} < \infty, \end{matrix}$

where $χ (z, r) = max (φ (z, r), ϕ^{2} (z, r))$

Comments on the hypotheses.

All these conditions are standards in FTSA. In particular, condition (P1) is checked for several continuous time processes. We refer to [32] for a general Gaussian process viewed as functional space in

L^{2}

. This condition relates the functional structure of the data to the probablity measure of the random variable, measuring the concentration of the probability measure of X over a topological ball constructed from the semimetric d. At this stage, the function

ϕ

is affected by two principal factors that are the probability measure and the semimetric d. Such assumption can be viewed as generalization for the multivariate case (

X \in R^{k}

) when

ϕ (z, r) = f_{X} (z) r^{k} + o (r^{k})

where

f_{X}

is the density of X. In this situation, the function

ϕ

is positive as long as the density

f_{X}

is structurally positive. A mild regularity condition (P2) is assumed for the distribution function. Such a condition allows for characterizing the nonparametric path of the studied model. (P2) is used to evalaute the bias term of the estimator. Condition (P3) defines the mixing structure of our FTSA framework. The first part of this condition allows for obtaining a convergence rate comprable to the independent case. In conclusion, we can say that the assumed conditions (P1)–(P4) are sufficiently weak to obtain an improved convergence rate, which is comparable to the ideal situation that is the independent case. Of course, these assumptions can be reduced if we are interested in only the convergence estimator without a convergence rate.

Now, we state the following results.

Theorem 1.

Under the suppositions (P1)–(P5), we have

|\hat{R E X_{p}} (z) - R E X_{p} (z)| = O (r^{b} + \sqrt{\frac{χ^{1 / 2} (z, r) ln n}{n ϕ^{2} (z, r)}}) . a . c o .

(2)

Proof of Theorem 1

For

t \in I R

, we define

\hat{E S} (t, z) = \frac{\sum_{i = 1}^{n} F [r^{- 1} d (z, X_{i})] Y_{i} 1_{Y_{i} > t}}{\sum_{i = 1}^{n} F [r^{- 1} d (z, X_{i})]} .

So,

\hat{E S} ({\hat{R E X P}}_{p} (z), z) = \hat{R E X_{p}} (z), a n d E S ({R E X P}_{p} (z), z) = R E X_{p} (z) .

Then,

\hat{R E X_{p}} (z) - R E X_{p} (z) = \hat{E S} ({\hat{R E X P}}_{p} (z), z) - E S ({\hat{R E X P}}_{p} (z), z)

+ E S ({\hat{R E X P}}_{p} (z), z) - E S ({R E X P}_{p} (z), z) .

Thusly,

| \hat{R E X_{p}} (z) - R E X_{p} (z) | \leq sup_{t \in [{R E X P}_{p} (z) - δ, {R E X P}_{p} (z) + δ]} | \hat{E S} (t, z) - E S (t, z) |

+ C | {\hat{R E X P}}_{p} (z) - {R E X P}_{p} (z) | .

So, Theorem 1 comes from

sup_{t \in [{R E X P}_{p} (z) - δ, {R E X P}_{p} (z) + δ]} | \hat{E S} (t, z) - E S (t, z) | = O (r^{b} + \sqrt{\frac{χ^{1 / 2} (z, r) ln n}{n ϕ^{2} (z, r)}}) a . c o .

(3)

and

| {\hat{R E X P}}_{p} (z) - {R E X P}_{p} (z) | = O (r^{b} + \sqrt{\frac{χ^{1 / 2} (z, r) ln n}{n ϕ^{2} (z, r)}}) a . c o .

(4)

The result (4) is proved in [10]. So, we concentrate only on (3). Indeed, as

I E [{\hat{E S}}_{D} (z)] = 1

, we have, for

t \in I R

,

\hat{E S} (t, z) - \hat{E S} (t, z) = \frac{1}{{\hat{E S}}_{D} (z)} [({\hat{E S}}_{N} (t, z) - I E [{\hat{E S}}_{N} (t, z)])

- (\hat{E S} (t, z)) - I E [{\hat{E S}}_{N} (t, z)])] - \frac{{\hat{E S}}_{N} (t, z)}{{\hat{E S}}_{D} (z)} [{\hat{E S}}_{D} (z) - I E [{\hat{E S}}_{D} (z)]]

The proof is carried out through Lemmas 1–3. □

Lemma 1.

Under the suppositions (P1) and (P3)–(P5), we have

{\hat{E S}}_{D} (z) - I E [{\hat{E S}}_{D} (z)] = O (\sqrt{\frac{χ^{1 / 2} (z, r) ln n}{n ϕ^{2} (z, r)}}) a . c o .

Moreover,

\sum_{n} I P ({\hat{E S}}_{D} (z) < \frac{1}{2}) < \infty .

Lemma 2.

Under the suppositions (P1)–(P2) and (P4)–(P5), we have

sup_{t \in [{R E X P}_{p} (z) - δ, {R E X P}_{p} (z) + δ]} |E S (t, z) - I E [{\hat{E S}}_{N} (t, z)]| = O (a^{b}) .

Lemma 3.

Under the suppositions (P1)–(P5), we have

sup_{t \in [{R E X P}_{p} (z) - δ, {R E X P}_{p} (z) + δ]} |{\hat{E S}}_{N} (t, z) - I E [{\hat{E S}}_{N} (t, z)]| = O (\sqrt{\frac{χ^{1 / 2} (z, r) ln n}{n ϕ^{2} (z, r)}}), a . c o .

4. Empirical Analysis

This section is devoted to examining the practical use of the model studied in this work. This section is divided into three sections. In the first subsection, we discuss the selection of the smoothing parameter, which is the pivotal parameter in our estimation. For this reason, the choice of this parameter is primordial for the computational aspect. After the smoothing parameter selection, we examine the usefulness of the estimator. This practical study is conducted using two examples. The first one concerns artificial data, and the second one treats financial real data coming from some popular index markets according to the Dow–Jones index.

4.1. Smoothing Parameter Selection: Cross-Validation

As mentioned above, the choice of the smoothing parameter r is crucial in this nonparametric framework. Now, as our estimation procedure is based on the expectile regression, the appropriate cross-validation (CV) rule is the mean squared error. The latter is usual in nonparametric functional data:

r_{C V_{o p t}} = arg min_{r} \sum_{i = 1}^{n} {(Y_{i} - {\hat{R E X P}}_{0.5} (z_{i}))}^{2} .

(5)

This rule is motivated by the fact that the conditional mean

I E [Y | X]

is associated with

{\hat{R E X P}}_{p}

with

p = 0.5

. The popularity of this approach comes from its easy implementation. However, we can employ a more accurate rule that is a generalization of (5). It is explicitly expressed by

r_{o p t} = arg min_{a} \sum_{i = 1}^{n} ρ {(Y_{i} - {\hat{R E X P}}_{p} (z_{i}))}^{2},

(6)

where

ρ

is the scoring function defining

{R E X P}_{p}

. The main advantage of this last rule is its dependence on the threshold p. Of course, it is very beneficial in this area of financial risk analysis. Indeed, in risk analysis, we are interested in the tail which corresponds to small or large values of p. Observe that the challenging issue in expected shortfall is the absence of the scoring function or backtesting measure. Thus, the use of the optimal smoothing parameter associated with the expectile regression is more adequate for the model

\hat{R E X_{p}}

. Moreover, this choice reduces the time cost of the companionability of the ES expectile regression.

4.2. Artificial Data

In this empirical analysis, we aim to examine the applicability of the constructed estimator

\hat{R E X_{p}}

as well as its behavior for a finite sample. Clearly, the particularity of this work is the treatment of the dependency case. For this aim, we compare the behavior of the estimator

\hat{R E X_{p}} (z)

over different levels of dependency. Now, for this purpose, we generate artificial data from the functional autoregressive sampling processes. It is well documented that this kind of process has a strong mixing property and allows controlling the effects of this property on the efficiency of the constructed estimator. We point out that the functional autoregressive process is generated using the R-package version 4.3.1 freqdom.fda through the routine code fts.rar. We point out that this routine code use of the dynamic functional principal component analysis (see [33]) permits simulating the p-order functional autoregressive process using the finite dimensional subspace spanned by given basis functions, such as Fourier basis or spline basis. In this artificial study, we employ this routine code to generate a functional autoregressive process of order

p = 3

. Furthermore, the functional regressors

{(X_{i})}_{i}

are generated as

X_{i} = Ψ (X_{i - 1}) + ε_{i},

where

Ψ

is a kernel operator with kernel

ψ (\cdot, \cdot)

and

ε_{i}

is a white noise random variable. So, under this consideration, the functional covariate is

X_{i} (t) = \sum_{k = 1}^{p} \int_{0}^{1} ψ_{k} (t, s) X_{i - k} (s) d s + ε_{i} .

In practice, the kernel operator is associated with the matrix

{(ψ_{k i j})}_{i j} = (< Ψ_{k} (v_{i}), v_{j} >)

. Thus, the dependency level is measured by the values of the coefficients

{(ψ_{k i j})}_{i j}

. In this empirical analysis, we put

(ψ_{k i j}) = \frac{k + i + j}{k^{2} + i^{2} + j^{2}}

. Moreover, in the routine code fts.rar, the operator

Ψ

should be scaled with respect to its Hilbert–Schmidt norms. The parameter of normalization is so-called op.norms. Thus, the dependency degree increases with the value of op.norms. In this sense, the great value of op.norms implies strong dependency and vice versa. So, in order to examine the effect of the dependency on the estimation quality, we generate three levels of dependencies (strong, medium and moderate correlation). Specifically, the strong dependency is obtained by assuming op.norms = 0.999, while the moderate and the medium cases are, respectively, associated with op.norms = 0.48 and op.norms = 0.01. In Figure 1, we plotted the generated samples of different functional regressors.

Next, the response variable is generated using nonparametric regression formula

\begin{matrix} Y_{i} & = & \int_{0}^{1} sin (2 + X_{i}^{2} (t)) d t + \int_{0}^{1} cos (2 + X_{i}^{2} (t)) d t + ϵ_{i}, \end{matrix}

where

ϵ_{i}

is independent of

X_{i}

, representing the so-called white noise. Obviously, the conditional distribution is related to the distribution of the random variable

ϵ_{i}

. Thus, to cover different situations, we consider two types of white noise distributions. The first one is normal, which is light-tailed distribution. The second one is Lévy distribution, which is heavy-tailed distribution. We point out that there are several definitions of these notions of light or heavy-tailed distribution. However, the most common index to identify the heavy-tailed distribution is the variance, which should be infinite. Thus, the Lévy distribution is certainly heavy-tailed distribution and has many applications in financial time-series analysis (see [34]). Furthermore, the choice of this distribution is also motivated by the its stability with the linear transformation. If U has a standard Lévy distribution

L e v y (0, 1)

,

a + b U

is

L e v y (a, b)

. For this reason, in both situations (normal or Lévy distribution), the true conditional expected shortfall can be explicitly defined by shifting the distribution of the white noise. In order to highlight the importance of

\hat{R E X_{p}}

in practice, we compare it to the VaR-based expected shortfall, which is estimated by

\bar{V E X_{p}} (z) = \frac{\sum_{i = 1}^{n} F [r^{- 1} d (z, X_{i})] Y_{i} 1_{Y_{i} > {\bar{V a R}}_{p} (z)}}{\sum_{i = 1}^{n} F (r^{- 1} d (z, X_{i}))}

Of course, the performance of

\bar{V E X_{p}}

is strongly linked to the estimation of the function

\bar{V a R}

. Thus, in order to provide a more comprehensive comparison with

\hat{R E X_{p}}

, we calculate

\bar{V a R}

with two alternative approaches:

The kernel method \hat{V a R} (z) = arg min_{a} \frac{\sum_{i = 1}^{n} F [r^{- 1} d (z, X_{i})] Q_{p} (Y_{i} - a)}{\sum_{i = 1}^{n} F (r^{- 1} d (z, X_{i}))}

and

the local linear method \tilde{V a R} (z) = (\tilde{a}, \tilde{b}) * {(1, 0)}^{t}

where

(\tilde{a}, \tilde{b}) = arg min_{a, b} \frac{\sum_{i = 1}^{n} F [r^{- 1} d (z, X_{i})] Q_{p} (Y_{i} - a - b d (X_{i}, z))}{\sum_{i = 1}^{n} F (r^{- 1} d (z, X_{i}))}

and

Q_{p} (z) = (2 p - 1) (Y - z) + | Y - z |

. Furthermore, we denote by

\hat{V E X_{p}}

the estimator associated with the kernel estimator

\hat{V a R}

, while

\tilde{V E X_{p}}

denotes the estimator associated with the local linear approach.

Clearly, the applicability of all these estimators

\hat{R E X_{p}}

,

\hat{V E X_{p}}

and

\tilde{V E X_{p}}

rests on the easy selection of different parameters defining these estimators. At this stage, we choose the bandwidth parameters r, the semi-metric d and the kernel

F

according to the principal assumptions. For this empirical analysis, we select the optimal bandwidth by the mean square cross-validation rule. The optimization method is performed over a discrete set defined by the

k^{t h}

-distance from the location point

z

where k is an integer number that belongs in

{5, 10, 15, 20, 25, 30, \dots 50}

. For the kernel

F

, we choose the

β

-kernel on

(0, 1)

, which is adequate with this kind of nonparametric approach. We point out that this kernel incorporates the technical assumption (P4). On the other hand, the metric choice is closely related to the nature of the functional variable and its smoothing property. It appears that the PCA metric is more suitable for this type of discontinuous functional regressor.

Finally, the performance of both estimators is evaluated by computing

M s e = \frac{1}{n} \sum_{i = 1}^{n} {(M o (X_{i}) - \hat{M o} (X_{i}))}^{2} 𝟙_{Y_{i} > {\hat{M b}}_{p} (X_{i})} .

where

M o

(resp.

M b

) means either expectile-based shortfall or VaR-based (respectively, expectile or VaR).

So, for this comparative study, we report estimation error

M s e

in different situations (degrees of dependency, types of conditional distribution (heavy-tailed and light-tailed)). The results are reported in Table 1.

It is clear that the efficiency of the three estimators is strongly affected by the different axes of this study such as the dependency degree, the nature of the model, as well as the conditional distribution case. In particular, the performance of the three estimators decreases with the level of dependency. On the other hand, it appears clearly that the type of the conditional distribution affects also the behavior of the three estimators. However, it appears that this sensitivity is not very important for the VaR-based expected shortfall

\hat{V E X_{p}}

and

\tilde{V E X_{p}}

, since the variability of

\hat{V E X_{p}}

and

\tilde{V E X_{p}}

is small compared to the

\hat{R E X_{p}}

. This conclusion confirms the high sensitivity of the expectile to the extreme values, which is very beneficial for risk management.

4.3. Real Data Application

This section is devoted to the applicability of our model in real time-series data. Specifically, we compare the efficiency of the new ES expectile model to the standard one using environmental time-series data. Although the financial time-series data are the main area where one can use the ES model, the environmental area is also an interesting applied domain of risk management. In fact, the air quality has a important impact on the quality of life. In particular, it is well known that exposures to ground ozone for a period of more than 8 h impact the pulmonary functions as well as the tissues of the respiratory tract. Therefore, it is very important to control the excessive level of ozone concentration. In this context, the theory of the extreme values has been employed to fit this issue. We cite, for instance, [35] and the references therein that use some financial tools to model the risk of air quality. We point out that the standard expected shortfall model is based on the quantiles function, which can be viewed as tail probabilities, while the expectile is tail expectation. The definition of the quantile measures only the frequency of the risk, but the expectile measures the risk frequency and its severity. For this reason, the expectile seems more informative since it has a high sensitivity to the extreme values, which is very beneficial in risk analysis. It permits better detection of the risk of excessive levels of ozone concentration. In this real data analysis, we use the air quality data of the website https://dataverse.harvard.edu/dataverse/beijing-air (accessed on 20 April 2024). It concerns the air quality in Beijing in the northeast of the Chinese country. We focus on two important indices of air quality: sulfur dioxide (SO₂) and the ozone concentration (O₃). Recall that (SO₂) and the ultraviolet rays have a great impact on the stratospheric ozone. So, the functional sample is defined as

X_{i}

, the

i^{t h}

-daily curve of SO₂, and by

Y_{i}

, the total ozone measured on the

i + 1^{t h}

-day. The sulfur dioxide and the ozone concentration curves are shown in Figure 2.

As discussed below, the principal aim of this real data analysis is to compare the ES expectile regression

\hat{C E X_{p}}

and the expected shortfall based on the quantile regression, which is defined by

V_{p} (z) = inf \{z \in I R : F (z : z) \geq p\},

where F is the conditional cumulative function of Y given

X = z

. We point out that the conditional cumulative distribution function is estimated by

\hat{F} (z : z) = \frac{\sum_{i = 1}^{n} 1 I_{(Y_{i} - z) < 0} F (\frac{d (z, X_{i})}{r_{n}})}{\sum_{i = 1}^{n} F (\frac{d (z, X_{i})}{r_{n}})} .

Then, the Value-at-Risk function is defined

{\hat{V}}_{p} (z) = inf \{z \in I R : \hat{F} (z : z) \geq p\} .

Thus, the kernel estimator of the standard expected shortfall regression is

\tilde{R E S_{p}} (s) = \frac{\sum_{i = 1}^{n} F (r^{- 1} d (z, X_{i}) (r G (r^{- 1} (\hat{V_{p}} (s) - Y_{i})) + Y_{i} (1 - H (r^{- 1} (\hat{V_{p}} (s) - Y_{i}))))}{p \sum_{i = 1}^{n} K (r^{- 1} d (z, X_{i}))},

where

G (s) = \int_{s}^{\infty} u F (u)

and

H (s) = \int_{- i n f t y}^{s} F (u) d u .

So, our aim is to compare the

\hat{R E X_{p}}

and

\tilde{R E S_{p}}

using real data

{(X_{i}, Y_{i})}_{i = 1, \dots 365}

. Both estimators are calculated using the same techniques as the empirical analysis section. Specifically, we execute the estimators by the same kernel, select the smoothing parameters by the rule (5), and employ the metric obtained by the PCA metric. We return to Ferraty and Vieu [36] for more details on the mathematical formulation of these metrics. The performance of both estimators is evaluated by computing

M s e (p) = \frac{1}{n} \sum_{i = 1}^{n} {(Y_{i} - \hat{Θ_{p}} (X_{i}))}^{2} 𝟙_{Y_{i} > \hat{R E X_{p}} (X_{i})}

where

\hat{Θ_{p}}

represents

\hat{R E X_{p}}

and

\tilde{R E S_{p}}

. Such an error is evaluated as a function of p. In Figure 3, we show the values of

M s e

of both estimators

\hat{R E X_{p}} (b l a c k l i n e)

and

\tilde{R E S_{p}}

(red line).

The graphs show the superiority of the ES expectile regression over the ES quantile model. In several cases, the black line is under the red line. This superiority is confirmed by reporting the Mse of some p. The values of this are given in the following Table 2.

It appears that the ES expectile is more accurate for various values of p, giving

\hat{R E X_{p}}

greater precedence as a risk metric.

5. Conclusions and Prospects

In this work, we have investigated the free-parameter estimation of the regression of the ES expectile. We have constructed an estimator by the kernel-smoothing approach. This contribution covers the functional time-series case. The theoretical part of this work focuses on the establishment of the convergence in Borell–Contelli in pointwise performance under strong mixing assumptions. This theoretical devolvement constitutes good mathematical support for the use of the new risk metric in risk management. Moreover, the obtained asymptotic results were established under standard conditions and with the precision of the convergence rate. In practice, the applicability of the estimator is very easy and gives better results compared to the standard one. Specifically, we applied the new model for environmental time-series data. The result confirms the superiority of the ES expectile over the ES quantile. This superiority is confirmed in two directions: The first one is the fact that the ES expectile has a small error compared to the ES quantile. The second one is the variability of the error in the ES expectile, which proves its high sensitivity to outliers. This feature is very important in risk analysis, because the risk is often located in the extreme values. Therefore, the robustness of the qunatile is not beneficial in this kind of area. For this reason, the ES expectile is more adequate than the ES quantile. The importance of our contribution can be viewed, also, through the numerous opens questions for the future. For instance, we will treat more dependent cases such as the quasi-associated case or the spatial case. Let us point out that the mixing assumption is very difficult to handle in practice. Thus, this condition can be considered as the principal practical limitation of the present contribution. For this reason, treating other type correlations is very important in practice. It allows for controlling alternative financial time-series data that are not difficult to handle in practice. In addition, the determination of the uniform UNN convergence of the estimator is also a very important prospect in the future. It permits resolving the problem of the smoothing parameter selection. Furthermore, we can also estimate the model using the additive or the linear case.

6. The Demonstration of Asymptotic Results

This section is devoted to the proofs of our results. To do that, we start by recalling the principal inequalities used to prove the intermediate lemmas:

Lemma 4

([36]). Let

{(Z_{i})}_{i \in N}

be an α-mixing process. For

k \in N

, consider two random variables

T

and

T^{'}

measurable on

σ (Z_{i}, - \infty < i \leq k)

and

σ (Z_{i}, n + k \leq i \leq + \infty)

, respectively:

(1): If $T$ and $T^{'}$ are bounded, then

$\exists C > 0, c o v (T, T^{'}) \leq C α (n) .$
(2): If there exist three positive integers p, q and r, such that $p^{- 1} + q^{- 1} + r^{- 1} = 1$ and $I E [T^{p}] < \infty$ and $I E [{T^{'}}^{q}] < \infty$ , then

$\exists C > 0, c o v (T, T^{'}) \leq C I E {[T^{p}]}^{1 / p} I E {[{T^{'}}^{q}]}^{1 / q} α {(n)}^{1 / r} .$

Lemma 5

([36]). Let

{(Z_{i})}_{i \in N}

be an algebraic α-mixing process, which is identically distributed:

(1): If there exist $p > 2$ and $M > 0$ such that for all $t > M, I P (| Z_{1} | > t) \leq t^{- p}$ , then for all $r \geq 1$ , $ϵ > 0$ and

$I P (|\sum_{i = 1}^{n} Z_{i}| > ϵ) \leq C [{(1 + \frac{ϵ^{2}}{r s_{n}^{2}})}^{- r / 2} + n r^{- 1} {(\frac{r}{ϵ})}^{(a + 1) p / (a + p)}] .$
(2): If there exist $M < \infty$ such that $| Z_{1} | \leq M$ , then for all $r \geq 1$ and $C < \infty$ :

$I P (|\sum_{i = 1}^{n} Z_{i}| > ϵ) \leq C [{(1 + \frac{ϵ^{2}}{r s_{n}^{2}})}^{- r / 2} + n r^{- 1} {(\frac{r}{ϵ})}^{a + 1}],$

where $s_{n}^{2} = \sum_{i = 1}^{n} \sum_{j = 1}^{n} | c o v (Z_{i}, Z_{j}) |$ .

Proof of Lemma 1.

We put

{\hat{E S}}_{D} (z) = \frac{1}{n} \sum_{i = 1}^{n} \frac{F_{i}}{I E [F_{1}]},

We use the Fuck–Nagaev inequality (Lemma 5) to obtain

\forall ℓ > 0

and

ε > 0

,

\begin{matrix} I P \{|I E [{\hat{E S}}_{D} (z)] - {\hat{E S}}_{D} (z)| > ε\} & = & I P \{|\frac{1}{n I E [F_{1}]} \sum_{i = 1}^{n} F_{i}| > ε\} \\ \leq & I P \{|\sum_{i = 1}^{n} F_{i}| > ε n I E [F_{1}]\} \\ \leq & C (A_{1} + A_{2}) \end{matrix}

(7)

with

A_{1} = {(1 + \frac{ε^{2} n^{2} {(I E [F_{1}])}^{2}}{S_{n}^{2} ℓ})}^{- ℓ / 2} and A_{2} = n ℓ^{- 1} {(\frac{ℓ}{ε n I E [F_{1}]})}^{a + 1}

where

S_{n}^{2} = \sum_{i = 1}^{n} \sum_{j = 1}^{n} C o v (F_{i}, F_{j}) = S_{n}^{2 *} + n V a r [F_{1}]

and

S_{n}^{2 *} = \sum_{i = 1}^{n} \sum_{i \neq j} C o v (F_{i}, F_{j}) .

Now, we must determine the asymptotic term of

S_{n}^{2 *}

. For this, we apply the technique of [37] So, we define

S_{1} = {(i, j) such that 1 \leq i - j \leq m_{n}}

and

S_{2} = {(i, j) such that m_{n} + 1 \leq i - j \leq n - 1}

with

m_{n} \to \infty, as n \to \infty .

Denote by

J_{1, n}

and

J_{2, n}

the covariance sum over

S_{1}

and

S_{2}

, respectively. Then,

J_{1, n} = \sum_{S_{1}} | C o v (F_{i}, F_{j}) | \leq \sum_{S_{1}} | I E [F_{i} F_{j}] - I E [F_{i}] I E [F_{j}] | .

Because of (P1), (P3) and (P5), we can write

J_{1, n} \leq C n m_{n} χ (z, r) .

Now, the covariance over

S_{2}

, can be treated using Davydov–Rio’s inequality (see Lemma 4). Thus, for all

i \neq j

, to

| C o v (F_{i}, F_{j}) | \leq C α (| i - j |) .

Therefore, using

\sum_{j \geq x + 1} j^{- a} \leq \int_{u \geq x} u^{- a} = {[(a - 1) x^{a - 1}]}^{- 1}

we obtain

|J_{2, n}| = |\sum_{(i, j) \in E_{2}} C o v [F_{i}, F_{j}]| \leq C \frac{n m_{n}^{- a + 1}}{a - 1} .

(8)

Choosing

m_{n} = {(χ (z, r))}^{- 1 / a}

, we obtain

\sum_{i \neq j}^{n} C o v (F_{i}, F_{j}) = O (n χ^{(a - 1) / a} (z, r)) .

Now, the variance part is

V a r (F_{1}) \leq C (ϕ (z, r) + {(ϕ (z, r))}^{2}) \leq C χ^{1 / 2} (z, r) .

Finally, as

a > 2

,

S_{n}^{2} = O (n χ^{1 / 2} (z, r)) .

(9)

Therefore,

ε = λ \frac{\sqrt{S_{n}^{2} ln n}}{n I E [F_{1}]}

and

ℓ = C {(ln n)}^{2}

. Thus,

A_{2} \leq C n^{1 - (a + 1) / 2} χ {(z, r)}^{- (a + 1) / 4} {(ln n)}^{(3 a - 1) / 2} .

Next, from (P5),

A_{2} \leq C n^{- 1 - η (a + 1) / 2} {(ln n)}^{(3 a - 1) / 2} .

So,

\exists ν > 0

such that

A_{2} \leq C n^{- 1 - ν} .

(10)

By (9),

A_{1} \leq C {(1 + \frac{λ^{2} ln n}{ℓ})}^{- ℓ / 2} = C exp (- ℓ / 2 ln (1 + \frac{λ^{2} ln n}{ℓ}))

Since

ℓ = C {(ln n)}^{2}

, we obtain

A_{1} \leq C exp (- λ^{2} \frac{ln n}{2}) = C n^{- λ^{2} / 2} .

Thus, for large

λ

,

\exists ν^{'} > 0, A_{1} \leq C n^{- λ^{2} / 2} \leq C n^{- 1 - ν^{'}} .

(11)

In conclusion, for large

C η^{2} > 1

,

\sum_{n} I P (|{\hat{E S}}_{D} (z) - I E [{\hat{E S}}_{D} (z)]| > η \sqrt{\frac{χ^{1 / 2} (z, r) ln n}{n ϕ^{2} (z, r)}}) < \infty .

Moreover,

\sum_{n \geq 1} I P (|{\hat{E S}}_{D} (z)| \leq 1 / 2) \leq \sum_{n \geq 1} I P (|{\hat{E S}}_{D} (z) - I E [{\hat{E S}}_{D} (z)]| > 1 / 2) < \infty .

□

Proof of Lemma 2.

Using

E S (t, x) - I E [{\hat{E S}}_{N} (t, z)] = \frac{1}{I E [F_{1} (z)]} I E [F_{1} (z) 𝟙_{B (z, r)} (z_{1}) (E S (t, x) - E S (t, X_{1}))] .

Condition (P2) gives

𝟙_{B (z, r)} (z_{1}) | E S (t, x) - E S (t, X_{1}) | \leq C r^{b} .

So,

sup_{t \in [{R E X P}_{p} (z) - δ, {R E X P}_{p} (z) + δ]} | E S (t, x) - I E [{\hat{E S}}_{N} (t, z)] | \leq C r^{b} .

allowing

sup_{t \in [{R E X P}_{p} (z) - δ, {R E X P}_{p} (z) + δ]} | E S (t, x) - I E [{\hat{E S}}_{N} (t, z)] | = O (r^{b})

□

Proof of Lemma 3.

Since

[{R E X P}_{p} (z) - δ, {R E X P}_{p} (z) + δ]

then by the compactness feature we obtain

[{R E X P}_{p} (z) - δ, {R E X P}_{p} (z) + δ] \subset ⋃_{j = 1}^{l_{n}}] y_{j} - d_{n}, y_{j} + d_{n} [

(12)

for

d_{n} = O (\frac{1}{\sqrt{n}})

and

l_{n} = O (\sqrt{n})

. The two functions

I E [{\hat{E S}}_{N} (\cdot, z)]

and

{\hat{E S}}_{N} (\cdot, z)

are increasing. Thus, for

1 \leq j \leq l_{n}

,

I E {\hat{E S}}_{N} ((y_{j} - d_{n}, z) \leq sup_{t \in] y_{j} - d_{n}, y_{j} + d_{n} [} I E {\hat{E S}}_{N} (t, z) \leq I E {\hat{E S}}_{N} (y_{j} + d_{n}, z)

{\hat{E S}}_{N} (t, z) y_{j} - d_{n}, z) \leq sup_{t \in] y_{j} - d_{n}, y_{j} + d_{n} [} {\hat{E S}}_{N} (t, z) \leq {\hat{E S}}_{N} (y_{j} + d_{n}, t) .

(13)

Now, by (P2) we obtain

\forall t_{1}, t_{2} \in {R E X P}_{p} (z) - δ, {R E X P}_{p} (z) + δ .

we have

|I E {\hat{E S}}_{N} (t_{1}, z) - I E {\hat{E S}}_{N} (t_{2}, z)| \leq C | t_{1} - t_{2} | .

Hence,

sup_{t \in [{R E X P}_{p} (z) - δ, {R E X P}_{p} (z) + δ]} |{\hat{E S}}_{N} (t, z) - I E {\hat{E S}}_{N} (t, z)|

\leq max_{1 \leq j \leq l_{n}} max_{z \in {y_{j} - d_{n}, y_{j} + d_{n}}} |{\hat{E S}}_{N} (z, z) - I E {\hat{E S}}_{N} (z, z)| + C d_{n} .

Clearly

d_{n} = n^{- 1 / 2} = o (\sqrt{\frac{χ^{1 / 2} (z, r) ln n}{n ϕ^{2} (z, r)}}) .

Therefore, it suffices to demonstrate that

max_{1 \leq j \leq l_{n}} max_{z \in {y_{j} - d_{n}, y_{j} + d_{n}}} |{\hat{E S}}_{N} (z, z) - I E {\hat{E S}}_{N} (z, z)| = O (\sqrt{\frac{χ^{1 / 2} (z, r) ln n}{n ϕ^{2} (z, r)}}), a . c o .

Then,

\forall η > 0

,

\begin{matrix} I P (max_{1 \leq j \leq l_{n}} max_{z \in {y_{j} - d_{n}, y_{j} + d_{n}}} |{\hat{E S}}_{N} (z, z) - I E {\hat{E S}}_{N} (z, z)| > η \sqrt{\frac{χ^{1 / 2} (z, r) ln n}{n ϕ^{2} (z, r)}}) \\ \leq 2 l_{n} max_{1 \leq j \leq l_{n}} max_{z \in {y_{j} - d_{n}, y_{j} + d_{n}}} I P (|{\hat{E S}}_{N} (z, z) - I E {\hat{E S}}_{N} (z, z)| > η \sqrt{\frac{χ^{1 / 2} (z, r) ln n}{n ϕ^{2} (z, r)}}) . \end{matrix}

It remains to assess

I P (|{\hat{E S}}_{N} (z, z) - I E {\hat{E S}}_{N} (z, z)| > η \sqrt{\frac{χ^{1 / 2} (z, r) ln n}{n ϕ^{2} (z, r)}}) .

Indeed,

{\tilde{F}}_{i} (z) = \frac{1}{I E [F_{1}] (z)} [F_{i} (z) Y_{i} 𝟙_{\{Y_{i} \leq z\}} - I E [F_{i} (z) Y_{i} 𝟙_{\{Y_{1} \leq z\}}]] .

We write

\forall ε > 0, I P [| {\hat{E S}}_{N} (z, z) - I E {\hat{E S}}_{N} (z, z) | > ε] = I P [\frac{1}{n} [|\sum_{i = 1}^{n} {\tilde{F}}_{i} (z)| > ε]]

I P (max_{z \in G_{n}} |{\hat{E S}}_{N} (z, z) - I E [{\hat{E S}}_{N} (z, z)]| > ε) \leq \sum_{z \in G_{n}} I P (|{\hat{E S}}_{N} (z, z) - I E [{\hat{E S}}_{N} (z, z)]| > ε) .

(14)

Because Y is not necessarily bounded, we use the truncation method by introducing

{\hat{E S}}_{N}^{*} (z, t) = \frac{1}{n I E [F (h^{- 1} d (z, X_{1}))]} \sum_{i = 1}^{n} F (r^{- 1} d (z, X_{i})) Y^{*}

with

Y^{*} = Y 𝟙_{(Y < γ_{n})}

. Thus, the result is a consequence of the intermediate results

d_{n} max_{z \in G_{n}} |I E [{\hat{E S}}_{N}^{*} (z, z)] - I E [{\hat{E S}}_{N} (z, z)]| = O (\sqrt{\frac{χ^{1 / 2} (z, r) ln n}{n ϕ^{2} (z, r)}}),

(15)

d_{n} max_{z \in G_{n}} |{\hat{E S}}_{N}^{*} (z, z) - {\hat{E S}}_{N} (z, z)| = O_{a . c o .} (\sqrt{\frac{χ^{1 / 2} (z, r) ln n}{n ϕ^{2} (z, r)}})

(16)

and

d_{n} max_{z \in G_{n}} |{\hat{E S}}_{N}^{*} (z, z) - I E [{\hat{E S}}_{N}^{*} (z, z)]| = O_{a . c o .} (\sqrt{\frac{χ^{1 / 2} (z, r) ln n}{n ϕ^{2} (z, r)}}) .

(17)

We start by proving (15): We have, ∀

z \in G_{n}

|I E [{\hat{E S}}_{N}^{*} (z, z)] - I E [{\hat{E S}}_{N} (z, z)]| \leq C \frac{1}{ϕ (z, r)} I E [|Y| 𝟙_{Y \geq γ_{n}}} F (r^{- 1} d (z, X))] .

By the Holder inequality, for

α^{'}

and

β

such that

\frac{1}{α^{'}} + \frac{1}{β} = 1

, and

α^{'} = \frac{p}{2}

\begin{matrix} \forall z \in G_{n} \\ I E [|Y| 𝟙_{{Y \geq γ_{n}}} F (r^{- 1} d (z, X_{1}))] & \leq & {I E}^{1 / α} [|Y^{α}| 𝟙_{{Y \geq γ_{n}}}] {I E}^{1 / β} [F^{β} (r^{- 1} d (z, X_{1}))] \\ \leq & γ_{n}^{- 1} {I E}^{1 / α} [|Y^{2 α}|] {I E}^{1 / β} [F^{β} (r^{- 1} d (z, X_{1}))] \\ \leq & γ_{n}^{- 1} {I E}^{1 / α} [|Y^{p}|] {I E}^{1 / β} [F^{β} (r^{- 1} d (z, X_{1}))] \\ \leq & C γ_{n}^{- 1} ϕ^{1 / β} (z, r) . \end{matrix}

Thus,

d_{n} max_{z \in G_{n}} |{\hat{E S}}_{N}^{*} (z, z) - I E [{\hat{E S}}_{N}^{*} (z, z)]| \leq n^{1 / 2} γ_{n}^{- 1} ϕ^{(1 - β) / β} .

Finally, (15) is consequence of (P5).

Now, for (16), we use Markov’s inequality to show that

\forall z \in G_{n}

,

\forall ϵ > 0

\begin{matrix} I P (|{\hat{E S}}_{N}^{*} (z, z) - {\hat{E S}}_{N} (z, z)| > ϵ) & \leq & \sum_{i = 1}^{n} I P (Y_{i} > γ_{n}) \\ \leq & n I P (Y > γ_{n}) \\ \leq & n γ_{n}^{- p} I E [Y^{p}] . \end{matrix}

Choosing

ϵ = ϵ_{0} (\sqrt{\frac{χ^{1 / 2} (z, r) ln n}{n ϕ^{2} (z, r)}})

,

d_{n} max_{z \in G_{n}} I P (| {\hat{E S}}_{N} (z, z) - {\hat{E S}}_{N}^{*} (z, z) | > ϵ_{0} (\sqrt{\frac{χ^{1 / 2} (z, r) ln n}{n ϕ^{2} (z, r)}})) \leq n^{3 / 2 - a} < C n^{- 1 - ν} .

Now, we prove (17). Define

z \in G_{n}

,

Λ_{i} = F_{i} Y_{i}^{*} - I E [F_{1} Y_{i}^{*}] .

Therefore, ∀

ϵ > 0

\begin{matrix} I P \{|{\hat{E S}}_{N}^{*} (z, z) - I E [{\hat{E S}}_{N}^{*} (z, z)]| > ε\} & = & I P \{|\frac{1}{n I E [F_{1}]} \sum_{i = 1}^{n} Λ_{i}| > ε\} \\ \leq & I P \{|\sum_{i = 1}^{n} Λ_{i}| > ε n I E [F_{1}]\} . \end{matrix}

We calculate

S_{n}^{' 2} = \sum_{i = 1}^{n} \sum_{j = 1}^{n} C o v (Λ_{i}, Λ_{j}) = \sum_{i = 1}^{n} \sum_{i \neq j} C o v (Λ_{i}, Λ_{j}) + n V a r [Λ_{1}] .

We define

S_{1}^{'} = {(i, j) such that 1 \leq i - j \leq u_{n}}

and

S_{2}^{'} = {(i, j) such that u_{n} + 1 \leq i - j \leq n - 1} .

Let

J_{1, n}^{'}

and

J_{2, n}^{'}

be the sum of covariance over these two sets, respectively. On

S_{1}^{'}

, we have

\begin{matrix} J_{1, n}^{'} & = & \sum_{S_{1}^{'}} |C o v (Λ_{i}, Λ_{j})| \\ \leq & C \sum_{S_{1}^{'}} |I E [F_{i} F_{j}]| + |I E [F_{i}] I E [F_{j}]| . \end{matrix}

Because of (P1), (P3) and (P5), we have

J_{1, n}^{'} \leq C n u_{n} χ (z, r) .

By Davydov-Rio’s inequality (see, Lemma 4) in the

L^{\infty}

cases we have

| C o v (Λ_{i}, Λ_{j}) | \leq C γ_{n}^{2} α (| i - j |) .

Hence,

J_{2, n}^{'} = \sum_{S_{2}^{'}} | C o v (Λ_{i}, Λ_{j}) | \leq \frac{n γ_{n}^{2} u_{n}^{- a + 1}}{a - 1} .

Choosing

u_{n} = {(\frac{γ_{n}^{2}}{χ (z, r)})}^{1 / a}

, we prove that

\sum_{i = 1}^{n} \sum_{i \neq j} C o v (Λ_{i}, Λ_{j}) = O (n γ_{n}^{2 / a} χ {(z, r)}^{(a - 1) / a}) .

On the other hand,

\begin{matrix} V a r (Λ_{1}) \leq I E {[F_{i} Y_{i}^{*}]}^{2} \leq I E {[F_{i} Y_{i}]}^{2} = O (ϕ (z, r)) . \end{matrix}

Thus,

S_{n}^{' 2} = O (n χ^{(1 / 2)} (z, r)) .

(18)

Fuck–Nagaev’s inequality (see Lemma 5) over

Λ_{i}

implies that ∀

ℓ > 0

and

ε > 0

,

\begin{matrix} I P \{|I E [{\hat{E S}}_{N}^{*} (z, z)] - {\hat{E S}}_{N}^{*} (z, z)| > ε\} & \leq & I P \{|\sum_{i = 1}^{n} Λ_{i}| > ε n I E [F_{1}]\} \\ \leq & C (A_{1}^{'} (z) + A_{2}^{'} (z)) \end{matrix}

where

A_{1}^{'} = {(1 + \frac{ε^{2} n^{2} {(I E [F_{1}])}^{2}}{S_{n}^{' 2} ℓ})}^{- ℓ / 2} and A_{2}^{'} = n ℓ^{- 1} {(\frac{ℓ}{ε n I E [F_{1}]})}^{a + 1} .

Taking

ε = λ^{'} \frac{\sqrt{n ln n χ^{1 / 2} (z, r)}}{n I E [F_{1}]}

and

ℓ = C {(ln n)}^{2}

, we obtain

\begin{matrix} d_{n} A_{2}^{'} \leq C n^{3 / 2 - (a + 1) / 2} χ {(z, r)}^{- (a + 1) / 4} {(ln n)}^{(3 a - 1) / 2} \leq C n^{- 1 - ν_{1}^{'}} \end{matrix}

(19)

for some

ν_{1}^{'} > 0

. Similarly to (11), we prove for

ℓ = O {(ln n)}^{2}

that

\begin{matrix} d_{n} A_{1}^{'} \leq C {(1 + \frac{λ^{' 2} ln n}{ℓ})}^{- ℓ / 2} \leq C n^{- 1 - ν_{2}^{'}} ν_{2}^{'} > 0 . \end{matrix}

(20)

Then, (19) and (20) permit concluding. Therefore, for

τ > \frac{1}{\sqrt{C}}

,

|{\hat{E S}}_{N} (z, z) - I E {\hat{E S}}_{N} (z, z)| = O_{a . c o .} (\sqrt{\frac{χ^{1 / 2} (z, r) ln n}{n ϕ^{2} (z, r)}}) .

(21)

□

Author Contributions

The authors contributed approximately equally to this work. Formal analysis, A.L.; Validation, M.B.A.; Writing—review and editing, Z.K. and F.A.A. All authors have read and agreed to the final version of the manuscript.

Funding

This research was funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2024R515), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia; and the Deanship of Scientific Research and Graduate Studies at King Khalid University through the Research Groups Program under grant number R.G.P. 1/128/45.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The data used in this study are available through the link https://dataverse.harvard.edu/dataverse/beijing-air (accessed on 20 April 2024).

Acknowledgments

The authors are indebted to the Editor-in-Chief, Associate Editor and the three referees for their very generous comments and suggestions on the first version of our article, which helped us improve the content, presentation, and layout of the manuscript. The authors thank and extend their appreciation to the funders of this work. This work was supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2024R515), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia, and the Deanship of Scientific Research and Graduate Studies at King Khalid University through the Research Groups Program under grant number R.G.P. 1/128/45.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Newey, W.K.; Powell, J.L. Asymmetric least squares estimation and testing. Econom. J. Econom. Soc. 1987, 55, 819–847. [Google Scholar] [CrossRef]
Waltrup, L.S.; Sobotka, F.; Kneib, T.; Kauermann, G. Expectile and quantile regression—David and Goliath. Stat. Model. 2015, 15, 433–456. [Google Scholar] [CrossRef]
Bellini, F.; Di Bernardino, E. Risk management with expectiles. Eur. J. Financ. 2017, 23, 487–506. [Google Scholar] [CrossRef]
Bellini, F.; Negri, I.; Pyatkova, M. Backtesting VaR and expectiles with realized scores. Stat. Methods Appl. 2019, 28, 119–142. [Google Scholar] [CrossRef]
Farooq, M.; Steinwart, I. Learning rates for kernel-based expectile regression. Mach. Learn. 2019, 108, 203–227. [Google Scholar] [CrossRef]
Chakroborty, S.; Iyer, R.; Trindade, A.A. On the use of the M-quantiles for outlier detection in multivariate data. arXiv 2024, arXiv:2401.01628. [Google Scholar]
Gu, Y.; Zou, H. High-dimensional generalizations of asymmetric least squares regression and their applications. Ann. Stat. 2016, 44, 2661–2694. [Google Scholar] [CrossRef]
Zhao, J.; Chen, Y.; Zhang, Y. Expectile regression for analyzing heteroscedasticity in high dimension. Stat. Probab. Lett. 2018, 137, 304–311. [Google Scholar] [CrossRef]
Kneib, T. Beyond mean regression. Stat. Model. 2013, 13, 275–303. [Google Scholar] [CrossRef]
Mohammedi, M.; Bouzebda, S.; Laksaci, A. The consistency and asymptotic normality of the kernel type expectile regression estimator for functional data. J. Multivar. Anal. 2021, 181, 104673. [Google Scholar] [CrossRef]
Girard, S.; Stupfler, G.; Usseglio-Carleve, A. Functional estimation of extreme conditional expectiles. Econom. Stat. 2022, 21, 131–158. [Google Scholar] [CrossRef]
Artzner, P.; Delbaen, F.; Eber, J.M.; Heath, D. Coherent measures of risk. Math. Financ. 1999, 9, 203–228. [Google Scholar] [CrossRef]
Righi, M.B.; Ceretta, P.S. A comparison of expected shortfall estimation models. J. Econ. Bus. 2015, 78, 14–47. [Google Scholar] [CrossRef]
Lazar, E.; Pan, J.; Wang, S. On the estimation of Value-at-Risk and Expected Shortfall at extreme levels. J. Commod. Mark. 2024, 34, 100391. [Google Scholar] [CrossRef]
Moutanabbir, K.; Bouaddi, M. A new non-parametric estimation of the expected shortfall for dependent financial losses. J. Stat. Plan. Inference 2024, 232, 106151. [Google Scholar] [CrossRef]
Scaillet, O. Nonparametric estimation and sensitivity analysis of expected shortfall. Math. Financ. Int. J. Math. Stat. Financ. Econ. 2004, 14, 115–129. [Google Scholar] [CrossRef]
Cai, Z.; Wang, X. Nonparametric estimation of conditional VaR and expected shortfall. J. Econom. 2008, 147, 120–130. [Google Scholar] [CrossRef]
Wu, Y.; Yu, W.; Balakrishnan, N.; Wang, X. Nonparametric estimation of expected shortfall via Bahadur-type representation and Berry–Esséen bounds. J. Stat. Comput. Simul. 2022, 92, 544–566. [Google Scholar] [CrossRef]
Ferraty, F.; Quintela-Del-Río, A. Conditional VAR and expected shortfall: A new functional approach. Econom. Rev. 2016, 35, 263–292. [Google Scholar] [CrossRef]
Ait-Hennani, L.; Kaid, Z.; Laksaci, A.; Rachdi, M. Nonparametric estimation of the expected shortfall regression for quasi-associated functional data. Mathematics 2022, 10, 4508. [Google Scholar] [CrossRef]
Fuchs, S.; Schlotter, R.; Schmidt, K.D. A review and some complements on quantile risk measures and their domain. Risks 2017, 5, 59. [Google Scholar] [CrossRef]
Almanjahie, I.M.; Bouzebda, S.; Kaid, Z.; Laksaci, A. The local linear functional kNN estimator of the conditional expectile: Uniform consistency in number of neighbors. In Metrika; Springer: Berlin/Heidelberg, Germany, 2024; pp. 1–29. [Google Scholar]
Litimein, O.; Laksaci, A.; Ait-Hennani, L.; Mechab, B.; Rachdi, M. Asymptotic normality of the local linear estimator of the functional expectile regression. J. Multivar. Anal. 2024, 202, 105281. [Google Scholar] [CrossRef]
Aneiros, G.; Cao, R.; Fraiman, R.; Genest, C.; Vieu, P. Recent advances in functional data analysis and high-dimensional statistics. J. Multivar. Anal. 2019, 170, 3–9. [Google Scholar] [CrossRef]
Goia, A.; Vieu, P. An introduction to recent advances in high/infinite dimensional statistics [Editorial]. J. Multivar. Anal. 2016, 170, 1–6. [Google Scholar]
Yu, D.; Pietrosanu, M.; Mizera, I.; Jiang, B.; Kong, L.; Tu, W. Functional Linear Partial Quantile Regression with Guaranteed Convergence for Neuroimaging Data Analysis. In Statistics in Biosciences; Springer: Berlin/Heidelberg, Germany, 2024; pp. 1–17. [Google Scholar]
Di Bernardino, E.; Laloe, T.; Pakzad, C. Estimation of extreme multivariate expectiles with functional covariates. J. Multivar. Anal. 2024, 202, 105292. [Google Scholar] [CrossRef]
Jones, D.A.; Cox, D.R. Nonlinear autoregressive pro cesses. Proc. R. Soc. Lond. A Math. Phys. Sci. 1978, 360, 71–95. [Google Scholar]
Ozaki, T. Non-linear time series models for non-linear random vibrations. J. Appl. Prob. 1980, 17, 84–93. [Google Scholar] [CrossRef]
Engle, R.F. Autoregressive conditional heteroskedasticity with estimates of the variance of U.K. inflation. Econometrica 1982, 50, 987–1007. [Google Scholar] [CrossRef]
Bollerslev, T. Generalized autoregressive conditional heteroskedasticity. J. Econom. 1986, 31, 307–327. [Google Scholar] [CrossRef]
Li, W.V.; Shao, Q.M. Gaussian processes: Inequalities, small ball probabilities and applications. Handb. Stat. 2001, 19, 533–597. [Google Scholar]
Hörmann, S.; Kidziński, Ł.; Hallin, M. Dynamic functional principal components. J. R. Stat. Soc. Ser. Stat. Methodol. 2015, 77, 319–348. [Google Scholar] [CrossRef]
Kanellopoulou, S.; Panas, E. Empirical distributions of stock returns: Paris stock market, 1980–2003. Appl. Financ. Econ. 2008, 16, 1289–1302. [Google Scholar] [CrossRef]
Bersimis, S.; Degiannakis, S.; Georgakellos, D. Real-time monitoring of carbon monoxide using value-at-risk measure and control charting. J. Appl. Stat. 2017, 44, 89–108. [Google Scholar] [CrossRef]
Ferraty, F.; Vieu, P. Nonparametric Functional Data Analysis: Theory and Practice; Springer Series in Statistics; Springer: New York, NY, USA, 2006. [Google Scholar]
Masry, E. Recursive probability density estimation for weakly dependent stationary processes. IEEE Trans. Inf. Theory 1986, 32, 254–267. [Google Scholar] [CrossRef]

Figure 1. A sample of 100 dependent curves with different lags (colors).

Figure 2. The SO₂ and O₃ daily curves.

Figure 3. Comparison between ES expectile and ES quantile using Mse.

Table 1. MSE error for the estimator for different dependency levels.

Conditional Distribution	Level of Dependency	p	$\hat{R E X_{p}}$	$\hat{V E X_{p}}$	$\tilde{V E X_{p}}$
Normal distribution	Strong dependency
		0.01	0.138	0.554	0.446
		0.05	0.125	0.447	0.436
		0.90	0.102	0.428	0.414
		0.95	0.162	0.374	0.367
Normal distribution	Medium dependency
		0.01	0.098	0.311	0.308
		0.05	0.081	0.302	0.293
		0.90	0.075	0.282	0.176
		0.95	0.093	0.203	0.199
Normal distribution	Moderate dependency
		0.01	0.049	0.161	0.154
		0.05	0.062	0.181	0.171
		0.90	0.051	0.168	0.160
		0.95	0.073	0.192	0.182
Lévy distribution	Strong dependency
		0.01	0.610	0.581	0.472
		0.05	0.630	0.532	0.423
		0.90	0.310	0.442	0.309
		0.95	0.280	0.364	0.251
Lévy distribution	Medium dependency
		0.01	0.290	0.271	0.235
		0.05	0.090	0.182	0.111
		0.90	0.051	0.113	0.102
		0.95	0.154	0.117	0.106
Lévy distribution	Moderate dependency
		0.01	0.151	0.241	0.192
		0.05	0.128	0.214	0.189
		0.90	0.033	0.217	0.195
		0.95	0.038	0.143	0.117

Table 2. Comparison between Mse of ES expectile and ES quantile.

Cases	p = 0.99	p = 0.5	p = 0.1	p = 0.05	p = 0.01
ES expectile	1.76	0.14	0.53	0.48	0.56
ES quantile	1.79	0.18	0.38	0.68	0.88

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Alamari, M.B.; Almulhim, F.A.; Kaid, Z.; Laksaci, A. Nonparametric Expectile Shortfall Regression for Complex Functional Structure. Entropy 2024, 26, 798. https://doi.org/10.3390/e26090798

AMA Style

Alamari MB, Almulhim FA, Kaid Z, Laksaci A. Nonparametric Expectile Shortfall Regression for Complex Functional Structure. Entropy. 2024; 26(9):798. https://doi.org/10.3390/e26090798

Chicago/Turabian Style

Alamari, Mohammed B., Fatimah A. Almulhim, Zoulikha Kaid, and Ali Laksaci. 2024. "Nonparametric Expectile Shortfall Regression for Complex Functional Structure" Entropy 26, no. 9: 798. https://doi.org/10.3390/e26090798

APA Style

Alamari, M. B., Almulhim, F. A., Kaid, Z., & Laksaci, A. (2024). Nonparametric Expectile Shortfall Regression for Complex Functional Structure. Entropy, 26(9), 798. https://doi.org/10.3390/e26090798

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Nonparametric Expectile Shortfall Regression for Complex Functional Structure

Abstract

1. Introduction

2. Model and Estimator

3. Main Asymptotic Result

4. Empirical Analysis

4.1. Smoothing Parameter Selection: Cross-Validation

4.2. Artificial Data

4.3. Real Data Application

5. Conclusions and Prospects

6. The Demonstration of Asymptotic Results

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI