1. Introduction
Stochastic models with fractional Brownian motion (fBm) as the noise source have attained increasing popularity recently. This is because fBm is a continuous Gaussian process, increments of which are positively, or negatively correlated if Hurst parameter
, or
, respectively. If
fBm coincides with classical Brownian motion and its increments are independent. The ability of fBm to include memory into the noise process makes it possible to build more realistic models in such diverse fields as biology, neuroscience, hydrology, climatology, finance and many others. The interested reader may check monographs [
1,
2], or more recent paper [
3] and the references therein for more information.
Let
be a fractional Brownian motion with Hurst parameter
H defined on an appropriate probability space
. Fractional Ornstein–Uhlenbeck process (fOU) is the unique solution to the following linear stochastic differential equation
where
is a drift parameter (we consider ergodic case only) and
is a noise intensity (or volatility). Recall that solution to Equation (
1) can be expressed by the exact analytical formula:
A single realization of the random process
for a particular
is the model for the single real-valued trajectory, part of which is observed. Two examples of such trajectories are given in
Figure 1. We assume
throughout this paper so that the fOU exhibits long-range dependence. For an example of application, see a neuronal model based on fOU described in the recent work [
4].
The aim of this paper is to study the problem of estimating drift parameter
based on an observation of a single trajectory of a fOU in discrete time instants
with fixed mesh size
and increasing time horizon
(long-span asymptotics). Estimating drift parameter of a fOU observed in continuous time has been considered in [
5,
6], where least-squares estimator (LSE) and ergodic-type estimator are studied. These have advantageous asymptotic properties, they are strongly consistent and, if
, also asymptotically normal. Ergodic-type estimator is easy to implement, but it has greater asymptotic variance compared to LSE, requires a priori knowledge of
H and
and does not provide acceptable results for non-stationary processes with limited time horizon.
A straightforward discretization of the least-squares estimator for a fOU has been introduced and studied in [
7] for
and in [
8] for
. For the precise formula, see (
8). This estimator is consistent provided that both the time horizon
and the mesh size
(mixed in-fill and long-span asymptotics). However, it is not consistent when
h is fixed and
. This has led us to construct and study LSE-type estimators that converge in this long-span setting.
An easy modification of the ergodic-type estimator to discrete-time setting with fixed time step was given in [
9], see (
10) for precise formula, and its strong consistency (assuming
) and asymptotic normality (for
) when
were proved, but with possibly incorrect technique (as pointed out in [
10]). Correct proofs of asymptotic normality for
and strong consistency for
of this estimator were provided (in more general setup) in [
10]. Note that the use of this discrete ergodic estimator requires the knowledge of parameter
(in contrast to the estimators of least-squares type introduced below). Other works related to estimating drift parameter for discretely observed fOU include [
11,
12,
13], but this list is by no means complete.
This work contributes to the problem of estimating drift parameter of fOU by introducing three new LSE-type estimators: least-squares estimator from exact solution, asymptotic least-squares estimator and conditional least-squares estimator. These estimators are tailored to discrete-time observations with fixed time step. We provide proofs of their asymptotic properties and identify situations, in which these new estimators perform better than the already known ones. In particular, we eliminate the discretization error (the LSE from exact solution), construct strongly consistent estimators in the long-span regime without assuming in-fill condition (the asymptotic LSE and the conditional LSE), and eliminate the bias in the least-squares procedure caused by autocorrelation of the noise term (the conditional LSE). Especially the conditional LSE demonstrates outstanding performance in all studied scenarios. This suggests that the newly introduced (to our best knowledge) concept of conditioning in the least-squares procedure applied to the models with fractional noise provides a powerful framework for parameter estimation in this type of models. The proof of its strong consistency, presented within this paper, is rather non-trivial and may serve as a starting point for investigation of similar estimators in possibly different settings. A certain disadvantage of the conditional LSE is its complicated implementation (involving optimization procedure), which is in contrast to the other studied estimators.
Let us explain the strength of the conditional least-squares estimator in more detail. Comparison of the two trajectories in
Figure 1 demonstrates the effect of different values of
on trajectories of fOU. In particular, it affects the speed of exponential decay in initial non-stationary phase and the variability in stationary phase. As we illustrate below, the discretized least-squares estimator, cf. (
8), utilizes information about
from the exponential decay in initial phase, but is not capable to make use of the information contained in the variability in stationary phase. As a consequence, it is not consistent (in long-span setting). On the contrary, the ergodic-type estimator, cf. (
10), is derived from the variance of the stationary distribution of the process. It works well for stationary processes (and is consistent), but leaves idle (and even worse, it is corrupted by) the observation of the process in its initial non-stationary phase. In result, neither of these estimators can efficiently estimate drift from long trajectories with far-from-stationary initial values. This gap is best filled with the conditional least-squares estimator, cf. (
25), which effectively utilizes both information stored in non-stationary phase and in stationary phase of the observed process. This unique property is demonstrated in Results and Discussion, where the conditional LSE (denoted by
) dominates the other estimators.
For the three newly introduced estimators the value of the Hurst parameter
H is considered to be known a priori, whereas the knowledge of volatility parameter
is not required, which is an advantage of these methods. If
H is not known, it can be estimated in advance by some of many methods, such as methods based on quadratic variations (cf. [
14]), sample quantiles or trimmed means (cf. [
15]), or on a wavelet transform (cf. [
16]), to name just a few. Another useful works in this direction include simultaneous estimation of
and
H using the powers of the second order variations (see [
17], Chapter 3.3). The estimates of
H (obtained independently from
) can subsequently be used in the LSE-type estimators of lambda introduced below in a way similar to [
18].
In
Section 2, some elements of stochastic calculus with respect to fBm are recalled, stationary fOU is introduced and precise formulas for two existing drift estimators
and
are provided.
Section 3 is devoted to construction of a new LSE type estimator (
) based on exact formula for fOU. A certain modification of
(denoted as
), which ensures long-span consistency, is introduced in
Section 4. In
Section 5, we rewrite the linear model using conditional expectations to overcome the bias in LSE caused by autocorrelation of the noise. Least-squares method, applied to the conditional model with explicit formulas for conditional expectations, results in the conditional least-squares estimator (
). We prove strong consistency of this estimator. The actual performance of the newly introduced estimators
,
and
as well as its comparison to the already-known
and
, is studied by Monte Carlo simulations in various scenarios and reported in
Section 6. The simulated trajectories have been obtained in software
R with YUIMA package (see [
19]).
Section 7 summarizes key points of the article and provides possible future extensions.
2. Preliminaries
For reader’s convenience we briefly review the basic concepts from theory of stochastic models with fractional noise in this section, including definition of fBm, Wiener integral of deterministic functions w.r.t. fBm and stationary fOU. This exposition follows [
2,
20]. For further reading, see also the monograph [
1]. In the end of this section, we also recall formulas for discretized LSE and discrete ergodic estimator.
Fractional Brownian motion with Hurst parameter
is a centered (zero-mean) continuous Gaussian process
starting from zero (
and having the following covariance structure
Note that for the purpose of construction of the stationary fOU, we need a two-sided fBm
with
t ranging over the whole
. In this case we have
As a consequence, the increments of fBm are negatively correlated for , independent for and positively correlated for .
Consider a two-sided fBm with
and define Wiener integral of a deterministic step function with respect to the fBm by formula
for any positive integer
N, real-valued coefficients
and a partition
. This definition constitutes the following isometry for any pair of deterministic step functions
f and
g
where
. Using this isometry, we can extend the definition of the Wiener integral w.r.t. fBm to all elements of the space
, defined as the completions of the space of deterministic step functions w.r.t. the scalar product
defined above. In result, the formula (
3) holds true for any
, see also [
21]. We will frequently use this formula in what follows, mainly to calculate the covariances of Wiener integrals.
Let
be again a two-sided fBm with
. Define
and denote by
the solution to (
1) with initial condition
in the sense that it satisfies
This process is referred to as the stationary fOU and it can be expressed as
Note that the stationary fOU is an ergodic stationary Gaussian process (its autocorrelation function vanishes at infinity).
Consider now a stationary fOU
observed at discrete time instants
. The ergodicity and the formula for the second moment of stationary fOU (see e.g., [
20]) imply
Analogously,
and the expectation can be calculated using (
3) and the change-of-variables formula
The rest of this section is devoted to the two popular estimators of the drift parameter of fOU observed at discrete time instants described in Introduction—the discretized LSE and the discrete ergodic estimator. Start with the former. Consider a straightforward discrete approximation of the Equation (
1):
Application of the standard least-squares procedure to the linear approximation above provides the discretized LSE studied in [
7,
8], which takes the form
where
h is the mesh size (time step) and
with
and
being the observations at adjacent time instants
and
respectively, of the process
defined by (
1) or (
2). Note that having
expressed in term of
simplifies its comparison with the estimators newly constructed in this paper. Recall that for consistency of
mixed in-fill and long-span asymptotics is required due to the approximation error in (
7).
The discrete ergodic estimator is derived from asymptotic behavior of the (stationary) fOU. Recall the convergence in (
4). Rearranging the terms provides an asymptotic formula for drift parameter
expressed in terms of the limit of the second sample moment of the stationary fOU. Substituting the stationary fOU by the observed fOU
in the asymptotic formula results in the discrete ergodic estimator:
which was studied in [
9,
10]. Recall that this estimator is strongly consistent in the long span regime (no in-fill condition needed), however, it heavily builds upon the asymptotic (stationary) behavior of the process and fails for processes with non-stationary initial phase (as illustrated by numerical experiments below).
5. Conditional Least-Squares Estimator
Non-stationary trajectories with long time horizon contain a lot of information about
, which is encoded mainly in two aspects: speed of decay in initial non-stationary phase and variance in stationary phase (see
Figure 1). However, neither of the estimators
,
,
or
can utilize all the information effectively. This motivates us to introduce another estimator. Recall that
fails to be consistent because of bias in LSE caused by the correlation between
and
in Equation (
11). To eliminate the correlation between explanatory variable and noise term in the linear model, we switch to conditional expectations. Start from the following equation, which defines
:
where
is the true value of the unknown drift parameter and
the (conditional) expectation with respect to the measure generated by the fOU
with drift value
and initial condition
. (
stands for an unknown throughout this section). In other words,
means the conditional expectation of
, conditioned by
, where the process
X is given by (
2) with drift
. Hence,
has the same meaning as
in previous sections.
Obviously
and, consequently,
and
are uncorrelated. Indeed,
In result, we apply the least-squares technique to Equation (
18), where
is to be estimated, i.e., we would like to minimize
To calculate
explicitly, use (
11) and obtain
Note that random vector
has 2-dimensional normal distribution (dependent on parameter
)
and we can use explicit expression for its conditional expectation to write
With respect to the exact formula for
given by (
2) and relation (
3) we get
where we used change-of-variable formula in the last step. Analogously
Using the expressions for
and
in (
20) we obtain
Combining formula (
21) with (
19) yields
with
and
We can thus reformulate the Equation (
18) for the observed process
X as the following model (linear in
, but non-linear in
):
Now we aim to apply the least-squares method to the reformulated model to get the conditional least-squares estimator
. To ensure the existence of global minima, we choose a closed interval
) and define
as the minimizer of sum-of-squares function on this interval:
with criterion function
defined as
where we used (
24) with
.
Note that
is continuous in
and therefore a minimum on the compact interval
exists. Although model (
24) is linear in
, the coefficients
A and
B depend on
t and that complicates the numerical minimization of
.
Remark 3. Let be the stationary solution to (1). Thenwhere f is defined in (
13)
and is arbitrary. Since the coefficient does not depend on t, it is possible to calculate LSE for explicitly and to construct the estimator of λ by applying . Such estimator coincides with introduced in previous chapter. Thus can be understood as the special case of conditional LSE for the stationary solution. In order to prove strong consistency of the estimator we need to verify uniform convergence of to a function specified below. Let us start with the following proposition on uniform convergence of and . This proposition will help us in the sequel to investigate limiting behaviour of the two terms and in the sum-of-squares function .
Proposition 1. Consider and defined by (
22)
and (
23),
and f defined by (
13)
. Fix arbitrary . Then Proof. In order to simplify the notation, denote
and notice that
Choose any
and recall that
is fixed. The uniform convergences in (
27) and (
28) imply the following convergences uniformly in
:
and
respectively. Indeed, set
and
and fix any
. There is
such that for any
,
If
, then
for any
. Consequently
which proves (
29). The convergence in (
30) can be shown analogously. These uniform convergences will be helpful in the proof of the following Lemma, which provides uniform convergence of
to a limiting function
. This uniform convergence is the key ingredient for the convergence of the minimizers
.
Lemma 3. Let f be defined by (
13)
and let be defined by (
26),
where is the observed process with drift value λ. Denote Proof. First consider the stationary solution
to (
1) corresponding to drift value
. Comparison of (
6) with (
13) yields
It enables us to write
for any
, and, consequently
Recall that
is ergodic and
vanishes at infinity. Using Lemma 1 in the same way as in the proof of Theorem 1 implies
For the second term, write
Application of Lemma 1, the convergence in (
29) and the continuity of
f ensure the convergence with probability one of both summands to zero as
.
The uniform convergence of the third term can be shown analogously:
where we use
which follows directly from (
29) and the continuity of
f.
The last term in (
33) can be treated similarly:
where
Lemma 1 concludes the proof:
□
Previous considerations lead to the convergence of , being the minimizers of to the minimizer of . Next lemma ensures that this minimizer coincides with the true drift value .
Lemma 4. defined by (
31)
is continuous on and λ is the unique minimizer of , i.e., Proof. The claim follows immediately, because f is one-to-one (it is strictly decreasing).
Continuity of is a direct consequence of the continuity of f. □
Now we are in a position to prove the strong consistency of .
Theorem 3. Consider bounds so that they cover the true drift λ of the observed solution to Equation (
1)
, i.e., . Then defined in (
25)
is strongly consistent, i.e., Proof. The proof follows standard argumentation from nonlinear regression and utilizes Lemma 3 and Lemma 4. Choose
sufficiently small so that
and set
Consider a set of full measure on which the uniform convergence (
32) holds and take
such that
Fix any
. Then for arbitrary
we get
As
minimizes
, for all
we have
Since
was arbitrary (if small enough), we obtain the convergence
on a set of full measure. □
6. Results and Discussion
In
Table 1 we present comparison of the root mean square errors (RMSE) of all considered estimators for
and several combinations of
,
T and
H. Estimators
and
demonstrate good performance in scenarios with far-from-zero initial (
) condition and short time horizon (
) This illustrates the fact that these estimators reflect mainly the speed of convergence to zero of the observed process in its initial phase. Increasing time horizon to
adds a stationary phase to the observed trajectories, which distorts the estimators
and
.
Estimators and perform well in settings with stationary-like initial condition () and long time horizon (). This is because they are constructed from the stationary behavior of the process. Taking far-from-zero initial condition ruins these estimators, unless trajectory is very long.
The conditional LSE,
, shows reasonable performance in all studied scenarios and it significantly outperforms the other estimators in scenario with far-from-stationary initial condition (
) and long time horizon (
). This results from the unique ability of this estimator to reflect and utilize information about the drift from both non-stationary (decreasing) phase and stationary (oscillating) phase. This is also illustrated on
Figure 5. On the other hand, evaluation of
is the most numerically demanding compared the other studied estimators.
If
and
,
shows greater RMSE than
and
in
Table 1 due to
being relatively close to zero. This causes that
and
have smaller variance (although greater bias) compared to
(see
Figure 6). In order to present this effect we have calculated RMSE for simulations in same scenario but with
(see
Table 2).
provides smaller RMSE than the other estimators in this setting.