4.1. A Premise on the Proposed Method
Since it is analytically straightforward, the classical statistical approach for the estimation of the CIR model is only hinted at in the next section, while in this section the Bayes method of inference is summarized, and then applied to the CIR model. As well known, in the classical statistical approach [
41], featuring (among the others) the method of moments and the maximum likelihood (ML) method of estimation, any parameter to be estimated is seen as a fixed but unknown variable. Instead, in Bayesian parameter estimation [
39,
40,
41,
43], each unknown parameter is seen as a random variable, so that it possesses a “prior” PDF representing the probabilistic information or the “belief” of the observer about the parameter.
Then, letting the parameter (possibly a vector) denoted by
w, its prior PDF g(
w) is integrated and updated with field data—denoted by
D—by the below reported well known Bayes’ theorem, which allows to obtain the the posterior distribution
g(w|D) of
w, conditional to the observed data
D:
where:
The above integral is a multiple one, in the case w is a vector of parameters, say: w = (w1, w2, …,wp), so that dw = dw1, dw2, …, dwp.
In a Bayesian analysis, the posterior distribution sums up the current status of knowledge about the unknown parameters of probability distributions, integrating the prior knowledge (before data observation)—or the initial belief about the parameter—represented by the prior distribution, and the knowledge produced by data in terms of the likelihood function of the observed data, which in a sense measures the information contained in the data. Once the posterior distribution, here in terms of posterior PDF
g(w|D) of (29), has been deduced, there are various possible choices for the Bayes point estimate of the parameter
w. A very common choice, as the “best” Bayes estimate—in the mean square error sense—is the posterior mean of such PDF. As is well known [
39], such an estimate minimizes the posterior mean squared error.
Another popular choice, here performed, is the so called “Maximum a Posteriori Probability” (MAP) estimation, in which the point estimate
w° of the parameter
w is the value which maximizes the posterior PDF
g(
w|
D):
where
Sw is the parameter space, i.e., the set of all possible values that
w can assume. This choice, which by definition maximizes the probability of a correct estimation (or minimizes the estimation error probability), constitutes a “classical” method in Bayesian analysis and has gained new interest in the last years, also being adopted in the framework of the recent machine learning methods [
44,
45,
46].
Both the posterior mean and the MAP estimator, as will be apparent from the following equations, require numerical methods, but the MAP estimator is somewhat simpler to compute, being nonetheless very close to the posterior mean.
In view of the estimation of the CIR model, the parameter
w to be estimated, appearing in Equations (29)–(31), i.e., the “input” of the estimation process, may naturally be the parameter
η of (2)–(4), also reported in the following Equation (32). The parameter
η also represents the median of the CIR distribution. However, also any other parameter
τ =
τ(
η) related to the unknown parameter
η may be chosen as the input parameter. It is recalled that, in the Bayesian methodology, also any functions of the random parameter
η are themselves RV, described by appropriate distributions. In particular here, once a given WS value
x has been fixed, the attention may be focused on the inference on value of the CIR CDF at
x:
Indeed, also the CDF—as well as other parameters (e.g., moments and quantiles)—may be considered as the input of the estimation process. It is remarked that, although formally equal to Equation (2), the expression in (32) should not be seen as a function of the WS
x, which is instead a fixed constant there. This is indeed a function of the unknown parameter
η, thus constituting a new parameter, on which it should be easy to assess a prior PDF. For mathematical convenience, alternatively, by introducing the new RV:
the above expression of the CIR CDF can be re-parametrized (omitting for clarity the dependence on
x, which is a constant, so writing
F instead of
F(
x)):
Of course, assessing a prior PDF on the random parameter η implies—being x a given constant—assessing a prior PDF on the random parameter Y, as well as on F and other related parameters. Of course, the converse is also true. The choice of the most convenient way depends on the kind of information that the observer possesses.
For instance, a reasonable prior assessment could be the one assessing the probability that the WS is higher than a given value
x: this seems indeed a realistic piece of information, which should be available to the engineer based on past WS data. By the above relation, such “exceedance probability” is expressed by:
being
S(
x) = 1 −
F(
x) the “Survival function” of the WS RV. It is interesting to remark that this can be considered as a “un-safety index”, accounting for the most severe EWS amplitudes as disturbances for wind tower safety.
In summary, for the above estimation, two methods are considered in the paper in order to establish an appropriate prior distribution as illustrated in the following sub-sections, i.e., assessing assigning a Lognormal prior distribution to the parameter
η (
Section 4.1.), or a Beta prior distribution to the parameter
S (
Section 4.2). Although, in general, the two prior assessments (i.e., assessing a prior distribution to the parameter
η or a prior distribution to the parameter
S) can be made equivalent by appropriate choice of the relevant prior PDF parameters, choosing one or the other as the basic input information has some peculiar analytical aspects that are highlighted in the following, since—for instance—a Lognormal prior distribution on the parameter
η does not imply a Beta prior distribution on the parameter S, but a so-called “Beta-Lognormal” distribution discussed in the following. However, such “Beta-Lognormal” distribution can be nonetheless approximated to a certain extent by a Beta distribution.
Finally, it is remarked that the choice of assessing a prior distribution to the parameter
S is in accordance with the so-called “practical” Bayes estimation, which is widely adopted nowadays in various applications, after being first proposed in [
36] in the framework of reliability analyses.
4.2. “First Approach”: Choice of a Lognormal Prior Distribution of the Parameter η
Let us examine the first choice, for which a Lognormal prior PDF is suggested for its great flexibility and for its easy transformation into the corresponding prior PDF for the above introduced RV
Y and
S, when needed. Let
η be a Lognormal RV, with the following PDF [
33]:
The function
g(
η) is zero for
η < 0; parameters
γ,
β are respectively the scale and the shape parameters of the Lognormal PDF. Then, by virtue of (33), also
Y is an opportune Lognormal RV, because of known properties of the Lognormal model [
33] (among which, the fact that—given a Lognormal RV
Y—any power function,
Yk is still a Lognormal RV, whatever the value of the real constant
k). The parameters of the PDF of
Y are simply deducible from those of
η. Then, the RV
S of (35) possesses a known PDF, which can be expressed as follows. First, the new RV:
is introduced. It is again a Lognormal RV with PDF, here denoted by
h(
t), simply expressible in terms of the Lognormal PDF of (36):
Then, the PDF of
S is easily expressible in terms of the one of
T, since:
Then, denoting by
q(
s) the PDF of
S, it can be expressed in terms of the above PDF
h(
t) of
T by means of a “Beta-Lognormal” PDF (introduced in [
47]), which—by well-known rules of PDF transformations, is given by:
Of course, since
F = 1 −
S, the prior PDF of
F is also simply expressible. The above PDF should be conveniently approximated by a Beta PDF, as discussed in [
47]. Indeed, it is known that a Beta RV, say
Z, may be characterized, under some hypotheses, as the ratio
Z =
U/(
U +
V) = 1/(1 +
W), being
U and
V independent Gamma RV [
33,
43], and so
W =
V/
U the ratio of two independent Gamma RV. Because of the known similarity between the Gamma and the Lognormal PDF (for given common values of mean and standard deviation, [
46]), it may be expected that also the PDF of
Y in (39) may be at least approximated by a Beta PDF, as in our application was successfully verified. Indeed, in the following
Section 5 it is shown that the estimation results are not so different when assuming a Lognormal PDF on
η (as discussed here), or when assuming a Beta PDF on
S, as discussed in the following sub-section.
4.3. “Second Approach”: Choice of a Beta Prior Distribution the Parameter S
With the purpose of assessing the prior information on
S, a Beta PDF [
33,
39,
41,
43] may be successfully adopted, as generally done for RV in (0,1) [
39], due to its great flexibility in this interval. The Beta PDF for
S,
f(
s), with positive parameters (
p,
q), can be expressed as follows:
It is noticeable that in such case also
F = 1 −
S is a Beta RV [
33], whose PDF can be obtained from (41) simply swapping the roles of parameters (
p,
q). This choice also implies that the RV
Y of (35), which is obtained as a function of S (as simply deducible from (35) itself) by:
is a RV with support (0,
), which possesses a so-called “Beta Prime” distribution [
33], with the same positive parameters (
p,
q) of the Beta PDF of
S. This means that the PDF
u(
s) of
Y can be expressed as follows:
Such PDF of
Y implies in turn an analytical expression also for the parameter
η, which is also an RV with support (0,
), and by means of (33) is proportional to the square root of
Y:
This means that the PDF of
η can be still expressed, with some adequate transformation, in terms of the Beta Prime distribution of (43). I.e., given the PDF
u(y;p.q) of (43), the implied prior PDF of
η, say v(
h), is expressed by:
The applications of the above Equations (36) and (45), expressing respectively the prior PDF of η according the first approach and the second approach, have been performed in the course of the computations of the following section, as described afterwards.
Then, both in the case of the first and the second approach, in order to obtain the Bayes estimate (be it the posterior mean or the MAP estimate), the posterior PDF shall be computed according (29), i.e., by multiplying the above prior PDF ((36) or (45), according to the case under study) and the Likelihood function L(D|w). This Likelihood function, which can be denoted by L(D|η), since in this case is the only parameter to be estimated, is the joint PDF of the CIR RV which constitutes the sample of n WS values (x1, x2, …, xn), i.e., it is the product of n PDF f(xj|η) j = 1 …n, such as the one in (3), evaluated in xj (j = 1 …n), being n the sample size.
As far as the MAP estimate is concerned, it is not important to evaluate the constant C in (29), since its value does not change the maximum of the above product g(η) L(D|η). However, the MAP estimate must be obtained numerically (and the same would be true also for the posterior mean or other possible estimates).