*4.3. Other Conformity Indices*

First, we consider the conformity constraint (9) thoroughly. It admits the following treatment. Let *<sup>F</sup>* <sup>∈</sup> <sup>F</sup> be some reference distribution. The constraint <sup>L</sup>(*y*, *<sup>F</sup>*) - <sup>L</sup>(*y*, *<sup>F</sup>*) is a specific case of (9); the feasible distributions *F* should be relevant to the available observations *<sup>Y</sup>* <sup>=</sup> *<sup>y</sup>* no less than the reference distribution *<sup>F</sup>* is. One more treatment is also acceptable. Let *<sup>q</sup>* ∈ C be some "guess" value of the uncertain parameter *<sup>γ</sup>*, and *<sup>α</sup>* <sup>&</sup>gt; 0 be a fixed value. The constraint:

$$\frac{\mathfrak{L}(y, F)}{\mathfrak{L}(y|\overline{q})} \gtrsim a \tag{33}$$

is a specific case of (9): it means that the likelihood ratio of any feasible distribution *F* to the one-point distribution at *<sup>q</sup>*should be no less that the level *<sup>α</sup>*. Obviously, the guess value *<sup>q</sup>* could be chosen from the maxima of the function <sup>L</sup>, i.e., *<sup>q</sup>* <sup>∈</sup> Argmax*q*∈C <sup>L</sup>(*y*|*q*), but calculation of these maxima is itself a nontrivial problem of likelihood function maximization. In Section 5 we use some modification of (33):

$$\frac{\mathfrak{L}(y, F) - \min\_{q \in \mathcal{C}\_n} \mathcal{L}(y|q)}{\max\_{q \in \mathcal{C}\_n} \mathcal{L}(y|q) - \min\_{q \in \mathcal{C}\_n} \mathcal{L}(y|q)} \gtrless r \tag{34}$$

where C*<sup>n</sup>* ⊆ C is a known subset, and *r* ∈ (0, 1) is a fixed parameter. This form is important, because in the case of C = C*<sup>n</sup>* it guarantees for the constraint (34) to be active in the considered minimax optimization problem for each *r* ∈ (0, 1).

Furthermore, the proposed conformity index L(*y*, *F*) (9) is a non-unique numerical characteristic that describes the interconnection between *F* and *Y*. For example, an alternative conformity index can be defined as <sup>C</sup> *<sup>f</sup>*(L(*y*|*q*))*F*(*dq*), where *<sup>f</sup>*(·) : <sup>R</sup> <sup>→</sup> <sup>R</sup> is some continuous nondecreasing function. Another way to introduce this index is to set it as <sup>S</sup>(*y*) <sup>L</sup>(*<sup>y</sup>* , *F*)*dy* = P*F*{*Y* ∈ S(*y*)}, i.e., as a probability that the observation *Y* lies in the confidence set <sup>S</sup>(*y*) ∈ B(R*k*).

For a particular case of the observation model (1) we can propose one more conformity index that is based on the EDF. Let us consider the observation model with the "pure uncertain" estimated parameter *γ*:

$$Y\_t = A(\gamma) + B(\gamma)V\_{t\prime} \quad t = \overline{1,T}.\tag{35}$$

Here:


If the value *γ* is known, the observations {*Yt*}*t*=1,*<sup>T</sup>* can be considered as i.i.d. random values, whose pdf is equal to *φV*(*v*) after some shifting and scaling. The EDF of the sample {*Yt*}*t*=1,*<sup>T</sup>* has the form:

$$F\_T^\*(y) \stackrel{\triangle}{=} \frac{1}{T} \sum\_{t=1}^T \mathbf{I}(y - \mathbf{Y}\_t). \tag{36}$$

On the other hand, the cdf *FY*(*y*) of any observation *Yt* for a fixed distribution *F* can be calculated as:

$$F^Y(y) \stackrel{\Delta}{=} \int\_{-\infty}^{y} \int\_{\mathcal{C}} \phi\_V \left( \frac{u - A(q)}{B(q)} \right) F(dq) du. \tag{37}$$

*The sample conformity index based on the EDF* is the following value:

$$\mathfrak{M}(\mathbf{Y}\_{T},\boldsymbol{F}) \stackrel{\Delta}{=} ||\boldsymbol{F}\_{T}^{\*} - \boldsymbol{F}^{Y}||\_{\infty} = \sup\_{\boldsymbol{y} \in \mathbb{R}} |\boldsymbol{F}\_{T}^{\*}(\boldsymbol{y}) - \boldsymbol{F}^{Y}(\boldsymbol{y})|.\tag{38}$$

The new uncertainty set F*<sup>M</sup>* describing all admissible distributions *F* satisfies conditions (i), (ii) and (iv) above, but condition (iii) is replaced by the following one:

(x) the constraint

$$\mathfrak{M}(\mathbf{Y}\_{T}, F) \ll M \tag{39}$$

This holds for all *<sup>F</sup>* <sup>∈</sup> <sup>F</sup>*<sup>M</sup>* and some fixed level *<sup>M</sup>* <sup>&</sup>gt; 0. It is called *the constraint based on the EDF*.

The proposed conformity index represents the well known Kolmogorov distance used in the goodness-of-fit test. One also knows the asymptotic characterization of M(**Y***T*, *F*):

$$\lim\_{T \to \infty} \mathbb{P}\left\{ \mathfrak{M}(\mathbf{Y}\_T, F) < \frac{\mathfrak{x}}{\sqrt{T}} \right\} = \sum\_{-\infty}^{+\infty} (-1)^j e^{-2j^2 \mathbf{x}^2}.$$

Furthermore, the value M(**Y***T*, *F*) can be easily calculated, because the function *F*<sup>∗</sup> *<sup>T</sup>* is piece-wise constant while *F<sup>Y</sup>* is continuous:

$$\mathfrak{M}(\mathbf{Y}\_{T}, F) = \max\_{1 \le t \le T} \max(|F\_T^\*(Y\_t) - F^Y(Y\_t -)|, |F\_T^\*(Y\_t) - F^Y(Y\_t)|),$$

and the cdf *F<sup>Y</sup>* is calculated by (37).

The distribution set determined by (39) takes the form:

$$\left\{ F \in \mathbb{F} \colon -M + F\_T^\*(Y\_t) \leqslant \int\_{-\infty}^{Y\_t} \int\_{\mathcal{C}} \phi\_V \left( \frac{u - A(q)}{B(q)} \right) F(dq) du \leqslant M + F\_T^\*(Y\_t -), \quad t = \overline{1, T} \right\}. \tag{40}$$

Using the variational series **Y**(*T*) col(*Y*(1), ... ,*Y*(*T*)) of the sample **Y***T*, and recalling *F*∗ *<sup>T</sup>*(*Y*(*t*)) = *<sup>t</sup> <sup>T</sup>* , *F*<sup>∗</sup> *<sup>T</sup>*(*Y*(*t*)−) = *<sup>t</sup>*−<sup>1</sup> *<sup>T</sup>* , (40) can be rewritten in the form:

$$\left\{ F \in \mathcal{F} \colon -M + \frac{t}{T} \leqslant \int\_{-\infty}^{\chi\_{(t)}} \int\_{\mathcal{C}} \phi\_V \left( \frac{u - A(q)}{B(q)} \right) F(dq) du \leqslant M + \frac{t - 1}{T}, \quad t = \overline{1, T} \right\}. \tag{41}$$

It can be seen that this set is a convex closed polyhedron, lying in F, with at most 2*T* facets. All assertions formulated in Section 3 are valid after replacing the uncertainty set F*L*, generated by the likelihood function, by the set F*M*, generated by the EDF. Moreover, the mesh algorithm for the dual optimization problem solution, presented above in Section 4.1, can also be applied to this case.

Let us consider the observation model (35) again. We can use the sample mean *Y* <sup>1</sup> *<sup>T</sup>Yt* as one more conformity index. Let us remind the reader that due to the model property, the random parameter *γ*(*ω*) is constant for each sample **Y***T*. For rather large *T* values, the central limit theorem allows to treat the normalized value <sup>√</sup>*T*(*Y*−*A*(*γ*)) <sup>|</sup>*B*(*γ*)<sup>|</sup> as a standard Gaussian random one. We then fix a standard Gaussian quantile *c<sup>α</sup>* of the confidence level *α* and exscind the subset:

$$\mathcal{C}\_{\mathfrak{X}} \stackrel{\triangle}{=} \left\{ q \in \mathcal{C} : \overline{Y} - \frac{c\_{a}|B(\gamma)|}{\sqrt{T}} \leqslant A(q) \leqslant \overline{Y} + \frac{c\_{a}|B(\gamma)|}{\sqrt{T}} \right\} \subseteq \mathcal{C}.$$

If <sup>C</sup>*<sup>α</sup>* is compact then the set <sup>F</sup>*<sup>α</sup>* of all probability distributions with the domain lying in C*<sup>α</sup>* is called *the set of admissible distributions satisfying the sample mean conformity constraint of the level α.*

The comparison of the minimax estimates, calculated under various types of the conformity constraints, is presented in the next section.
