*Article* **Statistical Parameters Based on Fuzzy Measures**

#### **Fernando Reche, María Morales \* and Antonio Salmerón**

Department of Mathematics and Center for the Development and Transfer of Mathematical Research to Industry (CDTIME), University of Almería, 04120 Almería, Spain; fernando.reche@ual.es (F.R.); antonio.salmeron@ual.es (A.S.)

**\*** Correspondence: maria.morales@ual.es

Received: 21 October 2020; Accepted: 9 November 2020; Published: 12 November 2020

**Abstract:** In this paper, we study the problem of defining statistical parameters when the uncertainty is expressed using a fuzzy measure. We extend the concept of monotone expectation in order to define a monotone variance and monotone moments. We also study parameters that allow the joint analysis of two functions defined over the same reference set. Finally, we propose some parameters over product spaces, considering the case in which a function over the product space is available and also the case in which such function is obtained by combining those in the marginal spaces.

**Keywords:** monotone statistical parameters; fuzzy measures; monotone measures; product spaces; fuzzy statistics

### **1. Introduction**

Fuzzy measures [1], also known as capacities [2], non-additive measures or monotone measures [3], have shown to be a valuable tool for representing uncertainty, since they are able to cope with more general scenarios than probability measures do. Even though fuzzy measures have been successfully applied in a wide range of applications [4], no theory analogous to mathematical statistics has emerged around them in the general case, due to the difficulty of defining statistical parameters with a clear interpretation when additivity is replaced by monotonicity.

A remarkable exception is the case of the so-called imprecise probabilities [5,6], characterized by upper and lower expectations that provide rich semantics and interpretability. Dempster–Shafer belief functions [7,8], for instance, can be formulated as special cases of imprecise probabilities.

The field of fuzzy probability and statistics [9–14] has received significant attention during the last two decades. The contributions in this field can be classified into two basic groups according to the underlying approach they follow [15]. One of the groups include the methods that deal with the analysis of classical (non-fuzzy) data using methods based on fuzzy set theory, while the other group focuses on analyzing fuzzy data using statistical methods. In this context, fuzzy data refers to data in which the values correspond to fuzzy numbers [16], characterized by a membership function that returns a value between 0 and 1 indicating to which extent a given real number matches a given fuzzy number.

Examples within the first group include fuzzy clustering [17], fuzzy linear regression [18], testing fuzzy hypothesis from non-fuzzy data [19], fuzzy statistical quality control [20], time series forecasting based on fuzzy logic [21] and making statistical decisions with fuzzy utilities [22].

The second group includes methods for maximum likelihood estimation from fuzzy data [23], classification when data are labeled with Dempster–Shafer belief functions [24], distance-based statistical analysis [25], statistical hypothesis testing from fuzzy data [26], principal component analysis [27], discriminant analysis [28] and clustering [29].

In this paper, we are interested in the definition of statistical parameters when the uncertainty is represented by a general fuzzy measure. More precisely, our starting point is a measurable space

and a measurable real-valued function defined on the reference set of the space. We also assume that the measurable space is endowed with a fuzzy measure, and we will study the definition of statistical parameters over the measurable function, in a similar way as statistical parameters over a random variable can be defined from a probability measure. In this way, we attempt to handle more general scenarios than the ones covered by probability measures. To achieve this, we rely on the concept of monotone expectation [30]. We consider the case of marginal spaces as well as product spaces, and take advantage of recent advances in the construction of fuzzy measures over product spaces [31]. Our study is restricted to discrete reference sets.

The rest of the paper is organized as follows. Section 2 establishes the basic notation and definitions, and highlights the fundamental properties of product measures that are used throughout the paper. Section 3 contains the original contributions in this paper, in what concerns the definition of parameters in a marginal measurable space, while Section 4 describes our proposals for product spaces. The paper ends with conclusions in Section 5.

#### **2. Preliminaries and Notation**

**Definition 1.** *[1] Let X be a set and* A *be a non-empty class of subsets of X so that X* ⊂ A *and* ∅ ⊂ A*. A function μ* : A −→ [0, 1] *is a* fuzzy measure *if:*


*4. If* {*An*}*n*∈<sup>N</sup> ∈ A *such that A*<sup>1</sup> <sup>⊆</sup> *<sup>A</sup>*<sup>2</sup> <sup>⊆</sup> ... *and* <sup>2</sup><sup>∞</sup> *n*=1 *An* ∈ A*, then* lim*<sup>n</sup> μ*(*An*) = *μ* ( <sup>∞</sup> *<sup>n</sup>*=<sup>1</sup> *An*).

$$\text{15.} \quad \text{If } \{A\_{\mathfrak{n}}\}\_{\mathfrak{n}\in\mathbb{N}} \in \mathcal{A} \text{ such that } A\_1 \supseteq A\_2 \supseteq \dots \text{ and } \bigcap\_{\mathfrak{n}=1}^{\infty} A\_{\mathfrak{n}} \in \mathcal{A} \text{, then}\\ \lim\_{\mathfrak{n}} \mu(A\_{\mathfrak{n}}) = \mu\left(\bigcap\_{\mathfrak{n}=1}^{\infty} A\_{\mathfrak{n}}\right).$$

The triplet (*X*, A, *μ*) is a *measurable space*, and *X* is called the *reference set*. We will only work with finite reference sets [4] in this paper. By default, we will assume that A is the power set of *X*.

**Example 1** (Modified from [32])**.** *Imagine there is a vehicle covering the connection between the harbor and the railway station in a city. This vehicle has four compartments: one for a car, one for a van, one for a motor-bike and another one for a bike. Assume that the gas tank of this vehicle has exactly the capacity necessary to carry the vehicle, with the four compartments busy, from the harbor to the railway station. Then we can regard this capacity to be equal to 1 unit. In this example, X* = {*c*, *v*, *m*, *b*}*, where c stands for* car compartment busy*, v for* van compartment busy*, m for* motor-bike compartment busy *and b for* bike compartment busy*. Assume also that the vehicle does not start the trip unless at least one of the compartments is busy. All the possible transportation situations are then the elements in* A = P(*X*) *(*P(*X*) *stands for the power set of (X). In these conditions, for every A* ⊆ *X, μ*(*A*) *can be interpreted as the proportion of gas consumed if A happens. A possible specification of a fuzzy measure for this problem is as follows.*

$$\mu(\{b\}) = 0.1, \,\mu(\{v\}) = 0.4, \,\mu(\{c\}) = 0.3, \,\mu(\{m\}) = 0.2,$$

$$\mu(\{c, v\}) = 0.6, \,\mu(\{c, b\}) = 0.35, \,\mu(\{c, m\}) = 0.45,$$

$$\mu(\{b, v\}) = 0.42, \,\mu(\{b, m\}) = 0.21, \,\mu(\{v, m\}) = 0.68,$$

$$\mu(\{c, v, b\}) = 0.7, \,\mu(\{c, v, m\}) = 0.75, \,\mu(\{c, b, m\}) = 0.5, \,\mu(\{v, b, m\}) = 0.69.$$

Note how the fuzzy measure in Example 1 is non-additive. Therefore, the same information cannot be represented by a single probability distribution.

Every fuzzy measure over a reference set of cardinality *n* can be characterized by *n*! probability functions (not necessarily different) [33], each one of them corresponding to one possible permutation

of the reference set. Given a permutation *<sup>σ</sup>* of the set of indices {1, ... , *<sup>n</sup>*}, we will denote by *<sup>X</sup><sup>σ</sup>* the ordering of the elements of *<sup>X</sup>* according to permutation *<sup>σ</sup>*, i.e., *<sup>X</sup><sup>σ</sup>* <sup>=</sup> {*xσ*(1), ... , *<sup>x</sup>σ*(*n*)}. When it is clear from the context, we will drop *<sup>σ</sup>* from the subscripts and write *<sup>X</sup><sup>σ</sup>* <sup>=</sup> {*x*(1),..., *<sup>x</sup>*(*n*)}.

**Definition 2.** *[33] Let* (*X*, A, *μ*) *be a measurable space. The* probability function associated with *μ and <sup>X</sup><sup>σ</sup> is defined as the set P<sup>σ</sup>* <sup>=</sup> {*pσ*(*x*(1)),..., *<sup>p</sup>σ*(*x*(*n*))} *such that*

$$p\_{\sigma}(\mathbf{x}\_{(i)}) = \begin{cases} \mu(A\_{(i)}) - \mu(A\_{(i+1)}) & \text{if } i < n\_{\prime} \\ \mu(\mathbf{x}\_{(n)}) & \text{if } i = n\_{\prime} \end{cases} \tag{1}$$

*where A*(*i*) = {*x*(*i*),..., *x*(*n*)}*.*

**Definition 3.** *[33] Let* (*X*, A, *μ*) *be a measurable space and let P<sup>σ</sup> be the probability function associated with μ and Xσ. The* probability measure generated by *μ and X<sup>σ</sup> is*

$$P\_{\mathcal{T}}(A) = \sum\_{\mathbf{x} \in A} p\_{\mathcal{T}}(\mathbf{x}), \quad \forall A \in \mathcal{A}. \tag{2}$$

We will use *Pσ* for both the probability function and the probability measure when it is clear from the context.

We will consider measures over marginal spaces (*X*, A) as well as product spaces (*X*<sup>1</sup> × *X*2, A*X*1×*X*<sup>2</sup> ) resulting from composing the marginal spaces (*X*1, A*X*<sup>1</sup> ) and (*X*2, A*X*<sup>2</sup> ), with A*X*1×*X*<sup>2</sup> = P(*X*<sup>1</sup> × *X*2), which is not the same as P(*X*1) × P(*X*2).

Of particular interest are the elements of a product class that can be obtained from sets in the marginal space. They are called *rectangles* and are formally defined as follows:

**Definition 4.** *Let* (*X*1, A*X*<sup>1</sup> ) *and* (*X*2, A*X*<sup>2</sup> ) *be two spaces where* A*X*<sup>1</sup> *and* A*X*<sup>2</sup> *are classes defined on X*<sup>1</sup> *and X*2*, respectively. The class of* rectangles *of* A*X*1×*X*<sup>2</sup> *is*

$$\mathcal{R} = \{ H \in \mathcal{A}\_{X\_1 \times X\_2} \mid H = A \times B, \text{ where } A \in \mathcal{A}\_{X\_1}, B \in \mathcal{A}\_{X\_2} \}. \tag{3}$$

Our proposals in this paper will be based on the product measures described in [31], which make use of the concept of triangular norm and conorm.

**Definition 5.** *[34] An operator T* : [0, 1] <sup>2</sup> −→ [0, 1] *is a* triangular norm *or* t-norm *for short, if it satisfies the following conditions:*


**Definition 6.** *[34] An operator T* : [0, 1] <sup>2</sup> −→ [0, 1] *is a* triangular conorm *or* t-conorm *for short, if it satisfies the following properties:*


The usual way of integrating real functions with respect to a fuzzy measure is by means of the so-called Choquet integral, which is a generalization of Lebesgue integral to monotone measures.

**Definition 7.** *[2] Let* (*X*, A, *μ*) *be a measurable space, and let h be a measurable real function of X. The* Choquet integral *of h with respect to μ is*

$$\oint\_{A} h \circ \mu = \int\_{-\infty}^{0} (\mu(H\_{\mathfrak{A}} \cap A) - 1) \, da + \int\_{0}^{\infty} \mu(H\_{\mathfrak{A}} \cap A) \, da \tag{4}$$

*where A* ∈ A *and H<sup>α</sup> are the α*-cuts *of h, defined as*

$$H\_{\mathbb{A}} = \{ \mathbf{x} \in \mathcal{X} / h(\mathbf{x}) \ge a \}. \tag{5}$$

If the reference set is finite, the integral can be expressed as

$$\oint h \circ \mu = h(\mathbf{x}\_{(1)})\mu(A\_{(1)}) + \sum\_{i=2}^{n} \mu(A\_{(i)}) [h(\mathbf{x}\_{(i)}) - h(\mathbf{x}\_{(i-1)})],\tag{6}$$

where *<sup>X</sup><sup>σ</sup>* is an ordering such that *<sup>h</sup>*(*x*(1)) <sup>≤</sup> *<sup>h</sup>*(*x*(2)) <sup>≤</sup> ... <sup>≤</sup> *<sup>h</sup>*(*x*(*n*)) and the sets *<sup>A</sup>*(*i*) are of the form {*x*(*i*), *x*(*i*+1),..., *x*(*n*)}. Furthermore, if *h* is non-negative, it can be computed as

$$\oint h \circ \mu = \sum\_{i=1}^{n} h(\mathbf{x}\_{(i)}) p\_{\sigma}(\mathbf{x}\_{(i)})\_{\prime} \quad p\_{\sigma} \in P\_{\text{h}\prime} \tag{7}$$

where *Ph* is the probability function associated with the ordering *X<sup>σ</sup>* induced by *h* (see Definition 2).

Given two measurable spaces (*X*1, A*X*<sup>1</sup> , *μ*1) and (*X*2, A*X*<sup>2</sup> , *μ*2) , the concept of product fuzzy measure is defined as follows.

**Definition 8.** *[31] A* product fuzzy measure *of μ*<sup>1</sup> *and μ*<sup>2</sup> *is a function μ*<sup>12</sup> : A*X*1×*X*<sup>2</sup> −→ [0, 1] *satisfying:*


The next definitions particularize the concept of a fuzzy measure product, so that it is guaranteed to be compatible with the intuitive idea of independence, in the sense that if two fuzzy measures are independent, their fuzzy measure product should be possible to be obtained using exclusively the two original fuzzy measures.

**Definition 9.** *[31] Let* (*X*1, A*X*<sup>1</sup> , *μ*1) *and* (*X*2, A*X*<sup>2</sup> , *μ*2) *be measurable spaces. μ*<sup>1</sup> *and μ*<sup>2</sup> *are* %-independent fuzzy measures *if there exists a product fuzzy measure μ*% <sup>12</sup> *such that for any H* ∈ R*,*

$$
\mu\_{12}^{\odot}(H) = \mu\_1(A) \odot \mu\_2(B),
\tag{8}
$$

*where H* = *A* × *B and* % *is a t-norm. μ*% <sup>12</sup> *is called the* %*-independent product of μ*<sup>1</sup> *and μ*2*.*

**Definition 10.** *[31] Let* (*X*1, A*X*<sup>1</sup> , *μ*1) *and* (*X*2, A*X*<sup>2</sup> , *μ*2) *be measurable spaces. The* %-exterior product measure *for any H* ∈ A*X*1×*X*<sup>2</sup> *is defined as*

$$\overline{\mu}\_{12}^{\odot}(H) = \min\_{A \times B \supseteq H} \mu\_1(A) \odot \mu\_2(B),\tag{9}$$

*where* % *is a t-norm.*

**Definition 11.** *[31] Let* (*X*1, A*X*<sup>1</sup> , *μ*1) *and* (*X*2, A*X*<sup>2</sup> , *μ*2) *be measurable spaces. The* %-interior product measure *for any H* ∈ A*X*1×*X*<sup>2</sup> *is defined as*

$$\underline{\mu}\_{12}^{\odot}(H) = \max\_{A \times B \subseteq H} \mu\_1(A) \odot \mu\_2(B),\tag{10}$$

*where* % *is a t-norm.*

Both measures conform to lower and upper bounds for any %-independent product fuzzy measure.

**Proposition 1.** *[31] Let* (*X*1, A*X*<sup>1</sup> , *μ*1) *and* (*X*2, A*X*<sup>2</sup> , *μ*2) *be measurable spaces. Given any* %*-independent product of μ*<sup>1</sup> *and μ*2*, it holds that for all C* ∈ A*X*1×*X*<sup>2</sup> *,*

$$
\underline{\mu}\_{12}^{\odot}(\mathbb{C}) \le \mu\_{12}^{\odot}(\mathbb{C}) \le \overline{\mu}\_{12}^{\odot}(\mathbb{C}).\tag{11}
$$

Note that, for the particular case of the class R, both measures coincide [31], i.e., for all *H* ∈ R,

$$
\underline{\mu^{\odot}}\_{12}(H) = \mu^{\odot}\_{12}(H) = \mathfrak{T}^{\odot}\_{12}(H). \tag{12}
$$

Product fuzzy measures can also be defined in terms of the associated probability measures [31].

**Definition 12.** *[31] Let* (*X*1, <sup>A</sup>*X*<sup>1</sup> , *<sup>μ</sup>*1) *and* (*X*2, <sup>A</sup>*X*<sup>2</sup> , *<sup>μ</sup>*2) *be measurable spaces and <sup>P</sup>μ*<sup>1</sup> *<sup>σ</sup>*<sup>1</sup> *and <sup>P</sup>μ*<sup>2</sup> *<sup>σ</sup>*<sup>2</sup> *be the probability functions associated with Xσ*<sup>1</sup> <sup>1</sup> *and Xσ*<sup>2</sup> <sup>2</sup> *, respectively. The* lower product p-measure *is defined as*

$$\mathfrak{m}\_{12}(\mathbb{C}) = \min\_{\sigma\_1, \sigma\_2} \left[ P\_{\sigma\_1}^{\mu\_1} \odot P\_{\sigma\_2}^{\mu\_2}(\mathbb{C}) \right],\tag{13}$$

*for all C* ∈ A*X*1×*X*<sup>2</sup> *, where* <sup>⊗</sup> *is the standard probabilistic product, i.e., Pμ*<sup>1</sup> *<sup>σ</sup>*<sup>1</sup> <sup>⊗</sup> *<sup>P</sup>μ*<sup>2</sup> *<sup>σ</sup>*<sup>2</sup> (*C*) = *<sup>P</sup>μ*<sup>1</sup> *<sup>σ</sup>*<sup>1</sup> (*C*)*Pμ*<sup>2</sup> *<sup>σ</sup>*<sup>2</sup> (*C*)*.*

**Definition 13.** *[31] Given the conditions in Definition 12, the* upper product p-measure *is defined as*

$$\overline{\mathfrak{m}}\_{12}(\mathbb{C}) = \max\_{\sigma\_1, \sigma\_2} \left[ P\_{\sigma\_1}^{\mu\_1} \odot P\_{\sigma\_2}^{\mu\_2}(\mathbb{C}) \right],\tag{14}$$

*where* ⊗ *is the standard probabilistic product.*

#### **3. Parameters over One Measurable Space**

In this section we propose statistical parameters aimed to characterize the behavior of functions defined on a measurable space endowed with a fuzzy measure. We will separately address the case of analyzing a single function and the case of simultaneously analyzing two functions.

#### *3.1. The Case of Only One Function*

Our proposals rely on the extension of the concept of mathematical expectation associated with probability measures, to the more general case of fuzzy measures. Consider a measurable space (*X*, A, *μ*) where *μ* is a fuzzy measure, and the class P of all the additive measures over *X*. One way to extend the concept of mathematical expectation [5,35] is based on defining the set

$$\mathcal{M}\_P(\mu) = \{ P \in \mathfrak{P} | P(A) \ge \mu(A), \,\forall A \in \mathcal{A} \}\tag{15}$$

of all the probability measures that dominate the fuzzy measure *μ*.

Since all the elements in M*P*(*μ*) are additive measures, the expectation of a function *h* with respect to a fuzzy measure *μ* can be defined as

$$E\_{\mu}(h) = \min\_{P \in \mathcal{M}\_P(\mu)} E\_P(h)\_{\prime} \tag{16}$$

where *EP*(*h*) is the mathematical expectation of *h* with respect to the probability measure *P*.

The problem of this definition is that it is not always well defined, since there can exist a fuzzy measure *μ* for which M*P*(*μ*) = ∅. It happens, for instance, when the sum of the fuzzy measure *μ* over the unitary subsets of *X* is greater than 1, as it is not possible to find a probability measure bounding *μ* from above.

A class of fuzzy measures that are compatible with the definition of expectation in Equation (16) are those that conform a lower envelope of a set of probability measures [6], i.e., *μ*(*A*) = min{*P*(*A*)|*P* ∈ M ⊆ P}, because in that case M*P*(*μ*) = ∅.

A more general definition of expectation, based on Choquet integral [2], was given in [30] with the aim of extending the probabilistic concept of expectation to non-additive settings.

**Definition 14.** *[30] Let* (*X*, A, *μ*) *be a measurable space and let h be a non-negative, real valued measurable function of X. The monotone expectation of h with respect to the fuzzy measure μ is defined as*

$$E\_{\mu}(h) = \oint h \circ \mu.\tag{17}$$

Since a fuzzy measure can always be characterized by a set of probability measures, it is clear from Definition 14 and Equation (7) that the monotone expectation is equal to the mathematical expectation obtained with the probability function associated with the fuzzy measure *μ* and the ordering induced by the function *h* (see Definition 2), i.e.,

$$E\_{\mu}(h) = E\_{P\_{\mu,h}}(h),\tag{18}$$

where *Pμ*,*<sup>h</sup>* denotes the probability function associated with *μ* and the ordering induced by *h*. In the particular case of considering a finite reference set, the monotone expectation can be expressed as

$$E\_{\mu}(h) = \sum\_{i=1}^{n} h(\mathbf{x}\_{(i)}) p\_{\sigma}(\mathbf{x}\_{(i)}), \ p\_{\sigma} \in P\_{\mu, h}. \tag{19}$$

The relation between the monotone expectation and the mathematical expectation is also illustrated in Proposition 2.

**Proposition 2.** *[30] Let* (*X*, A, *μ*) *be a measurable space and let* {*Pσ*, *σ* ∈ *Sn*} *be the set of all the probability functions associated with the fuzzy measure μ. Then, for any non-negative real valued, measurable function h of X it holds that*

$$\min\_{\sigma} E\_{\mathcal{P}\_{\sigma}}(h) \le E\_{\mu}(h) \le \max\_{\sigma} E\_{\mathcal{P}\_{\sigma}}(h). \tag{20}$$

3.1.1. Monotone Variance

In the same way as the monotone expectation extends in a natural way the concept of mathematical expectation to non-additive measures, we will pursue the extension of other statistical parameters in a similar way.

We will start off considering the extension of the concept of variance to a non-monotone context. A direct approach is to define an extension of the variance using Choquet integral, as in the case of the monotone expectation, which yields

$$\text{Var}\_{\mu}(h) = E\_{\mu}[(h - E\_{\mu}(h))^2]. \tag{21}$$

However, the definition of variance in Equation (21) is problematic, since the distribution associated with *μ* and the ordering induced by *h* is not, in general, the same as the one induced by (*<sup>h</sup>* <sup>−</sup> *<sup>E</sup>μ*(*h*))2. The reason is that functions *<sup>h</sup>* and (*<sup>h</sup>* <sup>−</sup> *<sup>E</sup>μ*(*h*))<sup>2</sup> are not comonotone, and therefore they may induce different orderings of the reference set. Hence, the monotone variance defined in this way could not be considered as a measure of dispersion with respect to the monotone expectation, as the underlying probability distribution can be different (see Definition 2).

Taking this into account, we propose a definition of monotone variance that preserves the underlying probability measure associated with *μ* and the ordering induced by *h*.

**Definition 15.** *Let* (*X*, A, *μ*) *be a measurable space and let h be a non-negative real valued measurable function of X. We define the monotone variance of h with respect to the fuzzy measure μ as*

$$Var\_{\mu}(h) = Var\_{P\_{\mu,h}}(h),\tag{22}$$

*where Pμ*,*<sup>h</sup> is the probability function associated with μ and the ordering induced by h.*

It is clear from the definition that Var*μ*(*h*) ≥ 0 and that it is equal to the traditional variance when *μ* is a probability measure.

**Example 2.** *Consider the fuzzy measure over the reference set X* = {*x*1, *x*2, *x*3} *and its associated probability distributions in Table 1, and the function h defined as h*(*x*1) = 0.4, *h*(*x*2) = 0.1 *and h*(*x*3) = 0.7*. The ordering of X induced by h is thus* (*x*2, *x*1, *x*3)*, i.e., the ordering induced by permutation σ* = (2, 1, 3)*, which corresponds to the probability distribution P*(2,1,3)*. Therefore, according to Equation* (22)*, the monotone variance of h is just the variance of h computed using probability distribution P*(2,1,3)*, resulting in*

$$Var\_{\mu}(h) = 0.0621... $$

**Table 1.** A fuzzy measure and the associated probability distributions corresponding to all the possible permutations of the indices (1, 2, 3).


Our definition of monotone variance preserves some properties of the traditional variance, likewise the monotone expectation preserves some properties of the mathematical expectation. In particular, the result in Theorem 1 is of practical value as it simplifies the calculation, and it is also of interest because it links the concepts of monotone variance and monotone expectation.

**Theorem 1.** *Let* (*X*, A, *μ*) *be a measurable space and let h be a non-negative real valued measurable function of X, then it holds that*

$$Var\_{\mu}(h) = E\_{\mu}(h^2) - E\_{\mu}^2(h). \tag{23}$$

**Proof.** According to Equation (22),

$$\text{Var}\_{\mathbb{H}}(h) \quad = \quad \text{Var}\_{P\_{\mathbb{H}^h}}(h)\_\* $$

i.e., the variance of *h* computed according to probability distribution *Pμ*,*h*, which can be calculated as

$$\text{Var}\_{P\_{\mu\lambda}}(h) \quad = \quad E\_{P\_{\mu\lambda}}(h^2) - \left[ E\_{P\_{\mu\lambda}}(h) \right]^2,$$

and thus

$$\text{Var}\_{\mu}(h) \quad = \quad E\_{P\_{\mu,h}}(h^2) - \left[ E\_{P\_{\mu,h}}(h) \right]^2. \tag{24}$$

The functions *h* and *h*<sup>2</sup> are comonotone, and therefore they induce the same ordering of the reference set and hence yield the same associated probability distribution (see Definition 2). Thus, it holds that *Pμ*,*<sup>h</sup>* = *Pμ*,*h*<sup>2</sup> and therefore,

$$E\_{P\_{\mu\lambda}}(h^2) = E\_{P\_{\mu\lambda^2}}(h^2).$$

In addition, according to Equation (18), *<sup>E</sup>μ*(*h*2) = *EP<sup>μ</sup>*,*h*<sup>2</sup> (*h*2) = *EP<sup>μ</sup>*,*<sup>h</sup>* (*h*2) and *<sup>E</sup>μ*(*h*) = *EP<sup>μ</sup>*,*<sup>h</sup>* (*h*). Now, replacing *EP<sup>μ</sup>*,*<sup>h</sup>* (*h*2) by *<sup>E</sup>μ*(*h*2) and *EP<sup>μ</sup>*,*<sup>h</sup>* (*h*) by *<sup>E</sup>μ*(*h*) in Equation (24) we obtain Equation (23).

**Example 3.** *As a continuation of Example 2, we will compute Varμ*(*h*) *using Equation* (23)*.*

$$\begin{aligned} E\_{\mu}(h^2) &= \quad E\_{P\_{(2,1,3)}}(h^2) \\ &= \quad 0.3 \cdot 0.4^2 + 0.4 \cdot 0.1^2 + 0.3 \cdot 0.7^2 = 0.199. \end{aligned}$$

$$\begin{array}{rcl} E\_{\mu}(h) &=& E\_{P\_{(2,1,3)}}(h) \\ &=& 0.3 \cdot 0.4 + 0.4 \cdot 0.1 + 0.3 \cdot 0.7 = 0.37. \end{array}$$

*Hence,*

$$Var\_{\mu}(h) = 0.199 - 0.37^2 = 0.0621...$$

The next result shows that the monotone variance behaves in a similar way as traditional variance in relation to affine transformations.

**Proposition 3.** *Assume the conditions in Theorem <sup>1</sup> and let <sup>t</sup> be a function defined as <sup>t</sup>* <sup>=</sup> *ah* <sup>+</sup> *<sup>b</sup> with <sup>a</sup>* <sup>∈</sup> <sup>R</sup><sup>+</sup> 0 *and b* <sup>∈</sup> <sup>R</sup>*. It holds that*

$$Var\_{\mu}(t) = a^2 Var\_{\mu}(h). \tag{25}$$

**Proof.** First, we have to show that *t* and *h* are comonotone, i.e., that for all *x*, *y* ∈ *X*, (*h*(*x*) − *h*(*y*)) and (*t*(*x*) − *t*(*y*)) have the same sign:

$$(h(\mathbf{x}) - h(\mathbf{y}))(t(\mathbf{x}) - t(\mathbf{y})) = (h(\mathbf{x}) - h(\mathbf{y}))(ah(\mathbf{x}) + b - ah(\mathbf{y}) - b) = a(h(\mathbf{x}) - h(\mathbf{y}))^2 \ge 0,$$

since *<sup>a</sup>* <sup>∈</sup> <sup>R</sup><sup>+</sup> <sup>0</sup> . Therefore, the probability distribution associated with the measure *μ* is the same for both functions, i.e., *Pμ*,*<sup>t</sup>* = *Pμ*,*<sup>h</sup>* and thus

$$\text{Var}\_{\boldsymbol{\mu}}(t) = \text{Var}\_{\boldsymbol{P}\_{\boldsymbol{\mu},\boldsymbol{t}}}(t) = \text{Var}\_{\boldsymbol{P}\_{\boldsymbol{\mu},\boldsymbol{h}}}(t) = a^2 \text{Var}\_{\boldsymbol{P}\_{\boldsymbol{\mu},\boldsymbol{h}}}(h) = a^2 \text{Var}\_{\boldsymbol{\mu}}(h) \dots$$

The next results analyze when the monotone variance is equal to 0.

**Theorem 2.** *Let* (*X*, A, *μ*) *be a measurable space and let h be a non-negative real valued measurable function of X. Let Pμ*,*<sup>h</sup> be the probability function associated with μ and h. Then, the following three conditions are equivalent:*

*1. Varμ*(*h*) = 0*.*

*2.* ∃!*i* (1 ≤ *i* ≤ *n*) *such that pσ*(*xi*) = 1 *and pσ*(*xj*) = 0*,* ∀*j* = *i, with p<sup>σ</sup>* ∈ *Pμ*,*h.*

*3.* ∃*i* (1 ≤ *i* ≤ *n*) *such that*

$$
\mu(H\_{\mathfrak{a}\_j}) = 1, \quad \forall j \le i \quad \text{and} \quad \mu(H\_{\mathfrak{a}\_j}) = 0, \quad \forall j > i.
$$

*where Hα<sup>i</sup>* = {*x* ∈ *X*|*h*(*x*) ≥ *h*(*xi*)}, *i* = 1, . . . , *n.*

**Proof.** Let us assume without loss of generality that

$$h(\mathbf{x}\_1) \le h(\mathbf{x}\_2) \le \cdots \le h(\mathbf{x}\_n). \tag{26}$$

(1) =⇒ (2)

Since *pσ*(*xi*) ≥ 0, *i* = 1, ... , *n*, and *n* ∑ *i*=1 *pσ*(*xi*) = 1, there must be at least one *i* ∈ {1, ... , *n*} such that *pσ*(*xi*) = 0.

Suppose that Var*μ*(*h*) = 0 and there exist two different *j*, *k* ∈ {1, ... , *n*}, *j* < *k*, such that *pσ*(*xj*) = 0 and *pσ*(*xk*) = 0. Then it holds that

$$\text{Var}\_{\boldsymbol{\mu}}(h) = \text{Var}\_{\mathbb{P}\_{\boldsymbol{\sigma}}}(h) = p\_{\boldsymbol{\sigma}}(\mathbf{x}\_{\boldsymbol{\bar{\jmath}}})(h(\mathbf{x}\_{\boldsymbol{\bar{\jmath}}}) - E\_{\boldsymbol{\mu}}(h))^2 + p\_{\boldsymbol{\sigma}}(\mathbf{x}\_{\boldsymbol{k}})(h(\mathbf{x}\_{k}) - E\_{\boldsymbol{\mu}}(h))^2 = 0,$$

which means that *h*(*xj*) = *Eμ*(*h*) and *h*(*xk*) = *Eμ*(*h*). However, according to the assumption in Equation (26), it holds that *<sup>h</sup>*(*xj*) ≤ *<sup>h</sup>*(*xj*+1) ≤···≤ *<sup>h</sup>*(*xk*−1) ≤ *<sup>h</sup>*(*xk*). Hence,

$$E\_{\mu}(h) = h(\mathbf{x}\_{j}) \le h(\mathbf{x}\_{j+1}) \le \dots \le h(\mathbf{x}\_{k-1}) \le h(\mathbf{x}\_{k}) = E\_{\mu}(h),$$

which means that

$$\begin{aligned} h(\mathbf{x}\_j) = h(\mathbf{x}\_{j+1}) = \dots = h(\mathbf{x}\_k) = E\_{\mu}(h) & \Rightarrow & H\_{\mathbf{a}\_j} = H\_{\mathbf{a}\_{j+1}} = \dots = H\_{\mathbf{a}\_k} \\ & \Rightarrow & p\_{\sigma}(\mathbf{x}\_j) = p\_{\sigma}(\mathbf{x}\_{j+1}) = \dots = p\_{\sigma}(\mathbf{x}\_{k-1}) = 0, \end{aligned}$$

which is a contradiction with the assumption that *pσ*(*xj*) = 0. Thus, there is only one *pσ*(*xi*) = 0 and furthermore, *pσ*(*xi*) = 1.

(2) =⇒ (3)

Assume ∃!*i* such that *pσ*(*xi*) = 0. Then,

$$p\_{\sigma}(\mathbf{x}\_1) = p\_{\sigma}(\mathbf{x}\_2) = \dots = p\_{\sigma}(\mathbf{x}\_{i-1}) = p\_{\sigma}(\mathbf{x}\_{i+1}) = \dots = p\_{\sigma}(\mathbf{x}\_n) = 0$$

and therefore

$$\mu(H\_{\alpha\_1}) = \mu(H\_{\alpha\_2}) = \dots = \mu(H\_{\alpha\_i}),$$

and

$$
\mu(H\_{a\_{i+1}}) = \mu(H\_{a\_{i+2}}) = \dots = \mu(H\_{a\_n}).
$$

On the other hand, since *μ*(*Hα<sup>i</sup>* ) − *μ*(*Hαi*+<sup>1</sup> ) = 1, it follows that *μ*(*Hα<sup>j</sup>* ) = 1 if *j* ≤ *i* and *μ*(*Hα<sup>j</sup>* ) = 0 if *j* > *i*. (3) =⇒ (1)

It is straightforward from the definition of monotone variance.

**Corollary 1.** *If h is constant, then Varμ*(*h*) = 0*, for any fuzzy measure μ.*

**Example 4.** *Assume a function h defined on X* = {*x*1, *x*2, *x*3} *as h*(*x*1) = 0.4*, h*(*x*2) = 0.1 *and h*(*x*3) = 0.7*, and an associated probability distribution p<sup>σ</sup> such that pσ*(*x*1) = 1*, pσ*(*x*2) = 0 *and pσ*(*x*3) = 0*. We will see how the monotone variance is equal to 0. However, first we need to calculate the monotone expectation.*

$$E\_{\mu}(h) = 1 \cdot 0.4 + 0 \cdot 0.1 + 0 \cdot 0.7 = 0.4.$$

*Thus,*

$$\text{Var}\_{\mu}(h) = 1 \cdot (0.4 - 0.4)^2 + 0 \cdot (0.1 - 0.4)^2 + 0 \cdot (0.7 - 0.4)^2 = 0.1$$

*Now we will calculate the value of the measure μ over the sets Hα<sup>i</sup>* = {*x* ∈ *X*|*h*(*x*) ≥ *h*(*xi*)}, *i* = 1, 2, 3*, i.e., Hα*<sup>1</sup> = {*x*1, *x*3}*, Hα*<sup>2</sup> = {*x*1, *x*2, *x*3} *and Hα*<sup>3</sup> = {*x*3}*.*

*We can obtain the values of μ from pσ using Definition 2. The result is*

$$\begin{aligned} \mu(\{\mathbf{x}\_3\}) &=& p\_\sigma(\mathbf{x}\_3) = 0, \\ \mu(\{\mathbf{x}\_1, \mathbf{x}\_3\}) &=& p\_\sigma(\mathbf{x}\_1) + \mu(\{\mathbf{x}\_3\}) = 1 + 0 = 1, \\ \mu(\{\mathbf{x}\_1, \mathbf{x}\_2, \mathbf{x}\_3\}) &=& p\_\sigma(\mathbf{x}\_2) + \mu(\{\mathbf{x}\_1, \mathbf{x}\_3\}) = 0 + 1 = 1. \end{aligned}$$

*Therefore, μ*(*Hα*<sup>1</sup> ) = 1*, μ*(*Hα*<sup>2</sup> ) = 1 *and μ*(*Hα*<sup>3</sup> ) = 0*.*

3.1.2. Monotone Moments

Following the same idea underlying the definition of monotone variance, we can extend the concepts of central and non-central moments from a probabilistic setting to a monotone one.

**Definition 16.** *Let* (*X*, A, *μ*) *be a measurable space and let h be a non-negative real valued measurable function of X. We define the k-th non-central monotone moment of h with respect to μ as*

$$\mathbf{g}^k\_{\mu}(h) = E\_{\mu}(h^k). \tag{27}$$

Note that Equation (27) is well defined, since *h* and *h<sup>k</sup>* are comonotone, and therefore the corresponding probability function is the same for both of them, regardless of the value of *k*.

The definition of central monotone moments is, however, more problematic. If we follow the same idea as in Definition 16, and define the central monotone moment as *<sup>E</sup>μ*(*<sup>h</sup>* <sup>−</sup> *<sup>E</sup>μ*(*h*))*k*, we find the problem that functions *<sup>h</sup>* and (*<sup>h</sup>* <sup>−</sup> *<sup>E</sup>μ*(*h*))*<sup>k</sup>* are not comonotone, and that would mean that different underlying probability distributions would be used to compute *<sup>E</sup>μ*(*h*) and *<sup>E</sup>μ*(*<sup>h</sup>* <sup>−</sup> *<sup>E</sup>μ*(*h*))*k*. We will therefore generalize the definition of monotone variance to values of *k* = 2, utilizing the probability function associated with *μ* and *h*.

**Definition 17.** *Let* (*X*, A, *μ*) *be a measurable space and let h be a non-negative real valued measurable function of X. We define the k-th central monotone moment of h with respect to μ as*

$$\gamma\_{\mu}^{k}(h) = E\_{P\_{\mu\lambda}}\left[ (h - E\_{\mu}(h))^{k} \right],\tag{28}$$

*where Pμ*,*<sup>h</sup> is the probability function associated with μ and h.*

The following result establishes the relation between central and non-central monotone moments.

**Proposition 4.** *Let* (*X*, A, *μ*) *be a measurable space and let h be a non-negative real valued measurable function of X. It holds that*

$$\gamma\_{\mu}^{k}(h) = \sum\_{j=0}^{k} (-1)^{j} \binom{k}{j} \left[ \mathcal{g}\_{\mu}(h) \right]^{j} \mathcal{g}\_{\mu}^{k-j}(h). \tag{29}$$

**Proof.** Assume *X* = {*x*1,..., *xn*}.

$$\begin{split} \gamma\_{\mu}^{k}(h) &= \quad E\_{P\_{\mu}h} \left[ \left( h - E\_{\mu}(h) \right)^{k} \right] = \sum\_{i=1}^{n} (h(\mathbf{x}\_{i}) - E\_{\mu}(h))^{k} P\_{\mu,h}(\mathbf{x}\_{i}) \\ &= \quad \sum\_{i=1}^{n} \sum\_{j=0}^{k} (-1)^{j} \binom{k}{j} E\_{\mu}^{j}(h) h^{k-j}(\mathbf{x}\_{i}) P\_{\mu,h}(\mathbf{x}\_{i}) \\ &= \quad \sum\_{j=0}^{k} (-1)^{j} \binom{k}{j} E\_{\mu}^{j}(h) \sum\_{i=1}^{n} h^{k-j}(\mathbf{x}\_{i}) P\_{\mu,h}(\mathbf{x}\_{i}) \\ &= \quad \sum\_{j=0}^{k} (-1)^{j} \binom{k}{j} E\_{\mu}^{j}(h) E\_{\mu}(h^{k-j}) = \sum\_{j=0}^{k} (-1)^{j} \binom{k}{j} \left[ \mathcal{g}\_{\mu}(h) \right]^{j} \mathcal{g}\_{\mu}^{k-j}(h). \end{split}$$

#### *3.2. The Case of Two Functions*

In this section we approach the simultaneous analysis of two functions *h*<sup>1</sup> and *h*<sup>2</sup> over the same reference set, *X*. Our goal is to model the information that both functions have in common, or the way in which they interact with one another.

Generalizing the concept of covariance, for instance, by using *Eμ*[(*h*<sup>1</sup> − *Eμ*(*h*1))(*h*<sup>2</sup> − *Eμ*(*h*2))], raises the problem that the underlying probability distribution used to compute the monotone expectation is not the one induced by *h*<sup>1</sup> nor by *h*<sup>2</sup> for the same fuzzy measure *μ*, and therefore it is not clear that this monotone covariance in fact measures the relationship between both functions at all. We will therefore explore a different approach, in which we will model the degree of similarity between *h*<sup>1</sup> and *h*2, by measuring the common region determined by both functions.

**Definition 18.** *Let* (*X*, A, *μ*) *be a measurable space and let h*<sup>1</sup> *and h*<sup>2</sup> *be non-negative real valued measurable functions of X. We define the common expectation of h*<sup>1</sup> *and h*<sup>2</sup> *with respect to μ as*

$$
\psi\_{\mu}(h\_1, h\_2) = E\_{\mu}[\min\{h\_1, h\_2\}].\tag{30}
$$

The concept of common expectation is illustrated in Figure 1. More precisely, the value of the common expectation of *h*<sup>1</sup> and *h*<sup>2</sup> is the measure, according to *μ*, of the function under which the shaded area is.

**Figure 1.** An illustration of the concept of common expectation of *h*<sup>1</sup> and *h*2.

**Example 5.** *We want to obtain the global grade for two students out of the individual grades they obtained in four different courses* {*x*1, *x*2, *x*3, *x*4}*. In the final grade we want to reflect if a student shows a good performance in the two scientific courses,* {*x*1, *x*2}*, the humanistic ones,* {*x*3, *x*4}*, or in the combination*

{*x*2, *x*3}*, corresponding to a social sciences profile. These criteria are encoded in the fuzzy measure in Table 2, while the grades obtained by both students (between 0 and 1) in each of the courses are shown in Table 3.*


**Table 2.** A fuzzy measure matching the criteria in Example 5.

**Table 3.** Grades obtained by the students in Example 5 in the individual courses.


*The calculation of the respective monotone expectations and variances result in*

$$\begin{aligned} E\_{\mu}(h\_1) &= 0.61, & E\_{\mu}(h\_2) &= 0.65, \\ Var\_{\mu}(h\_1) &= 0.0769, & Var\_{\mu}(h\_2) &= 0.0765, \end{aligned}$$

*which are quite similar, while the common expectation is ψμ*(*h*1, *h*2) = 0.25*.*

The next proposition states the basic properties of the common expectation.

**Proposition 5.** *Let* (*X*, A, *μ*) *be a measurable space and let h*<sup>1</sup> *and h*<sup>2</sup> *be non-negative real valued measurable functions of X. Then, ψμ satisfies the following properties:*


#### **Proof.**


The common expectation is not normalized, and therefore its value alone is not enough to determine if it can be regarded as high or low. For instance, in Example 5 we obtained *ψμ*(*h*1, *h*2) = 0.25, but that value does not tell us if it is high or low. However, it is clear that the common expectation can be bounded from above, since it is known that for any positive real numbers *a* and *b*, it holds that min{*a*, *<sup>b</sup>*} ≤ <sup>√</sup>*<sup>a</sup>* · *<sup>b</sup>* <sup>≤</sup> max{*a*, *<sup>b</sup>*} and the equality is reached only when *<sup>a</sup>* <sup>=</sup> *<sup>b</sup>*. Hence, we can normalize the common expectation using these bounds, which yields three possible definitions of coefficients of concordance between *h*<sup>1</sup> and *h*2.

**Definition 19.** *Let* (*X*, A, *μ*) *be a measurable space and let h*<sup>1</sup> *and h*<sup>2</sup> *be non-negative real valued measurable functions of X. We define the coefficients of concordance ρ*1, *ρ*<sup>2</sup> *and ρ*<sup>3</sup> *between h*<sup>1</sup> *and h*<sup>2</sup> *with respect to μ as*

$$\begin{array}{rcl}\varphi\_1^{\mu}(h\_1, h\_2) &=& \frac{\psi\_{\mu}(h\_1, h\_2)}{\sqrt{E\_{\mu}(h\_1)E\_{\mu}(h\_2)}}\text{.}\tag{31} \end{array}$$

$$\rho\_2^{\mu}(h\_1, h\_2) = \begin{array}{c} \frac{\Psi \mu(h\_1, h\_2)}{E\_{\mu}(\max\{h\_1, h\_2\})} \\ \end{array} \tag{32}$$

$$\rho\_3^{\mu}(h\_1, h\_2) \quad = \frac{\Psi\_{\mu}(h\_1, h\_2)}{\min\{E\_{\mu}(h\_1), E\_{\mu}(h\_2)\}}. \tag{33}$$

The next proposition shows the basic properties of the three concordance coefficients (when it is clear from the context, we will drop the measure and the functions, thus denoting *ρ μ <sup>i</sup>* (*h*1, *h*2) by *ρi*).

**Proposition 6.** *Assume the conditions in Definition 19. The coefficients of concordance satisfy the following conditions:*


$$\begin{array}{rcl} \rho\_1 &=& \sqrt{\frac{E\_\mu(h\_1)}{E\_\mu(h\_2)}},\\ \rho\_2 &=& \frac{E\_\mu(h\_1)}{E\_\mu(h\_2)},\\ \rho\_3 &=& 1. \end{array}$$

*6. If h*<sup>1</sup> = *kh*2*, with k* > 1*, then*

$$\begin{array}{rcl} \rho\_1 &=& \frac{1}{\sqrt{k}'}\\ \rho\_2 &=& \frac{1}{k'}\\ \rho\_3 &=& 1. \end{array}$$

**Proof.**

1. It is clear that

$$\min\{E\_{\mu}(h\_1), E\_{\mu}(h\_2)\} \le \sqrt{E\_{\mu}(h\_1)E\_{\mu}(h\_2)} \le \max\{E\_{\mu}(h\_1), E\_{\mu}(h\_2)\}.$$

Furthermore, since *E<sup>μ</sup>* is a monotone functional, max{*Eμ*(*h*1), *Eμ*(*h*2)} ≤ *Eμ*(max{*h*1, *h*2}). According to property 2 in Proposition 5, *ψμ*(*h*1, *h*2) ≤ min{*Eμ*(*h*1), *Eμ*(*h*2)}, and thus

$$\psi\_{\mu}(h\_1, h\_2) \le \min\{E\_{\mu}(h\_1), E\_{\mu}(h\_2)\} \le \sqrt{E\_{\mu}(h\_1)E\_{\mu}(h\_2)} \le \max\{E\_{\mu}(h\_1), E\_{\mu}(h\_2)\}.$$

Therefore

$$\begin{array}{rcl}\rho\_1^{\mu}(h\_1, h\_2) &=& \frac{\psi\_{\mu}(h\_1, h\_2)}{\sqrt{E\_{\mu}(h\_1)E\_{\mu}(h\_2)}} \le 1, \\\rho\_2^{\mu}(h\_1, h\_2) &=& \frac{\psi\_{\mu}(h\_1, h\_2)}{E\_{\mu}(\max\{h\_1, h\_2\})} \le 1, \\\rho\_3^{\mu}(h\_1, h\_2) &=& \frac{\psi\_{\mu}(h\_1, h\_2)}{\min\{E\_{\mu}(h\_1), E\_{\mu}(h\_2)\}} \le 1. \end{array}$$

On the other hand, since *h*<sup>1</sup> and *h*<sup>2</sup> are non-negative, so it is *ψμ*(*h*1, *h*2), which means that *ρ μ* <sup>1</sup> (*h*1, *h*2) ≥ 0, *ρ μ* <sup>2</sup> (*h*1, *h*2) ≥ 0 and *ρ μ* <sup>3</sup> (*h*1, *h*2) ≥ 0.


$$\min\{E\_{\mu}(h\_{1}), E\_{\mu}(h\_{2})\} \le \sqrt{E\_{\mu}(h\_{1})E\_{\mu}(h\_{2})} \le E\_{\mu}(\max\{h\_{1}, h\_{2}\}) \Rightarrow$$

$$\frac{\psi\_{\mu}(h\_{1}, h\_{2})}{E\_{\mu}(\max\{h\_{1}, h\_{2}\})} \le \frac{\psi\_{\mu}(h\_{1}, h\_{2})}{\sqrt{E\_{\mu}(h\_{1})E\_{\mu}(h\_{2})}} \le \frac{\psi\_{\mu}(h\_{1}, h\_{2})}{\min\{E\_{\mu}(h\_{1}), E\_{\mu}(h\_{2})\}} \Rightarrow$$

$$\rho\_{2} \le \rho\_{1} \le \rho\_{3}.$$


$$\rho\_1 = \frac{E\_\mu(h\_1)}{\sqrt{E\_\mu(h\_1)E\_\mu(h\_2)}} = \sqrt{\frac{E\_\mu(h\_1)}{E\_\mu(h\_2)}}, \quad \rho\_2 = \frac{E\_\mu(h\_1)}{E\_\mu(h\_2)} \quad \text{and } \rho\_3 = \frac{E\_\mu(h\_1)}{\min\{E\_\mu(h\_1), E\_\mu(h\_2)\}} = 1.$$

6. If *h*<sup>1</sup> = *kh*<sup>2</sup> with *k* > 0, then min{*h*1, *h*2} = *h*2, max{*h*1, *h*2} = *kh*<sup>2</sup> and *Eμ*(*h*1) = *kEμ*(*h*2). Therefore,

$$\rho\_1 = \frac{E\_\mu(h\_2)}{\sqrt{k E\_\mu(h\_2) E\_\mu(h\_2)}} = \frac{1}{\sqrt{k}}, \quad \rho\_2 = \frac{E\_\mu(h\_2)}{E\_\mu(kh\_2)} = \frac{1}{k} \quad \text{and } \rho\_3 = \frac{E\_\mu(h\_2)}{\min\{k E\_\mu(h\_2), E\_\mu(h\_2)\}} = 1.$$

**Example 6.** *As a continuation of Example 5, we can use the data in Tables 2 and 3 to compute the coefficients of concordance, obtaining*

$$
\rho\_1(h\_1, h\_2) = 0.397, \quad \rho\_2(h\_1, h\_2) = 0.301, \quad \rho\_3(h\_1, h\_2) = 0.410.
$$

*Note how the three coefficients have low values, which is consistent with the data in the example, as in spite of the similar values for the monotone expectation and variance corresponding to both students, they have a clearly different profile, scientific in the case of h*<sup>1</sup> *and humanistic in the case of h*2*.*

#### **4. Parameters Defined over Product Spaces**

In this section we explore scenarios where we have two measurable spaces each of them equipped with a different fuzzy measure. We will consider the definition of statistical parameters on the product space.

Likewise, in Section 3, we will separately study the case of one or two real functions. In both cases, it is necessary to obtain a fuzzy measure over the product space. We will rely on the proposals in [31] to obtain the product measures.

#### *4.1. The Case of One Function*

The methods proposed in [31] for constructing fuzzy measures over product spaces, rather than single measures, usually yield a set of them, bounded by an upper and lower measure. Similarly, our proposals here will consist of intervals of parameters rather than a single one.

We will start defining the concept of joint expectation making use of the interior and exterior product measures (see Definitions 10 and 11).

**Definition 20.** *Let* (*X*1, A*X*<sup>1</sup> , *μ*1) *and* (*X*2, A*X*<sup>2</sup> , *μ*2) *be measurable spaces, h* : *X*<sup>1</sup> × *X*<sup>2</sup> → [0, 1] *and μ*% <sup>12</sup>*, <sup>μ</sup>*% 12 *the* %*-interior and* %*-exterior product measures. We define the joint lower and upper -expectations as*

$$\underline{E}\_{12}^{\odot}(h) \quad = \oint h \circ \underline{\mu}\_{12}^{\odot} \quad \text{(lower)},\tag{34}$$

$$\mathbb{E}\_{12}^{\odot}(h) = \oint h \diamond \overline{\mu}\_{12}^{\odot} \quad \text{(upper)}.\tag{35}$$

**Proposition 7.** *Let* (*X*1, A*X*<sup>1</sup> , *μ*1) *and* (*X*2, A*X*<sup>2</sup> , *μ*2) *be measurable spaces, h* : *X*<sup>1</sup> × *X*<sup>2</sup> → [0, 1] *and μ*% <sup>12</sup>*, <sup>μ</sup>*% 12 *the* %*-interior and* %*-exterior product measures. It holds that*

$$\mathbb{E}\_{12}^{(\mathbb{C})}(h) \le \mathbb{E}\_{12}^{(\mathbb{C})}(h). \tag{36}$$

*Furthermore, if μ*% <sup>12</sup> *is any* %*-independent product measure of μ*<sup>1</sup> *and μ*<sup>2</sup> *(see Definition 9), it also holds that*

$$
\underline{E}\_{12}^{(\circ)}(h) \le E\_{12}^{(\circ)}(h) \le \overline{E}\_{12}^{(\circ)}(h), \tag{37}
$$

*where E*% <sup>12</sup>(*h*) = C 7 *h* ◦ *μ*% 12*.*

**Proof.** Note that *E*% <sup>12</sup>, *<sup>E</sup>*% <sup>12</sup> and *E*% <sup>12</sup> are monotone expectations, namely, *Eμ*% 12 , *Eμ*% 12 and *Eμ*% 12 respectively. Therefore, Equations (36) and (37) are a direct consequence of the monotonicity of the monotone expectation and Proposition 1.

The concept of joint %-expectations is analogous to the concept of monotone expectation in a marginal space, with the difference that, in the case of the product space, the underlying fuzzy measure is not known, but instead we have an interval of measures bounded by the interior and exterior %-product measures.

We can define joint expectations using other product measures, as the *p*-measures given in Definitions 12 and 13.

**Definition 21.** *Let* (*X*1, A*X*<sup>1</sup> , *μ*1) *and* (*X*2, A*X*<sup>2</sup> , *μ*2) *be measurable spaces, h* : *X*<sup>1</sup> × *X*<sup>2</sup> → [0, 1] *and* m12*,* m<sup>12</sup> *the lower and upper product p-measures respectively. We define the lower and upper joint probabilistic expectations as*

$$E\_{\underline{\mathfrak{m}}\_{12}}(h) \quad = \oint h \circ \underline{\mathfrak{m}}\_{12} \quad (lower), \tag{38}$$

$$E\_{\overline{\mathfrak{m}}\_{12}}(h) \quad = \oint h \circ \overline{\mathfrak{m}}\_{12} \quad \text{(upper)}.\tag{39}$$

Since we have a function defined over the product space and fuzzy measures defined over the marginal spaces, it is natural to define marginal expectations. We will utilize the concept of *⊕*-marginal of a function [31].

**Definition 22.** *[31] Let h be a function defined on X*<sup>1</sup> × *X*<sup>2</sup> *and taking values on* [0, 1]*. We define the ⊕*-marginals *of h as*

$$h\_{X\_1}^{\ominus}(\mathbf{x}\_{1i}) \quad = \bigoplus\_{\mathbf{x}\_{2\n} \in X\_2} h(\mathbf{x}\_{1i}, \mathbf{x}\_{2\n}) = h(\mathbf{x}\_{1i}, \mathbf{x}\_{21}) \oplus h(\mathbf{x}\_{1i}, \mathbf{x}\_{22}) \oplus \dots \oplus h(\mathbf{x}\_{1i}, \mathbf{x}\_{2m}), \tag{40}$$

$$h\_{X\_2}^{\ominus}(\mathbf{x\_{2j}}) \quad = \bigoplus\_{\mathbf{x\_{1j}} \in X\_1} h(\mathbf{x\_{1i}}, \mathbf{x\_{2j}}) = h(\mathbf{x\_{11}}, \mathbf{x\_{2j}}) \oplus h(\mathbf{x\_{12}}, \mathbf{x\_{2j}}) \oplus \dots \oplus h(\mathbf{x\_{1n}}, \mathbf{x\_{2j}}), \tag{41}$$

*where* ⊕ *is a t-conorm (see Definition 6), n is the cardinality of X*<sup>1</sup> *and m is the cardinality of X*2*.*

**Definition 23.** *Let* (*X*1, A*X*<sup>1</sup> , *μ*1) *and* (*X*2, A*X*<sup>2</sup> , *μ*2) *be measurable spaces and let h be a function defined on X*<sup>1</sup> × *X*<sup>2</sup> *and taking values on* [0, 1]*. We define the marginal ⊕-expectations as*

$$E\_{X\_i}^{\ominus}(h) = \oint h\_{X\_i}^{\ominus} \circ \mu\_{i\star} \quad i = 1, 2,\tag{42}$$

*where h*⊕ *Xi are the* ⊕*-marginals of h.*

#### *4.2. The Case of Two Functions*

We will now assume that we have two different functions, one for each marginal space, and define parameters that combine the information provided by the marginal spaces.

**Definition 24.** *Let* (*X*1, A*X*<sup>1</sup> , *μ*1) *and* (*X*2, A*X*<sup>2</sup> , *μ*2) *be measurable spaces, and let h*1*, h*<sup>2</sup> *be functions defined on X*<sup>1</sup> *and X*<sup>2</sup> *respectively, taking values on* [0, 1]*. We define the upper and lower global expectation of h*<sup>1</sup> *and h*<sup>2</sup> *as*

$$\underline{\Phi}\_{\odot}^{\star}(h\_1, h\_2) \quad = \oint h\_{12}^{\star} \circ \underline{\mu}\_{12}^{\odot} \quad \text{(lower)},\tag{43}$$

$$\left(\overline{\phi}\_{\odot}^{\star}(h\_1, h\_2)\right) = \oint h\_{12}^{\star} \circ \overline{\mu}\_{12}^{\odot} \quad (upper). \tag{44}$$

*where and* % *are arbitrary t-norms (see Definition 5), <sup>h</sup>*- <sup>12</sup>(*x*1, *x*2) = *h*1(*x*1) *h*2(*x*2), ∀(*x*1, *x*2) ∈ *X*<sup>1</sup> × *X*<sup>2</sup> *and μ*% <sup>12</sup> *and <sup>μ</sup>*% <sup>12</sup> *are the interior and exterior product measures of μ*<sup>1</sup> *and μ*<sup>2</sup> *respectively.*

The next proposition shows that both expectations coincide when is the min t-norm.

**Proposition 8.** *Assume the conditions in Definition 24. If is the* min *t-norm, it holds that*

$$
\underline{\Phi}\_{\circlearrowright}^{\star}(h\_1, h\_2) = \overline{\Phi}\_{\circlearrowright}^{\star}(h\_1, h\_2). \tag{45}
$$

**Proof.** According to ([31], Proposition 8), the *α*-cuts generated by *h* belong to R when is the min t-norm. Furthermore, Equation (12) establishes that *μ*% <sup>12</sup> <sup>=</sup> *<sup>μ</sup>*% <sup>12</sup> for the elements of R, which proves the result.

As a consequence of Proposition 8, when using the min t-norm we will just write *φ*- % for both *<sup>φ</sup>*- % and *<sup>φ</sup>*- %.

The global expectation is in fact an extension of the monotone expectation in the sense expressed by the next theorem.

**Theorem 3.** *Let* (*X*, A, *μ*) *be a measurable space and let h be a function defined on X and taking values on* [0, 1]*. Consider the product space X* × *X and let both and* % *be the* min *t-norm. Then,*

$$
\phi\_\odot^\*(h, h) = \phi\_{\min}^{\min}(h, h) = E\_\mu(h). \tag{46}
$$

**Proof.** Assume *X* = {*x*1, *x*2,..., *xn*}. Then

$$\forall (\mathbf{x}\_1, \mathbf{x}\_2) \in X \times X \quad \Rightarrow \quad h^{\min}(\mathbf{x}, y) = \min\{h(\mathbf{x}\_1), h(\mathbf{x}\_2)\}.$$

Without loss of generality, we can assume that

$$h(\mathbf{x}\_1) < h(\mathbf{x}\_2) < \dots < h(\mathbf{x}\_n),$$

in which case the *α*-cuts generated by *h*min are of the form

$$H\_{a\_i} = \{ (\mathfrak{x}\_{k'} \mathfrak{x}\_l) \in X \times X \mid k\_\prime l \ge i \}\_{\prime l}$$

which are elements of the class R with their two projections being identical.

Since we are using the min t-norm to construct the product measure, it turns out that the measure in each *α*-cut of the product space is equal to the measure assigned by *μ* to the *α*-cuts in the marginal space, and thus

$$\begin{aligned} \left(\Phi\_{\text{min}}^{\text{min}}(h,h)\right) &= \oint h^{\text{min}} \circ \mu\_{12}^{\text{min}} = \sum\_{i=1}^{n \times n} \mu\_{12}^{\text{min}}(H\_{a\_i})(\alpha\_i - \alpha\_{i-1}) \\ &= \sum\_{i=1}^{n} \left[\mu(H\_{a\_i}^{\downarrow X})\right](\alpha\_i - \alpha\_{i-1}) = E\_{\mu}(h). \end{aligned}$$

**Example 7.** *Consider a reference set X* = {*x*1, *x*2, *x*3} *and the function defined as h*(*x*1) = 0.1, *h*(*x*2) = 0, 4, *h*(*x*3) = 0.7*. The function h*min *is displayed in Table 4.*

**Table 4.** Values of the function *h*min.


*It can be seen how the diagonal contains the original values of h, and its α-cuts are*

$$\begin{aligned} H\_{0.1} &= X \times X, & \mu\_{12}^{\min}(H\_{0.1}) &= \min\{\mu(X), \mu(X)\}, \\ H\_{0.4} &= \{\mathbf{x}\_{2}, \mathbf{x}\_{3}\} \times \{\mathbf{x}\_{2}, \mathbf{x}\_{3}\}, & \mu\_{12}^{\min}(H\_{0.4}) &= \min\{\mu(\{\mathbf{x}\_{2}, \mathbf{x}\_{3}\}), \mu(\{\mathbf{x}\_{2}, \mathbf{x}\_{3}\})\}, \\ H\_{0.7} &= \{\mathbf{x}\_{3}\} \times \{\mathbf{x}\_{3}\}, & \mu\_{12}^{\min}(H\_{0.1}) &= \min\{\mu(\{\mathbf{x}\_{3}\}), \mu(\{\mathbf{x}\_{3}\})\}, \end{aligned}$$

*and therefore*

$$\begin{aligned} \phi\_{\min}^{\min}(h,h) &= \mu\_{12}^{\min}(H\_{0.1})(0.1-0) + \mu\_{12}^{\min}(H\_{0.4})(0.4-0.1) + \mu\_{12}^{\min}(H\_{0.7})(0.7-0.4) \\ &= \mu(X)(0.1-0) + \mu(\{x\_2, x\_3\})(0.4-0.1) + \mu(\{x\_3\})(0.7-0.4) \\ &= -E\_\mu(h). \end{aligned}$$

Likewise for the common expectation, the global expectation is not normalized, but it can be easily normalized in the same way as we did for the common expectation case, as stated in the next definition.

**Definition 25.** *Let* (*X*1, A*X*<sup>1</sup> , *μ*1) *and* (*X*2, A*X*<sup>2</sup> , *μ*2) *be measurable spaces and let h*<sup>1</sup> *and h*<sup>2</sup> *be functions defined on X*<sup>1</sup> *and X*<sup>2</sup> *respectively and taking values on* [0, 1]*. We define the global coefficients of concordance of h*<sup>1</sup> *and h*<sup>2</sup> *as*

$$\Phi\_1(h\_1, h\_2) \quad = \frac{\phi\_{\min}^{\min}(h\_1, h\_2)}{\min\{E\_{\mu\_1}(h\_1), E\_{\mu\_2}(h\_2)\}}\,\tag{47}$$

$$\Phi\_2(h\_1, h\_2) = \frac{\phi\_{\min}^{\min}(h\_1, h\_2)}{\sqrt{E\_{\mu\_1}(h\_1)E\_{\mu\_2}(h\_2)}} \cdot \tag{48}$$

**Example 8.** *(Continuation of Example 5)*

*Using the data in Table 3 we can obtain the function h*min <sup>12</sup> *, the values of which are given in Table 5.*


**Table 5.** Values of the function *h*min <sup>12</sup> .

*Using the fuzzy measure in Table 2, we find that φ*min min = 0.6 *and the global concordance coefficients are* Φ1(*h*1, *h*2) = 0.953 *and* Φ2(*h*1, *h*2) = 0.984*.*

*The value of the global expectation (*0.6*) is very close to the values of the monotone expectations for each student in Example 5 (0.61 and 0.65 respectively). It can be interpreted as the fact that the grades of both students are acceptable individually and also globally, which is reflected in high values of the global coefficients of concordance. Note how the global expectation is not detecting the fact that both students have different profiles (scientific and humanistic), while the common expectation detected this fact yielding a much lower value (0.25) resulting in lower values of the coefficients of concordance as well.*

#### **5. Conclusions**

With the introduction of the concept of monotone variance, we have complemented the already known concept of monotone expectation. It can be regarded as a measure of dispersion with respect to a central position measure. We have also introduced the concepts of central and non-central monotone moments, that can serve as a vehicle to define further statistical parameters based on fuzzy measures as, for instance, shape measures. The potential application scope is certainly wide, as it covers non-additive scenarios like the ones described in the examples in this paper, and just to mention some of them, such scenarios can be found in Engineering and Social Sciences applications.

The common expectation and concordance coefficients can be interpreted as measures of match between the functions, and in that sense can provide information about to which extent one function explains the other one. A possible application of these concepts is the development of prediction models when the measures are not additive.

Thanks to the developments in [31] we have been able to extend the concept of monotone expectation to product spaces, where, in addition, we have shown how to marginalize the information provided by a function over a product space using the marginal ⊕-expectations.

All the developments in this paper are restricted to finite reference sets. Even though it covers a wide variety of practical applications, it is worth exploring the formulation of the results obtained here to uncountable reference sets, which seems to be a promising research line. The first step in this direction would be the extension of the results in [31] to continuous domains.

**Author Contributions:** Investigation, F.R., M.M. and A.S.; writing—original draft, F.R., M.M. and A.S.; writing—review and editing, F.R., M.M. and A.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Spanish Ministry of Science and Innovation through grants TIN2016-77902-C3-3-P, PID2019-106758GB-C32 and by ERDF-FEDER funds.

**Conflicts of Interest:** The authors declare no conflict of interest.
