*3.1. Fuzzy Numbers*

A fuzzy number is a fuzzy set *A* on the referential set R that satisfies (i) *A* is normal, (ii) *A* - is convex, and (iii) the *α*-cuts of *A* - , *<sup>A</sup>*(*α*) = *A*(*α*), *<sup>A</sup>*(*α*), are closed and bounded (compact) intervals ∀*α* ∈ [0, 1]. The lower and upper bounds of the FN *<sup>A</sup>*(*α*) are:

$$A(\mathfrak{a}) = \left[ \underline{A}(\mathfrak{a}), \overline{A}(\mathfrak{a}) \right] = \left[ \inf\_{\mathbf{x} \in \mathbb{R}} \{ \mu\_{\bar{A}}(\mathbf{x}) \ge \mathfrak{a} \}, \sup\_{\mathbf{x} \in \mathbb{R}} \{ \mu\_{\bar{A}}(\mathbf{x}) \ge \mathfrak{a} \} \right] \tag{8}$$


FNs can be interpreted as the extension of the concept of a real number.


A triangular fuzzy number represented as *A*- = (*AL*/*AC*/*AU*), is a FN whose *α*-cuts, ∀*α* ∈ [0, 1], are, from Equation (8):

$$A(a) = \left[\underline{A}(a), \overline{A}(a)\right] = \left[A\_L + (A\_C - A\_L)a, A\_{ll} - (A\_{ll} - A\_C)a\right] \tag{9}$$

from where, if needed, the membership of *A*- could be obtained. The core of *A*- is *AC* and can be understood as the most reliable value of this TFN. i.e., the possibility of *AC* is 1. The support of *A*- is [*AL*, *AU*]. TFNs are used in countless practical applications including actuarial ones [22] because they are easy to handle arithmetically and they are well adapted to the way humans think of uncertain quantities. Moreover, when the information about a variable is vague and imprecise, the parsimony principle leads us to represent that information as simply as possible. The linear shape of TFNs meets that requirement. For instance, the uncertain quantity "approximately 0.04" can be represented in a very natural way as the TFN (0.038/0.04/0.042).

Likewise, let it be a TFN *A*-:

$$\tilde{A} \rhd \mathbf{x} \left( \text{or } \succeq \mathbf{x} \right) \text{ if } A\_L \rhd \mathbf{x} \left( A\_L \rhd \mathbf{x} \right) \forall \mathbf{x} \in \mathbb{R} \tag{10}$$

$$\tilde{A} < \mathbf{x} \left( \text{or } \le \mathbf{x} \right) \text{ if } A\_{L} < \mathbf{x} \left( A\_{L} \le \mathbf{x} \right) \forall \mathbf{x} \in \mathbb{R} \tag{11}$$

Let *f* be a continuous real-valued function of *n*-real variables *x* = (*<sup>x</sup>*1, *x*2,..., *xn*), *f*(*x*) = *f*(*<sup>x</sup>*1, *x*2,..., *xn*). If *xjs* are not crisp numbers, but FNs *Aj* with *α*–cuts *Aj*(*α*) = *Aj*(*α*), *Aj*(*α*) , *j* = 1, 2, ... , *n*, a FN *B*- is induced via *f* such that *B*- = *f A*-1, *A*-2,..., *An* . It is often difficult to obtain a closed expression for the membership function of *B*-. However, following [33], the *α*-cuts of *B*-, *<sup>B</sup>*(*α*), in the usual case where *<sup>A</sup>*1(*α*), ... , *An*(*α*) are not interactive, i.e., the variables that they quantify have an independent behavior, can be obtained as:

$$B(\mathfrak{a}) = \{ f(\mathfrak{x}) | \mathfrak{x} = (\mathfrak{x}\_1, \mathfrak{x}\_2, \dots, \mathfrak{x}\_n) \in \text{Dom}(\mathfrak{a}) \}\tag{12}$$

where *Dom*(*α*) stands for the rectangular domain:

$$Dom(a) = \left\{ \mathbf{x}\_{\bar{j}} \in \left[ \underline{A}\_{\bar{j}}(a), \overline{A}\_{\bar{j}}(a) \right], j = 1, 2, \dots, n \right\} \tag{13}$$

So, the lower (upper) bounds of *<sup>B</sup>*(*α*), *<sup>B</sup>*(*α*) (*B*(*α*)), are the global minimum (maximum) of *f*(*x*) within the rectangular domain in Equation (13), that is to say:

$$\underline{B}(\boldsymbol{a}) = \min\_{j,k} \{ f(V\_j), f(E\_k) \} \text{ and } \overline{B}(\boldsymbol{a}) = \max\_{j,k} \{ f(V\_j), f(E\_k) \} \tag{14}$$

being *Vj*, *j* = 1, 2, ... , 2*<sup>n</sup>*, a vertex of the domain (13) and *f*(*Ek*), *k* = 1, 2, ... , *K*, an extreme value of the function *f* within this domain that takes this value at point *x* = *Ek*.

Therefore, if *f*(*<sup>x</sup>*1, *x*2,..., *xn*) is monotonic, the lower and upper bounds of *<sup>B</sup>*(*α*), *<sup>B</sup>*(*α*) and *<sup>B</sup>*(*α*), are in one of the 2*n* vertexes of (13). Without loss of generality, let us suppose that *f* increases with respect to *xj*, *j* = 1, 2, ... , *m, m* ≤ *n*, and decreases in the last *n* − *m* variables, [34] demonstrates that:

$$\mathcal{B}(a) = \left[ \mathcal{B}(a), \overline{\mathcal{B}}(a) \right] = \left[ f\left( \Delta\_1(a), \Delta\_2(a), \dots, \Delta\_m(a), \overline{A}\_{m+1}(a), \overline{A}\_{m+2}(a), \dots, \overline{A}\_n(a) \right), \tag{15}$$

$$f\left( \overline{A}\_1(a), \overline{A}\_2(a), \dots, \overline{A}\_m(a), \underline{A}\_{m+1}(a), \underline{A}\_{m+2}(a), \dots, \underline{A}\_n(a) \right) \right]$$

If *<sup>A</sup>*1(*α*), ... , *An*(*α*) are interactive, (15) cannot be used to evaluate *<sup>B</sup>*(*α*). However, according to [35], the general formulation to obtain the lower and upper bounds of *<sup>B</sup>*(*α*) from Equations (12)–(14) is still valid but now the number of vertexes, *j*, is less than 2*<sup>n</sup>*. In [35], the authors study the role of interactive fuzzy variables in decision-making problems and analyze some particular cases. Concretely, when *f*(*x*) is the mathematical

expectation function, *xj* is the probability of the *j*th outcome, and it is quantified as a FN, the domain in Equation (13) turns into:

$$Dom(a) = \left\{ x\_{\vec{j}} \in \left[ \underline{A}\_{\vec{j}}(a), \overline{A}\_{\vec{j}}(a) \right], \sum\_{j=1}^{n} x\_{\vec{j}} = 1 \right\} \tag{16}$$


and Equation (14) becomes:

$$\underline{B}(\mathfrak{a}) = \min\_{j} \{ f(V\_{j}) \} \text{ and } \overline{B}(\mathfrak{a}) = \max\_{j} \{ f(V\_{j}) \} \tag{17}$$

due to the fact that *f*(*x*) is a linear function.

It is worth noting that the result *B* of evaluating a non-linear *f* with the TFNs *A j* is not necessarily a TFN. However, often *B* - admits a good triangular approximation through the secant approach. It builds up the shape of the triangular approximate FN *B* - to *B* - by means of the secant lines that unite the 0-cut and the 1-cut of *B* - . Such that *B* - is a TFN as:


$$
\dot{B} \approx \dot{B}' = (B\_L / B\_C / B\_{\rm II}) = \left( \underline{B}(0) / \underline{B}(1) = \overline{B}(1) / \overline{B}(0) \right) \tag{18}
$$

This approximation, as shown in [36], works pretty well for nonlinear monotonic functions of TFNs such as product, division, power, etc. Likewise, [37,38] show that this approach fits satisfactorily common actuarial and financial calculations with TFN parameters, e.g., the present value of a stream of fuzzy cash-flows. Keeping the triangular shape of the initial data when handling FNs is quite interesting. According to [39], complex shapes of FNs can generate problems with calculations in computer work or interpreting results intuitively. [40] state that a triangular approximate is a kind of defuzzification that is richer than just transforming a FN into a crisp representative value. If defuzzification is carried out too early, a grea<sup>t</sup> loss of information occurs, so it is preferable to drag all the fuzzy information in the calculations for as long as possible. The triangular approximation involves a compromise between simplification in computation and interpretation, and not oversimplifying the value of fuzzy parameters. In addition, TFNs have a very intuitive interpretation and, therefore, from the insurance industry point of view, a triangular approximate to actuarial variables and parameters could be very useful in decision-making processes.

## *3.2. Fuzzy Markov Chains*

Fuzzy set literature has provided three approaches to MC under fuzziness, namely FMC. The first one, due to [41], supposes fuzzy probabilities and proposes calculating the matrices *Pn*, ∀*n* > 1, by applying Zadeh's extension principle [42]. The second approach, in [43], consists of defining the matrix that governs the transition between states by means of a fuzzy relation. In the third one, [2], like [41], suppose that the probabilities of the one-step transition matrix are FNs. However, Buckley and Eslami's framework of FMCs uses restricted matrix multiplication to operate with probabilities in such a way that the constraint of being a well-formed probability distribution always holds. That is to say, they take into account the interdependence between the probabilities of a distribution function, similarly to Equations (16) and (17). This paper follows this last approach.

Let us assume that some probabilities *pij* in the one-step transition matrix *P* = *pij* are uncertain and are quantified by means of the FNs *pij*, with *α*-cuts *pij*(*α*). Now we have a fuzzy transition matrix *P* - = *pij*, with 0 ≤ *pij* ≤ 1. See Equations (10) and (11). Of course, some elements in *P* - may be crisp since crisp numbers are a particular case of a FN. FMCs defined by [2] have uncertainty in the transition probabilities but not in the set of outcomes, that is discrete. So, the following constraint on *pij*is added: *r* ∑ *pij*(1) = 1.

To compute the *n*-period transition matrix, *P* -(*n*), and the fuzzy stationary distribution, *π* -, the following process is implemented:

*j*=1

Step 1. For a given *α* ∈ [0, 1], obtain the matrix of intervals *<sup>P</sup>*(*α*) = *pij*(*α*) = *pij*(*α*), *pij*(*α*).

Step 2. Define the domain of a row *i* of this matrix, *Domi*(*α*), as:

$$Dom\_i(a) = \left( \times\_{j=1}^r p\_{ij}(a) \right) \cap D \tag{19}$$

where × stands for the cartesian product and *D* = \$(*x*1, *x*2,..., *xr*) *r*∑*j*=1 *pij* = 1, *pij* ≥ 0'.

Step 3. Define the domain of the matrix *<sup>P</sup>*(*α*), for the given *α* ∈ [0, 1], as:

$$Dom(\mathfrak{a}) = \left( \times\_{j=1}^{r\_s} Dom\_i(\mathfrak{a}) \right) \tag{20}$$

This domain defines a set of matrices that satisfy that each row sums up 1 with a possibility level of at least *α*. So, each matrix *P* = *pij*, *pij* ∈ *Dom*(*α*), is a crisp MC.

Step 4. Since *Domi*(*α*) in Equation (19) are compact sets, *Dom*(*α*) in Equation (20) is also compact. So, any continuous function applied to its elements has a compact image. Then, if Equation (2) is applied, such an image for that *α* is a compact interval. This interval is set as the *α*-cut of the FN *p*-(*n*) *ij* , i.e., *p*(*n*) *ij* (*α*) = *f* (*n*) *ij* (*Dom*(*α*)), which is surely normal [2].

To determine *p*(*n*) *ij* (*α*) we must find the lower and upper bounds, *p*(*n*) *ij* (*α*) and *p*(*n*) *ij* (*α*) by solving:

$$\underbrace{p\_{ij}^{(n)}} (\boldsymbol{\kappa}) = \min \left\{ f\_{ij}^{(n)} (\boldsymbol{p}) \, \middle| \, \boldsymbol{p} \in \operatorname{Dom}(\boldsymbol{\kappa}) \right\} \tag{21}$$

and

$$\overline{p\_{ij}^{(n)}}(a) = \max \left\{ f\_{ij}^{(n)}(p) \, \middle| \, p \in Dom(a) \right\} \tag{22}$$

Notice that Equations (21) and (22) can be easily solved in low-dimensional problems. However, in more complex problems it is necessary to use an algorithm (see, e.g., [44]) or a heuristic constrained optimization technique [45].

Finally, by performing Steps 1–4 ∀*α* ∈ [0, 1], the FNs *p*-(*n*) *ij*can be obtained.

In regards to the fuzzy stationary state, (*π*-1,..., *πr*), its *α*-cuts *<sup>π</sup>j*(*α*) = *<sup>π</sup>j*(*α*), *<sup>π</sup>j*(*α*)can be determined from Equation (3) as:

$$\pi\_j(u) = \min\{w\_j|w = wP, \ (p\_{11}, \dots, p\_{rr}) \in \text{Dom}(u)\}\tag{23}$$

$$
\overline{\pi\_j}(\mathfrak{a}) = \max \{ w\_j | w = wP\_{\prime\prime} \: (p\_{11\prime}, \dots, p\_{rr}) \in \text{Dom}(\mathfrak{a}) \}\tag{24}
$$

#### **4. Implementing a Markovian Fuzzy Bonus-Malus System Governed by a Fuzzy Poisson Discrete Random Variable**

In this Section, we propose an integral methodology to develop a FBMS under the hypothesis that *N* ∼ Po*λ*. It embeds the fitting of the risk parameter *λ* as a TFN, the obtaining of fuzzy transition matrix and the triangular approximate of the stationary distribution calculated by using Equations (23) and (24), and also the determination of the fuzzy mean asymptotic premium in Equation (4).

Step 1. Fit the risk factor *λ* as a TFN.

We point out three different options to estimate this parameter.

Option 1

Given that *λ* has an intuitive interpretation since it is the mean number of claims in one period, it may be quantified as a FN based on experts' opinions. For example, an expert may judge that a concrete type of driver generates approximately one claim every 5 years and so the TFN - *λ* = 0.2 # = (0.18/0.2/0.22) can be considered. Imprecise or subjective quantitative predictions can often come from a pool of experts, leading to a set of fuzzy quantifications. This set of fuzzy opinions can be aggregated simply by their arithmetic mean or other more sophisticated methods (see [46–48] for full details).

Option 2

Papers [49,50] consider a standard 1 − *α* statistical confidence interval as the observed *α*-cut of the FN, for some increasing values of *ε* ≤ *α* < 1, where *ε* is an arbitrary value near 0 (it is often chosen to be 0.001, 0.005 or 0.01). In [50] it is suggested that by placing those confidence intervals one on top of the other, a FN close to triangular-shaped is obtained. So, we point out two alternatives to apply that idea:

(a) Given that *λ* is the mean value of a Poisson RV, the interval estimates of *λ* can be used as the *α*-cuts of - *λ*. Let us denote as *N*∗ the mean number of claims in a pool of similar contracts, *SN* the standard deviation of *N* and *n* the number of policies in the pool. The 1 − *α* statistical confidence interval for the mean number of claims is:

$$\left[N^\* - t\_{\left(1-\frac{\mathfrak{g}}{2}, n-1\right)} \frac{\mathcal{S}\_N}{\sqrt{n}}, N^\* + t\_{\left(1-\frac{\mathfrak{g}}{2}, n-1\right)} \frac{\mathcal{S}\_N}{\sqrt{n}}\right] \tag{25}$$

where *t*( *α*2 ,*n*−<sup>1</sup>) stands for the (1 − *α*2 )-percentile of a Student *t* with *n* − 1 degrees of freedom and *SN* the standard deviation of the sample. So, - *λ* can be fitted through its *α*-cuts by doing, from Equation (25):

$$\lambda(a) = \left[\underline{\lambda}(a), \overline{\lambda}(a)\right] = \left[N^\* - t\_{\left(1-\frac{\mathfrak{s}}{2}, n-1\right)} \frac{S\_N}{\sqrt{n}}, N^\* + t\_{\left(1-\frac{\mathfrak{s}}{2}, n-1\right)} \frac{S\_N}{\sqrt{n}}\right] \tag{26}$$

(b) Papers [49,51] propose making fuzzy predictions from statistical linear regression models. In [49] it is stated that a 1 − *α* statistical confidence interval of coefficients adjusted with a linear regression may be interpreted as the *α*-cut of a FN for these coefficients. Therefore, let us suppose that a GLM estimate of *λ* is determined, as usual, by:

$$\ln \lambda = \sum\_{i=1}^{m} a\_i x\_i \tag{27}$$

being *ai*, *i* = 1, ... , *m*, the coefficients and *xi* the explanatory variables (e.g., age, gender, and driving experience in a car insurance context) that are crisp non-negative observations (in fact, they are usually modeled as dichotomic variables). For the estimate of each coefficient, it is possible to generate a FN *ai* whose *α*-cuts, *ai*(*α*), *i* = 1, . . . , *m*, are:

$$a\_i(\boldsymbol{a}) = \begin{bmatrix} \underline{a}\_i(\boldsymbol{a}), \overline{a}\_i(\boldsymbol{a}) \end{bmatrix} = \begin{bmatrix} a\_i^\* - t\_{\left(\frac{\mathfrak{g}}{2}, n-m-1\right)} S\_{a\_i \prime} a\_i^\* + t\_{\left(\frac{\mathfrak{g}}{2}, n-m-1\right)} S\_{a\_i} \end{bmatrix} \tag{28}$$

where *a*∗*i* is the GLM point estimate of *ai* and *Sai* the standard deviation of that estimate. So,fromthefunction - *λ* = *em* ∑ *i*=1 - *ai xi* andinmind and

 fuzzy , bearing Equations (15) (28),the following FN - *λ* is induced:

$$\lambda(\mathfrak{a}) = \left[ \underline{\lambda}(\mathfrak{a}), \overline{\lambda}(\mathfrak{a}) \right] = \left[ \epsilon^{\underline{m}} \underline{\omega}^{\underline{a}} \underline{\mathfrak{a}}^{(a)x\_{\bar{i}}}, \epsilon^{\underline{m}} \underline{\mathfrak{a}}^{\underline{m}} \overline{\mathfrak{a}}(\mathfrak{a}) x\_{\bar{i}} \right] \tag{29}$$

A similar approach may be developed from the results in [51]. However, in this case, it must be taken into account that their approach to making fuzzy predictions from a statistical regression is built up from the interval predictions of residuals instead of

using interval estimates of coefficients. So - *λ* = *e* ∑ *i*=1 *aixi*+ - *ε*where *ε* is a fuzzy error term induced from the residuals of the conventional regression and, so:

$$\lambda(\mathfrak{a}) = \left[ \underline{\lambda}(\mathfrak{a}), \overline{\lambda}(\mathfrak{a}) \right] \\ = \left[ \mathfrak{e}^{\frac{m}{i+1}a\_i \mathbf{x}\_i + \mathfrak{g}(\mathfrak{a})}, \mathfrak{e}^{\frac{m}{i+1}a\_i \mathbf{x}\_i + \mathfrak{f}(\mathfrak{a})} \right] \tag{30}$$




*m*

Equations (26), (29) and (30) do not give a TFN. However, *λ* can be approximated as a TFN simply by using Equation (18).
