**2. Preliminaries**

*2.1. Pythagorean Fuzzy Set (PFS)*

**Definition 1** ([18,23])**.** *Suppose X is a fixed set. A PFS takes the form of:*

$$P = \{ \langle \text{ x, } P(\mu\_P(\mathfrak{x}), \upsilon\_P(\mathfrak{x})) \rangle | \mathfrak{x} \in X \}$$

*where vp*(*x*)*: X* → [0, 1] *and up*(*x*)*: X* → [0, 1] *represent the non-membership function and membership function of x* ∈ *X, respectively, <sup>u</sup>*2*p*(*x*) + *<sup>v</sup>*2*p*(*x*) ≤ 1*. In addition, <sup>π</sup>p*(*x*) = +1 − *<sup>u</sup>*2*p*(*x*) − *<sup>ν</sup>*2*p*(*x*) *denotes the hesitation degree of x* ∈ *X*.

For the sake of simplicity, Zhang and Xu [19] named *<sup>P</sup>*(*uP*(*x*), *vP*(*x*)) the Pythagorean fuzzy number (PFN), expressed by *β* = *<sup>P</sup>*-*<sup>u</sup>β*, *<sup>v</sup>β*., where *up*(*x*), *vp*(*x*) ∈ [0, 1], *<sup>π</sup>p*(*x*) = +1 − *<sup>u</sup>*2*p*(*x*) − *<sup>ν</sup>*2*p*(*x*) and *<sup>u</sup>*2*p*(*x*) + *<sup>v</sup>*2*p*(*x*) ≤ 1.

**Definition 2** ([9])**.** *Assume S* = {*si*|*<sup>i</sup>* = 0, ··· , *t*, *t* ∈ *R*} *is a linguistic term set, si is the linguistic evaluation value, t is the granularity of S. Take the seven-level linguistic term as an example: S =* {*<sup>s</sup>*0 *= Extremely low, s*1 *= Very low, s*2 *= Low, s*3 *= Fair, s*4 *= High, s*5 *= Very high, s*6 *= Extremely high*}.

*S* must satisfy the following two properties:


**Definition 3** ([22])**.** *Based on the definition of linguistic term set and PFS, the Pythagorean fuzzy linguistic set* (*PFLS*) *takes the form of D* = *<sup>s</sup>τ*(*x*), *up*(*x*), *vp*(*x*)HHH*x* ∈ *X, and the Pythagorean fuzzy linguistic number* (*PFLN*) *is denoted as <sup>s</sup>τ*(*x*), *up*(*x*), *vp*(*x*)*, where <sup>s</sup>τ*(*x*) *is the linguistic evaluation value.*

When the attribute values are represented in the form of linguistic terms in MADM problems, the linguistic variable cannot be directly calculated. Therefore, Xu used the subscript of the linguistic term [24] for computation, Wang put forward a linguistic scale function [11] to convert linguistic terms into crisp numbers, and Herrera converted linguistic terms into fuzzy numbers [25]. However, all those methods still use linguistic subscript to express language variables in nature. To express the fuzziness and uncertainty of linguistic terms, we introduce shadowed set method to cope with linguistic term, and further put forward a new Pythagorean shadowed set.

## *2.2. Shadowed Set*

**Definition 4** ([13,14])**.** *A shadowed set S is a set-valued mapping as follows:*

$$S: \mathcal{U} \to \{0, [0, 1], 1\}$$

*where U is a given universe of discourse.*

> The core of the shadowed set *S* is the area where the mapping values of the elements are equal to 1.

$$core(S) = \{ \mathbf{x} \in \mathcal{U} | S(\mathbf{x}) = 1 \}$$

The elements of *U* whose mapping values are unit intervals in *S* compose the shadowed zone of the shadowed set and are expressed as follows,

$$CLI(S) = \{ \mathbf{x} \in \mathcal{U} | S(\mathbf{x}) = [0, 1] \}$$

The elements of *U* whose mapping values are equal to 0 will be excluded from the shadowed set *S*.

**Definition 5.** *A* = [*a*, *b*, *c*, *d*] *is called a shadowed number* (*SN*)*, where a*, *b are the lower and upper bound of the left-shoulder shadowed part, and c*, *d are the lower and upper bound of the right-shoulder shadowed part. Figure 1 shows an illustration of shadowed number.*

**Figure 1.** Shadowed number.

#### *2.3. Pythagorean Shadowed Set (PSS)*

In this section, we will define the Pythagorean shadowed set and give some properties for it.

**Definition 6.** *Suppose X is a fixed set, a Pythagorean shadowed set T over X takes the form of*

$$T = \{ \langle \, A, \, P(\mu\_P(\mathfrak{x}), \upsilon\_P(\mathfrak{x})) \rangle | \mathbf{x} \in X \}$$

*where A* = [*a*, *b*, *c*, *d*]*, a*, *b are the lower and upper bound of the left-shoulder shadowed part, respectively, and c*, *d are the lower and upper bound of the right-shoulder shadowed part, respectively. Function up*(*x*)*: X* → [0, 1] *and vp*(*x*)*: X* → [0, 1] *denote the membership function and non-membership function, respectively*. *u*2 *p*(*x*) + *v*2 *p*(*x*) ≤ 1*, and <sup>π</sup>p*(*x*) = +1 − *<sup>u</sup>*2*p*(*x*) − *<sup>ν</sup>*2*p*(*x*) *denotes the hesitate degree of x* ∈ *X*.

**Definition 7.** *A Pythagorean shadowed number* (*PSN*) *takes the form of:*

$$V = \langle A, P(\mu\_P(\mathfrak{x}), \upsilon\_P(\mathfrak{x})) \rangle$$

*where a*, *b are the lower and upper bound of the left-shoulder shadowed part, c*, *d are the lower and upper bound of the right-shoulder shadowed part, and up*(*x*)*: X* → [0, 1] *and vp*(*x*)*: X* → [0, 1] *represent membership function and non-membership function, respectively.*

Let *V*1 = *A*1, *<sup>P</sup>*(*uP*(*<sup>x</sup>*1), *vP*(*<sup>x</sup>*1)) and *V*2 = *A*2, *<sup>P</sup>*(*uP*(*<sup>x</sup>*2), *vP*(*<sup>x</sup>*2)) be two PSNs, where *A*1 = [*<sup>a</sup>*1, *b*1, *c*1, *d*1] and *A*2 = [*<sup>a</sup>*2, *b*2, *c*2, *d*2], then the operation rules are as follows:

$$\begin{pmatrix} \mathbf{1} \\ \end{pmatrix} \quad V\_1 + V\_2 = \left\langle \begin{bmatrix} a\_1 + a\_2, b\_1 + b\_2, c\_1 + c\_2, d\_1 + d\_2 \end{bmatrix}, P \left( \sqrt{\left( u\_p(\mathbf{x}\_1) \right)^2 + \left( u\_p(\mathbf{x}\_2) \right)^2 - \left( u\_p(\mathbf{x}\_1) \right)^2 \left( u\_p(\mathbf{x}\_2) \right)^2}, v\_p(\mathbf{x}\_1) v\_p(\mathbf{x}\_2) \right) \right\rangle\_{\mathbf{x}\_1, \mathbf{x}\_2}$$
 
$$\begin{pmatrix} \mathbf{\mathcal{D}} & \mathbf{V}\_1 \times \mathbf{V}\_2 \end{pmatrix} = \left\langle \begin{bmatrix} a\_1 \times a\_2 \ b\_1 \times b\_2 \times c\_1 \ c\_1 \ d\_1 \times d\_2 \end{bmatrix}, P \left( u\_p(\mathbf{x}\_1) u\_p(\mathbf{x}\_2) \right) \sqrt{\left( v\_p(\mathbf{x}\_1) \right)^2 + \left( v\_p(\mathbf{x}\_2) \right)^2 - \left( v\_p(\mathbf{x}\_1) \right)^2 \left( u\_p(\mathbf{x}\_2) \right)^2} \end{pmatrix}$$

$$\begin{aligned} \text{(2)} \qquad &V\_1 \times V\_2 = \left\langle \left[a\_1 \times a\_2, b\_1 \times b\_2, c\_1 \times c\_2, d\_1 \times d\_2\right], P\left(u\_p(\mathbf{x}\_1)u\_p(\mathbf{x}\_2), \sqrt{\left(\nu\_p(\mathbf{x}\_1)\right)^2 + \left(\nu\_p(\mathbf{x}\_2)\right)^2 - \left(\nu\_p(\mathbf{x}\_1)\right)^2 \left(\nu\_p(\mathbf{x}\_2)\right)^2}\right) \right\rangle \\ \text{(3)} \qquad &\lambda V\_1 = \left\langle \left[\lambda a\_1, \lambda b\_1, \lambda c\_1, \lambda d\_1\right], P\left(\sqrt{\left(1 - \left(\nu\_p(\mathbf{x}\_1)\right)^2\right)^\lambda}, \left(\nu\_p(\mathbf{x}\_1)\right)^\lambda}\right) \right\rangle \end{aligned}$$

$$\begin{aligned} \text{(3)} \quad \lambda V\_1 &= \left\langle \left[ \lambda a\_1, \lambda b\_1, \lambda c\_1, \lambda d\_1 \right], P \left( \sqrt{1 - \left( 1 - \left( \mu\_{\mathcal{P}}(\mathbf{x}\_1) \right)^2 \right)^{\lambda}}, \left( \nu\_{\mathcal{P}}(\mathbf{x}\_1) \right)^{\lambda} \right) \right\rangle, \lambda \geq 0 \end{aligned}$$

$$(4)\quad V\_1^{\lambda} = \left\langle \left[a\_1^{\lambda}, b\_1^{\lambda}, c\_1^{\lambda}, d\_1^{\lambda}\right], P\left(\left(\mu\_p(\mathbf{x}\_1)\right)^{\lambda}, \sqrt{1 - \left(1 - \left(\nu\_p(\mathbf{x}\_1)\right)^2\right)^{\lambda}}\right) \right\rangle, \lambda \ge 0.$$

**Theorem 1.** *For any two PSNs V*1 = *A*1, *<sup>P</sup>*(*uP*(*<sup>x</sup>*1), *vP*(*<sup>x</sup>*1)) *and V*2 = *A*2, *<sup>P</sup>*(*uP*(*<sup>x</sup>*2), *vP*(*<sup>x</sup>*2))*, whereA*1 = [*<sup>a</sup>*1, *b*1, *c*1, *d*1] *and A*2 = [*<sup>a</sup>*2, *b*2, *c*2, *d*2]*, the calculation rules satisfy the following properties:*

170

*(1) V*1 + *V*2 = *V*2 + *V*1 *(2)V*1×*V*2 = *V*2×*V*1


#### **3. Shadowed Set Model of Linguistic Terms**

We collect the interval data for each language word in the form of the seven-level linguistic term listed in Definition 2 and use the collected interval data to construct the shadowed set models for the seven-level linguistic term. The interval data are obtained by means of questionnaire survey. The main framework of our questionnaire is designed to ge<sup>t</sup> a proper interval value for each word of the seven-level linguistic term from those respondents according to their experience, habits and common sense. It is necessary for the filled numbers to be accurate to the first decimal place.

We handed out our questionnaires via leaflets, emails and online survey websites to people in different fields, especially to those with a bachelor degree or above. In the end, we go<sup>t</sup> 1205 valid questionnaires, and the questionnaire data were processed by the following interval data preprocessing method to obtain the shadowed number of the seven-level linguistic term.

#### *3.1. Interval Data Preprocessing*

Wu and Liu [26,27] proposed an efficient method to preprocess interval data, and we preprocessed the *n* interval endpoint data [*ak*, *bk*](*k* = 1, 2, . . . , *n*) based on this method, as follows:

**Step 1:** Bad data processing. This aims to remove unreasonable results from the surveyed people, whose answers were beyond the range of the universe of discourse *U*. If the interval endpoints satisfy the following conditions, the interval data are acceptable. Otherwise, they will be rejected.

$$\begin{cases} \ 0 \le a\_k \le 10 \\\ \ 0 \le b\_k \le 10 \\\ b\_k \ge a\_k \end{cases} , k = 1, 2, \dots, m$$

By this step, some data will be abandoned, and *n*<sup>∗</sup> < *n* interval data will be preserved.

**Step 2:** Outlier Processing. By using the Box and Whisker test [28], the data that are extremely large or small, i.e., outliers, can be eliminated. Outlier tests can be applied to process the endpoints of interval data and the lengths of interval data *Lk* = *bk* − *ak*, respectively. Consequently, only the interval endpoints and lengths satisfying the following conditions are kept:

$$\begin{cases} \begin{aligned} a\_k &\in \left[Q\_a(0.25) - 1.5IQR\_a, Q\_a(0.75) + 1.5IQR\_a\right] \\ b\_k &\in \left[Q\_b(0.25) - 1.5IQR\_b, Q\_b(0.75) + 1.5IQR\_b\right] \\ L\_k &\in \left[Q\_L(0.25) - 1.5IQR\_L, Q\_L(0.75) + 1.5IQR\_L\right] \end{aligned} \end{cases}, k = 1, 2, \dots, n^\* \end{aligned}$$

where *Qa* and *IQRa* are respectively the quartile and interquartile ranges of the left endpoints, *Qb* and *IQRb* are respectively the quartile and interquartile ranges of the right endpoints, *QL* and *IQRL* are respectively the quartile and interquartile ranges of the interval data's length. *Q*(0.25) and *Q*(0.75) are the first and third quartiles, which include 25% and 75% of the data, respectively. In addition, the interquartile range *IQR* is the difference between *Q*(0.25) and *Q*(0.75); that is to say, *IQR* contains 50% of the data between *Q*(0.25) and *Q*(0.75). The points that are more than 1.5*IQR* below the first quartile or more than 1.5*IQR* above the third quartile are regarded as outliers.

After this step, *m*<sup>∗</sup> < *n*<sup>∗</sup> interval data will remain.

Then, the following statistics of the *m*<sup>∗</sup> interval data are calculated: *ml* and *σl* are mean values and standard deviations of the *m*<sup>∗</sup> left endpoints, respectively. Similarly, *mr* and *σr* represent the mean values and standard deviations of the *m*<sup>∗</sup> right endpoints. *mL* and *σL* denote the mean values and standard deviations of the lengths of the *m*<sup>∗</sup> interval data.

**Step 3:** Tolerance limit processing. If the remaining intervals satisfy the following conditions, then they will be accepted; otherwise, they will be rejected.

$$\begin{cases} a\_k \in \left[ m\_l - \eta \sigma\_l, m\_l + \eta \sigma\_l \right] \\ b\_k \in \left[ m\_r - \eta \sigma\_r, m\_r + \eta \sigma\_r \right] \\ L\_k \in \left[ m\_L - \eta \sigma\_L, m\_L + \eta \sigma\_L \right] \end{cases} , k = 1, 2, \dots, m^\*$$

where *η* is the tolerance factor, which represents that we can assure the given limits at least include the proportion 1 − *α* of the measurements with 100 · (<sup>1</sup>−*<sup>γ</sup>*)% confidence level. The value of tolerance factor can be obtained from Table 1 [29].


**Table 1.** Tolerance factor *η* for several collected data.

After the processing of Step 3, *m*∗∗ < *m*<sup>∗</sup>(<sup>1</sup> ≤ *m*∗∗ ≤ *n*) interval data will be left, and the following statistical characteristics of the *m*∗∗ data will be computed: *ml*, *σl*, *mr*, *σr*, *mL* and *σL* of the left (right) endpoints the *m*∗∗ interval data.

**Step 4:** Reasonable-interval processing. If the intervals satisfy the following conditions, they will be kept; otherwise, they will be rejected.

$$2m\_l - \phi^\* \le a\_k < \phi^\* < b\_k \le 2m\_r - \phi^\*$$

where

$$\phi^\* = \frac{\left(m\_r \sigma\_l^2 - m\_l \sigma\_r^2\right) \pm \sigma\_l \sigma\_r \left[\left(m\_l - m\_r\right)^2 + 2\left(\sigma\_l^2 - \sigma\_r^2\right) \ln\left(\sigma\_l / \sigma\_r\right)\right]^{1/2}}{\sigma\_l^2 - \sigma\_r^2}$$

After this step, there will be *m* interval data.

In a word, there will be *m* interval data after the four processing steps above, which is not greater than the *n* interval data at the beginning, as shown in Figure 2.

$$\mathcal{M} \xrightarrow{\text{Bad data}} \mathcal{M} \xrightarrow{\text{Outliers}} \mathcal{M} \xrightarrow{\text{"Iolerance"}} \mathcal{M} \xrightarrow{\text{Reasonable}} \mathcal{M}$$

**Figure 2.** The process of data preprocessing.

#### *3.2. Shadowed Set Model of Seven-Level Language Terms*

After data preprocessing, the distribution of the remaining interval data is obtained as shown in Figure 3. The intervals of the left-end points and right-end points can reflect the linguistic word's uncertainties from different surveyed persons. Therefore, it is necessary to determine the representative intervals for the left-end points and right-end points to express the uncertainties. As shown in Figure 3, the core area can be determined even if the surveyed people cannot give accurate representative intervals. As a result, the core of the shadowed set is the core area and the uncertain bound of the shadowed set is the representative intervals.

**Figure 3.** Distribution of the remaining interval data.

Next, we will estimate the representative intervals by the tolerance limit method via the following steps.

**Step 1:** Calculate the mean *ml* and standard deviation *σl* of the remaining left-end points

$$m\_l = \frac{\sum\_{k=1}^{m} \hat{I}\_k}{m} \tag{1}$$

$$
\sigma\_l = \sqrt{\frac{\sum\_{k=1}^{m} \left(\hat{l}\_k - m\_l\right)^2}{m}} \tag{2}
$$

where ˆ *lk* denotes the left-end point of each remaining interval, *m* is the number of remaining intervals.

**Step 2:** Determine the representative interval. Let [*Ll*, *Lr*] and [*Rl*, *Rr*] be the representative intervals of the left-end points and right-end points, respectively.

$$L\_l = m\_l - \eta \* \sigma\_l \tag{3}$$

$$L\_r = m\_l + \eta \* \sigma\_l \tag{4}$$

where *η* is the tolerance factor in Table 1.

> Then, the representative interval for the right-end points is calculated in the same way.

The parameters *γ* and *α* are set to 0.05 and 0.1 in this paper, respectively, and we can obtain a tolerance factor *η* of 1.709 from Table 1. Take the seven-level language terms as an example: based on the results above, the shadowed set models for seven-level language terms can be constructed as shown in Figure 4.

**Figure 4.** The shadowed set models for seven-level language terms.

#### **4. The Score Function of Pythagorean Shadowed Number**

Based on the concepts of shadowed number and Pythagorean shadowed number in Section 2, we will further present the score functions of shadowed number and Pythagorean shadowed number, respectively. Numerical examples will also be given to illustrate the specific calculation process of the two score functions.

According to the central limit theorem, the attribute value *rij* given by the decision-maker is stable and tends to be the most likely attribute value at a certain point, so it is believed that *rij* obeys the normal distribution within the fuzzy interval. From the tolerance limit method in Section 3.2, we can obtain the distribution of attribute value in the shadowed set *S* = {*Ai*|*U*}, *Ai* = [*ai*, *bi*, *ci*, *di*], as shown in Figure 5.

**Figure 5.** Normal distribution of attribute value.

**Definition 8.** *The score function of shadowed number A is defined as follows:*

$$\text{score}(A) = a + \int\_{a}^{b} f(\mathbf{x}) \, d\mathbf{x} + c - b + \int\_{c}^{d} f(\mathbf{x}) \, d\mathbf{x} + d \tag{5}$$

*where f*(*x*) = √ 12*πσ* · *<sup>e</sup>*<sup>−</sup>(*<sup>x</sup>*−*<sup>u</sup>*)2/(<sup>2</sup>·*σ*<sup>2</sup>).

> According to the 3*σ* principle of normal distribution:

$$p(r \in [\mu - 3\sigma, \mu + 3\sigma]) = 0.9974, \ p(r \in [a, d]) = 0.9974$$

Then *u* = *a*+*d* 2 , *σ* = *d*−*<sup>a</sup>* 6 .

**Example 1.** *The score function value of shadowed set A*0 *for 'High' in Figure 3 can be calculated as follows:*

$$\mu = a + d = 5.77 + 7.96 = 13.73,\\ \sigma = \frac{d - a}{6} = \frac{7.96 - 5.77}{6} = 0.37$$

*f*(*x*) = 1.08*e*<sup>−</sup>(*<sup>x</sup>*−6.87)2/0.27

> Then, we can gain the figure of shadowed number 'High' as shown in Figure 6.

**Figure 6.** The shadowed number of 'High' and its normal distribution.

In the same way, we can ge<sup>t</sup> the score function of shadowed sets for the other six language terms in Figure 3.

**Definition 9.** *The score function of a Pythagorean shadowed number V is denoted as:*

$$\text{score}(V) = \left( a + \int\_{a}^{b} f(\mathbf{x}) \, d\mathbf{x} + c - b + \int\_{c}^{d} f(\mathbf{x}) \, d\mathbf{x} + d \right) \cdot (u/v) \tag{6}$$

**Example 2.** *For a Pythagorean shadowed number V*1 = {[5.77, 6.48, 7.21, 7.96], *P*(0.7, 0.4)}*, the score function value is:*

$$\text{score}(V\_1) = \left(5.77 + \int\_{5.77}^{6.48} f(\mathbf{x})d\mathbf{x} + 7.21 - 6.48 + \int\_{7.21}^{7.96} f(\mathbf{x})d\mathbf{x} + 7.96\right) \cdot (0.7/0.4) = 31.33\dots$$
 
$$\text{score}(\dots) = \text{score}^2 \text{score}$$

*where f*(*x*) = 1.08*e*<sup>−</sup>(*<sup>x</sup>*−6.87)2/0.27.

#### **5. MADM Method Based on the Pythagorean Shadowed Set**

With the concept of PSS in mind, we can put forward a novel MADM approach under Pythagorean fuzzy linguistic term circumstances. The diagram of the proposed method is shown in Figure 7. Firstly, present a description of the MADM problem under the Pythagorean linguistic fuzzy circumstances. Secondly, transform the PFLS into PSS through a data-driven method. Thirdly, determine the ranking order of all alternatives so as to obtain the best choice(s) by means of the score function of PSNs and OWA operator. The whole decision-making process is carried out in the following steps.

**Figure 7.** Diagram of the proposed method.

**Step 1:** Standardized decision matrix. For PFLVs *Pij* = *sτij* , *<sup>P</sup>*-*up*-*xij*., *vp*-*xij*..9 For beneficial attributes, *Pij* = *Pij* = *sτij* , *<sup>P</sup>*-*up*-*xij*., *vp*-*xij*..9 For cost attributes, *Pij* = -*Pij*.−<sup>1</sup> = *<sup>s</sup>*(*<sup>τ</sup>ij*)−<sup>1</sup> , *<sup>P</sup>*-*vp*-*xij*., *up*-*xij*..9 where -*<sup>τ</sup>ij*.−<sup>1</sup> = *l* + 1 − *τij* and thenumberoflanguageterm.

**Step 2:** Collect the data by questionnaire and ge<sup>t</sup> the shadowed set of language terms by processing the data. Transform Pythagorean fuzzy linguistic numbers into PSNs using Figure 4.

**Step 3:** Transform the PFSN decision matrix into score function matrix based on Equation (6). **Step 4:** By OWA operator, the attribute values *rij*of each alternative *ai*are aggregated to obtain

the comprehensive attribute values *zi*.

*l* is

$$z\_i = \text{OWA}\_{\text{w}}(r\_{i1}, r\_{i2}, \dots, r\_{im}) = \sum\_{j=1}^{m} w\_i r\_{ij}, i = 1, 2, \dots, m$$

where *w* = (*<sup>w</sup>*1, *w*2,..., *wm*) is the criterion weight vector, *n* is the number of alternatives, *m* is the number of attribute.

**Step 5:** Determine the order of all the alternatives in the light of the comprehensive attribute values *zi*.

## **6. Numerical Study**

The proposed algorithm will be demonstrated by solving the problem of how to select the most suitable supplier for a company under various evaluation factors. At the same time, comparisons with the linguistic term subscript method and the linguistic scale function method are performed to show the advantages of our approach.

#### *6.1. Supplier Selection Problem*

A car company needs to choose appropriate supplier of spare parts. A total of five alternative suppliers are denoted as *a*1, *a*2, *a*3, *a*4, *a*5. After synthetical consideration, four main factors are taken into account: *c*1 Supply capacity, *c*2 Delivery timeliness, *c*3 Service quality, *c*4 Scientific research ability. The criterion weight vector is *w* = (0.3, 0.2, 0.4, 0.1). Language evaluation of the four attributes adopts the form of seven-level linguistic term, *S* = {*<sup>s</sup>*0,*s*1,*s*2,*s*3,*s*4,*s*5,*s*6} = {Extremely low, Very low, Low, Fair, High, Very high, and Extremely high}. The decision matrix given by experts is shown in Table 2:

**Table 2.** Decision matrix.


**Step 1:** *c*1, *c*2, *c*3, *c*4 are beneficial attributes. Therefore, the standardized decision matrix is the same with Table 2.

**Step 2:** Transform PFLNs into Pythagorean shadowed numbers using Figure 4, and the result is shown in Table 3.


#### **Table 3.** Decision matrix with PFSN.

**Step 3:** Transform the PFSNs decision matrix into score function matrix (shown in Table 4) based on Equation (6).


**Table 4.** Score function matrix.

**Step 4:** By OWA operator, the attribute values *rij* of each alternative *ai* are aggregated to obtain the comprehensive attribute values *zi*.

*z*1 = 25.87, *z*2 = 25.15, *z*3 = 31.65, *z*4 = 27.01, *z*5 = 29.91**Step 5:** Rank the alternatives and obtain the best alternative(s) according to the comprehensive attribute values *zi* in the Step 4.

*z*3 > *z*5 > *z*4 > *z*1 > *z*2, that means, *a*3 *a*5 *a*4 *a*1 *a*2.

And the alternative *a*3 is the best choice of the supplier option problem.

## *6.2. Comparison Analysis*

To verify the superiority of our method, comparations will be made between our approach and the other two approaches, i.e., the linguistic term subscript method [22] and the linguistic scale function method [11,19]. 

In [22], the score function of *p* = *<sup>s</sup>τ*(*x*), *uA*(*x*), *vA*(*x*)is:

$$score(p) = \frac{\pi(\mathbf{x})}{t+1} \* \left(\mu\_{\beta}^2 - \nu\_{\beta}^2\right) \tag{7}$$

where *τ*(*x*) is the subscript of the linguistic term, and *t* is the number of linguistic terms.

We can obtain the comprehensive attribute values *zi* based on Equation (7) and the OWA operator. *z*1= 0.094, *z*2= 0.104, *z*3= 0.098, *z*4= 0.115, *z*5 = 0.147,and *z*5> *z*4> *z*2> *z*3> *z*1.

 Therefore, the alternative *a*5is the best choice.

 In [19], the score function of PFN *β* = *<sup>P</sup>*-*<sup>u</sup>β*, *<sup>v</sup>β*. is:

$$score(\beta) = \mu\_{\beta}^{2} - \nu\_{\beta}^{2} \tag{8}$$

In [11], the improved linguistic scale function is calculated as follows:

$$f(s\_i) = \theta\_i = \begin{cases} \frac{m^a - (m-i)^a}{2m^a} & (i = 0, 1, 2, \dots, m) \\\frac{m^b + (i-m)^b}{2m^b} & (i = m+1, m+2m, \dots, t) \end{cases} \tag{9}$$

where *α*, *β* ∈ (0, 1], *m* = *t*2, and *t* is the number of linguistic terms.

According to the improved linguistic scale Function (8) and score Function (9), we can obtain the score function of *p* = *<sup>s</sup>τ*(*x*), *uA*(*x*), *vA*(*x*) as:

$$score(p) = f(s\_i) \* \left(\mu\_{\beta}^2 - \nu\_{\beta}^2\right) \tag{10}$$

Let *α* = *β* = 0.5. We can obtain the comprehensive attribute values *zi* based on Equation (10) and the OWA operator.

*z*1 = 0.07, *z*2 = 0.15, *z*3 = 0.2, *z*4 = 0.14, *z*5 = 0.16, and *z*3 > *z*5 > *z*2 > *z*4 > *z*1. Therefore, the alternative *a*3 is the best choice.

From Table 5, it can be observed that the ranking result obtained via our algorithm is different from the other two methods. By using the linguistic term subscript method, the ranking order is *a*5 *a*4 *a*2 *a*3 *a*1, which is totally different form the results of our method and the language scale function method. The reason is that replacing linguistic words simply with linguistic subscript leads to distortion of information. In fact, the linguistic subscript cannot effectively reflect original decision information. Compared with the linguistic term subscript approach, the linguistic scale function method seems more reasonable for describing the linguistic term information with a so-called language scale function. However, the language scale function still replaces linguistic words with numbers in nature, and information loss or information distortion is still inevitable. On the other hand, different people may have different viewpoints on the same word, but the linguistic subscript and linguistic scale function can only express a single meaning for a word. Compared with the other two methods, we utilize a data-driven method to construct the shadowed set models for the linguistic terms, which cannot only maintain the original decision information as far as possible, but also take different views into account for a single word.


