Previous Article in Journal
An Independent Learning System for Flutter Cross-Platform Mobile Programming with Code Modification Problems
Previous Article in Special Issue
Intuitionistic Fuzzy Sets for Spatial and Temporal Data Intervals
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Novel Method Based on the Fuzzy Entropy Measure to Optimize the Fuzziness in Trapezoidal Strong Fuzzy Partitions

by
Barbara Cardone
1 and
Ferdinando Di Martino
1,2,*
1
Department of Architecture, University of Naples Federico II, Via Toledo 402, 80134 Napoli, Italy
2
Center for Interdepartmental Research “Alberto Calza Bini”, University of Naples Federico II, Via Toledo 402, 80134 Napoli, Italy
*
Author to whom correspondence should be addressed.
Information 2024, 15(10), 615; https://doi.org/10.3390/info15100615
Submission received: 10 September 2024 / Revised: 4 October 2024 / Accepted: 6 October 2024 / Published: 7 October 2024

Abstract

:
Analyzing the uncertainty of outcomes based on estimates of the data’s membership degrees to fuzzy sets is essential for making decisions. These fuzzy sets are often designated by experts as strong fuzzy partitions of the data domain with trapezoidal fuzzy numbers. Some indices of the fuzzy set’s fuzziness provide an assessment of the degree of uncertainty of the results. It is feasible to bring the fuzzy sets’ fuzziness below a tolerable level by suitably redefining the strong fuzzy partition. Significant differences in the original fuzzy partition, however, result in disparities concerning the decision maker’s approximative reasoning and the interpretability of the results. In light of this, we provide in this study a technique applied to trapezoidal strong fuzzy partitions that, while not appreciably altering the original fuzzy partition, reduces the fuzziness of its fuzzy sets. The fuzziness of the fuzzy sets is assessed using the De Luca and Termini fuzzy entropy. An iterative process is then executed, with the aim of modifying the cores of the trapezoidal fuzzy partitions to decrease their fuzziness. This technique is tested on datasets containing average daily temperatures measured in various cities. The findings demonstrate that this approach strikes a great balance between the goal of lessening the fuzziness of the fuzzy sets and the goal of not appreciably altering the original fuzzy partition.

1. Introduction

Fuzzy logic and approximate reasoning models [1,2] represent today a fundamental tool to be adopted in eXplainable Artificial Intelligence (XAI) techniques [3], a recent research field that aims to provide understandable and reliable results of complex problems.
Fuzzy logic allows one to model knowledge in a human-oriented vision, adopting a paradigm that allows the use of natural language terms. This capability allows users to be provided with readable explanations of the embodied knowledge, the results and strategies adopted, and the inference processes used [4].
Strong fuzzy partitions [5] are frequently employed to simulate human knowledge in approximate reasoning and fuzzy systems. Because trapezoidal fuzzy numbers facilitate the construction and interpretation of fuzzy concepts and fuzzy rules by decision makers, they are the types of fuzzy sets most widely used to create strong fuzzy partitions [6].
The level of uncertainty and vagueness of fuzzy-based models is assessed by measuring the fuzziness of the fuzzy sets, where the fuzziness of a fuzzy set represents the quantity of fuzzy information gained from the fuzzy set. A high level of fuzziness causes uncertainty in assigning linguistic terms to variables, which can influence the interpretability of the results.
The measure of the fuzziness of a fuzzy set is called fuzzy entropy [7], a concept extending the Shannon entropy of a crisp set [8].
In many data analysis applications, such as feature selection [9,10], feature reduction [11,12], multicriteria decision aiding [13], multiple attribute decision making [14,15], real-time data recognition [16], fuzzy clustering [17,18], thematic data classification [19], and emotion detection [20], some fuzzy entropy functions have recently been used to assess the fuzziness of fuzzy sets.
A variety of probabilistic, non-probabilistic, and hybrid fuzzy entropy functions were suggested in the literature. A summary of the fuzzy entropy types can be found in [21,22,23]. The properties of a fuzzy entropy function that are necessary to gauge the degree of fuzziness of fuzzy sets are defined in [24].
The fuzzy entropy measure that is more widely recognized is the De Luca and Termini fuzzy entropy, a variant of the Shannon entropy that was first presented in [24]. It has been applied in numerous data analysis problems.
In [15], a knowledge measure based on the De Luca and Termini fuzzy entropy is introduced; it is used to calculate the weights in a Multiple Attribute Decision-Making model when they are either unknown or only partially known.
A technique for determining the ideal strong fuzzy partition for theme classification in which the fuzzy sets’ fuzziness is quantified using the De Luca and Termini fuzzy entropy is presented in [19]. In order to lower each fuzzy set’s fuzziness below a predetermined level, the fuzzy set with the highest fuzziness is suitably split at each iteration.
In a preprocessing step, the De Luca and Termini fuzzy entropy is employed in [17,20] to choose the best initial fuzzy clusters for the Fuzzy C-Means algorithm. The centroids of the first fuzzy clusters with the lowest mean fuzziness are found iteratively once the number C of clusters has been chosen.
The primary critical point with these techniques is that the fuzzy partition that is ultimately produced by decreasing the fuzziness of fuzzy sets may differ significantly from the one that was first established by a human expert. This may have a considerable impact on how easily the results can be interpreted. Consequently, by modifying a fuzzy partition to reduce the fuzziness of its fuzzy sets, these techniques simultaneously make the processes and the results harder to comprehend.
As highlighted in [25], in order for the acquired knowledge in a fuzzy-based model to be easily verified, refined, and integrated with the domain knowledge of a human expert, it is necessary to guarantee a high interpretability of the model and of the results that it produces. Therefore, while on the one hand, approaches to reduce the fuzziness of fuzzy partitions, such as those carried out in [18,19], allow the reduction in the uncertainty of the results, on the other hand, they affect their interpretability, as they modify, even significantly, the initial fuzzy partitions. It is therefore necessary to use a method to reduce the fuzziness of fuzzy partitions, which represents a trade-off between the need to reduce the uncertainty of fuzzy models and that of maintaining their interpretability.
This study aims to lessen the fuzziness of the fuzzy sets in strong fuzzy partitions without significantly altering the original ones created by the expert. In contrast to the approach suggested in [18], the number of fuzzy sets and the terms assigned to them remain unchanged in order to preserve the expert’s approximate reasoning model.
Unlike the methods used in [18,19], the proposed method has the advantage of reducing the fuzziness of the fuzzy sets of strong fuzzy partitions without, however, deforming the structure of the fuzzy partition modeled by the expert, so as not to affect the interpretability of the results.
Our suggested approach is iterative and involves measuring the fuzziness values of the fuzzy sets of a strong trapezoidal fuzzy partition with respect to the dataset at each cycle. The De Luca and Termini fuzzy entropy is used to quantify the fuzziness. The core of the trapezoidal fuzzy set with the largest fuzziness, given by the real interval where its membership degree equals one, is steadily increased until the fuzziness of the fuzzy set is less than or equal to a predetermined threshold. When all fuzzy sets have fuzziness levels below or equal to the threshold, the algorithm ends.
A data source that holds the average daily temperatures observed in many metropolitan cities between 1995 and 2020 was utilized to test the methodology. In order to reduce the fuzziness of each fuzzy set below the threshold without significantly changing the initial partition, the same strong trapezoidal fuzzy partition is first created, and a fuzziness threshold is fixed. Then, the method is applied to the dataset of each city to find the best fuzzy partition with respect to the dataset.
The paper is structured as follows. Section 2 describes the basic concepts such as the De Luca and Termini fuzzy entropy and the trapezoidal strong fuzzy partition; the details of the proposed method are discussed in Section 3. The results of the tests executed to evaluate the performance evaluation of the method are provided in Section 4. Finally, Section 5 concludes the study by summarizing the important points regarding the proposed fuzzy entropy-based method to optimize the fuzziness and future perspectives of the research.

2. Basic Concepts

The concepts of a trapezoidal strong fuzzy partition and De Luca and Termini fuzzy entropy are covered in this section. As an extension of the Shannon entropy used in information theory, fuzzy entropy is introduced in Section 2.1. The trapezoidal strong fuzzy partition and the different kinds of trapezoidal fuzzy numbers that make it up are discussed in Section 2.2.

2.1. The Fuzzy Entropy Measure

Formally, let X be a random variable in the discrete domain U = {x1, …, xi, …, xn} and let pi be the probability that X = xi variables in X. The term,
h(pi) = −log(pi)   i = 1, 2, …, n,
denotes the probabilistic information gain function; it indicates the gain in information obtained when the outcome X = xi occurs. Since the logarithmic function is an increasing and negative function in the open interval (0, 1), then the Shannon information gain function −log(pi) is positive and is greater the smaller the probability pi is. In fact, the lower the probability of outcome X = xi occurring, the greater the information gain obtained.
The probabilistic Shannon entropy for the random variable X is given by:
H ( X ) = K i = 1 n p i h ( p i ) ,
where H(X) is a quantity useful to evaluate the mean information gain obtained following the occurrence of an event in which the random variable X assumes a value. It is defined, up to a positive constant of proportionality K, as the weighted average of the information gains with respect to their probability pi [8].
In fuzzy logic, fuzzy uncertainty is expressed by the concept of fuzziness, given by the uncertainty in the assignment of an element to a fuzzy set [21].
Let A be a fuzzy set defined in the domain of a variable assuming a set of n values = {x1, …, xi, …, xn}. Let μ i be the membership degree to A of xi. The closer μi is to 0.5, the greater the fuzziness in assigning or not xi to the fuzzy set A.
A fuzzy entropy measure quantifies the fuzziness of a fuzzy set; it quantifies the degree of presence or absence of a fuzzy entity or a fuzzy property in relation to a variable. Unlike the Shannon entropy, which handles probabilistic uncertainty, fuzzy entropy instead deals with vagueness and ambiguous uncertainties [3].
The definition of information gain function has been adopted in the popular De Luca and Termini fuzzy entropy [24], a Shannon-based measure of the degree of fuzziness of a fuzzy set.
Let µ be the membership degree of an element to the fuzzy set A. The information gain function proposed in [24] is given by:
h ( μ ) = 0 if   μ = 0 μ log 2 μ ( 1 μ ) log 2 ( 1 μ ) if   0 < μ < 1 0 if   μ = 1 ,
where log 2 ( μ ) and log 2 ( 1 μ ) denote, in logarithmic to base 2, the information gains corresponding, respectively, to the membership degree μ and to the complement membership degree 1 − μ.
The measure of the grade of fuzziness of the fuzzy set A is given by:
H ( A ) = K i = 1 n h ( μ i ) ,
where μ i is the membership degree of the element xi to A, and K is a positive constant of proportionality, generally set to 1/n.
The De Luca and Termini fuzzy entropy (3) satisfies the following four properties:
(1)
Iff A is a crisp set then H(A) = 0;
(2)
Iff μi = 0.5 for each xi then H(A) is maximum;
(3)
H(C(A)) = H(A) where C(A) is the complement of the fuzzy set A;
(4)
H(A) ≥ H(A’) where A’ is a sharpened version of A, i.e., any fuzzy set such that A’(x) ≥ A(x) if A(x) ≥ 0.5 and A’(x) ≤ A(x) if A(x) ≤ 0.5.
The greater the fuzziness, the greater the average degree of uncertainty in assigning a piece of data to the fuzzy set, and therefore the less significant the fuzzy set is with respect to the data.

2.2. Trapezoidal Strong Fuzzy Partitions

Let {A1, A2, …, AN+1} be a collection of normal and convex fuzzy sets defined for a domain U. It constitutes a Strong Fuzzy Partition (SFP) if the following constraint holds:
x U k = 1 N + 1 μ k ( x ) = 1 .
Equation (5) is called the Ruspini condition [26]. It requires that the sum of the degrees of membership of an element x to the fuzzy sets of the fuzzy partition must always equal one; this means that element x certainly belongs to the union of all fuzzy sets in the fuzzy partition.
Let U = [m, M] with m < M being a numerical domain given by a real closed interval. A simple mode to obtain SFPs from it is to generate fuzzy partitions given by trapezoidal fuzzy numbers (TFNs). A TFN T is built assigning four real numbers, a, b, c, d, where a ≤ b ≤ c ≤ d. We obtain:
T = 0       if   m   x < a x a b a     if   a x < b 1           if   b x < c x d c d     if   c x < d 0           if   d x M .
When b = c, the trapezoidal fuzzy number becomes a triangular fuzzy number, given by:
T r = 0       i f   m   x < a x a b a     i f   a x < b x d b d     i f   b x < d 0           i f   d x M .
Moreover, if a = b = m, then the TFN becomes a semi-trapezoidal fuzzy number called the R-function. It is given by:
R = 1       i f   m   x < c x c c d     i f   c x < d 0           i f   d x M .
Finally, if c = d = M, the TFN becomes a semi-trapezoidal fuzzy set called the L-function. It is given by:
L = 0       i f   m   x < a x a b a     i f   a x < b 1           i f   b x M .
Let the quadruple (ak, bk, ck, dk) be the set of parameters assigned to create, as a TFN, the kth fuzzy set, where k = 1, 2, …, N + 1. The collection of N + 1 trapezoidal fuzzy numbers {A1, A2, …, AN+1} defined as U = [m, M] is a SFP if the following constraints hold:
a 1 = b 1 = m                                         a k + 1 = c k       k = 1 , , N b k + 1 = d k       k = 1 , , N c N + 1 = d N + 1 = M                       ,
where the first and the last fuzzy sets A1 and AN+1 are, respectively, a R-function and a L-function fuzzy number.
In this case the intersection point of two consecutive fuzzy sets, Ak and Ak+1, where k = 1, …, N, is a point x ¯ k , called the cut-point, where A k ( x ¯ k ) = A k + 1 ( x ¯ k ) = 0.5 and A h ( x ¯ k ) = 0 for each h ≠ k, k + 1.

3. The Proposed Method

Let Φ = {A1, A2, …, AN+1} be a trapezoidal SFP. Let X = {x1, …, xi, …, xn} be a set of measures of a variable X defined in a real domain U = [m, M].
Applying the De Luca and Termini fuzzy entropy, the fuzziness of a fuzzy set Ak in the SFP is given by:
H A k = 1 n i = 1 n h μ i k   ,  
where μ i k is the membership degree of the element xi to Ak, and the function h is the De Luca and Termini fuzzy entropy function (3).
High fuzziness values imply a high average uncertainty in assigning data to the fuzzy set, so it is necessary to modify the values of the TFN parameters in order to reduce its fuzziness.
In this paper, in order to reduce the fuzziness of the fuzzy set with respect to the data, an optimization process is executed that modifies the parameters of Ak and of the TFNs close to it: Ak-1 and Ak+1.
To further illustrate this concept, let us consider a SFP Φ = {A1, A2, A3} defined for a domain U = [0%, 100%], where the parameters of the three TFNs are shown in Table 1.
Fifteen data points are included in the dataset X; the membership degrees to the three TFNs and the accompanying fuzzy entropy are displayed for each data point in Table 2.
Figure 1 shows the three TFNs; the points with the membership degrees of each measure to the TFN A2 are shown as black triangles.
The fuzziness values of the three TFNs, obtained by (10), are, respectively, H(A1) = 0.46, H(A2) = 0.98, and H(A3) = 0.52.
As can be seen from Figure 1, for all data points, the membership degrees to the TFN A2 are approximately between 0.4 and 0.6, providing high fuzzy entropy measures.
In order to reduce the fuzziness of A2, an incremental process is performed in which the central parameters, b2 and c2, of the TFN are altered in order to increase its core length, given by the interval [b2, c2] in which the membership degree to A2 is one.
Table 3 shows the new values of the parameters of the three TFNs obtained to reduce the fuzziness of all TFNs to values less than or equal to a threshold of 0.6. In bold are highlighted the changed parameters.
The new SFP is shown in Figure 2.
Now, the membership degrees of all the measures are greater than 0.5. The fuzziness values of the three TFNs of the new SFP are, respectively, H(A1) = 0.31, H(A2) = 0.54, and H(A3) = 0.23.
To measure the deviation of the final SFP from the initial one, the Root Mean Squared Error (RMSE) of the values of the four parameters, a, b, c, d, of the three TFNs A1, A2, and A3 in the final SFP with respect to the values of the four parameters of the corresponding TFs in the initial SFP is calculated.
For a fuzzy partition given by N + 1 TFNs, the RMSE is given by:
R M S E = 1 4 ( N + 1 ) k = 1 N + 1 ( a k a k ) 2 + ( b k b k ) 2 + ( c k c k ) 2 + ( d k d k ) 2 ,
where ak, bk, ck, and dk are the parameters of the k-th TFN in the original SFP, and a’k, b’k, c’k, and d’k are the parameters of the k-th fuzzy sets in the final SFP.
The RMSE obtained in the example is 5.77%, equal to 5.77% of the domain length.
The proposed method consists of an iterative algorithm that, at each cycle, finds the TFN with the highest fuzziness. If this fuzziness is higher than a predetermined threshold, HTS, the TFN core is enlarged in increments until the fuzziness is less than or equal to the threshold. When the fuzziness of every TFN in the SFP is less than or equal to HTS, the algorithms come to an end.
Formally, let Ak be the TFN with the highest fuzziness, and let H(Ak) be its fuzziness. Let bk and ck be the two central parameters of Ak, which determine its core. The core length ck − bk is enlarged incrementally, by setting b k = b k 0.01 ( c k b k ) and c k = c k + 0.01 ( c k b k ) until H(AK) ≤ HTS. This process is repeated until the fuzziness of each TFN of the FSP is less than or equal to the threshold HTS.
In Algorithm 1, there is shown a pseudocode schematization of the algorithm. At each cycle, the core length of the TFN with the highest fuzziness is increased iteratively on the left and right by one hundredth of its initial length until the fuzziness of the TFN is less than or equal to HTS. The cycle is repeated until the fuzziness of each TFN is less than or equal to HMAX.
Algorithm 1: Optimize SFP via fuzziness
  Input:           Dataset given by n measures
  Output:           Optimized SFP
 1. Create the SFP given by N + 1 TFNs
 2. Set the fuzziness threshold HTS
 3. Repeat
 4. HMAX:= HTS // initialize to HTS the maximum fuzziness
 5. j:= 0 // initialize to 0 the index of the TFN with maximum fuzziness
 6. For k = 1 to N+1
 7. Calculate the fuzziness H(Ak) by (11)
 8. If H(Ak) > HMAX Then
 9. HMAX:= H(Ak)
 10.  j:= k
 11.  End if
 12.  Next k
 13.  If j > 0 Then
 14.  CL:= (cj − bj)/100 // (cj − bj) is the length of the core of the jth TFN
 15.  While H(Aj)> HTS
 16.  If j = 1 Then
 17.  cj:= cj + CL
 18.  aj+1:= cj
 19.  Else if j = N + 1 Then
 20.  bj:= bj - CL
 21.  dj-1:= bj
 22.  Else
 23.  bj:= bj − CL
 24.  cj:= cj + CL
 25.  aj+1:= cj
 26.  dj-1:= bj
 27.  End if
 28.  Calculate the fuzziness H(Aj) by (11)
 29.  End while
 30.  End if
 31.  Until j > 0
 32.  Return the final SFP

4. Experiment and Results

The proposed algorithm was tested by using a data repository containing average temperature measurements recorded in the major cities of the world from 1995 to 2020 and available at https://www.kaggle.com/datasets/sudalairajkumar/daily-temperature-of-major-cities (accessed on 1 August 2024). In this section, the test cases performed are presented, and the results obtained are discussed.

4.1. Case Study

The algorithm was executed on 26 datasets containing the average daily temperatures of 26 cities located in Europe in different climatic zones. Each dataset was extracted from the data repository and is given by 9266 daily mean temperature measurements taken during the period from 1 January 1995 to 14 May 2020. The data were converted from degrees Fahrenheit to degrees Celsius. The infimum and supremum of the domain of the daily mean temperature are, respectively, m = −30 °C and M = 45 °C.
The initial SFP was provided by the seven TFNs shown in Figure 3. The first TFN is a R-function, and the last is a L-function.
The parameters of the seven TFNs are shown in Table 4.
To select an optimal value for the fuzziness threshold, the algorithm was run on a sample of data that were randomly taken from the repository, changing HTS in a range of 0.05 to 0.3. The TFNs’ fuzziness was consistently less than 0.15 in every test. Moreover, for HTS values less than 0.1, at least one TFN’s core length in the final TFP was more than 1.5 times that of the corresponding TFN in the initial TFP, producing a TFP that differed significantly from the first one. So, the selected value for the fuzziness threshold was HTS = 0.1.
To test the algorithm, two empirical applications were carried out. In the first application, the algorithm was tested to verify its efficiency in reducing the fuzziness of all the TFNs of the SFP to values less than or equal to the threshold. The tests were performed by calculating the fuzziness of the TFNs of the initial and final SFPs by (4).
Moreover, to evaluate how efficient the proposed method was in producing the smallest variations of the final SFP compared to the original one, comparison tests using the method in [19] were executed, in which the RMSE of the final SFP was measured by (12). The two methods were executed on all the 26 datasets, comparing the RMSEs obtained. The comparison tests were executed by setting HTS to 0.1 and calculating the RMSE index of the final SFP with respect to the initial FSP in Table 4. The RMSE obtained using the algorithm in [19] was calculated by considering the TFNs with equivalent labels in the initial and final SFP.

4.2. Results

Now, the results of the tests performed on the 26 datasets are discussed. In the following paragraph, the results of the tests carried out to measure the fuzziness of the final SFP are shown. In Section 4.2.2, the results of the comparison tests carried out to measure the similarity between the initial and final SFP are presented.

4.2.1. Test Results

For the sake of brevity, the results obtained for the datasets of the daily temperatures measured in three European cities located in different climate zones are discussed in detail: Paris in France, Rome in Italy, and Stockholm in Sweden.
Table 5 shows the fuzziness values measured for each TFN while taking into account the dataset of daily temperatures recorded in Paris.
The TFN Medium low has the highest degree of fuzziness (0.14). The fuzziness of the TFN Very high is zero since in Paris, during the whole period, average daily temperatures above 35 °C were never recorded, and the membership degree of each measure to this TFN is 0.
The values of the parameters of the TFNs in the final SFP obtained following algorithm execution are shown in Table 6, where the modified parameter values are highlighted in bold.
Table 7 displays the fuzziness values of the TFNs of the final SFP. In bold are shown the fuzziness values changed with respect to the ones in the initial SFP.
The TFN Medium low’s fuzziness has been brought down to a level below the threshold. Additionally, the fuzziness of the two nearby TFNs—Cold and Medium—has decreased even more.
The fuzziness values of the TFNs of the initial SFP with regard to the dataset of the average daily temperature recorded in Rome are now displayed in Table 8.
As in the previous test, the TFN Very high’s fuzziness is zero because there were no recorded average daily temperatures in Rome over 35 °C for the whole period.
The TFN with the greatest fuzziness (0.1353) is the TFN Medium. The fuzziness of the TFN Medium low (0.125) is also greater than the threshold HTS = 0.1.
The final SFP obtained after executing the algorithm is shown in Table 9. In bold are shown the fuzziness values changed with respect to the ones in the initial SFP.
Table 10 displays the fuzziness values of the TFNs of the final SFP. The fuzziness values that were altered in relation to one of the TFNs of the original SFP are displayed in bold.
The TFNs Medium low and Medium’s fuzziness levels are now below the threshold. Furthermore, there was a further reduction in the fuzziness values of the nearby Cold and Medium high TFNs.
The results obtained for the average daily temperature measured in Stockholm are now presented. The fuzziness values calculated for each TFN in relation to the daily temperature data recorded in Stockholm are shown in Table 11.
The TFN Cold has the highest level of fuzziness (0.1351). Additionally, the TFN Medium low’s fuzziness (0.1066) exceeds the threshold HTS.
Furthermore, since the maximum average daily temperature in Stockholm for the entire period was 26.7 °C, below the value of 28 °C at which the membership degree to at least one of the two TFNs High and Very high is more than zero, the fuzziness calculated for the TFNs High and Very high is zero.
The final SFP is displayed in Table 12, with the modified parameters marked in bold.
According to Table 12’s results, extending the TFN Cold’s core was enough to obtain fuzziness values below the threshold for the TFNS Cold and Medium low. The fuzziness values for each TFN in the final SFP are displayed in Table 13.
The new fuzziness values of the two TFNs Cold and Medium low are both less than the threshold value 0.1. As shown in Table 11, the extension of the TFN Cold core determined a fuzziness value below the threshold also for the nearby TFN Medium low, whose b and c parameters remained unchanged.

4.2.2. Results of the Comparison Tests

Now are shown the results of the tests performed to compare the performances of the algorithm with the one proposed in [19].
The RMSE values calculated for each of the 26 datasets executing the two methods are displayed in Table 14.
These findings show that, for all the 26 datasets used, the RMSE obtained by running the proposed method is lower than that obtained by running the method in [19]. In particular, using the proposed algorithm, the maximum value of the RMSE is equal to 0.46 °C, or about 0.61% of the domain length. On the contrary, running the method in [19], RMSE values in a range between 0.92 and 3.18 °C are obtained; they are at least twice greater than the values obtained with the proposed algorithm. Thus, while both the two methods are effective in reducing the fuzziness of all the TFNs of the initial SFP to values below or equal to the predefined threshold, the proposed method is significantly more effective in containing the deviation of the final SFP from the original one.
Therefore, our approach is a good compromise between a reduction in the fuzziness of the TFNs of an SFP with respect to the data and the interpretability of the results, which is guaranteed by the conformity of the final SFP to the original one created by the expert.

5. Conclusions

The objective of this study was to reduce the fuzziness of the fuzzy sets forming trapezoidal SFPs with respect to a dataset while striking a balance with the requirement to preserve the interpretability of the model outputs by not making significant changes to the original SFP.
Datasets of daily mean temperatures recorded over a range of cities in varying climate zones between 1995 and 2020 were used to test the method. To perform the testing, an initial SFP including seven TFNs was built. The fuzziness threshold was fixed by selecting a criterion for the acceptance of the fuzziness of the fuzzy sets.
The findings show that in none of the case studies did the technique significantly alter the original SFP. The difference between the final SFP’s parameter values and the original SFP’s RMSE is less than roughly 0.61% of the domain’s length. Comparisons with the algorithm proposed in [19] highlighted that, for all the case study cities, our method produces the lowest distortions of the initial SFP for each of the datasets used.
Therefore, in contrast to existing fuzziness reduction strategies, the suggested method permits the maintaining of the results’ interpretability. As a result, it might be a useful tool to assist experts and decision makers to decrease the fuzziness of the created TFNs while also offering comprehensible and trustworthy outcomes from the applied approximate reasoning model.
The choice of the fuzziness threshold is a crucial and critical aspect of the process. An excessively high threshold would result in a more notable modification of the initial SFP’s TFNs parameter values.
We intend to use our TFNs fuzziness reduction technology in many decision-making scenarios in future studies. Additionally, we want to experiment with other fuzziness threshold optimization strategies in order to further enhance their performance.

Author Contributions

Conceptualization, B.C. and F.D.M.; methodology, B.C. and F.D.M.; software, B.C. and F.D.M.; validation, B.C. and F.D.M.; formal analysis, B.C. and F.D.M.; investigation, B.C. and F.D.M.; resources, B.C. and F.D.M.; data curation, B.C. and F.D.M.; writing—original draft preparation, B.C. and F.D.M.; writing—review and editing, B.C. and F.D.M.; visualization, B.C. and F.D.M.; supervision, B.C. and F.D.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Zadeh, L.A. Fuzzy Sets. Inf. Control. 1965, 8, 338–353. [Google Scholar] [CrossRef]
  2. Zadeh, L.A. Fuzzy logic and approximate reasoning. Synthese 1975, 30, 407–428. [Google Scholar] [CrossRef]
  3. Hagras, H. Toward human-understandable Explainable AI. Computer 2018, 51, 28–36. [Google Scholar] [CrossRef]
  4. Zadeh, L.A. Toward human level machine intelligence: Is it achievable? the need for a paradigm shift. IEEE Comput. Intell. Mag. 2008, 3, 11–22. [Google Scholar] [CrossRef]
  5. Loquin, K.; Strauss, O. Fuzzy histograms and density estimation. In Soft Methods for Integrated Uncertainty Modelling; Lawry, J., Miranda, E., Bugarin, A., Li, S., Gil, M.A., Grzegorzewski, P.A., Hyrniewicz, O., Eds.; Springer: Berlin/Heidelberg, Germany, 2006; pp. 45–52. [Google Scholar] [CrossRef]
  6. Casalino, G.; Castellano, G.; Cadtiello, C.; Mencar, C. Effect of fuzziness in fuzzy rule-based classifiers defined by strong fuzzy partitions and winner-takes-all inference. Soft Comput. 2022, 26, 6519–6527. [Google Scholar] [CrossRef]
  7. Criado, F.; Gachechiladze, T. Entropy of fuzzy events. Fuzzy Sets Syst. 1997, 88, 99–106. [Google Scholar] [CrossRef]
  8. Shannon, S.C.E. A mathematical theory of communication. In ACM SIGMOBILE Mobile Computing and Communications Review; Association for Computing Machinery: New York, NY, USA, 2001; Volume 5, pp. 3–55. [Google Scholar] [CrossRef]
  9. Pandey, K.; Mishra, A.; Rani, P.; Ali, J.; Chakrabortty, R. Selecting features by utilizing intuitionistic fuzzy Entropy method. Decis. Mak. Appl. Manag. Eng. 2023, 6, 111–133. [Google Scholar] [CrossRef]
  10. Wang, Z.; Chen, H.; Yuan, Z.; Wan, J.; Li, T. Multiscale Fuzzy Entropy-Based Feature Selection. IEEE Trans. Fuzzy Syst. 2023, 31, 3248–3262. [Google Scholar] [CrossRef]
  11. Yang, M.; Nataliani, Y. A feature-reduction fuzzy clustering algorithm based on feature-weighted entropy. IEEE Trans. Fuzzy Syst. 2018, 26, 817–835. [Google Scholar] [CrossRef]
  12. Gao, C.; Lai, Z.; Zhou, J.; Wen, J.; Wong, W.K. Granular maximum decision entropy-based monotonic uncertainty measure for attribute reduction. Int. J. Approx. Reason. 2019, 104, 9–24. [Google Scholar] [CrossRef]
  13. Aggarwal, M. Decision aiding model with entropy-based subjective utility. Inf. Sci. 2019, 501, 558–572. [Google Scholar] [CrossRef]
  14. Sait Gul, A.A. A novel entropy proposition for spherical fuzzy sets and its application in multiple attribute decision-making. Int. J. Intell. Syst. 2020, 35, 1354–1374. [Google Scholar] [CrossRef]
  15. Arya, V.; Kumar, S. Knowledge measure and entropy: A complementary concept in fuzzy theory. Granul. Comput. 2020, 6, 631–643. [Google Scholar] [CrossRef]
  16. Raghu, S.; Sriraam, N.; Kumar, G.P.; Hegde, A.S. A novel approach for real-time recognition of epileptic seizures using minimum variance modified fuzzy entropy. IEEE Trans. Biomed. Eng. 2018, 65, 2612–2621. [Google Scholar] [CrossRef]
  17. Cardone, B.; Di Martino, F. A Novel Fuzzy Entropy-Based Method to Improve the Performance of the Fuzzy C-Means Algorithm. Electronics 2020, 9, 554. [Google Scholar] [CrossRef]
  18. D’Urso, P.; De Giovanni, L.; Alaimo, L.S.; Mattera, R.; Vitale, V. Fuzzy clustering with entropy regularization for interval-valued data with an application to scientific journal citations. Ann. Oper. Res. 2023, 1–24. [Google Scholar] [CrossRef]
  19. Cardone, B.; Di Martino, F. A Fuzzy Entropy-Based Thematic Classification Method Aimed at Improving the Reliability of Thematic Maps in GIS Environments. Electronics 2022, 11, 3509. [Google Scholar] [CrossRef]
  20. Cardone, B.; Di Martino, F.; Senatore, S. Emotion-based classification through fuzzy entropy-enhanced FCM clustering. In Statistical Modeling in Machine Learning; Goswami, T., Sinha, G.R., Eds.; Academic Press: Cambridge, MA, USA, 2023; pp. 205–225. [Google Scholar] [CrossRef]
  21. Al-Sharhan, S.; Karray, F.; Gueaieb, W.; Basir, O. Fuzzy entropy: A brief survey. In Proceedings of the 10th IEEE International Conference on Fuzzy Systems. (Cat. No.01CH37297), Melbourne, VIC, Australia, 2–5 December 2001; Volume 2, pp. 1135–1139. [Google Scholar] [CrossRef]
  22. Aggarwal, M. Bridging the Gap Between Probabilistic and Fuzzy Entropy. IEEE Trans. Fuzzy Syst. 2020, 28, 2175–2184. [Google Scholar] [CrossRef]
  23. Sahmi, P.; Kumar, R. A survey on fuzzy entropy measures and their applications. Int. J. Adv. Sci. Res. 2022, 7, 32–36. [Google Scholar]
  24. De Luca, A.; Termini, S. A definition of a non-probabilistic entropy in the setting of fuzzy sets theory. Inf. Control. 1972, 20, 301–312. [Google Scholar] [CrossRef]
  25. Alonso, J.M.; Castiello, C.; Mencar, C. Interpretability of Fuzzy Systems: Current Research Trends and Prospects. In Springer Handbook of Computational Intelligence; Kacprzyk, J., Pedrycz, W., Eds.; Springer Handbooks; Springer: Berlin/Heidelberg, Germany, 2015; pp. 219–238. [Google Scholar] [CrossRef]
  26. Ruspini, E.H. A new approach to clustering. Inf. Control. 1969, 15, 22–32. [Google Scholar] [CrossRef]
Figure 1. SFP of the example. The points with the membership degree of the measures of the dataset X to the TFN A2 are shown as black triangles.
Figure 1. SFP of the example. The points with the membership degree of the measures of the dataset X to the TFN A2 are shown as black triangles.
Information 15 00615 g001
Figure 2. New SFP obtained changing the values of the parameters.
Figure 2. New SFP obtained changing the values of the parameters.
Information 15 00615 g002
Figure 3. SFP of the daily temperature domain.
Figure 3. SFP of the daily temperature domain.
Information 15 00615 g003
Table 1. Parameters of the three TFNs of the SFP in the example.
Table 1. Parameters of the three TFNs of the SFP in the example.
TFNabcd
A10%0%25%50%
A225%50%60%85%
A360%85%100%100%
Table 2. Membership degrees and corresponding fuzzy entropies of the data in the example.
Table 2. Membership degrees and corresponding fuzzy entropies of the data in the example.
xµ1µ2µ3h(µ1)h(µ2)h(µ3)
34.00%0.640.360.000.940.940.00
35.00%0.600.400.000.970.970.00
37.00%0.520.480.001.001.000.00
37.50%0.500.500.001.001.000.00
38.00%0.480.520.001.001.000.00
38.50%0.460.540.001.001.000.00
39.00%0.440.560.000.990.990.00
69.00%0.000.640.360.000.940.94
70.00%0.000.600.400.000.970.97
70.50%0.000.580.420.000.980.98
71.00%0.000.560.440.000.990.99
72.00%0.000.520.480.001.001.00
72.50%0.000.500.500.001.001.00
73.50%0.000.460.540.001.001.00
74.50%0.000.420.580.000.980.98
Table 3. New values of the parameters of the three TFNs of the SFP in the example.
Table 3. New values of the parameters of the three TFNs of the SFP in the example.
TFNabcd
A10%0%25%40%
A225%40%70%85%
A370%85%100%100%
Table 4. Parameters of the seven TFNs of the SFP for the daily temperature domain, expressed in degrees Celsius.
Table 4. Parameters of the seven TFNs of the SFP for the daily temperature domain, expressed in degrees Celsius.
TFNabcd
Very cold−30−30−100
Cold−100510
Medium low5101215
Medium12152025
Medium high20252830
High28303540
Very high35404545
Table 5. Fuzziness of the seven TFNs in the original SFP calculated using the Paris daily temperature dataset.
Table 5. Fuzziness of the seven TFNs in the original SFP calculated using the Paris daily temperature dataset.
TFNFuzziness
Very cold0.0063
Cold0.0947
Medium low0.1400
Medium0.0898
Medium high0.0340
High0.0006
Very high0.0000
Table 6. Paris daily temperature dataset. Values in °C of the parameters of the seven TFNs of the final SFP. In bold are shown the changed parameters.
Table 6. Paris daily temperature dataset. Values in °C of the parameters of the seven TFNs of the final SFP. In bold are shown the changed parameters.
TFNabcd
Very cold−30−30−100
Cold−10058.9
Medium low58.913.115
Medium13.1152025
Medium high20252830
High28303540
Very high35404545
Table 7. Paris daily temperature dataset. Fuzziness of the seven TFNs of the final SFP. In bold are shown the new fuzziness values.
Table 7. Paris daily temperature dataset. Fuzziness of the seven TFNs of the final SFP. In bold are shown the new fuzziness values.
TFNFuzziness
Very cold0.0063
Cold0.0728
Medium low0.0984
Medium0.0722
Medium high0.0340
High0.0006
Very high0.0000
Table 8. Fuzziness of the seven TFNs in the original SFP calculated using the Rome daily temperature dataset.
Table 8. Fuzziness of the seven TFNs in the original SFP calculated using the Rome daily temperature dataset.
TFNFuzziness
Very cold0.0002
Cold0.0610
Medium low0.1250
Medium0.1353
Medium high0.0841
High0.0006
Very high0.0000
Table 9. Rome daily temperature dataset. Values in °C of the parameters of the seven TFNs of the final SFP. In bold are shown the changed parameters.
Table 9. Rome daily temperature dataset. Values in °C of the parameters of the seven TFNs of the final SFP. In bold are shown the changed parameters.
TFNabcd
Very cold−30−30−100
Cold−10059.9
Medium low59.912.113.7
Medium12.113.721.325
Medium high21.3252830
High28303540
Very high35404545
Table 10. Rome daily temperature dataset. Fuzziness of the seven TFNs of the final SFP. In bold are shown the new fuzziness values.
Table 10. Rome daily temperature dataset. Fuzziness of the seven TFNs of the final SFP. In bold are shown the new fuzziness values.
TFNFuzziness
Very cold0.0063
Cold0.0589
Medium low0.0981
Medium0.0955
Medium high0.0655
High0.0006
Very high0.0000
Table 11. Fuzziness of the seven TFNs in the original SFP calculated using the Stockholm daily temperature dataset.
Table 11. Fuzziness of the seven TFNs in the original SFP calculated using the Stockholm daily temperature dataset.
TFNFuzziness
Very cold0.0501
Cold0.1351
Medium low0.1066
Medium0.0587
Medium high0.0145
High0.0000
Very high0.0000
Table 12. Stockholm daily temperature dataset. Values in °C of the parameters of the seven TFNs of the final SFP. In bold are shown the changed parameters.
Table 12. Stockholm daily temperature dataset. Values in °C of the parameters of the seven TFNs of the final SFP. In bold are shown the changed parameters.
TFNabcd
Very cold−30−0−10−1.2
Cold−10−1.26.210
Medium low6.2101215
Medium12152025
Medium high20252830
High28303540
Very high35404545
Table 13. Stockholm daily temperature dataset. Fuzziness of the seven TFNs of the final SFP. In bold are shown the new fuzziness values.
Table 13. Stockholm daily temperature dataset. Fuzziness of the seven TFNs of the final SFP. In bold are shown the new fuzziness values.
TFNFuzziness
Very cold0.0501
Cold0.0981
Medium low0.0883
Medium0.0587
Medium high0.0145
High0.0000
Very high0.0000
Table 14. RMSE calculated for the datasets containing the daily mean temperature of 26 cities.
Table 14. RMSE calculated for the datasets containing the daily mean temperature of 26 cities.
CountryCityRMSE (°C)
Method [19]
RMSE (°C)
Proposed Method
AustriaVienna1.120.44
BelgiumBrussels2.890.42
CroatiaZagreb0.980.41
CyprusNicosia1.050.40
Czech RepublicPrague1.560.39
DenmarkCopenhagen3.180.43
FinlandHelsinki0.970.45
FranceParis1.060.42
FranceBordeaux0.940.41
GermanyFrankfurt1.020.40
GermanyHamburg1.210.45
GreeceAthens0.980.38
HungaryBudapest1.030.44
IcelandReykjavik1.110.45
IrelandDublin0.920.43
ItalyMilan0.970.42
ItalyRome1.340.43
The NetherlandsAmsterdam2.290.46
NorwayOslo2.150.43
PortugalLisbon1.020.40
RussiaMoscow2.670.45
SpainBarcelona1.190.41
SpainMadrid1.980.42
SwedenStockholm2.110.45
SwitzerlandGeneva1.340.43
United KingdomLondon1.850.44
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cardone, B.; Di Martino, F. A Novel Method Based on the Fuzzy Entropy Measure to Optimize the Fuzziness in Trapezoidal Strong Fuzzy Partitions. Information 2024, 15, 615. https://doi.org/10.3390/info15100615

AMA Style

Cardone B, Di Martino F. A Novel Method Based on the Fuzzy Entropy Measure to Optimize the Fuzziness in Trapezoidal Strong Fuzzy Partitions. Information. 2024; 15(10):615. https://doi.org/10.3390/info15100615

Chicago/Turabian Style

Cardone, Barbara, and Ferdinando Di Martino. 2024. "A Novel Method Based on the Fuzzy Entropy Measure to Optimize the Fuzziness in Trapezoidal Strong Fuzzy Partitions" Information 15, no. 10: 615. https://doi.org/10.3390/info15100615

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop