A Design and Its Application of Multi-Granular Fuzzy Model with Hierarchical Tree Structures

Yeom, Chan-Uk; Kwak, Keun-Chang

doi:10.3390/app132011175

Open AccessArticle

A Design and Its Application of Multi-Granular Fuzzy Model with Hierarchical Tree Structures

by

Chan-Uk Yeom

and

Keun-Chang Kwak

^*

Department of Electronics Engineering, IT-Bio Convergence System, Chosun University, Gwangju 61452, Republic of Korea

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(20), 11175; https://doi.org/10.3390/app132011175

Submission received: 7 August 2023 / Revised: 3 September 2023 / Accepted: 10 October 2023 / Published: 11 October 2023

(This article belongs to the Topic Artificial Intelligence Models, Tools and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

This paper is concerned with the design of a context-based fuzzy C-means (CFCM)-based multi-granular fuzzy model (MGFM) with hierarchical tree structures. For this purpose, we propose three types of hierarchical tree structures (incremental, aggregated, and cascaded types) in the design of MGFM. In general, the conventional fuzzy inference system (FIS) has problems, such as time consumption and an exponential increase in the number of if–then rules when processing large-scale multivariate data. Meanwhile, the existing granular fuzzy model (GFM) reduces the number of rules that increase exponentially. However, the GFM not only has overlapping rules as the cluster centers become closer but also has problems that are difficult to interpret due to many input variables. To solve these problems, the CFCM-based MGFM can be designed as a smaller tree of interconnected GFMs. Here, the inputs of the high-level GFMs are taken from the output to the low-level GFMs. The hierarchical tree structure is more computationally efficient and easier to understand than a single GFM. Furthermore, since the output of the CFCM-based MGFM is a triangular fuzzy number, it is evaluated based on a performance measurement method suitable for the GFM. The prediction performance is analyzed from the automobile fuel consumption and Boston housing database to present the validity of the proposed approach. The experimental results demonstrate that the proposed CFCM-based MGFM based on the hierarchical tree structure creates a small number of meaningful rules and solves prediction-related problems by making them explainable.

Keywords:

multi-granular fuzzy model; hierarchical tree structure; context-based fuzzy c-means clustering; information granule; granular fuzzy model

1. Introduction

In general, fuzzy sets can effectively represent ambiguous and uncertain information inherent in real-world nonlinear systems. Fuzzy sets also represent dynamic or static characteristics as a membership function and represent the degree of fuzzy membership for given data. A fuzzy inference system (FIS) can be expressed qualitatively to make it easy to understand the system and have robust characteristics for systems with uncertain information. However, FIS has to rely on the knowledge or experience of experts to obtain fuzzy rules. Meanwhile, neural networks (NNs) can analyze the input and output relationships of a system through learning and have a processing function. Hence, NN can perform processing tasks quickly. However, NNs have difficulty understanding a system because they do not have information about the given system. A complementary model, called a neuro-fuzzy system, was proposed by combining the advantages of fuzzy models and neural networks to address the abovementioned issues [1,2,3,4,5]. Studies are actively being conducted on neuro-fuzzy inference systems [6,7,8,9,10,11,12,13,14,15]. Similar to FIS, neuro-fuzzy inference systems are expressed by fuzzy rules. As the number of input variables increases, the number of fuzzy rules increases exponentially in the grid partitioning.

Meanwhile, granular computing (GC) is a paradigm for processing an increasing amount of information in computational intelligence and human-centric systems. Researchers have been investigating several design methods for GC [16,17,18,19,20]. The term GC was defined as “subsets computed with words” in a study by Zadeh [21]. GC is a set of computational methodologies and approaches derived from property to solve complex and realistic problems, and it encompasses techniques, theories, and methodologies that use the concept of granules to solve real-world complex problems. Among many studies, Pedrycz [22] proposed a granular fuzzy model (GFM) in which the output of this model is expressed as triangular contexts rather than numerical values for a human-centric system. The GFM uses context-based fuzzy c-means (CFCM) clustering [23] to generate the contexts in the output variable and produces cluster centers in the input space with the aid of the generated contexts. Unlike conventional fuzzy c-means (FCM) clustering, CFCM clustering efficiently produces clusters by representing the properties of the data in the input and output variables. Studies are actively being conducted on GFM [24,25,26,27,28,29,30,31,32,33]. However, increasing the number of inputs to GFM can make it difficult for complex system designs to understand the resulting rules. The GFM has overlapping rules as the cluster centers become closer. It also has problems that are difficult to interpret fuzzy if–then rules due to many input variables.

Therefore, we propose a CFCM-based multi-granular fuzzy model (MGFM) with hierarchical tree structures to design smaller interconnected GFMs that have a smaller number of meaningful rules. The advantage of the proposed model is that it is more computationally efficient and easier to understand fuzzy if–then rules than the GFM itself. Furthermore, we use a new performance index suitable for the output characteristics of GFM. The performance index is obtained by the unique concept of coverage and specificity. Studies on various hierarchical structures are also actively being conducted [34,35,36,37,38]. This paper is organized as follows. Section 2 describes the concept and procedure of CFCM clustering and CFCM-based GFM. Section 3 provides three hierarchical tree structures and CFCM-based MGFM with hierarchical tree structure. In Section 4, experiments on automobile fuel consumption and the Boston housing database are described. Finally, conclusions are provided in Section 5.

2. Generation of Information Granules and Design of the Granular Model

2.1. Context-Based Fuzzy C-Means Clustering

Context-based Fuzzy C-Means (CFCM) clustering was proposed by Pedrycz [39], and it is an effective method of creating clusters that utilize the correlation of the data in the input and output variables. Furthermore, it is possible to preserve the properties of the output variable by using CFCM clustering; hence, CFCM clustering is more homogeneous than conventional FCM clustering [40]. Equation (1) defines the context created by using the characteristics of the output data. Here,

D

denotes the dataset in the output space, and it is assumed that the context has an available value for the given data.

f_{k} = T (d_{k})

denotes the degree of membership for the k-th data in the context of the output variable. The value of

f_{k}

is between 0 and 1, which indicates the degree of membership. The requirement of the membership matrix can be expressed by Equation (2) based on these characteristics.

D = T \to [0, 1]

(1)

U (f) = \{\begin{matrix} u_{i k} \in [0, 1] | \sum_{i = 1}^{c} u_{i k} = f_{k} \forall k \\ 0 < \sum_{k = 1}^{N} u_{i k} < N \end{matrix}\}

(2)

The modified membership matrix

U

can be expressed by Equation (3).

u_{i k} = \frac{f_{k}}{\sum_{j = 1}^{c} {(\frac{| | x_{k} - c_{i} | |}{| | x_{k} - c_{j} | |})}^{\frac{2}{m - 1}}}

(3)

The fuzzification factor

m \in [1, \infty]

generally uses two as the weight exponent. The context is generated by a triangular membership function in the output space to obtain

f_{k}

, which is the degree of membership. Here, it can be seen that the membership function generates contexts such that they overlap at uniform intervals. Moreover, the interval and shape of the generated contexts can be changed according to the user’s settings. Here, we use a method of generating contexts by evenly dividing the data between the output variable and a method of flexibly generating contexts from the Gaussian distribution. The procedure for the CFCM clustering algorithm is as follows:

[Step 1]: Set the number of contexts to be generated and the cluster’s number to be estimated for each context. In addition, initialize the membership matrix $U$ having a value between 0 and 1.
[Step 2]: Generate the contexts via triangular fuzzy sets that are flexibly distributed in the output variable. The generation method of the context can be changed based on the user’s settings.
[Step 3]: Calculate the cluster centers for each context using Equation (4).

$c_{i} = \frac{\sum_{k = 1}^{N} u_{i k}^{m} x_{k}}{\sum_{k = 1}^{N} u_{i k}^{m}}$

(4)
[Step 4]: Compute the objective function using the equation below. The above process stops if the value updated through the previous iteration is less than the threshold.

$J = \sum_{i = 1}^{c} \sum_{k = 1}^{N} u_{i k}^{m} d_{i k}^{2}$

(5)

$|J^{p} - J^{p - 1}| \leq ϵ$

(6)

Here, $d_{k}$ represents the Euclidean distance between the $i$ -th cluster center and the $k$ th data, and $p$ denotes the number of iterations.
[Step 5]: Calculate a new membership matrix $U$ by the equation. Then, go to [Step 3].

The pseudo-code for the CFCM clustering algorithm see Algorithm 1.

Algorithm 1. The context based fuzzy C means clustering algorithm.

Begin

Fix p, 2 < p < n;

Fix c, 2 < c < n;

Fix m, 1 < m < \infty,

(e . g ., m = 2);

Fix m a x i t e r a t i o n, (e . g ., m a x i t e r a t i o n s = 100);

Choose the type of context to be created in the output space
(e.g., type 1 = uniform, type 2 = flexible);
Create p contexts;
Randomly initialize v cluster centers;
for

t = 1

to

m a x i t e r a t i o n s

do
Update the membership matrix U;
Calculate the new cluster centers V
Calculate the new objective function J;
if (abs(J^t − J^t−¹)<∈) then
break;
else
J^t−¹ = J^t;
end if
end for
end

2.2. CFCM-Based Granular Fuzzy Model

As explained in Section 2.1, the granular fuzzy model (GFM) can be designed using CFCM clustering. The premise parameters are determined by the cluster centers estimated by CFCM clustering. The linguistic contexts, which are the consequent parameters, are produced in the output variable, as shown in Algorithm 1. The CFCM-based GFM consists of four layers. The first layer receives the input data, and the second layer represents the set of activation levels for all clusters related to the linguistic contexts. In the third layer, the CFCM clustering for each context is performed. Hence, when a linguistic context is given, clustering occurs in the input variable corresponding to the context. For each context, the cluster’s number is adjusted by the user. The fourth layer is the output layer, and it is implemented in a single granulated form.

The output value Y of the CFCM-based GFM is expressed by Equation (7).

Y = \sum_{\oplus} W_{t} \otimes z_{t}

(7)

Here, the addition and multiplication symbols are used to represent information granules.

W_{t}

and

z_{t}

are the t-th context and t-th firing strength, respectively. The GFM has a single hidden layer formed by clusters obtained by CFCM clustering. The space between hidden and output layers can be expressed linguistically using contexts, unlike the design method of conventional neural networks. The final value of the output layer can be computed as follows

Y = (z_{11} \otimes A_{1} \oplus z_{12} \otimes A_{1} \oplus \dots \oplus z_{1 n 1} \otimes A_{1}) \oplus \dots \oplus (z_{c 1} \otimes A_{c} \oplus z_{c 2} \otimes A_{c} \oplus \dots \oplus z_{c n c} \otimes A_{c})

(8)

The generalized additions and multiplications in Equation (8) are completed using a fuzzy calculation method. Here

A_{i}

is the i-th fuzzy set characterized by context. When the linguistic context is assigned as a triangular fuzzy number, it can be expressed using the following equation:

A_{i} = (a_{i -}, a_{i}, a_{i +})

(9)

Here,

a_{i -}, a_{i}, a_{i +}

denote the lower limit output value, granular model output value, and upper limit output value of the triangular fuzzy set, respectively. The lower prediction value and upper limits of the GFM can be expressed using Equations (10)–(12).

y_{-} = (z_{11} a_{1 -} + z_{12} a_{1 -} + \dots + z_{1 n 1} a_{1 -}) + \dots + (z_{c 1} a_{c -} + z_{c 2} a_{c -} + z_{c n c} a_{c -})

(10)

y = (z_{11} a_{1} + z_{12} a_{1} + \dots + z_{1 n 1} a_{1}) + \dots + (z_{c 1} a_{c} + z_{c 2} a_{c} + z_{c n c} a_{c})

(11)

y_{+} = (z_{11} a_{1 +} + z_{12} a_{1 +} + \dots + z_{1 n 1} a_{1 +}) + \dots + (z_{c 1} a_{c +} + z_{c 2} a_{c +} + z_{c n c} a_{c +})

(12)

3. CFCM-Based Granular Model with Hierarchical Tree Structure

3.1. Hierarchical Tree Structures

Increasing the number of input variables in fuzzy-related and granular models also exponentially increases the number of fuzzy rules, resulting in poor computational efficiency and performance for fuzzy-related and granular models. It also makes it difficult to understand the operation of the model and makes it challenging to adjust the parameters of the membership functions and fuzzy rules. Moreover, the generalizability of fuzzy models and granular models can be reduced when processing large multivariate databases. To solve these problems, fuzzy and granular models are designed in an interconnected hierarchical structure [41] rather than in a single monolithic form. Fuzzy and granular models, designed in a hierarchical structure, can be referred to as hierarchical fuzzy and granular models. In this structure, the output of the low-level model is used as the input to the high-level model. A hierarchical fuzzy or granular model designed in this manner is easier to understand and can perform computations more efficiently than a single monolithic fuzzy or granular model with a unified number of inputs.

In this paper, we present three types of hierarchical structure, including incremental structure, aggregated structure, and cascaded structure [42,43,44]. Figure 1 shows these three hierarchical structures. In the incremental structure, input values are integrated into multiple levels, and output values can be specified at each level. When selecting an input variable at various levels, the input variable is used based on its contribution to the final output value. Typically, the input variable with the highest contribution is used for the lowest-level model, while the input variable with the lowest contribution is used for the highest-level model. In other words, the input values of the low-level model depend on the input values of the high-level model. On the other hand, in the case of aggregated structure, input variables are used in the lowest-level models, and the outputs of the lowest-level models are used as inputs to the models with high levels. Input variables are naturally grouped and used for specific decisions. The associated input variable is low-level. It is used as input to the model, and its output is used as input to the high-level model. Finally, the cascaded structure is designed by combining the incremental and aggregated structures. This structure is suitable for models that contain both correlated and uncorrelated input variables. Thus, we will use a hierarchical tree model with the cascaded structure combining the incremental and aggregated structures in this paper.

3.2. CFCM-Based Multi-Granular Fuzzy Model (MGFM) with Hierarchical Tree Structure

The tree-structured CFCM-based MGFM is proposed to resolve the problems that arise when a single-granular model processes a large-scale multivariate database. The CFCM-based MGFM is designed as a multi-granular model in the form of a tree by stacking models in aggregated and incremental structures. Figure 2 and Figure 3 show the hierarchical tree structure and diagram of CFCM-based MGFM, respectively. This structure uses correlated input variables and non-correlated input variables by dividing them. First, the low-level granular model calculates the output by grouping the correlated input variables in an aggregated form. Then, the calculated output and the uncorrelated input variables are calculated through the granular model designed in an incremental structure. Figure 4 shows the process of generating fuzzy rules in CFCM-based MGFM. The context is created in the output variable, and the clusters are created in the input areas corresponding to each context. Fuzzy rules are created using context and clusters. Correlation determines the rank of input variables based on their correlation with the output attributes of the database, and it uses them as the inputs to the low-level granular model based on descending order of positive and negative correlations. The procedure of the tree-structured CFM-based MGFM is as follows:

[Step 1]: Calculate the positive and negative correlations of the input variables based on the correlations with the output variables in the database to be used.
[Step 2]: By using the positive and negative correlation ranks, designate input variables (input variables with high correlation ranks) to be input into the low-level granular model of the aggregated structure and input variables (input variables with low correlation ranks) to be input into the granular model of the incremental structure.
[Step 3]: Sets the context’s number to create in the low-level granular model and the cluster’s number to be produced per context. In addition, initialize the membership matrix $U$ .
[Step 4]: Create contexts and clusters using context-based FCM clustering to generate fuzzy rules automatically. The generated fuzzy rules are then used to calculate the output of the lower-level granular model.
[Step 5]: Process the output of the low-level granular model, the output variables of the database, and the input variables with low correlation ranks that will be used in the incremental structure so that they can be used as inputs to the high-level granular model.
[Step 6]: Use the processed database as input to the high-level granular model and generate fuzzy rules using CFCM clustering to calculate the final output.

3.3. Performance Evaluation of CFCM-Based MGFM

The GFM (granular fuzzy model) is constructed by the information granules generated using linguistic concepts and information. The information granule can be realized as a specific fuzzy set rather than a single numerical entity. Conventional numerical data are evaluated by the standard deviation as well as the mean or median value. The information granules must be generated rationally so that they can optimally contain the data’s information. Additionally, the GFM is evaluated based on specific concepts of coverage and specificity. The first coverage criterion concerns whether experimental evidence of sufficiently high dimension is accumulated to form the information granule that supports the existence of the data. The second specificity criterion is to maintain the high specificity of the information granule generated due to this result.

The production of rational information granules focuses on generating meaningful information granules using the original data. To generate rational information granules, the coverage and specificity of the information granules should be satisfied [45]. Figure 5 illustrates a conceptual diagram of the coverage and specificity required to generate rational information granules.

Coverage is a criterion to indicate whether data are included in the information granules generated by the GFM, and it shows the degree to which the output values are included in the range of the triangular information granules. When

Y_{k}

is a triangular context, it has a value close to 1 if

y_{k}

is included in

Y_{k} = [y_{k}^{-}, y_{k}^{+}]

and a value close to 0 if

y_{k}

is not included. In other words, the number of target data included in the output of the GFM can be calculated by the coverage, and then the average value for all data is calculated. When all data are included in the output of the GFM, the coverage has a value close to 1.

C o v e r a g e = \frac{1}{N} \sum_{k = 1}^{N} i n c l (y_{k}, Y_{k})

(13)

Specificity indicates the precision of the constructed information granules as Equation (14). Shorter information granules indicate higher specificity, meaning that the resulting information granules have a well-defined meaning. The narrower the interval between the upper and lower limit values, the higher the specificity value. When the output

Y_{k}

of the GFM decreases to a narrow width, the specificity has a value close to 1.

S p e c i f i c i t y = \frac{1}{N} \sum_{k = 1}^{N} e x p (- |y_{k}^{+} - y_{k}^{-}|)

(14)

The GFM is evaluated by the coverage and specificity of information granules, and we need to find a method that maximizes coverage and specificity simultaneously. Two characteristics of information granules need trade-offs. It means that when the coverage has a high value, the specificity has a low value. A rational information granule can be expressed by Equation (15), and this equation is referred to as the performance index (PI).

P e r f o r m a n c e i n d e x = c o v e r a g e (ε) \cdot s p e c i f i c i t y (ε)

(15)

The performance index plays an important role in evaluating the accuracy and clarity of models. Several studies have been conducted on various methods of evaluating the performance of models. Common performance evaluation methods include root mean square error (RMSE) and mean absolute percentage error (MAPE). These methods are mainly used when the output of the model is a numerical value. However, the output of GFM is expressed by fuzzy numbers in a linguistic form. In this sense, the performance evaluation method of GFM is challenging. Thus, we use the performance index as a new performance evaluation method for GFM. The higher the value of the performance index, the more meaningful the information granule, making it possible to design the GFM with excellent performance. Furthermore, we can verify the relationship between the coverage and specificity by the performance index calculated from the GFM.

4. Experimental Results and Comments

To verify the validity of the tree-structured CFCM-based MGFM proposed in this paper, we performed experiments using the automobile fuel consumption database and the Boston housing price database, which are used as benchmarking databases in the prediction field.

4.1. Database

The automobile fuel consumption database [46] was collected from the late 1970s to the early 1980s to predict automobile fuel economy. This database consists of eight variables and 398 instances. The inputs consist of the cylinder’s number, engine displacement, weight, horsepower, acceleration, year, production region, and model name, and the output is fuel economy. In this experiment, we use six input variables except for the model name.

The Boston housing price database [47] is a database of collected information on housing prices in the city of Boston. This database has 14 variables and 506 instances. The inputs include the crime rate per capita by municipality, the proportion of residential areas over 25,000 square feet, the proportion of land occupied by non-retail commercial districts, the dummy variable for the Charles River, nitric oxides concentration per 10 ppm, the average number of rooms per dwelling, the proportion of owner-occupied houses before 1940, the index of accessibility to five Boston job centers, the index of accessibility to radial roads, property tax rate per USD 10,000, the ratio of students to teachers by municipality, the proportion of people of color by municipality, and the proportion of the lower class of the population. The output is the price of owner-occupied houses (MEDV).

The energy efficiency database [48] was collected to predict the cooling and heating load of buildings and consists of eight input variables and 768 instances. The input variables consist of relative compactness, surface area, walled space, roofed space, total height, direction, glazed area, and glazed area distribution map, and the outputs are cooling load and heating load.

4.2. Experimental Methods and Results Analysis

The performance used in this paper is evaluated by the performance index method, as explained in the previous section. This database was normalized to a value between 0 and 1. Total data are divided by training data (50%) and testing data (50%). The experiment was conducted by increasing the number of contexts (P) and clusters (C) from two to six in each tree-structured CFCM-based MGFM. Furthermore, fuzzification coefficients of 1.5 and 2 were used in the experiment. In addition, the contexts are generated by a uniform and flexible method.

In the case of the automobile fuel efficiency prediction experiment, Table 1 and Table 2 list the performance index of the tree-structured CFCM-based MGFM with uniformly generated contexts for the training data and validation data, respectively. Here,

m

denotes the fuzzification coefficient, and

P

and

C

represent the number of contexts and clusters, respectively. In addition,

P I

denotes the performance index. As listed in Table 2, the best prediction performance is achieved when four contexts and four clusters are created. However, when the context’s number is set to 2, the specificity of the information granules is not secured, and the specificity value is close to 0. Consequently, the performance index is 0. Figure 6 and Figure 7 show the prediction results for the training data and validation data, respectively. In these figures, the solid black line represents the actual output values of the data, and the dotted red line represents the predicted values of the proposed method. As shown in these figures, the experimental results revealed that tree-structured CFCM-based MGFM showed good prediction performance for automobile fuel consumption data.

In what follows, we performed the experiments for the Boston housing price prediction. Table 3 and Table 4 list the performance index of the tree-structured CFCM-based MGFM for the training data and verification data, respectively. In the same manner, the experiment was conducted by increasing the number of contexts and clusters from two to six in each tree-structured CFCM-based MGFM. As listed in Table 4, the best performance is achieved when three contexts and four clusters are created. As in the previous experiment, the specificity is not secured when the number of contexts is set to 2, and the resulting specificity value is close to 0. Hence, the performance index is 0. Figure 8 and Figure 9 show the prediction results for the training data and validation data, respectively. As shown in these figures, it was confirmed that tree-structured CFCM-based MGFM showed valid prediction performance for Boston housing data.

Table 5 lists the experimental results for the automobile fuel consumption prediction, the Boston housing price prediction, and the additional energy efficiency example. Figure 10 shows the performance comparison for CFCM-based MGFM and GFM itself. As listed in Table 5, the experimental results for the automobile fuel consumption data demonstrated that the best performance was achieved by each of 18 fuzzy rules (P = 6, C = 3) for four GFMs with two inputs for hierarchical tree structure. Here, the performance index of the best model in a hierarchical tree structure is 0.4909. The processing time of the proposed method is 0.0104 s. In contrast, the performance index and processing time of GFM itself with six inputs are 0.3986 and 0.4 s, respectively. Here, the number of premise and consequent parameters in the explainable simple GFM are 36 (18 centers × 2 inputs) and 18 (6 contexts × 3 points), respectively. In contrast, the number of the premise and consequent parameters in the GFM itself, which is difficult to explain, are 108 (18 centers × 6 inputs) and 18 (6 contexts × 3 points), respectively. The experiment was performed with MATLAB 2022a in a desktop environment of Intel Core i7-7700CPU, RAM 16GB, NVIDIA Geforce GTX 1060. As listed in Table 5, the experimental results revealed that the performance index and processing time showed good results in comparison to those of GFM itself. Furthermore, the proposed method has the advantage of showing that it can be explained by simplifying hierarchical models with each of the two inputs.

In the case of Boston housing price data, the experimental results showed that the best performance was achieved by each of 24 fuzzy rules (P = 6, C = 4) for seven GFMs with two inputs for hierarchical tree structure. Here, the performance index of the best model in a hierarchical tree structure is 0.4603. The processing time of the proposed method is 1.2834 s. In contrast, the performance index and processing time of GFM itself with 12 inputs are 0.3043 and 1.3417 s, respectively. Although the proposed method has a similar processing time, the performance index of GFM itself was confirmed to have better performance. However, the GFM itself was found to be difficult to obtain an explainable model because the rules were generated with too many inputs.

In the case of energy efficiency data, the experimental results showed that the best performance was achieved by each of the 24 fuzzy rules (P = 6, C = 4) for a single GFM with two inputs for a hierarchical tree structure. Here, the performance index of the best model in a hierarchical tree structure is 0.4952. The processing time of the proposed method is 0.2834 s. In contrast, the performance index and processing time of GFM itself with six inputs are 0.4561 and 1.9887 s, respectively. As listed in Table 5, the performance index and processing time showed good results in comparison to those of GFM itself. Furthermore, the proposed CFCM-based MGFM has the characteristics of representing that it can be explained by simplifying hierarchical models with each of the two inputs.

5. Conclusions

We proposed the CFCM-based MGFM with a hierarchical tree structure. This method has the advantage that it is more computationally efficient and easier to understand than the GFM itself in multivariate system applications. For this, we used a hierarchical tree model with the cascaded structure combining the incremental and aggregated structures.

Furthermore, we used the performance index based on the unique coverage and specificity concept for the performance evaluation of the proposed model with the aid of information granules. The experiments proved the validity of the prediction performance of the proposed model using the well-known automobile fuel consumption prediction data, Boston housing data, and energy efficiency data. As a result, it was confirmed that the proposed CFCM-based MGFM with the hierarchical tree structure can be expressed as an explainable structure with fewer rules while simplifying the existing designed model and that the proposed model can be usefully used by the new performance index. The best model of the proposed model was found by trial and error. Therefore, we shall optimize to find the best granular fuzzy model with a hierarchical tree structure in future research.

Author Contributions

Conceptualization, C.-U.Y. and K.-C.K.; Methodology, C.-U.Y. and K.-C.K.; Software, C.-U.Y. and K.-C.K.; Validation, K.-C.K.; Formal Analysis, C.-U.Y. and K.-C.K.; Investigation, C.-U.Y. and K.-C.K.; Resources, K.-C.K.; Data Curation, C.-U.Y.; Writing-Original Draft Preparation, C.-U.Y.; Writing-Review and Editing, C.-U.Y. and K.-C.K.; Visualization, C.-U.Y.; Supervision, K.-C.K.; Project Administration, K.-C.K.; Funding Acquisition, K.-C.K. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by research fund from Chosun University, 2022.

Conflicts of Interest

The authors declare no conflict of interest.

References

Drigo, E.D.S.; Rodriguez, J.L.M.; Embirucu, M.; Filho, S.A. Development of a Neuro-Fuzzy System for Assessing Information Management on the Shop Floor. IEEE Access 2020, 8, 207063–207075. [Google Scholar] [CrossRef]
Sun, R.; Wang, G.; Fan, Z.; Xu, T.; Ochieng, W.Y. An Integrated Urban Positioning Algorithm Using Matching, Particle Swam Optimized Adaptive Neuro Fuzzy Inference System and a Spatial City Model. IEEE Trans. Veh. Technol. 2020, 69, 4842–4854. [Google Scholar] [CrossRef]
Samanta, S.; Pratama, M.; Sundaram, S. Bayesian Neuro-Fuzzy Inference System for Temporal Dependence Estimation. IEEE Trans. Fuzzy Syst. 2020, 29, 2479–2490. [Google Scholar] [CrossRef]
Ali, M.; Adnan, M.; Tariq, M.; Poor, H.V. Load Forecasting Through Estimated Parametrized Based Fuzzy Inference System in Smart Grids. IEEE Trans. Fuzzy Syst. 2020, 29, 156–165. [Google Scholar] [CrossRef]
Liu, J.; Jiang, C.; He, J.; Tang, Z.; Xie, Y.; Xu, P.; Wei, S. STA-APSNFIS: STA-Optimized Adaptive Pre-Sparse Neuro-Fuzzy Inference System for Online Soft Sensor Modeling. IEEE Access 2020, 8, 104870–104883. [Google Scholar] [CrossRef]
Ulloa-Cazarez, R.L.; García-Díaz, N.; Soriano-Equigua, L. Multi-layer Adaptive Fuzzy Inference System for Predicting Student Performance in Online Higher Education. IEEE Lat. Am. Trans. 2021, 19, 98–106. [Google Scholar] [CrossRef]
Dyanamina, G.; Kakodia, S.K. Adaptive neuro fuzzy inference system based decoupled control for neutral point clamped multi level inverter fed induction motor drive. Chin. J. Electr. Eng. 2021, 7, 70–82. [Google Scholar] [CrossRef]
Haque, F.; Reaz, M.B.I.; Chowdhury, M.E.H.; Hashim, F.H.; Arsad, N.; Ali, S.H.M. Diabetic Sensorimotor Polyneuropathy Severity Classification Using Adaptive Neuro Fuzzy Inference System. IEEE Access 2021, 9, 7618–7631. [Google Scholar] [CrossRef]
Shalabi, M.E.; Elbab, A.M.R.F.; El-Hussieny, H.; Abouelsoud, A.A. Neuro-Fuzzy Volume Control for Quarter Car Air-Spring Suspension System. IEEE Access 2021, 9, 77611–77623. [Google Scholar] [CrossRef]
Ghosh, S. Neuro-Fuzzy-Based IoT Assisted Power Monitoring System for Smart Grid. IEEE Access 2021, 9, 168587–168599. [Google Scholar] [CrossRef]
Bai, K.; Zhu, X.; Wen, S.; Zhang, R.; Zhang, W. Broad Learning Based Dynamic Fuzzy Inference System With Adaptive Structure and Interpretable Fuzzy Rules. IEEE Trans. Fuzzy Syst. 2021, 30, 3270–3283. [Google Scholar] [CrossRef]
Tomasiello, S.; Pedrycz, W.; Loia, V. On Fractional Tikhonov Regularization: Application to the Adaptive Network-Based Fuzzy Inference System for Regression Problems. IEEE Trans. Fuzzy Syst. 2022, 30, 4717–4727. [Google Scholar] [CrossRef]
Amekraz, Z.; Hadi, M.Y. CANFIS: A Chaos Adaptive Neural Fuzzy Inference System for Workload Prediction in the Cloud. IEEE Access 2022, 10, 49808–49828. [Google Scholar] [CrossRef]
Cerqueira, I.C.D.S.; Carvalho, P.P.S.; Rodriguez, J.L.M.; Filho, S.A.; Freires, F.G.M. Development of Adaptive Neuro-Fuzzy Inference System for Assessing Industry Leadership in Accident Situations. IEEE Access 2022, 10, 102933–102944. [Google Scholar] [CrossRef]
Rodríguez, E.Y.A.; Gamboa, A.A.R.; Rodríguez, E.C.A.; da Silva, A.F.; Rizol, P.M.S.R.; Marins, F.A.S. Comparison of adaptive neuro-fuzzy inference system (ANFIS) and machine learning algorithms for electricity production forecasting. IEEE Lat. Am. Trans. 2022, 20, 2288–2294. [Google Scholar] [CrossRef]
Kiliç, K.; Uncu, Ö.; Türksen, I.B. Comparison of different strategies of utilizing fuzzy clustering in structure identification. Inf. Sci. 2007, 177, 5153–5162. [Google Scholar] [CrossRef]
Molina, C.; Rodriguez-Ariza, L.; Sanchez, D.; Amparo-Vila, M. A new fuzzy multidimensional model. IEEE Trans. Fuzzy Syst. 2006, 14, 897–912. [Google Scholar] [CrossRef]
Pedrycz, W.; Gomide, F. An Introduction to Fuzzy Sets: Analysis and Design; MIT Press: Cambridge, MA, USA, 1998. [Google Scholar]
Pham, D.T.; Castellani, M. Evolutionary learning of fuzzy models. Eng. Appl. Artif. Intell. 2006, 19, 583–592. [Google Scholar] [CrossRef]
Wang, Y.; Kinsner, W.; Zhang, D. Contemporary cybernetics and its faces of cognitive informatics and computational intelligence. IEEE Trans. Syst. Man Cybern. Part B 2009, 39, 823–833. [Google Scholar] [CrossRef]
Zadeh, L.A. Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic. Fuzzy Sets Syst. 1997, 90, 111–127. [Google Scholar] [CrossRef]
Pedrycz, W. Relational and directional aspects in the construction of information granules. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 2002, 32, 605–614. [Google Scholar] [CrossRef]
Pedrycz, W.; Kwak, K.-C. Linguistic models as a framework of user-centric system modeling. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 2006, 36, 727–745. [Google Scholar] [CrossRef]
Zhu, X.; Pedrycz, W.; Li, Z. Granular Models and Granular Outliers. IEEE Trans. Fuzzy Syst. 2018, 26, 3835–3846. [Google Scholar] [CrossRef]
Zhu, X.; Pedrycz, W.; Li, Z. A Development of Hierarchically Structured Granular Models Realized Through Allocation of Information Granularity. IEEE Trans. Fuzzy Syst. 2020, 29, 3845–3858. [Google Scholar] [CrossRef]
Hu, X.; Shen, Y.; Pedrycz, W.; Li, Y.; Wu, G. Granular Fuzzy Rule-Based Modeling With Incomplete Data Representation. IEEE Trans. Cybern. 2021, 52, 6420–6433. [Google Scholar] [CrossRef] [PubMed]
Ma, C.; Zhang, L.; Pedrycz, W.; Lu, W. The Long-Term Prediction of Time Series: A Granular Computing-Based Design Approach. IEEE Trans. Syst. Man Cybern. Syst. 2022, 52, 6326–6338. [Google Scholar] [CrossRef]
Lu, W.; Shan, D.; Pedrycz, W.; Zhang, L.; Yang, J.; Liu, X. Granular Fuzzy Modeling for Multidimensional Numeric Data: A Layered Approach Based on Hyperbox. IEEE Trans. Fuzzy Syst. 2018, 27, 775–789. [Google Scholar] [CrossRef]
Zhu, X.; Pedrycz, W.; Li, Z. A Development of Granular Input Space in System Modeling. IEEE Trans. Cybern. 2019, 51, 1639–1650. [Google Scholar] [CrossRef]
Hu, X.; Pedrycz, W.; Wang, X. Granular Fuzzy Rule-Based Models: A Study in a Comprehensive Evaluation and Construction of Fuzzy Models. IEEE Trans. Fuzzy Syst. 2016, 25, 1342–1355. [Google Scholar] [CrossRef]
Zhu, X.; Pedrycz, W.; Li, Z. A Design of Granular Takagi–Sugeno Fuzzy Model Through the Synergy of Fuzzy Subspace Clustering and Optimal Allocation of Information Granularity. IEEE Trans. Fuzzy Syst. 2018, 26, 2499–2509. [Google Scholar] [CrossRef]
Lu, W.; Pedrycz, W.; Yang, J.; Liu, X. Granular Fuzzy Modeling Guided Through the Synergy of Granulating Output Space and Clustering Input Subspaces. IEEE Trans. Cybern. 2019, 51, 2625–2638. [Google Scholar] [CrossRef]
Zhang, B.; Pedrycz, W.; Fayek, A.R.; Gacek, A.; Dong, Y. Granular Aggregation of Fuzzy Rule-Based Models in Distributed Data Environment. IEEE Trans. Fuzzy Syst. 2020, 29, 1297–1310. [Google Scholar] [CrossRef]
Kamthan, S.; Singh, H. Hierarchical Fuzzy Logic for Multi-Input Multi-Output Systems. IEEE Access 2020, 8, 206966–206981. [Google Scholar] [CrossRef]
Pedrycz, W.; Al-Hmouz, R.; Balamash, A.S.; Morfeq, A. Hierarchical Granular Clustering: An Emergence of Information Granules of Higher Type and Higher Order. IEEE Trans. Fuzzy Syst. 2015, 23, 2270–2283. [Google Scholar] [CrossRef]
Han, Z.; Pedrycz, W.; Zhao, J.; Wang, W. Hierarchical Granular Computing-Based Model and Its Reinforcement Structural Learning for Construction of Long-Term Prediction Intervals. IEEE Trans. Cybern. 2020, 52, 666–676. [Google Scholar] [CrossRef] [PubMed]
Wang, L.; Han, Z.; Pedrycz, W.; Zhao, J.; Wang, W. A Granular Computing-Based Hybrid Hierarchical Method for Construction of Long-Term Prediction Intervals for Gaseous System of Steel Industry. IEEE Access 2020, 8, 63538–63550. [Google Scholar] [CrossRef]
Pedrycz, W.; Gacek, A.; Wang, X. A Hierarchical Approach to Interpretability of TS Rule-Based Models. IEEE Trans. Fuzzy Syst. 2021, 30, 2861–2869. [Google Scholar] [CrossRef]
Pedrycz, W. Conditional fuzzy clustering in the design of radial basis function neural networks. IEEE Trans. Neural Netw. 1998, 9, 601–612. [Google Scholar] [CrossRef] [PubMed]
Dunn, J.C. A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters. J. Cybern. 1973, 3, 32–57. [Google Scholar] [CrossRef]
Al-Hmouz, R.; Pedrycz, W.; Balamash, A.S.; Morfeq, A. Hierarchical System Modeling. IEEE Trans. Fuzzy Syst. 2018, 26, 258–269. [Google Scholar] [CrossRef]
Pedrycz, W.; Kwak, K.-C. The Development of Incremental Models. IEEE Trans. Fuzzy Syst. 2007, 15, 507–518. [Google Scholar] [CrossRef]
Chung, F.-L.; Duan, J.-C. On multistage fuzzy neural network modeling. IEEE Trans. Fuzzy Syst. 2000, 8, 125–142. [Google Scholar] [CrossRef]
Lendek, Z.; Babuska, R.; De Schutter, B. Stability of Cascaded Fuzzy Systems and Observers. IEEE Trans. Fuzzy Syst. 2008, 17, 641–653. [Google Scholar] [CrossRef]
Pedrycz, W. Evaluating Quality of Models via Prediction Information Granules. IEEE Trans. Fuzzy Syst. 2022, 30, 5551–5556. [Google Scholar] [CrossRef]
UCI Machine Learning Repository, Concrete Compressive Strength Data Set. Available online: https://archive.ics.uci.edu/ml/datasets/auto+mpg (accessed on 15 January 2023).
Boston Housing Dataset. Available online: http://lib.stat.cmu.edu/datasets/boston (accessed on 3 September 2023).
Energy Efficiency Dataset. Available online: https://archive.ics.uci.edu/dataset/242/energy+efficiency (accessed on 3 September 2023).

Figure 1. Hierarchical tree structure: (a) incremental structure, (b) aggregated structure, (c) cascade structure.

Figure 2. Hierarchical tree structure of CFCM-based MGFM.

Figure 3. Diagram of tree-structured CFCM-based MGFM.

Figure 4. Diagram for fuzzy rule generation in design of CFCM-based MGFM.

Figure 5. The concept of coverage and specificity for rational information granule generation.

Figure 6. Prediction results for training data (automobile fuel consumption data, P = 6, C = 2, m = 2, flexible generation method).

Figure 7. Prediction results for validation data (automobile fuel consumption database, P = 6, C = 2, m = 2, flexible generation method).

Figure 8. Prediction results for training data (Boston housing price data, P = 5, C = 5, m = 2, flexible generation method).

Figure 9. Prediction results for validation data (Boston housing price data, P = 5, C = 5, m = 2, flexible generation method).

Figure 10. Performance comparison for CFCM-based MGFM and GFM itself.

Table 1. Performance index of tree-structured CFCM-based MGFM for training data (automobile fuel consumption data).

m	Context		C	2	3	4	5	6
		P
1.5	Uniform	2		0	0	0	0	0
		3		0.2687	0.2701	0.2855	0.3002	0.2826
		4		0.3647	0.3690	0.3838	0.3790	0.3431
		5		0.4600	0.4101	0.4177	0.4143	0.3932
		6		0.4526	0.4651	0.4304	0.3800	0.4182
	Context		C	2	3	4	5	6
		P
	Flexible	2		0	0	0	0	0
		3		0.2871	0.2597	0.2246	0.2805	0.3187
		4		0.3893	0.3701	0.3843	0.3690	0.3533
		5		0.4105	0.3902	0.4050	0.4099	0.3793
		6		0.4346	0.4419	0.3970	0.3890	0.3915
m	Context		C	2	3	4	5	6
		P
2	Uniform	2		0	0	0	0	0
		3		0.3027	0.2907	0.2945	0.3069	0.3115
		4		0.4294	0.4017	0.4014	0.3759	0.3828
		5		0.4278	0.4028	0.4481	0.4036	0.4165
		6		0.4091	0.4445	0.4447	0.4213	0.4619
	Context		C	2	3	4	5	6
		P
	Flexible	2		0	0	0	0	0
		3		0.3248	0.3184	0.2914	0.3055	0.3004
		4		0.4494	0.4301	0.4605	0.4197	0.4520
		5		0.4771	0.4160	0.4203	0.4689	0.4453
		6		0.4219	0.4416	0.4311	0.4534	0.4258

Table 2. Performance index of tree-structured CFCM-based MGFM for verification data (automobile fuel consumption data).

m	Context		C	2	3	4	5	6
		P
1.5	Uniform	2		0	0	0	0	0
		3		0.2493	0.2766	0.2808	0.2966	0.2747
		4		0.3915	0.3568	0.3854	0.3833	0.3573
		5		0.4505	0.3713	0.3921	0.4036	0.4398
		6		0.4538	0.4602	0.4559	0.4060	0.4247
	Context		C	2	3	4	5	6
		P
	Flexible	2		0	0	0	0	0
		3		0.2696	0.2620	0.2132	0.2714	0.3021
		4		0.3860	0.3593	0.3432	0.3509	0.3809
		5		0.4098	0.4505	0.4110	0.4170	0.3844
		6		0.4055	0.4408	0.4080	0.4090	0.3865
m	Context		C	2	3	4	5	6
		P
2	Uniform	2		0	0	0	0	0
		3		0.3001	0.2874	0.2909	0.2805	0.2962
		4		0.4068	0.4149	0.3938	0.4254	0.4290
		5		0.4641	0.4599	0.4435	0.4600	0.4513
		6		0.4553	0.4737	0.4530	0.4479	0.4710
	Context		C	2	3	4	5	6
		P
	Flexible	2		0	0	0	0	0
		3		0.2935	0.2837	0.3133	0.3320	0.2877
		4		0.4395	0.4266	0.4178	0.4270	0.4334
		5		0.4458	0.4610	0.4351	0.4850	0.4734
		6		0.4597	0.4909	0.4798	0.4864	0.4724

Table 3. Performance index of tree-structured CFCM-based MGFM for training data (Boston house price data).

m	Context		C	2	3	4	5	6
		P
1.5	Uniform	2		0	0	0	0	0
		3		0.2679	0.2735	0.2797	0.2616	0.2577
		4		0.3767	0.3430	0.3439	0.3382	0.3763
		5		0.3823	0.3163	0.3502	0.3501	0.3392
		6		0.4212	0.4330	0.4835	0.4001	0.3689
	Context		C	2	3	4	5	6
		P
	Flexible	2		0	0	0	0	0
		3		0.2920	0.2473	0.2734	0.2831	0.2607
		4		0.3854	0.3441	0.3670	0.3759	0.3284
		5		0.3646	0.3487	0.3548	0.3914	0.3714
		6		0.3517	0.3646	0.3558	0.4101	0.3671
m	Context		C	2	3	4	5	6
		P
2	Uniform	2		0	0	0	0	0
		3		0.2723	0.2745	0.2793	0.2834	0.2897
		4		0.3823	0.3520	0.3711	0.3909	0.3778
		5		0.3387	0.3305	0.3478	0.3670	0.3605
		6		0.4158	0.3834	0.4181	0.4238	0.4370
	Context		C	2	3	4	5	6
		P
	Flexible	2		0	0	0	0	0
		3		0.3031	0.3014	0.2885	0.2898	0.2949
		4		0.3839	0.3653	0.3601	0.3878	0.3894
		5		0.3518	0.3717	0.3843	0.3775	0.3998
		6		0.3828	0.3627	0.3798	0.4072	0.4045

Table 4. Performance index of tree-structured CFCM-based MGFM for validation data (Boston house price data).

m	Context		C	2	3	4	5	6
		P
1.5	Uniform	2		0	0	0	0	0
		3		0.2481	0.2718	0.2646	0.2651	0.2549
		4		0.3917	0.3881	0.3783	0.4073	0.4010
		5		0.3862	0.3426	0.3659	0.3699	0.3676
		6		0.4478	0.4412	0.4603	0.4233	0.3852
	Context		C	2	3	4	5	6
		P
	Flexible	2		0	0	0	0	0
		3		0.2995	0.2966	0.2865	0.2663	0.2860
		4		0.3765	0.3275	0.3391	0.3477	0.3425
		5		0.3860	0.3396	0.3908	0.3997	0.3744
		6		0.3698	0.4054	0.3892	0.3762	0.3348
m	Context		C	2	3	4	5	6
		P
2	Uniform	2		0	0	0	0	0
		3		0.2836	0.2802	0.2890	0.2834	0.3195
		4		0.3665	0.3729	0.3901	0.3969	0.3818
		5		0.3668	0.3462	0.3414	0.3759	0.3304
		6		0.3208	0.3850	0.4100	0.4161	0.4354
	Context		C	2	3	4	5	6
		P
	Flexible	2		0	0	0	0	0
		3		0.3127	0.3237	0.3116	0.3096	0.3310
		4		0.3921	0.3912	0.3942	0.3945	0.3951
		5		0.3701	0.3498	0.3630	0.3647	0.4252
		6		0.3663	0.3749	0.3468	0.3475	0.3679

Table 5. Experimental results of performance index and processing time (validation data).

Databases	Models	m	Context	P	C	Rules	PI	Time (s)
Auto MPG	MGFM	1.5	Uniform	6	3	18	0.4602	0.0133
		1.5	Flexible	5	3	15	0.4505	0.0132
		2	Uniform	6	3	18	0.4737	0.0127
		2	Flexible	6	3	18	0.4909	0.0104
	GFM	2	Uniform	6	3	18	0.3986	0.0400
Boston housing	MGFM	1.5	Uniform	6	4	24	0.4603	1.2834
		1.5	Flexible	6	3	18	0.4053	1.2634
		2	Uniform	6	6	36	0.4353	1.3044
		2	Flexible	5	6	30	0.4253	1.2937
	GFM	2	Uniform	5	6	30	0.3043	1.3427
Energy efficiency	MGFM	1.5	Uniform	5	6	30	0.4585	0.5562
		1.5	Flexible	5	5	25	0.4774	0.5472
		2	Uniform	5	4	20	0.4824	0.2561
		2	Flexible	6	4	24	0.4952	0.2834
	GFM	2	Uniform	4	3	12	0.4561	1.9887

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yeom, C.-U.; Kwak, K.-C. A Design and Its Application of Multi-Granular Fuzzy Model with Hierarchical Tree Structures. Appl. Sci. 2023, 13, 11175. https://doi.org/10.3390/app132011175

AMA Style

Yeom C-U, Kwak K-C. A Design and Its Application of Multi-Granular Fuzzy Model with Hierarchical Tree Structures. Applied Sciences. 2023; 13(20):11175. https://doi.org/10.3390/app132011175

Chicago/Turabian Style

Yeom, Chan-Uk, and Keun-Chang Kwak. 2023. "A Design and Its Application of Multi-Granular Fuzzy Model with Hierarchical Tree Structures" Applied Sciences 13, no. 20: 11175. https://doi.org/10.3390/app132011175

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Design and Its Application of Multi-Granular Fuzzy Model with Hierarchical Tree Structures

Abstract

1. Introduction

2. Generation of Information Granules and Design of the Granular Model

2.1. Context-Based Fuzzy C-Means Clustering

2.2. CFCM-Based Granular Fuzzy Model

3. CFCM-Based Granular Model with Hierarchical Tree Structure

3.1. Hierarchical Tree Structures

3.2. CFCM-Based Multi-Granular Fuzzy Model (MGFM) with Hierarchical Tree Structure

3.3. Performance Evaluation of CFCM-Based MGFM

4. Experimental Results and Comments

4.1. Database

4.2. Experimental Methods and Results Analysis

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI