Next Article in Journal
Existence of Best Proximity Point in O-CompleteMetric Spaces
Next Article in Special Issue
Privacy-Preserving Distributed Learning via Newton Algorithm
Previous Article in Journal
A Symbolic Approach to Discrete Structural Optimization Using Quantum Annealing
Previous Article in Special Issue
Three-Way Co-Training with Pseudo Labels for Semi-Supervised Learning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Unsupervised Attribute Reduction Algorithm for Mixed Data Based on Fuzzy Optimal Approximation Set

Department of Mathematics and Physics, Shijiazhuang Tiedao University, Shijiazhuang 050043, China
*
Author to whom correspondence should be addressed.
Mathematics 2023, 11(16), 3452; https://doi.org/10.3390/math11163452
Submission received: 14 June 2023 / Revised: 15 July 2023 / Accepted: 7 August 2023 / Published: 9 August 2023
(This article belongs to the Special Issue Data Mining: Analysis and Applications)

Abstract

:
Fuzzy rough set theory has been successfully applied to many attribute reduction methods, in which the lower approximation set plays a pivotal role. However, the definition of lower approximation used has ignored the information conveyed by the upper approximation and the boundary region. This oversight has resulted in an unreasonable relation representation of the target set. Despite the fact that scholars have proposed numerous enhancements to rough set models, such as the variable precision model, none have successfully resolved the issues inherent in the classical models. To address this limitation, this paper proposes an unsupervised attribute reduction algorithm for mixed data based on an improved optimal approximation set. Firstly, the theory of an improved optimal approximation set and its associated algorithm are proposed. Subsequently, we extend the classical theory of optimal approximation sets to fuzzy rough set theory, leading to the development of a fuzzy improved approximation set method. Finally, building on the proposed theory, we introduce a novel, fuzzy optimal approximation-set-based unsupervised attribute reduction algorithm (FOUAR). Comparative experiments conducted with all the proposed algorithms indicate the efficacy of FOUAR in selecting fewer attributes while maintaining and improving the performance of the machine learning algorithm. Furthermore, they highlight the advantage of the improved optimal approximation set algorithm, which offers higher similarity to the target set and provides a more concise expression.

1. Introduction

Since the introduction of rough sets by Pawlak [1] in 1982, attribute reduction, also known as feature selection, has emerged as an important research direction in rough set applications [2,3,4]. Attribute reduction algorithms rooted in classical rough set theory are often termed positive region reduction algorithms [5,6,7], which use knowledge granules (equivalence classes) in the data attributes to approximate the target set. Among these, the equivalence classes that are wholly contained within the target set are designated as lower approximations, while those that have intersections with the target set are denoted as upper approximations. The difference between the lower and upper approximations is referred to as the boundary region. However, the classical rough set model cannot be directly applied to numerical data types. Concurrently, fuzzy set theory has achieved success in numerous domains, such as fuzzy logic systems [8,9] and pattern recognition. To address the limitation of rough sets in handling numerical data, Dubois and Prade [10,11] integrated rough set models with fuzzy set theory and proposed fuzzy rough sets. Reduction algorithms based on fuzzy rough sets employ similarity kernel functions to ensure more efficacious attribute reduction without sacrificing the information contained in the data. Consequently, a plethora of scholars have conducted in-depth and extensive research on reduction algorithms predicated on fuzzy rough sets [12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27]. For instance, Hu et al. [16] devised a heuristic algorithm that guarantees the algorithm’s convergence by proposing a novel fuzzy rough set. Yuan et al. [20] extended the supervised fuzzy rough set algorithm to unsupervised fuzzy rough set attribute reduction.
However, lower approximation theory, which is widely employed within these models, still possesses significant limitations. Traditionally, lower approximation theory stipulates that knowledge granules must be entirely contained within the target set; failing this, it is inferred that the target set cannot be represented by the extant knowledge granules. This criterion overlooks the information in the boundary region, resulting in an inability to accurately describe the dependencies among many attributes. In order to address these issues, scholars have proposed a plethora of refined models. Ziarko et al. [28] introduced the variable precision rough set model. Its fundamental concept entails establishing a tolerance threshold within the traditional rough set model, which allows knowledge granules that fall under this threshold to be regarded as lower approximations of the target set. Chen et al. [29] incorporated a rough membership function and approximation parameters from variable precision rough sets into the multigranulation environment, establishing a model known as the variable precision multigranulation rough set. Dai and Li [30,31] investigated the VPRS model in the context of a double universe, focusing on its structures and properties. These models share a common characteristic, which is the relaxation of the inclusion criteria for lower approximations to a certain extent by setting parameters. However, when knowledge granules still cannot satisfy the relatively relaxed inclusion criteria, these models crudely consider the knowledge granules to have no association with the target set. In reality, the refined models have not fundamentally addressed the issues faced by rough sets, but have only provided a degree of alleviation to the problem.
Several scholars have proposed the utilization of optimal approximation sets as substitutes for lower approximation sets during the reduction process [32,33,34,35,36,37]. The primary objective is to identify the set that most accurately represents the target set. An optimal approximation set comprises the lower approximation of the target set and a segment of the boundary area, enabling effective utilization of the information derived from this region. By leveraging the similarity between knowledge granules and the target set to replace the lower approximation present in traditional rough sets, their relationship can be precisely described using a similarity function, irrespective of whether the knowledge granules are encompassed within the target set. This approach effectively resolves the issues encountered in rough sets. Furthermore, optimal approximation sets can also mitigate these same challenges faced by variable-precision models and other similar models when applied to practical problems. Despite their potential, however, current models of optimal approximation sets present several limitations. For example, they are rooted in traditional rough set theory and are consequently only applicable to categorical data. Furthermore, the current methods used to find optimal approximation sets fall short of identifying the most similar representation of the target set in a theoretical sense. These shortcomings not only restrict the broad applicability of optimal approximation sets but also prevent them from accurately embodying the target.
To tackle the challenges discussed above, this study presents modifications and extensions to the concept of optimal approximation sets. Concomitant with this endeavor, we also propose a new unsupervised attribute reduction approach that leverages these improved fuzzy optimal approximation sets. Primarily, we have upgraded traditional optimal approximation sets theory, proposing a more advanced version of this theory along with the associated theorems. In a further amplification, the traditional ambit of optimal approximation sets theory has been broadened to incorporate elements of fuzzy set theory, resulting in a new theory for fuzzy optimal approximation sets. Building on the foundation of the refined optimal approximation set model, we proposed two new algorithms: an improved optimal approximation set algorithm (IOAS) and a fuzzy improved optimal approximation set algorithm (FIOAS). Lastly, leveraging the fuzzy optimal approximation set model, we proposed a novel attribute importance function and developed a unique algorithm for unsupervised attribute reduction, namely FOUAR. To test the validity of IOAS and FOUAR, we carried out experiments on 30 different datasets. The results showed that IOAS can identify closer matches to the target set and can also minimize the size of optimal approximation sets. On the other hand, FOUAR showed its ability to pick fewer, yet higher-quality attributes. The algorithm FOUAR also demonstrated its capacity to maintain and even enhance the accuracy of subsequent machine learning tasks. Moreover, the outcomes of parameter experiments showed that FOUAR is quite robust against changes in parameters. Hypothesis tests performed suggest that there are notable differences between FOUAR and most other algorithms, with FOUAR consistently achieving the top ranks.
The main contributions of this paper are as follows:
(1)
We extend the existing theory of optimal approximation sets to create a theory of fuzzy optimal approximation sets. This enables their application to mixed data types.
(2)
We introduce improved algorithms for solving optimal approximation sets, known as IOAS and FIOAS. These enhanced methods enable higher similarity to the target set and produce smaller optimal approximation sets.
(3)
We propose an unsupervised attribute reduction algorithm based on fuzzy optimal approximation sets. This algorithm is capable of selecting fewer attributes while maintaining the accuracy of classification and clustering tasks.
The remainder of this paper is organized as follows: Section 2 introduces the preliminary information on rough set theory. Progressing to Section 3, we discuss optimal approximation set theory and its improved algorithm. Thereafter, in Section 4, we extend the classical theory of optimal approximation sets and related definitions to fuzzy set theory, and propose an algorithm for fuzzy optimal approximation sets. Transitioning to Section 5, we present a series of theories and definitions related to fuzzy optimal approximation sets, as well as an unsupervised attribute reduction algorithm for mixed data based on fuzzy optimal approximation sets. Additionally, an example is provided to illustrate the proposed method. Section 6 presents the experimental results. Finally, conclusions are drawn in Section 7.

2. Relevant Concepts

Definition 1 
([1]). Let U be a finite universe, and R be an equivalence relation on U. Then, the pair ( U , R ) is called an approximation space. The equivalence classes determined by the equivalence relation R are referred to as the basic knowledge on the approximation space ( U , R ) , which can be denoted as U / R = { [ x i ] R | x i U } , where [ x i ] R = { x j | ( x i , x j ) R } .
Definition 2 
([1]). Let ( U , R ) be an approximation space, where R is an equivalence relation on U. For any subset X U , the lower approximation of X is defined as R ̲ X = { x U | [ x ] R X } , and the upper approximation of X is defined as R ¯ X = { x U | [ x ] R X } .
Definition 3 
([38]). Let S : U × U [ 0 , 1 ] , for any X , Y U , if  S ( X , Y ) satisfies
  • Boundedness, 0 S ( X , Y ) 1 , S ( X , X ) = 1 , and  S ( X , Y ) = 0 only if X Y = ;
  • Symmetry, S ( X , Y ) = S ( Y , X ) ;
  • X Y Z , S ( X , Y ) S ( X , Z ) , S ( Y , Z ) S ( X , Z ) ,
then, S ( X , Y ) is called the similarity degree between sets X and Y. In this paper, let S ( X , Y ) = | X Y | | X Y | .

3. Improvement Based on Optimal Approximation Set

In this section, we mainly introduce the calculation method of the optimal approximation set, the improved algorithm of the optimal approximation set, and its related theorems.

3.1. Optimal Approximation Set

For a target set X U , its lower and upper approximations are denoted by R ̲ X and R ¯ X , respectively. The optimal approximation set is defined as the set with the highest similarity to X between the lower and upper approximations and can be represented using the basic knowledge in the approximation space. Compared to other approximation sets, the optimal approximation set ensures a high degree of approximation accuracy for the target set without necessitating manual parameter adjustments. The definition of the optimal approximation set is provided below.
Definition 4. 
Let X U be a target, where X a p p s is the power set of U / R . For any X 2 X a p p s , if there exists X 1 X a p p s such that S ( X 1 , X ) S ( X 2 , X ) where X i = [ x i 1 ] R [ x i 2 ] R [ x i t ] R , [ x i j ] R X i and t | U | , then, X 1 is called the optimal approximation set of X.
For convenience, in this paper, we denote X as X u n i o n .
Given an approximation space ( U , R ) , X U , let α =   | R ̲ X | / | X | , B X   = { [ x ] R | 0 < | [ x ] R X | / | [ x ] R | < 1 , x U } , and  C X   = { [ x ] R | | [ x ] R X | / | [ x ] R X | α , [ x ] R B X } . Then, for any A B X , X can be approximated and characterized by A u n i o n R ̲ X . Based on the above description, a theorem for computing the optimal approximation set of the target set X is provided in [36].
Theorem 1 
([36]). Let ( U , R ) be an approximation space, X U . If  Y = { [ y ] R | | [ y ] R X | / | [ y ] R X | α , [ y ] R B X } , then, R ̲ X Y u n i o n is called the optimal approximation set of X.
Theorem 1 identifies a relatively high-precision approximation of the target set X. Here is an explanation for Theorem 1: If we consider C R ̲ ( X ) as the approximation of X, then the similarity can be calculated as follows: S ( C R ̲ X , X ) = | ( C R ̲ X ) X | | C R ̲ X X | . After derivation, we get: S ( C R ̲ X , X ) = | C X | + | R ̲ X | | C X | + | X | . In this context, if we desire an increase in the similarity relative to | R ̲ X | | X | , then we need to identify an instance where | C 1 X | | C 1 X | | R ̲ X | | X | . Given this, we can deduce:
| C 1 X | + | R ̲ X | | C 1 X | + | X | | R ̲ X | | X | ,
represent it with similarity and we get:
S ( C R ̲ X , X ) S ( R ̲ X , X ) .
Following this idea, we can filter the knowledge [ x ] R from U / R that meets the condition | [ x ] R X | | [ x ] R X | | R ̲ X | | X | and assemble it into C x . In this case, the similarity between C x R ̲ X and X is greater than the similarity between R ̲ X and X. Consequently, we have found a set C x R ̲ X that can better represent X as an approximation set compared to R ̲ X .
Theorem 1 identifies a high-precision approximation set that offers a more reasonable representation of the target set. A corresponding example illustrating this theorem is provided below.
Example 1. 
From Table 1, we can infer that the decision attribute d has the following equivalence classes: D 1 = { x 6 , x 7 , x 8 , x 9 , x 10 , x 11 , x 17 , x 18 } and D 2 = { x 1 , x 2 , x 3 , x 4 , x 5 , x 12 , x 13 , x 14 , x 15 , x 16 } . Next, we will consider D 1 as the target set, and Theorem 1 will be employed to compute the optimal approximation set. Then, we will utilize the obtained results to illustrate the limitations of Theorem 1.
Utilizing the equivalence relation R, we derive the partition of U under R as follows: U / R = { X 1 , X 2 , X 3 } , where X 1 = { x 1 , x 2 , x 3 , , x 7 } , X 2 = { x 8 , x 9 , x 10 , , x 16 } , and  X 3 = { x 17 , x 18 } . By applying Theorem 1, we compute the optimal approximation set B X :
B X = { [ x ] R | 0 <   | [ x ] R D 1 | / | [ x ] R |   < 1 , x U } = { X 1 , X 2 }
Given that α = | R ̲ D 1 | / | D 1 | = 2 8 , and in accordance with the definition of C X , we obtain: | X 1 D 1 | / | X 1 D 1 | = 2 5 > α , and  | X 2 D 1 | / | X 2 D 1 | = 4 5 > α . As both values exceed α, we deduce that C X = { X 1 , X 2 } . Subsequently, the similarity can be computed as:
S ( R ̲ D 1 C X u n i o n , D 1 ) = | ( X 1 X 2 R ̲ D 1 ) D 1 | | ( X 1 X 2 R ̲ D 1 ) D 1 | = 8 18 .
Thus, the optimal approximation set of D 1 is given by R ̲ D 1 C X = { X 1 , X 2 , X 3 } .
However, there exists an alternative C X = X 2 such that
S ( R ̲ D 1 C X , D 1 ) = | ( X 2 R ̲ D 1 ) D 1 | | ( X 2 R ̲ D 1 ) D 1 | = 6 13 > 8 18 .
According to the definition of upper approximation, the upper approximation of D 1 is { X 1 , X 2 , X 3 } , which is equal to R ̲ D 1 C X n u i o n .
In fact, the subset C X C X allows R ̲ D 1 C X to achieve a higher degree of approximation accuracy to the target set X compared to R ̲ D 1 C X u n i o n . Theorem 1 only identifies a set R ̲ D 1 C X u n i o n that can better characterize the target set X than R ̲ D 1 , and does not truly verify whether R ̲ D 1 C X u n i o n is the optimal approximation set that maximizes the similarity. In the next subsection, we will present the relevant theorem for optimal approximation sets and an improved algorithm.
Table 1. Decision information system.
Table 1. Decision information system.
U a 1 a 2 a 3 d
x 1 1010
x 2 1010
x 3 1010
x 4 1010
x 5 1010
x 6 1011
x 7 1011
x 8 0101
x 9 0101
x 10 0101
x 11 0101
x 12 0100
x 13 0100
x 14 0100
x 15 0100
x 16 0100
x 17 0011
x 18 0011

3.2. Improvement of Optimal Approximation Set

In this subsection, we mainly introduce the calculation method of the optimal approximation set, the improved algorithm of the optimal approximation set, and its related theorems.
Lemma 1 
([17]). For any positive constants x 1 , x 2 , y 1 , y 2 , if  x 1 y 1 x 2 y 2 , then we have x 1 y 1 x 1 + x 2 y 1 + y 2 x 2 y 2 .
Now, we denote:
  • For the target set X, let P ( y i ) = | [ y i ] R X | | [ y i ] R X | ;
  • Let x 1 y 1 x 2 y 2 denote x 1 + x 2 y 1 + y 2 , and  x 1 x 2 y 1 y 2 denote x 1 y 1 x 2 y 2 ;
  • Let k i = 1 P ( y i ) denote P ( y 1 ) P ( y k ) .
The following is the theorem regarding the existence of the optimal approximation set.
Theorem 2. 
Let β = | R ̲ X | | X | k j = 1 P ( y j ) , if  P ( y i ) | R ̲ X | | X | P ( y 1 ) P ( y i 1 ) P ( y i + 1 ) P ( y k ) , then, we have β β P ( y i ) .
Proof of Theorem 2. 
Since P ( y i ) | R ̲ X | | X | P ( y 1 ) P ( y i 1 ) P ( y i + 1 ) P ( y k ) and P ( y i ) ( | R ̲ X | | X | P ( y 1 ) P ( y i 1 ) P ( y i + 1 ) P ( y k ) ) = β . Applying Lemma 1, we have
P ( y i ) β | R ̲ X | | X | k j = 1 P ( y j ) P ( y i ) ,
thus, we can obtain β β P ( y i ) . □
From Theorem 2, for  C X = { [ x ] R | | [ x ] R X | | [ x ] R X | α , [ x ] R B X } , if there exists [ y i ] R C X such that
| [ y i ] R X | | [ y i ] R X | | R ̲ X | + j = 1 k | [ y j ] R X | | [ y i ] R X | | X | + j = 1 k | [ y j ] R X | | [ y i ] R X | ,
then, according to Lemma 1, the following result can be obtained:
| [ y i ] R X | | [ y i ] R X | | R ̲ X | + j = 1 k | [ y j ] R X | | [ y i ] R X | | X | + j = 1 k | [ y j ] R X | | [ y i ] R X | | R ̲ X | + j = 1 k | [ y j ] R X | | [ y i ] R X | | X | + j = 1 k | [ y j ] R X | | [ y i ] R X | .
According to the conclusion deduced based on Theorem 1 in s u b s e c t i o n A, the following result can be derived:
S ( C X u n i o n R ̲ X , X ) = | C X u n i o n X | + | R ̲ X | | C X u n i o n X | + | X |
Thus, the inequality (Equation (2)) can be expressed by the similarity (Equation (3)), which yields S ( C X u n i o n R ̲ X , X ) S ( ( C X u n i o n / [ y i ] R ) R ̲ X , X ) .
The above conclusion indicates that removing some of the less informative elements from C X can actually improve the approximation accuracy. Therefore, when we sort all P ( y i ) , i = 1 , 2 , , k in ascending order as P ( y 1 ) P ( y 2 ) P ( y k ) , Lemma 1 and Theorem 2 can be employed to derive the following theorem for identifying the optimal approximation set.
Theorem 3. 
For a given β = | R ̲ X | | X | k j = 1 P ( y j ) , where P ( y 1 ) P ( y 2 ) P ( y k ) , if there exists P ( y i ) β P ( y i ) , then the following is true:
β i j = i P ( y j ) β i j = i 1 P ( y j ) β i j = 1 P ( y j ) .
Proof of Theorem 3. 
Mathematical induction can be used to prove that
β i j = i P ( y j ) β i j = i 1 P ( y j ) β i j = i t P ( y j ) .
When t = 1 , according to the assumption of the theorem, we know that P ( y i ) β P ( y i ) , so we have P ( y i 1 ) P ( y i ) β P ( y i ) , and  P ( y i 1 ) β P ( y i ) P ( y i 1 ) = β P ( y i ) .
Based on Lemma 1, we have β P ( y i ) β P ( y i ) P ( y i 1 ) , and  P ( y i 1 ) β P ( y i ) P ( y i 1 ) . When t > 1 , if  P ( y i t ) β ( j = i t i P ( y j ) ) and P ( y i t 1 ) P ( y i t ) , then P ( y i t 1 ) β ( j = i t i P ( y j ) ) , which implies that P ( y i t 1 ) ( β ( j = i t 1 i P ( y j ) ) ) = β ( j = i t i P ( y j ) ) . From Lemma 1, we have β ( j = i t i P ( y j ) ) β ( j = i t 1 i P ( y j ) ) and P ( y i t 1 ) β ( j = i t 1 i P ( y j ) ) . Using mathematical induction, we can obtain β ( j = i i P ( y j ) ) β ( j = i 1 i P ( y j ) ) β ( j = i t i P ( y j ) ) . □
In light of Theorem 3, we now propose an improved algorithm for determining the optimal approximation set, as given in Algorithm 1, namely, the improved algorithm for optimal approximation set (IOAS).
Algorithm 1: IOAS
Mathematics 11 03452 i001
Example 2 
(Continued from Example 1). We will apply the improved algorithm for the optimal approximation set (IOAS) to re-evaluate the following example.
By following the step 2 in Algorithm 1, we have:
C X = { [ x ] R | | [ x ] R D 1 | | [ x ] R D 1 | α , [ x ] R B X } = { X 1 , X 2 } .
According to step 4 in Algorithm 1, | X 1 D 1 | / | X 1 D 1 | = 2 5 and | X 2 D 1 | / | X 2 D 1 | = 4 5 , we can obtain [ x ] R = X 2 . Then, following Step 5 in IOAS, update C X to C X = { X 1 } and D = to D = { X 2 } . Based on steps 7–9 and step 3, since
δ = | X 2 D 1 | | X 2 D 1 | | R ̲ D 1 | | D 1 | = 4 + 2 5 + 8 > | X 1 D 1 | | X 1 D 1 | = 2 5 .
X 1 is removed from C X and C X = . Finally, we obtain the optimal approximation set D R ̲ D 1 = { X 2 } { X 3 } , and the resulting similarity is
S ( R ̲ D 1 D , D 1 ) = | ( X 2 R ̲ D 1 ) D 1 | | ( X 2 R ̲ D 1 ) D 1 | = 6 13 > 8 18 .
The improved algorithm provides a higher similarity value compared to the result obtained through Theorem 1. Additionally, the algorithm has effectively reduced the size of the optimal approximation set. As a result, it provides a more accurate and concise representation of the target set D 1 .

4. Improved Algorithm Based on Fuzzy Optimal Approximation Set

In this section, we have extended the classical theory of optimal approximation sets to fuzzy set theory, and proposed an algorithm for solving the optimal approximation sets of fuzzy sets based on this extension.

4.1. Relative Definition

In this paper, we represent a fuzzy information system as F I S = ( U , A , V , f ) , which is a fuzzy decision information system where U is a non-empty finite domain, A is a non-empty finite set of attributes that satisfies A = C D i s and C D i s = , V is the value range of all attribute values, and f is a mapping that fulfills U × A V . Additionally, a fuzzy information system without a decision is denoted as F I S = ( U , C , V , f ) .
In situations without ambiguity, for the sake of simplicity, we represent [ x ] R C as [ x ] C , where R denotes the equivalence or similarity relation, and C refers to either an attribute set or an attribute subset.
Definition 5 
([10,39]). A fuzzy relation R on U is defined as R : U × U [ 0 , 1 ] . ( x , y ) U × U , the membership degree R ( x , y ) indicates the degree to which x and y have a relationship R. The set of all fuzzy relations on U is denoted as F ( U × U ) .
Suppose R F ( U × U ) , ( x , y ) U , if it satisfies the following conditions
  • Reflexivity R ( x , x ) = 1 ;
  • Symmetry R ( x , y ) = R ( y , x ) ;
  • Transitivity R ( x , z ) sup y U min { R ( x , y ) , R ( y , z ) } ,
then R is called a fuzzy equivalence relation on U. Furthermore, if R only satisfies reflexivity and symmetry, then R is called a fuzzy similarity relation on U. R 1 , R 2 F ( U × U ) , we have
  • R 1 ( x , y ) R 2 ( x , y ) R 1 R 2 ;
  • ( R 1 R 2 ) ( x , y ) = min { R 1 ( x , y ) , R 2 ( x , y ) } ;
  • ( R 1 R 2 ) ( x , y ) = max { R 1 ( x , y ) , R 2 ( x , y ) } .
Definition 6 
([10,39]). Let F I S = ( U , A , V , f ) be a fuzzy information system, and R be a fuzzy similarity relation on U. For any X F ( U ) , the lower approximation R ̲ X and upper approximation R ¯ X of X are a pair of fuzzy sets on U whose membership functions, respectively, are
R ̲ X ( x ) = inf y U max { 1 R ( x , y ) , X ( y ) } ,
R ¯ X ( x ) = sup y U min { R ( x , y ) , X ( y ) } .
Definition 7. 
Let F I S = ( U , A , V , f ) be a fuzzy information system, for any fuzzy sets X , Y F ( U ) , the fuzzy similarity degree S between X and Y is defined as follows:
S ( X , Y ) = | X Y | | X Y | = min x U ( X ( x ) , Y ( x ) ) max x U ( X ( x ) , Y ( x ) ) .
where the | · | represents the cardinality of a set.
Definition 8. 
Let F I S = ( U , A , V , f ) be a fuzzy information system. For any B A , R B is the fuzzy similarity relation on U, [ x i ] B is the fuzzy similarity class under B, and  F ( U B ) is the power set of all similarity classes under B. Given a fuzzy target set X F ( U ) , for any K F ( U B ) , if there exists T F ( U B ) , denoted as T = { [ x i 1 ] B , [ x i t ] B , t | U | } , such that S ( T , X ) S ( K , X ) . Then, T ( U B ) is called the fuzzy optimal approximation set about X, where T = [ x i 1 ] B [ x i t ] B .
Similarly to the previous section, for a set of fuzzy sets T, we denote T as T u n i o n .
Theorem 4. 
Let F I S = ( U , A , V , f ) be a fuzzy information system, where R is the fuzzy equivalence relation on U. For  X F ( U ) , if  C X = { [ x i ] R | | [ x i ] R X | / | [ x i ] R [ x i ] R X | α , α = R ̲ X ( x ) / X ( x ) , x U } , then, C X u n i o n R ̲ X is the fuzzy optimal approximation set of X.
Proof of Theorem 4. 
Similar to Theorem 1. □

4.2. Improved Fuzzy Optimal Approximation Set Model

In this subsection, we propose an enhanced algorithm for optimal approximation sets and illustrate the algorithm using an example.
To ensure accurate data processing regardless of the difference of data, we adopt the maximum-minimum normalization method [40] for numerical data in the dataset, which transforms the data into the interval [ 0 , 1 ] . The calculation formula is as follows:
F ( f ( x i , c k ) ) = f ( x i , c k ) min c k max c k min c k ,
where max c k and min c k are the maximum and minimum values of attribute c k . This paper employs a kernel function that is capable of handling mixed data types simultaneously. The specific function is presented below:
r i j c k = 1 , f ( x i , c k ) = f ( x j , c k ) and c k is nominal ; 0 , f ( x i , c k ) f ( x j , c k ) and c k is nominal ; 0 , | f ( x i , c k ) f ( x j , c k ) |   > ε k and c k is numerical ; 1 | f ( x i , c k ) f ( x j , c k ) | , | f ( x i , c k ) f ( x j , c k ) |   ε k and c k is numerical ,
where r i j represents the fuzzy similarity relationship between attributes, and  ε c k is a conditionally fuzzy similarity radius, with the specific calculation formula as follows:
ε c k = s t d ( c k ) λ ,
where s t d ( c k ) is the standard deviation of attribute c k , and the preset parameter λ is used to adjust the fuzzy similarity radius.
In order to compute the optimal approximation set of a fuzzy set, similar to IOAS, this paper proposes a fuzzy improved optimal approximation set algorithm (FIOAS) as Algorithm 2.
Algorithm 2: FIOAS
Mathematics 11 03452 i002
Next, we provide an example to illustrate the effectiveness of FIOAS in computing the optimal approximation set of the target set X.
Example 3. 
Let F I S = ( U , C , V , f ) be a fuzzy information system, as shown in Table 2. Among the attributes, c 1 is nominal and c 2 and c 3 are numerical. Given a fuzzy target set X = { 0.9 x 1 , 0.3 x 2 , 0 x 3 , 0 x 4 , 1 x 5 } , we calculate the optimal approximation set of X using FIOAS as follows: First, we normalize the data using min-max normalization (Equation (7)). Next, we calculate the fuzzy similarity radius using (Equation (9)) with λ = 1 , resulting in ε c 2 0.4028 and ε c 3 0.4437 . Finally, we set D = .
Thus, the fuzzy similarity matrices for any c k C are obtained using (Equation (8)) as follows:
M R c 1 = 1 1 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 1 1 0 0 0 0 0 1 ,   M R c 2 = 1 0.8333 0 0 0 0.8333 1 0 0 0 0 0 1 1 0.8333 0 0 1 1 0.8333 0 0 0.8333 0.8333 1 ,
   
M R c 3 = 1 0.8947 0 0 0 0.8947 1 0 0 0 0 0 1 0.9895 0.9368 0 0 0.9895 1 0.9263 0 0 0.9368 0.9263 1 .
Therefore, the fuzzy similarity matrix of R C can be derived as follows:
M R C = M R c 1 M R c 2 M R c 3 = 1 0.8333 0 0 0 0.8333 1 0 0 0 0 0 1 0.9895 0 0 0 0.9895 1 0 0 0 0 0 1 .
From the matrix M R C , we can deduce the following results:
[ x 1 ] C = M R C ( 1 , : ) = { 1 x 1 , 0.8333 x 2 , 0 x 3 , 0 x 4 , 0 x 5 } , [ x 2 ] C = M R C ( 2 , : ) = { 0.8333 x 1 , 1 x 2 , 0 x 3 , 0 x 4 , 0 x 5 } , [ x 3 ] C = M R C ( 3 , : ) = { 0 x 1 , 0 x 2 , 1 x 3 , 0.9895 x 4 , 0 x 5 } , [ x 4 ] C = M R C ( 4 , : ) = { 0 x 1 , 0 x 2 , 0.9895 x 3 , 1 x 4 , 0 x 5 } , [ x 5 ] C = M R C ( 5 , : ) = { 0 x 1 , 0 x 2 , 0 x 3 , 0 x 4 , 1 x 5 } .
Using (Equation (4)) from definition 6, we can derive the lower approximation of X, which can be represented as a fuzzy set:
R C ̲ X = { 0.3 x 1 , 0.3 x 2 , 0 x 3 , 0 x 4 , 1 x 5 } .
Since α = | R C ̲ X | | X | = 0.3 + 0.3 + 0 + 0 + 1 0.9 + 0.3 + 0 + 0 + 1 = 0.7273 , we can determine that C X = { [ x 1 ] C , [ x 2 ] C , [ x 5 ] C } by utilizing Theorem 4. Then, according to Step 4 in FIOAS, we obtain [ x ] R = [ x 5 ] C . Based on steps 5–8, we can ascertain D = { [ x 5 ] C } and C X = { [ x 1 ] C } , which implies that FIOAS excludes [ x 2 ] C and [ x 5 ] C form C X in this iteration. Furthermore, in the next iteration, we can obtain D = { [ x 5 ] C , [ x 1 ] C } and C X = . Finally, the optimal approximation set of X can be derived as follows:
O p t C ( X ) = D u n i o n R C ̲ X = [ x 5 ] C [ x 1 ] C { 0.3 x 1 , 0.3 x 2 , 0 x 3 , 0 x 4 , 1 x 5 } = { 1 x 1 , 0.8333 x 2 , 0 x 3 , 0 x 4 , 1 x 5 } ,
and the similarity can be calculated as S ( O p t C ( X ) , X ) = | O p t C ( X ) X | | O p t C ( X ) X | = 0.7765 .
However, for  C X u n i o n s R C ̲ X = { 1 x 1 , 1 x 2 , 0 x 3 , 0 x 4 , 1 x 5 } , the similarity for target set X is S ( C X u n i o n s R C ̲ X , X ) = 0.7333 . This demonstrates the effectiveness of FIOAS in computing the optimal approximation set of X on fuzzy sets.
Table 2. Mixed Data Table.
Table 2. Mixed Data Table.
U c 1 c 2 c 3
x 1 10.212.1
x 2 10.311.1
x 3 20.73.2
x 4 20.73.3
x 5 30.82.6

5. Unsupervised Algorithm Based on Fuzzy Optimal Approximation Set

In this section, we propose a fuzzy optimal approximation-based unsupervised attribute reduction algorithm for mixed data (abbreviated as FOUAR). Subsequently, we analyze the algorithmic complexity of FOUAR. Finally, an example is presented to illustrate the computation process of FOUAR, with a comparative analysis of the reduction results alongside those of FRUAR.

5.1. Relative Definitions and Model

In the following, we utilize the optimal approximation set model to define the attribute importance function and attribute relevance function.
Definition 9. 
Let F I S = ( U , C , V , f ) be a fuzzy information system, where R is the fuzzy equivalence relation on U. For any P , Q C . [ x 1 ] Q , [ x n ] Q are all fuzzy sets under Q, and  [ x 1 ] P , [ x n ] P are all fuzzy sets under P. Then, the optimal approximation set of [ x i ] Q relative to P can be denoted as O p t P ( [ x i ] Q ) , and the fuzzy optimal approximation dependency of P to [ x i ] Q can be defined as
γ P ( [ x i ] Q ) = | O p t P ( [ x i ] Q ) [ x i ] Q | | O p t P ( [ x i ] Q ) [ x i ] Q |
Then, the fuzzy optimal approximation dependency of P relative to Q can be defined as
γ P ( Q ) = 1 | U | x i U γ P ( [ x i ] Q ) = 1 | U | x i U | O p t P ( [ x i ] Q ) [ x i ] Q | | O p t P ( [ x i ] Q ) [ x i ] Q | .
Definition 10. 
Let F I S = ( U , C , V , f ) be a fuzzy information system, where R is the fuzzy equivalence relation on U. For any B C , the relevance of B on all single attribute subsets is defined as
R e l B ( C ) = 1 | C | c i C γ B ( { c i } ) .
Ultimately, we define reduction based on the importance function and relevance function.
Definition 11. 
Let F I S = ( U , C , V , f ) be a fuzzy information system, where R is the fuzzy equivalence relation on U. For any B C , we say that B is a reduction of C, if B satisfies
  • R e l B ( C ) = R e l C ( C ) ;
  • b B , R e l B b < R e l B ( C ) .
According to the aforementioned theorems and definitions, an unsupervised attribute reduction algorithm for mixed data based on fuzzy optimal approximation sets is proposed in this paper as Algorithm 3.

5.2. Complexity Analysis

In this section, we will analyze the complexity of the FOUAR algorithm.
The time complexity of the FOUAR algorithm can be analyzed as follows. In the first for-loop, steps 2–4 of the algorithm, we calculate the similarity matrix M c k for each attribute c k , which takes O ( m n 2 ) time. In steps 5–26, The while loop iterates h times, where h is the size of the current candidate attribute set B. Then, in steps 7–17 of the algorithm, for each attribute c i C , we calculate α ( [ x j ] c i ) and γ R c k l ( [ x j ] c i ) for all x j U , which takes O ( m n ) time. Finally, In steps 18 and 19 of the algorithm, we calculate the relevance degree R e l R c k l ( C ) , which takes O ( h ) time. Therefore, the time complexity of the FOUAR algorithm can be expressed as O ( h ( m n 2 + m n + h ) ) = O ( m 2 n 2 ) .
Compared to traditional unsupervised attribute reduction algorithms, our method tends to be more time-consuming overall. This is because our algorithm spends more time computing the optimal approximate sets for each fuzzy partition of an attribute, which is more resource-intensive than computing the lower approximations used in traditional methods.
Algorithm 3: FOUAR
Mathematics 11 03452 i003

5.3. Specific Example

Example 4. 
According to Algorithm 3, we apply FOUAR to reduce the data in Table 3. In this table, attributes c 3 and c 4 represent the same values under different temperature scales (Fahrenheit and Celsius). Thus, c 3 and c 4 are considered redundant attributes.
Let R denote the set of attribute reductions with an initial condition R = , and let B = C R .
First, we use the max-min normalization method to standardize the numerical data in Table 3 and calculate the fuzzy similarity matrix corresponding to each attribute, denoted as M c 1 , M c 2 , M c 3 , M c 4 , M c 5 . Then, for calculating the optimal approximation set of all fuzzy sets [ x i ] R , we should first calculate:
  • α R c 1 ( [ x 1 ] c 1 ) = R ̲ [ x 1 ] c 1 / M c 1 ( 1 , : ) = 1 ,
  • and α R c 1 ( [ x 2 ] c 1 ) , α R c 1 ( [ x 18 ] c 1 ) are also calculated to be 1. Following the same procedure, we can see that
  • α R c 1 ( [ x 1 ] c 2 ) = R ̲ [ x 1 ] c 2 / M c 2 ( 1 , : ) = 0.9268 ,
  • α R c 1 ( [ x 2 ] c 2 ) = R ̲ [ x 2 ] c 2 / M c 2 ( 2 , : ) = 0 ,
  •                      ⋯
  • α R c 1 ( [ x 17 ] c 2 ) = R ̲ [ x 17 ] c 2 / M c 2 ( 17 , : ) = 0.9268 ,
  • α R c 1 ( [ x 18 ] c 2 ) = R ̲ [ x 18 ] c 2 / M c 2 ( 18 , : ) = 0.9268 .
  • Similarly, we can obtain α R c 1 ( [ x i ] c 3 ) , α R c 1 ( [ x i ] c 4 ) , α R c 1 ( [ x i ] c 5 ) for i = 1 , , 18 .
Using Algorithm 2, we can obtain the optimal approximation set O p t R c 1 ( [ x i ] c j ) for each fuzzy set [ x i ] c j under each attribute c j , and then use the obtained optimal approximation sets to calculate the degree of dependency of each attribute.
By calculating the optimal approximation dependence, we have:
  • γ { c 1 } ( c 1 ) = 1 | U | x i U | O p t c 1 ( [ x i ] c 1 ) [ x i ] c 1 | | O p t c 1 ( [ x i ] c 1 ) [ x i ] c 1 | = 1 ,
  • γ { c 1 } ( c 2 ) = 0.9 , γ { c 1 } ( c 3 ) = 0.5177 ,
  • γ { c 1 } ( c 4 ) = 0.5031 , γ { c 1 } ( c 5 ) = 0.6318 .
  • Thus, the relevance degree can be calculated as
  • R e l { c 1 } ( C ) = 1 | 5 | c i C γ { c 1 } ( c i ) = 0.7105 .
  • Similarly, we can obtain:
  • R e l { c 2 } ( C ) = 0.6798 , R e l { c 3 } ( C ) = 0.7064 ,
  • R e l { c 4 } ( C ) = 0.7017 , R e l { c 5 } ( C ) = 0.6603 .
  • Therefore, S i g c 1 can be obtained as follows:
  • S i g c 1 = R e l R c 1 ( C ) R e l R ( C ) = 0.7105 0 = 0.7105 > 0 .
  • Hence, c 1 can be selected as the attribute corresponding to the maximum attribute relevance and added to R. At this point, R = { c 1 } .
Similarly, after recalculating α R c l ( [ x i ] c j ) , l = 1 , , | C R | , we use FIOAS to obtain the optimal approximation set O p t R c l ( [ x i ] c j ) for each attribute c j under R. Then, we recalculate the dependency degree, which can yield:
  • γ R { c 2 } ( c 1 ) = 1 | U | x i U | O p t R c 2 ( [ x i ] c 1 ) [ x i ] c 1 | | O p t R c 2 ( [ x i ] c 1 ) [ x i ] c 1 | = 1 ,
  • γ R { c 2 } ( c 2 ) = 1 , γ R { c 2 } ( c 3 ) = 0.5619 ,
  • γ R { c 2 } ( c 4 ) = 0.5562 , γ R { c 2 } ( c 5 ) = 0.7107 .
  • Then, calculate the relevance R e l R c 2 ( C ) = 0.7658 .
  • We can similarly obtain that R e l R c 3 ( C ) = 0.9133 ,
  • R e l R c 4 ( C ) = 0.9172 , and R e l R c 5 ( C ) = 0.8374 .
  • Since S i g c 4 = R e l R c 4 ( C ) R e l R ( C ) = 0.2067 > 0 , we select the attribute c 4 corresponding to the maximum R e l and obtain R = { c 1 , c 4 } .
Similarly, further iteration leads to R = { c 1 , c 4 , c 5 } . When the final iteration is performed, we have S i g c 2 = R e l R c 2 ( C ) R e l R ( C ) = 0 , S i g c 3 = R e l R c 3 ( C ) R e l R ( C ) = 0.0029 .
Table 3. Mixed Data Table.
Table 3. Mixed Data Table.
U c 1 c 2 c 3 c 4 c 5
x 1 A238.0100.42
x 2 A136.297.21
x 3 A136.197.01
x 4 A136.397.31
x 5 A136.497.51
x 6 B239.2102.61
x 7 B239.3102.61
x 8 B238.4101.11
x 9 B239.2102.61
x 10 B238.1100.61
x 11 B236.196.91
x 12 B236.497.51
x 13 B236.397.31
x 14 B236.397.31
x 15 B236.798.05
x 16 B236.297.25
x 17 B236.397.95
x 18 C236.1100.585
Based on steps 21 to 25 in Algorithm 3, the iteration is stopped since all S i g c i 0 , and we obtain R = { c 1 , c 4 , c 5 } . However, using the unsupervised positive region reduction method FRUAR [20], we obtain the reduction R = { c 1 , c 4 , c 5 , c 3 } . It can be observed that the reduction method based on a lower approximation is too strict when dealing with the relationships between data, which amplifies the small differences between similar attributes such as c 3 and c 4 , leading to the selection of the redundant attribute c 3 in FRUAR. In contrast, FOUAR adopts a loose definition of the optimal approximation set to describe the relationships between defined attributes, making the algorithm more tolerant to differences between attributes during attribute reduction, and ultimately achieving effective removal of redundant attributes.

6. Experiments and Analyses

In this section, we present a detailed evaluation of FOUAR’s performance by different learning algorithms. In addition, we conduct validation experiments for Algorithm 1. The learning task of evaluation experiments for FOUAR are divided into two categories: classification and clustering experiments. We select 32 datasets from the UCI database [41] for experimentation, including 22 datasets for learning task algorithm experimentation and 10 datasets for Algorithm 1 validation experiments. The description of datasets used for learning tasks and the evaluation experiment of IOAS are shown in Table 4 and Table 5. It is worth noting that missing data in the datasets will be supplemented using the method of maximum probability estimation, where each missing value in an attribute will be replaced by the most frequent value in that attribute.
The experimental setup for this paper utilizes a computer with an AMD Ryzen 7 5800H CPU @3.20 HZ and 16 GB of memory. The downstream classification task experiments are conducted in Matlab2021a, while the Algorithm 1 validation experiments are conducted in Python3.7.

6.1. Experimental Preparation

In the learning task experiments of FOUAR for evaluation, the classification and regression tree (CART) algorithm and K-nearest neighborhood (KNN) are used as learning algorithms for the classification experiments. FOUAR is applied to reduce the dataset, and the reduced dataset is then used to train classifiers and evaluate the model’s accuracy. In the clustering experiment, the K-means clustering algorithm is utilized as the learning algorithm, and clustering accuracy (ACC) is used as the evaluation metric for clustering results [42]. Both experiments involve 10-fold cross-validation, with each experiment repeated 10 times. The mean and standard deviation of the results from the 10 experiments are reported as the final experimental results.
In this study, we compare FOUAR with five other algorithms, including FRUAR [20], FSFS (feature similarity-based feature selection) [43], USQR (unsupervised fast attribute reduction) [44], UEBR (unsupervised entropy-based reduction) [45], and UFRFS (unsupervised fuzzy rough set feature selection) [32]. Since UEBR and USQR are based on classical rough set theory and cannot handle numerical data, data discretization is required for dealing with numerical data. The FCM (fuzzy C-means) discretization method has proven effective in experiments [46]. Therefore, in this study, we use FCM [47] to discretize the data and divide it into four categories. To save time, we propose the optimal approximation set based on Theorem 1.
In the parameter experiment, we analyze and discuss the number of reduced attributes and the accuracy curves of the learning tasks under different values of λ . This analysis aims to showcase FOUAR’s exceptional performance in reduction and stability with respect to the parameter λ . It is important to note that the range of λ adopted in this experiment is an interval of [ 0.1 , 3 ] , with a step size of 0.1 .
In the verification experiment of Algorithm 1, we remove the numerical data from some of the mixed-type datasets in the UCI database since there are only a few purely categorical datasets available. This allows us to focus solely on the categorical data during the verification experiment.

6.2. Validation Experiment of IOAS

This subsection presents an experiment that compares the similarity before and after using the optimization algorithm. Similarity aims to quantify the degree of resemblance between a target set and its optimal approximation set (Definition 3). Higher similarity suggests that knowledge granules provide a more precise representation of the target set. The purpose of this paper is to demonstrate that the optimal approximation set obtained by the improved algorithm can express and characterize the target set more concisely and accurately. It should be noted that the attribute set with the maximum similarity improvement for each dataset is manually selected in this experiment.
Table 6 shows the similarity before optimization in the first column, the similarity after optimization in the second column, the percentage improvement of the optimal approximation set in the third column, and the compression ratio of the size of the optimal approximation set obtained by the improved algorithm relative to the original algorithm in the fourth column.
As shown in Table 6, the attribute sets with the highest increase in similarity reach up to four times, while the maximum compression of the set volume achieves 50%. For the vote dataset, the similarity of the optimal approximation set for the target is increased from 0.39 to 0.91, and the size of the set is reduced by 40%. Moreover, the average similarity of IOAS is significantly higher than that of the compared algorithm involved in the experiment, and the average improvement in similarity is at least twice that of the compared algorithm.
The higher approximation accuracy of the IOAS algorithm for the target set is indeed supported by the results of Theorems 2 and 3, resulting in its superior performance over the original algorithm. Ultimately, the experimental analysis confirms that the improved algorithm effectively and efficiently characterizes the target set.

6.3. Classification Experiments

The classification experiment subsection presents two tables to display the experimental results. Table 7 shows the optimal subset of reduction results of FOUAR, while Table 8 and Table 9 show the accuracy of the two classifiers.
Based on the classification accuracy results presented in Table 8, it can be observed that FOUAR outperforms the other unsupervised attribute reduction algorithms, achieving the highest classification accuracy in 19 datasets. Specifically, compared to FRUAR, an unsupervised positive region reduction algorithm, FOUAR has higher accuracy in 21 datasets. Furthermore, when compared to UFRFS, UEBR, USQR, and FSFS, FOUAR demonstrates higher accuracy in 21, 21, 21, and 20 datasets, respectively. Finally, after comparing the average classification accuracy of the six algorithms across 22 datasets, it can be concluded that FOUAR has the highest classification accuracy among the other five algorithms.
By comparing the original reduction results, it is evident from Table 9 that FOUAR surpasses the other algorithms in terms of accuracy across 15 datasets. Furthermore, in a comparative analysis against FRUAR, UFRFS, UEBR, USQR, and FSFS, FOUAR exhibits superior accuracy in 17, 20, 22, and 20 datasets, respectively. A comparative examination of the average accuracies of the six algorithms in question employing KNN, substantiates that FOUAR boasts the highest mean accuracy.
The above experimental analysis leads to the conclusion that FOUAR exhibits superior performance when the learning task is a classification task compared to the other algorithms used in the experiment. This suggests that FOUAR is more suitable for handling classification experiments.

6.4. Clustering Experiment

In this subsection, we present an improved analysis of the clustering experiment results presented in Table 10 and Table 11. From the results shown in Table 11, it is observed that FOUAR achieves the highest clustering accuracy in 18 out of 22 datasets. This suggests that FOUAR can effectively reduce the number of attributes while preserving essential information for clustering tasks. Moreover, FOUAR outperforms FRUAR, an unsupervised positive region reduction algorithm, in 22 datasets, indicating that FOUAR provides better attribute reduction performance than FRUAR for clustering tasks.
Furthermore, FOUAR outperforms other unsupervised attribute reduction algorithms, such as UFRFS, UEBR, USQR, and FSFS in 21, 21, 21, and 21 datasets, respectively. These results suggest that FOUAR can generate more accurate and concise data representations for clustering tasks compared to these algorithms. Finally, FOUAR has the highest clustering accuracy mean compared to the other algorithms, which confirms its superiority in attribute reduction for clustering tasks.
The experimental results demonstrate that FOUAR is a highly effective feature selection algorithm for clustering tasks in unsupervised attribute reduction. FOUAR produces better clustering accuracy results than other unsupervised attribute reduction algorithms in most datasets. These results suggest that FOUAR has superior performance in attribute reduction and feature selection compared to other attribute reduction algorithms, making it more suitable as a feature selection algorithm for clustering tasks in unsupervised attribute reduction.

6.5. Reduction Experiment

In this section, we report a comparison experiment performed on the original attribute reduction quantities, specifically comparing the ability of the FOUAR algorithm and the traditional rough set attribute reduction algorithm (FRUAR) to eliminate redundant attributes under CART and K-means. In the figures, the blue bar represents the attribute reduction quantity of FOUAR, while the red bar represents the reduction quantity of the traditional attribute reduction model. The two horizontal lines in the figure represent the average reduction quantity of each algorithm.
As observed in Figure 1a, FOUAR outperforms the traditional reduction algorithm in eliminating redundant attributes in most datasets. Among the 20 datasets involved in the comparison, FOUAR demonstrates better reduction capabilities in 12 datasets compared to the traditional reduction algorithm. From Figure 1b, we observe that the number of reduced attributes by FOUAR is less than that of the traditional reduction algorithm in 11 datasets. Notably, in Figure 1a,b, the average reduction quantity of FOUAR is less than that of the traditional attribute reduction model.
From the figures, we can conclude that FOUAR has a stronger capability to remove redundant attributes compared to the traditional attribute reduction algorithm. Given that FOUAR generally performs better in downstream tasks, we deduce that FOUAR is more suitable for reduction experiments where the downstream tasks are classification and clustering.

6.6. Parameter Experiment

In this subsection, we conduct a parameter experiment analysis. In FOUAR, λ serves as a critical adjustable parameter, enabling the control of the granularity of fuzzy-rough data analysis. As a result, FOUAR can fine-tune the reduced attribute subset of the dataset by adjusting λ , allowing the reduced dataset to achieve good performance in various learning algorithms. We showcase the graphs of the accuracy of learning algorithms and the number of reduced attributes as functions of λ in Figure 2, highlighting FOUAR’s reduction capabilities and its ability to preserve data features. It should be noted that the number of reduced attributes equals the total attributes in the dataset minus the attributes retained after reduction.
From Figure 2, we observe that, in most datasets, the number of reduced attributes initially decreases rapidly as λ changes, then gradually increases until stabilizing, as exemplified in the Bands, Autos, Dermatology, Movement libras, Parkinsons, Sonar, and WDBC datasets. For the Annealing and German datasets, the number of reduced attributes remains stable or slowly increases, respectively. However, for the Credit Approve and Ionosphere datasets, the number of reduced attributes initially decreases and remains stable without further increases, while for the Heart dataset, the number of reduced attributes begins to gradually increase after a prolonged period of relatively low reduction. Based on this analysis of the number of reduced attributes, we find that FOUAR achieves more significant reduction effects when λ has a larger value. This is because a larger λ results in smaller granularity of fuzzy-rough data, leading to a more accurate approximation of the target set. Consequently, the dependencies between attributes can be better expressed.
For most datasets in Figure 2, the classification and clustering accuracies remain within a relatively stable range. However, for the Dermatology dataset, the classification and clustering accuracies are initially unstable, becoming stable only when λ exceeds 1.5 . Analyzing the changes in classification and clustering accuracies with respect to λ , we observe that FOUAR is not sensitive to changes in λ when processing most datasets, enabling FOUAR to stably maintain data characteristics.
In summary, the experimental analysis demonstrates that FOUAR exhibits outstanding performance in terms of reduction efficiency and preservation of data features.

6.7. Hypothesis Testing

In this section, we report hypothesis testing experiments on FOUAR using prediction accuracy and clustering accuracy data obtained from classification and clustering experiments. The Friedman test [48] and Nemenyi test [49] were used for this purpose.
Before conducting the Friedman test, we first need to rank the accuracies of each algorithm on different datasets in ascending order and assign natural numbers 1 , 2 , , N to these sorted accuracies, referred to as ranks. If there are ties in accuracy, we need to equally distribute these ranks among the tied values.
Suppose we need to compare M algorithms on N datasets, where r i represents the average rank of all corresponding accuracies for each participating algorithm. The Friedman test can be calculated as:
τ χ 2 = 12 N M ( M + 1 ) ( i = 1 M r i 2 M ( M + 1 ) 2 4 ) .
However, the original Friedman test statistic is too conservative, so a new statistic, τ F , is introduced:
τ F = ( N 1 ) τ χ 2 N ( M 1 ) τ χ 2 ,
where τ F follows an F-distribution with degrees of freedom ( M 1 ) and ( M 1 ) ( N 1 ) . The null hypothesis of the Friedman test is that ‘there is no significant difference among all algorithms in terms of their performance on the datasets’. Thus, if the null hypothesis is rejected, it indicates significant differences among the algorithms. To further differentiate the differences between the algorithms, we conducted post hoc tests. The Nemenyi test, as a widely used post hoc test, was adopted in this paper. Its core purpose is to compare the differences between algorithms with a critical difference. By calculating the critical difference, significant differences at the confidence level can be obtained. In the Nemenyi test, the critical difference (CD) is calculated by the formula:
C D α = q α M ( M + 1 ) 6 N ,
where q α is the critical value obtained from the table based on the significance level α , which can be found in [49]. If the difference between algorithms is greater than the average rank difference, it indicates a significant difference between them.
In hypothesis-testing experiments, we obtained M = 6 and N = 22 in the Friedman test with a significance level of α = 0.1 . The degrees of freedom for τ F were 5 and 105. We obtained a critical value of 1.9029 for each learning algorithm in the Friedman test, and, if τ F was greater than the critical value, we rejected the null hypothesis of no significant difference between the algorithms.
In the case of CART, τ F = 19.6792 , while for KNN and K-means, τ F = 13.4193 and τ F = 12.1507 , respectively. As all of the values exceed the critical value, the null hypothesis was rejected for all algorithms in the Friedman test, indicating a significant difference among them. Further hypothesis testing was performed using the Nemenyi test, and the results are presented in Figure 3.
Based on the results shown in Figure 3a, it is apparent that FOUAR exhibits significant differences when compared to most of the other reduction algorithms. In Figure 3c, significant differences were observed between FOUAR and FRUAR, USQR, UEBR, FSFS, and UFRFS. Additionally, FOUAR had the highest average rank among all the reduction algorithms in (a), (b) and (c). These findings suggest that FOUAR is a superior attribute reduction algorithm in comparison to the other methods tested, making it a suitable choice for learning tasks such as classification and clustering.

7. Conclusions

This paper proposes an unsupervised attribute reduction algorithm for mixed data based on an improved optimal approximation set. First, we propose the theory of an improved optimal approximation set along with its corresponding algorithm, named IOAS. Then, we broaden the classical optimal approximation set theory to encompass fuzzy set theory, leading to the creation of a fuzzy improved optimal approximation set algorithm (FIOAS). Lastly, in order to leverage the information of the upper approximation set, we propose an attribute significance function and a relevance function based on the fuzzy optimal approximation set, culminating in the development of an unsupervised attribute reduction algorithm (FOUAR).
To evaluate the effectiveness of FOUAR and IOAS, 22 datasets from the UCI database were selected for experiments. The results indicate that the improved algorithm for computing the optimal approximation set achieves higher approximation accuracy and smaller approximation set volume. Furthermore, FOUAR maintains or even improves the prediction and clustering accuracy of the reduced dataset and outperforms existing unsupervised attribute reduction algorithms.
In future work, the proposed fuzzy optimal approximation-based reduction algorithm will be extended to supervised positive region reduction algorithms. Additionally, algorithms will be designed to obtain globally optimal reduced attributes.

Author Contributions

Methodology, H.W.; Writing—original draft, H.W.; Writing—review and editing, H.W., S.Z. and M.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (No. 72101082), and the Natural Science Foundation of Hebei Province (No. A2020208004). The APC was funded by the National Natural Science Foundation of China (No. 72101082), and the Natural Science Foundation of Hebei Province (No. A2020208004).

Data Availability Statement

Simulation datasets supporting reported results can be found at the link to the publicly archived UCI Machine Learning Repository http://archive.ics.uci.edu/ml (accessed on 30 March 2022).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Pawlak, Z. Rough sets. Int. J. Parallel. Prog. 1982, 11, 341–356. [Google Scholar] [CrossRef]
  2. Ma, X.-A.; Xu, H.; Ju, C. Class-specific feature selection via maximal dynamic correlation change and minimal redundancy. Expert Syst. Appl. 2023, 229, 120455. [Google Scholar] [CrossRef]
  3. Zhang, X.; Yao, Y. Tri-level attribute reduction in rough set theory. Expert Syst. Appl. 2022, 190, 116187. [Google Scholar] [CrossRef]
  4. Yao, Y.; Zhang, X. Class-specific attribute reducts in rough set theory. Inform. Sci. 2017, 418–419, 601–618. [Google Scholar] [CrossRef]
  5. Dong, L.J.; Chen, D.G.; Wang, N.; Lu, Z.H. Key energy-consumption feature selection of thermal power systems based on robust attribute reduction with rough sets. Inf. Sci. 2020, 532, 61–71. [Google Scholar]
  6. Zhang, P.F.; Li, T.R.; Wang, G.Q.; Luo, C.; Chen, H.M.; Zhang, J.B.; Wang, D.X.; Yu, Z. Multi-source information fusion based on rough set theory: A review. Inform. Fusion 2021, 68, 85–117. [Google Scholar] [CrossRef]
  7. Zhang, X.Y.; Yao, H.; Lv, Z.Y.; Miao, D.Q. Class-specific information measures and attribute reducts for hierarchy and systematicness. Inf. Sci. 2021, 563, 196–225. [Google Scholar] [CrossRef]
  8. Lashin, M.M.A.; Khan, M.I.; Khedher, N.B.; Eldin, S.M. Optimization of Display Window Design for Females’ Clothes for Fashion Stores through Artificial Intelligence and Fuzzy System. Appl. Sci. 2022, 12, 11594. [Google Scholar] [CrossRef]
  9. Kouatli, I. The Use of Fuzzy Logic as Augmentation to Quantitative Analysis to Unleash Knowledge of Participants’ Uncertainty When Filling a Survey: Case of Cloud Computing. IEEE Trans. Knowl. Data Eng. 2022, 34, 1489–1500. [Google Scholar] [CrossRef]
  10. Dubois, D.; Prade, H. Rough fuzzy sets and fuzzy rough sets. Int. J. Gen. Syst. 1990, 17, 191–209. [Google Scholar] [CrossRef]
  11. Dubois, D.; Prade, H. Putting rough sets and fuzzy sets together. In Intelligent Decision Support; Springer: Berlin/Heidelberg, Germany, 1992; pp. 203–232. [Google Scholar]
  12. Sun, B.Z.; Ma, W.M.; Qian, Y.H. Multigranulation fuzzy rough set over two universes and its application to decision making. Knowl.-Based Syst. 2017, 123, 61–74. [Google Scholar] [CrossRef]
  13. Morsi, N.N.; Yakout, M.M. Axiomatics for fuzzy rough sets. Fuzzy Sets Syst. 1998, 100, 327–342. [Google Scholar] [CrossRef]
  14. Moser, B. On the T-transitivity of kernels. Fuzzy Sets Syst. 2006, 157, 1787–1796. [Google Scholar] [CrossRef]
  15. Jensen, R.; Shen, Q. Fuzzy–rough attribute reduction with application to web categorization. Fuzzy Sets Syst. 2004, 141, 469–485. [Google Scholar] [CrossRef] [Green Version]
  16. Hu, Q.H.; Xie, Z.X.; Yu, D.R. Hybrid attribute reduction based on a novel fuzzy-rough model and information granulation. Pattern Recogn. 2007, 40, 3509–3521. [Google Scholar] [CrossRef]
  17. Wang, C.Z.; Huang, Y.; Shao, M.W.; Fan, X.D. Fuzzy rough set-based attribute reduction using distance measures. Knowl.-Based Syst. 2019, 164, 205–212. [Google Scholar] [CrossRef]
  18. Ganivada, A.; Ray, S.S.; Pal, S.K. Fuzzy rough sets, and a granular neural network for unsupervised feature selection. Neural Netw. 2013, 48, 91–108. [Google Scholar] [CrossRef]
  19. Mac Parthaláin, N.; Jensen, R. Unsupervised fuzzy-rough set-based dimensionality reduction. Inf. Sci. 2013, 229, 106–121. [Google Scholar] [CrossRef]
  20. Yuan, Z.; Chen, H.; Li, T.; Yu, Z.; Sang, B.; Luo, C. Unsupervised attribute reduction for mixed data based on fuzzy rough sets. Inf. Sci. 2021, 572, 67–87. [Google Scholar] [CrossRef]
  21. Hu, M.; Guo, Y.; Chen, D.; Tsang, E.C.C.; Zhang, Q. Attribute reduction based on neighborhood constrained fuzzy rough sets. Knowl.-Based Syst. 2023, 274, 110632. [Google Scholar] [CrossRef]
  22. Dai, J.; Wang, Z.; Huang, W. Interval-valued fuzzy discernibility pair approach for attribute reduction in incomplete interval-valued information systems. Inf. Sci. 2023, 642, 119215. [Google Scholar] [CrossRef]
  23. Wang, P.; He, J.; Li, Z. Attribute reduction for hybrid data based on fuzzy rough iterative computation model. Inf. Sci. 2023, 632, 555–575. [Google Scholar] [CrossRef]
  24. Qu, L.; He, J.; Zhang, G.; Xie, N. Entropy measure for a fuzzy relation and its application in attribute reduction for heterogeneous data. Appl. Soft Comput. 2022, 118, 108455. [Google Scholar] [CrossRef]
  25. Zhai, Y.; Li, D. Knowledge structure preserving fuzzy attribute reduction in fuzzy formal context. Int. J. Approx. Reason. 2019, 115, 209–220. [Google Scholar] [CrossRef]
  26. Yang, S.; Zhang, H.; Shi, G.; Zhang, Y. Attribute reductions of quantitative dominance-based neighborhood rough sets with A-stochastic transitivity of fuzzy preference relations. Appl. Soft Comput. 2023, 134, 109994. [Google Scholar] [CrossRef]
  27. Guo, Y.; Hu, M.; Wang, X.; Tsang, E.C.C.; Chen, D.; Xu, W. A robust approach to attribute reduction based on double fuzzy consistency measure. Knowl.-Based Syst. 2022, 253, 109585. [Google Scholar] [CrossRef]
  28. Ziarko, W. Variable precision rough set model. J. Comput. Syst. Sci. 1993, 46, 39–59, ISSN 0022-0000. [Google Scholar] [CrossRef] [Green Version]
  29. Chen, J.; Zhu, P. A variable precision multigranulation rough set model and attribute reduction. Soft Comput. 2023, 27, 85–106. [Google Scholar] [CrossRef]
  30. Dai, J.H.; Han, H.F.; Zhang, X.H.; Liu, M.F.; Wan, S.P.; Liu, J.; Lu, Z.L. Catoptrical rough set model on two universes using granule-based definition and its variable precision extensions. Inf. Sci. 2017, 390, 70–81. [Google Scholar] [CrossRef]
  31. Li, R.; Wang, Q.H.; Gao, X.F.; Wang, Z.J. Research on fuzzy order variable precision rough set over two universes and its uncertainty measures. Proc. Comput. Sci. 2019, 154, 283–292. [Google Scholar] [CrossRef]
  32. Zhang, Q.H.; Wang, G.Y.; Xiao, Y. Approximation sets of rough sets. J. Softw. 2012, 23, 1745–1759. [Google Scholar] [CrossRef]
  33. Zhang, Q.H.; Xue, Y.B.; Hu, F.; Yu, H. Research on Uncertainty of Approximation Set of Rough Set. Acta Electron. Sin. 2016, 44, 1574. [Google Scholar]
  34. Zhang, Q.H.; Xue, Y.B.; Wang, G.Y. Optimal approximation sets of rough sets. J. Softw. 2016, 27, 295–308. [Google Scholar]
  35. Luo, L.P.; Liu, E.G.; Fan, Z.Z. Optimal Approximation Rough Set. J. Henan Univ. Sci. Technol. Sci. 2018, 39, 89–93. [Google Scholar]
  36. Luo, L.P.; Liu, E.G.; Fan, Z.Z. Attributes reduction based on optimal approximation set of rough set. Appl. Res. Comput. 2019, 36, 1940–1942. [Google Scholar]
  37. Luo, L.P.; Liu, E.G.; Fan, Z.Z. Matrix Computation for Optimal Approximation Rough Set. J. East China Jiaotong Univ. 2018, 35, 83–88. [Google Scholar]
  38. Yuan, J.X.; Zhang, W.X. The Inclusion Degree and Similarity Degree of Fuzzy Rough Sets. Fuzzy Syst. Math. 2005, 1, 111–115. [Google Scholar]
  39. Yeung, D.S.; Chen, D.G.; Tsang, E.C.C.; Lee, J.W.T.; Wang, X.Z. On the generalization of fuzzy rough sets. IEEE Trans. Fuzzy Syst. 2005, 13, 343–361. [Google Scholar] [CrossRef]
  40. Yuan, Z.; Zhang, X.Y.; Feng, S. Hybrid data-driven outlier detection based on neighborhood information entropy and its developmental measures. Expert Syst. Appl. 2018, 112, 243–257. [Google Scholar] [CrossRef]
  41. Dheeru, D.; Taniskidou Karra, E. UCI Machine Learning Repository. 2017. Available online: http://archive.ics.uci.edu/ml (accessed on 30 March 2022).
  42. Zhu, P.F.; Zhu, W.C.; Hu, Q.H.; Zhang, C.Q.; Zuo, W.M. Subspace clustering guided unsupervised feature selection. Pattern Recogn. 2017, 66, 364–374. [Google Scholar] [CrossRef]
  43. Mitra, P.; Murthy, C.; Pal, S.K. Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 301–312. [Google Scholar] [CrossRef]
  44. Velayutham, C.; Thangavel, K. Unsupervised quick reduct algorithm using rough set theory. J. Electron. Sci. Technol. 2011, 9, 193–201. [Google Scholar]
  45. Velayutham, C.; Thangavel, K. A novel entropy based unsupervised feature selection algorithm using rough set theory. In Proceedings of the IEEE-International Conference on Advances in Engineering, Science and Management (ICAESM-2012), Nagapattinam, India, 30–31 March 2012; IEEE: Manhattan, NY, USA, 2012; pp. 156–161. [Google Scholar]
  46. Hu, Q.H.; Yu, D.R.; Xie, Z.X. Information-preserving hybrid data reduction based on fuzzy-rough techniques. Pattern Recogn. Lett. 2006, 27, 414–423. [Google Scholar] [CrossRef]
  47. Yu, D.R.; Hu, Q.H.; Bao, W. Combining rough set methodology and fuzzy clustering for knowledge discovery from quantitative data. Proc. CSEE 2004, 24, 205–210. [Google Scholar]
  48. Friedman, M. A comparison of alternative tests of significance for the problem of m rankings. Ann. Math. Stat. 1940, 11, 86–92. [Google Scholar] [CrossRef]
  49. Demšar, J. Statistical comparisons of classifiers over multiple datasets. J. Mach. Learn. Res. 2006, 7, 1–30. [Google Scholar]
Figure 1. Reduction Number Comparison.
Figure 1. Reduction Number Comparison.
Mathematics 11 03452 g001
Figure 2. Parameter Experiment.
Figure 2. Parameter Experiment.
Mathematics 11 03452 g002
Figure 3. Post hoc test.
Figure 3. Post hoc test.
Mathematics 11 03452 g003
Table 4. Datasets for classification and clustering.
Table 4. Datasets for classification and clustering.
IDDatasetObjectConditional AttributeDecision ClassType
1Anneal798386Mixed
2Autos205256Mixed
3Bands531392Mixed
4Cmc1473103Mixed
5Cleveland303135Mixed
6Credit approval690152Mixed
7German1000202Mixed
8Heart disease270132Mixed
9Hepatitis155192Mixed
10SCADI702057Mixed
11Dermatology366346Numeric
12Ecoli33678Numeric
13Ionosphere351332Numeric
14Iris15052Numeric
15Movement libras3609015Numeric
16Sonar208602Numeric
17Parkinsons768222Numeric
18WDBC569312Numeric
19Chess3196362Nominal
20Lymphography148184Nominal
21Monks43262Nominal
22Audiology226696Nominal
Table 5. Datasets for the Improved Optimal Approximation Set algorithm (IOAS).
Table 5. Datasets for the Improved Optimal Approximation Set algorithm (IOAS).
IDDatasetObjectConditional AttributeDecision ClassType
1Bank4521162Mixed
2Car172864Nominal
3Mushroom8124222Nominal
4Diagnosis12062Nominal
5Vote435162Nominal
6Chess3196372Nominal
7Flag194298Nominal
8Flare323122Nominal
9Letter-recognition20,000171Nominal
10Lymphography148184Nominal
Table 6. Improved Algorithm for the Optimal Approximation Set.
Table 6. Improved Algorithm for the Optimal Approximation Set.
DatasetSimilarity before OptimizationSimilarity after OptimizationImprovement in SimilarityCompression Rate
bank0.11790.1732147%15%
car0.11280.1244110%19%
mushroom0.09480.4516476%50%
diagnosis0.61200.6270103%24%
Vote0.3920.9166234%40%
Chess0.57650.6738117%24%
Flag0.09090.2380262%84%
Flare0.02870.1200413%67%
Letter-recognition0.07240.0961133%28%
Lymphography0.05330.2222417%50%
Average0.21510.3643241.2%40.1%
Table 7. Attribute subsets and number of algorithms CART.
Table 7. Attribute subsets and number of algorithms CART.
DatasetRawCART (Number) λ
Anneal383,12,32,37,1,35,17,7,36,9,5,33,8,13,27,38,4(17)0.4
Autos252,17,6,5,15,4,7,22,18,19,10,11,1,23,13,16,9,12,25(19)0.5
Bands392,22,24,29(4)2.7
Cmc102,7,8,3,6,5,4,1(8)0.9
Cleveland133,12,11,13,7,2,9,6(8)0.2
Credit approval156,4,10,7,12,1,9,13,11,15(10)0.1
German204,7,3,12,2,1,6(7)2.0
Heart disease133,11,13,7,2,9,6,12(8)0.2
Hepatitis196,4,10,18,2,19,12,15,16,7,5,11,17,3(14)1.6
SCADI205105,34,48,2,200,179,201,167,193,81,123,145,1,66(14)2.3
Dermatology3427,8,11,30,13,10,15,26,9,5,24,18,23,14,4,21,3,34,32,19,28,1,17,7,31,2(26)0.5
Ecoli77,1,5,2,6,3(6)2.3
Ionosphere331,6,27,26,8,16,14,4,18,13,20,12,24,28,22,10,2,9(18)0.3
Iris53,2,4(3)0.8
Movement libras908,26,1,62,59,82,29,89,42,41,90,17,32,48,71,18,47,72,56,39,85,44,86,6,9,65,37,52,74,21,16,46,57,13(34)0.9
Sonar6058,46,11,16,30,6,38,34,53,22,5,60,52,7,4(15)1.7
Parkinsons229,18,17,2,21,22,16,3,1,10,2,11(12)2.3
WDBC3122,7,23,26,10,16,30,6,19,11,20,2,31,28(14)2.4
Lymphography1814,18,12,15,1,2,16,8,11,3(10)-
Chess369,7,11,35,10,34,13,33,24,6,5,15,36,26,18,12,21,20,4,22,23,16,30,31,1,27,17,2,8,3,19,25,28(33)-
Monks65,1,2,4,3(5)-
Audiology6964,66,4,6,2,60,15,1,10,5,57,40,7,38,11,25,17,53,24,41,44,59,52,47(24)-
Average36.7314.05-
Table 8. Comparison of classification accuracy of the algorithm CART on reduced data (%).
Table 8. Comparison of classification accuracy of the algorithm CART on reduced data (%).
DatasetRaw DataFSFSUSQRUEBRUFRFSFRUARFOUAR
Anneal92.11 ± 0.6092.08 ± 0.8292.18 ± 0.7890.28 ± 0.5192.29 ± 0.7292.07 ± 0.6592.97 ± 0.57
Autos72.88 ± 2.7672.09 ± 3.0871.38 ± 2.2774.16 ± 2.5567.63 ± 2.5774.81 ± 2.1975.61 ± 2.24
Bands74.24 ± 2.0473.28 ± 1.7668.34 ± 1.5469.09 ± 1.8371.65 ± 1.1772.29 ± 2.1672.77 ± 1.88
Cmc50.36 ± 0.8350.46 ± 0.9850.92 ± 1.0151.15 ± 1.0951.08 ± 1.0150.55 ± 0.7751.36 ± 0.50
Cleveland51.42 ± 1.7851.68 ± 2.0650.12 ± 2.0850.89 ± 2.8551.00 ± 1.1852.14 ± 1.9053.36 ± 1.33
Credit81.39 ± 0.5372.16 ± 1.2781.72 ± 1.3581.30 ± 1.3381.33 ± 1.2085.07 ± 0.6685.59 ± 0.75
German70.04 ± 1.4069.51 ± 0.9968.98 ± 0.5670.30 ± 1.3567.59 ± 1.1670.33 ± 1.6471.41 ± 1.26
Heart76.56 ± 1.0873.41 ± 2.1476.19 ± 1.8672.00 ± 1.8076.41 ± 1.5080.67 ± 1.9180.78 ± 1.30
Hepatitis56.07 ± 2.1760.12 ± 3.3756.25 ± 3.5356.96 ± 3.2859.07 ± 2.5659.09 ± 3.6360.85 ± 4.00
SCADI77.14 ± 1.7870.86 ± 1.9375.43 ± 1.7677.71 ± 1.2070.57 ± 1.8180.71 ± 1.8181.29 ± 1.42
Dermatology93.96 ± 0.5591.36 ± 0.6194.15 ± 0.4790.57 ± 0.4492.02 ± 0.5295.87 ± 0.6496.21 ± 0.33
Ecoli81.39 ± 0.8875.11 ± 1.0181.52 ± 1.3481.14 ± 1.0480.97 ± 1.2181.91 ± 0.8482.61 ± 1.17
Ionosphere87.86 ± 0.9790.37 ± 0.8887.77 ± 0.7488.69 ± 1.0486.24 ± 1.5188.04 ± 1.1288.78 ± 0.57
Iris95.00 ± 1.0593.93 ± 0.7393.67 ± 0.6595.00 ± 0.7294.53 ± 1.2995.40 ± 0.4995.40 ± 0.49
Movement L66.25 ± 1.2758.03 ± 1.7463.28 ± 1.6757.36 ± 1.1740.47 ± 2.0468.33 ± 1.4268.97 ± 1.68
Sonar71.11 ± 2.8369.83 ± 2.7769.82 ± 2.4767.64 ± 3.0855.09 ± 1.8776.42 ± 2.0476.68 ± 2.60
Parkinsons85.59 ± 1.7084.26 ± 2.0585.38 ± 1.9287.44 ± 0.7885.33 ± 1.2186.93 ± 2.4387.89 ± 2.07
WDBC92.64 ± 0.5593.04 ± 0.5292.51 ± 0.5892.79 ± 0.7093.00 ± 0.4493.62 ± 0.8493.94 ± 0.27
Lymphography74.93 ± 1.8477.13 ± 1.9476.50 ± 2.5578.05 ± 1.7874.96 ± 2.5277.27 ± 1.7879.48 ± 2.54
Chess99.34 ± 0.1180.97 ± 0.3299.07 ± 0.1298.31 ± 0.1299.11 ± 0.0999.14 ± 0.0999.16 ± 0.09
Monks100.00 ± 0.0077.77 ± 0.02100.00 ± 0.00100.00 ± 0.00100.00 ± 0.00100.00 ± 0.00100.00 ± 0.00
Audiology77.10 ± 1.2357.77 ± 1.7465.68 ± 1.6364.17 ± 1.6373.76 ± 1.1965.26 ± 1.8074.52 ± 1.01
Average78.5274.3377.3177.0575.6479.3680.44
Bold data denotes the highest value observed for the corresponding dataset.
Table 9. Comparison of classification accuracy of the algorithm KNN on reduced data (%).
Table 9. Comparison of classification accuracy of the algorithm KNN on reduced data (%).
DatasetRaw DataFSFSUSQRUEBRUFRFSFRUARFOUAR
Anneal91.38 ± 0.5991.53 ± 0.2991.21 ± 0.5490.85 ± 0.6091.02 ± 0.2291.70 ± 0.4193.00 ± 0.33
Autos72.85 ± 0.8172.68 ± 1.6073.99 ± 1.0672.75 ± 1.3571.47 ± 1.1676.06 ± 2.2777.92 ± 1.33
Bands78.64 ± 0.9377.78 ± 0.7077.57 ± 0.8577.69 ± 0.8277.65 ± 0.9578.19 ± 0.6178.30 ± 1.09
Cmc43.65 ± 0.2647.54 ± 0.6743.56 ± 0.5443.52 ± 0.4843.83 ± 0.2542.75 ± 0.8344.00 ± 0.54
Cleveland54.35 ± 0.9351.68 ± 2.0650.12 ± 2.0850.89 ± 2.8554.48 ± 0.7652.14 ± 1.9053.36 ± 1.33
CreditA80.83 ± 0.5465.22 ± 0.5080.84 ± 0.7881.20 ± 0.5680.32 ± 0.6982.19 ± 0.6781.19 ± 0.59
German68.66 ± 0.5866.76 ± 0.4969.81 ± 0.3868.47 ± 0.7467.95 ± 0.7470.00 ± 0.3570.15 ± 0.57
Heart disease75.04 ± 0.9371.37 ± 0.8774.37 ± 0.5773.04 ± 1.2974.70 ± 1.2878.48 ± 0.9978.52 ± 0.83
Hepatitis60.35 ± 1.9762.52 ± 1.3653.22 ± 1.3058.58 ± 1.6258.52 ± 2.0760.82 ± 2.1363.51 ± 1.45
SCADI79.86 ± 0.4563.57 ± 1.2171.86 ± 1.3679.00 ± 0.6964.43 ± 1.8473.14 ± 0.9079.14 ± 1.00
Dermatology95.56 ± 0.1289.05 ± 0.5388.03 ± 0.3786.04 ± 0.6688.49 ± 0.9995.50 ± 0.4795.93 ± 0.36
Ionosphere86.47 ± 0.6887.43 ± 0.4387.95 ± 0.4589.06 ± 0.6487.69 ± 0.3687.66 ± 0.4889.34 ± 0.43
Ecoli81.08 ± 0.5475.32 ± 0.7680.85 ± 0.7580.90 ± 0.9381.01 ± 1.0781.01 ± 0.6281.17 ± 0.85
Movement L85.97 ± 0.7681.67 ± 0.4985.06 ± 0.8985.11 ± 0.5186.11 ± 0.4787.69 ± 1.0187.17 ± 0.79
Iris95.20 ± 0.5391.13 ± 1.1890.87 ± 0.5595.40 ± 0.3895.53 ± 0.3295.27 ± 0.4995.27 ± 0.80
Sonar87.02 ± 0.8787.15 ± 1.0779.01 ± 1.4777.18 ± 1.7454.12 ± 1.0888.46 ± 0.6986.50 ± 1.06
Parkinsons96.26 ± 0.6194.36 ± 0.4993.33 ± 0.7295.98 ± 0.7392.65 ± 0.9096.46 ± 0.7196.72 ± 0.87
WDBC95.40 ± 0.2595.19 ± 0.2894.31 ± 0.2994.76 ± 0.3394.15 ± 0.2795.80 ± 0.2196.03 ± 0.34
Lymphography78.19 ± 1.3967.24 ± 0.8878.15 ± 1.0578.31 ± 1.4073.23 ± 1.6677.80 ± 1.0378.60 ± 1.26
Chess84.23 ± 0.3167.58 ± 0.1584.34 ± 0.1684.56 ± 0.2483.95 ± 0.3784.83 ± 0.2484.48 ± 0.14
Monks91.95 ± 0.6577.78 ± 0.0299.44 ± 0.4391.06 ± 0.5590.90 ± 0.5199.45 ± 0.6499.54 ± 0.38
Audiology66.07 ± 1.0944.44 ± 0.7658.06 ± 1.1060.19 ± 0.5661.25 ± 1.0257.80 ± 0.6462.18 ± 1.12
Average79.5074.0477.5477.9376.0779.6980.55
Bold data denotes the highest value observed for the corresponding dataset.
Table 10. Attribute subsets and number of the algorithm K-Means.
Table 10. Attribute subsets and number of the algorithm K-Means.
DatasetRaw AttributeK-Means (Number) λ
Anneal383,32,34,33,12,8,35,5,4,36,37,1,9,17(14)3.0
Autos252,17,6,5,15,4,7,14(8)0.1
Bands392,3,14,25,24(5)1.4
Cmc102,7,8,3,1,6,4,5(8)2.1
Cleveland133,12,11,13,7,2,9,6(8)1.1
Credit approval156,4,10,7,12,9,1,2,3,14,8,13,11,15(14)1.1
German204,7,3,12,8,1,11,13,6,17,14,9(12)1.8
Heart disease133,11,13,7,2,9,6,12,8,4(10)0.3
Hepatitis196,4,10,12,7,1,11,9,5,8,13,3,14,19,17(15)0.2
SCADI70105,34,48,5,123,172,1,2,201,68,81,187,181,145,66,167,178(17)1.0
Dermatology3433,28,31,20,15,7,13,11,5,3,4,26,9,21,16,23,10,14,18,34,32,24,1,19(24)0.6
Ecoli73,7,6,5,2,1(6)0.7
Ionosphere331,6,27,26,8,16,14,4,18,13,20,12,24,28,22,10,2,9(18)0.3
Iris53,2,4(3)0.6
Movement libras908,15,55,72,31,44,79,18,85,45,9,38,40,28,80,58,39,48,89,67,1,46,14(24)3.0
Sonar603,60,2,4,31,21,30,20(8)0.1
Parkinsons229,18,17,20,21,22,16,3,1,10,2,11(12)2.3
WDBC311,15,22,18,29,19,30,7,26,9,23,11,10,27,3,6,12,16,4,5,25,24,2(23)0.1
Lymphography1814,18,12,15,1,2,16,8,11,3(10)-
Chess369,7,11,35,10,34,13,33,24,6,5,15,36,26,18,12,21,20,4,22,23,16,30,31,1,27,17,2,8,3,19,25,28(33)-
Monks65,1,2,4,3 (5)-
Audiology6964,66,4,6,2,60,15,1,10,5,57,40,7,38,11,25,17,53,24,41,44,59,52,47(24)-
Average36.7213.68-
Table 11. Comparison of clustering accuracy of the algorithm K-means on reduced data (%).
Table 11. Comparison of clustering accuracy of the algorithm K-means on reduced data (%).
DatasetRaw DataFSFSUSQRUEBRUFRFSFRUARFOUAR
Anneal41.42 ± 6.1335.85 ± 2.9137.02 ± 4.6436.70 ± 6.3837.23 ± 3.4536.50 ± 2.8944.20 ± 0.66
Autos34.59 ± 1.5528.05 ± 1.7229.12 ± 1.8328.68 ± 1.2829.02 ± 2.3128.20 ± 1.3031.27 ± 2.64
Bands58.76 ± 0.0258.76 ± 0.2058.72 ± 0.1959.19 ± 0.0958.72 ± 0.1958.64 ± 0.1558.87 ± 0.15
Cmc41.50 ± 1.9640.35 ± 2.1941.12 ± 1.9641.03 ± 2.2641.89 ± 1.8243.63 ± 1.8643.96 ± 0.99
Cleveland35.21 ± 3.8251.68 ± 2.0650.12 ± 2.0850.89 ± 2.8536.80 ± 2.6452.14 ± 1.9053.36 ± 1.33
CreditA57.67 ± 3.0956.59 ± 2.8956.42 ± 2.7255.62 ± 2.9656.56 ± 2.8956.72 ± 3.3559.57 ± 3.06
German52.70 ± 0.0053.35 ± 2.0652.70 ± 0.0052.70 ± 0.0052.70 ± 0.0052.70 ± 0.0053.59 ± 2.81
Heart disease72.89 ± 7.1876.30 ± 0.0074.59 ± 5.3959.78 ± 3.2972.89 ± 7.1876.30 ± 0.0076.30 ± 0.00
Hepatitis61.61 ± 0.3460.65 ± 0.0059.68 ± 4.1259.68 ± 0.5561.03 ± 0.3361.55 ± 1.0261.68 ± 1.10
SCADI61.29 ± 7.4839.29 ± 3.5247.71 ± 10.5752.71 ± 3.4038.86 ± 2.9247.57 ± 6.7454.57 ± 8.85
Dermatology68.03 ± 12.8468.74 ± 10.2668.99 ± 6.7265.46 ± 6.8370.77 ± 12.8575.44 ± 15.7078.17 ± 11.49
Ionosphere70.34 ± 2.1068.83 ± 9.1271.23 ± 0.0069.23 ± 0.0070.74 ± 3.6470.37 ± 0.0070.91 ± 0.09
Ecoli62.02 ± 9.1854.55 ± 5.3458.13 ± 3.6158.90 ± 6.4461.82 ± 6.8761.10 ± 4.3363.84 ± 11.62
Movement L44.42 ± 2.5542.08 ± 2.2542.53 ± 1.8936.44 ± 2.3231.44 ± 1.3842.00 ± 1.7144.31 ± 2.67
Iris88.33 ± 0.3583.33 ± 0.0080.73 ± 8.2288.40 ± 0.3479.00 ± 14.9594.67 ± 0.0094.67 ± 0.00
Sonar54.37 ± 1.4655.96 ± 0.4167.50 ± 1.7152.02 ± 0.6356.20 ± 0.2765.19 ± 0.9964.76 ± 0.23
Parkinsons62.77 ± 0.2669.90 ± 5.8968.51 ± 0.2664.92 ± 4.6374.97 ± 1.0272.56 ± 3.2372.72 ± 4.01
WDBC92.79 ± 0.0087.08 ± 0.8690.16 ± 9.7092.86 ± 0.3688.52 ± 9.1292.92 ± 0.0894.20 ± 0.19
Lymphography47.70 ± 6.5442.91 ± 5.4948.31 ± 6.7243.38 ± 5.3147.64 ± 6.3450.27 ± 6.6150.34 ± 5.81
Chess52.10 ± 1.7558.09 ± 2.6752.40 ± 3.0752.07 ± 2.3350.96 ± 0.0451.34 ± 1.1753.90 ± 3.93
Monks62.69 ± 3.8158.52 ± 6.9465.19 ± 4.1063.15 ± 1.7963.19 ± 1.5663.06 ± 2.9765.00 ± 3.51
Audiology28.85 ± 2.4425.97 ± 1.4627.35 ± 2.2225.84 ± 1.8627.70 ± 2.0825.93 ± 2.7028.14 ± 2.92
Average56.9155.3156.7454.9854.9458.1259.92
Bold data denotes the highest value observed for the corresponding dataset.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wen, H.; Zhao, S.; Liang, M. Unsupervised Attribute Reduction Algorithm for Mixed Data Based on Fuzzy Optimal Approximation Set. Mathematics 2023, 11, 3452. https://doi.org/10.3390/math11163452

AMA Style

Wen H, Zhao S, Liang M. Unsupervised Attribute Reduction Algorithm for Mixed Data Based on Fuzzy Optimal Approximation Set. Mathematics. 2023; 11(16):3452. https://doi.org/10.3390/math11163452

Chicago/Turabian Style

Wen, Haotong, Shixin Zhao, and Meishe Liang. 2023. "Unsupervised Attribute Reduction Algorithm for Mixed Data Based on Fuzzy Optimal Approximation Set" Mathematics 11, no. 16: 3452. https://doi.org/10.3390/math11163452

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop