Next Article in Journal
A Green Supply Chain with Sales Effort under a Cost-Sharing Contract
Previous Article in Journal
Manufacturer vs. Retailer: A Comparative Analysis of Different Government Subsidy Strategies in a Dual-Channel Supply Chain Considering Green Quality and Channel Preferences
Previous Article in Special Issue
Malicious Traffic Classification via Edge Intelligence in IIoT
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Optimizing Attribute Reduction in Multi-Granularity Data through a Hybrid Supervised–Unsupervised Model

School of Computer Science, Jiangsu University of Science and Technology, Zhenjiang 212100, China
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(10), 1434; https://doi.org/10.3390/math12101434
Submission received: 22 March 2024 / Revised: 5 May 2024 / Accepted: 5 May 2024 / Published: 7 May 2024
(This article belongs to the Special Issue Mathematical and Computing Sciences for Artificial Intelligence)

Abstract

:
Attribute reduction is a core technique in the rough set domain and an important step in data preprocessing. Researchers have proposed numerous innovative methods to enhance the capability of attribute reduction, such as the emergence of multi-granularity rough set models, which can effectively process distributed and multi-granularity data. However, these innovative methods still have numerous shortcomings, such as addressing complex constraints and conducting multi-angle effectiveness evaluations. Based on the multi-granularity model, this study proposes a new method of attribute reduction, namely using multi-granularity neighborhood information gain ratio as the measurement criterion. This method combines both supervised and unsupervised perspectives, and by integrating multi-granularity technology with neighborhood rough set theory, constructs a model that can adapt to multi-level data features. This novel method stands out by addressing complex constraints and facilitating multi-perspective effectiveness evaluations. It has several advantages: (1) it combines supervised and unsupervised learning methods, allowing for nuanced data interpretation and enhanced attribute selection; (2) by incorporating multi-granularity structures, the algorithm can analyze data at various levels of granularity. This allows for a more detailed understanding of data characteristics at each level, which can be crucial for complex datasets; and (3) by using neighborhood relations instead of indiscernibility relations, the method effectively handles uncertain and fuzzy data, making it suitable for real-world datasets that often contain imprecise or incomplete information. It not only selects the optimal granularity level or attribute set based on specific requirements, but also demonstrates its versatility and robustness through extensive experiments on 15 UCI datasets. Comparative analyses against six established attribute reduction algorithms confirms the superior reliability and consistency of our proposed method. This research not only enhances the understanding of attribute reduction mechanisms, but also sets a new benchmark for future explorations in the field.
MSC:
03B52; 68T37; 62H30; 18B05

1. Introduction

In this era of information explosion, data are growing exponentially in both dimension and volume, which leads to the attributes of data becoming redundant and vague. How to find valuable information from massive data has become challenging. Rough set theory, introduced by Pawlak [1] in 1982 as a simple and efficient method for data mining, can deal with fuzzy, incomplete, and inaccurate data [2].
The traditional model of rough sets mainly focuses on describing the uncertainty and fuzziness of data through binary relations [3]. In recent years, multi-granularity rough set models have been proposed to fully mine the multiple granularity levels of target information, extending the traditional single binary relation to multiple binary relations, with the work of Qian et al. [4] being representative. This model has provided a new solution for rough set theory in dealing with distributed data and multi-granularity data. Afterward, researchers continuously improved Qian’s multi-granularity rough set model. Some of the improvements combine multi-granularity rough sets with decision-theoretic rough sets to form a multi-granularity decision-theoretic rough set model [5]. In addition, there is research combining multi-granularity rough sets with the three-way decision model, proposing a multi-granularity three-way decision model [6]. Targeting the granulation of attributes and attribute values, Xu proposed an improved multi-granularity rough set model [7]. To expand the applicability of multi-granularity rough sets, Lin et al. integrated the neighborhood relation into the multi-granularity rough set model, proposing the neighborhood multi-granularity rough set. The introduction of this model has made the multi-granularity rough set research branch a hot topic of study [8]. These rough set models can effectively reduce data dimensionality, achieved by attribute reduction [9].
Attribute reduction can be achieved through supervised or unsupervised constraints, and research on constraints from both supervised and unsupervised perspectives has been extensively explored [10]. Specifically, some studies propose attribute reduction constraints based on measures from only one perspective, using these constraints to find qualified reductions. For instance, Jiang et al. [11] and Yuan et al. [12] concentrated on attribute reduction through the lens of supervised information granulation and related supervised metrics, respectively; meanwhile, Yang et al. [13] proposed a concept known as fuzzy complementary entropy for attribute reduction within an unsupervised model; The algorithm discussed by Jain and Som [14] introduces a sophisticated multigranular rough set model that utilizes an intuitionistic fuzzy beta covering approach; Ji et al. [15] developed an extended rough sets model based on a fuzzy granular ball to enhance attribute reduction effectiveness. However, whether considering supervised measures or unsupervised measures, single-perspective based measures exhibit inherent constraints. Firstly, measures relying on a single perspective may overlook the multifaceted evaluation of data, leading to the neglect of some important attributes [16]. This is because when only one fixed measure is used for the attribute reduction of data, the importance of each attribute is judged only based on its criterion. However, if other measures are needed for evaluation, then relying only on that criterion may no longer yield accurate results. Secondly, relying only on a single-perspective measure may not fully capture the characteristics of data under complex conditions, resulting in the selection of attributes that are neither accurate nor complete. For instance, if conditional entropy is used as a measure to evaluate attributes [17], the derived reduction may only possess the single feature required for evaluation, without fully considering other types of uncertainty features and learning capabilities.
To solve the limitations of the attribute reduction mentioned above, this paper introduces a new measure that merges both supervised and unsupervised perspectives, leading to a novel rough set model. The model proposed in this paper has the following advantages: (1) it integrates multi-granularity and neighborhood rough sets, making the model more adaptable to data features at different levels; and (2) for attribute sets of different granularities, it introduces a fusion strategy, selecting the optimal granularity level or attribute set according to the needs of different tasks and datasets, which can be flexibly adjusted based on specific circumstances.
The rest of this paper is organized as follows. Section 2 reviews related basic concepts. Section 3 provides a detailed introduction to the basic framework and algorithm design of the proposed method. In Section 4, the accuracy of our method is calculated and discussed through experiments. Finally, Section 5 concludes this paper and depicts some future works.

2. Preliminaries

2.1. Neighborhood Rough Sets

Neighborhood rough sets were proposed by Hu et al. as an improvement over traditional rough sets [18]. The key distinction lies in that neighborhood rough sets are established on the basis of neighborhood relations, as opposed to relations of indiscernibility [19]. Hence, the neighborhood rough set model is capable of processing both discrete and continuous data [20]. Moreover, the partitioning of neighborhoods granulates the sample space, which can reflect the discriminative power of different attributes on the samples [21].
Within the framework of rough set theory, a decision system is characterized by a tuple, represented by D S = ( U , A T ) , where U denotes a finite collection of samples and A T encompasses a suite of conditional attributes, including a decision attribute d [22]. The attribute d captures the sample labels [23]. For every x in U and every a in A T , a ( x ) signifies the value of x for the conditional attribute a, and  d ( x ) represents the label of x. Utilizing d, one can derive an equivalence relation on U:
I N D ( d ) = { ( x , y ) U × U : d ( x ) = d ( y ) }
Pursuant to I N D ( d ) , it leads to a division of U / I N D ( d ) = X 1 , X 2 , , X q ( q 2 ) . Each X k within U / I N D ( d ) is recognized as the k-th decision category. Notably, the decision category that includes the sample x can be similarly referred to as [ x ] d .
In rough set methods, binary relations are often used for information granulation, among which neighborhood relations, as one of the most effective binary relations, have received extensive attention. The formation of neighborhood relations is as follows:
N δ A = { ( x , y ) U × U : r A ( x , y ) δ }
where r A is a distance function regarding A A T , r 0 is a radius.
r A = a A ( a ( x ) a ( y ) ) 2
Based on I N D ( d ) , a segmentation of U / I N D ( d ) = X 1 , X 2 , . . . , X q ( q 2 ) can be initiated. For every X k within U / I N D ( d ) , it is identified as the k-th decision group. In particular, the decision group that encompasses sample x may also be represented as [ x ] d .
In alignment with Equation (2), the vicinity of a sample x is established as follows:
δ A ( x ) = { y U : r A ( x , y ) δ }
From the perspective of granular computing [24,25], both I N D ( d ) and N δ A are derivations of information granules [26]. The most significant difference between these two types of information granules lies in their intrinsic mechanisms, i.e., the binary relations used. Based on the outcomes of these information granules, the concepts of lower and upper approximations within the context of neighborhood rough sets, as the fundamental units, were also proposed by Cheng et al.

2.2. Multi-Granularity Rough Sets

For multi-granularity rough sets [27,28], given D S = ( U , A T ) , where A T = { A k | k { 1 , 2 , , m } } is a set of attributes, and the family of attribute subsets on A T is represented by { A 1 , A 2 , , A m }  [29].
[ x ] A i is an equivalence class of x under A i , for any X U , the optimistic multi-granularity lower and upper approximations of A i with respect to X are defined as follows:
i = 1 m A i O ̲ ( x ) = { x U [ x ] A 1 X [ x ] A 2 X [ x ] A m X } ,
i = 1 m A i O ¯ ( x ) = ( i = 1 m A i O ̲ ( x c ) ) c
If i = 1 m A i O ̲ ( x ) i = 1 m A i O ¯ ( x ) , then i = 1 m A i O ̲ ( x ) and i = 1 m A i O ¯ ( x ) are called optimistic multi-granularity rough sets.
Given D S = ( U , A T ) , where A T = { A k | k { 1 , 2 , , m } } is a set of attributes, and the family of attribute subsets on A T is represented by { A 1 , A 2 , , A m } .
[ x ] A i is an equivalence class of x under A i , for any X U , the pessimistic multi-granularity [30] lower and upper approximations of A i with respect to X are defined as follows:
i = 1 m A i P ̲ ( x ) = { x U [ x ] A 1 X [ x ] A 2 X [ x ] A m X } ,
i = 1 m A i P ¯ ( x ) = ( i = 1 m A i P ̲ ( x c ) ) c
If i = 1 m A i P ̲ ( x ) i = 1 m A i P ¯ ( x ) , then i = 1 m A i P ̲ ( x ) and i = 1 m A i P ¯ ( x ) are called pessimistic multi-granularity rough sets.
In the pursuit of refining data analysis, particularly when addressing complex and heterogeneous data sets, the application of multi-granularity rough sets provides a transformative framework. This approach offers a flexible methodology for representing data across various levels of granularity, allowing analysts to dissect large and diverse datasets into more comprehensible and manageable segments. This adaptability is crucial in environments where data exhibits varying degrees of precision, stemming from different sources or capturing differing phenomena.

2.3. Multi-Granularity Neighborhood Rough Sets

In the literature [31], Lin et al. proposed two types of neighborhood multi-granularity rough sets, which can be applied to deal with incomplete systems containing numerical and categorical attributes [32]. To simplify the problem, when dealing with incomplete systems, only the application of neighborhood multi-granularity rough sets to numerical data are considered.
Given D S = ( U , A T ) , where A T = { A k k { 1 , 2 , , m } } , U = { x i i { 1 , 2 , , n } } , X U , in the optimistic neighborhood multi-granularity rough sets, the neighborhood multi-granularity approximation of X is defined as:
i = 1 m N i O ̲ ( x ) = { x i U δ A 1 ( x i ) X δ A 2 ( x i ) X δ A k ( x i ) X }
i = 1 m N i O ¯ ( x ) = { x i U δ A 1 ( x i ) X δ A 2 ( x i ) X δ A m ( x i ) X }
where δ A k ( x i ) is the neighborhood granularity of x i , based on the granularity structure A k .
Given D S = ( U , A T ) , where A T = { A k k i n { 1 , 2 , , m } } , U = { x i i i n { 1 , 2 , , n } } , X U , in the pessimistic neighborhood multi-granularity rough sets, the neighborhood multi-granularity approximation of X is defined as:
i = 1 m N i P ̲ ( x ) = { x i U δ A 1 ( x i ) X δ A 2 ( x i ) X δ A k ( x i ) X }
i = 1 m N i P ¯ ( x ) = { x i i n U δ A 1 ( x i ) X δ A 2 ( x i ) X δ A k ( x i ) X }
where δ A k ( x i ) is the neighborhood granularity of x i , based on the granularity structure A k .
The incorporation of multi-granularity neighborhood rough sets extends this concept by emphasizing local contexts and the spatial or temporal relationships inherent within the data. By focusing on the neighborhoods around each data point, these sets are particularly adept at mitigating the influence of noise and anomalies, significantly enhancing the robustness of the analysis. The neighborhood-based approach also facilitates adaptive threshold settings, crucial for accurately defining the granularity level in datasets where this parameter is not readily apparent.

2.4. Supervised Attribute Reduction

It is well known that neighborhood rough sets are often used in supervised learning tasks, especially in enhancing generalization performance and reducing classifier complexity [33]. The advantage of attribute reduction lies in its easy adaptation to different practical application requirements, hence a variety of forms of attribute reduction have emerged in recent years. For neighborhood rough sets, information gain and split information value are two metrics that can be used to further explore the forms of attribute reduction.
Given the data D S = U , A T , d , δ , for any A A T , the neighborhood information gain of D based on A is defined as:
I G N R S ( d , A ) = H N R S H N R S ( d , A )
Here, H N R S is the entropy of the entire dataset D, calculated based on the distribution under neighborhood lower or upper approximation [34]. H N R S ( d , A ) is the expected value of uncertainty considering attribute A, defined as:
H N R S ( d , A ) = 1 | U | x U | δ s ( x ) [ x ] d | log | δ s ( x ) [ x ] d | | δ s ( x ) |
Given the data D S = U , A T , d , δ , for any A A T , the neighborhood split information value of d based on A is defined as:
S I N R S ( d , A ) = j = 1 n | δ A ( X j ) | | U | log 2 | δ A ( X j ) | | U |
Here, δ A ( X j ) represents the sample set X j within the neighborhood formed by attribute A, and n is the number of different neighborhoods formed by A [35].
The combination of neighborhood information gain and split information value helps to more comprehensively assess the impact of attributes on dataset classification, thereby making more effective decisions in attribute reduction.

2.5. Unsupervised Attribute Reduction

It is widely recognized that supervised attribute reduction necessitates the use of sample labels, which are time-consuming and expensive to obtain in many practical tasks [36]. In contrast, unsupervised attribute reduction does not require these labels, hence it has received more attention recently.
In unsupervised attribute reduction, if it is necessary to measure the importance of attributes, one can construct models by introducing pseudo-label strategies and using information gain and split information as metrics.
Given unsupervised data I S = U , A T and δ , for any A A T , the unsupervised information gain based on A is defined as:
I G N R S ( d , A ) = H N R S H N R S ( d , A )
where H N R S ( d , A ) is the expected value of uncertainty considering attribute A, defined as:
H N R S ( d , A ) = 1 | A | a A ( H N R S ( d , A ) ( d a ) )
d a denotes the pseudo-label decision for samples generated using conditional attribute a.
Given unsupervised data I S = U , A T and δ , for any A A T , the unsupervised split information based on A is defined as:
S I N R S ( d , A ) = 1 | A | a A ( S I N R S ( d , A ) ( d a ) )
Here, d a is a pseudo-label decision, recorded by using conditional attribute a for sample pseudo-labels.
These definitions provide a new method for evaluating attribute importance in an unsupervised setting. Information gain reflects the contribution of an attribute to data classification, while split information measures the degree of confusion introduced by an attribute in the division of the dataset. This approach helps in more effective attribute selection and reduction in unsupervised learning.

3. Proposed Method

3.1. Definition of Multi-Granularity Neighborhood Information Gain Ratio

Considering a dataset D S = U , A T , d , δ , with U representing the sample set, A T indicating the attribute set, d denoting the decision attribute, and  δ specifying the neighborhood radius.
For any A A T , the multi-granularity neighborhood information gain ratio is defined as:
ϵ A ( d ) = S I N R S ( d , A ) e I G N R S ( d , A ) × W A
where S I N R S ( d , A ) is the neighborhood split information quantity based on A, e I G N R S ( d , A ) is the information gain for decision attribute d based on attribute A with the base of the natural logarithm, and  W A is the granularity space coefficient of attribute A in the multi-granularity structure, reflecting its importance in the multi-granularity structure.
For the calculation of the granularity space coefficient, given a set of granularities G 1 , G 2 , . . . , G n , the performance of attribute A under each granularity can be measured by the quantitative indicator P G i ( A ) . The definition of the granularity space coefficient W A is as follows:
W A = i = 1 n β i · P G i ( A ) i = 1 n β i
where β i is the granularity space allocated to each granularity G i , reflecting the importance of different granularities. These granularity spaces are usually determined based on the specific background knowledge or experimental verification of the problem.
The granularities G 1 , G 2 , . . . , G n in the multi-granularity structure are determined according to the data characteristics, problem requirements, etc. [37], and each granularity reflects different levels or details of the data. When calculating the granularity space coefficient, the performance of the attribute under different granularities is considered, in order to more accurately reflect its importance in the multi-granularity structure.
The neighborhood rough set is a method for dealing with uncertain and fuzzy data, which uses neighborhood relations instead of the indiscernible relations in traditional rough sets. In this method, data are decomposed into different granularities, each representing different levels or details of the data. Information gain ratio is a method for measuring the importance of attributes in data classification. It is based on the concept of information entropy and evaluates the classification capability of an attribute by comparing the entropy change in the dataset with and without the attribute.
Therefore, ϵ combines these two concepts, i.e., neighborhood information gain at different granularities and the split information value of attributes, to evaluate the importance of attributes in multi-granularity data analysis. The structure of the ϵ -reduct part is shown in Figure 1. This method not only considers the information gain of attributes, but also their performance at different granularities, thus providing a more comprehensive method of attribute evaluation.
Given a decision system D S and a threshold θ [ 0 , 1 ] , an attribute A is considered significant if it satisfies the following conditions:
  • ϵ A ( d ) ϵ A T ( d ) θ ;
  • There is no proper subset A of attribute A such that ϵ A ( d ) ϵ A T ( d ) < θ .
In this definition, significant attributes are determined based on their contribution to the information gain ratio, aiming to select attributes that are informative, yet not redundant for the decision-making process. This method is based on greedy search techniques for attribute reduction, and helps identify attributes that significantly impact the decision outcome.
Given a dataset D S = U , A T , d , where U is the set of objects, A T is the set of conditional attributes, and d is the decision attribute. For any attribute subset A A T and any a A T A (i.e., any attribute not in A), the significance of attribute a regarding the multi-granularity neighborhood information gain ratio is defined as follows:
S i g ϵ a ( d ) = ϵ A { a } ( d ) ϵ A ( d )
The aforementioned significance function suggests that an increase in value enhances the importance of a conditional attribute, making it more probable to be included in the reduction set. For example, if  S i g ϵ a 1 ( d ) < S i g ϵ a 2 ( d ) , where a 1 , a 2 A T A , then ϵ A { a 1 } ( d ) < ϵ A { a 2 } ( d ) . Such a result indicates that choosing a 2 to join A would lead to a higher multi-granularity neighborhood information gain ratio compared to a 1 .
Given the foregoing, it is not formidable to conclude that ϵ -reduct has the following benefits.
  • Multi-level data analysis: By incorporating multi-granularity structures, the algorithm can analyze data at various levels of granularity. This allows for a more detailed understanding of data characteristics at each level, which can be crucial for complex datasets.
  • Comprehensive attribute evaluation: The algorithm evaluates attributes not only based on information gain, but also considering their performance across different granularities through the granularity space coefficient. This provides a holistic measure of attribute importance that accounts for varied data resolutions and contexts.
  • Handling uncertainty and fuzziness: by using neighborhood relations instead of indiscernibility relations, the method effectively handles uncertain and fuzzy data, making it suitable for real-world datasets that often contain imprecise or incomplete information.
However, while having various advantages, it may also has certain limitations like computational complexity due to the computation of neighborhood information gain ratio for each attribute across multiple granularities. These have endowed it with infinite potential and room for development.

3.2. Detailed Algorithm

Based on the significance function, Algorithm 1 is designed to find the ϵ -reduct.
To streamline the analysis of the computational complexity for Algorithm 1, we initiate by applying k-means clustering to generate pseudo labels for the samples. With T denoting the iteration count for k-means clustering and k indicating the cluster count, the complexity of creating pseudo labels is O ( k · T · | U | · | A T | ) , where | U | is the total number of samples, and  | A T | signifies the attribute count. Subsequently, the calculation of ϵ A a ( d ) occurs no more than ( 1 + | A T | ) · | A T | / 2 times. In conclusion, the computational complexity of Algorithm 1 equates to O | U | 2 · | A T | 3 2 + k · T · | U | · | A T | .
Algorithm 1 Forward greedy searching for ϵ -reduct with neighborhood rough set (NRS- ϵ )
Input: A decision system D S = ( U , A T , d ) , a neighborhood radius δ , a significance threshold θ .
Output: An ϵ -reduct A.
1:
Initialize A = ;
2:
Calculate initial neighborhood rough set characteristics for D S ;
3:
for each attribute a A T  do
4:
    Generate neighborhood relations N δ a for a;
5:
end for
6:
repeat
7:
    for each a A T A  do
8:
        Calculate the neighborhood information gain I G N R S ( d , A { a } ) ;
9:
        Calculate the neighborhood split information S I N R S ( d , A { a } ) ;
10:
        Compute ϵ A { a } ( d ) = S I N R S ( d , A { a } ) e | I G N R S ( d , A { a } ) | × W A { a } ;
11:
    end for
12:
    Select attribute b = arg max { ϵ A { a } ( d ) : a A T A } ;
13:
    Update A = A { b } ;
14:
until No attribute can improve the significance threshold θ or ϵ A ( d ) ϵ A T ( d ) θ  return A

4. Experimental Analysis

4.1. Dataset Description

To evaluate the performance of the proposed measure, 15 UCI datasets are used in this experiment. These datasets were carefully selected after a thorough review to meet the multi-granular criteria required by our method, accommodating both supervised and unsupervised learning scenarios. Table 1 summarizes the statistical information of these datasets.

4.2. Experimental Configuration

The experiment was performed on a personal computer running Windows 11, featuring an Intel Core i5-12500H processor (2.50 GHz) with 16.00 GB RAM. MATLAB R2023a served as the development environment.
In this experiment, a double means algorithm was adopted to recursively allocate attribute granularity space, while utilizing the k-means clustering method [38] to generate pseudo labels for samples, and the information gain ratio as the criterion for evaluating attribute reduction. Notably, the selected k-value needs to match the number of decision categories in the dataset. Moreover, the effect of the neighborhood rough set is significantly influenced by the preset radius size. To demonstrate the effectiveness and applicability of the proposed method, a series of experiments are designed using 20 different radius values, incremented by 0.02, ranging from 0.02 to 0.40. Through 10-fold cross-validation, the simplified reasoning process is validated. Specifically, for each specific radius, the dataset was divided into ten subsets, nine for training and one for testing. This cross-validation process was repeated 10 times to ensure that each subset had the opportunity to serve as the test set, thereby evaluating the classification performance and ensuring the reliability and stability of the model.
In the experiment, the proposed measure is compared with six advanced attribute reduction algorithms as well as with the algorithm without applying any attribute reduction methods (no reduct) using Regression Trees (CART) [20], K-Nearest Neighbors (KNN, K = 3) [39], and Support Vector Machines (SVM) [40]. The performance of the reducer is evaluated in aspects of the stability, accuracy, and timeliness of classification, as well as the stability of reduction. The attribute reduction algorithms included for comparison are:
MapReduce-Based Attribute Reduction Algorithm (MARA) [41];
Robust Attribute Reduction Based On Rough Sets (RARR) [42];
Bipolar Fuzzy Relation System Attribute Reduction Algorithms (BFRS) [43];
Attribute Group (AG) [44];
Separability-Based Evaluation Function (SEF) [45];
Genetic Algorithm-based Attribute Reduction (GAAR) [46].

4.3. Comparison of Classification Accuracy

In this part, the classification accuracy of each algorithm is evaluated using KNN, SVM, and CART for predicting test samples. Regarding attribute reduction algorithms, within a decision system D S , the definition of classification accuracy post-reduction is as follows:
A c c r e d = | { x i r e d | P r e r e d ( x i ) = d ( x i ) } | | U | ,
where Pre r e d ( x i ) is the predicted label for x i using the reduced set r e d .
Table 2 and Figure 2 present the specific classification accuracy outcomes for each algorithm across 15 datasets. From these observations, several insights can be readily inferred:
  • For most datasets, the classification accuracy associated with NRS- ϵ is superior to other comparison algorithms, regardless of whether the KNN, SVM, or CART classifier is used. For example, in the “Car Evaluation (ID: 6)” dataset, when using the CART classifier, the classification accuracies of NRS- ϵ , MARA, RARR, BFRS, AG, SEF, and GAAR are 0.5039, 0.4529, 0.4157, 0.494, 0.4886, 0.4909, 0.4719, respectively; when using the KNN classifier, the classification accuracies of NRS- ϵ , MARA, RARR, BFRS, AG, SEF, and GAAR are 0.6977, 0.6584, 0.535, 0.6747, 0.6675, 0.6586, 0.6579, respectively; when using SVM, the classification accuracies of NRS- ϵ , MARA, RARR, BFRS, AG, SEF, and GAAR are 0.5455, 0.4307, 0.368, 0.4737, 0.4698, 0.4718, 0.4923, respectively. Therefore, NRS- ϵ derived simplifications can provide effective classification performance.
  • Examining the average classification accuracy per algorithm reveals that the accuracy associated with NRS- ϵ is on par with, if not exceeding, that of MARA, RARR, BFRS, AG, SEF, and GAAR. When using the CART classifier, the average classification accuracy of NRS- ϵ is 0.8012, up to 29.28% higher than other algorithms; when using the KNN classifier, the average classification accuracy of NRS- ϵ is 0.8169, up to 34.48% higher than other algorithms; when using SVM, the average classification accuracy of NRS- ϵ is 0.80116, up to 36.38% higher than other algorithms.

4.4. Comparison of Classification Stability

Similar to the evaluation of classification accuracy, this section explores the classification stability obtained by analyzing the classification results of seven different algorithms, including experiments with CART, KNN, and SVM classifiers. In a decision system D S = U , A T , d , δ , assume the set U is equally divided into z mutually exclusive groups of the same size (using 10-fold cross-validation, so z = 10 ); that is, U 1 , , U τ , , U z ( 1 τ z ). Then, the classification stability based on redundancy reduction red τ (obtained by removing U τ from the set U) can be represented as:
S t a b c l a s s = 2 z · ( z 1 ) τ = 1 z 1 τ = τ + 1 z E x a ( r e d τ , r e d τ )
where Exa ( r e d τ , r e d τ ) measures the consistency between two classification results, which can be defined according to Table 3.
In Table 3, P r e r e d τ ( x ) represents the predicted label of x obtained by red τ . The symbols ψ 1 , ψ 2 , ψ 3 , and ψ 4 , respectively, represent the number of samples that satisfy the corresponding conditions in Table 4. Based on this, E x a ( r e d τ , r e d τ ) is defined as
E x a ( r e d τ , r e d τ ) = ψ 1 + ψ 4 ψ 1 + ψ 2 + ψ 3 + ψ 4 .
The classification stability index reflects the degree of deviation of prediction labels when data perturbation occurs. Higher values of classification stability mean more stable prediction labels, indicating better quality of the corresponding reduction. Improvements in classification stability mean increased stability of prediction label results and reduced interference with training samples. After analyzing the 15 datasets using these three classifiers, Table 4 and Figure 3 present the findings of each algorithm in terms of classification stability. It should be noted that the classification stability index reflects the degree of change in prediction labels when data are perturbed. Higher classification stability values indicate more stable prediction labels, meaning the related redundancy reduction has a higher quality.
  • Across many datasets, the NRS- ϵ algorithm exhibits leading performance compared to other algorithms in terms of classification stability, playing a leading role. For example, in the “Iris Plants Database (ID: 2)” dataset, significant differences in classification accuracy were observed under different classifiers for NRS- ϵ and other algorithms: when using the CART classifier, the accuracy of NRS- ϵ reached 0.7364, while MARA, RARR, BFRS, AG, SEF, and GAAR had accuracies of 0.6794, 0.7122, 0.6981, 0.7244, 0.7130, and 0.7276, respectively; when using the KNN classifier, the accuracy of NRS- ϵ was 0.8357, with MARA, RARR, BFRS, AG, SEF, and GAAR algorithms having accuracies of 0.6380, 0.8349, 0.8155, 0.8145, 0.8253, and 0.8246, respectively; when using the SVM classifier, the accuracies of NRS- ϵ , MARA, RARR, BFRS, AG, SEF, and GAAR were 0.8918, 0.6581, 0.8774, 0.8771, 0.8748, 0.8852, 0.8783, respectively.
  • Regarding average classification accuracy, the stability of NRS- ϵ markedly surpasses that of competing algorithms. Specifically, when using the CART classifier, the classification stability of NRS- ϵ was 0.8228, up to 12.51% higher than other methods; when using the KNN classifier, its classification stability was 0.8972, up to 25.14% higher than other methods; and through the SVM classifier, the classification stability of NRS- ϵ was 0.9295, up to 14.61% higher than other methods.

4.5. Comparisons of Elapsed Time

In this section, the time required for attribute reduction by different algorithms is compared. The results are shown in Table 5.
An increase in the value of dimensionality reduction stability correlates with an extended length of reduction. Through an in-depth analysis of the table below, the findings listed below can be easily derived. The reduction length of NRS- ϵ is longer, suggesting that there is a need to enhance the algorithm’s time efficiency throughout the simplification process.
When analyzing the average processing time performance of algorithms, from the perspective of average time consumed, it is noteworthy that the value of NRS- ϵ is reduced by 97.23% and 48.86% compared to RARR and GAAR, respectively. Taking the dataset “Car Evaluation (ID: 6)” as an example, the time consumed by NRS- ϵ , MARA, RARR, BFRS, AG, SEF, and GAAR are 122.1212 s, 6.9838 s, 421.1056 s, 154.8219 s, 31.4599 s, 33.3661 s, and 54.0532 s, respectively. Hence, under certain conditions, the time NRS- ϵ takes for attribute reduction is less compared to RARR and BFRS.
Based on the discussion, it is evident that while our novel algorithm exhibits better time efficiency compared to RARR and BFRS on certain datasets, the speed of NRS- ϵ requires further enhancement.

4.6. Comparison of Attribute Dimensionality Reduction Stability

In this section, the attribute dimensionality reduction stability related to 15 datasets is presented. Table 6 shows that the dimensionality reduction stability of NRS- ϵ is slightly lower than GAAR and SEF, but still maintains a leading position. Compared to MARA, RARR, BFRS, and AG, the average dimensionality reduction stability value of NRS- ϵ has increased by 100.2%, 49.89%, 27.19%, and 14.15%, respectively, while it only decreased by 19.323% and 6.677% compared to GAAR and SEF.
Although NRS- ϵ does not fall short of GAAR and SEF’s results in terms of dimensionality reduction stability on many datasets, in some cases, its results in attribute dimensionality reduction are superior to the six advanced algorithms. For example, for the “Letter Recognition (ID: 15)” dataset, the dimensionality reduction stability of NRS- ϵ , MARA, RARR, BFRS, AG, SEF, and GAAR were 0.8608, 0.6001, 0.4011, 0.7882, 0.6549, 0.7723, and 0.7442, respectively. Compared to other algorithms, the results of NRS- ϵ improved by 43.47%, 115.2%, 9.211%, 31.44%, 11.46%, and 15.67%, respectively. Thus, it is important to recognize that employing NRS- ϵ favors the selection of attributes better aligned with variations in samples.

5. Conclusions and Future Expectations

In this study, we introduced a novel attribute reduction strategy designed to address the challenges associated with high-dimensional data analysis. This strategy innovatively combines multi-granularity modeling with both supervised and unsupervised learning frameworks, enhancing its adaptability and effectiveness across various levels of data complexity.
This model’s integration of multi-granularity aspects distinguishes it from conventional attribute reduction methods by providing enhanced flexibility and adaptability to different data feature levels. This allows for more precise and effective handling of complex, high-dimensional datasets. The application of our proposed strategy across 15 UCI datasets has demonstrated not only exceptional classification performance, but also robust stability during the dimensionality reduction process. These results substantiate the practical utility and effectiveness of our approach in diverse data scenarios. While the strategy marks a significant advancement in attribute reduction, it does present challenges, primarily related to computational efficiency. The sophisticated nature of the integrated measurement methods, though beneficial for attribute selection quality, substantially increases the computational time required. This aspect can be particularly limiting in time-sensitive applications. To enhance the practicality and efficiency of our attribute reduction strategy, future research efforts could focus on:
  • Implementing acceleration technologies could significantly reduce the computational burden, making the strategy more feasible for larger or more complex datasets.
  • Exploring alternative rough set-based fundamental measurements could provide deeper insights into their impact on classification performance. This exploration may lead to the discovery of even more effective attribute reduction techniques.
By addressing these limitations and exploring these suggested future research directions, we can further refine our attribute reduction strategy, potentially setting a new benchmark in the field. Our findings not only contribute to the existing body of knowledge, but also pave the way for future explorations aimed at enhancing data preprocessing techniques in the era of big data.

Author Contributions

Conceptualization, J.C.; methodology, J.C.; software, Z.F.; validation, J.S.; formal analysis, H.C.; investigation, T.X.; resources, J.S.; data curation, T.X.; writing—original draft preparation, Z.F.; writing—review and editing, T.X.; visualization, Z.F.; supervision, J.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (Grant No. 62006099), Industry-school Cooperative Education Program of the Ministry of Education (Grant No. 202101363034).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare there are no conflicts of interest.

References

  1. Pawlak, Z. Rough sets. Int. J. Comput. Inf. Sci. 1982, 11, 341–356. [Google Scholar] [CrossRef]
  2. Chen, H.; Li, T.; Luo, C.; Horng, S.J.; Wang, G. A decision-theoretic rough set approach for dynamic data mining. IEEE Trans. Fuzzy Syst. 2015, 23, 1958–1970. [Google Scholar] [CrossRef]
  3. Dowlatshahi, M.; Derhami, V.; Nezamabadi-pour, H. Ensemble of filter-based rankers to guide an epsilon-greedy swarm optimizer for high-dimensional feature subset selection. Information 2017, 8, 152. [Google Scholar] [CrossRef]
  4. Qian, Y.; Liang, J.; Wei, Z.; Dang, C. Information granularity in fuzzy binary GrC model. IEEE Trans. Fuzzy Syst. 2010, 19, 253–264. [Google Scholar] [CrossRef]
  5. Qian, Y.; Li, F.; Liang, J.; Liu, B.; Dang, C. Space structure and clustering of categorical data. IEEE Trans. Neural Netw. Learn. Syst. 2015, 27, 2047–2059. [Google Scholar] [CrossRef] [PubMed]
  6. Qian, J.; Liu, C.; Miao, D.; Yue, X. Sequential three-way decisions via multi-granularity. Inf. Sci. 2020, 507, 606–629. [Google Scholar] [CrossRef]
  7. Wan, S.; Wang, F.; Dong, J. A preference degree for intuitionistic fuzzy values and application to multi-attribute group decision making. Inf. Sci. 2016, 370, 127–146. [Google Scholar] [CrossRef]
  8. Zhang, Q.; Liu, J.; Yang, F.; Sun, Q.; Yao, Z. Subjective weight determination method of evaluation index based on intuitionistic fuzzy set theory. In Proceedings of the 2022 34th Chinese Control and Decision Conference (CCDC), Hefei, China, 15–17 August 2022; pp. 2858–2861. [Google Scholar] [CrossRef]
  9. Chen, Y.; Wang, P.; Yang, X.; Yu, H. Bee: Towards a robust attribute reduction. Int. J. Mach. Learn. Cybern. 2022, 13, 3927–3962. [Google Scholar] [CrossRef]
  10. Liu, K.; Yang, X.; Fujita, H.; Liu, D.; Yang, X.; Qian, Y. An efficient selector for multi-granularity attribute reduction. Inf. Sci. 2019, 505, 457–472. [Google Scholar] [CrossRef]
  11. Jiang, Z.; Liu, K.; Yang, X.; Yu, H.; Fujita, H.; Qian, Y. Accelerator for supervised neighborhood based attribute reduction. Int. J. Approx. Reason. 2020, 119, 122–150. [Google Scholar] [CrossRef]
  12. Yuan, Z.; Chen, H.; Li, T.; Yu, Z.; Sang, B.; Luo, C. Unsupervised attribute reduction for mixed data based on fuzzy rough sets. Inf. Sci. 2021, 572, 67–87. [Google Scholar] [CrossRef]
  13. Yang, X.; Yao, Y. Ensemble selector for attribute reduction. Appl. Soft Comput. 2018, 70, 1–11. [Google Scholar] [CrossRef]
  14. Jain, P.; Som, T. Multigranular rough set model based on robust intuitionistic fuzzy covering with application to feature selection. Int. J. Approx. Reason. 2023, 156, 16–37. [Google Scholar] [CrossRef]
  15. Ji, X.; Peng, J.H.; Zhao, P.; Yao, S. Extended rough sets model based on fuzzy granular ball and its attribute reduction. Inf. Sci. 2023, 481, 119071. [Google Scholar] [CrossRef]
  16. Yang, Y.; Chen, D.; Wang, H. Active sample selection based incremental algorithm for attribute reduction with rough sets. IEEE Trans. Fuzzy Syst. 2016, 25, 825–838. [Google Scholar] [CrossRef]
  17. Qian, Y.; Liang, J.; Pedrycz, W.; Dang, C. Positive approximation: An accelerator for attribute reduction in rough set theory. Artif. Intell. 2010, 174, 597–618. [Google Scholar] [CrossRef]
  18. Hu, Q.; Pedrycz, W.; Yu, D.; Lang, J. Selecting discrete and continuous features based on neighborhood decision error minimization. IEEE Trans. Syst. Man Cybern. Part (Cybernetics) 2009, 40, 137–150. [Google Scholar] [CrossRef]
  19. Li, J.; Yang, X.; Song, X.; Li, J.; Wang, P.; Yu, D. Neighborhood attribute reduction: A multi-criterion approach. Int. J. Mach. Learn. Cybern. 2019, 10, 731–742. [Google Scholar] [CrossRef]
  20. Wang, J.; Liu, Y.; Chen, J.; Yang, X. An Ensemble Framework to Forest Optimization Based Reduct Searching. Symmetry 2022, 14, 1277. [Google Scholar] [CrossRef]
  21. Xu, E.; Gao, X.; Tan, W. Attributes Reduction Based On Rough Set. In Proceedings of the 2006 International Conference on Machine Learning and Cybernetics, Dalian, China, 13–16 August 2006; pp. 1438–1442. [Google Scholar] [CrossRef]
  22. Xu, S.; Ju, H.; Shang, L.; Pedrycz, W.; Yang, X.; Li, C. Label distribution learning: A local collaborative mechanism. Int. J. Approx. Reason. 2020, 121, 59–84. [Google Scholar] [CrossRef]
  23. Xu, X.; Niu, Y.; Niu, Y. Research on attribute reduction algorithm based on Rough Set Theory and genetic algorithms. In Proceedings of the 2011 2nd International Conference on Artificial Intelligence, Management Science and Electronic Commerce (AIMSEC), Zhengzhou, China, 8–10 August 2011; pp. 524–527. [Google Scholar] [CrossRef]
  24. Yang, X.; Xu, S.; Dou, H.; Song, X.; Yu, H.; Yang, J. Multigranulation rough set: A multiset based strategy. Int. J. Comput. Intell. Syst. 2017, 10, 277–292. [Google Scholar] [CrossRef]
  25. Yang, X.; Liang, S.; Yu, H.; Gao, S.; Qian, Y. Pseudo-label neighborhood rough set: Measures and attribute reductions. Int. J. Approx. Reason. 2019, 105, 112–129. [Google Scholar] [CrossRef]
  26. Dai, J.; Hu, H.; Wu, W.; Qian, Y.; Huang, D. Maximal-discernibility-pair-based approach to attribute reduction in fuzzy rough sets. IEEE Trans. Fuzzy Syst. 2017, 26, 2174–2187. [Google Scholar] [CrossRef]
  27. Qian, Y.; Li, S.; Liang, J.; Shi, Z.; Wang, F. Pessimistic rough set based decisions: A multigranulation fusion strategy. Inf. Sci. 2014, 264, 196–210. [Google Scholar] [CrossRef]
  28. Qian, Y.; Liang, J.; Yao, Y.; Dang, C. MGRS: A multi-granulation rough set. Inf. Sci. 2010, 180, 949–970. [Google Scholar] [CrossRef]
  29. Pan, Y.; Xu, W.; Ran, Q. An incremental approach to feature selection using the weighted dominance-based neighborhood rough sets. Int. J. Mach. Learn. Cybern. 2023, 14, 1217–1233. [Google Scholar] [CrossRef]
  30. Qian, Y.; Zhang, H.; Sang, Y.; Liang, J. Multigranulation decision-theoretic rough sets. Int. J. Approx. Reason. 2014, 55, 225–237. [Google Scholar] [CrossRef]
  31. Lin, G.; Qian, Y.; Li, J. NMGRS: Neighborhood-based multigranulation rough sets. Int. J. Approx. Reason. 2012, 53, 1080–1093. [Google Scholar] [CrossRef]
  32. Song, M.; Chen, J.; Song, J.; Xu, T.; Fan, Y. Forward Greedy Searching to κ-Reduct Based on Granular Ball. Symmetry 2023, 15, 996. [Google Scholar] [CrossRef]
  33. Xing, T.; Chen, J.; Xu, T.; Fan, Y. Fusing Supervised and Unsupervised Measures for Attribute Reduction. Intell. Autom. Soft Comput. 2023, 37, 561. [Google Scholar] [CrossRef]
  34. Dai, J.; Wang, W.; Tian, H.; Liu, L. Attribute selection based on a new conditional entropy for incomplete decision systems. Knowl. Based Syst. 2013, 39, 207–213. [Google Scholar] [CrossRef]
  35. Liang, J.; Zhao, X.; Li, D.; Cao, F.; Dang, C. Determining the number of clusters using information entropy for mixed data. Pattern Recognit. 2012, 45, 2251–2265. [Google Scholar] [CrossRef]
  36. Yin, Z.; Fan, Y.; Wang, P.; Chen, J. Parallel Selector for Feature Reduction. Mathematics 2023, 11, 2084. [Google Scholar] [CrossRef]
  37. Chen, Y.; Wang, P.; Yang, X.; Mi, J.; Liu, D. Granular ball guided selector for attribute reduction. Knowl. Based Syst. 2021, 229, 107326. [Google Scholar] [CrossRef]
  38. Wang, P.; Shi, H.; Yang, X.; Mi, J. Three-way k-means: Integrating k-means and three-way decision. Int. J. Mach. Learn. Cybern. 2019, 10, 2767–2777. [Google Scholar] [CrossRef]
  39. Fukunaga, K.; Narendra, P. A branch and bound algorithm for computing k-nearest neighbors. IEEE Trans. Comput. 1975, 100, 750–753. [Google Scholar] [CrossRef]
  40. Chang, C.; Lin, C. LIBSVM: A library for support vector machines. Acm Trans. Intell. Syst. Technol. (Tist) 2011, 2, 1–27. [Google Scholar] [CrossRef]
  41. Yin, L.; Li, J.; Jiang, Z.; Ding, J.; Xu, X. An efficient attribute reduction algorithm using MapReduce. J. Inf. Sci. 2021, 47, 101–117. [Google Scholar] [CrossRef]
  42. Dong, L.; Chen, D.; Wang, N.; Lu, Z. Key energy-consumption feature selection of thermal power systems based on robust attribute reduction with rough sets. Inf. Sci. 2020, 532, 61–71. [Google Scholar] [CrossRef]
  43. Ali, G.; Akram, M.; Alcantud, J. Attributes reductions of bipolar fuzzy relation decision systems. Neural Comput. Appl. 2020, 32, 10051–10071. [Google Scholar] [CrossRef]
  44. Chen, Y.; Liu, K.; Song, J.; Fujita, H.; Yang, X.; Qian, Y. Attribute group for attribute reduction. Inf. Sci. 2020, 535, 64–80. [Google Scholar] [CrossRef]
  45. Hu, M.; Tsang, E.; Guo, Y.; Xu, W. Fast and robust attribute reduction based on the separability in fuzzy decision systems. IEEE Trans. Cybern. 2021, 52, 5559–5572. [Google Scholar] [CrossRef] [PubMed]
  46. Iqbal, F.; Hashmi, J.; Fung, B.; Batool, R.; Khattak, A.; Aleem, S.; Hung, P. A hybrid framework for sentiment analysis using genetic algorithm based feature reduction. IEEE Access 2019, 7, 14637–14652. [Google Scholar] [CrossRef]
Figure 1. The structure of the ϵ -reduct part.
Figure 1. The structure of the ϵ -reduct part.
Mathematics 12 01434 g001
Figure 2. Classification accuracies of three classifiers.
Figure 2. Classification accuracies of three classifiers.
Mathematics 12 01434 g002
Figure 3. Classification stabilities of three classifiers.
Figure 3. Classification stabilities of three classifiers.
Mathematics 12 01434 g003
Table 1. Dataset descriptions.
Table 1. Dataset descriptions.
IDDatasetsSamplesAttributesLabels
1Adult Income48,842142
2Iris Plants Database15043
3Wine178133
4Breast Cancer Wisconsin (Original)699102
5Climate Model Simulation Crashes540202
6Car Evaluation172864
7Human Activity Recognition Using Smartphones10,2995616
8Statlog (Image Segmentation)2310187
9Yeast1484810
10Seeds21073
11Ultrasonic Flowmeter Diagnostics-Meter D180434
12Spambase4601572
13Mushroom8124222
14Heart Disease303755
15Letter Recognition20,0001626
Table 2. The comparisons of the classification accuracies.
Table 2. The comparisons of the classification accuracies.
CART
IDNRS- ϵ MARARARRBFRSAGSEFGAARNO REDUCT
10.85140.58490.59110.84690.85110.84940.84520.8377
20.67950.50780.63920.66040.67490.65160.65730.6715
30.73120.63420.75840.72130.76710.75680.76870.7019
40.95070.56550.14990.95030.95020.94660.94450.9434
50.87830.80570.87790.84680.85320.83400.81840.8528
60.50390.45290.41570.49400.48860.49090.47190.5006
70.98480.74030.62170.98310.98450.98250.98200.9818
80.92710.35400.30460.91870.92620.92450.92590.9007
90.81010.71250.80150.80140.80970.79440.80750.7884
100.80910.46740.79790.79470.80470.80230.80220.8005
110.92810.86120.87690.90720.91400.92760.89370.8917
120.82060.58870.59670.81610.81610.81610.81690.8016
130.88340.81150.88280.86520.86650.86700.87940.8769
140.61580.57840.60480.60670.60950.60530.61380.6117
150.64340.63080.64220.64300.64140.64210.64250.6482
Average0.80120.61970.63740.79030.79720.79270.79130.7873
rate 29.27% ↑25.68% ↑1.37% ↑0.49% ↑1.06% ↑1.24% ↑1.77% ↑
KNN
10.88980.51770.49300.88980.89350.88910.88910.8794
20.65470.47230.61400.64650.65290.63870.64100.6438
30.68020.59420.68800.67030.69300.67020.70050.6925
40.94450.52690.15900.94360.94140.94190.93440.9261
50.83920.78900.83900.76590.80150.77680.77010.7349
60.69770.65840.53500.67470.66750.65860.65790.6098
70.85970.73150.57960.86200.86710.86910.86320.8672
80.97430.27330.20940.96710.96840.96590.96580.9176
90.87300.70870.86850.86710.87290.86050.86500.8714
100.76550.39960.75330.75270.76340.75770.76030.7476
110.92670.89270.89430.90810.91210.92660.90420.8905
120.91320.60350.60660.89980.89810.89590.89720.9027
130.89480.70870.89450.87000.88710.87910.88650.8845
140.61480.51050.61220.60790.61150.61450.61040.6112
150.72600.72540.70960.70640.70940.70090.70320.6994
Average0.81690.60750.63040.80210.80930.80300.80330.7919
rate 34.47% ↑29.59% ↑1.84 % ↑0.94% ↑1.73% ↑1.70% ↑3.16% ↑
SVM
10.86160.57510.57410.85730.86120.86130.85800.8194
20.63430.39890.59350.62310.63140.60470.61420.6241
30.73880.48900.72110.72700.76140.75380.74000.5212
40.92800.47410.14990.91690.91080.91500.91060.8723
50.63920.58340.63900.58340.59120.56180.59620.6231
60.54550.43070.36800.47370.46980.47180.49230.5298
70.68090.60210.45750.66360.67560.65730.66030.5643
80.93540.32340.30160.91290.91930.91460.91870.8759
90.87180.64200.86360.86720.86720.85490.86510.8831
100.77240.45800.75650.74610.75430.74530.75350.7478
110.92090.90750.90780.90920.91050.92010.90750.9052
120.94020.66250.67740.92670.92400.92180.92490.9324
130.89400.80010.89380.83710.86450.83670.88270.8560
140.63720.54360.63570.62810.62860.62280.62810.6405
150.66620.66410.65440.65330.65490.65220.65440.6248
Average0.77780.57030.61290.755020.76210.75300.76040.7347
rate 36.38% ↑26.90% ↑3.01% ↑2.05% ↑3.29% ↑2.27% ↑5.87% ↑
Table 3. Joint distribution of classification results.
Table 3. Joint distribution of classification results.
Prered τ ( x ) = d ( x ) Prered τ ( x ) d ( x )
P r e r e d τ ( x ) = d ( x ) ψ 1 ψ 2
P r e r e d τ ( x ) d ( x ) ψ 3 ψ 4
Table 4. The comparisons of classification stabilities.
Table 4. The comparisons of classification stabilities.
CART
IDNRS- ϵ MARARARRBFRSAGSEFGAARNO REDUCT
10.85830.86630.83690.85800.85660.85380.85580.8551
20.73640.67940.71220.69810.72440.71300.72760.7130
30.72220.74530.77610.75460.75530.74310.75000.7495
40.94850.62650.77330.94230.94540.94800.93910.8747
50.88370.82790.88340.83700.84730.84320.84070.8518
60.64690.65570.65840.66900.64730.63970.66320.6543
70.98120.93340.89340.97970.98060.98090.97920.9611
80.92590.66740.95010.91410.90410.90880.91590.8837
90.90170.77720.90140.87270.88800.89060.89480.8752
100.83670.70410.81510.83590.80820.82690.82830.8078
110.92460.88530.89940.92040.90540.90140.90320.9056
120.77240.52170.68040.75320.74920.76010.75650.7134
130.91450.91440.90370.86800.85150.83310.89400.8827
140.64650.53310.63950.64020.63720.64410.63480.6250
150.64200.63160.63760.63830.63390.64120.63930.6377
Average0.82280.73130.79740.81210.80900.80850.81480.7994
rate 12.51% ↑3.18% ↑1.31% ↑1.71% ↑1.76%↑0.97% ↑2.93% ↑
KNN
10.94600.88250.83770.94270.93780.94280.93870.9186
20.83570.63800.83490.81550.81450.82530.82460.7841
30.79770.65180.81020.74940.75260.76560.78670.7591
40.99240.59490.99080.97370.96890.97710.96600.9234
50.94520.79450.94430.87430.87090.85320.89960.8830
60.80310.76120.70180.75550.75390.75970.78250.7597
70.90620.93570.90440.90240.89030.90650.90020.9065
80.97860.69060.86010.96630.95920.96780.96270.9122
90.93410.69910.92840.91020.92330.92960.92450.8927
100.88540.67070.87610.87490.85410.88230.87070.8449
110.97060.97050.96570.93580.93090.93960.94570.9514
120.92990.53010.71380.87460.87450.89350.88080.8140
130.92220.60580.93620.86320.87050.86620.89350.8511
140.79420.47140.78890.76820.77270.79340.78130.7386
150.81640.85740.82730.80670.80630.82500.81410.8206
Average0.89720.71700.86140.86760.86530.87520.87810.8507
rate 25.14%↑4.16%↑3.41%↑3.68% ↑2.51% ↑2.17% ↑5.47% ↑
SVM
10.97390.86790.88770.97330.96710.96980.96750.9439
20.89180.65810.87740.87710.87480.88520.87830.8490
30.87540.61340.90500.85750.86620.86680.84770.8331
40.98070.72820.74560.97490.96780.97970.96410.9059
50.77820.75570.77790.74760.75430.75760.75760.7613
60.79260.68620.73940.76090.74900.75190.77290.7504
70.92470.99921.00010.93110.90150.93760.92870.9461
80.97090.64670.95670.95010.92670.94780.94590.9064
90.96700.77530.96550.95360.95660.96460.96390.9352
101.00001.00001.00001.00001.00001.00001.00000.9895
110.99751.00000.99950.98460.99450.99741.00000.9962
120.96940.55830.81650.91410.91170.93340.92250.8608
130.93380.91440.95870.89680.89780.87110.93140.9149
140.92451.00000.92400.92210.92080.93230.92620.9357
150.96190.96110.94550.93820.92600.93620.94210.9444
Average0.92950.81100.90000.91210.90770.91540.91660.8989
rate 14.61% ↑3.30% ↑1.90%↑2.41% ↑1.54% ↑1.41% ↑3.42% ↑
Table 5. The elapsed time of all seven algorithms.
Table 5. The elapsed time of all seven algorithms.
IDNRS- ϵ MARARARRBFRSAGSEFGAAR
1429.77830.88076.622757.05017.29156.529326.3625
22.43350.65130.89820.94130.19330.21730.3533
321.08020.32320.75591.99370.22170.19860.6452
47.05191.34624.29447.03591.47691.10961.9095
514.07030.21675.350411.73321.96391.83313.4963
6122.12126.9838421.1056154.821931.459933.366154.0532
7233.605581.796421.191980.480320.269418.38935.7563
836.33290.592828.518466.870811.05479.172515.1738
970.31015.7716532.7057286.475349.701538.06673.976
107.97170.64851.87459.17481.53621.11732.0991
112.35760.57970.61151.05260.23110.21150.3641
1218.29020.53146.078743.30798.08515.670211.0026
13167.912615.38751605.04761466.822254.7792157.5132327.6303
148.59420.62860.76633.77650.59230.58870.9772
15345.994110.341946.57211403.8344207.5309167.8453310.1552
Average99.19368.4452305.4930239.691439.759229.455257.5970
rate 1001% ↑97.23% ↓27.45% ↑503.1 % ↑34.98% ↑48.86 % ↓
Table 6. The stabilities of all seven algorithms.
Table 6. The stabilities of all seven algorithms.
IDNRS- ϵ MARARARRBFRSAGSEFGAAR
10.65870.12750.48880.15350.12320.20590.6033
20.57610.63080.92770.29030.29410.36470.5004
30.85040.20510.95060.62110.42540.64660.9209
40.92710.40130.92240.78690.80330.93560.898
50.82461.00000.90450.76590.72810.86870.8007
60.90060.14980.60070.59580.58180.71540.6374
70.55770.49330.72770.29960.17640.38130.5346
80.91530.31031.00000.86980.67120.89670.7945
90.89591.00001.00000.79550.71830.87880.8257
100.82460.40510.85880.80330.72610.82800.815
110.66060.05460.37730.52840.34840.32650.5815
120.91460.20010.99170.83280.82590.92810.8708
130.90540.30371.00000.71550.59620.76720.7502
140.78160.16020.86780.60170.60380.74380.7165
150.86080.60010.40110.78820.65490.77230.7442
Average0.80340.40140.80120.62990.55180.68400.7329
rate 100.2% ↑49.89% ↓27.19% ↑14.15% ↑19.323% ↓6.677% ↓
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Fan, Z.; Chen, J.; Cui, H.; Song, J.; Xu, T. Optimizing Attribute Reduction in Multi-Granularity Data through a Hybrid Supervised–Unsupervised Model. Mathematics 2024, 12, 1434. https://doi.org/10.3390/math12101434

AMA Style

Fan Z, Chen J, Cui H, Song J, Xu T. Optimizing Attribute Reduction in Multi-Granularity Data through a Hybrid Supervised–Unsupervised Model. Mathematics. 2024; 12(10):1434. https://doi.org/10.3390/math12101434

Chicago/Turabian Style

Fan, Zeyuan, Jianjun Chen, Hongyang Cui, Jingjing Song, and Taihua Xu. 2024. "Optimizing Attribute Reduction in Multi-Granularity Data through a Hybrid Supervised–Unsupervised Model" Mathematics 12, no. 10: 1434. https://doi.org/10.3390/math12101434

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop