Next Article in Journal
On a Randomly Censoring Scheme for Generalized Logistic Distribution with Applications
Previous Article in Journal
Nuclear Matter and Finite Nuclei: Recent Studies Based on Parity Doublet Model
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Dynamic Variable Precision Attribute Reduction Algorithm

School of Computer Science and Technology, Xinjiang Normal University, Urumqi 830054, China
*
Author to whom correspondence should be addressed.
Symmetry 2024, 16(9), 1239; https://doi.org/10.3390/sym16091239
Submission received: 29 August 2024 / Revised: 18 September 2024 / Accepted: 19 September 2024 / Published: 21 September 2024

Abstract

:
Dynamic reduction algorithms have become an important part of attribute reduction research because of their ability to perform dynamic updates without the need to retrain the original model. To enhance the efficiency of variable precision reduction algorithms in processing dynamic data, research has been conducted from the perspective of the construction process of the discernibility matrix. By modifying the decision values of some samples through an absolute majority voting strategy, a connection between variable precision reduction and positive region reduction has been established. Considering the increase and decrease of samples, dynamic variable precision reduction algorithms have been proposed. For four cases of sample increase, four corresponding scenarios have been discussed, and judgment conditions for the construction of the discernibility matrix have been proposed, which has led to the development of a dynamic variable precision reduction algorithm for sample increasing (DVPRA-SI). Simultaneously, for the scenario of sample deletion, three corresponding scenarios have been proposed, and the judgment conditions for the construction of the discernibility matrix have been discussed, which has resulted in the development of a dynamic variable precision reduction algorithm for sample deletion (DVPRA-SD). Finally, the proposed two algorithms and existing dynamic variable precision reduction algorithms were compared in terms of the running time and classification precision, and the experiments demonstrated that both algorithms are feasible and effective.

1. Introduction

The rough set was proposed by Polish scholar Pawlak [1]. The main contribution of attribute reduction is to remove redundant attributes while keeping the classification precision essentially unchanged. However, because of the strictness of classical classification in the rough set, it is often sensitive to noise disturbance or error influence in practical applications. Subsequently, to enhance the applicable robustness, the variable precision rough set (VPRS) [2] was allowed a certain degree of noise to relax the strictness of the classification.
Two main reduction algorithms are used to identify reducts. One is a heuristic algorithm, and the other is a discernibility matrix-based algorithm. Although the discernibility matrix-based algorithm has a high time complexity, it remains the only approach to identify all reducts. The various discernibility matrix based algorithms proposed, such as axisymmetric positive region discernibility matrix and symmetric variable precision discernibility matrix. The invariant reduction studied in [3] generates an discernibility matrix for each invariant. The discernibility matrix for method 1 is axisymmetric, while the discernibility matrices for the other two methods are symmetric.
To meet the extensive application needs of VPRS theory, the academic community extended it, the concept of a general VPRS approximation [4] was proposed, and the effective matrix method for the calculation was provided, thus simplifying the calculation of the VPRS approximation. Simultaneously, two types of attribute reduction for information systems were also provided [5]: β reduction and β variable precision reduction (VPR). Researchers also found that only the smallest element in the discernibility matrix is sufficient for determining a reduction result [6]. Additionally, the VPRS has been combined with other methods to varying degrees, such as the combination of variable precision and granularity [7,8], and variable precision fuzzy rough set models also exist that are based on coverage [9]. Additionally, various reduction methods exist, such as neighborhood reduction, covering reduction, multi-label reduction, and information entropy reduction [10,11,12,13,14,15,16,17,18,19,20,21,22,23].
Considering the problems of sample set disturbance, attribute set disturbance, and attribute value disturbance, it is particularly important to develop incremental algorithms because of the relatively low efficiency of non-incremental algorithms. To overcome this problem, a feature selection framework based on the discernibility score generated by eliminating redundant samples in the incremental method was proposed [24]. Considering the usefulness of new samples, an active sample selection method [25] was proposed to select new samples dynamically. For the incremental algorithm for attribute value disturbance, a new compound attribute measure method [26] was proposed. The incremental algorithm was proposed to handle missing attribute values for incomplete decision tables [27]. For the simultaneous disturbance of the sample set and attribute set [28], the incremental establishment process was discussed. Additionally, for different application backgrounds, dynamic attribute reduction algorithms [29,30,31,32,33,34,35,36] were proposed, which enhance the adaptability and precision of dynamic samples while maintaining computational efficiency.
In recent years, incremental approximation updates have attracted the interest of scholars. Some approximation update methods require recalculation when new samples are added or existing ones are removed, inevitably leading to substantial redundant computations.
In ongoing research, an incremental matrix algorithm facilitates dynamic reduction while preserving the integrity of the original samples [37,38,39,40,41,42]. Through matrix partitioning, a thorough analysis is conducted on the dynamic interplay between existing and newly introduced samples. Notably, the matrix formed subsequent to the addition of new samples exhibits symmetrical properties, which is a crucial characteristic for algorithmic design [43]. To enhance computational efficiency and minimize computation time, this paper introduces a discernibility matrix-based dynamic approximation update method.
This paper proposes the relationship between positive region reduction (PRR) and variable precision reduction (VPR) from the perspective of constructing discernibility matrix, and proposes a variable precision dynamic update algorithm based on the dynamic changes in the relationship between the two reductions. The main contributions of this paper are as follows:
  • At present, there is relatively insufficient discussion on the relationship and dynamic changes between different reduction methods. This paper by modifying the decision values of some samples through an absolute majority voting strategy, the relationship between positive region reduction (PRR) and VPR is established from the perspective of discernibility matrix construction.
  • Four scenarios that appear when the number of samples increases are analyzed, and the dynamic relationship between PRR and VPR is discussed for four cases. In the same manner, the analysis and discussion are conducted on three scenarios when samples are deleted.
  • Based on the changes to the number of samples in the VPR model, regardless of whether they are an increase or decrease, dynamic VPR algorithms using the PRR model are proposed: the dynamic variable precision reduction algorithm for sample increasing (DVPRA-SI) and dynamic variable precision reduction algorithm for sample reduction (DVPRA-SD).
The remainder of this paper is organized as follows: In Section 2, we review the basic concepts of PRR and VPR. In Section 3, we illustrate the dynamic reduction algorithm when samples are added. In Section 4, we illustrate the dynamic reduction algorithm when samples are deleted. In Section 5, we conduct experiments on the proposed algorithms and illustrate the feasibility of the algorithms. Section 6 contains a conclusion of the paper.

2. Preliminaries

In this section, the related concepts of PRR and VPR are reviewed. To clarify the research content of this paper, the research basis for the VPR algorithm in incremental learning is also introduced.
The quadruple S = { U , A T , V a , f } is an information table [6,23]. U is the universe, A T is a nonempty finite set, { V a | a A T } is a nonempty finite set of values, and f ( x , a ) : U A T is a function, where { f ( x , a ) | x U , a A T } denotes the value of object x under the attribute a. The equivalence relation [4,5] is defined as R A = { ( x , y ) | ( x , y ) U × U , f ( x , a ) = f ( y , a ) , a A } and equivalence classes [ x ] A = { y | ( x , y ) R A } . Given S, where C is the condition attributes set, D is the decision attributes set, and C D = , ( U , C D ) is called the decision table.
Definition 1 
([5]). Given ( U , C D ) , U / D = { D 1 , D 2 , , D t } is the quotient set by D, and the definition of the positive region is P os C D = i = 1 t ( R C ̲ ( D i ) ) . Where R C ̲ ( x ) = { x [ x ] C D i } .
Definition 2 
([23]). Given ( U , C D ) , B is the PRR of C when B satisfies the following conditions:
( 1 ) Pos C ( D ) = Pos B ( D ) ( 2 ) Pos C ( D ) Pos B ( D ) for all B and B B
PRR’s corresponding discernibility matrix M P o s C D = ( m P o s C D ( x i , x j ) ) s × n is followed as
m P o s C D ( x i , x j ) = { a C ; ( x i , x j ) R a } , x i P o s C ( D ) , ( x i , x j ) R D , o t h e r w i s e
where s = P o s C D , n = U , and · denotes the cardinality of the set. Then, each row of the discernibility matrix represents any object in an equivalence class in the positive region and the discernibility matrix can be compressed.
Remark 1. 
If y R C ( x ) , and R C ( x ) R D ( x ) , then M P o s c D ( x , : ) = M P o s c D ( y , : ) , where M P o s c D ( x , : ) represents the row in which x is located in the discernibility matrix; it is easy to observe that x and y are in the same row in the discernibility matrix.
Definition 3 
([4,5,23]). X is subset of U, and for each x U , the characteristic function λ X of X is defined as λ X ( x ) = 1 x X 0 x X
Definition 4 
([4,5,23]). Let B C , and β ( 0 , 1 ] . For X U , and [ x i ] R U , where [ x i ] R is an equivalence class on R, then definition of W R λ X is as follows:
W R λ X = [ | [ x 1 ] R X | | [ x 1 ] R | , | [ x 2 ] R X | | [ x 2 ] R | , , | [ x n ] R X | | [ x n ] R | ] T
where T denotes the transpose and column vector λ X = ( λ X ( x 1 ) , λ X ( x 2 ) , , λ X ( x n ) ) T , with respect to x i U . Additionally,
μ C D = p ( D 1 | [ x 1 ] C ) p ( D 2 | [ x 1 ] C ) p ( D t | [ x 1 ] C ) p ( D 1 | [ x 2 ] C ) p ( D 2 | [ x 2 ] C ) p ( D t | [ x 2 ] C ) p ( D 1 | [ x n ] C ) p ( D 2 | [ x n ] C ) p ( D t | [ x n ] C ) = μ C D ( x 1 ) μ C D ( x 2 ) μ C D ( x n )
where p ( D j [ x i ] C ) = | [ x i ] C D j | [ x i ] C ( i = 1 , 2 n , j = 1 , 2 t ) .
Definition 5 
([4,5,23]). Given ( U , C D ) and β ( 0 , 1 ] , B is the VPR of C if B satisfies the following conditions:
( 1 ) ( μ C D ( x ) ) β = ( μ B D ( x ) ) β f o r x U ( 2 ) B B , ( μ C D ( x ) ) β ( μ B D ( x ) ) β f o r x U
VPR’s corresponding discernibility matrix, M μ C D = ( m μ C D ( x i , x j ) ) n × n , is followed as
m μ C D ( x i , x j ) = { a | a C , a ( x i ) a ( x j ) } , ( μ C D ( x i ) ) β ( μ C D ( x j ) ) β , otherwise
Definition 6 
([4,5,23]). Transform the discernibility function Φ from its Conjunctive Normal Form (CNF) Φ = i < j { m ( x i , x j ) } into the Disjunctive Normal Form (DNF) Φ = u = 1 ν { B u } , ( B u C ) , R e d ( C ) = { B 1 , B 2 , , B ν } and C o r e ( C ) = u = 1 ν B u .
Lemma 1 
([23]). Given μ C D ( x ) for ( U , C D ) . If β ( 0.5 , 1 ] and the updated decision table is ( U , C D ) , which corresponds to μ C D ( x ) , then μ C D ( x ) β = μ C D ( x ) 1 for each x.
According to the discernibility matrix, the results of VPR with β = 1 are the results of PRR, and combined with Lemma 1, the following theorem follows.
Theorem 1 
([23]). Given ( U , C D ) , If β ( 0.5 , 1 ] and the updated decision table is ( U , C D ) , where P o s c D is the positive region, then Φ μ C D = Φ P o s C D .
Remark 2. 
Φ μ C D denotes the VPR’s discernibility function with β = 1 for ( U , C D ) and Φ P o s C D denotes the PRR’s discernibility function for ( U , C D ) . Although the discernibility matrix M μ C D M P o s c D , Φ μ C D = Φ P o s C D for the discernibility function by the proof [22].
Given β ( 0.5 , 1 ] , for | [ x i ] R D i | | [ x i ] C | in ( U , C D ) , the new decision table ( U , C D ) is obtained using the absolute majority voting strategy in ( U , C D ) . The specific modified approach is as follows: if P ( D j | [ x i ] C ) β and f ( y , d k ) f ( x , d j ) for y R C ( x ) , then f ( y , d k ) = f ( x , d j ) . The quotient set by D in the updated decision table is U / D = { D 1 , D 2 , , D s } . It follows from Theorem 1 that VPR with β in ( U , C D ) has the same discernibility function and, hence, the same reduction results as PRR in ( U , C D ) . Table 1 illustrates the relationship between VPR and PRR.
Table 1 obtained U / C = { { x 1 , x 2 } , { x 3 , x 5 , x 6 } , { x 4 , x 7 , x 8 } } , U / D = { { x 1 , x 5 } , { x 2 , x 3 , x 6 } , { x 4 , x 7 , x 8 } } . When β = 0.55 , ( μ C D ) β = 0 0 0 0 1 0 0 0 1 . Then, the VPR’s discernibility matrix is
M μ C D = C a 1 C C a 1 a 1 C C a 2 a 3 a 2 a 3 a 2 a 3 a 1 a 1 a 2 a 3 a 2 a 3 a 2 a 3
discernibility function Φ μ C D from its CNF Φ μ C D = { a 1 a 2 a 3 } { a 1 } { a 2 a 3 } into the DNF Φ μ C D = { a 1 a 2 } { a 1 a 3 } , and the reduction results are { a 1 a 2 } and { a 1 a 3 } .
When β = 0.55 , Table 1 is modified to obtain the modified decision table (Table 2). where the bold display represents the modified decision values.
Because | [ x 5 ] C D 2 | | [ x 5 ] C | 0.55 , let f ( x 5 , d ) = 1 . According to Table 2  P o s C D = { x 3 , x 4 , x 5 , x 6 , x 7 , x 8 } , and because M P o s C D ( x 3 , : ) = M P o s C D ( x 5 , : ) = M P o s C D ( x 6 , : ) , the PRR’s discernibility matrix can be compressed as
M P o s C D = C C a 2 a 3 a 2 a 3 a 2 a 3 a 1 a 1 a 2 a 3 a 2 a 3 a 2 a 3
discernibility function Φ P o s C D from its CNF Φ P o s C D = { a 1 a 2 a 3 } { a 2 a 3 } { a 1 } into the DNF Φ P o s C D = { a 1 a 2 } { a 1 a 3 } , and the reduction results are { a 1 a 2 } and { a 1 a 3 } .
It is worth repeating that, although the discernibility matrix of VPR in ( U , C D ) and the discernibility matrix of PRR in ( U , C D ) are different, the results obtained by both are the same. The purpose of this paper is to study the process of obtaining reduction results from the perspective of discernibility matrix construction considering the dynamic change of the universe.

3. Incremental Mechanism of Dynamic Data Reduction

When incremental learning encounters new samples, it does not require the retraining of the original model and enables dynamic updates, which is crucial for adapting to changing environments in a timely manner. Relatively many examples exist where the sample is frequently perturbed, such as stock prices in financial markets and traffic flow monitoring systems. In this section, the dynamic reduction algorithm for increasing samples is proposed. The following four scenarios in which new samples are added are proposed in Table 3. The four frameworks for increasing sample scenarios are shown in Figure 1.
The set Y = { y 1 , y 2 , , y p } represents adding new samples based on ( U , C D ) . For the value of | [ x i ] C D i | | [ x i ] C | , and adopting the absolute majority voting strategy, the modified decision table is denoted by ( U + , C D ) , where U + = U Y .
Definition 7. 
Given ( U , C D ) , its PRR’s discernibility matrix is M P o s c D . The modified table is ( U + , C D ) as new samples Y = { y 1 , y 2 , , y p } are added and corresponding PRR’s discernibility matrix is denoted by M P o s C D A d d .
The corresponding discernibility function for M P o s c D is Φ P o s C D , and the corresponding discernibility function for M P o s C D A d d is Φ P o s C D A d d .
Definition 8. 
Given ( U , C D ) , its VPR’s discernibility matrix is M μ C D , and the corresponding VPR’s discernibility matrix is denoted by M μ C D A d d for ( U + , C D ) , when new samples Y = { y 1 , y 2 , , y p } are added.
The corresponding discernibility function for M μ C D and M μ C D A d d are Φ μ C D and Φ μ C D A d d , respectively.
Remark 3. 
According to Definitions 7–8 and Theorem 1, the discernibility function Φ μ C D A d d in ( U + , C D ) in ( U + , C D ) and the discernibility function Φ P o s C D A d d in ( U + , C D ) are equivalent.
The positive region for ( U , C D ) is denoted by P o s c D and the corresponding positive region for ( U + , C D ) is denoted by P o s c D . Upon the introduction of new samples, four distinct scenarios may emerge, each potentially encompassing one of the following four cases (Figure 1).
C a s e 1 : x i P o s C D , x i P o s C D C a s e 2 : x i P o s C D , x i P o s C D C a s e 3 : x i P o s C D , x i P o s C D C a s e 4 : x i P o s C D , x i P o s C D
Four cases under Scenario 1 (Table 3) are discussed as follows. Given ( U , C D ) , the modified decision table is ( U + , C D ) , and the first case is discussed.
Theorem 2. 
Let y be a new sample. If x i P o s C D and x i P o s C D for y [ x i ] C , y R D ( x i ) , then Φ P o s C D A d d = Φ P o s c D .
Proof. 
m P o s C D ( x , x k ) = m P o s C D A d d ( x , x k ) , where x P o s C D , and x k U . For x i P o s C D , there exist m P o s C D ( x i , x j ) Φ P o s C D . For matrix row changes: For each x j U , if x i P o s C D , and y [ x i ] C , then m P o s C D ( x i , x j ) = m P o s C D A d d ( y , x j ) . For matrix column changes: If each x j P o s C D , then m P o s C D ( x j , x i ) = m P o s C D A d d ( x j , y ) , and if each x j P o s C D then m P o s C D ( x j , x i ) = m P o s C D A d d ( x j , y ) = . Additionally if y [ x i ] C , y R D ( x i ) for x i P o s C D , then m P o s C D A d d ( x i , y ) = . Therefore, Φ P o s C D A d d = Φ P o s c D . □
The following corollary follows from Theorems 1 and 2.
Corollary 1. 
Let y be a new sample, If x i P o s C D and x i P o s C D for y [ x i ] C , y R D ( x i ) , then Φ μ C D A d d = Φ μ C D .
As a continuation of Table 1, given a table ( U + , C D ) , which contains new samples Y = { y 1 , y 2 } , y 1 [ x 3 ] C , y 1 R D ( x 3 ) and y 2 [ x 4 ] C , y 2 R D ( x 4 ) , it is easy to observe that x 3 , x 4 P o s C D for Table 2 when samples are added. The modified samples are shown in Table 4. Then, x 3 , x 4 P o s C D .
It follows that M P o s C D , = C C a 2 a 3 a 2 a 3 a 2 a 3 a 2 a 3 a 1 a 1 a 2 a 3 a 2 a 3 a 2 a 3 a 2 a 3 . Clearly, Φ P o s C D A d d = Φ P o s c D , and then Φ μ C D A d d = Φ μ C D from Theorem 1.
According to Definition 2, upon the addition of the sample set Y = { y 1 , y 2 } , the static algorithm necessitates a recalculation of U / C , U / D , M P o s C D , and Φ P o s C D . The preceding analysis has demonstrated that the reduction outcomes following the incorporation of y 1 and y 2 are consistent. The dynamic reduction algorithm, in contrast, is capable of directly calculating and reflecting the impact of new samples on the existing original sample. In order to effectively address the issue of redundant calculations inherent in static algorithms, this study will further investigate the effects of four distinct scenarios involving the increasing of samples on the reduction process.
Definition 9. 
δ r o w ( x i ) = { ( x i , x j ) R a a C , x j U } , x i P o s C D , x i P o s C D { ( x i , x j ) R a a C , x j U + } , x i P o s C D , x i P o s C D , and the whole row elements corresponding to x i of the positive region in the discernibility matrix are denoted by δ P o s C D r o w ( x i ) .
The whole row and column element corresponding to the newly added sample y are defined as δ r o w ( y ) = { ( y , x j ) R a x j U } , δ c o l ( y ) = { ( x j , y ) R a f o r x j P o s C D } , and the whole row and column elements corresponding to the positive region in the discernibility matrix are denoted by δ P o s C D r o w ( y ) , δ P o s C D c o l ( y ) .
Case 2 in Scenario 1 (Table 3) is discussed next.
Theorem 3. 
Let y be a new sample. If x i P o s C D and x i P o s C D , for y [ x i ] C , y R D ( x i ) , then Φ P o s C D = Φ P o s C D A d d δ P o s C D r o w ( x i ) .
Proof. 
For m P o s C D ( x i , x j ) Φ P o s C D , because x i P o s C D , and x i P o s C D , m P o s C D ( x , x k ) = m P o s C D A d d ( x , x k ) , where x P o s C D and x [ x i ] C . For matrix row changes: For each x j U , if x i P o s C D , then m P o s c D A d d ( x i , x j ) = , where m P o s c D ( x i , x j ) = { a ( x i , x j ) R a a n d ( x i , x j ) R D } . For matrix column changes: For each x j P o s C D , and y [ x i ] C , y R D ( x i ) , m P o s C D ( x j , x i ) = m P o s C D A d d ( x j , y ) , and for each x j P o s C D , m P o s C D ( x j , x i ) = m P o s C D A d d ( x j , y ) = . Because x i P o s C D , m P o s c D A d d ( x i , y ) = . Therefore, Φ P o s C D = Φ P o s C D A d d δ P o s C D r o w ( x i ) . □
Corollary 2. 
Let y be a new sample. For x i P o s C D , y [ x i ] C and y R D ( x i ) , if x i P o s C D , then Φ μ C D = Φ P o s C D A d d δ P o s C D r o w ( x i ) .
x i P o s C D was discussed in the above two cases, and now two cases of x i P o s C D are discussed. Case 3 is discussed as follows.
Theorem 4. 
Let y be a new sample. If x i P o s C D and x i P o s C D , for y [ x i ] C and y R D ( x i ) , then Φ P o s C D A d d = Φ P o s C D δ P o s C D r o w ( x i ) .
Proof. 
m P o s C D ( x , x k ) = m P o s C D A d d ( x , x k ) , where x P o s C D , and x k U . For x i P o s C D , there exist m P o s C D A d d ( x i , x j ) Φ P o s C D A d d . For matrix row changes, for each x j U , if x i P o s C D , then m P o s c D ( x i , x j ) = , m P o s c D A d d ( x i , x j ) = { a ( x i , x j ) R a a n d ( x i , x j ) R D } . For matrix column changes, for each x j P o s C D , m P o s C D ( x j , x i ) = m P o s C D A d d ( x j , y ) , and for each x j P o s C D , m P o s C D ( x j , x i ) = m P o s C D A d d ( x j , y ) = . Additionally, y [ x i ] C y R D ( x i ) , for x i P o s C D , m P o s c D A d d ( x i , y ) = . Therefore, Φ P o s C D A d d = Φ P o s C D δ P o s C D r o w ( x i ) . □
Corollary 3. 
Let y be a new sample, for x i P o s C D , y [ x i ] C and y R D ( x i ) . If x i P o s C D , then Φ μ C D A d d = Φ P o s C D δ P o s C D r o w ( x i )
Corollary 3 is obtained according to Theorem 4. The final case within Scenario 1 (Table 3) is now being discussed.
Theorem 5. 
Let y be a new sample. For x i P o s C D , y [ x i ] C and y R D ( x i ) , if x i P o s C D , then Φ P o s C D A d d = Φ P o s c D .
Proof. 
For x i P o s C D , there exist m P o s C D ( x i , x j ) Φ P o s C D . m P o s C D ( x , x k ) = m P o s C D A d d ( x , x k ) , where x P o s C D , x k U . For matrix column changes, for each x j P o s C D , m P o s C D ( x j , x i ) = m P o s C D A d d ( x j , y ) , and for each x j P o s C D , m P o s C D ( x j , x i ) = m P o s C D A d d ( x j , y ) = . Therefore, Φ P o s C D A d d = Φ P o s c D . □
Corollary 4. 
Let y be a new sample. For x i P o s C D , y [ x i ] C and y R D ( x i ) , if x i P o s C D , then Φ μ C D A d d = Φ μ C D .
Theorems 2–5 have detailed the four cases encompassed within Scenario 1 (Table 3); the ensuing discussion is now directed towards the cases presented in Scenario 2 (Table 3). Because of the different decision values for x and y i , some corresponding changes occur. ( U , C D ) is given and the modified table is ( U + , C D ) when new samples are added.
Theorem 6. 
Let y be a new sample. If y [ x i ] C , y R D ( x i ) , the following conclusions holds:
( 1 ) I f x i P o s C D , x i P o s C D , T h e n Φ P o s C D A d d = Φ P o s C D ( 2 ) I f x i P o s C D , x i P o s C D , T h e n Φ P o s C D = Φ P o s C D A d d δ P o s C D r o w x i ( 3 ) I f x i P o s C D , x i P o s C D , T h e n , Φ P o s C D A d d = Φ P o s C D δ P o s C D r o w x i ( 4 ) I f x i P o s C D , x i P o s C D , T h e n , Φ P o s C D A d d = Φ P o s C D
Proof. 
The proofs for Theorem 6 are similar to the processes for Theorems 2–5. □
Remark 4. 
From conclusion (1) of Theorem 6, if x i P o s C D , then R C ( x i ) R D ( x i ) . For y U + U , because y [ x i ] C and y R D ( x i ) , R C ( x i ) R D ( x i ) . For | [ x i ] C D i | | [ x i ] C | , let f ( y , d ) = f ( x i , d ) , by the absolute majority voting strategy. Then, R C ( x i ) R D ( x i ) . Therefore, x i P o s C D , and x i P o s C D .
Corollary 5. 
Let y be a new sample. If y [ x i ] C , y R D ( x i ) , the following conclusions holds:
( 1 ) I f x i P o s C D , x i P o s C D , T h e n Φ μ C D A d d = Φ μ C D ( 2 ) I f x i P o s C D , x i P o s C D , T h e n Φ μ C D = Φ P o s C D A d d δ P o s C D r o w x i ( 3 ) I f x i P o s C D , x i P o s C D , T h e n , Φ μ C D A d d = Φ P o s C D δ P o s C D r o w x i ( 4 ) I f x i P o s C D , x i P o s C D , T h e n , Φ μ C D A d d = Φ μ C D
In scenario 3 (Table 3), the new samples have different condition values from any x i in ( U , C D ) ; hence, the new samples belong to the positive region in ( U + , C D ) . Scenarios 3 and 4 exhibit more distinctive cases; discussions pertaining to these cases are detailed in Theorems 7 and 8.
Theorem 7. 
Let y be a new sample, y [ x i ] C for each x i U . If there exist x j U and y R D ( x j ) , then Φ P o s C D A d d = Φ P o s C D δ P o s C D r o w ( y ) δ P o s C D c o l ( y ) .
Proof. 
There exist m P o s C D ( x i , x j ) Φ P o s C D , for x i P o s C D and m P o s C D ( x , x k ) = m P o s C D A d d ( x , x k ) for x P o s C D , x k U . For each x i , y R C ( x j ) and [ y ] C = { y } , clearly R C ( y ) R D ( y ) , i.e., y P o s C D . For matrix row changes, because [ y ] C = { y } , m P o s C D A d d ( y , x j ) δ P o s C D r o w ( y ) , for each x j U . For matrix column changes, m P o s C D A d d ( x j , y ) δ P o s C D c o l ( y ) , for each x j P o s C D . Because y P o s C D , m P o s C D A d d ( y , y ) = . Therefore, Φ P o s C D A d d = Φ P o s C D δ P o s C D r o w ( y ) δ P o s C D c o l ( y ) . □
Corollary 6. 
Let y be a new sample. For each x i U , if y [ x i ] C , there exist x j U , y R D ( x j ) , then Φ μ C D A d d = Φ P o s C D δ P o s C D r o w ( y ) δ P o s C D c o l ( y ) .
Theorem 8. 
Let y be a new sample. For each x i U , y [ x i ] C , y R D ( x i ) , Φ P o s C D A d d = Φ P o s C D δ P o s C D r o w ( y ) δ P o s C D c o l ( y ) .
Proof. 
The proof is similar to that of Theorem 7. □
Corollary 7. 
Let y be a new sample. If y [ x i ] C for each x i U and y R D ( x i ) , then Φ μ C D A d d = Φ P o s C D δ P o s C D r o w ( y ) δ P o s C D c o l ( y ) .
According to the discussion of the four scenarios, in this paper, a dynamic reduction algorithm, DVPRA-SI is proposed when samples are increased. Let M S D = { M i D } n × 1 = ( M 1 D , M 2 D , , M n D ) T , where M i D is the label of each row in the discernibility matrix.
The time complexity of the algorithm is O ( | Y | × | U | 2 ) , where | Y | is the cardinality of the newly added sample set Y. However, the above worst case rarely occurs, and the overall algorithm time is far less than O ( | Y | × | U | 2 ) . Theoretically, Algorithm 1 DVPRA-SI is feasible.
Algorithm 1: Dynamic VPR algorithm for sample increasing (DVPRA-SI)
Input: ( U , C D ) , newly added sample set Y = { y 1 , y 2 , , y l }
Output: Reduction Results // discernibility function Φ from its CNF into the DNF
1: Begin
2: Compute U / C = { C 1 , C 2 , , C q } , U / D = { D 1 , D 2 , , D t } and P o s C ( D )
3: Modify the decision table
4: Compute P o s C ( D )
5: for  i = 1 to l do
6:   Compare y i with label M i D .
7:      If  [ y i ] C = [ M i D ] C  then
8:         assign ∅ to all M P o s C D ( M i D , y i ) columns where y i is located
9:      else
10:         according to a ( x i ) a ( x j ) , the elements are added to M P o s C D ( M i D , y i )
11:      for  j = 1 to M s D  do
12:        If  M i D P o s C ( D )  then
13:          rows of the discernibility matrix are not modified
14:        If  M i D P o s C ( D )  then
15:          delete M i D
16:      end for
17:      When  y i P o s C ( D )
18:        If  [ y i ] C = [ M i D ] C and M i D M s D  then
19:          without modifying it
20:        If  [ y i ] C = [ M i D ] C and M i D M s D  then
21:          according to a ( x i ) a ( x j ) , the elements are added to the discernibility.
22:        If  y i P o s C ( D )  then
23:           M i D and M P o s c D are not modified
24: end for
24: Compute Φ μ C D A d d
25: end

4. Deleting Mechanism of Dynamic Data Reduction

In this section, we focus on the cases of deleting samples dynamically. For the case of deleting samples, a dynamic VPR algorithm for sample deletion is proposed. The three frameworks for deleting sample scenarios are shown in Figure 2. Deleting samples leads to three scenarios (Table 5), in which U U denotes deleting samples Z = { z 1 , z 2 , , z q } for U. For Scenario 3 (Table 5), no sample x exists that satisfies z [ x ] C , it is easy to observe that all equivalence classes of [ x ] C have been deleted in the discernibility matrix; hence, the decision value of x will not affect the discernibility matrix for ( U , C D ) . Therefore, the section is confined to a discussion of the cases pertinent to Scenarios 1–2.
In ( U , C D ) , and adopting the absolute majority voting strategy for | [ x i ] C D i | | [ x i ] C | , the modified decision table is denoted by ( U , C D ) , where U = U Z .
Definition 10. 
Given ( U , C D ) , the modified decision table is ( U , C D ) when sample z is deleted. The positive region and variable precision discernibility matrix are denoted by M P o s C D D e l and M μ C D D e l , respectively.
The corresponding discernibility functions for M P o s C D D e l and M μ C D D e l are Φ P o s C D D e l , Φ μ C D D e l , respectively.
Provide an illustration for Scenario 3 (Table 5). When the sample Z = { x 3 , x 4 , x 5 } is deleted, namely x U , z [ x ] C . Calculate U / C = { { x 1 , x 2 } , { x 4 , x 7 , x 8 } } , U / D = { { x 1 } , { x 2 } , { x 4 , x 7 , x 8 } } , which then results in P o s C D = { x 4 , x 7 , x 8 } . As indicated in Remark 1, the number of rows in the discernibility matrix is | P o s C D / [ x ] C | = 1 , i.e., M P o s C D = a 1 a 1 a 1 , thus Φ P o s C D = { a 1 } . The decision value of the deletion sample, which is either z [ x ] D or z [ x ] D , results in the removal of the corresponding row from the discernibility matrix. Consequently, the decision attribute values for Scenario 3 are filled with “-”, and the processing for Scenario 3 has also been executed in line 18 of the DVPRA-SD algorithm.
When samples are deleted, the positive region may either increase or decrease. Similar to the analysis of the four cases of the discernibility matrix for adding samples, it is necessary to discuss the four cases of the discernibility matrix when deleting samples. Four cases under Scenario 1 (Table 5) are discussed as follows.
Theorem 9. 
Given ( U , C D ) , let z be a deleted sample, z [ x i ] C and z R D ( x i ) . Then, the following holds:
( 1 ) I f x i P o s C D , x i P o s C D , T h e n Φ P o s C D D e l = Φ P o s C D ( 2 ) I f x i P o s C D , x i P o s C D , T h e n , Φ P o s C D = Φ P o s C D D e l δ P o s C D r o w x i ( 3 ) I f x i P o s C D , x i P o s C D , T h e n , Φ P o s C D D e l = Φ P o s C D δ P o s C D r o w x i ( 4 ) I f x i P o s C D , x i P o s C D , T h e n , Φ P o s C D D e l = Φ P o s C D
Proof. 
The proofs are similar to those of Theorems 2–5, respectively. □
According to Scenario 1 (Table 5), the following corollary can be obtained from Theorems 1 and 9.
Corollary 8. 
Let z be a delete sample, z [ x i ] C and z R D ( x i ) . Then, the following corollary holds:
( 1 ) I f x i P o s C D , x i P o s C D , T h e n , Φ μ C D D e l = Φ μ C D ( 2 ) I f x i P o s C D , x i P o s C D , T h e n Φ μ C D = Φ P o s C D D e l δ P o s C D r o w x i ( 3 ) I f x i P o s C D , x i P o s C D , T h e n Φ μ C D D e l = Φ P o s C D δ P o s C D r o w x i ( 4 ) I f x i P o s C D , x i P o s C D , T h e n , Φ μ C D D e l = Φ μ C D
For | [ x i ] c D i | | [ x i ] c | in ( U , C D ) , due to the decision values of modifying some samples reaching ( U , C D ) , in which z [ x i ] C , if sample z is deleted in ( U , C D ) , then [ x i ] C may not meet the conditions for modifying decision values for ( U , C D ) , thus, [ x i ] C contains different decision values for Scenario 2 (Table 5). If the deleted sample belongs to Scenario 2, the above changes are detailed in the following theorems.
Theorem 10. 
Let z be a deleted sample. If z [ x i ] C , z R D ( x i ) , then the following holds:
( 1 ) I f x i P o s C D , x i P o s C D , T h e n , Φ P o s C D D e l = Φ P o s C D ( 2 ) I f x i P o s C D , x i P o s C D , T h e n , Φ P o s C D = Φ P o s C D D e l δ P o s C D r o w x i ( 3 ) I f x i P o s C D , x i P o s C D , T h e n , Φ P o s C D D e l = Φ P o s C D δ P o s C D r o w x i ( 4 ) I f x i P o s C D , x i P o s C D , T h e n Φ P o s C D D e l = Φ P o s C D
Proof. 
The proofs are similar to those of Theorems 2–5, respectively. □
Remark 5. 
If x i P o s C D , then P ( D j | [ x i ] C ) = 1 β , if and only if R C ( x i ) R D ( x i ) . After the sample is deleted, because z [ x i ] C , z R D ( x i ) and x i P o s C D . Then, there exist P ( D j | [ x i ] C ) = 1 β , hence, R C ( x i ) R D ( x i ) .
Corollary 9. 
Let z be a deleted sample. If z [ x i ] C , z R D ( x i ) , then the following corollary holds:
( 1 ) I f x i P o s C D , x i P o s C D , T h e n Φ μ C D D e l = Φ μ C D ( 2 ) I f x i P o s C D , x i P o s C D , T h e n Φ μ C D = Φ P o s C D D e l δ P o s C D r o w x i ( 3 ) I f x i P o s C D , x i P o s C D , T h e n Φ μ C D D e l = Φ P o s C D δ P o s C D r o w x i ( 4 ) I f x i P o s C D , x i P o s C D , T h e n , Φ μ C D D e l = Φ μ C D
The theorem above explains two scenarios for deleting samples, from which the dynamic reduction algorithm for deleted samples is proposed as follows.
Many examples exist of sample disturbance in many applications, such as samples increasing or decreasing. Compared with the four scenarios considered for samples increasing in Section 3, sample deletion considers three scenarios, for which Algorithm 2 DVPRA-SD was proposed in the section. Both Algorithm 2 DVPRA-SD and Algorithm 1 DVPRA-SI consider the issue of equivalence class changes caused by the decrease (increase) of samples, and the corresponding reduction algorithms based on the discernibility matrices are proposed.
Algorithm 2: Dynamic VPR algorithm for sample deletion (DVPRA-SD)
Input: ( U , C D ) , newly added sample set Z = { z 1 , z 2 , , z l }
Output: Reduction Results // discernibility function Φ from its CNF into the DNF
1: Begin
2: Compute U / C = { C 1 , C 2 , , C q } , U / D = { D 1 , D 2 , , D t } and P o s C ( D )
3: Modify the decision table
4: Compute P o s C ( D )
5: for  i = 1 to l do
6:   for  i = 1 to M i D  do
7:      Delete M P o s C D ( M i D , z )
8:    end for
9: end for
10: for  i = 1 to | U |  do
11:      When  x i P o s C ( D )
12:        If  [ x i ] C = [ M i D ] C and M i D M s D  then
13:           the discernibility matrix is not modified
14:        if  M i D M s D  then
15:          according to a ( x i ) a ( x j ) , and then add the elements to M s D .
16:      When  x i P o s C ( D )
17:        if  [ x i ] C = [ M i D ] C and M i D M s D  then
18:           delete the row where label M s D is located
19:        if  M i D M s D  then
20:           because the discernibility matrix conditions are not met, the discernibility matrix is not modified
21: end for
22: Compute Φ μ C D D e l
23: end

5. Experimental Analysis

Experiments were conducted in which eight datasets (Table 6) were selected to verify the effectiveness of the proposed algorithms, DVPRA-SI and DVPRA-SD, which were compared with existing algorithms [24,36], where | U | represents the number of objects, | C | represents the number of condition attributes, and | U / D | represents the number of classifications. For some datasets with missing values, the average value of the column was selected at the missing position in the columns.
The experimental environment utilized a computing environment with an Intel Core i5-9300H processor, 24 GB of RAM, and the Windows 11 operating system. The algorithms were implemented in Python. Conduct pre-processing on eight datasets from the UCI Repository, transforming all numerical data into integer format.
Running time and classification accuracy were the main evaluation criteria in the experiment. Algorithm DVPRA-SI was compared with two incremental reduction algorithms: IFS-SSFA [24] and IFSA [36]. For classification accuracy, the reduction results with different precisions of DVPRA-SI were compared with that of the original datasets on the classifiers support vector machine (SVM), K-nearest neighbor (KNN), and random forest (RF).
Algorithm DVPRA-SI was compared with Algorithms IFS-SSFA and IFSA in terms of the running time for the datasets in this section. For Algorithms DVPRA-SI and DVPRA-SD, the reduction results were calculated using binary integer programming. For the running time, Algorithms IFS-SSFA and IFSA were compared with Algorithm DVPRA-SI, where precision was set to 0.65, 0.75, and 0.85, respectively. To ensure the condition for comparison, the first 75% of the samples was selected as the raw dataset and the remaining 25% as the newly added samples. The results for the running time are shown in Table 7, where it can be clearly observed that Algorithm DVPRA-SI had an obvious advantage over the other algorithms. As the size of the dataset expanded, the advantage of Algorithm DVPRA-SI became more apparent. Taking the Iris dataset as an example, although the running time of all three algorithms remained relatively short, the advantages of Algorithm DVPRA-SI became particularly prominent when shifting to larger datasets. For example, on the Rice dataset, DVPRA-SI quickly obtained simplified results 11.84, 11.44, and 11.43 when precision was 0.65, 0.75, and 0.85, respectively.
For the running time of Algorithm DVPRA-SD (Figure 3), the precision was also set to 0.65, 0.75, and 0.85, respectively. For Algorithm DVPRA-SD, all of the datasets were raw datasets, and then 25% of the samples was selected randomly as deleted samples. The running time of Algorithm DVPRA-SD (Figure 3) remained at the same scale as the running time of DVPRA-SI (Table 7). It is noted that to more intuitively observe the performance of the running time of algorithms, six datasets were selected from eight datasets for comparison in Figure 3 based on the scale of the datasets. Their execution efficiency and stability met the expected standards.
Algorithm DVPRA-SI with different precisions was compared on the classifiers SVM, KNN, and RF in Figure 4, Figure 5 and Figure 6, in which the performance of Algorithm DVPRA-SI varied on eight datasets. For all datasets in the comparative experiments, the result of Algorithm DVPRA-SI was the testing set, whose proportions were set to 60%, 70%, 80%, and 90%, respectively. When precision was 0.65, eight datasets were used for classification accuracies on three classifiers, for example, SVM-D.A.-F.F. represents the classification accuracies of D.A., F.F., and those of Algorithm DVPRA-SI on classifier SVM in Figure 4. The result of Algorithm DVPRA-SI exhibited higher classification accuracy than that of the original datasets on classifier KNN. For example, given precision 0.65, for the proportions 60%, 70%, 80%, and 90%, the classification accuracies obtained by Algorithm DVPRA-SI were 49.53%, 48.54%, 50.31%, and 44.37% when the classification accuracies of the corresponding original W.Q. dataset were 49.53%, 48.13%, 50.31%, and 44.37%, respectively. The classification accuracy of Algorithm DVPRA-SI on the R.F. classifier was better than that of the corresponding original dataset; for example, on the Wine dataset, precision was 0.65, the classification accuracies of the corresponding original dataset were 97.18%, 94.44%, 91.67%, and 100% when the classification accuracies obtained by Algorithm DVPRA-SI were 98.59%, 96.30%, 94.44%, and 100%, respectively.
According to the overall analysis above, the results of the classification accuracies demonstrate Algorithm DVPRA-SI’s robustness and effectiveness on different datasets. Based on the analysis of the above figures and tables, our proposed algorithms have been verified, and have advantages, to some extent.

6. Conclusions

The proposed algorithms can quickly respond to changes in data when dealing with sample set disturbance. When the proposed algorithms encounter scenarios such as data noise and outliers, they can reduce the impact of these factors on the processing results while ensuring the quality of processing, thus improving processing efficiency and robustness. PRR ensures that classification is not affected by sample disturbance while reducing redundant attributes, whereas VPR reduces redundant attributes for different precision requirements while maintaining classification capability. Based on the discernibility matrix construction of PRR and VPR, the process of dynamic VPR was discussed with the aid of PRR, and corresponding algorithms were proposed. In the future, we will continue to explore more dynamic variable-precision simplification problems and expand the application scope of dynamic reduction algorithms.

Author Contributions

Conceptualization, X.L.; methodology, R.D.; software, X.L. and R.D.; validation, X.L., R.D. and Z.C.; formal analysis, Z.C. and J.R.; data curation, J.R.; writing—original draft preparation, X.L. and R.D.; writing—review and editing, X.L., R.D., Z.C. and J.R.; supervision, X.L. and Z.C.; project administration, R.D. and J.R.; funding acquisition, X.L. and J.R. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (Grant No. 62072067) and the Xinjiang Tianchi Youth Fund Project and the Xinjiang Social Science Foundation (No. 2024BMZ099).

Data Availability Statement

All data in this article comes from the UCI dataset https://archive.ics.uci.edu/ (accessed on 1 August 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Pawlak, Z. Rough sets. Int. J. Comput. Inf. Sci. 1982, 11, 341–356. [Google Scholar] [CrossRef]
  2. Ziarko, W. Variable precision rough set model. J. Comput. Syst. Sci. 1993, 46, 39–59. [Google Scholar] [CrossRef]
  3. Liu, G. Attribute reduction algorithms determined by invariants for decision tables. Cogn. Comput. 2022, 6, 1818–1825. [Google Scholar] [CrossRef]
  4. Liu, G. Matrix approaches for variable precision rough approximations. In International Conference on Rough Sets and Knowledge Technology; Springer: Berlin/Heidelberg, Germany, 2015; pp. 214–221. [Google Scholar]
  5. Liu, G.; Liu, J. A Variable Precision Reduction Type for Information Systems. In International CCF Conference on Artificial Intelligence; Springer: Berlin/Heidelberg, Germany, 2019; pp. 240–247. [Google Scholar]
  6. Yang, Y.-Y.; Chen, D.-G.; Kwong, S. Novel algorithms of attribute reduction for variable precision rough set. In Proceedings of the 2011 International Conference on Machine Learning and Cybernetics, Guilin, China, 10–13 July 2011; Volume 1, pp. 108–112. [Google Scholar]
  7. Feng, T.; Fan, H.-T.; Mi, J.-S. Uncertainty and reduction of variable precision multigranulation fuzzy rough sets based on three-way decisions. Int. J. Approx. Reason. 2017, 85, 36–58. [Google Scholar] [CrossRef]
  8. Ma, Z.; Mi, J.-S.; Lin, Y.; Li, J. Boundary region-based variable precision covering rough set models. Inf. Sci. 2022, 608, 1524–1540. [Google Scholar] [CrossRef]
  9. Zhan, J.; Jiang, H.; Yao, Y. Covering-based variable precision fuzzy rough sets with PROMETHEE-EDAS methods. Inf. Sci. 2020, 538, 314–336. [Google Scholar] [CrossRef]
  10. Yao, Y. Three-way granular computing, rough sets, and formal concept analysis. Int. J. Approx. Reason. 2020, 116, 106–125. [Google Scholar] [CrossRef]
  11. Huang, Z.; Li, J.; Wang, C. Robust feature selection using multigranulation variable-precision distinguishing indicators for fuzzy covering decision systems. IEEE Trans. Syst. Man Cybern. Syst. 2023, 54, 903–914. [Google Scholar] [CrossRef]
  12. Xie, L.; Lin, G.; Li, J.; Lin, Y. A novel fuzzy-rough attribute reduction approach via local information entropy. Fuzzy Sets Syst. 2023, 473, 108733. [Google Scholar] [CrossRef]
  13. Janusz, A.; Ślęzak, D. Rough set methods for attribute clustering and selection. Appl. Artif. Intell. 2014, 28, 220–242. [Google Scholar]
  14. Inuiguchi, M.; Yoshioka, Y.; Kusunoki, Y. Variable-precision dominance-based rough set approach and attribute reduction. Int. J. Approx. Reason. 2009, 50, 1199–1214. [Google Scholar] [CrossRef]
  15. Fang, Y.; Cao, X.-M.; Wang, X.; Min, F. Three-way sampling for rapid attribute reduction. Inf. Sci. 2022, 609, 26–45. [Google Scholar] [CrossRef]
  16. Li, J.-Z.; Yang, X.-B.; Song, X.-N.; Li, J.-H.; Wang, P.-X.; Yu, D.-J. Neighborhood attribute reduction: A multi-criterion approach. Int. J. Mach. Learn. Cybern. 2019, 10, 731–742. [Google Scholar] [CrossRef]
  17. Kang, Y.; Dai, J. Attribute reduction in inconsistent grey decision systems based on variable precision grey multigranulation rough set model. Appl. Soft Comput. 2023, 133, 109928. [Google Scholar] [CrossRef]
  18. Mac Parthaláin, N.; Jensen, R.; Diao, R. Fuzzy-rough set bireducts for data reduction. IEEE Trans. Fuzzy Syst. 2019, 28, 1840–1850. [Google Scholar] [CrossRef]
  19. Ali, A.; Ali, M.I.; Rehman, N. Soft dominance based rough sets with applications in information systems. Int. J. Approx. Reason. 2019, 113, 171–195. [Google Scholar] [CrossRef]
  20. Deepa, N.; Ganesan, K. Decision-making tool for crop selection for agriculture development. Neural Comput. Appl. 2019, 31, 1215–1225. [Google Scholar] [CrossRef]
  21. Wang, C.; Wang, C.; Qian, Y.; Leng, Q. Feature selection based on weighted fuzzy rough sets. IEEE Trans. Fuzzy Syst. 2024, 32, 4027–4037. [Google Scholar] [CrossRef]
  22. Yang, Y.; Chen, D. Improved Algorithm and Further Research of β-reduct for variable precision rough set. In Proceedings of the 2012 IEEE International Conference on Granular Computing, Hangzhou, China, 11–13 August 2012; pp. 783–786. [Google Scholar]
  23. Li, X.; Xiao, H.; Tang, J. New Variable Precision Reduction Algorithm for Decision Tables. IEEE Access 2023, 11, 42701–42712. [Google Scholar] [CrossRef]
  24. Yang, Y.; Chen, D.; Zhang, X.; Ji, Z.; Zhang, Y. Incremental feature selection by sample selection and feature-based accelerator. Appl. Soft Comput. 2022, 121, 108800. [Google Scholar] [CrossRef]
  25. Yang, Y.; Chen, D.; Wang, H. Active sample selection based incremental algorithm for attribute reduction with rough sets. IEEE Trans. Fuzzy Syst. 2016, 25, 825–838. [Google Scholar] [CrossRef]
  26. Qian, W.; Xie, Y.; Yang, B. A dynamic attribute reduction algorithm based on compound attribute measure. In Proceedings of the 2013 IEEE International Conference on Granular Computing (GrC), Beijing, China, 13–15 December 2013; pp. 236–241. [Google Scholar]
  27. Shu, W.; Qian, W. An incremental approach to attribute reduction from dynamic incomplete decision systems in rough set theory. Data Knowl. Eng. 2015, 100, 116–132. [Google Scholar] [CrossRef]
  28. Dong, L.; Wang, R.; Chen, D. Incremental feature selection with fuzzy rough sets for dynamic data sets. Fuzzy Sets Syst. 2023, 467, 108503. [Google Scholar] [CrossRef]
  29. Pan, Y.; Xu, W.; Ran, Q. An incremental approach to feature selection using the weighted dominance-based neighborhood rough sets. Int. J. Mach. Learn. Cybern. 2023, 14, 1217–1233. [Google Scholar] [CrossRef]
  30. Zhang, X.; Li, J.; Mi, J. Dynamic updating approximations approach to multi-granulation interval-valued hesitant fuzzy information systems with time-evolving attributes. Knowl.-Based Syst. 2022, 238, 107809. [Google Scholar] [CrossRef]
  31. Wang, L.; Pei, Z.; Qin, K.; Yang, L. Incremental updating fuzzy tolerance rough set approach in intuitionistic fuzzy information systems with fuzzy decision. Appl. Soft Comput. 2024, 151, 111119. [Google Scholar] [CrossRef]
  32. Yang, X.; Liu, D.; Yang, X.; Liu, K.; Li, T. Incremental fuzzy probability decision-theoretic approaches to dynamic three-way approximations. Inf. Sci. 2021, 550, 71–90. [Google Scholar] [CrossRef]
  33. Gu, S.; Qian, Y.; Hou, C. Incremental feature spaces learning with label scarcity. ACM Trans. Knowl. Discov. Data (TKDD) 2022, 16, 1–26. [Google Scholar] [CrossRef]
  34. Dong, L.-J.; Chen, D.-G. Incremental attribute reduction with rough set for dynamic datasets with simultaneously increasing samples and attributes. Int. J. Mach. Learn. Cybern. 2020, 11, 1339–1355. [Google Scholar] [CrossRef]
  35. Deng, X.; Li, J.; Qian, Y.; Liu, J. An Emerging Incremental Fuzzy Concept-Cognitive Learning Model Based on Granular Computing and Conceptual Knowledge Clustering. IEEE Trans. Emerg. Top. Comput. Intell. 2024, 8, 2417–2432. [Google Scholar] [CrossRef]
  36. Shu, W.-H.; Qian, W.-B.; Xie, Y.-H. Incremental approaches for feature selection from dynamic data with the variation of multiple objects. Knowl.-Based Syst. 2019, 163, 320–331. [Google Scholar] [CrossRef]
  37. Jing, Y.; Li, T.; Fujita, H.; Wang, B.; Cheng, N. An incremental attribute reduction method for dynamic data mining. Inf. Sci. 2018, 465, 202–218. [Google Scholar] [CrossRef]
  38. Su, L.; Yu, F. Matrix approach to spanning matroids of rough sets and its application to attribute reduction. Theor. Comput. Sci. 2021, 893, 105–116. [Google Scholar] [CrossRef]
  39. CaihuiLiu, K.; Miao, D. Novel matrix-based approaches to computing minimal and maximal descriptions in covering-based rough sets. Inf. Sci. 2020, 539, 312–326. [Google Scholar]
  40. Xu, Y.; Wang, Q.; Sun, W. Matrix-based incremental updating approximations in multigranulation rough set under two-dimensional variation. Int. J. Mach. Learn. Cybern. 2021, 12, 1041–1065. [Google Scholar] [CrossRef]
  41. Ma, F.; Ding, M.; Zhang, T.; Cao, J. Compressed binary discernibility matrix based incremental attribute reduction algorithm for group dynamic data. Neurocomputing 2019, 344, 20–27. [Google Scholar] [CrossRef]
  42. Sowkuntla, P.; Prasad, P.S.V.S. MapReduce based parallel fuzzy-rough attribute reduction using discernibility matrix. Appl. Intell. 2022, 52, 154–173. [Google Scholar] [CrossRef]
  43. Zhang, X.; Wang, J.; Hou, J. Matrix-based approximation dynamic update approach to multi-granulation neighborhood rough sets for intuitionistic fuzzy ordered datasets. Appl. Soft Comput. 2024, 163, 111915. [Google Scholar] [CrossRef]
Figure 1. Framework for increasing sample scenarios.
Figure 1. Framework for increasing sample scenarios.
Symmetry 16 01239 g001
Figure 2. Framework for deleting sample scenarios.
Figure 2. Framework for deleting sample scenarios.
Symmetry 16 01239 g002
Figure 3. Running time of Algorithm DVPRA-SD with different precisions.
Figure 3. Running time of Algorithm DVPRA-SD with different precisions.
Symmetry 16 01239 g003
Figure 4. Classification accuracy with precision 0.65.
Figure 4. Classification accuracy with precision 0.65.
Symmetry 16 01239 g004
Figure 5. Classification accuracy with precision 0.75.
Figure 5. Classification accuracy with precision 0.75.
Symmetry 16 01239 g005
Figure 6. Classification accuracy with precision 0.85.
Figure 6. Classification accuracy with precision 0.85.
Symmetry 16 01239 g006
Table 1. A decision table.
Table 1. A decision table.
U C D
a 1 a 2 a 3
x 1 1010
x 2 1011
x 3 0101
x 4 0012
x 5 0100
x 6 0101
x 7 0012
x 8 0012
Table 2. Modified decision table (change f ( x 5 , D ) = 0 to f ( x 5 , D ) = 1 ).
Table 2. Modified decision table (change f ( x 5 , D ) = 0 to f ( x 5 , D ) = 1 ).
U C D
a 1 a 2 a 3
x 1 1010
x 2 1011
x 3 0101
x 4 0012
x 5 0101
x 6 0101
x 7 0012
x 8 0012
Table 3. Scenarios for increasing the sample.
Table 3. Scenarios for increasing the sample.
Original SampleNew SampleCondition Attribute ValueDecision Attribute ValueScenario
x U y U + U a , f ( x , a ) = f ( y , a ) f ( x , d ) = f ( y , d ) 1
x U y U + U a , f ( x , a ) = f ( y , a ) f ( x , d ) f ( y , d ) 2
x i U , x j U y U + U a , f ( x i , a ) f ( y , a ) f ( x , d ) = f ( y , d ) 3
x U y U + U a , f ( x i , a ) f ( y , a ) f ( x , d ) f ( y , d ) 4
Table 4. Decision table after increasing the sample.
Table 4. Decision table after increasing the sample.
U C D
a 1 a 2 a 3
x 1 1010
x 2 1011
x 3 0101
x 4 0012
x 5 0101
x 6 0101
x 7 0012
x 8 0012
y 1 0101
y 2 0012
Table 5. Scenarios for deleting the sample.
Table 5. Scenarios for deleting the sample.
Original SampleNew SampleCondition Attribute ValueDecision Attribute ValueScenario
x U z U U a , f ( x , a ) = f ( z , a ) f ( x , d ) = f ( y , d ) 1
x U z U U a , f ( x , a ) = f ( z , a ) f ( x , d ) f ( y , d ) 2
x U z U U a , f ( x , a ) = f ( z , a ) “-”3
Table 6. Experimental datasets.
Table 6. Experimental datasets.
NoDataSetsAbbr. | U | | C | | U / D | Missing Values
1Iris-15043No
2Wine-178123No
3Seeds-21073No
4Forest_FireF.F.244132No
5Credit ApprovalC.A.690152Yes
6Raisin-90072No
7Wine QualityW.Q.1599114No
8Rice (Cammeo and Osmancik)Rice381072No
Table 7. Running time for the algorithms.
Table 7. Running time for the algorithms.
Running Time (s) DVPRA-SI IFS-SSFAIFSA
β = 0.65 β = 0.75β = 0.85
Iris0.01560.01560.01560.40230.2752
Wine0.04690.03120.04810.23632.4654
Seeds0.03120.01560.01560.13660.8966
F.F.0.07810.06250.06250.51533.5778
C.A.0.59490.59230.68667.329128.7329
Raisin0.68800.65360.67331.408413.05
W.Q.2.41112.12472.124837.6194.901
Rice11.842411.442211.431054.5942403.0925
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, X.; Dong, R.; Chen, Z.; Ren, J. Dynamic Variable Precision Attribute Reduction Algorithm. Symmetry 2024, 16, 1239. https://doi.org/10.3390/sym16091239

AMA Style

Li X, Dong R, Chen Z, Ren J. Dynamic Variable Precision Attribute Reduction Algorithm. Symmetry. 2024; 16(9):1239. https://doi.org/10.3390/sym16091239

Chicago/Turabian Style

Li, Xu, Ruibo Dong, Zhanwei Chen, and Jiankang Ren. 2024. "Dynamic Variable Precision Attribute Reduction Algorithm" Symmetry 16, no. 9: 1239. https://doi.org/10.3390/sym16091239

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop