Next Article in Journal
Efficient or Fractal Market Hypothesis? A Stock Indexes Modelling Using Geometric Brownian Motion and Geometric Fractional Brownian Motion
Next Article in Special Issue
Using Locality-Sensitive Hashing for SVM Classification of Large Data Sets
Previous Article in Journal
A Global Optimization Algorithm for Solving Linearly Constrained Quadratic Fractional Problems
Previous Article in Special Issue
Attribute Selecting in Tree-Augmented Naive Bayes by Cross Validation Risk Minimization
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Novel Hybrid Approach: Instance Weighted Hidden Naive Bayes

1
College of Computer, Hubei University of Education, Wuhan 430205, China
2
Hubei Co-Innovation Center of Basic Education Information Technology Services, Hubei University of Education, Wuhan 430205, China
3
School of Management, Huazhong University of Science and Technology, Wuhan 430071, China
4
Wuhan Eight Dimension Space Information Technology Co., Ltd., Wuhan 430071, China
*
Author to whom correspondence should be addressed.
Mathematics 2021, 9(22), 2982; https://doi.org/10.3390/math9222982
Submission received: 13 October 2021 / Revised: 15 November 2021 / Accepted: 19 November 2021 / Published: 22 November 2021
(This article belongs to the Special Issue Machine Learning and Data Mining: Techniques and Tasks)

Abstract

:
Naive Bayes (NB) is easy to construct but surprisingly effective, and it is one of the top ten classification algorithms in data mining. The conditional independence assumption of NB ignores the dependency between attributes, so its probability estimates are often suboptimal. Hidden naive Bayes (HNB) adds a hidden parent to each attribute, which can reflect dependencies from all the other attributes. Compared with other Bayesian network algorithms, it offers significant improvements in classification performance and avoids structure learning. However, the assumption that HNB regards each instance equivalent in terms of probability estimation is not always true in real-world applications. In order to reflect different influences of different instances in HNB, the HNB model is modified into the improved HNB model. The novel hybrid approach called instance weighted hidden naive Bayes (IWHNB) is proposed in this paper. IWHNB combines instance weighting with the improved HNB model into one uniform framework. Instance weights are incorporated into the improved HNB model to calculate probability estimates in IWHNB. Extensive experimental results show that IWHNB obtains significant improvements in classification performance compared with NB, HNB and other state-of-the-art competitors. Meanwhile, IWHNB maintains the low time complexity that characterizes HNB.

1. Introduction

Bayesian network (BN) combines knowledge of network topology and probability. It is a classical method which can be used to predict a test instance [1]. The BN structure is a directed acyclic graph, and each edge in BN reflects the dependency between attributes. Unfortunately, it has been confirmed that finding the optimal BN from arbitrary BNs is an non-deterministic polynomial (NP)-hard problem [2,3]. Naive Bayes (NB) is one of the most classic and efficient models in BNs. It is easy to construct but surprisingly effective [4]. The NB model is shown in the Figure 1a. A 1 ,   A 2   , , A m denote m attributes. The class variable C is the parent node of each attribute. Each attribute A i is independent from the others.
The classification performance of NB is comparable to well-known classifiers [5,6]. However, the conditional independence assumption of NB ignores the dependencies between attributes in real-world applications, so its probability estimates are often suboptimal [7,8]. In order to reduce the primary weakness brought by the conditional independence assumption, a lot of improved approaches of NB have been proposed to alleviate the primary weakness in NB by manipulating attribute independence assertions [9,10]. These improved approaches can fall into five main categories: (1) Structure extension by extending the NB’s structure to overcome the attribute independence assertions [11,12,13,14]; (2) Instance weighting by constructing a NB classifier on an instance weighted dataset [15,16,17,18]; (3) Instance selection by constructing a NB classifier on a selected local instance subset [19,20,21]; (4) Attribute weighting by constructing a NB classifier on an attribute weighted dataset [22,23,24,25,26]; (5) Attribute selection by constructing a NB classifier on a selected attribute subset [27,28,29,30].
Structure extension adds finite directed edges to reflect the dependencies between attributes [31]. It is efficient to overcome the conditional independence assumption of NB, since probabilistic relationships among attributes can be explicitly denoted by directed arcs [32]. Among various structure extension approaches, the hidden naive Bayes (HNB) is an improved model that essentially combines mixture dependencies of attributes [33]. It can display Bayesian network topology well and reflect the dependencies from all other attributes. However, HNB regards each instance as equally important when computing probability estimates. This assumption is not always true because different instances could have different contributions. In order to improve the classification performance of HNB, it will be interesting to study whether a better classification performance can be achieved by constructing an improved HNB model on the instance weighted dataset. The resulting model which combines instance weighting with the improved HNB model into one uniform framework inherits the effectiveness of HNB, and reflects different influences of different instances.
In this study, we propose the novel hybrid model which combines instance weighting with the improved HNB model into one uniform framework, referred to as instance weighted hidden naive Bayes (IWHNB). With the research of the existing HNB model, we propose an improved HNB model that can reflect different contributions of different instances. In contrast to the existing HNB model, the improved HNB model is built on the instance weighted dataset. Instance weights are incorporated into generating each hidden parent to reflect mixture dependencies of both attributes and instances. In our IWHNB approach, the improved HNB model is proposed to approximate the ground-truth attribute dependencies. Meanwhile, instance weights are calculated by the attribute value frequency-based instance weighted filter. Each instance weight is incorporated into probability estimates and the classification formula in IWHNB.
We have completed experiments to compare IWHNB with NB, HNB, and other state-of-the-art competitors. Empirical studies show that IWHNB obtains more satisfactory classification performance than its competitors. Meanwhile, IWHNB maintains the low time complexity that characterizes HNB. The main contributions of the work presented in this paper can be briefly summarized as follows:
  • We reviewed the related work about structure extension and found that there is almost no method that focuses on the hybrid paradigm which combines structure extension with instance weighting.
  • We reviewed the related work about the existing instance weighting approaches and found that the Bayesian network in these researches is limited to NB.
  • The IWHNB approach is an improved approach which combines instance weighting with the improved HNB model into one uniform framework. It is a new paradigm to calculate discriminative instance weights for the structure extension model.
  • Although some training time is spent to calculate the weight of each instance, the experimental results show that our proposed IWHNB approach is still simple and efficient. Meanwhile, the classification performance of the IWHNB approach is more satisfactory than its competitors.
The paper is organized as follows. In Section 2, we review the related work with regard to this paper. In Section 3, we propose our IWHNB approach. In Section 4, we describe the experimental setup and results. In Section 5, we give our conclusions and outline suggestions for future research.

2. Related Work

2.1. Structure Extension

Structure extension adds finite directed edges to encode probabilistic relationships. The extended NB structure encodes attribute independence statements, where directed arcs can explicitly characterize the joint probability distribution. In the case of given its parents, the attribute is independent of its non descendants. Given a test instance x, represented by an attribute vector < a 1 , a 2 , , a m > , Equation (1) is formalized to classify instance x in structure extended NB:
c ( x ) = arg max c C P ( c ) i = 1 m P ( a i | Π a i , c ) ,
where m is the number of attributes, Π a i is the attribute value(s) of Π A i , and Π A i denotes the set of parent nodes of A i except for the class node C. The prior probability P ( c ) is defined by Equation (2) as follows:
P ( c ) = 1 + t = 1 n δ ( c t , c ) q + n ,
where n is the number of training instances, c t is the class label of the tth training instance, q is the number of classes, and  δ ( ) is a binary function, where the value is 1 when two variables are equal, and the value is 0 when unequal.
A number of structure extension approaches have been proposed to alleviate the primary weakness in NB [31]. The network structure of tree-augmented naive Bayesian (TAN) comprising all the attribute nodes is tree-like [14]. The class is the parent node of each attribute, and each attribute can have only one other attribute parent from other attributes. Aggregating one-dependence estimators (AODE) aggregate the joint probability distribution of all qualified classifiers [12]. This Bayesian model directly makes each attribute the parent node of all other attributes, and does not need to learn the topological structure between attributes. Weighted average of one-dependence estimators (WAODE) is proposed to assign different weights to different one-dependence estimators [34]. Each attribute is set as the root attribute once.
Among various structure extension approaches, the hidden naive Bayes (HNB) [33] is an improved model that essentially combines mixture dependencies of attributes. In this study, our proposed IWHNB approach is based on the HNB model, so the HNB model is introduced in detail here. The existing HNB model is shown in the Figure 1b. C is the class label. Each hidden parent A h p i , i = 1 , 2 , , m is created for each attribute A i . A dashed directed line which is from each hidden parent A h p i to attribute A i distinguishes it from a regular arc. A test instance x, x = < a 1 , , a m > classified by HNB is formalized as Equation (3):
c ( x ) = arg max c C P ( c ) i = 1 m P ( a i | a h p i , c ) ,
where the prior probability P ( c ) is also computed by Equation (2). A hidden parent A h p i is created for each attribute A i . The probability P ( a i | a h p i , c ) is formalized as Equation (4):
P ( a i | a h p i , c ) = j = 1 , j i m W i j P ( a i | a j , c ) ,
where W i j ( i , j = 1 , 2 , , m and i j ) are the weights calculated to reflect influences from other attributes. W i j is calculated as Equation (5):
W i j = I P ( A i ; A j | C ) j = 1 , j i m I P ( A i ; A j | C ) ,
where I P ( A i ; A j | C ) is the conditional mutual information formalized as Equation (6):
I P ( A i ; A j | C ) = a i , a j , c P ( a i , a j | c ) l o g P ( a i , a j | c ) P ( a i | c ) P ( a j | c ) ,
where P ( a i | c ) , P ( a j | c ) and P ( a i | a j , c ) are formalized as Equations (7)–(9), respectively:
P ( a i | c ) = 1 + t = 1 n δ ( a t i , a i ) δ ( c t , c ) n i + t = 1 n δ ( c t , c ) ,
P ( a j | c ) = 1 + t = 1 n δ ( a t j , a j ) δ ( c t , c ) n j + t = 1 n δ ( c t , c ) ,
P ( a i , a j | c ) = 1 + t = 1 n δ ( a t i , a i ) δ ( a t j , a j ) δ ( c t , c ) n i + t = 1 n δ ( a t j , a j ) δ ( c t , c ) ,
where a t i is the ith attribute value of the tth training instance, n i is the number of values for the ith attribute.
In the HNB model, we can see that each hidden parent essentially combines mixture dependencies of all other attributes. The HNB model avoids structure learning with intractable computational complexity and reflects the dependencies from all other attributes, but it regards each instance equally important when computing probability estimates.

2.2. Instance Weighting

Naive Bayes (NB) is one of the most classic and efficient models in Bayesian networks. The classification performance of the NB classifier is comparable to state-of-the-art classifiers. The classifier of NB uses Equation (10) to classify a test instance x:
c ( x ) = arg max c C P ( c ) i = 1 m P ( a i | c ) ,
where the prior probability P ( c ) is also computed by Equation (2). In the meantime, the conditional probability P ( a i | c ) is defined by Equation (11):
P ( a i | c ) = 1 + t = 1 n δ ( a t i , a i ) δ ( c t , c ) n i + t = 1 n δ ( c t , c ) ,
where n i is the number of values for the ith attribute.
Instance weighting is a practical way to improve NB by constructing a NB classifier on the instance weighted dataset [35]. It calculates the discriminative weight of each instance according to the distribution of the instance. Instance weighted NB still uses Equation (10) to classify a test instance x. The classification formula of instance weighted NB is the same as that for NB. However, different from the NB classifier, instance weighted NB calculates the discriminative weight of each instance, and incorporates discriminative instance weights into the prior probability and the conditional probability estimates. The prior probability P ( c ) is redefined by Equation (12):
P ( c ) = 1 + t = 1 n w t δ ( c t , c ) q + t = 1 n w t ,
where w t is the weight of the tth training instance. In the meantime, the conditional probability P ( a i | c ) is redefined by Equation (13):
P ( a i | c ) = 1 + t = 1 n w t δ ( a t i , a i ) δ ( c t , c ) n i + t = 1 n w t δ ( c t , c ) .
How to calculate the different weight of each instance to build an instanced weighted NB classifier is crucial. Instance weighting approaches can broadly fall into two categories: eager learning and lazy learning. Eager learning uses general characteristics of instances to calculate instance weights during the training phase. Each instance weight is directly computed as a preprocessing step before the classification phase. Rather than calculating instance weights based on general characteristics of instances, lazy learning is more profitable to optimize instance weights at classification phase. Lazy learning first uses search algorithms to search for instance weights, and then optimize instance weights by building the target classifier on the instance weighted NB. Lazy learning spends more computational cost. So, eager learning normally is faster to calculate instance weights compared to lazy learning, but lazy learning has better classification performance than eager learning.
Discriminatively weighted NB uses the estimated conditional probability loss to calculate discriminative instance weights [15]. It is an eager learning approach, it achieves remarkable classification results in both classification accuracy and ranking. Attribute value frequency weighted NB is a simple and efficient eager learning approach [18]. This instance weighting filter focuses on the frequency of each attribute value to learn the weight of each instance. It calculates each instance weight according to its attribute value number and its attribute value frequency. Lazy NB clones each instance in the neighborhood [16]. It is a lazy learning approach. It calculates the similarity between the test instance and each training instance, and clones are then made based on the similarity. The improved algorithm called instance weighted NB finds the mode within training instances, and then calculates each weight according to the similarity between the mode and each instance [17].
Numerous instance weighting studies have revealed that the Bayesian networks in these existing instance weighting approaches are all limited to NB. It will be interesting to study whether a better classification performance can be obtained by exploiting instance weighting on the structure extended NB.

3. Instance Weighted Hidden Naive Bayes

The studies show that both structure extension and instance weighting can improve the classification performance. Structure extension extends the structure to overcome the unrealistic assumption, but regards each instance as equally important. Instance weighting weights each instance discriminatively to overcome the conditional independence assumption. Each instance weight is incorporated to calculate probability estimates, but the Bayesian network of existing instance weighting approaches is limited to NB. Based on the above analysis, we study whether more satisfying classification results can be obtained by exploiting instance weighting on the structure extended NB.
Following the reasons above, this paper focuses the research on the new hybrid paradigm which combines structure extension with instance weighting. The extended structure should be more accurate to reflect the dependency between attributes. Meanwhile, different instance weights can be incorporated into probability estimates and the classification formula to give more accurate results compared to traditional methods. Learned instance weights can reflect different contributions of different instances. Based on these, we propose a new hybrid approach which combines the improved hidden naive Bayes with instance weighting into one hybrid model. This improved hybrid approach is called instance weighted hidden naive Bayes (IWHNB). We modify the HNB model into the instance weighted hidden naive Bayes (IWHNB) model. In the following subsection, we describe the IWHNB model in detail.

3.1. The Instance Weighted Hidden Naive Bayes Model

Hidden naive Bayes (HNB) generates a hidden parent to each attribute to reflect dependencies from all other attributes [33]. Figure 1 effectively creates relationships among the models, as if they had evolved directly from one to the other. As Figure 1a shows, naive Bayes (NB) is one of the most classic and efficient models in BNs. As Figure 1b shows, the HNB model essentially adds a hidden parent to each attribute, but it regards each instance as equally important. The HNB model avoids structure learning with intractable computational complexity. It can be interpreted as the weight of each instance is set to 1 by default in HNB. However, in the training dataset, some instances contribute more to classification than others, so they should have more influence than less important instances. Different contributions for different instances can be a very important consideration.
Motivated by the work of HNB [33], we modify the HNB model into the instance weighted hidden naive Bayes model in our IWHNB approach. The instance weighted hidden naive Bayes model is shown in the Figure 1c. C is the class label, and is the parent node of each attribute. A hidden parent A h p i , i = 1 , 2 , , m is also created for each attribute A i . n is the number of training instances. w t is the weight of the tth training instance. In the improved HNB model, each instance weight w t is integrated to generate the hidden parent to each attribute. A dashed directed line which is from each hidden parent A h p i to attribute A i distinguishes it from a regular parent. Different from the existing HNB model, the improved HNB model not only essentially reflects dependencies from all other attributes but also can reflect different contributions of different instances.
In our IWHNB approach, the test instance x = < a 1 , , a m > classified by IWHNB is formalized as Equation (14):
c ( x ) = arg max c C P ( c ) i = 1 m P ( a i | a h p i , c ) .
Although the classification formula of our IWHNB approach is the same as that for HNB, the calculations of the probabilities P ( c ) and P ( a i | a h p i , c ) are different. We embed each instance weight w t into the generation of each hidden parent. Instance weights are also incorporated into calculating probabilities. The detailed processes are described as follows. Firstly, we redefine the prior probability P ( c ) as Equation (15):
P ( c ) = 1 + t = 1 n w t δ ( c t , c ) q + t = 1 n w t .
Secondly, the probability P ( a i | a h p i , c ) is formalized as Equation (16).
P ( a i | a h p i , c ) = j = 1 , j i m W i j P ( a i | a j , c ) ,
where P ( a i | a j , c ) and W i j both are redefined in our IWHNB approach. We redefine the probability P ( a i | a j , c ) as Equation (17):
P ( a i , a j | c ) = 1 + t = 1 n w t δ ( a t i , a i ) δ ( a t j , a j ) δ ( c t , c ) n i + t = 1 n w t δ ( a t j , a j ) δ ( c t , c ) ,
where w t is the weight of the tth training instance.
Thirdly, W i j are weights which are measured by the conditional mutual information I P ( A i ; A j | C ) to reflect influences from other attributes. W i j is calculated as Equation (18):
W i j = I P ( A i ; A j | C ) j = 1 , j i m I P ( A i ; A j | C ) ,
where I P ( A i ; A j | C ) is defined as follows:
I P ( A i ; A j | C ) = a i , a j , c P ( a i , a j | c ) l o g P ( a i , a j | c ) P ( a i | c ) P ( a j | c ) .
In the process of computing I P ( A i ; A j | C ) and W i j , we incorporate instance weights to compute probability estimates. We redefine the probabilities P ( a i , a j | c ) , P ( a i | c ) and P ( a j | c ) . The probability P ( a i | a j , c ) is redefined as Equation (17). Meanwhile, P ( a i | c ) and P ( a j | c ) are respectively redefined as:
P ( a i | c ) = 1 + t = 1 n w t δ ( a t i , a i ) δ ( c t , c ) n i + t = 1 n w t δ ( c t , c ) .
P ( a j | c ) = 1 + t = 1 n w t δ ( a t j , a j ) δ ( c t , c ) n j + t = 1 n w t δ ( c t , c ) .
Finally, the probability P ( a i | a h p i , c ) is computed by Equation (16). The test instance is classified by Equation (14). Instance weights are incorporated into the process of calculating probability estimates and the classification formula.
In our IWHNB approach, the improved HNB model is modified to reflect the influences of both attributes and instances. Different contributions for different instances are considered when generating the improved HNB model. Different influences of different instance weights are embedded to generate a hidden parent of each attribute. Now, the only question is how to quantify different instance weights. To address this question, the next subsection will describe how to quantify the weight of each instance.

3.2. The Weight of Each Instance

In order to maintain the computational simplicity that characterizes HNB, we exploit eager learning, known as the attribute value frequency-based instance weighted filter, to calculate each single instance weight. The frequency of an attribute value means the ratio between the occurrence times of each attribute values and the instances’ number. It can contain important information to define instance weights [18]. To quantify the frequency of an attribute value, f t i is used to denote the frequency of attribute value a t i (the ith attribute value of the tth instance). We define Equation (22) to denote the attribute value frequency:
f t i = r = 1 n δ a r i , a t i n ,
where a r i is the ith attribute value of the rth instance. For the tth training instance, its attribute value frequency vector is denoted as < f t 1 , f t 2 , , f t m > . The more frequently an attribute value appears, the more influence of an attribute value there is on the instance. The frequency of the occurrence of attribute values can well reflect the importance of the instance.
In our IWHNB approach, not only the attribute value frequency but also the number of values of different attributes is considered. < n 1 , n 2 , , n m > is used to denote values of each attribute’s value number. It reflects the diversity of each attribute. Each instance weight has positive correlation with its attribute value frequency vector < f t 1 , f t 2 , , f t m > and the attribute value number vector < n 1 , n 2 , , n m > .
Finally, we set the weight of each instance to be the dot product of the attribute value frequency vector and attribute value number vector. The weight of the tth instance w t is formalized as the following Equation:
w t = < f t 1 , f t 2 , , f t m > < n 1 , n 2 , , n m > = i = 1 m ( f t i n i ) .
Based on the simple and efficient attribute value frequency-based instance weighted filter, a proper weight is assigned to each different instance. Discriminative instance weights are embedded to generate a hidden parent of each attribute to reflect the influences of both attributes and instances.
Now, the detailed learning algorithm for our instance weighted hidden naive Bayes (IWHNB for short) can be described as Algorithm 1. From Algorithm 1, the time complexity of computing instance weights is O ( 3 n m ) . n is the number of training instances. m is the number of attributes. IWHNB needs to compute the conditional mutual information for each pair of attributes. The time complexity is O ( q m 2 v 2 ) , v is the average number of values for an attribute, q is the number of class labels. The time complexity for computing each weight W i j is O ( m 2 ) . These formulas sum over n, thus, the training time complexity of IWHNB is O ( 3 n m + n m 2 + n q m 2 v 2 ) . The training procedure of the algorithm IWHNB is similar to that of HNB, except the additional procedure for calculating each instance weight. At classification time, Equation (14) is used to classify a test instance, and it takes O ( q n 2 ) . The total time complexity of the IWHNB algorithm is O ( q n 2 + 3 n m + n m 2 + n q m 2 v 2 ) , which shows that IWHNB is simple and efficient.
Algorithm 1 Instance Weighted Hidden Naive Bayes
Input: TD-a training dataset; a test instance x
Output: the predicted class label of x
1:
Initialize all instance weights by the attribute value frequency-based instance weighted filter
2:
for each training instance t = 1 to n do
3:
   for each training instance’s attribute value, i = 1 to m do
4:
     Set new instance weight of tth instance to be the dot product of its attribute value frequency vector < f t 1 , f t 2 , , f t m > and the attribute value number vector < n 1 , n 2 , , n m >
5:
   end for
6:
end for
7:
Discriminative instance weights are incorporated into the process of calculating probability estimates.
8:
for each possible class label c that C takes do
9:
   Calculate P ( c ) using Equation (15)
10:
   for each attribute A i , i = 1 to m do
11:
     Calculate P ( a i | c ) using Equation (20)
12:
   end for
13:
   for each pair of attributes A i and A j ( i j )  do
14:
     Calculate P ( a i | a j , c ) as Equation (17)
15:
     Calculate I P ( A i ; A j | C ) using Equation (19)
16:
     Calculate W i j using Equation (18)
17:
   end for
18:
   Calculate P ( a i | a h p i , c ) using Equation (16)
19:
end for
20:
Predict the class label c ( x ) of x by Equation (14)

4. Experiments and Results

In order to verify the performance of our proposed IWHNB, we completed experiments to compare IWHNB with NB, HNB and other state-of-the-art competitors. These state-of-the-art competitors and their abbreviations are listed as follows. HNB, AODE and TAN are state-of-the-art structure extension approaches. AVFWNB is an eager instance weighting approach. AIWNB is a new improved approach which combines instance weighting with attribute weighting.
  • NB: Naive Bayes [36].
  • HNB: Hidden naive Bayes [33].
  • AVFWNB: Attribute value frequency weighted NB [18].
  • AIWNB: Attribute and instance weighted NB [35].
  • AODE: Aggregating one-dependence estimators [12].
  • TAN: Tree-augmented NB [14].
We performed our study on the 36 University of California, Irvine (UCI) datasets [37]. These datasets are published on the WEKA platform [38]. They are from a wide range of fields and also have various data characteristics. In the process of preprocessing, we replace missing attribute values with the modes of the nominal attribute values or the means of the numerical attribute values. We also use the Fayyad & Irani’s minimum description length (MDL) method [39] to discretize numerical attribute values. If the attribute’s value number is the same instances’ number, the attribute is redundant. So, we delete this type of attribute. There are three redundant attributes deleted: “Hospital Number” in the dataset “colic.ORIG”, “instance name” in the dataset “splice”, and “animal” in the dataset “zoo”.
Table 1 shows the results of a comparison of the classification accuracy of each approach on each dataset after averaging the classification accuracies from ten runs of 10-fold cross-validation, respectively. Meanwhile, two-tailed t-test with the p = 0.05 significance level [40,41] is used to compare the proposed IWHNB with its competitors. We use the symbol • to denote our proposed IWHNB is a significant improvement over its competitors, and use the symbol ∘ to denote it is a significant degradation over its competitors. The second-to-last line reveals the average accuracy of each algorithm, which can provide a gross indicator of its classification performance across all datasets. At the bottom of the Table 1, W/T/L reflects that our proposed IWHNB wins on W datasets, ties on T datasets and loses on L datasets over its competitors.
Then, the summary test results based on a corrected paired two-tailed t-test with the p = 0.05 significance level are shown in Table 2. For each entry i ( j ) , i is the number of datasets on which the algorithm in the column achieves higher classification accuracy than the algorithm in the corresponding row, and j is the number of datasets on which the algorithm in the column achieves significant wins with the p = 0.05 significance level with regard to the algorithm in the corresponding row. The ranking results are summarized in Table 3. The first column is the difference between the total number of wins and the total number of losses that the corresponding algorithm achieves compared with all the other algorithms, which is used to generate the ranking. The second column is the total number of winning datasets. The third column is the total number of losing datasets.
Based on comparison results, the conclusion is evident that our IWHNB approach obtains the best experimental results compared with its competitors. We summarize the conclusions briefly as follows:
  • According to results in Table 1, the averaged classification accuracy of IWHNB across all datasets is 86.37%. It is considerably higher than its competitors, such as NB (83.31%), HNB (85.86%), AVFWNB (84.21%), AIWNB (84.94%), AODE (85.68%) and TAN (84.95%). This suggests that our proposed IWHNB approach is effective.
  • IWHNB obtains the most satisfactory experimental results in accuracy. IWHNB outperforms NB (17 wins, 18 ties and 1 loss), HNB (9 wins, 27 ties and 0 losses), AVFWNB (13 wins, 21 ties and 2 losses), AIWNB (8 wins, 25 ties and 3 losses), AODE (6 wins, 30 ties and 0 losses) and TAN (9 wins, 25 ties and 2 losses).
  • The summary and ranking test results show that IWHNB is overall the best across all datasets (62 wins and 8 losses). The descending sort across all datasets is IWHNB, HNB, AIWNB, AODE, TAN, AVFWNB and NB.
  • Compared with HNB, IWHNB considerably improves the classification accuracy (nine wins and zero losses). This suggests that this improved hybrid approach which combines the improved HNB model with instance weighting improves the classification performance effectively.
Furthermore, we observe the performance of IWHNB in terms of the elapsed training time (in milliseconds). Our experiments were conducted on a Linux machine with 3.2 GHz processor and 8 GB of RAM. The elapsed training time comparison results are shown in Table 4, Table 5 and Table 6. Note that the meanings of the t-test results in these tables are opposite to those in Table 1, Table 2 and Table 3. For the elapsed training time, a small number which indicates lower time complexity is better than a large number. Thus, in Table 4, the symbols ∘ and • denote statistically significant improvement or degradation over its competitors, respectively. Each W/T /L implies that compared to its competitors, our proposed IWHNB wins on W datasets, ties on T datasets, and loses on L datasets. In Table 5, i of value i ( j ) denotes the number of datasets that the algorithm corresponding to the column loses compared to the algorithm corresponding to the row. In Table 6, the second and third columns represent the total numbers of losses and wins, respectively. The first column is the difference between the second column of losses and third column of wins. We summarize the main highlights of these comparisons as follows:
1
According to results in Table 4, the averaged elapsed training time of IWHNB is 13.15 milliseconds, which is a little bigger than that of HNB (12.56 milliseconds). Therefore, our proposed IWHNB approach maintains the computational simplicity that characterizes HNB. It is a simple, efficient and effective approach.
2
Compared with TAN, IWHNB has the lower time complexity. The averaged elapsed training time of IWHNB is smaller than that of TAN (15.84 milliseconds). It reduces the elapsed training time on 8 datasets, and loses on 0 datasets.
3
According to Table 4, Table 5 and Table 6, IWHNB indeed has higher time complexity than NB, AVFWNB, AIWNB, AODE, HNB, but it still has low computational simplicity. Structure extension and instance weighting are both completed in our IWHNB approach.
Table 4. Comparisons of the elapsed training time for IWHNB versus NB, HNB, AVFWNB, AIWNB, AODE and TAN.
Table 4. Comparisons of the elapsed training time for IWHNB versus NB, HNB, AVFWNB, AIWNB, AODE and TAN.
DatasetIWHNBNBHNBAVFWNBAIWNBAODETAN
anneal19.42±12.250.45±0.8813.28±6.040.93±1.376.20±4.537.84±1.6417.18±7.88
anneal.ORIG16.80±3.950.25±0.4412.32±1.790.36±0.504.20±0.656.95±1.0014.19±1.10
audiology63.53±18.790.13±0.3457.39±4.66 0.24±0.436.53±1.387.12±1.2095.49±9.07
autos4.65±1.250.03±0.174.35±0.63 0.09±0.291.15±0.410.86±0.434.89±0.65
balance-scale0.18±0.390.06±0.42 0.08±0.27 0.07±0.26 0.17±0.40 0.05±0.22 0.18±0.46
breast-cancer0.52±0.700.02±0.140.40±0.49 0.01±0.100.27±0.45 0.16±0.37 0.38±0.55
breast-w0.62±0.490.03±0.170.48±0.50 0.04±0.200.48±0.52 0.30±0.48 0.59±0.49
colic1.95±0.500.09±0.291.77±0.49 0.13±0.340.83±0.400.98±0.382.34±0.52
colic.ORIG3.77±1.200.09±0.293.28±0.53 0.13±0.341.54±0.541.44±0.624.05±0.87
credit-a1.83±0.880.05±0.221.34±0.50 0.18±0.390.78±0.440.87±0.461.59±0.73
credit-g3.00±0.590.15±0.363.12±0.59 0.26±0.441.66±0.812.17±0.453.42±0.81
diabetes0.48±0.500.09±0.290.36±0.48 0.09±0.29 0.37±0.49 0.31±0.46 0.40±0.55
glass0.42±0.500.00±0.000.38±0.49 0.02±0.140.13±0.34 0.07±0.26 0.44±0.52
heart-c0.70±0.610.10±0.300.68±0.49 0.04±0.200.29±0.46 0.23±0.42 0.90±0.61
heart-h0.63±0.490.07±0.260.68±0.49 0.03±0.170.31±0.46 0.27±0.47 0.71±0.56
heart-statlog0.60±0.510.05±0.220.37±0.51 0.03±0.170.29±0.46 0.28±0.45 0.43±0.54
hepatitis0.81±0.510.06±0.240.72±0.51 0.03±0.170.35±0.48 0.39±0.49 0.95±0.39
hypothyroid19.24±1.921.16±0.6018.46±1.46 1.90±0.7010.27±1.6317.96±3.36 21.08±2.44
ionosphere5.20±0.770.06±0.245.26±0.63 0.09±0.292.36±0.502.78±1.387.79±1.23
iris0.03±0.170.00±0.00 0.05±0.22 0.04±0.20 0.05±0.22 0.05±0.22 0.05±0.22
kr-vs-kp27.11±5.291.33±0.6423.27±1.221.71±0.7123.76±23.79 23.15±4.53 30.78±5.71
labor0.51±0.850.00±0.00 0.38±0.51 0.02±0.14 0.18±0.39 0.05±0.22 0.52±0.56
letter81.00±21.554.51±0.8672.44±8.74 9.73±1.0574.51±36.78 66.65±23.94 79.07±13.71
lymphography1.16±0.720.02±0.141.15±0.67 0.04±0.200.47±0.500.26±0.441.18±0.67
mushroom24.98±3.411.85±1.3724.47±1.47 4.26±0.7925.80±4.66 25.55±3.54 27.06±3.80
primary-tumor3.28±0.570.06±0.243.57±0.76 0.09±0.290.91±0.470.58±0.503.76±1.68
segment11.00±1.360.42±0.5212.32±1.52 0.78±0.507.52±1.496.87±1.1913.28±1.78
sick18.14±1.981.04±0.5317.77±1.64 1.85±0.5911.92±2.7916.74±1.54 22.35±5.21
sonar7.16±0.850.07±0.267.76±1.40 0.18±0.392.24±0.514.17±1.0929.32±3.92
soybean17.57±2.010.30±0.4617.80±1.84 0.53±0.504.85±1.865.59±1.5420.50±2.12
splice81.68±7.161.79±0.6789.51±4.193.14±0.7563.97±8.3480.66±11.04 101.74±11.40
vehicle2.95±0.580.13±0.372.87±0.51 0.25±0.441.45±0.521.63±0.603.09±0.64
vote0.92±0.340.07±0.260.90±0.46 0.11±0.310.52±0.54 0.64±0.54 1.00±0.45
vowel2.86±0.590.15±0.363.04±0.63 0.23±0.421.14±0.401.11±0.312.93±0.57
waveform-500047.56±2.101.61±0.5849.06±2.40 3.05±0.8529.01±1.8446.70±4.64 55.69±2.93
zoo0.98±0.380.00±0.001.01±0.33 0.08±0.270.19±0.390.21±0.431.03±0.36
Average13.150.4512.560.857.969.2115.84
W/T/L-0/3/331/32/30/4/320/15/210/19/178/28/0
Table 5. Summary test results on elapsed training time.
Table 5. Summary test results on elapsed training time.
AlgorithmIWHNBNBHNBAVFWNBAIWNBAODETAN
IWHNB-0 (0)12 (1)1 (0)2 (0)2 (0)27 (8)
NB36 (33)-36 (32)30 (6)36 (25)35 (23)36 (31)
HNB24 (3)0 (0)-0 (0)5 (0)1 (0)33 (13)
AVFWNB35 (32)5 (0)36 (31)-36 (24)35 (23)36 (32)
AIWNB34 (21)0 (0)29 (21)0 (0)-17 (6)35 (24)
AODE34 (17)1 (0)34 (18)1 (0)18 (0)-35 (25)
TAN8 (0)0 (0)2 (0)0 (0)0 (0)0 (0)-
Table 6. Ranking test results on elapsed training time.
Table 6. Ranking test results on elapsed training time.
AlgorithmLosses-WinsLossesWins
TAN1331330
IWHNB971069
HNB8710316
AODE−85260
AIWNB−234972
AVFWNB−1366142
NB−1500150

5. Conclusions and Future Work

Hidden naive Bayes (HNB) adds a hidden parent to each attribute to encode attribute dependencies. However, it regards each instance as equally important. In this paper, we propose an improved hybrid approach which combines the improved NB model with instance weighting into one hybrid model, called instance weighted hidden naive Bayes (IWHNB). In our IWHNB approach, different contributions for different instances are considered when generating the improved HNB model. Experiments are conducted to compare IWHNB with NB, HNB and other state-of-the-art competitors in terms of the classification accuracy and the elapsed training time. The classification accuracy comparison results show that our IWHNB approach obtains the best experimental results compared with its competitors. The elapsed training time comparison results show that IWHNB maintains the computational simplicity that characterizes HNB. IWHNB is a simple, efficient and effective approach.
How to calculate optimal instance weights to overcome the unrealistic assumption is crucial. We think that more sophisticated algorithms can be used to learn more optimal instance weights to optimize our current version.

Author Contributions

Conceptualization, L.Y. and S.G.; methodology, L.Y. and S.G.; software, L.Y., Y.C., D.L. and S.G.; validation, L.Y. and S.G.; formal analysis, L.Y., Y.C., D.L. and S.G.; investigation, L.Y., Y.C., D.L. and S.G.; resources, L.Y. and S.G.; data curation, L.Y. and S.G.; writing—original draft preparation, L.Y. and S.G.; writing—review and editing, L.Y. and D.L.; visualization, L.Y. and S.G.; supervision, S.G.; project administration, L.Y. and S.G.; funding acquisition, L.Y. and S.G. All authors have read and agreed to the published version of the manuscript.

Funding

The work was partially supported by Science and Technology Project of Hubei Province-Unveiling System (2019AEE020), Open Research Project of The Hubei Key Laboratory of Intelligent Geo-Information Processing (KLIGIP-2018A05), Scientific Research Foundation for Talent introduction (20RC07), Teaching Research Project of Hubei University of Education (X2019009).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
NBNaive Bayes
HNBHidden NB
AVFWNBAttribute Value Frequency Weighted NB
AIWNBAttribute and Instance Weighted NB
AODEAggregating One-Dependence Estimators
TANTree-augmented NB
IWHNBInstance Weighted HNB
BNBayesian Network
WAODEWeighted Average of One-Dependence Estimators
UCIUniversity of California, Irvine
WEKAWaikato Environment for Knowledge Analysis
MDLMinimum Description Length
NPNon-deterministic Polynomial

References

  1. Zhang, Y.; Wu, J.; Zhou, C.; Cai, Z. Instance cloned extreme learning machine. Pattern Recognit. 2017, 68, 52–65. [Google Scholar] [CrossRef]
  2. Yu, L.; Jiang, L.; Wang, D.; Zhang, L. Attribute Value Weighted Average of One-Dependence Estimators. Entropy 2017, 19, 501. [Google Scholar] [CrossRef] [Green Version]
  3. Wu, J.; Cai, Z.; Zhu, X. Self-adaptive probability estimation for Naive Bayes classification. In Proceedings of the International Joint Conference on Neural Networks, Dallas, TX, USA, 4–9 August 2013. [Google Scholar]
  4. Qiu, C.; Jiang, L.; Li, C. Not always simple classification: Learning superparent for class probability estimation. Expert Syst. Appl. 2015, 42, 5433–5440. [Google Scholar] [CrossRef]
  5. Jiang, L.; Wang, S.; Li, C.; Zhang, L. Structure extended multinomial naive Bayes. Inf. Sci. 2016, 329, 346–356. [Google Scholar] [CrossRef]
  6. Yu, L.; Gan, S.; Chen, Y.; He, M. Correlation-Based Weight Adjusted Naive Bayes. IEEE Access 2020, 8, 51377–51387. [Google Scholar] [CrossRef]
  7. Jiang, L.; Li, C.; Cai, Z. Learning decision tree for ranking. Knowl. Inf. Syst. 2009, 20, 123–135. [Google Scholar] [CrossRef]
  8. Bai, Y.; Wang, H.; Wu, J.; Zhang, Y.; Jiang, J.; Long, G. Evolutionary lazy learning for Naive Bayes classification. In Proceedings of the 2016 International Joint Conference on Neural Networks (IJCNN), Vancouver, BC, Canada, 24–29 July 2016; pp. 3124–3129. [Google Scholar]
  9. Bermejo, P.; Gámez, J.A.; Puerta, J.M. Speeding up incremental wrapper feature subset selection with Naive Bayes classifier. Knowl.-Based Syst. 2014, 55, 140–147. [Google Scholar] [CrossRef]
  10. Hall, M. A decision tree-based attribute weighting filter for naive Bayes. Knowl.-Based Syst. 2007, 20, 120–126. [Google Scholar] [CrossRef] [Green Version]
  11. Webb, G.I.; Boughton, J.R.; Fei, Z.; Ting, K.; Salem, H. Learning by extrapolation from marginal to full-multivariate probability distributions: Decreasingly naive Bayesian classification. Mach. Learn. 2012, 86, 233–272. [Google Scholar] [CrossRef] [Green Version]
  12. Webb, G.I.; Boughton, J.R.; Wang, Z. Not so naive Bayes: Aggregating one-dependence estimators. Mach. Learn. 2005, 58, 5–24. [Google Scholar] [CrossRef] [Green Version]
  13. Yang, Y.; Webb, G.; Korb, K.; Boughton, J.; Ting, K. To Select or To Weigh: A Comparative Study of Linear Combination Schemes for SuperParent-One-Dependence Estimators. IEEE Trans. Knowl. Data Eng. 2007, 9, 1652–1665. [Google Scholar] [CrossRef] [Green Version]
  14. Friedman, N.; Geiger, D.; Goldszmidt, M. Bayesian network classifiers. Mach. Learn. 1997, 29, 131–163. [Google Scholar] [CrossRef] [Green Version]
  15. Jiang, L.; Wang, D.; Cai, Z. Discriminatively weighted naive bayes and its application in text classification. Int. J. Artif. Intell. Tools 2012, 21, 1250007. [Google Scholar] [CrossRef]
  16. Jiang, L.; Guo, Y. Learning Lazy Naive Bayesian Classifiers for Ranking. In Proceedings of the 17th IEEE International Conference on Tools with Artificial Intelligence, Hong Kong, China, 14–16 November 2005; pp. 412–416. [Google Scholar]
  17. Jiang, L.; Cai, Z.; Wang, D. Improving Naive Bayes for Classification. Int. J. Comput. Appl. 2010, 32, 328–332. [Google Scholar] [CrossRef]
  18. Xu, W.; Jiang, L.; Yu, L. An attribute value frequency-based instance weighting filter for naive Bayes. J. Exp. Theor. Artif. Intell. 2019, 31, 225–236. [Google Scholar] [CrossRef]
  19. Kohavi, R. Scaling Up the Accuracy of Naive-Bayes Classifer: A Decision-Tree Hybrid. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining; AAAI Press: Portland, ON, USA, 1996; pp. 202–207. [Google Scholar]
  20. Xie, Z.; Hsu, W.; Liu, Z.; Lee, M. A Selective Neighborhood Based Naive Bayes for Lazy Learning. In Proceedings of the Sixth Pacific Asia Conference on KDD; Springer: Berlin/Heidelberg, Germany, 2002; pp. 104–114. [Google Scholar]
  21. Frank, E.; Hall, M.; Pfahringer, B. Locally Weighted Naive Bayes. In Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 2003; pp. 249–256. [Google Scholar]
  22. Lee, C.H.; Gutierrez, F.; Dou, D. Calculating feature weights in naive bayes with kullback-leibler measure. In Proceedings of the 11th International Conference on Data Mining, Vancouver, BC, Canada, 11–14 December 2011; pp. 1146–1151. [Google Scholar]
  23. Jiang, L.; Zhang, L.; Li, C.; Wu, J. A Correlation-Based Feature Weighting Filter for Naive Bayes. IEEE Trans. Knowl. Data Eng. 2019, 31, 201–213. [Google Scholar] [CrossRef]
  24. Zhang, H.; Jiang, L.; Yu, L. Class-specific Attribute Value Weighting for Naive Bayes. Inf. Sci. 2020, 508, 260–274. [Google Scholar] [CrossRef]
  25. Zaidi, N.A.; Cerquides, J.; Carman, M.J.; Webb, G.I. Alleviating naive Bayes attribute independence assumption by attribute weighting. J. Mach. Learn. Res. 2013, 14, 1947–1988. [Google Scholar]
  26. Jiang, L.; Li, C.; Wang, S.; Zhang, L. Deep feature weighting for naive Bayes and its application to text classification. Eng. Appl. Artif. Intell. 2016, 52, 26–39. [Google Scholar] [CrossRef]
  27. Langley, P.; Sage, S. Induction of selective Bayesian classifiers. In Proceedings of the Tenth International Conference on Uncertainty in Artificial Intelligence; Morgan Kaufmann: San Francisco, CA, USA, 1994; pp. 339–406. [Google Scholar]
  28. Jiang, L.; Cai, Z.; Zhang, H.; Wang, D. Not so greedy: Randomly Selected Naive Bayes. Expert Syst. Appl. 2012, 39, 11022–11028. [Google Scholar] [CrossRef]
  29. Jiang, L.; Zhang, H.; Cai, Z.; Su, J. Evolutional naive bayes. In Proceedings of the 2005 International Symposium on Intelligent Computation and Its Application, Wuhan, China, 22–24 October 2005; pp. 344–350. [Google Scholar]
  30. Chen, S.; Webb, G.; Liu, L.; Ma, X. A novel selective naïve Bayes algorithm. Knowl.-Based Syst. 2020, 192, 105361. [Google Scholar] [CrossRef]
  31. Xiang, Z.; Kang, D. Attribute weighting for averaged one-dependence estimators. Appl. Intell. 2017, 46, 616–629. [Google Scholar] [CrossRef]
  32. Yu, L.; Jiang, L.; Wang, D.; Zhang, L. Toward naive Bayes with attribute value weighting. Neural Comput. Appl. 2019, 31, 5699–5713. [Google Scholar] [CrossRef]
  33. Jiang, L.; Zhang, H.; Cai, Z. A novel Bayes model: Hidden naive Bayes. IEEE Trans. Knowl. Data Eng. 2009, 21, 1361–1371. [Google Scholar] [CrossRef]
  34. Jiang, L.; Zhang, H.; Cai, Z.; Wang, D. Weighted average of one-dependence estimators. J. Exp. Theor. Artif. Intell. 2012, 24, 219–230. [Google Scholar] [CrossRef]
  35. Zhang, H.; Jiang, L.; Yu, L. Attribute and instance weighted naive Bayes. Pattern Recognit. 2021, 111, 107674. [Google Scholar] [CrossRef]
  36. Langley, P.; Iba, W.; Thompson, K. An Analysis of Bayesian Classifiers. In Proceedings of the 10th National Conference on Artificial Intelligence, San Jose, CA, USA, 12–16 July 1992; pp. 223–228. [Google Scholar]
  37. Frank, A.; Asuncion, A. UCI Machine Learning Repository; University of California: Irvine, CA, USA, 2010. [Google Scholar]
  38. Witten, I.H.; Frank, E.; Hall, M.A. Data Mining: Practical Machine Learning Tools and Techniques, 3rd ed.; Morgan Kaufmann: San Francisco, CA, USA, 2011. [Google Scholar]
  39. Fayyad, U.M.; Irani, K.B. Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of the 13th International Joint Conference on Articial Intelligence, Chambéry, France, 28 August–3 September 1993; pp. 1022–1027. [Google Scholar]
  40. Nadeau, C.; Bengio, Y. Inference for the Generalization Error. Mach. Learn. 2003, 52, 239–281. [Google Scholar] [CrossRef] [Green Version]
  41. Jiang, L.; Zhang, L.; Yu, L.; Wang, D. Class-specific attribute weighted naive Bayes. Pattern Recognit. 2019, 88, 321–330. [Google Scholar] [CrossRef]
Figure 1. The different structures of related models.
Figure 1. The different structures of related models.
Mathematics 09 02982 g001
Table 1. Comparisons of the classification accuracy for IWHNB versus NB, HNB, AVFWNB, AIWNB, AODE and TAN.
Table 1. Comparisons of the classification accuracy for IWHNB versus NB, HNB, AVFWNB, AIWNB, AODE and TAN.
DatasetIWHNBNBHNBAVFWNBAIWNBAODETAN
anneal98.31±1.2996.13±2.1698.33±1.22 98.62±1.15 98.94±1.05 98.01±1.39 98.61±1.02
anneal.ORIG94.65±2.2492.66±2.7295.29±2.04 93.32±2.6595.06±2.23 93.35±2.53 94.55±2.10
audiology78.17±7.1571.40±6.3769.04±5.8378.58±8.44 83.93±7.0071.66±6.4265.35±6.84
autos85.56±7.9372.30±10.3182.17±8.60 77.27±9.4378.04±9.0280.74±8.68 80.85±8.99
balance-scale69.05±3.7471.08±4.2969.05±3.75 71.10±4.3073.75±4.2269.34±3.82 70.75±3.99
breast-cancer70.47±6.2972.94±7.71 73.09±6.11 71.41±7.98 71.90±7.55 72.53±7.15 69.53±7.13
breast-w96.30±1.9497.25±1.79 96.32±2.01 97.48±1.6897.17±1.68 96.97±1.87 96.27±2.08
colic81.20±6.0081.39±5.74 82.09±5.86 81.47±5.86 83.45±5.45 82.64±5.83 81.00±5.86
colic.ORIG74.23±6.5273.62±6.83 74.06±5.79 72.91±6.34 73.87±6.40 74.62±6.51 68.31±6.04
credit-a85.23±3.8286.25±4.01 85.91±3.70 86.23±3.85 87.03±3.83 86.71±3.82 85.39±3.81
credit-g75.85±3.6975.43±3.84 76.12±3.72 75.38±3.90 75.81±3.60 76.50±3.89 73.54±4.16
diabetes76.75±4.2077.85±4.67 76.81±4.11 77.89±4.66 77.87±4.86 78.07±4.56 78.70±4.29
glass77.70±8.9874.39±7.95 77.80±8.40 76.25±8.07 74.02±8.41 76.08±8.07 76.23±8.87
heart-c81.52±7.1283.60±6.42 82.31±6.81 83.04±6.68 82.71±6.61 83.20±6.20 81.62±7.50
heart-h84.56±6.0584.46±5.92 84.87±6.03 84.90±5.68 84.29±5.85 84.43±5.92 84.05±6.66
heart-statlog82.33±6.5983.74±6.25 82.33±6.55 83.78±6.29 83.22±6.61 83.33±6.61 82.44±6.48
hepatitis87.38±8.4384.22±9.41 88.26±7.28 85.38±9.00 85.75±8.97 84.98±9.26 86.01±8.25
hypothyroid99.32±0.4098.48±0.5998.95±0.4898.98±0.4899.07±0.48 98.76±0.5499.15±0.44
ionosphere93.96±3.6590.77±4.7691.82±4.3391.94±4.09 92.40±4.13 92.79±4.26 92.25±4.33
iris93.27±5.7294.47±5.61 93.80±5.86 94.40±5.50 94.40±5.50 93.20±5.76 94.20±5.74
kr-vs-kp92.70±1.3787.79±1.9192.36±1.3088.18±1.8693.73±1.2891.01±1.6792.88±1.49
labor95.90±9.2193.13±10.56 94.87±9.82 94.33±10.13 94.33±9.30 94.70±9.15 92.47±10.89
letter90.17±0.6274.00±0.8888.20±0.6675.07±0.8475.56±0.8988.76±0.7085.49±0.76
lymphography85.89±8.0284.97±8.30 85.84±8.86 85.49±7.83 84.68±7.99 86.98±8.32 85.30±8.79
mushroom99.96±0.0695.52±0.7899.94±0.10 99.12±0.3199.53±0.2399.95±0.07 99.99±0.04
primary-tumor46.14±6.1747.20±6.02 47.66±6.21 45.85±6.53 47.76±5.25 47.67±6.30 44.77±6.84
segment96.87±1.0791.71±1.6895.88±1.1993.69±1.4194.16±1.3895.77±1.2395.58±1.32
sick97.52±0.7697.10±0.8497.56±0.74 97.02±0.8697.33±0.85 97.39±0.79 97.40±0.76
sonar84.63±7.7285.16±7.52 84.63±7.34 84.49±7.79 82.23±8.65 86.60±6.91 84.45±8.31
soybean94.61±2.1892.20±3.2393.88±2.47 94.52±2.36 94.74±2.19 93.28±2.84 94.98±2.38
splice96.24±1.0095.42±1.1495.84±1.1095.61±1.1196.21±0.99 96.12±1.00 94.95±1.18
vehicle73.70±3.4162.52±3.8172.37±3.3563.36±3.8763.59±3.9272.31±3.62 73.39±3.26
vote94.39±3.2190.21±3.9594.43±3.18 90.25±3.9592.18±3.7694.52±3.19 94.43±3.34
vowel90.32±2.7165.23±4.5385.12±3.6567.46±4.6269.98±4.1180.87±3.8286.09±3.91
waveform-500086.24±1.4580.72±1.5086.21±1.44 80.65±1.4682.98±1.3786.03±1.56 82.22±1.71
zoo98.33±3.7293.98±7.14 97.73±4.64 96.05±5.60 96.05±5.60 94.66±6.38 95.15±6.68
Average86.3783.3185.8684.2184.9485.6884.95
W/T/L-17/18/19/27/013/21/28/25/36/30/09/25/2
Table 2. Summary test results on classification accuracy.
Table 2. Summary test results on classification accuracy.
AlgorithmIWHNBNBHNBAVFWNBAIWNBAODETAN
IWHNB-11 (1)17 (0)12 (2)15 (3)14 (0)11 (2)
NB25 (17)-27 (16)26 (11)27 (15)29 (13)20 (13)
HNB18 (9)9 (1)-13 (3)16 (4)18 (1)12 (2)
AVFWNB24 (13)10 (0)23 (11)-25 (10)24 (10)16 (8)
AIWNB21 (8)9 (0)20 (7)8 (0)-21 (8)15 (6)
AODE22 (6)7 (1)18 (3)12 (3)15 (7)-16 (4)
TAN25 (9)16 (2)24 (5)20 (2)21 (5)20 (6)-
Table 3. Ranking test results on classification accuracy.
Table 3. Ranking test results on classification accuracy.
AlgorithmWins-LossesWinsLosses
IWHNB54628
HNB224220
AIWNB154429
AODE143824
TAN63529
AVFWNB−312152
NB−80585
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Yu, L.; Gan, S.; Chen, Y.; Luo, D. A Novel Hybrid Approach: Instance Weighted Hidden Naive Bayes. Mathematics 2021, 9, 2982. https://doi.org/10.3390/math9222982

AMA Style

Yu L, Gan S, Chen Y, Luo D. A Novel Hybrid Approach: Instance Weighted Hidden Naive Bayes. Mathematics. 2021; 9(22):2982. https://doi.org/10.3390/math9222982

Chicago/Turabian Style

Yu, Liangjun, Shengfeng Gan, Yu Chen, and Dechun Luo. 2021. "A Novel Hybrid Approach: Instance Weighted Hidden Naive Bayes" Mathematics 9, no. 22: 2982. https://doi.org/10.3390/math9222982

APA Style

Yu, L., Gan, S., Chen, Y., & Luo, D. (2021). A Novel Hybrid Approach: Instance Weighted Hidden Naive Bayes. Mathematics, 9(22), 2982. https://doi.org/10.3390/math9222982

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop