Effective Class-Imbalance Learning Based on SMOTE and Convolutional Neural Networks
Abstract
:1. Introduction
1.1. Context of the Study
- RUS: Among undersampling methods, random undersampling (RUS) is the simplest one, in which the samples of the majority class are randomly removed until suitable balanced data are obtained [23].
- Tomek Links: Some of the undersampling techniques focus on overlap elimination. For example, the Tomek Links [24] method, which is a modification of the Condensed Nearest Neighbor rule, is one of these methods.
- One-Sided Selection: As a development of the Tomek Links algorithm, one can refer to the One-Sided Selection or briefly OSS method [25] that merges Tomek Links and the Condensed Nearest Neighbor algorithms.
- Near Miss is another popular undersampling method that randomly removes the majority of class samples. When two samples classified in different classes are very close to each other, it removes the sample belonging to the larger class [5].
- ROS: Among the oversampling algorithms, random oversampling (ROS) is the simplest one that merely selects and copies the samples from the minority class randomly, leading to more balanced data [26].
1.2. Research Gap
1.3. Motivation and Contribution
2. Related Work
3. Methodology
3.1. Dataset Preprocessing
3.1.1. Oversampling Techniques
Random Over-Sampling (ROS)
Synthetic Minority Oversampling Technique
3.1.2. Undersampling Techniques
Random Under-Sampling
Tomek Links
One-Sided Selection (OSS)
NearMiss
3.1.3. Normalization
3.1.4. Split Dataset
3.2. Models
3.2.1. Proposed Deep Neural Network
3.2.2. Proposed Convolutional Neural Network
4. Experimental Results
4.1. Simulation Setup
4.2. Dataset Description
4.3. Evaluation Metrics
- True Positive (TP): The number of samples that belongs to the positive class and are correctly predicted as positive by the classifier
- True Negative (TN): The number of samples that belongs to the negative class and are correctly predicted as negative by the classifier
- False Positive (FP): Number of samples that belongs to the negative class, even though they are predicted as positive by the classifier
- False Negative (FN): Number of samples that belongs to the positive class, even though they are predicted as negative by the classifier
4.3.1. Accuracy
4.3.2. Specificity
4.3.3. Recall
4.3.4. G-Mean
4.3.5. Precision
4.3.6. F1-Score
4.3.7. Kappa
- Kappa : Robust consistency, high reliable accuracy.
- Kappa < 0.75: the accuracy’s reliance level is generally.
- Kappa < 0.4: Accuracy is unreliable.
4.3.8. AUC-ROC
4.4. Classification Results
Dataset | Acc (CNN/DNN) | Pre (CNN/DNN) | Rec (CNN/DNN) | F1 (CNN/DNN) | G-Mean (CNN/DNN) | Spe (CNN/DNN) | AUC (CNN/DNN) | Kap (CNN/DNN) | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ecoli1 | 99.11 | 99.38 | 99.13 | 99.40 | 99.11 | 99.38 | 99.11 | 99.38 | 99.11 | 99.38 | 98.83 | 99.17 | 99.00 | 99.25 | 98.21 | 98.77 |
ecoli2 | 99.60 | 99.76 | 99.60 | 99.77 | 99.60 | 99.76 | 99.60 | 99.76 | 99.60 | 99.76 | 99.44 | 99.54 | 99.03 | 99.43 | 99.19 | 99.53 |
ecoli3 | 99.53 | 99.64 | 99.54 | 99.65 | 99.53 | 99.64 | 99.53 | 99.64 | 99.53 | 99.64 | 99.21 | 99.39 | 99.20 | 99.31 | 99.06 | 99.29 |
ecoli-0_vs_1 | 99.91 | 99.83 | 99.92 | 99.84 | 99.91 | 99.83 | 99.91 | 99.83 | 99.91 | 99.83 | 99.93 | 99.79 | 99.27 | 99.10 | 99.83 | 99.66 |
glass0 | 99.26 | 99.19 | 99.27 | 99.26 | 99.26 | 99.19 | 99.26 | 99.19 | 99.26 | 99.19 | 99.27 | 99.00 | 99.31 | 99.53 | 99.27 | 98.38 |
glass1 | 99.31 | 99.23 | 99.32 | 99.25 | 99.31 | 99.23 | 99.31 | 99.23 | 99.31 | 99.23 | 99.32 | 99.64 | 99.25 | 99.34 | 99.30 | 98.46 |
glass6 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
Haberman | 95.60 | 95.44 | 95.61 | 96.30 | 95.60 | 95.77 | 95.60 | 96.03 | 95.60 | 96.10 | 95.61 | 96.45 | 97.03 | 98.05 | 95.00 | 96.01 |
iris0 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 99.99 | 99.99 | 100.00 | 100.00 |
new-thyroid1 | 99.95 | 99.96 | 99.96 | 99.96 | 99.95 | 99.96 | 99.60 | 99.96 | 99.96 | 99.96 | 99.95 | 99.92 | 99.90 | 99.95 | 99.96 | 99.92 |
new-thyroid2 | 99.95 | 99.97 | 99.96 | 99.97 | 99.95 | 99.97 | 99.95 | 99.97 | 99.95 | 99.97 | 99.96 | 99.94 | 99.38 | 99.61 | 99.94 | 99.94 |
page-blocks0 | 99.42 | 99.39 | 99.42 | 99.39 | 99.42 | 99.39 | 99.42 | 99.39 | 99.42 | 99.39 | 99.38 | 99.25 | 99.27 | 99.28 | 99.23 | 98.97 |
pima | 99.35 | 99.15 | 99.36 | 99.15 | 99.35 | 99.14 | 99.35 | 99.15 | 99.35 | 99.21 | 99.30 | 99.37 | 99.81 | 99.25 | 99.46 | 99.03 |
segment0 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.67 | 99.31 | 99.99 | 99.99 |
vehicle0 | 99.95 | 99.97 | 99.95 | 99.97 | 99.95 | 99.97 | 99.95 | 99.97 | 99.95 | 99.97 | 99.89 | 99.95 | 99.83 | 99.60 | 99.88 | 99.95 |
vehicle1 | 99.81 | 99.76 | 99.81 | 99.76 | 99.81 | 99.76 | 99.81 | 99.76 | 99.81 | 99.76 | 99.87 | 99.67 | 99.90 | 99.15 | 99.90 | 99.52 |
vehicle2 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.84 | 99.57 | 99.97 | 99.98 |
vehicle3 | 99.85 | 99.74 | 99.86 | 99.75 | 99.85 | 99.74 | 99.85 | 99.74 | 99.85 | 99.74 | 99.82 | 99.62 | 99.80 | 99.20 | 99.91 | 99.48 |
wisconsin | 99.80 | 99.84 | 99.81 | 99.85 | 99.80 | 99.84 | 99.80 | 99.84 | 99.80 | 99.84 | 99.80 | 99.75 | 99.46 | 99.51 | 99.64 | 99.60 |
yeast1 | 98.98 | 98.71 | 98.98 | 98.72 | 98.98 | 98.71 | 98.98 | 98.71 | 98.98 | 98.71 | 98.98 | 99.81 | 98.80 | 98.62 | 98.67 | 98.35 |
yeast3 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 99.90 | 99.94 |
yeast-2_vs_4 | 99.95 | 99.91 | 99.96 | 99.92 | 99.95 | 99.91 | 99.95 | 99.91 | 99.95 | 99.91 | 99.94 | 99.92 | 99.80 | 99.45 | 99.73 | 99.67 |
penbased | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.88 | 99.81 | 99.98 | 99.96 | 99.99 | 99.99 |
nursery | 88.71 | 87.42 | 88.72 | 87.43 | 88.71 | 87.42 | 88.71 | 87.42 | 88.71 | 87.42 | 88.62 | 87.30 | 90.00 | 89.92 | 88.40 | 87.30 |
breast cancer | 99.67 | 98.84 | 99.68 | 98.85 | 99.67 | 98.84 | 99.67 | 98.84 | 99.67 | 98.84 | 99.46 | 98.67 | 99.42 | 99.14 | 99.48 | 98.62 |
Z-Alizadeh Sani | 98.57 | 97.91 | 98.58 | 97.92 | 98.57 | 97.91 | 98.57 | 97.91 | 98.57 | 97.85 | 98.42 | 97.84 | 99.14 | 99.02 | 98.21 | 98.04 |
Average | 99.08 | 98.96 | 99.09 | 99.00 | 99.08 | 98.97 | 99.09 | 98.98 | 99.08 | 98.98 | 99.03 | 98.99 | 99.08 | 99.02 | 98.92 | 98.78 |
Dataset | Acc (CNN/DNN) | Pre (CNN/DNN) | Rec (CNN/DNN) | F1 (CNN/DNN) | G-Mean (CNN/DNN) | Spe (CNN/DNN) | AUC (CNN/DNN) | Kap (CNN/DNN) | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ecoli1 | 99.05 | 98.98 | 99.06 | 98.99 | 99.05 | 98.98 | 99.05 | 98.98 | 99.05 | 98.99 | 99.00 | 98.98 | 99.02 | 98.80 | 99.10 | 98.96 |
ecoli2 | 99.51 | 99.76 | 99.52 | 99.77 | 99.51 | 99.76 | 99.51 | 99.76 | 99.51 | 99.76 | 99.45 | 99.54 | 99.03 | 99.70 | 99.19 | 99.53 |
ecoli3 | 99.23 | 99.74 | 99.24 | 99.75 | 99.23 | 99.74 | 99.23 | 99.74 | 99.23 | 99.74 | 99.20 | 99.41 | 99.05 | 99.52 | 99.00 | 99.30 |
ecoli-0_vs_1 | 99.81 | 99.75 | 99.82 | 99.76 | 99.81 | 99.75 | 99.81 | 99.75 | 99.81 | 99.75 | 99.82 | 99.63 | 99.60 | 99.17 | 99.80 | 99.61 |
glass0 | 99.05 | 98.98 | 99.06 | 98.99 | 99.05 | 98.98 | 99.05 | 98.98 | 99.05 | 98.99 | 99.02 | 99.00 | 99.00 | 98.90 | 99.02 | 98.38 |
glass1 | 98.83 | 98.32 | 98.84 | 98.33 | 98.83 | 98.32 | 98.83 | 98.32 | 98.83 | 98.32 | 98.83 | 98.42 | 99.15 | 99.32 | 98.80 | 98.32 |
glass6 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
haberman | 94.20 | 95.24 | 94.21 | 95.25 | 94.20 | 95.25 | 94.20 | 95.25 | 94.20 | 95.24 | 94.25 | 95.25 | 97.23 | 97.23 | 95.00 | 96.35 |
iris0 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
new-thyroid1 | 99.75 | 99.78 | 99.76 | 99.79 | 99.75 | 99.78 | 99.75 | 99.78 | 99.75 | 99.78 | 99.76 | 99.74 | 99.80 | 99.60 | 99.54 | 99.61 |
new-thyroid2 | 99.89 | 99.85 | 99.90 | 99.86 | 99.89 | 99.85 | 99.89 | 99.85 | 99.89 | 99.85 | 99.88 | 99.82 | 99.51 | 99.74 | 99.80 | 99.74 |
page-blocks0 | 98.62 | 99.39 | 98.63 | 99.39 | 98.62 | 99.39 | 98.62 | 99.39 | 98.62 | 99.39 | 98.60 | 99.25 | 99.37 | 99.28 | 99.21 | 98.97 |
pima | 99.27 | 99.10 | 99.28 | 99.11 | 99.27 | 99.10 | 99.27 | 99.10 | 99.27 | 99.10 | 99.24 | 99.10 | 99.12 | 99.10 | 99.05 | 99.04 |
segment0 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.67 | 99.31 | 99.99 | 99.99 |
vehicle0 | 99.90 | 99.97 | 99.91 | 99.97 | 99.90 | 99.97 | 99.90 | 99.97 | 99.90 | 99.97 | 99.90 | 99.95 | 99.85 | 99.60 | 99.87 | 99.95 |
vehicle1 | 99.70 | 99.75 | 99.71 | 99.76 | 99.70 | 99.75 | 99.70 | 99.75 | 99.70 | 99.75 | 99.77 | 99.65 | 99.78 | 99.66 | 99.68 | 99.50 |
vehicle2 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.83 | 99.63 | 99.97 | 99.98 |
vehicle2 | 99.80 | 99.76 | 99.81 | 99.75 | 99.80 | 99.74 | 99.81 | 99.74 | 99.80 | 99.74 | 99.80 | 99.62 | 99.81 | 99.20 | 99.91 | 99.56 |
wisconsin | 99.50 | 99.84 | 99.51 | 99.85 | 99.50 | 99.84 | 99.50 | 99.84 | 99.50 | 99.84 | 99.50 | 99.75 | 99.51 | 99.51 | 99.40 | 99.60 |
yeast1 | 98.98 | 98.71 | 98.98 | 98.72 | 98.98 | 98.71 | 98.98 | 98.71 | 98.98 | 98.71 | 98.98 | 99.81 | 98.80 | 98.62 | 98.67 | 98.35 |
yeast3 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 99.92 | 99.93 |
yeast-2_vs_4 | 99.85 | 99.89 | 99.86 | 99.90 | 99.85 | 99.89 | 99.85 | 99.89 | 99.85 | 99.89 | 99.84 | 99.88 | 99.70 | 99.62 | 99.64 | 99.60 |
penbased | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.81 | 99.77 | 99.90 | 99.85 | 99.99 | 99.99 |
Nursery | 89.84 | 88.21 | 89.85 | 88.22 | 89.84 | 88.21 | 89.84 | 88.21 | 89.84 | 88.21 | 89.70 | 88.11 | 91.05 | 90.95 | 89.20 | 88.00 |
breast cancer | 99.05 | 98.70 | 99.06 | 98.71 | 99.05 | 98.70 | 99.05 | 98.70 | 99.05 | 98.70 | 98.90 | 98.55 | 99.26 | 99.07 | 98.89 | 98.57 |
Z-Alizadeh Sani | 98.34 | 97.79 | 98.35 | 97.80 | 98.34 | 97.79 | 98.34 | 97.79 | 98.34 | 97.79 | 98.29 | 97.62 | 98.80 | 98.34 | 98.11 | 97.88 |
Average | 98.92 | 98.95 | 98.93 | 98.90 | 98.92 | 98.90 | 98.92 | 98.90 | 98.92 | 98.90 | 98.90 | 98.87 | 99.07 | 98.98 | 98.87 | 98.79 |
Dataset | Acc (CNN/DNN) | Pre (CNN/DNN) | Rec (CNN/DNN) | F1 (CNN/DNN) | G-Mean (CNN/DNN) | Spe (CNN/DNN) | AUC (CNN/DNN) | Kap (CNN/DNN) | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ecoli1 | 98.70 | 98.55 | 98.71 | 98.56 | 98.70 | 98.55 | 98.70 | 98.55 | 98.70 | 98.55 | 98.65 | 98.50 | 99.12 | 99.24 | 98.75 | 98.45 |
ecoli2 | 98.41 | 98.62 | 98.42 | 98.63 | 98.41 | 98.62 | 98.41 | 98.62 | 98.41 | 98.62 | 98.36 | 98.60 | 99.13 | 99.50 | 98.25 | 98.54 |
ecoli3 | 99.20 | 99.63 | 99.21 | 99.64 | 99.20 | 99.63 | 99.20 | 99.63 | 99.20 | 99.63 | 99.15 | 99.52 | 99.26 | 99.60 | 99.03 | 99.31 |
ecoli-0_vs_1 | 99.59 | 99.80 | 99.60 | 99.81 | 99.59 | 99.80 | 99.59 | 99.80 | 99.59 | 99.80 | 99.48 | 99.74 | 99.40 | 99.41 | 99.58 | 99.21 |
glass0 | 99.15 | 99.10 | 99.16 | 99.11 | 99.15 | 99.10 | 99.15 | 99.10 | 99.15 | 99.10 | 99.10 | 99.10 | 99.01 | 99.18 | 98.99 | 99.00 |
glass1 | 98.78 | 98.40 | 98.79 | 98.41 | 98.78 | 98.40 | 98.78 | 98.40 | 98.78 | 98.40 | 98.62 | 98.38 | 99.05 | 98.99 | 98.98 | 98.84 |
glass6 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
haberman | 94.50 | 95.14 | 94.51 | 95.15 | 94.50 | 95.15 | 94.50 | 95.15 | 94.50 | 95.14 | 94.46 | 95.30 | 98.00 | 97.20 | 94.89 | 96.30 |
iris0 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
new-thyroid1 | 99.70 | 99.63 | 99.71 | 99.64 | 99.70 | 99.63 | 99.70 | 99.63 | 99.70 | 99.63 | 99.62 | 99.60 | 99.77 | 99.05 | 99.80 | 99.61 |
new-thyroid2 | 99.80 | 99.80 | 99.81 | 99.81 | 99.80 | 99.80 | 99.80 | 99.80 | 99.80 | 99.80 | 99.74 | 99.74 | 99.20 | 99.55 | 99.22 | 99.74 |
page-blocks0 | 98.43 | 99.21 | 98.44 | 99.22 | 98.43 | 99.21 | 98.43 | 99.21 | 98.43 | 99.21 | 98.21 | 99.15 | 98.50 | 99.40 | 99.20 | 98.84 |
pima | 99.50 | 99.11 | 99.51 | 99.12 | 99.50 | 99.11 | 99.50 | 99.11 | 99.50 | 99.11 | 99.48 | 99.09 | 99.32 | 99.05 | 99.14 | 99.09 |
segment0 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.38 | 99.20 | 99.99 | 99.99 |
vehicle0 | 99.70 | 99.80 | 99.71 | 99.81 | 99.70 | 99.80 | 99.70 | 99.80 | 99.70 | 99.80 | 99.70 | 99.74 | 99.63 | 99.62 | 99.87 | 99.70 |
vehicle1 | 99.30 | 99.48 | 99.31 | 99.49 | 99.30 | 99.48 | 99.30 | 99.48 | 99.30 | 99.48 | 99.26 | 99.40 | 99.12 | 99.20 | 99.40 | 99.20 |
vehicle2 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.60 | 99.40 | 99.95 | 99.95 |
vehicle2 | 99.60 | 99.70 | 99.61 | 99.71 | 99.60 | 99.70 | 99.60 | 99.70 | 99.60 | 99.70 | 99.58 | 99.56 | 99.74 | 99.22 | 99.54 | 99.47 |
wisconsin | 99.50 | 99.75 | 99.51 | 99.76 | 99.50 | 99.75 | 99.50 | 99.75 | 99.50 | 99.75 | 99.50 | 99.70 | 99.51 | 99.60 | 99.40 | 99.51 |
yeast1 | 98.85 | 98.70 | 98.86 | 98.71 | 98.85 | 98.71 | 98.85 | 98.71 | 98.85 | 98.71 | 98.81 | 99.69 | 98.40 | 98.31 | 98.52 | 98.22 |
yeast3 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 99.81 | 99.70 |
yeast-2_vs_4 | 99.85 | 99.89 | 99.86 | 99.90 | 99.85 | 99.89 | 99.85 | 99.89 | 99.85 | 99.89 | 99.84 | 99.88 | 99.70 | 99.62 | 99.64 | 99.60 |
penbased | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.76 | 99.69 | 99.84 | 99.73 | 99.99 | 99.99 |
nursery | 89.27 | 88.88 | 89.28 | 88.89 | 89.27 | 88.88 | 89.27 | 88.88 | 89.27 | 88.88 | 89.15 | 88.65 | 90.31 | 91.23 | 89.06 | 88.70 |
breast cancer | 99.15 | 98.81 | 99.16 | 98.82 | 99.15 | 98.81 | 99.15 | 98.81 | 99.15 | 98.81 | 99.01 | 98.69 | 99.19 | 98.99 | 99.03 | 98.73 |
Z-Alizadeh Sani | 97.56 | 97.43 | 97.57 | 97.44 | 97.56 | 97.43 | 97.56 | 97.43 | 97.56 | 97.43 | 97.27 | 97.21 | 98.24 | 98.20 | 97.20 | 97.10 |
Average | 98.78 | 98.82 | 98.79 | 98.83 | 98.78 | 98.82 | 98.78 | 98.82 | 98.78 | 98.82 | 98.72 | 98.80 | 98.93 | 98.94 | 98.73 | 98.72 |
Dataset | Acc (CNN/DNN) | Pre (CNN/DNN) | Rec (CNN/DNN) | F1 (CNN/DNN) | G-Mean (CNN/DNN) | Spe (CNN/DNN) | AUC (CNN/DNN) | Kap (CNN/DNN) | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ecoli1 | 99.16 | 99.20 | 99.17 | 99.21 | 99.16 | 99.20 | 99.16 | 99.20 | 99.16 | 99.20 | 99.10 | 99.15 | 99.41 | 99.31 | 99.11 | 99.11 |
ecoli2 | 99.60 | 99.88 | 99.61 | 99.89 | 99.60 | 99.88 | 99.60 | 99.88 | 99.60 | 99.88 | 99.52 | 99.80 | 99.23 | 99.55 | 99.33 | 99.88 |
ecoli3 | 99.33 | 99.74 | 99.34 | 99.75 | 99.33 | 99.74 | 99.33 | 99.74 | 99.33 | 99.74 | 99.29 | 99.41 | 99.15 | 99.52 | 99.02 | 99.30 |
ecoli-0_vs_1 | 99.60 | 99.78 | 99.61 | 99.79 | 99.60 | 99.78 | 99.60 | 99.78 | 99.60 | 99.78 | 99.54 | 99.74 | 99.35 | 99.37 | 99.57 | 99.54 |
glass0 | 99.20 | 99.25 | 99.21 | 99.26 | 99.20 | 99.25 | 99.20 | 99.25 | 99.20 | 99.25 | 99.13 | 99.18 | 99.11 | 99.18 | 99.18 | 99.06 |
glass1 | 99.15 | 99.20 | 99.16 | 99.21 | 99.15 | 99.20 | 99.15 | 99.20 | 99.15 | 99.20 | 99.10 | 99.17 | 99.18 | 99.25 | 99.06 | 99.14 |
glass6 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
haberman | 94.80 | 95.36 | 94.81 | 95.37 | 94.80 | 95.37 | 94.80 | 95.37 | 94.80 | 95.37 | 94.62 | 95.32 | 98.00 | 97.25 | 95.00 | 95.35 |
iris0 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
new-thyroid1 | 99.63 | 99.65 | 99.64 | 99.66 | 99.63 | 99.65 | 99.63 | 99.65 | 99.63 | 99.64 | 99.59 | 99.61 | 99.74 | 99.80 | 99.50 | 99.55 |
new-thyroid2 | 99.90 | 99.87 | 99.91 | 99.88 | 99.90 | 99.87 | 99.90 | 99.87 | 99.90 | 99.87 | 99.84 | 99.87 | 99.50 | 99.70 | 99.87 | 99.78 |
page-blocks0 | 99.20 | 99.43 | 99.21 | 99.44 | 99.20 | 99.43 | 99.20 | 99.43 | 99.20 | 99.43 | 99.15 | 99.39 | 99.35 | 99.24 | 99.11 | 99.24 |
pima | 99.30 | 99.15 | 99.31 | 99.16 | 99.30 | 99.15 | 99.30 | 99.15 | 99.30 | 99.15 | 99.25 | 99.11 | 99.25 | 99.11 | 99.15 | 99.08 |
segment0 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.70 | 99.51 | 99.99 | 99.99 |
vehicle0 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.78 | 99.62 | 99.99 | 99.99 |
vehicle1 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.84 | 99.80 | 99.99 | 99.99 |
vehicle2 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.77 | 99.64 | 99.99 | 99.99 |
vehicle3 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.40 | 99.50 | 99.99 | 99.99 |
wisconsin | 99.42 | 99.50 | 99.43 | 99.51 | 99.42 | 99.50 | 99.42 | 99.50 | 99.42 | 99.50 | 99.35 | 99.42 | 99.20 | 99.55 | 99.32 | 99.41 |
yeast1 | 99.30 | 99.25 | 99.31 | 99.26 | 99.30 | 99.25 | 99.30 | 99.25 | 99.30 | 99.25 | 99.24 | 99.15 | 99.58 | 98.37 | 99.24 | 98.35 |
yeast3 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 99.81 | 99.70 |
yeast-2_vs_4 | 99.90 | 99.91 | 99.91 | 99.92 | 99.90 | 99.91 | 99.90 | 99.91 | 99.90 | 99.91 | 99.87 | 99.92 | 99.74 | 99.85 | 99.62 | 99.67 |
penbased | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.81 | 99.76 | 99.92 | 99.88 | 99.99 | 99.99 |
nursery | 88.37 | 88.18 | 88.38 | 88.19 | 88.37 | 88.18 | 88.37 | 88.18 | 88.37 | 88.18 | 88.21 | 88.02 | 90.37 | 89.86 | 88.02 | 88.00 |
breast cancer | 99.23 | 98.92 | 99.24 | 98.93 | 99.23 | 98.92 | 99.23 | 98.92 | 99.23 | 98.92 | 99.11 | 98.75 | 99.05 | 99.02 | 99.09 | 99.02 |
Z-Alizadeh Sani | 98.50 | 98.24 | 98.51 | 98.25 | 98.50 | 98.24 | 98.50 | 98.24 | 98.50 | 98.10 | 98.39 | 98.05 | 98.27 | 98.10 | 98.21 | 98.13 |
Average | 98.98 | 99.01 | 98.98 | 99.02 | 98.98 | 99.01 | 98.98 | 99.01 | 98.98 | 99.01 | 98.92 | 98.95 | 99.07 | 98.99 | 98.89 | 98.89 |
Dataset | Acc (CNN/DNN) | Pre (CNN/DNN) | Rec (CNN/DNN) | F1 (CNN/DNN) | G-Mean (CNN/DNN) | Spe (CNN/DNN) | AUC (CNN/DNN) | Kap (CNN/DNN) | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ecoli1 | 98.52 | 98.49 | 98.53 | 98.50 | 98.52 | 98.49 | 98.52 | 98.49 | 98.52 | 98.49 | 98.34 | 98.28 | 99.08 | 98.57 | 98.87 | 98.64 |
ecoli2 | 98.81 | 98.70 | 98.82 | 98.71 | 98.81 | 98.70 | 98.81 | 98.70 | 98.81 | 98.70 | 98.72 | 98.61 | 99.38 | 99.33 | 98.76 | 98.50 |
ecoli3 | 98.90 | 98.64 | 98.91 | 98.65 | 98.90 | 98.64 | 98.90 | 98.64 | 98.90 | 98.64 | 98.78 | 98.56 | 99.16 | 99.22 | 99.12 | 99.21 |
ecoli-0_vs_1 | 98.81 | 98.64 | 98.82 | 98.65 | 98.81 | 98.64 | 98.81 | 98.64 | 98.81 | 98.64 | 98.70 | 98.53 | 99.06 | 99.27 | 98.54 | 98.40 |
glass0 | 98.30 | 98.17 | 98.31 | 98.18 | 98.30 | 98.17 | 98.30 | 98.17 | 98.30 | 98.17 | 98.19 | 98.08 | 99.25 | 99.15 | 98.11 | 99.16 |
glass1 | 98.27 | 98.16 | 98.28 | 98.17 | 98.27 | 98.16 | 98.27 | 98.16 | 98.27 | 98.16 | 98.20 | 98.10 | 99.08 | 99.19 | 98.15 | 98.02 |
glass6 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
Haberman | 95.45 | 95.50 | 95.46 | 95.51 | 95.45 | 95.50 | 95.45 | 95.50 | 95.45 | 95.50 | 95.30 | 95.40 | 98.00 | 98.26 | 95.24 | 95.40 |
iris0 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
new-thyroid1 | 98.59 | 98.63 | 98.60 | 98.64 | 98.59 | 98.63 | 98.59 | 98.63 | 98.59 | 98.63 | 98.46 | 98.60 | 99.01 | 99.14 | 98.68 | 99.85 |
new-thyroid2 | 98.87 | 98.79 | 98.88 | 98.80 | 98.87 | 98.79 | 98.87 | 98.79 | 98.87 | 98.79 | 98.54 | 98.50 | 99.08 | 99.01 | 98.80 | 98.72 |
page-blocks0 | 98.40 | 98.16 | 98.41 | 98.17 | 98.40 | 98.16 | 98.40 | 98.16 | 98.40 | 98.16 | 98.26 | 98.09 | 99.12 | 99.06 | 98.60 | 98.47 |
pima | 98.32 | 99.02 | 98.33 | 99.03 | 98.32 | 99.02 | 98.32 | 99.02 | 98.32 | 99.02 | 98.24 | 98.95 | 99.20 | 99.03 | 99.23 | 99.01 |
segment0 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.35 | 99.29 | 99.99 | 99.99 |
vehicle0 | 98.98 | 98.90 | 98.99 | 98.91 | 98.98 | 98.90 | 98.98 | 98.90 | 98.98 | 98.90 | 98.86 | 98.80 | 98.60 | 98.57 | 98.86 | 98.80 |
vehicle1 | 98.58 | 98.44 | 98.59 | 98.45 | 98.58 | 98.44 | 98.58 | 98.44 | 98.58 | 98.44 | 98.50 | 98.38 | 99.16 | 99.10 | 98.42 | 98.37 |
vehicle2 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.55 | 99.34 | 99.95 | 99.95 |
vehicle2 | 99.31 | 99.15 | 99.32 | 99.16 | 99.31 | 99.15 | 99.31 | 99.15 | 99.31 | 99.15 | 99.23 | 99.10 | 99.21 | 99.11 | 99.20 | 99.05 |
wisconsin | 99.72 | 99.70 | 99.73 | 99.71 | 99.72 | 99.70 | 99.72 | 99.70 | 99.72 | 99.70 | 99.60 | 99.55 | 99.23 | 99.19 | 99.50 | 99.45 |
yeast1 | 98.94 | 98.60 | 98.95 | 98.61 | 98.94 | 98.60 | 98.94 | 98.60 | 98.94 | 98.60 | 98.80 | 98.73 | 99.03 | 99.00 | 98.90 | 98.72 |
yeast3 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 99.91 | 99.96 |
yeast-2_vs_4 | 98.93 | 99.00 | 98.94 | 99.00 | 98.93 | 99.00 | 98.93 | 99.00 | 98.93 | 99.00 | 98.81 | 98.90 | 98.86 | 99.10 | 98.99 | 98.92 |
penbased | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.73 | 99.60 | 99.89 | 99.87 | 99.99 | 99.99 |
nursery | 88.19 | 88.03 | 88.20 | 88.04 | 88.19 | 88.03 | 88.19 | 88.03 | 88.19 | 88.03 | 88.02 | 87.93 | 90.19 | 89.95 | 87.94 | 87.83 |
breast cancer | 99.00 | 98.72 | 99.01 | 98.73 | 99.00 | 98.72 | 99.00 | 98.72 | 99.00 | 98.72 | 98.81 | 98.61 | 98.73 | 98.52 | 98.67 | 98.43 |
Z-Alizadeh Sani | 96.87 | 96.60 | 96.88 | 96.61 | 96.87 | 96.60 | 96.87 | 96.60 | 96.87 | 96.60 | 96.72 | 96.50 | 97.19 | 97.00 | 96.72 | 96.49 |
Average | 98.45 | 98.38 | 98.45 | 98.39 | 98.45 | 98.38 | 98.45 | 98.38 | 98.45 | 98.38 | 98.33 | 98.29 | 98.78 | 98.74 | 98.42 | 98.43 |
Dataset | Acc (CNN/DNN) | Pre (CNN/DNN) | Rec (CNN/DNN) | F1 (CNN/DNN) | G-Mean (CNN/DNN) | Spe (CNN/DNN) | AUC (CNN/DNN) | Kap (CNN/DNN) | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ecoli1 | 98.26 | 98.17 | 98.27 | 98.18 | 98.26 | 98.17 | 98.26 | 98.17 | 98.26 | 98.17 | 98.19 | 98.07 | 99.02 | 99.00 | 98.64 | 98.60 |
ecoli2 | 98.51 | 98.43 | 98.52 | 98.44 | 98.51 | 98.43 | 98.51 | 98.43 | 98.51 | 98.43 | 98.39 | 98.37 | 99.13 | 99.06 | 98.71 | 98.60 |
ecoli3 | 98.84 | 98.70 | 98.85 | 98.71 | 98.84 | 98.70 | 98.84 | 98.70 | 98.84 | 98.70 | 98.76 | 98.63 | 99.19 | 99.16 | 98.80 | 98.62 |
ecoli-0_vs_1 | 98.77 | 98.60 | 98.78 | 98.61 | 98.77 | 98.60 | 98.77 | 98.60 | 98.77 | 98.60 | 98.70 | 98.51 | 99.03 | 99.01 | 98.65 | 98.57 |
glass0 | 98.36 | 98.13 | 98.37 | 98.14 | 98.36 | 98.13 | 98.36 | 98.13 | 98.36 | 98.13 | 98.24 | 98.03 | 99.28 | 99.06 | 98.17 | 98.11 |
glass1 | 98.49 | 98.53 | 98.50 | 98.54 | 98.49 | 98.53 | 98.49 | 98.53 | 98.49 | 98.53 | 98.33 | 98.44 | 99.16 | 98.39 | 98.27 | 98.45 |
glass6 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |
haberman | 94.73 | 94.61 | 94.74 | 94.62 | 94.73 | 94.61 | 94.73 | 94.61 | 94.73 | 94.61 | 94.65 | 94.52 | 99.01 | 98.89 | 94.29 | 94.20 |
iris0 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.84 | 99.70 | 99.96 | 99.95 |
new-thyroid1 | 98.29 | 98.43 | 98.30 | 98.44 | 98.30 | 98.43 | 98.30 | 98.43 | 98.30 | 98.43 | 98.05 | 98.27 | 99.32 | 99.43 | 99.27 | 98.43 |
new-thyroid2 | 98.80 | 98.30 | 98.81 | 98.31 | 98.80 | 98.30 | 98.80 | 98.30 | 98.80 | 98.30 | 98.69 | 98.15 | 98.99 | 98.52 | 99.01 | 98.78 |
page-blocks0 | 98.31 | 98.14 | 98.32 | 98.15 | 98.31 | 98.14 | 98.31 | 98.14 | 98.31 | 98.14 | 98.15 | 98.04 | 98.87 | 98.45 | 98.32 | 98.14 |
pima | 98.19 | 98.05 | 98.20 | 98.06 | 98.19 | 98.05 | 98.19 | 98.05 | 98.19 | 98.05 | 98.06 | 97.94 | 98.85 | 98.56 | 98.00 | 97.94 |
segment0 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.29 | 99.21 | 99.99 | 99.99 |
vehicle0 | 98.84 | 98.73 | 98.85 | 98.74 | 98.84 | 98.73 | 98.84 | 98.73 | 98.84 | 98.73 | 98.63 | 98.73 | 99.15 | 98.99 | 98.70 | 98.52 |
vehicle1 | 98.46 | 98.32 | 98.47 | 98.33 | 98.46 | 98.32 | 98.46 | 98.32 | 98.46 | 98.32 | 98.32 | 98.24 | 98.21 | 98.10 | 98.24 | 98.10 |
vehicle2 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.33 | 99.05 | 99.82 | 99.70 |
vehicle3 | 99.28 | 99.13 | 99.29 | 99.14 | 99.28 | 99.13 | 99.28 | 99.13 | 99.28 | 99.13 | 99.18 | 99.05 | 99.08 | 98.99 | 99.11 | 98.94 |
wisconsin | 99.64 | 99.48 | 99.65 | 99.49 | 99.64 | 99.48 | 99.64 | 99.48 | 99.64 | 99.48 | 99.61 | 99.34 | 99.64 | 99.27 | 99.29 | 99.16 |
yeast1 | 98.99 | 98.70 | 99.00 | 98.71 | 98.99 | 98.70 | 98.99 | 98.70 | 98.99 | 98.70 | 98.79 | 98.61 | 99.02 | 98.89 | 98.75 | 98.52 |
yeast3 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 99.84 | 99.62 |
yeast-2_vs_4 | 98.82 | 98.70 | 98.83 | 98.71 | 98.82 | 98.70 | 98.82 | 98.70 | 98.82 | 98.62 | 98.70 | 98.70 | 98.82 | 98.50 | 98.70 | 98.63 |
penbased | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.99 | 99.92 | 99.87 | 99.90 | 99.87 | 99.99 | 99.99 |
nursery | 88.06 | 88.00 | 88.07 | 88.01 | 88.06 | 88.00 | 88.06 | 88.00 | 88.06 | 88.00 | 87.96 | 87.80 | 89.96 | 89.85 | 87.84 | 87.95 |
breast cancer | 98.91 | 98.70 | 98.92 | 98.71 | 98.91 | 98.70 | 98.91 | 98.70 | 98.91 | 98.70 | 98.70 | 98.60 | 98.50 | 98.43 | 98.36 | 98.27 |
Z-Alizadeh Sani | 96.30 | 96.15 | 96.31 | 96.16 | 96.30 | 96.15 | 96.30 | 96.15 | 96.30 | 96.15 | 96.16 | 96.02 | 96.98 | 97.00 | 96.12 | 96.01 |
Average | 98.33 | 98.22 | 98.34 | 98.23 | 98.33 | 98.22 | 98.33 | 98.22 | 98.33 | 98.22 | 98.23 | 98.15 | 98.75 | 98.59 | 98.26 | 98.14 |
5. Discussion
6. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Li, Y.; Zhang, J.; Zhang, S.; Xiao, W.; Zhang, Z. Multi-objective optimization-based adaptive class-specific cost extreme learning machine for imbalanced classification. Neurocomputing 2022, 496, 107–120. [Google Scholar] [CrossRef]
- He, H.; Garcia, E.A. Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 2009, 21, 1263–1284. [Google Scholar]
- Weiss, G.M. Mining with rarity: A unifying framework. ACM Sigkdd Explor. Newsl. 2004, 6, 7–19. [Google Scholar] [CrossRef]
- Batista, G.E.A.P.A.; Prati, R.C.; Monard, M.C. A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor. Newsl. 2004, 6, 20–29. [Google Scholar] [CrossRef]
- Mani, I.; Zhang, I. kNN approach to unbalanced data distributions: A case study involving information extraction. In Proceedings of the Workshop on Learning from Imbalanced Datasets (ICML 2003), Washington, DC, USA, 21 August 2003; pp. 1–7. [Google Scholar]
- Liu, W.; Chawla, S. Class confidence weighted knn algorithms for imbalanced data sets. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Shenzhen, China, 24–27 May 2011; Springer: Berlin/Heidelberg, Germany, 2011; pp. 345–356. [Google Scholar]
- Chawla, N.V.; Lazarevic, A.; Hall, L.O.; Bowyer, K.W. SMOTEBoost: Improving prediction of the minority class in boosting. In Proceedings of the European Conference on Principles of Data Mining and Knowledge Discovery, Cavtat-Dubrovnik, Croatia, 22–26 September 2003; Springer: Berlin/Heidelberg, Germany, 2003; pp. 107–119. [Google Scholar]
- Seiffert, C.; Khoshgoftaar, T.M.; Van Hulse, J.; Napolitano, A. RUSBoost: A Hybrid Approach to Alleviating Class Imbalance. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 2009, 40, 185–197. [Google Scholar] [CrossRef]
- Provost, F. Machine learning from imbalanced data sets 101. In Proceedings of the AAAI’2000 Workshop on Imbalanced Data Sets, Austin, TX, USA, 31 July 2000; AAAI Press: Palo Alto, CA, USA, 2000; pp. 1–3. [Google Scholar]
- Sun, Y.; Kamel, M.S.; Wong, A.K.; Wang, Y. Cost-sensitive boosting for classification of imbalanced data. Pattern Recognit. 2007, 40, 3358–3378. [Google Scholar] [CrossRef]
- Liu, X.-Y.; Wu, J.; Zhou, Z.-H. Exploratory Undersampling for Class-Imbalance Learning. IEEE Trans. Syst. Man Cybern. Part B 2008, 39, 539–550. [Google Scholar] [CrossRef]
- Barandela, R.; Sánchez, J.S.; Garcıa, V. Rangel, Strategies for learning in class imbalance problems. Pattern Recognit. 2003, 36, 849–851. [Google Scholar] [CrossRef]
- Tahir, M.A.; Kittler, J.; Yan, F. Inverse random under sampling for class imbalance problem and its application to multi-label classification. Pattern Recognit. 2012, 45, 3738–3750. [Google Scholar] [CrossRef]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-sampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- García, S.; Herrera, F. Evolutionary Undersampling for Classification with Imbalanced Datasets: Proposals and Taxonomy. Evol. Comput. 2009, 17, 275–306. [Google Scholar] [CrossRef] [PubMed]
- Weiss, G.M.; McCarthy, K.; Zabar, B. Cost-sensitive learning vs. sampling: Which is best for handling unbalanced classes with unequal error costs? Dmin 2007, 7, 24. [Google Scholar]
- Zhou, Z.-H.; Liu, X.-Y. Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans. Knowl. Data Eng. 2005, 18, 63–77. [Google Scholar] [CrossRef]
- Seiffert, C.; Khoshgoftaar, T.M.; Van Hulse, J.; Napolitano, A. A Comparative Study of Data Sampling and Cost Sensitive Learning. In Proceedings of the IEEE International Conference on Data Mining Workshops, Pisa, Italy, 5–19 December 2008; pp. 46–52. [Google Scholar] [CrossRef]
- Breiman, L. Bagging predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef] [Green Version]
- Freund, Y.; Schapire, R.E. Experiments with a new boosting algorithm. Icml 1996, 96, 148–156. [Google Scholar]
- Guo, H.; Viktor, H.L. Learning from imbalanced data sets with boosting and data generation: The databoost-im approach. ACM Sigkdd Explor. Newsl. 2004, 6, 30–39. [Google Scholar] [CrossRef]
- Hido, S.; Kashima, H.; Takahashi, Y. Roughly balanced bagging for imbalanced data. Stat. Anal. Data Min. ASA Data Sci. J. 2009, 2, 412–426. [Google Scholar] [CrossRef]
- Durahim, A.O. Comparison of sampling techniques for imbalanced learning. Yönet. Bilişim Sist. Derg. 2016, 2, 181–191. [Google Scholar]
- Tomek, I. Two Modifications of CNN. IEEE Trans. Syst. Man Cybern. 1976, SMC-6, 769–772. [Google Scholar] [CrossRef] [Green Version]
- Kubat, M.; Matwin, S. Addressing the curse of imbalanced training sets: One-sided selection. Icml 1997, 97, 179. [Google Scholar]
- Azadbakht, M.; Fraser, C.S.; Khoshelham, K. Synergy of sampling techniques and ensemble classifiers for classification of urban environments using full-waveform LiDAR data. Int. J. Appl. Earth Obs. Geoinf. 2018, 73, 277–291. [Google Scholar] [CrossRef]
- Czarnowski, I. Weighted Ensemble with one-class Classification and Over-sampling and Instance selection (WECOI): An approach for learning from imbalanced data streams. J. Comput. Sci. 2022, 61, 101614. [Google Scholar] [CrossRef]
- Chen, Q.; Zhang, Z.-L.; Huang, W.-P.; Wu, J.; Luo, X.-G. PF-SMOTE: A novel parameter-free SMOTE for imbalanced datasets. Neurocomputing 2022, 498, 75–88. [Google Scholar] [CrossRef]
- Mayabadi, S.; Saadatfar, H. Two density-based sampling approaches for imbalanced and overlapping data. Knowl.-Based Syst. 2022, 241, 108217. [Google Scholar] [CrossRef]
- Buda, M.; Maki, A.; Mazurowski, M.A. A systematic study of the class imbalance problem in convolutional neural networks. Neural Netw. 2018, 106, 249–259. [Google Scholar] [CrossRef] [Green Version]
- Li, K.; Zhou, G.; Zhai, J.; Li, F.; Shao, M. Improved PSO_AdaBoost Ensemble Algorithm for Imbalanced Data. Sensors 2019, 19, 1476. [Google Scholar] [CrossRef] [Green Version]
- Vuttipittayamongkol, P.; Elyan, E.; Petrovski, A. On the class overlap problem in imbalanced data classification. Knowl.-Based Syst. 2020, 212, 106631. [Google Scholar] [CrossRef]
- Aridas, C.K.; Karlos, S.; Kanas, V.G.; Fazakis, N.; Kotsiantis, S.B. Uncertainty Based Under-Sampling for Learning Naive Bayes Classifiers Under Imbalanced Data Sets. IEEE Access 2019, 8, 2122–2133. [Google Scholar] [CrossRef]
- Dablain, D.; Krawczyk, B.; Chawla, N.V. DeepSMOTE: Fusing Deep Learning and SMOTE for Imbalanced Data. IEEE Trans. Neural Networks Learn. Syst. 2022, 1–15. [Google Scholar] [CrossRef]
- Bagui, S.; Li, K. Resampling imbalanced data for network intrusion detection datasets. J. Big Data 2021, 8, 6. [Google Scholar] [CrossRef]
- Choi, H.-S.; Jung, D.; Kim, S.; Yoon, S. Imbalanced Data Classification via Cooperative Interaction Between Classifier and Generator. IEEE Trans. Neural Networks Learn. Syst. 2021, 33, 3343–3356. [Google Scholar] [CrossRef] [PubMed]
- Xie, X.; Liu, H.; Zeng, S.; Lin, L.; Li, W. A novel progressively undersampling method based on the density peaks sequence for imbalanced data. Knowl.-Based Syst. 2020, 213, 106689. [Google Scholar] [CrossRef]
- Zheng, M.; Li, T.; Sun, L.; Wang, T.; Jie, B.; Yang, W.; Tang, M.; Lv, C. An automatic sampling ratio detection method based on genetic algorithm for imbalanced data classification. Knowl.-Based Syst. 2021, 216, 106800. [Google Scholar] [CrossRef]
- Elyan, E.; Moreno-Garcia, C.F.; Jayne, C. CDSMOTE: Class decomposition and synthetic minority class oversampling technique for imbalanced-data classification. Neural Comput. Appl. 2020, 33, 2839–2851. [Google Scholar] [CrossRef]
- Asniar; Maulidevi, N.U.; Surendro, K. SMOTE-LOF for noise identification in imbalanced data classification. J. King Saud Univ. Comput. Inf. Sci. 2021, 34, 3413–3423. [Google Scholar] [CrossRef]
- Abdoli, M.; Akbari, M.; Shahrabi, J. Bagging Supervised Autoencoder Classifier for credit scoring. Expert Syst. Appl. 2023, 213, 118991. [Google Scholar] [CrossRef]
- El Bakrawy, L.M.; Cifci, M.A.; Kausar, S.; Hussain, S.; Islam, A.; Alatas, B.; Desuky, A.S. A Modified Ant Lion Optimization Method and Its Application for Instance Reduction Problem in Balanced and Imbalanced Data. Axioms 2022, 11, 95. [Google Scholar] [CrossRef]
- Yang, M.; Wang, Z.; Li, Y.; Zhou, Y.; Li, D.; Du, W. Gravitation balanced multiple kernel learning for imbalanced classification. Neural Comput. Appl. 2022, 34, 13807–13823. [Google Scholar] [CrossRef]
- Tanimoto, A.; Yamada, S.; Takenouchi, T.; Sugiyama, M.; Kashima, H. Improving imbalanced classification using near-miss instances. Expert Syst. Appl. 2022, 201, 117130. [Google Scholar] [CrossRef]
- Thejas, G.S.; Hariprasad, Y.; Iyengar, S.; Sunitha, N.; Badrinath, P.; Chennupati, S. An extension of Synthetic Minority Oversampling Technique based on Kalman filter for imbalanced datasets. Mach. Learn. Appl. 2022, 8, 100267. [Google Scholar] [CrossRef]
- Wei, G.; Mu, W.; Song, Y.; Dou, J. An improved and random synthetic minority oversampling technique for imbalanced data. Knowl.-Based Syst. 2022, 248, 108839. [Google Scholar] [CrossRef]
- Gao, Y.; Gao, L.; Li, X.; Cao, S. A Hierarchical Training-Convolutional Neural Network for Imbalanced Fault Diagnosis in Complex Equipment. IEEE Trans. Ind. Inform. 2022, 18, 8138–8145. [Google Scholar] [CrossRef]
- Mohammed, R.; Rawashdeh, J.; Abdullah, M. Machine Learning with Oversampling and Undersampling Techniques: Overview Study and Experimental Results. In Proceedings of the 11th International Conference on Information and Communication Systems (ICICS), Irbid, Jordan, 7–9 April 2020; pp. 243–248. [Google Scholar] [CrossRef]
- Li, W.; Chen, J.; Cao, J.; Ma, C.; Wang, J.; Cui, X.; Chen, P. EID-GAN: Generative Adversarial Nets for Extremely Imbalanced Data Augmentation. IEEE Trans. Ind. Inform. 2022, 19, 3208–3218. [Google Scholar] [CrossRef]
- Zieba, M.; Tomczak, J.M. Boosted SVM with active learning strategy for imbalanced data. Soft Comput. 2014, 19, 3357–3368. [Google Scholar] [CrossRef] [Green Version]
- He, H.; Zhang, W.; Zhang, S. A novel ensemble method for credit scoring: Adaption of different imbalance ratios. Expert Syst. Appl. 2018, 98, 105–117. [Google Scholar] [CrossRef]
- Li, D.-C.; Wang, S.-Y.; Huang, K.-C.; Tsai, T.-I. Learning class-imbalanced data with region-impurity synthetic minority oversampling technique. Inf. Sci. 2022, 607, 1391–1407. [Google Scholar] [CrossRef]
- Fernandez, A.; Garcia, S.; Herrera, F.; Chawla, N.V. SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary. J. Artif. Intell. Res. 2018, 61, 863–905. [Google Scholar] [CrossRef]
- Pereira, R.M.; Costa, Y.M.; Silla, C.N., Jr. MLTL: A multi-label approach for the Tomek Link undersampling algorithm. Neurocomputing 2019, 383, 95–105. [Google Scholar] [CrossRef]
- Hernandez, J.; Carrasco-Ochoa, J.A.; Martínez-Trinidad, J.F. An Empirical Study of Oversampling and Undersampling for Instance Selection Methods on Imbalance Datasets. In Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, Proceedings of the 18th Iberoamerican Congress, CIARP 2013, Havana, Cuba, 20–23 November 2013; Springer: Berlin/Heidelberg, Germany, 2013; pp. 262–269. [Google Scholar] [CrossRef] [Green Version]
- Kamei, Y.; Monden, A.; Matsumoto, S.; Kakimoto, T.; Matsumoto, K.-I. The effects of over and under sampling on fault-prone module detection. In Proceedings of the First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007), Madrid, Spain, 20–21 September 2007; pp. 196–204. [Google Scholar]
- More, A. Survey of resampling techniques for improving classification performance in unbalanced datasets. arXiv 2016, arXiv:1608.06048. [Google Scholar]
- Liu, W.; Wang, Z.; Liu, X.; Zeng, N.; Liu, Y.; Alsaadi, F.E. A survey of deep neural network architectures and their applications. Neurocomputing 2017, 234, 11–26. [Google Scholar] [CrossRef]
- Caterini, A.L.; Chang, D.E. Deep Neural Networks in a Mathematical Framework; Springer International Publishing: Berlin/Heidelberg, Germany, 2018. [Google Scholar] [CrossRef]
- Pal, S.K.; Mitra, S. Multilayer Perceptron, Fuzzy Sets, Classification. IEEE Trans. Neural Netw. 1992, 3, 683–697. [Google Scholar] [CrossRef] [PubMed]
- Guo, Y.; Du, G.-Q.; Shen, W.-Q.; Du, C.; He, P.-N.; Siuly, S. Automatic myocardial infarction detection in contrast echocardiography based on polar residual network. Comput. Methods Programs Biomed. 2020, 198, 105791. [Google Scholar] [CrossRef]
- Albawi, S.; Mohammed, T.A.; Al-Zawi, S. Understanding of a convolutional neural network. In Proceedings of the International Conference on Engineering and Technology (ICET), Antalya, Turkey, 21–23 August 2017; pp. 1–6. [Google Scholar]
- O’Shea, K.; Nash, R. An introduction to convolutional neural networks. arXiv 2015, arXiv:1511.08458. [Google Scholar]
- Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Mulyanto, M.; Faisal, M.; Prakosa, S.W.; Leu, J.-S. Effectiveness of Focal Loss for Minority Classification in Network Intrusion Detection Systems. Symmetry 2020, 13, 4. [Google Scholar] [CrossRef]
- Alcalá-Fdez, J.; Fernández, A.; Luengo, J.; Derrac, J.; García, S.; Sánchez, L.; Herrera, F. Keel data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework. J. Mult. Valued Log. Soft Comput. 2011, 17, 1–36. [Google Scholar]
- Joloudari, J.H.; Azizi, F.; Nematollahi, M.A.; Alizadehsani, R.; Hassannatajjeloudari, E.; Nodehi, I.; Mosavi, A. GSVMA: A Genetic Support Vector Machine ANOVA Method for CAD Diagnosis. Front. Cardiovasc. Med. 2022, 8, 2178. [Google Scholar] [CrossRef]
- Li, J.; Fong, S.; Zhuang, Y. Optimizing SMOTE by metaheuristics with neural network and decision tree. In Proceedings of the 3rd International Symposium on Computational and Business Intelligence (ISCBI), Bali, Indonesia, 7–9 December 2015; pp. 26–32. [Google Scholar]
- Chowdary, M.K.; Nguyen, T.N.; Hemanth, D.J. Deep learning-based facial emotion recognition for human–computer interaction applications. Neural Comput. Appl. 2021, 1–18. [Google Scholar] [CrossRef]
- Narkhede, S. Understanding auc roc curve. Towards Data Sci. 2018, 26, 220–227. [Google Scholar]
- Zhang, S.; Yuan, Y.; Yao, Z.; Wang, X.; Lei, Z. Improvement of the Performance of Models for Predicting Coronary Artery Disease Based on XGBoost Algorithm and Feature Processing Technology. Electronics 2022, 11, 315. [Google Scholar] [CrossRef]
- Alizadehsani, R.; Hosseini, M.J.; Sani, Z.A.; Ghandeharioun, A.; Boghrati, R. Diagnosis of coronary artery disease using cost-sensitive algorithms. In Proceedings of the 12th International Conference on Data Mining Workshops, Brussels, Belgium, 10 December 2012; pp. 9–16. [Google Scholar]
- Alizadehsani, R.; Habibi, J.; Hosseini, M.J.; Boghrati, R.; Ghandeharioun, A.; Bahadorian, B.; Sani, Z.A. Diagnosis of coronary artery disease using data mining techniques based on symptoms and ecg features. Eur. J. Sci. Res. 2012, 82, 542–553. [Google Scholar]
- Alizadehsani, R.; Habibi, J.; Hosseini, M.J.; Mashayekhi, H.; Boghrati, R.; Ghandeharioun, A.; Bahadorian, B.; Sani, Z.A. A data mining approach for diagnosis of coronary artery disease. Comput. Methods Programs Biomed. 2013, 111, 52–61. [Google Scholar] [CrossRef]
- Babič, F.; Olejár, J.; Vantová, Z.; Paralič, J. Predictive and descriptive analysis for heart disease diagnosis. In Proceedings of the 2017 Federated Conference on Computer Science and Information Systems (FedCSIS), Prague, Czech Republic, 3–6 September 2017; pp. 155–163. [Google Scholar]
- Arabasadi, Z.; Alizadehsani, R.; Roshanzamir, M.; Moosaei, H.; Yarifard, A.A. Computer aided decision making for heart disease detection using hybrid neural network-Genetic algorithm. Comput. Methods Programs Biomed. 2017, 141, 19–26. [Google Scholar] [CrossRef]
- Li, H.; Wang, X.; Li, Y.; Qin, C.; Liu, C. Comparison between medical knowledge based and computer automated feature selection for detection of coronary artery disease using imbalanced data. In Proceedings of the BIBE 2018, International Conference on Biological Information and Biomedical Engineering, Shanghai, China, 6–8 July 2018; pp. 1–4. [Google Scholar]
- Abdar, M.; Acharya, U.R.; Sarrafzadegan, N.; Makarenkov, V. NE-nu-SVC: A New Nested Ensemble Clinical Decision Support System for Effective Diagnosis of Coronary Artery Disease. IEEE Access 2019, 7, 167605–167620. [Google Scholar] [CrossRef]
- Abdar, M.; Książek, W.; Acharya, U.R.; Tan, R.-S.; Makarenkov, V.; Pławiak, P. A new machine learning technique for an accurate diagnosis of coronary artery disease. Comput. Methods Programs Biomed. 2019, 179, 104992. [Google Scholar] [CrossRef]
- Khan, Y.; Qamar, U.; Asad, M.; Zeb, B. Applying Feature Selection and Weight Optimization Techniques to Enhance Artificial Neural Network for Heart Disease Diagnosis. In Intelligent Systems and Applications, Proceedings of the 2019 Intelligent Systems Conference (IntelliSys), London, UK, 5–6 September 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 340–351. [Google Scholar] [CrossRef]
- Kolukısa, B.; Hacılar, H.; Kuş, M.; Bakır-Güngör, B.; Aral, A.; Güngör, V.Ç. Diagnosis of coronary heart disease via classification algorithms and a new feature selection methodology. Int. J. Data Min. Sci. 2019, 1, 8–15. [Google Scholar]
- Nasarian, E.; Abdar, M.; Fahami, M.A.; Alizadehsani, R.; Hussain, S.; Basiri, M.E.; Zomorodi-Moghadam, M.; Zhou, X.; Pławiak, P.; Acharya, U.R.; et al. Association between work-related features and coronary artery disease: A heterogeneous hybrid feature selection integrated with balancing approach. Pattern Recognit. Lett. 2020, 133, 33–40. [Google Scholar] [CrossRef]
- Shahid, A.H.; Singh, M. A Novel Approach for Coronary Artery Disease Diagnosis using Hybrid Particle Swarm Optimization based Emotional Neural Network. Biocybern. Biomed. Eng. 2020, 40, 1568–1585. [Google Scholar] [CrossRef]
- Ghiasi, M.M.; Zendehboudi, S.; Mohsenipour, A.A. Decision tree-based diagnosis of coronary artery disease: CART model. Comput. Methods Programs Biomed. 2020, 192, 105400. [Google Scholar] [CrossRef]
- Joloudari, J.H.; Joloudari, E.H.; Saadatfar, H.; Ghasemigol, M.; Razavi, S.M.; Mosavi, A.; Nabipour, N.; Shamshirband, S.; Nadai, L. Coronary Artery Disease Diagnosis; Ranking the Significant Features Using a Random Trees Model. Int. J. Environ. Res. Public Health 2020, 17, 731. [Google Scholar] [CrossRef] [Green Version]
- Zomorodi-Moghadam, M.; Abdar, M.; Davarzani, Z.; Zhou, X.; Pławiak, P.; Acharya, U. Hybrid particle swarm optimization for rule discovery in the diagnosis of coronary artery disease. Expert Syst. 2019, 38, e12485. [Google Scholar] [CrossRef]
- Ashish, L.; Kumar, S.; Yeligeti, S. Ischemic heart disease detection using support vector Machine and extreme gradient boosting method. Mater. Today Proc. 2021. [Google Scholar] [CrossRef]
- Gupta, A.; Kumar, R.; Arora, H.S.; Raman, B. C-CADZ: Computational intelligence system for coronary artery disease detection using Z-Alizadeh Sani dataset. Appl. Intell. 2021, 52, 2436–2464. [Google Scholar] [CrossRef]
Programming Language | Python 3.9 |
Deep Learning Library | PyTorch 1.9 |
CPU | Intel® Core™ i7-10700 CPU @ 2.90 GHz × 16 |
GPU | NVIDIA Corporation GP104 [GeForce GTX 1070] |
RAM | 64 GB |
Layer | Layer Type | Input Features | Out Features |
---|---|---|---|
1 | Linear (Dense) | Input shape | 64 |
2 | Relu | N/A | N/A |
3 | batch normalization | 64 | - |
4 | Linear (Dense) | 64 | 64 |
5 | Relu | N/A | N/A |
6 | batch normalization | 64 | - |
7 | dropout | rate: 0.3 | - |
8 | Linear (Dense) | 64 | 1 |
Layer | Layer Type | Input Channels | Output Channels | Kernel Size | Stride |
---|---|---|---|---|---|
1 | Convolution | 1 | 16 | 3 | 1 |
2 | Relu | N/A | N/A | N/A | N/A |
3 | Convolution | 16 | 4 | 2 | 1 |
4 | Relu | N/A | N/A | N/A | N/A |
5 | Linear (Dense) | 16 | 50 | N/A | N/A |
6 | Relu | N/A | N/A | N/A | N/A |
7 | Linear (Dense) | 16 | 1 | N/A | N/A |
Name of Parameter | Description | Choice |
---|---|---|
Sampling strategy | Resample which one of the classes | All |
Random state | Set a fixed state to reproduce the same distribution of the data | None |
K Neighbors | Corresponding to the number of neighbors to use for synthesizing new data | 10 |
Number of Jobs | Corresponding to the number of cores of Central Processing Unit (CPU) for use | None |
Number of features | Corresponding to the number of input features | Based on the number of features existent in each dataset |
No. | Name | Attribute | All Samples | Imbalanced Ratio |
---|---|---|---|---|
1 | Wisconsin | 9 | 683 | 1.86 |
2 | Pima | 8 | 768 | 1.87 |
3 | iris0 | 4 | 150 | 2.00 |
4 | glass0 | 9 | 214 | 2.06 |
5 | glass1 | 9 | 214 | 1.82 |
6 | glass6 | 9 | 214 | 6.38 |
7 | yeast1 | 8 | 1484 | 2.46 |
8 | Haberman | 3 | 306 | 2.78 |
9 | vehicle1 | 18 | 846 | 2.90 |
10 | vehicle2 | 18 | 846 | 2.88 |
11 | vehicle3 | 18 | 846 | 2.99 |
12 | ecoli1 | 7 | 336 | 3.36 |
13 | ecoli2 | 7 | 336 | 5.46 |
14 | ecoli3 | 7 | 336 | 8.6 |
15 | new-thyroid1 | 5 | 215 | 5.14 |
16 | new-thyroid2 | 5 | 215 | 5.14 |
17 | segment0 | 19 | 2308 | 6.02 |
18 | yeast3 | 8 | 1484 | 8.10 |
19 | page-blocks0 | 10 | 5472 | 8.79 |
20 | yeast-2_vs_4 | 8 | 514 | 9.08 |
21 | penbased | 10 | 10,992 | 9.41 |
22 | Nursery | 5 | 12,690 | 2.2 |
23 | breast cancer | 117 | 102,294 | 163.2 |
24 | Z-Alizadeh Sani | 55 | 303 | 2.48 |
Models | Acc | Pre | Sen | F1 | G-Mean | Spe | AUC | Kap |
---|---|---|---|---|---|---|---|---|
TL + NORM. + CNN | 98.92 | 98.93 | 98.92 | 98.92 | 98.92 | 98.90 | 99.07 | 98.87 |
TL + NORM. + DNN | 98.95 | 98.90 | 98.90 | 98.90 | 98.90 | 98.87 | 98.98 | 98.79 |
OSS + NORM. + CNN | 98.78 | 98.79 | 98.78 | 98.78 | 98.78 | 98.72 | 98.93 | 98.73 |
OSS + NORM. + DNN | 98.82 | 98.83 | 98.82 | 98.82 | 98.82 | 98.80 | 98.94 | 98.72 |
NearMiss + NORM. + CNN | 98.98 | 98.98 | 98.98 | 98.98 | 98.98 | 98.92 | 99.07 | 98.89 |
NearMiss + NORM. + DNN | 99.01 | 99.02 | 99.01 | 99.01 | 99.01 | 98.95 | 98.99 | 98.89 |
RUS + NORM. + CNN | 98.33 | 98.34 | 98.33 | 98.33 | 98.33 | 98.23 | 98.75 | 98.26 |
RUS + NORM. + DNN | 98.22 | 98.23 | 98.22 | 98.22 | 98.22 | 98.15 | 98.59 | 98.14 |
SMOTE + NORM. + CNN | 99.08 | 99.09 | 99.08 | 99.09 | 99.08 | 99.03 | 99.08 | 98.92 |
SMOTE + NORM. + DNN | 98.96 | 99.00 | 98.97 | 98.98 | 98.98 | 98.99 | 99.02 | 98.78 |
Study | Method | Rec (%) | G-Nean (%) | F1 (%) |
---|---|---|---|---|
Mayabadi and Saadatfar, [29] | DB_HS + SVM | 95.80 | 88.30 | 81.40 |
DB_HS + RF | 98.10 | 92.00 | 83.80 | |
DB_US + SVM | 92.70 | 88.50 | 81.50 | |
DB_US + RF | 95.60 | 93.80 | 87.90 | |
Current Study | SMOTE + NORM + CNN | 99.00 | 99.00 | 98.98 |
Study | Method | Acc (%) | Rec (%) | F1 (%) | Pre (%) | Spe (%) | AUC (%) |
---|---|---|---|---|---|---|---|
Alizadehsani et al. 2012 [72] | Sequential Minimal Optimization | 92.09 | 97.22 | NC | NC | 79.31 | NC |
Alizadehsani et al. 2012 [73] | Ensemble of Naïve Bayes and Sequential Minimal Optimization | 88.52 | 91.12 | NC | NC | 82.05 | NC |
Alizadehsani et al. 2013 [74] | Information gain + Sequential Minimal Optimization | 94.08 | 96.30 | NC | NC | 88.51 | NC |
Babič et al., 2017 [75] | Suppoort vector machine | 86.67 | NC | NC | NC | NC | NC |
Arabasadi et al., 2017 [76] | Neural network + Genetic algorithm | 93.85 | 97.00 | NC | NC | 92.00 | NC |
Li et al., 2018 [77] | Naïve Bayes + Genetic algorithm | 88.16 | 88.00 | NC | NC | 87.78 | NC |
Abdar et al., 2019 [78] | nested ensemble nu-Support Vector Classification + genetic search algorithm + multi-step data balancing | 94.66 | 94.70 | 94.70 | 94.70 | NC | 96.60 |
Abdar et al., 2019 [79] | N2Genetic optimizer-nuSupport Vector Machine | 93.08 | NC | 91.51 | NC | NC | NC |
Khan et al., 2019 [80] | Neural network + Gini Index for feature selection + Backward Weight Optimization | 88.49 | NC | NC | NC | NC | NC |
Kolukısa et al., 2019 [81] | Ensemble Classifier with Fisher Linear Discriminant Analysis | 92.07 | 94.00 | 94.40 | NC | 87.40 | 95.30 |
Nasarian et al., 2020 [82] | Heterogeneous hybrid feature selection algorithm + SMOTE + Extreme gradient boosting | 92.58 | 92.99 | 90.62 | 92.59 | NC | NC |
Shahid and Singh, 2020 [83] | Hybrid Particle Swarm Optimization-emotional neural networks coupled with feature selection | 88.34 | 91.85 | 92.12 | 92.37 | 78.98 | NC |
Ghiasi et al., 2020 [84] | Classification and Regression tree | 92.41 | 98.61 | NC | NC | 77.01 | NC |
Joloudari et al., 2020 [85] | Random trees | 91.47 | NC | NC | NC | NC | 96.70 |
Zomorodi-moghadam et al., 2021 [86] | Hybrid Particle Swarm Optimization | 84.25 | NC | NC | NC | NC | NC |
Ashish et al., 2021 [87] | Support Vector Machine -Extreme gradient boosting + Random forest | 93.86 | NC | 91.86 | 93.86 | NC | NC |
Zhang et al., 2022 [71] | Extreme gradient boosting + Feature construction + SMOTE | 94.70 | 96.10 | 94.60 | 93.40 | 93.20 | 98.00 |
Gupta et al., 2022 [88] | Fixed analysis of mixed data + Binary Bat Algorithm + Ensemble of Random Forest and Extra Trees | 97.37 | 98.15 | 98.15 | NC | 95.45 | 96.80 |
The proposed study | Mixed SMOTE-NORM.-CNN | 98.57 | 98.58 | 98.57 | 98.58 | 98.42 | 99.14 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Joloudari, J.H.; Marefat, A.; Nematollahi, M.A.; Oyelere, S.S.; Hussain, S. Effective Class-Imbalance Learning Based on SMOTE and Convolutional Neural Networks. Appl. Sci. 2023, 13, 4006. https://doi.org/10.3390/app13064006
Joloudari JH, Marefat A, Nematollahi MA, Oyelere SS, Hussain S. Effective Class-Imbalance Learning Based on SMOTE and Convolutional Neural Networks. Applied Sciences. 2023; 13(6):4006. https://doi.org/10.3390/app13064006
Chicago/Turabian StyleJoloudari, Javad Hassannataj, Abdolreza Marefat, Mohammad Ali Nematollahi, Solomon Sunday Oyelere, and Sadiq Hussain. 2023. "Effective Class-Imbalance Learning Based on SMOTE and Convolutional Neural Networks" Applied Sciences 13, no. 6: 4006. https://doi.org/10.3390/app13064006
APA StyleJoloudari, J. H., Marefat, A., Nematollahi, M. A., Oyelere, S. S., & Hussain, S. (2023). Effective Class-Imbalance Learning Based on SMOTE and Convolutional Neural Networks. Applied Sciences, 13(6), 4006. https://doi.org/10.3390/app13064006