A Semisupervised Concept Drift Adaptation via Prototype-Based Manifold Regularization Approach with Knowledge Transfer
Abstract
:1. Introduction
- Concept drifts need to be detected as soon as they occur.
- Overlapping concepts can cause noise to the concept drift detector, causing concept drift detection to be uncertain.
- Detecting concept drift needs to occur continuously. Therefore, processing the data and deciding to trigger the drift alarm and adapt to drifts need to be computationally efficient to maximize the model’s throughput.
- Current approaches require labeled data that are expensive to obtain to calculate the performance measure, e.g., the error rate of the model to be used as the drift signal.
- Current unsupervised concept drift detection is blind to the change in the decision boundary by only monitoring the change in the data distribution.
- A concept detection approach was designed and developed to detect concept drifts by measuring the error made by confident predictions.
- An evaluation metric has been proposed to measure the performance of the classifiers in the ensemble without requiring labeled data.
2. Literature Review
2.1. Data Stream Mining
2.2. Concept Drift Detection
3. Preliminaries
3.1. Self-Organizing Incremental Neural Network (SOINN)
Algorithm 1: Enhanced Self-Organizing Incremental Neural Network (ESOINN) Input: Dataset , maximum egde age , noise remove interval Output: Set of prototypes |
1: function train_esoinn(): 2: foreach : 3: if < 2: 4: 5: continue 6: //find 1st winner 7: //find 2nd winner 8: if or : 9: 10: else: 11: //update 1st winner 12: //update 1st winner neighbors 13: 14: if : 15: //add a connection between 1st winner and 2nd winner 16: else: 17: 18: 19: remove all edges if age > 20: if number of data is multiple of : 21: foreach λ: 22: if | 23: //remove node a 24: else if and : 25: //remove node a 26: else if and : 27: end function |
3.2. Semisupervised Online Sequential—Extreme Learning Machine
Initialization Phase: | |
Obtain an initial dataset with true labels | |
| |
| |
| |
where, | |
| |
Sequential Learning phase: | |
Obtain a chunk of data labeled by which consists of labeled data and unlabeled data. | |
| |
| |
| |
| |
|
4. The Proposed Approach
4.1. High Confidence Prediction Conflict-Based Concept Drift Detection
Algorithm 2: Confidence Region Prediction Conflict Concept Drift Detection Input: Set of prototypes , confidence threshold , control chart method e.g., Page–Hinkley Test (PHT), partially labeled data chunk Output: None |
1: function drift_detect(): 2: 3: error = 0 4: for in 5: if : 6: 7: avg_confidence calculate_confidence via #EQ 8: if avg_confidence : 9: error 10: control_chart.update(error) 11: end function |
4.2. Concept Drift Adaptation with Knowledge Transfer
Algorithm 3: Trustworthiness of Classifiers in the Ensemble Input: Data , Set of classifiers in the ensemble , Age of classifiers Output: Trustworthiness of classifiers |
1: function calculate_trustworthiness(): 2: 3: for in 4: # set of data that the classifier disagrees with the majority 5: # estimated error as the proportion of disagreement in the data 6: use Equation (19) to calculate the trustworthiness of the classifier 7: return 8: end function |
Algorithm 4: Online Bagging Input: Data , Classifier , diversity parameter , training method to update classifier Output: Updated classifier |
1: function train_via_online_bagging(): 2: foreach : 3: #sample from Poisson distribution 4: repeat times: 5: 6: return 7: end function |
Algorithm 5: An Adaptive Prototype-Based Manifold Regularization Approach for Data Stream Mining Input: Data Stream , control method e.g., Page–Hinkley Test ensemble set , minimum ensemble size |
1: while (True) 2: //Sequential learning 3: Obtain partially labeled data 4: Process using Algorithm 4.2 to create or update prototype set 5: Update SOS-ELMs via manifold regularization method 6: control_chart update control chart statistics via Algorithm 5.1 7: if // if concept drift is detected 8: mode = ‘drift’ 9: #add new classifiers and add it to the ensemble 10: if mode == ‘drift’: 11: calculate the trustworthiness of the classifiers using Equation (19) 12: remove the classifier with the lowest value 13: if : 14: mode = ‘normal’ 15: end loop |
5. Experimental Setup
- STAGGER concepts. STAGGER is an abrupt drift dataset that switches between three concepts by switching between three labeling rules. STAGGER has three boolean features, i.e., either 0 or 1.
- Sine. Sine is also an abrupt drift dataset but has four different concepts. Its decision boundary resembles a sine wave function, making this dataset suitable to be evaluated on nonlinear decision boundaries.
- Abrupt drift adaptation evaluation. This experiment evaluates the performance of this approach against other baselines on abrupt drift (via the STAGGER and Sine) datasets, where the concept changes rapidly. This is to investigate whether this approach can detect changes to the concept and adapt to the changes quickly.
- Mixed magnitude drifts. This experiment is more challenging than the abrupt drift evaluation as it has a mixture of high- and low-severity drifts. The mixed severity of the drift makes it important to tailor the adaptation method based on its severity. Therefore, it allows this experiment to evaluate the performance of the proposed transfer learning adaptation approach.
- Overall performance ranking evaluation. This evaluation aggregates all the experiments’ performances and analyzes their overall ranking. This evaluation aims to analyze whether there is any significant improvement in the comparable manifold regularization approaches and if it can provide a reasonable alternative to the fully supervised approaches. This evaluation expects that this approach significantly improves the performance of the manifold regularization approaches while no significant difference in performance is observed compared to the supervised approaches.
6. Results
6.1. Abrupt Drift Evaluation
6.2. Mixed Drift Magnitude Evaluation
6.3. Real-World Dataset Performance Evaluation
6.4. Execution Time Analysis
7. Discussion
8. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
- (1)
- Sensorless Drive Diagnosis data set, 2015. UCI Machine Learning Repository: Data set for Sensorless Drive Diagnosis Data Set;
- (2)
- MAGIC Gamma Telescope data set, 2007. UCI Machine Learning Repository: MAGIC Gamma Telescope Data Set;
- (3)
- Human Activity Recognition (HAR) data set, 2012. UCI Machine Learning Repository: Human Activity Recognition Using Smartphones’ Data Set;
- (4)
- Crop Mapping Using Fused Optical Radar data set, 2020. UCI Machine Learning Repository: Crop mapping using fused optical-radar data set;
- (5)
- Knowledge Discovery and Data Mining Competition (KDDCup) data set, 1999. KDD Cup 1999 Data (uci.edu);
- (6)
- Physical Activity Monitoring (PAMAP2) data set, 2012. UCI Machine Learning Repository: PAMAP2 Physical Activity Monitoring Data Set;
Acknowledgments
Conflicts of Interest
References
- Aljaaf, A.J.; Al-Jumeily, D.; Hussain, A.J.; Dawson, T.; Fergus, P.; Al-Jumaily, M. Predicting the likelihood of heart failure with a multi level risk assessment using decision tree. In Proceedings of the 2015 Third International Conference on Technological Advances in Electrical, Electronics and Computer Engineering (TAEECE), Beirut, Lebanon, 29 April–1 May 2015. [Google Scholar]
- Li, J.; Stones, R.J.; Wang, G.; Liu, X.; Li, Z.; Xu, M. Hard drive failure prediction using Decision Trees. Reliab. Eng. Syst. Saf. 2017, 164, 55–65. [Google Scholar] [CrossRef]
- Ko, Y.H.; Hsu, P.Y.; Cheng, M.S.; Jheng, Y.R.; Luo, Z.C. Customer Retention Prediction with CNN. In Data Mining and Big Data; Springer Singapore: Singapore, 2019. [Google Scholar]
- De Caigny, A.; Coussement, K.; De Bock, K.W.; Lessmann, S. Incorporating textual information in customer churn prediction models based on a convolutional neural network. Int. J. Forecast. 2020, 36, 1563–1578. [Google Scholar] [CrossRef]
- De Francisci Morales, G.; Bifet, A.; Khan, L.; Gama, J.; Fan, W. Iot big data stream mining. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 14–18 August 2016. [Google Scholar]
- Krempl, G.; Žliobaite, I.; Brzeziński, D.; Hüllermeier, E.; Last, M.; Lemaire, V.; Noack, T.; Shaker, A.; Sievi, S.; Spiliopoulou, M.; et al. Open challenges for data stream mining research. ACM SIGKDD Explor. Newsl. 2014, 16, 1–10. [Google Scholar] [CrossRef]
- Mala, A.; Dhanaseelan, F.R. Data stream mining algorithms: A review of issues and existing approaches. Int. J. Comput. Sci. Eng. 2011, 3, 2726–2732. [Google Scholar]
- Homayoun, S.; Ahmadzadeh, M. A review on data stream classification approaches. J. Adv. Comput. Sci. Technol. 2016, 5, 8–13. [Google Scholar] [CrossRef] [Green Version]
- Alothali, E.; Alashwal, H.; Harous, S. Data stream mining techniques: A review. Telkomnika 2019, 17, 728–737. [Google Scholar] [CrossRef] [Green Version]
- Iwashita, A.S.; Papa, J. An Overview on Concept Drift Learning. IEEE Access 2019, 7, 1532–1547. [Google Scholar] [CrossRef]
- Agrahari, S.; Singh, A. Concept drift detection in data stream mining: A literature review. J. King Saud Univ. Comput. Inf. Sci. 2021, 34, 9523–9540. [Google Scholar] [CrossRef]
- Gaber, M.M.; Zaslavsky, A.; Krishnaswamy, S. Mining data streams: A review. ACM Sigmod Rec. 2005, 34, 18–26. [Google Scholar] [CrossRef]
- Huang, G.B.; Liang, N.Y.; Rong, H.J.; Saratchandran, P.; Sundararajan, N. On-Line Sequential Extreme Learning Machine. Comput. Intell. 2005, 2005, 232–237. [Google Scholar]
- Oza, N.C. Online bagging and boosting. In Proceedings of the 2005 IEEE International Conference on Systems, Man and Cybernetics, Waikoloa, HI, USA, 12 October 2005. [Google Scholar]
- Lu, J.; Liu, A.; Dong, F.; Gu, F.; Gama, J.; Zhang, G. Learning under concept drift: A review. IEEE Trans. Knowl. Data Eng. 2018, 31, 2346–2363. [Google Scholar] [CrossRef] [Green Version]
- Khamassi, I.; Sayed-Mouchaweh, M.; Hammami, M.; Ghédira, K. Discussion and review on evolving data streams and concept drift adapting. Evol. Syst. 2018, 9, 1–23. [Google Scholar] [CrossRef]
- Barros, R.S.M.; Santos, S.G.T.C. A large-scale comparison of concept drift detectors. Inf. Sci. 2018, 451, 348–370. [Google Scholar] [CrossRef]
- Gama, J.; Sebastiao, R.; Rodrigues, P. On evaluating stream learning algorithms. Mach. Learn. 2013, 90, 317–346. [Google Scholar] [CrossRef] [Green Version]
- Gama, J.; Žliobaitė, I.; Bifet, A.; Pechenizkiy, M.; Bouchachia, A. A survey on concept drift adaptation. ACM Comput. Surv. (CSUR) 2014, 46, 1–37. [Google Scholar] [CrossRef]
- Wares, S.; Isaacs, J.; Elyan, E. Data stream mining: Methods and challenges for handling concept drift. SN Appl. Sci. 2019, 1, 1412. [Google Scholar] [CrossRef] [Green Version]
- Ross, G.J.; Adams, N.M.; Tasoulis, D.K.; Hand, D.J. Exponentially weighted moving average charts for detecting concept drift. Pattern Recognit. Lett. 2012, 33, 191–198. [Google Scholar] [CrossRef] [Green Version]
- Page, E.S. Continuous inspection schemes. Biometrika 1954, 41, 100–115. [Google Scholar] [CrossRef]
- Frias-Blanco, I.; del Campo-Ávila, J.; Ramos-Jimenez, G.; Morales-Bueno, R.; Ortiz-Diaz, A.; Caballero-Mota, Y. Online and non-parametric drift detection methods based on Hoeffding’s bounds. IEEE Trans. Knowl. Data Eng. 2014, 27, 810–823. [Google Scholar] [CrossRef]
- Nishida, K.; Yamauchi, K. Detecting concept drift using statistical testing. In Proceedings of the International Conference on Discovery Science, Sendai, Japan, 1–4 October 2007; Springer: Berlin/Heidelberg, Germany, 2007. [Google Scholar]
- Minku, L.L.; Yao, X. DDD: A new ensemble approach for dealing with concept drift. IEEE Trans. Knowl. Data Eng. 2011, 24, 619–633. [Google Scholar] [CrossRef] [Green Version]
- Liu, A.; Zhang, G.; Lu, J. Fuzzy time windowing for gradual concept drift adaptation. In Proceedings of the 2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), Naples, Italy, 9–12 July 2017. [Google Scholar]
- Webb, G.I.; Hyde, R.; Cao, H.; Nguyen, H.L.; Petitjean, F. Characterizing concept drift. Data Min. Knowl. Discov. 2016, 30, 964–994. [Google Scholar] [CrossRef] [Green Version]
- Shen, Y.; Du, J.; Tong, J.; Dou, Q.; Jing, L. A parallel and reverse Learn++. NSE classification algorithm. IEEE Access 2020, 8, 64157–64168. [Google Scholar] [CrossRef]
- Chen, Y.; Zhu, Y.; Chen, H.; Shen, Y.; Xu, Z. A Pruning Optimized Fast Learn++ NSE Algorithm. IEEE Access 2021, 9, 150733–150743. [Google Scholar] [CrossRef]
- Hu, H.; Kantardzic, M.; Sethi, T.S. No Free Lunch Theorem for concept drift detection in streaming data classification: A review. WIREs Data Min. Knowl. Discov. 2020, 10, e1327. [Google Scholar] [CrossRef]
- Dasu, T.; Krishnan, S.; Venkatasubramanian, S.; Yi, K. An information-theoretic approach to detecting changes in multi-dimensional data streams. In Proceedings of the Symposium on the Interface of Statistics, Computing Science, and Applications, Pasadena, CA, USA, 24–27 May 2006. [Google Scholar]
- Kuncheva, L.I.; Faithfull, W.J. PCA feature extraction for change detection in multidimensional unlabeled data. IEEE Trans. Neural Netw. Learn. Syst. 2013, 25, 69–80. [Google Scholar] [CrossRef] [PubMed]
- Moreno-Torres, J.G.; Raeder, T.; Alaiz-Rodríguez, R.; Chawla, N.V.; Herrera, F. A unifying view on dataset shift in classification. Pattern Recognit. 2012, 45, 521–530. [Google Scholar] [CrossRef]
- Gemaque, R.N.; Costa, A.F.J.; Giusti, R.; Dos Santos, E.M. An overview of unsupervised drift detection methods. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2020, 10, e1381. [Google Scholar] [CrossRef]
- Domingos, P.; Hulten, G. Mining high-speed data streams. In Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA, USA, 20–23 August 2000. [Google Scholar]
- Oza, N.C.; Russell, S. Online Ensemble Learning; University of California: Berkeley, CA, USA, 2001. [Google Scholar]
- Bifet, A.; Zhang, J.; Fan, W.; He, C.; Zhang, J.; Qian, J.; Holmes, G.; Pfahringer, B. Extremely fast decision tree mining for evolving data streams. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017. [Google Scholar]
- Wang, H.; Fan, W.; Yu, P.S.; Han, J. Mining concept-drifting data streams using ensemble classifiers. In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 24–27 August 2003; Association for Computing Machinery: Washington, DC, USA, 2003; pp. 226–235. [Google Scholar]
- Brzeziński, D.; Stefanowski, J. Accuracy updated ensemble for data streams with concept drift. In Proceedings of the International Conference on Hybrid Artificial Intelligence Systems, Wroclaw, Poland, 23–25 May 2011; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
- Brzezinski, D.; Stefanowski, J. Reacting to different types of concept drift: The accuracy updated ensemble algorithm. IEEE Trans. Neural Netw. Learn. Syst. 2013, 25, 81–94. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- McCloskey, M.; Cohen, N.J. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of Learning and Motivation; Elsevier: Amsterdam, The Netherlands, 1989; pp. 109–165. [Google Scholar]
- French, R.M. Catastrophic forgetting in connectionist networks. Trends Cogn. Sci. 1999, 3, 128–135. [Google Scholar] [CrossRef] [PubMed]
- Liu, J.; Li, X.; Zhong, W. Ambiguous decision trees for mining concept-drifting data streams. Pattern Recognit. Lett. 2009, 30, 1347–1355. [Google Scholar] [CrossRef]
- Bifet, A.; Gavaldà, R. Adaptive learning from evolving data streams. In Proceedings of the International Symposium on Intelligent Data Analysis, Lyon, France, 31 August–2 September 2009; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
- Gomes, H.M.; Bifet, A.; Read, J.; Barddal, J.P.; Enembreck, F.; Pfharinger, B.; Holmes, G.; Abdessalem, T. Adaptive random forests for evolving data stream classification. Mach. Learn. 2017, 106, 1469–1495. [Google Scholar] [CrossRef] [Green Version]
- Lughofer, E.; Angelov, P. Handling drifts and shifts in on-line data streams with evolving fuzzy systems. Appl. Soft Comput. 2011, 11, 2057–2068. [Google Scholar] [CrossRef]
- Lughofer, E.; Pratama, M.; Skrjanc, I. Incremental rule splitting in generalized evolving fuzzy systems for autonomous drift compensation. IEEE Trans. Fuzzy Syst. 2017, 26, 1854–1865. [Google Scholar] [CrossRef]
- Pratama, M.; Lu, J.; Lughofer, E.; Zhang, G.; Er, M.J. An incremental learning of concept drifts using evolving type-2 recurrent fuzzy neural networks. IEEE Trans. Fuzzy Syst. 2016, 25, 1175–1192. [Google Scholar] [CrossRef]
- Lughofer, E.; Pratama, M.; Škrjanc, I. Online bagging of evolving fuzzy systems. Inf. Sci. 2021, 570, 16–33. [Google Scholar] [CrossRef]
- Zhu, X.J. Semi-Supervised Learning Literature Survey; University of Wisconsin: Madison, WI, USA, 2005. [Google Scholar]
- Chapelle, O.; Scholkopf, B.; Zien, A. Semi-supervised learning. IEEE Trans. Neural Netw. 2009, 20, 542. [Google Scholar] [CrossRef]
- Belkin, M.; Niyogi, P.; Sindhwani, V. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res. 2006, 7, 2399–2434. [Google Scholar]
- Moh, Y.; Buhmann, J.M. Manifold regularization for semi-supervised sequential learning. In Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, Taipei, Taiwan, 19–24 April 2009. [Google Scholar]
- Jia, X.; Wang, R.; Liu, J.; Powers, D.M. A semi-supervised online sequential extreme learning machine method. Neurocomputing 2016, 174, 168–178. [Google Scholar] [CrossRef]
- Da Silva, C.A.; Krohling, R.A. Semi-Supervised Online Elastic Extreme Learning Machine for Data Classification. In Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 8–13 July 2018. [Google Scholar]
- Kamiya, Y.; Ishii, T.; Furao, S.; Hasegawa, O. An online semi-supervised clustering algorithm based on a self-organizing incremental neural network. In Proceedings of the 2007 International Joint Conference on Neural Networks, Orlando, FL, USA, 12–17 August 2007. [Google Scholar]
- Furao, S.; Ogura, T.; Hasegawa, O. An enhanced self-organizing incremental neural network for online unsupervised learning. Neural Netw. 2007, 20, 893–903. [Google Scholar] [CrossRef]
- Chong, Y.; Ding, Y.; Yan, Q.; Pan, S. Graph-based semi-supervised learning: A review. Neurocomputing 2020, 408, 216–230. [Google Scholar] [CrossRef]
- Song, Z.; Yang, X.; Xu, Z.; King, I. Graph-based semi-supervised learning: A comprehensive review. IEEE Trans. Neural Netw. Learn. Syst. 2022, in press. [Google Scholar] [CrossRef] [PubMed]
- Zhou, K.; Martin, A.; Pan, Q.; Liu, Z. SELP: Semi-supervised evidential label propagation algorithm for graph data clustering. Int. J. Approx. Reason. 2018, 92, 139–154. [Google Scholar] [CrossRef] [Green Version]
- Wada, Y.; Su, S.; Kumagai, W.; Kanamori, T. Robust Label Prediction via Label Propagation and Geodesic k-Nearest Neighbor in Online Semi-Supervised Learning. IEICE Trans. Inf. Syst. 2019, 102, 1537–1545. [Google Scholar] [CrossRef]
- Iscen, A.; Tolias, G.; Avrithis, Y.; Chum, O. Label propagation for deep semi-supervised learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Kejani, M.T.; Dornaika, F.; Talebi, H. Graph Convolution Networks with manifold regularization for semi-supervised learning. Neural Netw. 2020, 127, 160–167. [Google Scholar] [CrossRef]
- Liu, W.; Fu, S.; Zhou, Y.; Zha, Z.J.; Nie, L. Human activity recognition by manifold regularization based dynamic graph convolutional networks. Neurocomputing 2021, 444, 217–225. [Google Scholar] [CrossRef]
- Din, S.U.; Shao, J.; Kumar, J.; Ali, W.; Liu, J.; Ye, Y. Online reliable semi-supervised learning on evolving data streams. Inf. Sci. 2020, 525, 153–171. [Google Scholar] [CrossRef]
- Casalino, G.; Castellano, G.; Mencar, C. Data stream classification by dynamic incremental semi-supervised fuzzy clustering. Int. J. Artif. Intell. Tools 2019, 28, 1960009. [Google Scholar] [CrossRef]
- Murilo Gomes, H.; Grzenda, M.; Mello, R.; Read, J.; Huong Le Nguyen, M.; Bifet, A. A survey on semi-supervised learning for delayed partially labelled data streams. ACM Comput. Surv. (CSUR) 2022, 55, 75. [Google Scholar]
- Casalino, G.; Castellano, G.; Mencar, C. Incremental adaptive semi-supervised fuzzy clustering for data stream classification. In Proceedings of the 2018 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS), Rhodes, Greece, 25–27 May 2018. [Google Scholar]
- Roberts, S. Control chart tests based on geometric moving averages. Technometrics 2000, 42, 97–101. [Google Scholar] [CrossRef]
- Hoeffding, W. Probability Inequalities for Sums of Bounded Random Variables. J. Am. Stat. Assoc. 1963, 58, 13–30. [Google Scholar] [CrossRef]
- Baena-Garcıa, M.; del Campo-Ávila, J.; Fidalgo, R.; Bifet, A.; Gavalda, R.; Morales-Bueno, R. Early drift detection method. In Proceedings of the Fourth International Workshop on KNOWLEDGE discovery from Data Streams, Philadelphia, PA, USA, 20 August 2006. [Google Scholar]
- Bifet, A.; Gavalda, R. Learning from time-changing data with adaptive windowing. In Proceedings of the 2007 SIAM International Conference on Data Mining, Minneapolis, MN, USA; SIAM: Philadelphia, PA, USA, 2007. [Google Scholar]
- Aminikhanghahi, S.; Cook, D.J. A survey of methods for time series change point detection. Knowl. Inf. Syst. 2017, 51, 339–367. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Hao, C. Sequential change-point detection based on nearest neighbors. Ann. Stat. 2019, 47, 1381–1407. [Google Scholar]
- Fearnhead, P.; Rigaill, G. Changepoint Detection in the Presence of Outliers. J. Am. Stat. Assoc. 2019, 114, 169–183. [Google Scholar] [CrossRef] [Green Version]
- Ferrari, A.; Richard, C.; Bourrier, A.; Bouchikhi, I. Online change-point detection with kernels. Pattern Recognition. Pattern Recognit. 2023, 133, 109022. [Google Scholar] [CrossRef]
- Lughofer, E.; Weigl, E.; Heidl, W.; Eitzinger, C.; Radauer, T. Recognizing input space and target concept drifts in data streams with scarcely labeled and unlabelled instances. Inf. Sci. 2016, 355, 127–151. [Google Scholar] [CrossRef]
- Nikzad-Langerodi, R.; Lughofer, E.; Cernuda, C.; Reischer, T.; Kantner, W.; Pawliczek, M.; Brandstetter, M. Calibration model maintenance in melamine resin production: Integrating drift detection, smart sample selection and model adaptation. Anal. Chim. Acta 2018, 1013, 1–12. [Google Scholar] [CrossRef] [Green Version]
- Huang, G.-B.; Zhu, Q.-Y.; Siew, C.-K. Extreme learning machine: Theory and applications. Neurocomputing 2006, 70, 489–501. [Google Scholar] [CrossRef]
- Huang, G.; Song, S.; Gupta, J.N.; Wu, C. Semi-supervised and unsupervised extreme learning machines. IEEE Trans. Cybern. 2014, 44, 2405–2417. [Google Scholar] [CrossRef]
- Platanios, E.A.; Blum, A.; Mitchell, T.M. Estimating Accuracy from Unlabeled Data. UAI 2014, 14, 10. [Google Scholar]
- Yang, L.; Yang, S.; Li, S.; Liu, Z.; Jiao, L. Incremental laplacian regularization extreme learning machine for online learning. Appl. Soft Comput. 2017, 59, 546–555. [Google Scholar] [CrossRef]
- Da Silva, C.A.; Krohling, R.A. Semi-Supervised Online Elastic Extreme Learning Machine with Forgetting Parameter to deal with concept drift in data streams. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019. [Google Scholar]
- Gomes, H.M.; Read, J.; Bifet, A. Streaming Random Patches for Evolving Data Stream Classification. In Proceedings of the 2019 IEEE International Conference on Data Mining (ICDM), Beijing, China, 8–11 November 2019. [Google Scholar]
- Montiel, J.; Read, J.; Bifet, A.; Abdessalem, T. Scikit-multiflow: A multi-output streaming framework. J. Mach. Learn. Res. 2018, 19, 1–5. [Google Scholar]
Symbol | Description |
---|---|
Dataset | |
Datapoint | |
Set of nodes | |
Nearest node to the th winner node | |
Number of times node has been selected as the winner | |
Set of edges in the SOINN | |
Edge between nodes and , | |
Distance threshold for node insertion | |
Age of the edge between nodes and | |
Set of neighbors for node | |
Number of standard deviations above the mean for node deletion when anode has only one neighbor. | |
Number of standard deviations above the mean for node deletion when node has two neighbors. | |
Graph Laplacian matrix | |
Degree matrix | |
Similarity matrix | |
Prediction matrix | |
Trainable parameters for SOS-ELM | |
Input-hidden output matrix for SOS-ELM | |
Regularization term matrix | |
Activation function | |
Kalman gain | |
Labeled–unlabeled importance tradeoff | |
Radial Basis Function (RBF) kernel width parameter |
Algorithm | Reference | Description |
---|---|---|
Semisupervised Online Sequential—Extreme Learning Machine (SOS-ELM) | [54] | A version of Semisupervised—Extreme Learning Machine (SS-ELM) that learns sequentially. |
Incremental Laplacian Regularized—Extreme Learning Machine (ILR-ELM) | [82] | An improvement in the SOS-ELM that could be updated without the requirement of each chunk having labeled data. |
Semisupervised Online Elastic—Extreme Learning Machine (SSOE-ELM) | [83] | An adaptable version of the SOS-ELM. that can adapt to drift by adding a forgetting factor to remove old concepts from the model. |
OzaBagAdwin | [14] | A derivation of the Online Bagging and Boosting Classifier that can handle nonstationary data streams by attaching the ADWIN classifier to the model. |
Adaptive Random Forest (ARF) | [45] | Introduces diversity to the decision tree ensemble by selecting only a subset of features for each decision tree and assigning a drift detector per decision tree for adaptability to concept drifts. |
Leveraging Bagging Classifier (LVB) | [84] | An ensemble version of the OzaBag classifier that also adds the ADWIN concept drift detector to the model for each decision tree. |
Dataset | Equation | Fixed Variables | Before/After Variables | Drift Severity | Change Percent (%) |
---|---|---|---|---|---|
Circle | Low | 16% | |||
Medium | 38% | ||||
High | 66% | ||||
SineV | Low | 15% | |||
Medium | 45% | ||||
High | 75% | ||||
SineH | Low | 36% | |||
Medium | 57% | ||||
High | 80% | ||||
Plane | , , | Low | 14% | ||
Medium | 44% | ||||
High | 74% |
Dataset | #Samples | #Features | #Classes | #Training Samples | #Test Samples | #Training Steps | #Labeled Samples | Data Characteristics | Imbalance Ratio |
---|---|---|---|---|---|---|---|---|---|
Sensorless Drive | 58,509 | 49 | 11 | 46,807 | 11,701 | 469 | 2340 | High Dimensional | - |
Magic Gamma | 19,020 | 11 | 11 | 15,216 | 3804 | 152 | 760 | Noisy | 46:54 |
Human Activity Recognition (HAR) | 10,299 | 561 | 6 | 5881 | 4418 | 58 | 290 | Noisy/High Dimensional | 1:99 |
Crop Mapping | 325,834 | 175 | 7 | 60,000 | 65,167 | 600 | 3000 | High Dimensional | 1:99 |
KDDCup 1999 | 494,021 | 42 | 12 | 50,000 | 98,805 | 500 | 2500 | Noisy/High Dimensional | 1:99 |
Physical Activity Monitoring (PAMAP2) | 1,942,872 | 52 | 12 | 50,000 | 388,575 | 500 | 2500 | Noisy/High Dimensional | 8:92 |
Approach | Label Size | This Approach | SOS-ELM | SSOE-ELM | ILR-ELM | OzaBag Adwin | ARF | LVB |
---|---|---|---|---|---|---|---|---|
Stagger | 1 | 0.8556 | 0.6195 | 0.6937 | 0.735 | 0.9569 | 0.9828 | 0.9737 |
5 | 0.9653 | 0.6756 | 0.6818 | 0.809 | 0.9569 | 0.9828 | 0.9737 | |
10 | 0.9566 | 0.6991 | 0.6822 | 0.8298 | 0.9569 | 0.9828 | 0.9737 | |
Average | 0.93 | 0.66 | 0.69 | 0.79 | 0.96 | 0.98 | 0.97 | |
Average Rank | 3.67 | 6.67 | 6.33 | 5.00 | 3.33 | 1.00 | 2.00 | |
Sine | 1 | 0.428 | 0.5394 | 0.553 | 0.5135 | 0.8827 | 0.9433 | 0.9309 |
5 | 0.7905 | 0.5318 | 0.5168 | 0.5293 | 0.8827 | 0.9433 | 0.9309 | |
10 | 0.8653 | 0.5192 | 0.5207 | 0.526 | 0.8827 | 0.9433 | 0.9309 | |
Average | 0.69 | 0.53 | 0.53 | 0.52 | 0.88 | 0.94 | 0.93 | |
Average Rank | 5.00 | 5.67 | 5.67 | 5.67 | 3.00 | 1.00 | 2.00 | |
Hyperplane0.001 | 1 | 0.5845 | 0.6809 | 0.6121 | 0.6657 | 0.7649 | 0.776 | 0.6628 |
5 | 0.783 | 0.753 | 0.6071 | 0.7128 | 0.7649 | 0.776 | 0.6628 | |
10 | 0.7875 | 0.753 | 0.6071 | 0.7128 | 0.7649 | 0.776 | 0.6628 | |
Average | 0.72 | 0.73 | 0.61 | 0.70 | 0.76 | 0.78 | 0.66 | |
Average Rank | 3.00 | 3.67 | 6.67 | 4.67 | 2.67 | 1.67 | 5.67 | |
Hyperplane0.01 | 1 | 0.6781 | 0.7005 | 0.7005 | 0.6667 | 0.7692 | 0.7901 | 0.6656 |
5 | 0.7799 | 0.763 | 0.763 | 0.7088 | 0.7692 | 0.7901 | 0.6655 | |
10 | 0.7792 | 0.7747 | 0.7747 | 0.731 | 0.7692 | 0.7901 | 0.6657 | |
Average | 0.75 | 0.75 | 0.75 | 0.70 | 0.77 | 0.79 | 0.67 | |
Average Rank | 3.00 | 3.83 | 3.83 | 6.00 | 3.33 | 1.00 | 7.00 | |
HAR | 1 | 0.7303 | 0.7506 | 0.7381 | 0.7428 | 0.8675 | 0.7091 | 0.8023 |
5 | 0.8621 | 0.8437 | 0.8376 | 0.8596 | 0.8675 | 0.6862 | 0.8023 | |
10 | 0.8621 | 0.8437 | 0.8683 | 0.8597 | 0.8675 | 0.6862 | 0.8023 | |
Average | 0.82 | 0.81 | 0.81 | 0.82 | 0.87 | 0.69 | 0.80 | |
Average Rank | 3.67 | 4.00 | 3.67 | 3.67 | 1.33 | 7.00 | 4.67 | |
Magic Gamma | 1 | 0.7389 | 0.7393 | 0.7261 | 0.6892 | 0.2452 | 0.8363 | 0.7375 |
5 | 0.7928 | 0.79 | 0.7022 | 0.7458 | 0.2452 | 0.8363 | 0.7375 | |
10 | 0.8085 | 0.7948 | 0.6715 | 0.7531 | 0.2452 | 0.8363 | 0.7375 | |
Average | 0.78 | 0.77 | 0.70 | 0.73 | 0.25 | 0.84 | 0.74 | |
Average Rank | 2.33 | 2.67 | 5.67 | 4.67 | 7.00 | 1.00 | 4.67 | |
PAMAP2 | 1 | 0.5643 | 0.4682 | 0.4568 | 0.4633 | 0.7027 | 0.8226 | 0.5817 |
5 | 0.7216 | 0.727 | 0.6733 | 0.7037 | 0.7027 | 0.8226 | 0.5817 | |
10 | 0.798 | 0.7936 | 0.7789 | 0.777 | 0.7027 | 0.8226 | 0.5817 | |
Average | 0.69 | 0.66 | 0.64 | 0.65 | 0.70 | 0.82 | 0.58 | |
Average Rank | 3.00 | 3.33 | 5.67 | 5.00 | 4.33 | 1.00 | 5.67 | |
Sensorless Drive | 1 | 0.6618 | 0.6687 | 0.5054 | 0.6928 | 0.8536 | 0.8905 | 0.6928 |
5 | 0.78 | 0.7814 | 0.7193 | 0.8092 | 0.8536 | 0.8905 | 0.6928 | |
10 | 0.8225 | 0.8185 | 0.8195 | 0.8481 | 0.8536 | 0.8905 | 0.6928 | |
Average | 0.75 | 0.76 | 0.68 | 0.78 | 0.85 | 0.89 | 0.69 | |
Average Rank | 5.00 | 5.00 | 6.00 | 3.17 | 2.00 | 1.00 | 5.83 | |
KDDCup1999 | 1 | 0.9926 | 0.9924 | 0.5054 | 0.9923 | 0.9881 | 0.9955 | 0.985 |
5 | 0.9926 | 0.9924 | 0.7193 | 0.9923 | 0.9881 | 0.9955 | 0.985 | |
10 | 0.9957 | 0.9833 | 0.8195 | 0.9848 | 0.9881 | 0.9955 | 0.985 | |
Average | 0.99 | 0.99 | 0.68 | 0.99 | 0.99 | 1.00 | 0.99 | |
Average Rank | 1.67 | 4.00 | 7.00 | 4.33 | 4.33 | 1.33 | 5.33 | |
Overall Average Accuracy | 0.7911 | 0.74 | 0.6767 | 0.7422 | 0.7811 | 0.8589 | 0.7811 | |
Overall Average Ranking | 3.37 | 4.31 | 5.61 | 4.69 | 3.48 | 1.78 | 4.76 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Muhammad Zaly Shah, M.Z.; Zainal, A.; Elfadil Eisa, T.A.; Albasheer, H.; Ghaleb, F.A. A Semisupervised Concept Drift Adaptation via Prototype-Based Manifold Regularization Approach with Knowledge Transfer. Mathematics 2023, 11, 355. https://doi.org/10.3390/math11020355
Muhammad Zaly Shah MZ, Zainal A, Elfadil Eisa TA, Albasheer H, Ghaleb FA. A Semisupervised Concept Drift Adaptation via Prototype-Based Manifold Regularization Approach with Knowledge Transfer. Mathematics. 2023; 11(2):355. https://doi.org/10.3390/math11020355
Chicago/Turabian StyleMuhammad Zaly Shah, Muhammad Zafran, Anazida Zainal, Taiseer Abdalla Elfadil Eisa, Hashim Albasheer, and Fuad A. Ghaleb. 2023. "A Semisupervised Concept Drift Adaptation via Prototype-Based Manifold Regularization Approach with Knowledge Transfer" Mathematics 11, no. 2: 355. https://doi.org/10.3390/math11020355
APA StyleMuhammad Zaly Shah, M. Z., Zainal, A., Elfadil Eisa, T. A., Albasheer, H., & Ghaleb, F. A. (2023). A Semisupervised Concept Drift Adaptation via Prototype-Based Manifold Regularization Approach with Knowledge Transfer. Mathematics, 11(2), 355. https://doi.org/10.3390/math11020355