Kernel Matrix-Based Heuristic Multiple Kernel Learning
Abstract
:1. Introduction
2. Background
2.1. Multiple Kernel
- Linear: ;
- Polynomial: ;
- Radial Basis Function (RBF): ;
- Hyperbolic Tangent: .
2.2. MKL-SVM
2.3. MKL Optimization Approaches
2.3.1. MKLGLp
Algorithm 1: MKLGLp Classifier Training. |
|
2.3.2. GAMKLp
2.3.3. DeFIMKLp
Algorithm 2:DeFIMKL Classifier Training. |
Data: ()—feature vector and label pairs; - kernel matrices Result: —Lexicographically ordered g vector foreach kernel matrixdo Based on the normalized values and our respective target labels, formulate and solve a quadratic programming problem (see [24]) to obtain the free Choquet integral parameters (g). |
Algorithm 3: DeFIMKL Classifier Testing. |
Compute the normalized SVM decision values . Apply the Choquet integral with respect to the learned g and inputs. Compute the class label by . |
2.4. Heuristic MK Approaches
3. Divergence Measures on Kernel Matrices
3.1. Key Factors for the Proposed Weight Assignments
3.2. Index 1 (Class Separation—Non-Normal Distribution): DiMKL
3.3. Index 2 (Class Separation—Normal Distribution): DiMKL
3.4. Index 3 (Class Separation—Euclidean of Overlap): DiMKL
3.5. Index 4 (Class Separation—Euclidean of Means): DiMKL
3.6. Index 5 (Class Separation—Bhattacharyya): DiMKL
4. Divergence Measures in the RKHS
5. Experiments
5.1. Feature Learning for Explosive Hazard Detection
5.2. Benchmark Datasets
5.3. Computational Complexity
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
- Chang, C.C.; Lin, C.J. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2011, 2, 27:1–27:27. Available online: http://www.csie.ntu.edu.tw/~cjlin/libsvm (accessed on 20 March 2018). [CrossRef]
- Das, S.; Abraham, A.; Konar, A. Automatic kernel clustering with a multi-elitist particle swarm optimization algorithm. Pattern Recognit. Lett. 2008, 29, 688–699. [Google Scholar] [CrossRef]
- Kim, D.W.; Lee, K.Y.; Lee, D.; Lee, K.H. Evaluation of the performance of clustering algorithms in kernel-induced feature space. Pattern Recognit. 2005, 38, 607–611. [Google Scholar] [CrossRef]
- Liao, L.; Lin, T.; Li, B. MRI brain image segmentation and bias field correction based on fast spatially constrained kernel clustering approach. Pattern Recognit. Lett. 2008, 29, 1580–1588. [Google Scholar] [CrossRef]
- Mika, S.; Schölkopf, B.; Smola, A.J.; Müller, K.R.; Scholz, M.; Rätsch, G. Kernel PCA and de-noising in feature spaces. In Proceedings of the Advances in Neural Information Processing Systems, Denver, CO, USA, 29 November–4 December 1999; pp. 536–542. [Google Scholar]
- Schölkopf, B.; Smola, A.; Müller, K.R. Kernel principal component analysis. In Proceedings of the International Conference on Artificial Neural Networks, Lausanne, Switzerland, 8–10 October 1997; Springer: Berlin/Heidelberg, Germany, 1997; pp. 583–588. [Google Scholar]
- Kim, K.I.; Franz, M.O.; Scholkopf, B. Iterative kernel principal component analysis for image modeling. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 1351–1366. [Google Scholar]
- Price, S.R.; Anderson, D.T.; Havens, T.C. Fusion of iECO image descriptors for buried explosive hazard detection in forward-looking infrared imagery. Proc. SPIE 2015, 9454, 945405. [Google Scholar]
- Price, S.R.; Murray, B.; Hu, L.; Anderson, D.T.; Havens, T.C.; Luke, R.H.; Keller, J.M. Multiple kernel based feature and decision level fusion of iECO individuals for explosive hazard detection in FLIR imagery. In Detection and Sensing of Mines, Explosive Objects, and Obscured Targets XXI; SPIE Defense+ Security; International Society for Optics and Photonics: Bellingham, WA, USA, 2016; p. 98231G. [Google Scholar]
- Pinar, A.J.; Rice, J.; Hu, L.; Anderson, D.T.; Havens, T.C. Efficient Multiple Kernel Classification Using Feature and Decision Level Fusion. IEEE Trans. Fuzzy Syst. 2017, 25, 1403–1416. [Google Scholar] [CrossRef]
- Schölkopf, B.; Smola, A.J. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond; MIT Press: Cambridge, MA, USA, 2002. [Google Scholar]
- Schölkopf, B.; Smola, A.J.; Williamson, R.C.; Bartlett, P.L. New support vector algorithms. Neural Comput. 2000, 12, 1207–1245. [Google Scholar] [CrossRef] [PubMed]
- Hsu, C.W.; Chang, C.C.; Lin, C.J. A practical guide to support vector classification. 2003. Available online: https://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf (accessed on 20 March 2018).
- Price, S.R.; Anderson, D.T.; Luke, R.H. An improved evolution-constructed (iECO) features framework. In Proceedings of the 2014 IEEE Symposium on Computational Intelligence for Multimedia, Signal and Vision Processing (CIMSIVP), Orlando, FL, USA, 9–12 December 2014; pp. 1–8. [Google Scholar] [CrossRef]
- Lu, K.; Zhao, J.; Zhang, J.; Qin, C. Multiple Kernel Learning via Ensemble Artifice in Reproducing Kernel Hilbert Space. In Proceedings of the 2020 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), Chongqing, China, 29–30 October 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 264–267. [Google Scholar]
- Varma, M.; Babu, B.R. More generality in efficient multiple kernel learning. In Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, QC, Canada, 14–18 June 2009; pp. 1065–1072. [Google Scholar]
- Suzuki, T.; Tomioka, R. SpicyMKL: A fast algorithm for multiple kernel learning with thousands of kernels. Mach. Learn. 2011, 85, 77–108. [Google Scholar] [CrossRef]
- Han, Y.; Yang, Y.; Li, X.; Liu, Q.; Ma, Y. Matrix-Regularized Multiple Kernel Learning via (r,p) Norms. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 4997–5007. [Google Scholar] [CrossRef] [PubMed]
- Xu, L.; Luo, B.; Tang, Y.; Ma, X. An efficient multiple kernel learning in reproducing kernel Hilbert spaces (RKHS). Int. J. Wavelets Multiresolution Inf. Process. 2015, 13, 1550008. [Google Scholar] [CrossRef]
- Banerjee, S.; Das, S. Kernel selection using multiple kernel learning and domain adaptation in reproducing kernel hilbert space, for face recognition under surveillance scenario. arXiv 2016, arXiv:1610.00660. [Google Scholar]
- Gönen, M.; Alpaydın, E. Multiple kernel learning algorithms. J. Mach. Learn. Res. 2011, 12, 2211–2268. [Google Scholar]
- Xu, Z.; Jin, R.; Yang, H.; King, I.; Lyu, M.R. Simple and Efficient Multiple Kernel Learning by Group Lasso; Fürnkranz, J., Joachims, T., Eds.; ICML; Omnipress: Madison, WI, USA, 2010; pp. 1175–1182. [Google Scholar]
- Pinar, A.; Havens, T.C.; Anderson, D.T.; Hu, L. Feature and decision level fusion using multiple kernel learning and fuzzy integrals. In Proceedings of the 2015 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), Istanbul, Turkey, 2–5 August 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 1–7. [Google Scholar]
- de Diego, I.; Moguerza, J.; Munoz, A. Combining kernel information for support vector classification. In Proceedings of the International Workshop on Multiple Classifier Systems, Cagliari, Italy, 9–11 June 2004; pp. 102–111. [Google Scholar]
- de Diego, I.M.; Muñoz, A.; Moguerza, J.M. Methods for the combination of kernel matrices within a support vector framework. Mach. Learn. 2010, 78, 137. [Google Scholar] [CrossRef]
- Moguerza, J.M.; Munoz, A.; de Diego, I.M. Improving Support Vector Classification via the Combination of Multiple Sources of Information; SSPR/SPR; Springer: Berlin/Heidelberg, Germany, 2004; pp. 592–600. [Google Scholar]
- Zhou, S.K.; Chellappa, R. From sample similarity to ensemble similarity: Probabilistic distance measures in reproducing kernel hilbert space. IEEE Trans. Pattern Anal. Mach. Intell. 2006, 28, 917–929. [Google Scholar] [CrossRef] [PubMed]
- Edelman, S.; Intrator, N.; Poggio, T. Complex Cells and Object Recognition. 1997. Available online: https://shimon-edelman.github.io/Archive/nips97.pdf (accessed on 3 June 2016).
- Lowe, D.G. Object recognition from local scale-invariant features. In Proceedings of the Seventh IEEE International Conference on Computer Vision, Kerkyra, Greece, 20–27 September 1999; Volume 2, pp. 1150–1157. [Google Scholar]
- Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the International Conference on Computer Vision and Pattern Recognition, CVPR 2005, San Diego, CA, USA, 20–26 June 2005; IEEE Computer Society: Washington, DC, USA, 2005; Volume 1, pp. 886–893. [Google Scholar]
- Frigui, H.; Gader, P. Detection and discrimination of land mines in ground-penetrating radar based on edge histogram descriptors and a possibilistic k-nearest neighbor classifier. IEEE Trans. Fuzzy Syst. 2009, 17, 185–199. [Google Scholar] [CrossRef]
- Stone, K.; Keller, J.; Anderson, D.; Barclay, D. An automatic detection system for buried explosive hazards in FL-LWIR and FL-GPR data. SPIE Def. Secur. Sens. 2012, 8357, 83571E. [Google Scholar]
- Lichman, M. UCI Machine Learning Repository. 2013. Available online: https://archive.ics.uci.edu/ml/index.php (accessed on 14 October 2017).
IoT | Internet of things | GMKL | Generalized MKL |
SVM | Support vector machine | MRMKL | Matrix regularized MKL |
RBF | Radial basis function | HRK | H-reproducing kernel |
MKL | Multiple kernel learning | MFKL | Multi-feature kernel learning |
GAMKLp | norm genetic algorithm based MKL | MKLGLp | norm MKLGL |
FIFO | Feature-in-feature-out | DeFIMKL | Decision-level FIMKL |
FIMKLp | norm fuzzy integral MKL | IQR | Interquartile range |
DIDO | Decision-in-decision-out | DiMKL | Divergence-based MKL |
RKHS | Reproducing kernel Hilbert space | NAUC | Normalized area under the curve |
MKLGL | MKL based group lasso | iECO | improved evolution constructed |
PSD | Positive semi-definite | GA | Genetic algorithm |
LCS | Linear convex sum | HOG | Histogram of oriented gradients |
SKSVM | Single kernel SVM | SD | Statistical descriptor |
MKLEA | MKL via ensemble artifice | EHD | Edge histogram descriptor |
Lane | Number of Targets | Area (m) | Metal Shallow | Metal Deep | Non-Metal Shallow | Non-Metal Deep |
---|---|---|---|---|---|---|
A | 44 | 3626.9 | 21 | 3 | 11 | 9 |
B | 50 | 4212.7 | 22 | 4 | 14 | 10 |
C | 79 | 3944.8 | 31 | 15 | 21 | 12 |
Learning Strategy | Weight Assignment | Fold-1 | Fold-2 | Fold-3 |
---|---|---|---|---|
Fixed Rule | DiMKL | 0.290 | 0.560 | 0.570 |
Heuristic: Proposed Metrics | DiMKL | 0.336 | 0.643 | 0.570 |
DiMKL | 0.335 | 0.633 | 0.611 | |
DiMKL | 0.333 | 0.633 | 0.598 | |
DiMKL | 0.335 | 0.616 | 0.617 | |
DiMKL | 0.336 | 0.633 | 0.565 | |
Optimization Function | MKLGL | 0.317 | 0.583 | 0.599 |
Learning Strategy | Weight Assignment | Fold-1 | Fold-2 | Fold-3 |
---|---|---|---|---|
Fixed Rule | Uniform | 0.328 | 0.608 | 0.610 |
Heuristic: Proposed Metrics | DiMKL | 0.290 | 0.571 | 0.570 |
DiMKL | 0.338 | 0.635 | 0.576 | |
DiMKL | 0.344 | 0.635 | 0.608 | |
DiMKL | 0.334 | 0.628 | 0.596 | |
DiMKL | 0.330 | 0.611 | 0.610 | |
DiMKL | 0.334 | 0.633 | 0.571 | |
Optimization Function | MKLGL | 0.318 | 0.595 | 0.578 |
DeFIMKL | 0.317 | 0.607 | 0.614 |
Dataset | Instances | Features | Classes |
---|---|---|---|
Sonar | 208 | 60 | 2 |
Ionosphere | 351 | 34 | 2 |
Breast Cancer Wisconsin | 683 | 10 | 2 |
Learning Strategy | Method | Sonar | Ionosphere | Breast Cancer |
---|---|---|---|---|
SKSVM | Individual K | 58.39 (11.22) | 71.14 (6.01) | 96.16 (1.55) |
Individual K | 78.17 (6.36) | 92.14 (3.01) | 97.08 (1.29) | |
Individual K | 84.63 (5.98) | 94.31 (2.46) | 96.66 (1.43) | |
Individual K | 84.56 (5.83) | 94.27 (2.36) | 96.29 (1.44) | |
Individual K | 82.15 (7.97) | 92.34 (2.74) | 95.64 (1.53) | |
Heuristic: Proposed Metrics | DiMKL | 86.17 (5.54) | 94.07 (2.36) | 96.57 (1.46) |
DiMKL | 81.68 (6.35) | 94.71 (2.39) | 97.13 (1.26) | |
DiMKL | 85.22 (5.77) | 94.70 (2.32) | 97.10 (1.27) | |
DiMKL | 85.17 (5.93) | 94.69 (2.33) | 97.09 (1.29) | |
DiMKL | 83.41 (6.53) | 94.57 (2.34) | 97.05 (1.26) | |
DiMKL | 84.75 (6.10) | 94.63 (2.41) | 97.09 (1.28) | |
Optimization Strategies | DeFIMKL | 82.60 (7.89) | 93.01 (2.96) | 96.11 (1.63) |
GAMKL | 85.60 (5.22) | 94.49 (2.49) | 97.06 (1.48) | |
MKLGL | 83.31 (7.98) | 94.08 (2.32) | 95.68 (1.49) |
Sonar | DiMKL | DiMKL | DiMKL | DiMKL | DiMKL | DiMKL | GAMKL | MKLGL | Ind Kernel Perf |
---|---|---|---|---|---|---|---|---|---|
Fused Performance | 92.68% | 87.19% |
Ionosphere | DiMKL | DiMKL | DiMKL | DiMKL | DiMKL | DiMKL | GAMKL | MKLGL | Ind Kernel Perf |
---|---|---|---|---|---|---|---|---|---|
Fused Performance | 96.42% | 94.28% | N/A |
Breast Cancer | DiMKL | DiMKL | DiMKL | DiMKL | DiMKL | DiMKL | GAMKL | MKLGLL | Ind Kernel Perf |
---|---|---|---|---|---|---|---|---|---|
Fused Performance | 96.32% | 98.53% | 96.32% | N/A |
Method | n = 500 | n = 1000 | n = 5000 | n = 10,000 | n = 25,000 |
---|---|---|---|---|---|
DiMKL | |||||
DiMKL | |||||
DiMKL | |||||
DiMKL | |||||
DiMKL | |||||
DiMKL | |||||
MKLGL |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Price, S.R.; Anderson, D.T.; Havens, T.C.; Price, S.R. Kernel Matrix-Based Heuristic Multiple Kernel Learning. Mathematics 2022, 10, 2026. https://doi.org/10.3390/math10122026
Price SR, Anderson DT, Havens TC, Price SR. Kernel Matrix-Based Heuristic Multiple Kernel Learning. Mathematics. 2022; 10(12):2026. https://doi.org/10.3390/math10122026
Chicago/Turabian StylePrice, Stanton R., Derek T. Anderson, Timothy C. Havens, and Steven R. Price. 2022. "Kernel Matrix-Based Heuristic Multiple Kernel Learning" Mathematics 10, no. 12: 2026. https://doi.org/10.3390/math10122026
APA StylePrice, S. R., Anderson, D. T., Havens, T. C., & Price, S. R. (2022). Kernel Matrix-Based Heuristic Multiple Kernel Learning. Mathematics, 10(12), 2026. https://doi.org/10.3390/math10122026