ViT–KAN Synergistic Fusion: A Novel Framework for Parameter- Efficient Multi-Band PolSAR Land Cover Classification
Abstract
:1. Introduction
- We design a position-encoding-enhanced cross-band ViT module that captures long-range dependencies among multi-band classification probability maps using self-attention mechanisms.
- We propose a spline-basis-expanded KAN classifier that achieves high fusion accuracy with significantly reduced parameter counts, enhancing the model’s practical applicability.
2. Methods
2.1. Representation and Preprocessing of PolSAR Images
2.2. Single-Band PolSAR Image Land Cover Classification Methods
2.2.1. Typical CNN-Based Method
2.2.2. Vision-Transformer-Based Method
2.3. The Proposed Multi-Band Land Cover Classification Method
2.3.1. Vision Transformer Module
2.3.2. Kolmogorov–Arnold Network (KAN) Module
3. Results
3.1. Data Description, Data Preprocessing, and Experimental Setup
3.2. Experimental Results
3.2.1. Single-Band Land Cover Classification
3.2.2. Comparative Experiments
- 1.
- Robustness: The proposed method outperforms the baselines across all the band combinations regardless of whether the single-band input is from a CNN or ViT. For example, with the P + L + C + X band combination, the proposed method achieves accuracies of for CNN input, significantly higher than the modified average approach () and Zhu method ().
- 2.
- Impact of Band Number: While the baselines show significant accuracy improvements when increasing from one band to two bands, their performance gains diminish as more bands are added. For example, with the ViT input, the modified average approach and Zhu method improve accuracy by and , respectively, when increasing from two bands (L + C) to four bands (P + L + C + X), whereas the proposed method achieves a improvement. This indicates that the baselines face limitations in multi-band fusion, while the proposed method can more effectively utilize multi-band information, continuously improving classification accuracy as the number of bands increases.Table 2. Multi-band classification overall accuracy with CNN single-band input (unit: %).
Band Combination SVM CNN Modified Average Zhu Method Proposed Method C (Single-Band) 83.84 83.84 83.84 83.84 83.84 L + C 84.01 85.27 84.41 85.53 85.92 L + C + X 84.23 86.18 84.82 86.23 86.96 P + L + C + X 84.96 86.88 85.14 86.87 87.79 Table 3. Multi-band classification overall accuracy with ViT single-band input (unit: %).Band Combination SVM CNN Modified Average Zhu Method Proposed Method L (Single-Band) 92.52 92.52 92.52 92.52 92.52 L + C 92.67 94.13 93.14 94.21 94.87 L + C + X 93.01 95.01 93.52 94.98 95.62 P + L + C + X 93.31 95.32 93.82 95.22 96.24 Table 4. Multi-band classification accuracy with CNN single-band input.Method OA (%) Kappa SVM (RBF kernel) 84.96 0.847 CNN 86.88 0.869 Modified Average 85.14 0.852 Zhu Method 86.87 0.869 Proposed Method 87.79 0.879 Table 5. Multi-band classification accuracy with ViT single-band input.Method OA (%) Kappa SVM (RBF kernel) 91.67 0.910 CNN 95.32 0.939 Modified Average 93.82 0.921 Zhu Method 95.22 0.934 Proposed Method 96.24 0.942
3.2.3. Ablation Studies
3.2.4. Theoretical Justification of KAN’s Efficiency
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Lee, J.S.; Grunes, M.R.; Kwok, R. Classification of multi-look polarimetric SAR imagery based on complex Wishart distribution. Int. J. Remote Sens. 1994, 15, 2299–2311. [Google Scholar] [CrossRef]
- Freeman, A.; Durden, S. A three-component scattering model for polarimetric SAR data. IEEE Trans. Geosci. Remote Sens. 1998, 36, 963–973. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Zhou, Y.; Wang, H.; Xu, F.; Jin, Y.Q. Polarimetric SAR Image Classification Using Deep Convolutional Neural Networks. IEEE Geosci. Remote Sens. Lett. 2016, 13, 1935–1939. [Google Scholar] [CrossRef]
- Zhang, Z.; Wang, H.; Xu, F.; Jin, Y.Q. Complex-Valued Convolutional Neural Network and Its Application in Polarimetric SAR Image Classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 7177–7188. [Google Scholar] [CrossRef]
- Dong, H.; Zhang, L.; Zou, B. PolSAR Image Classification with Lightweight 3D Convolutional Networks. Remote Sens. 2020, 12, 396. [Google Scholar] [CrossRef]
- Xie, W.; Ma, G.; Zhao, F.; Liu, H.; Zhang, L. PolSAR image classification via a novel semi-supervised recurrent complex-valued convolution neural network. Neurocomputing 2020, 388, 255–268. [Google Scholar] [CrossRef]
- Liu, F.; Jiao, L.; Tang, X. Task-Oriented GAN for PolSAR Image Classification and Clustering. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 2707–2719. [Google Scholar] [CrossRef] [PubMed]
- Zhao, S.; Zhang, Z.; Zhang, T.; Guo, W.; Luo, Y. Transferable SAR Image Classification Crossing Different Satellites Under Open Set Condition. IEEE Geosci. Remote Sens. Lett. 2022, 19, 4506005. [Google Scholar] [CrossRef]
- Wang, H.; Xing, C.; Yin, J.; Yang, J. Land Cover Classification for Polarimetric SAR Images Based on Vision Transformer. Remote Sens. 2022, 14, 4656. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.U.; Polosukhin, I. Attention is All you Need. In Advances in Neural Information Processing Systems; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: Newry, UK, 2017; Volume 30. [Google Scholar]
- Lee, J.S.; Pottier, E. Polarimetric Radar Imaging: From Basics to Applications; CRC Press: Boca Raton, FL, USA, 2017. [Google Scholar]
- Liu, Z.; Wang, Y.; Vaidya, S.; Ruehle, F.; Halverson, J.; Soljačić, M.; Hou, T.Y.; Tegmark, M. KAN: Kolmogorov-Arnold Networks. arXiv 2024, arXiv:2404.19756. [Google Scholar]
- Yong, D.; WenKang, S.; ZhenFu, Z.; Qi, L. Combining belief functions based on distance of evidence. Decis. Support Syst. 2004, 38, 489–493. [Google Scholar] [CrossRef]
- Zhu, J.; Pan, J.; Jiang, W.; Yue, X.; Yin, P. SAR image fusion classification based on the decision-level combination of multi-band information. Remote Sens. 2022, 14, 2243. [Google Scholar] [CrossRef]
Band | CNN (OA) | ViT (OA) |
---|---|---|
P | 82.11 | 89.13 |
L | 80.32 | 92.52 |
C | 83.84 | 90.25 |
X | 80.05 | 90.07 |
Class/Metric | SVM | CNN | Modified Average | Zhu Method | Proposed Method |
---|---|---|---|---|---|
Buildings | 93.23 | 95.89 | 94.39 | 95.75 | 96.74 |
Crops | 92.74 | 95.76 | 93.64 | 95.89 | 96.16 |
Moss | 89.73 | 93.01 | 91.10 | 88.20 | 93.42 |
Trees | 90.23 | 93.45 | 92.21 | 91.79 | 94.02 |
Roads | 91.87 | 95.67 | 93.74 | 96.69 | 96.60 |
Water | 92.69 | 96.23 | 95.20 | 96.00 | 97.29 |
Kappa | 0.910 | 0.939 | 0.921 | 0.934 | 0.942 |
OA (%) | 91.67 | 95.32 | 93.82 | 95.22 | 96.24 |
Method | OA (%) | Kappa | Params (k) |
---|---|---|---|
ResNet-18 + KAN | 92.13 | 0.891 | 108.7 |
ViT-Base + KAN | 96.24 | 0.942 | 120.5 |
Method | OA (%) | Kappa | Params (k) |
---|---|---|---|
ViT-Base + MLP | 92.85 | 0.901 | 135.2 |
ViT-Base + KAN | 96.24 | 0.942 | 120.5 |
Method | OA (%) | Kappa | Params (k) |
---|---|---|---|
ViT-Base + MLP (B-Spline) | 96.11 | 0.940 | 143.4 |
ViT-Base + KAN | 96.24 | 0.942 | 120.5 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Han, S.; Ren, D.; Gao, F.; Yang, J.; Ma, H. ViT–KAN Synergistic Fusion: A Novel Framework for Parameter- Efficient Multi-Band PolSAR Land Cover Classification. Remote Sens. 2025, 17, 1470. https://doi.org/10.3390/rs17081470
Han S, Ren D, Gao F, Yang J, Ma H. ViT–KAN Synergistic Fusion: A Novel Framework for Parameter- Efficient Multi-Band PolSAR Land Cover Classification. Remote Sensing. 2025; 17(8):1470. https://doi.org/10.3390/rs17081470
Chicago/Turabian StyleHan, Songli, Dawei Ren, Fan Gao, Jian Yang, and Hui Ma. 2025. "ViT–KAN Synergistic Fusion: A Novel Framework for Parameter- Efficient Multi-Band PolSAR Land Cover Classification" Remote Sensing 17, no. 8: 1470. https://doi.org/10.3390/rs17081470
APA StyleHan, S., Ren, D., Gao, F., Yang, J., & Ma, H. (2025). ViT–KAN Synergistic Fusion: A Novel Framework for Parameter- Efficient Multi-Band PolSAR Land Cover Classification. Remote Sensing, 17(8), 1470. https://doi.org/10.3390/rs17081470