Multi-Source Remote Sensing Images Semantic Segmentation Based on Differential Feature Attention Fusion
Abstract
:1. Introduction
- We propose an end-to-end multi-source semantic segmentation network based on differential feature attention fusion. The network enhances the feature expression ability of the model through the multi-source differential feature fusion mechanism and enriches the context information in the decoding stage.
- We develop a differential feature fusion module based on spatial attention. By aligning the weight distribution of multi-source feature maps and giving higher attention to the differential part, the model’s discriminative feature mining ability is improved.
- We design a shallow attention-guided upsampling method based on the self-attention mechanism to better achieve image reconstruction without introducing additional parameters.
- We use an unsupervised loss function to perform deep supervision on the feature extraction and fusion modules so that the fused features can better retain the diversity of multi-source data features.
2. Related Work
2.1. Remote Sensing Image Semantic Segmentation
2.2. Multi-Source Fusion Semantic Segmentation
3. Method
3.1. Difference Feature Attention Fusion Module
3.2. Attention-Guided Upsampling Module
3.3. Loss Function
4. Experiments
4.1. Experimental Conditions
4.2. Ablation Studies
4.3. Comparative Experiments and Discussions
4.3.1. Comparative Experiments on US3D Data Set
4.3.2. Comparative Experiments on ISPRS Data Set
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Leyva-Mayorga, I.; Martinez-Gost, M.; Moretti, M.; Pérez-Neira, A.; Vázquez, M.Á.; Popovski, P.; Soret, B. Satellite edge computing for real-time and very-high resolution earth observation. IEEE Trans. Commun. 2023, 71, 6180–6194. [Google Scholar] [CrossRef]
- Zhou, W.; Jin, J.; Lei, J.; Yu, L. CIMFNet: Cross-layer interaction and multiscale fusion network for semantic segmentation of high-resolution remote sensing images. IEEE J. Sel. Top. Signal Process. 2022, 16, 666–676. [Google Scholar] [CrossRef]
- Gao, Y.; Luo, X.; Gao, X.; Yan, W.; Pan, X.; Fu, X. Semantic segmentation of remote sensing images based on multiscale features and global information modeling. Expert Syst. Appl. 2024, 249, 123616. [Google Scholar] [CrossRef]
- Li, Q.; Guo, J.; Wang, F.; Song, Z. Monitoring the Characteristics of Ecological Cumulative Effect Due to Mining Disturbance Utilizing Remote Sensing. Remote Sens. 2021, 13, 5034. [Google Scholar] [CrossRef]
- Jia, P.; Chen, C.; Zhang, D.; Sang, Y.; Zhang, L. Semantic segmentation of deep learning remote sensing images based on band combination principle: Application in urban planning and land use. Comput. Commun. 2024, 217, 97–106. [Google Scholar] [CrossRef]
- Chowdhury, T.; Rahnemoonfar, M. Attention based semantic segmentation on uav dataset for natural disaster damage assessment. In Proceedings of the 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium, 11–16 July 2021; pp. 2325–2328. [Google Scholar]
- Feng, J.; Yang, X.; Gu, Z.; Zeng, M.; Zheng, W. SMBCNet: A transformer-based approach for change detection in remote sensing images through semantic segmentation. Remote Sens. 2023, 15, 3566. [Google Scholar] [CrossRef]
- Wang, W.; Fu, Y.; Dong, F.; Li, F. Semantic segmentation of remote sensing ship image via a convolutional neural networks model. IET Image Process. 2019, 13, 1016–1022. [Google Scholar] [CrossRef]
- Gao, W.; Chen, N.; Chen, J.; Gao, B.; Xu, Y.; Weng, X.; Jiang, X. A Novel and Extensible Remote Sensing Collaboration Platform: Architecture Design and Prototype Implementation. ISPRS Int. J. Geo-Inf. 2024, 13, 83. [Google Scholar] [CrossRef]
- Wang, X.; Tan, L.; Fan, J. Performance evaluation of mangrove species classification based on multi-source Remote Sensing data using extremely randomized trees in Fucheng Town, Leizhou city, Guangdong Province. Remote Sens. 2023, 15, 1386. [Google Scholar] [CrossRef]
- Ma, J.; Qian, K.; Zhang, X.; Ma, X. Weakly Supervised Instance Segmentation of Electrical Equipment Based on RGB-T Automatic Annotation. IEEE Trans. Instrum. Meas. 2020, 69, 9720–9731. [Google Scholar] [CrossRef]
- Zhou, W.; Zhang, H.; Yan, W.; Lin, W. MMSMCNet: Modal Memory Sharing and Morphological Complementary Networks for RGB-T Urban Scene Semantic Segmentation. IEEE Trans. Circuits Syst. Video Technol. 2023, 33, 7096–7108. [Google Scholar] [CrossRef]
- Liang, W.; Shan, C.; Yang, Y.; Han, J. Multi-branch Differential Bidirectional Fusion Network for RGB-T Semantic Segmentation. IEEE Trans. Intell. Veh. 2024, 1–11. [Google Scholar] [CrossRef]
- Ma, J.; Zhou, W.; Lei, J.; Yu, L. Adjacent Bi-Hierarchical Network for Scene Parsing of Remote Sensing Images. IEEE Geosci. Remote Sens. Lett. 2023, 20, 1–5. [Google Scholar] [CrossRef]
- Li, X.; Xu, F.; Liu, F.; Lyu, X.; Tong, Y.; Xu, Z.; Zhou, J. A synergistical attention model for semantic segmentation of remote sensing images. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–16. [Google Scholar] [CrossRef]
- Mostafa, R.R.; Houssein, E.H.; Hussien, A.G.; Singh, B.; Emam, M.M. An enhanced chameleon swarm algorithm for global optimization and multi-level thresholding medical image segmentation. Neural Comput. Appl. 2024, 36, 8775–8823. [Google Scholar] [CrossRef]
- He, X.; Zhou, Y.; Liu, B.; Zhao, J.; Yao, R. Remote sensing image semantic segmentation via class-guided structural interaction and boundary perception. Expert Syst. Appl. 2024, 252, 124019. [Google Scholar] [CrossRef]
- Hong, S.; Oh, J.; Lee, H.; Han, B. Learning transferrable knowledge for semantic segmentation with deep convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 3204–3212. [Google Scholar]
- Freixenet, J.; Munoz, X.; Raba, D.; Martí, J.; Cufí, X. Yet another survey on image segmentation: Region and boundary information integration. In Proceedings of the 7th European Conference on Computer Vision, Copenhagen, Denmark, 28–31 May 2002; pp. 408–422. [Google Scholar]
- Kampffmeyer, M.; Salberg, A.B.; Jenssen, R. Semantic segmentation of small objects and modeling of uncertainty in urban remote sensing images using deep convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 1–9. [Google Scholar]
- Wang, J.; Feng, Z.; Jiang, Y.; Yang, S.; Meng, H. Orientation attention network for semantic segmentation of remote sensing images. Knowl. Based Syst. 2023, 267, 110415. [Google Scholar] [CrossRef]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef]
- Shang, R.; Zhang, J.; Jiao, L.; Li, Y.; Marturi, N.; Stolkin, R. Multi-scale adaptive feature fusion network for semantic segmentation in remote sensing images. Remote Sens. 2020, 12, 872. [Google Scholar] [CrossRef]
- Liu, R.; Mi, L.; Chen, Z. AFNet: Adaptive fusion network for remote sensing image semantic segmentation. IEEE Trans. Geosci. Remote Sens. 2020, 59, 7871–7886. [Google Scholar] [CrossRef]
- Huang, Z.; Wang, X.; Huang, L.; Huang, C.; Wei, Y.; Liu, W. Ccnet: Criss-cross attention for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 603–612. [Google Scholar]
- Strudel, R.; Garcia, R.; Laptev, I.; Schmid, C. Segmenter: Transformer for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 7262–7272. [Google Scholar]
- Ding, H.; Xia, B.; Liu, W.; Zhang, Z.; Zhang, J.; Wang, X.; Xu, S. A Novel Mamba Architecture with a Semantic Transformer for Efficient Real-Time Remote Sensing Semantic Segmentation. Remote Sens. 2024, 16, 2620. [Google Scholar] [CrossRef]
- Zhou, W.; Jin, J.; Lei, J.; Hwang, J.N. CEGFNet: Common extraction and gate fusion network for scene parsing of remote sensing images. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–10. [Google Scholar] [CrossRef]
- Zhang, J. Multi-source remote sensing data fusion: Status and trends. Int. J. Image Data Fusion 2010, 1, 5–24. [Google Scholar] [CrossRef]
- Guo, Z.; Xu, R.; Feng, C.C.; Zeng, Z. PIF-Net: A Deep Point-Image Fusion Network for Multimodality Semantic Segmentation of Very High-Resolution Imagery and Aerial Point Cloud. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–15. [Google Scholar] [CrossRef]
- Fan, X.; Zhou, W.; Qian, X.; Yan, W. Progressive Adjacent-Layer coordination symmetric cascade network for semantic segmentation of Multimodal remote sensing images. Expert Syst. Appl. 2024, 238, 121999. [Google Scholar] [CrossRef]
- Ma, X.; Zhang, X.; Pun, M.O.; Liu, M. A Multilevel Multimodal Fusion Transformer for Remote Sensing Semantic Segmentation. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–15. [Google Scholar] [CrossRef]
- Liu, Y.; Chen, K.; Liu, C.; Qin, Z.; Luo, Z.; Wang, J. Structured knowledge distillation for semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2604–2613. [Google Scholar]
- Berman, M.; Triki, A.R.; Blaschko, M.B. The lovász-softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4413–4421. [Google Scholar]
- Xiao, T.; Liu, Y.; Zhou, B.; Jiang, Y.; Sun, J. Unified perceptual parsing for scene understanding. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8 September 2018; pp. 418–434. [Google Scholar]
- Ding, L.; Zheng, K.; Lin, D.; Chen, Y.; Liu, B.; Li, J.; Bruzzone, L. MP-ResNet: Multipath residual network for the semantic segmentation of high-resolution PolSAR images. IEEE Geosci. Remote Sens. Lett. 2021, 19, 1–5. [Google Scholar] [CrossRef]
- Ma, X.; Che, R.; Wang, X.; Ma, M.; Wu, S.; Feng, T.; Zhang, W. DOCNet: Dual-Domain Optimized Class-Aware Network for Remote Sensing Image Segmentation. IEEE Geosci. Remote Sens. Lett. 2024, 21, 1–5. [Google Scholar] [CrossRef]
- Hu, X.; Yang, K.; Fei, L.; Wang, K. ACNet: Attention based network to exploit complementary features for rgbd semantic segmentation. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; pp. 1440–1444. [Google Scholar]
- Seichter, D.; Köhler, M.; Lewandowski, B.; Wengefeld, T.; Gross, H.M. Efficient rgb-d semantic segmentation for indoor scene analysis. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 13525–13531. [Google Scholar]
- Ma, C.; Zhang, Y.; Guo, J.; Zhou, G.; Geng, X. FusionHeightNet: A Multi-Level Cross-Fusion Method from Multi-Source Remote Sensing Images for Urban Building Height Estimation. Remote Sens. 2024, 16, 958. [Google Scholar] [CrossRef]
- Liu, B.; Ren, B.; Hou, B.; Gu, Y. Multi-Source Fusion Network for Remote Sensing Image Segmentation with Hierarchical Transformer. In Proceedings of the IGARSS 2023–2023 IEEE International Geoscience and Remote Sensing Symposium, Pasadena, CA, USA, 16–21 July 2023; pp. 6318–6321. [Google Scholar]
Dataset | Attributes | Categories | |||||
---|---|---|---|---|---|---|---|
Name | Ground | Vegetation | Building | Water | Road | Background | |
Color | |||||||
Training | 63.05% | 15.8% | 15.34% | 4.46% | 1.34% | 0.01% | |
Testing | 73.65% | 14.46% | 9.08% | 1.93% | 0.86% | 0.02% | |
Name | Low vegetation | Tree | Car | Building | Surface | Background | |
Color | |||||||
Training | 24.28% | 16.11% | 1.86% | 26.91% | 30.83% | 0.01% | |
Testing | 22.28% | 15.95% | 1.89% | 28.26% | 31.61% | 0.01% |
Datasets | DS | DFF | AGU | UAL | Metrics | |||
---|---|---|---|---|---|---|---|---|
PA | mIoU | FWIoU | Kappa | |||||
US3D | ✓ | - | - | - | 93.15 | 79.53 | 87.74 | 84.34 |
✓ | ✓ | - | - | 93.57 | 80.83 | 88.33 | 85.04 | |
✓ | ✓ | - | ✓ | 93.69 | 81.22 | 88.50 | 85.26 | |
✓ | ✓ | ✓ | ✓ | 93.73 | 82.33 | 88.62 | 85.42 | |
ISPRS | ✓ | - | - | - | 89.59 | 80.73 | 81.49 | 86.02 |
✓ | ✓ | - | - | 90.23 | 81.42 | 82.60 | 86.89 | |
✓ | ✓ | - | ✓ | 90.43 | 81.80 | 82.96 | 87.16 | |
✓ | ✓ | ✓ | ✓ | 91.07 | 82.73 | 83.95 | 88.01 |
Datasets | DS | DFF | AGU | UAL | Categories | ||||
---|---|---|---|---|---|---|---|---|---|
Ground | Vegetation | Building | Water | Road | |||||
US3D | ✓ | - | - | - | 84.68 | 83.52 | 91.14 | 93.74 | 90.62 |
✓ | ✓ | - | - | 85.35 | 84.22 | 91.61 | 94.67 | 91.66 | |
✓ | ✓ | - | ✓ | 86.60 | 83.39 | 91.63 | 95.18 | 92.37 | |
✓ | ✓ | ✓ | ✓ | 86.93 | 85.28 | 91.41 | 95.71 | 93.05 |
Methods | Metrics | ||||
---|---|---|---|---|---|
PA | MPA | mIoU | FWIoU | Kappa | |
UperNet [36] | 87.96 | 75.82 | 67.79 | 79.11 | 70.79 |
MP-ResNet [37] | 89.14 | 80.73 | 72.54 | 80.88 | 73.57 |
DOCNet [38] | 88.71 | 79.39 | 70.37 | 81.39 | 75.08 |
ACNet [39] | 90.78 | 75.87 | 60.97 | 84.38 | 79.05 |
ESANet [40] | 92.92 | 80.05 | 72.41 | 87.26 | 83.43 |
FHNet [41] | 91.92 | 87.79 | 79.39 | 85.58 | 82.34 |
SegFusion [42] | 93.18 | 89.36 | 81.54 | 87.63 | 84.71 |
DFAFNet | 93.73 | 90.46 | 82.33 | 88.62 | 85.42 |
Methods | Metrics | ||||
---|---|---|---|---|---|
PA | MPA | mIoU | FWIoU | Kappa | |
UperNet [36] | 88.99 | 88.43 | 79.52 | 80.56 | 85.22 |
MP-ResNet [37] | 89.76 | 88.96 | 80.39 | 81.80 | 86.27 |
DOCNet [38] | 89.24 | 89.03 | 80.78 | 81.57 | 86.53 |
ACNet [39] | 86.99 | 85.57 | 75.31 | 77.49 | 82.57 |
ESANet [40] | 85.03 | 83.67 | 70.76 | 74.65 | 79.91 |
FHNet [41] | 88.79 | 85.35 | 79.83 | 80.30 | 85.84 |
SegFusion [42] | 90.23 | 89.41 | 81.36 | 81.72 | 87.13 |
DFAFNet | 91.07 | 90.12 | 82.73 | 83.95 | 88.01 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, D.; Yue, P.; Yan, Y.; Niu, Q.; Zhao, J.; Ma, H. Multi-Source Remote Sensing Images Semantic Segmentation Based on Differential Feature Attention Fusion. Remote Sens. 2024, 16, 4717. https://doi.org/10.3390/rs16244717
Zhang D, Yue P, Yan Y, Niu Q, Zhao J, Ma H. Multi-Source Remote Sensing Images Semantic Segmentation Based on Differential Feature Attention Fusion. Remote Sensing. 2024; 16(24):4717. https://doi.org/10.3390/rs16244717
Chicago/Turabian StyleZhang, Di, Peicheng Yue, Yuhang Yan, Qianqian Niu, Jiaqi Zhao, and Huifang Ma. 2024. "Multi-Source Remote Sensing Images Semantic Segmentation Based on Differential Feature Attention Fusion" Remote Sensing 16, no. 24: 4717. https://doi.org/10.3390/rs16244717
APA StyleZhang, D., Yue, P., Yan, Y., Niu, Q., Zhao, J., & Ma, H. (2024). Multi-Source Remote Sensing Images Semantic Segmentation Based on Differential Feature Attention Fusion. Remote Sensing, 16(24), 4717. https://doi.org/10.3390/rs16244717