Deep Layered Network Based on Rotation Operation and Residual Transform for Building Segmentation from Remote Sensing Images
Abstract
:1. Introduction
- (1)
- The proposed DLEF module captures and integrates information from different receptive fields, effectively linking global and detailed information to achieve refined global representation.
- (2)
- The TA module is introduced which establishes multi-directional dependency relationships through dimension rotation and residual transformations.
- (3)
- The MDC module is proposed which constructs a bridge between multi-level information, enhancing information exchange and strengthening contextual relationships.
2. Methodology
2.1. Overview
2.2. Segformer Structure
2.3. DLEF Module
2.4. MDC Module
3. Experimental Datasets and Evaluation
3.1. Datasets
3.2. Data Processing
3.3. Experimental Settings
3.4. Evaluation Metrics
4. Results
4.1. Experimental Results on the Massachusetts Dataset
4.2. Experimental Results on the INRIA Dataset
4.3. Experimental Results on the WHU Dataset
5. Discussion
5.1. The Main Contribution of This Study
5.2. Ablation Study
5.3. Total Parameters of Different Networks
5.4. Generalization Ability of C_ASegformer
5.5. Module Comparison
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Rathore, M.M.; Ahmad, A.; Paul, A.; Rho, S. Urban Planning and Building Smart Cities Based on the Internet of Things Using Big Data Analytics. Comput. Netw. 2016, 101, 63–80. [Google Scholar] [CrossRef]
- Dong, L.; Shan, J. A Comprehensive Review of Earthquake-Induced Building Damage Detection with Remote Sensing Techniques. ISPRS J. Photogramm. Remote Sens. 2013, 84, 85–99. [Google Scholar] [CrossRef]
- Vakalopoulou, M.; Karantzalos, K.; Komodakis, N.; Paragios, N. Building Detection in Very High Resolution Multispectral Data with Deep Learning Features. In Proceedings of the 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Milan, Italy, 26–31 July 2015; pp. 1873–1876. [Google Scholar]
- Gao, Y.; Luo, X.; Gao, X.; Yan, W.; Pan, X.; Fu, X. Semantic Segmentation of Remote Sensing Images Based on Multiscale Features and Global Information Modeling. Expert Syst. Appl. 2024, 249, 123616. [Google Scholar] [CrossRef]
- Shao, Z.; Tang, P.; Wang, Z.; Saleem, N.; Yam, S.; Sommai, C. BRRNet: A Fully Convolutional Neural Network for Automatic Building Extraction From High-Resolution Remote Sensing Images. Remote Sens. 2020, 12, 1050. [Google Scholar] [CrossRef]
- Hoeser, T.; Kuenzer, C. Object Detection and Image Segmentation with Deep Learning on Earth Observation Data: A Review-Part I: Evolution and Recent Trends. Remote Sens. 2020, 12, 1667. [Google Scholar] [CrossRef]
- Liu, P.; Liu, X.; Liu, M.; Shi, Q.; Yang, J.; Xu, X.; Zhang, Y. Building Footprint Extraction from High-Resolution Images via Spatial Residual Inception Convolutional Neural Network. Remote Sens. 2019, 11, 830. [Google Scholar] [CrossRef]
- Liu, Y.; Zhou, J.; Qi, W.; Li, X.; Gross, L.; Shao, Q.; Zhao, Z.; Ni, L.; Fan, X.; Li, Z. ARC-Net: An Efficient Network for Building Extraction from High-Resolution Aerial Images. IEEE Access 2020, 8, 154997–155010. [Google Scholar] [CrossRef]
- Liu, Y.; Gross, L.; Li, Z.; Li, X.; Fan, X.; Qi, W. Automatic Building Extraction on High-Resolution Remote Sensing Imagery Using Deep Convolutional Encoder-Decoder with Spatial Pyramid Pooling. IEEE Access 2019, 7, 128774–128786. [Google Scholar] [CrossRef]
- Zhou, J.; Liu, Y.; Nie, G.; Cheng, H.; Yang, X.; Chen, X.; Gross, L. Building Extraction and Floor Area Estimation at the Village Level in Rural China Via a Comprehensive Method Integrating UAV Photogrammetry and the Novel EDSANet. Remote Sens. 2022, 14, 5175. [Google Scholar] [CrossRef]
- He, D.; Shi, Q.; Liu, X.; Zhong, Y.; Zhang, X. Deep Subpixel Mapping Based on Semantic Information Modulated Network for Urban Land Use Mapping. IEEE Trans. Geosci. Remote Sens. 2021, 59, 10628–10646. [Google Scholar] [CrossRef]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef]
- Zeiler, M.D.; Fergus, R. Visualizing and Understanding Convolutional Networks. In Computer Vision—ECCV 2014; Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2014; Volume 8689, pp. 818–833. ISBN 978-3-319-10589-5. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; proceedings, part III 18. Springer: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
- Chen, L.-C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 834–848. [Google Scholar] [CrossRef]
- Maggiori, E.; Tarabalka, Y.; Charpiat, G.; Alliez, P. High-Resolution Semantic Labeling with Convolutional Neural Networks. IEEE Trans. Geosci. Remote Sens. 2017, 55, 7092–7103. [Google Scholar] [CrossRef]
- Jin, Y.; Xu, W.; Zhang, C.; Luo, X.; Jia, H. Boundary-Aware Refined Network for Automatic Building Extraction in Very High-Resolution Urban Aerial Images. Remote Sens. 2021, 13, 692. [Google Scholar] [CrossRef]
- Li, R.; Zheng, S.; Zhang, C.; Duan, C.; Su, J.; Wang, L.; Atkinson, P.M. Multiattention Network for Semantic Segmentation of Fine-Resolution Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5607713. [Google Scholar] [CrossRef]
- Huang, Z.; Cheng, G.; Wang, H.; Li, H.; Shi, L.; Pan, C. Building Extraction from Multi-Source Remote Sensing Images via Deep Deconvolution Neural Networks. In Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijing, China, 10–15 July 2016; pp. 1835–1838. [Google Scholar]
- Zheng, Z.; Zhong, Y.; Wang, J.; Ma, A. Foreground-Aware Relation Network for Geospatial Object Segmentation in High Spatial Resolution Remote Sensing Imagery. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 4095–4104. [Google Scholar]
- Xu, Y.; Wu, L.; Xie, Z.; Chen, Z. Building Extraction in Very High Resolution Remote Sensing Imagery Using Deep Learning and Guided Filters. Remote Sens. 2018, 10, 144. [Google Scholar] [CrossRef]
- Tang, K.; Xu, F.; Chen, X.; Dong, Q.; Yuan, Y.; Chen, J. The ClearSCD Model: Comprehensively Leveraging Semantics and Change Relationships for Semantic Change Detection in High Spatial Resolution Remote Sensing Imagery. ISPRS J. Photogramm. Remote Sens. 2024, 211, 299–317. [Google Scholar] [CrossRef]
- Wei, S.; Zhang, T.; Yu, D.; Ji, S.; Zhang, Y.; Gong, J. From Lines to Polygons: Polygonal Building Contour Extraction from High-Resolution Remote Sensing Imagery. ISPRS J. Photogramm. Remote Sens. 2024, 209, 213–232. [Google Scholar] [CrossRef]
- Du, Z.; Sui, H.; Zhou, Q.; Zhou, M.; Shi, W.; Wang, J.; Liu, J. Vectorized Building Extraction from High-Resolution Remote Sensing Images Using Spatial Cognitive Graph Convolution Model. ISPRS J. Photogramm. Remote Sens. 2024, 213, 53–71. [Google Scholar] [CrossRef]
- Li, J.; He, W.; Cao, W.; Zhang, L.; Zhang, H. UANet: An Uncertainty-Aware Network for Building Extraction from Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–13. [Google Scholar] [CrossRef]
- Zeiler, M.D.; Taylor, G.W.; Fergus, R. Adaptive Deconvolutional Networks for Mid and High Level Feature Learning. In Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 2018–2025. [Google Scholar]
- Zhang, Z.; Wang, X.; Jung, C. DCSR: Dilated Convolutions for Single Image Super-Resolution. IEEE Trans. Image Process. 2018, 28, 1625–1635. [Google Scholar] [CrossRef]
- Chollet, F. Xception: Deep Learning with Depthwise Separable Convolutions. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 1800–1807. [Google Scholar]
- Shi, F.; Xu, Z.; Yuan, T.; Zhu, S.-C. HUGE2: A Highly Untangled Generative-Model Engine for Edge-Computing. arXiv 2019, arXiv:1907.11210. [Google Scholar]
- Dang, L.; Pang, P.; Lee, J. Depth-Wise Separable Convolution Neural Network with Residual Connection for Hyperspectral Image Classification. Remote Sens. 2020, 12, 3408. [Google Scholar] [CrossRef]
- Jha, A.; Bose, S.; Banerjee, B. GAF-Net: Improving the Performance of Remote Sensing Image Fusion Using Novel Global Self and Cross Attention Learning. In Proceedings of the 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 2–7 January 2023; pp. 6343–6352. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11534–11542. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. Cbam: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Chen, L.-C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Computer Vision—ECCV 2018; Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2018; Volume 11211, pp. 833–851. ISBN 978-3-030-01233-5. [Google Scholar]
- Zhao, B.; Wu, C.; Zou, F.; Zhang, X.; Sun, R.; Jiang, Y. Research on Small Sample Multi-Target Grasping Technology Based on Transfer Learning. Sensors 2023, 23, 5826. [Google Scholar] [CrossRef]
- Li, X.; Xu, F.; Li, L.; Xu, N.; Liu, F.; Yuan, C.; Chen, Z.; Lyu, X. AAFormer: Attention-Attended Transformer for Semantic Segmentation of Remote Sensing Images. IEEE Geosci. Remote Sens. Lett. 2024, 21, 5002805. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Proceedings of the 31st Annual Conference on Neural Information Processing Systems (NIPS), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Wang, L.; Fang, S.; Meng, X.; Li, R. Building Extraction with Vision Transformer. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5625711. [Google Scholar] [CrossRef]
- Xie, E.; Wang, W.; Yu, Z.; Anandkumar, A.; Alvarez, J.M.; Luo, P. SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers. Adv. Neural Inf. Process. systems 2021, 34, 12077–12090. [Google Scholar]
- Zhang, J.; Li, C.; Yin, Y.; Zhang, J.; Grzegorzek, M. Applications of Artificial Neural Networks in Microorganism Image Analysis: A Comprehensive Review from Conventional Multilayer Perceptron to Popular Convolutional Neural Network and Potential Visual Transformer. Artif. Intell. Rev. 2023, 56, 1013–1070. [Google Scholar] [CrossRef]
- Li, M.; Rui, J.; Yang, S.; Liu, Z.; Ren, L.; Ma, L.; Li, Q.; Su, X.; Zuo, X. Method of Building Detection in Optical Remote Sensing Images Based on SegFormer. Sensors 2023, 23, 1258. [Google Scholar] [CrossRef] [PubMed]
- He, X.; Zhou, Y.; Zhao, J.; Zhang, D.; Yao, R.; Xue, Y. Swin Transformer Embedding UNet for Remote Sensing Image Semantic Segmentation. IEEE Trans. Geosci. Remote Sens. 2022, 60, 4408715. [Google Scholar] [CrossRef]
- Wu, H.; Huang, P.; Zhang, M.; Tang, W.; Yu, X. CMTFNet: CNN and Multiscale Transformer Fusion Network for Remote-Sensing Image Semantic Segmentation. IEEE Trans. Geosci. Remote Sens. 2023, 61, 2004612. [Google Scholar] [CrossRef]
- Chang, H.; Bi, H.; Li, F.; Xu, C.; Chanussot, J.; Hong, D. Deep Symmetric Fusion Transformer for Multimodal Remote Sensing Data Classification. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5644115. [Google Scholar] [CrossRef]
- Mnih, V. Machine Learning for Aerial Image Labeling; University of Toronto: Toronto, ON, Canada, 2013; ISBN 0-494-96184-8. [Google Scholar]
- Maggiori, E.; Tarabalka, Y.; Charpiat, G.; Alliez, P. Can Semantic Labeling Methods Generalize to Any City? The Inria Aerial Image Labeling Benchmark. In Proceedings of the 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA, 23–28 July 2017; pp. 3226–3229. [Google Scholar]
- Ji, S.; Wei, S.; Lu, M. Fully Convolutional Networks for Multisource Building Extraction from an Open Aerial and Satellite Imagery Data Set. IEEE Trans. Geosci. Remote Sens. 2019, 57, 574–586. [Google Scholar] [CrossRef]
- Misra, D.; Nalamada, T.; Arasanipalai, A.U.; Hou, Q. Rotate to Attend: Convolutional Triplet Attention Module. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Virtual, 5–9 January 2021; pp. 3139–3148. [Google Scholar]
- Aldughayfiq, B.; Ashfaq, F.; Jhanjhi, N.Z.; Humayun, M. YOLOv5-FPN: A Robust Framework for Multi-Sized Cell Counting in Fluorescence Images. Diagnostics 2023, 13, 2280. [Google Scholar] [CrossRef]
- Li, X.; Xu, F.; Liu, F.; Lyu, X.; Tong, Y.; Xu, Z.; Zhou, J. A Synergistical Attention Model for Semantic Segmentation of Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5400916. [Google Scholar] [CrossRef]
- Li, X.; Yong, X.; Li, T.; Tong, Y.; Gao, H.; Wang, X.; Xu, Z.; Fang, Y.; You, Q.; Lyu, X. A Spectral–Spatial Context-Boosted Network for Semantic Segmentation of Remote Sensing Images. Remote Sens. 2024, 16, 1214. [Google Scholar] [CrossRef]
- Li, X.; Xu, F.; Yu, A.; Lyu, X.; Gao, H.; Zhou, J. A Frequency Decoupling Network for Semantic Segmentation of Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5607921. [Google Scholar] [CrossRef]
- Yildirim, F.S. FwSVM-Net: A Novel Deep Learning-Based Automatic Building Extraction from Aerial Images. J. Build. Eng. 2024, 96, 110473. [Google Scholar] [CrossRef]
- Yu, M.; Chen, X.; Zhang, W.; Liu, Y. AGs-Unet: Building Extraction Model for High Resolution Remote Sensing Images Based on Attention Gates U Network. Sensors 2022, 22, 2932. [Google Scholar] [CrossRef] [PubMed]
Framework | Module | Name | Number |
---|---|---|---|
Backbone | MiT-B0 | patch_size | 4 |
embed_dims | [32, 64, 160, 256] | ||
num_heads | [1, 2, 5, 8] | ||
mlp_ratios | [4, 4, 4, 4] | ||
sr_ratios | [8, 4, 2, 1] | ||
Neck | DLEF | dim_in | [32, 64, 160, 256] |
dim_out | [32, 64, 160, 256] | ||
MDC | in_channels | [32, 64, 160, 256] | |
out_channels | 256 | ||
num_outs | 4 | ||
Decoder_Head | Segformer_head | in_channels | [64, 128, 320, 512] |
in_index | [0, 1, 2, 3] | ||
feature_strides | [4, 8, 16, 32] |
Category | Parameter | Setting |
---|---|---|
Data Augmentation | Flip Probability | 0.5 (Horizontal/Vertical) |
Rotation Range | ±15° | |
Crop Size | 512 × 512 | |
Optimizer (AdamW) | Initial Learning Rate | 0.0006 |
Beta Coefficients (β1/β2) | 0.9/0.999 | |
Weight Decay | 0.01 | |
Training Protocol | Batch Size | 2 |
Training Iterations | 40,000 |
OA (%) | Precision (%) | Recall (%) | F1-Score (%) | mIoU (%) | |
---|---|---|---|---|---|
AGs-UNet [56] | 93.67 | 87.94 | 73.57 | 78.63 | 68.46 |
DeepLabV3+ [15] | 93.34 | 84.73 | 79.1 | 81.82 | 69.23 |
MA-FCN [17] | 94.12 | 87.07 | 82.89 | 84.93 | 73.8 |
HRNet [16] | 93.98 | 85.82 | 82.01 | 83.87 | 72.22 |
Segformer [41] | 94.36 | 86.93 | 82.38 | 84.59 | 73.60 |
MANet [18] | 94.41 | 86.65 | 82.85 | 84.61 | 73.97 |
ST-UNet [44] | 94.51 | 87.68 | 79.82 | 83.15 | 73.92 |
CMTFNet [45] | 94.61 | 87.64 | 80.58 | 83.64 | 74.19 |
DSymFuser [46] | 94.66 | 87.35 | 81.42 | 84.05 | 74.68 |
C_ASegformer | 95.42 | 86.47 | 84.93 | 85.69 | 75.46 |
OA (%) | Precision (%) | Recall (%) | F1-Score (%) | mIoU (%) | |
---|---|---|---|---|---|
AGs-Unet [56] | 91.9 | 90.7 | 86.48 | 88.53 | 68.2 |
DeepLabV3+ [15] | 93.28 | 87.35 | 86.4 | 86.88 | 76.8 |
MA-FCN [17] | 92.48 | 88.25 | 86.11 | 87.17 | 77.14 |
HRNet [16] | 94.12 | 90.33 | 87.53 | 88.91 | 80.03 |
Segformer [41] | 95.21 | 90.67 | 88.77 | 89.69 | 81.16 |
MANet [18] | 95.11 | 91.62 | 87.02 | 89.12 | 81.34 |
ST-UNet [44] | 95.18 | 91.42 | 87.61 | 89.38 | 81.71 |
CMTFNet [45] | 95.24 | 91.33 | 88.03 | 89.58 | 82.00 |
DSymFuser [46] | 95.11 | 91.44 | 88.25 | 89.75 | 82.09 |
C_ASegformer | 95.33 | 90.83 | 89.17 | 89.97 | 82.58 |
OA (%) | Precision (%) | Recall (%) | F1-Score (%) | mIoU (%) | |
---|---|---|---|---|---|
AGs-Unet [56] | 97.87 | 95.47 | 93.56 | 94.49 | 89.88 |
DeepLabV3+ [15] | 98.25 | 95.56 | 95.58 | 95.57 | 91.73 |
MA-FCN [17] | 96.8 | 94.8 | 94.4 | 94.2 | 89.1 |
HRNet [16] | 98.15 | 95.67 | 94.89 | 95.27 | 91.21 |
Segformer [41] | 98.31 | 95.78 | 95.66 | 95.72 | 91.99 |
MANet [18] | 98.39 | 96.44 | 95.32 | 95.87 | 91.25 |
ST-UNet [44] | 98.25 | 96.31 | 94.72 | 95.50 | 91.60 |
CMTFNet [45] | 98.31 | 95.55 | 95.93 | 95.74 | 92.02 |
DSymFuser [46] | 98.30 | 95.79 | 95.58 | 95.69 | 91.93 |
C_ASegformer | 98.51 | 96.26 | 96.18 | 96.22 | 92.87 |
Model | OA (%) | Precision (%) | Recall (%) | F1-Score (%) | mIoU (%) | ||
---|---|---|---|---|---|---|---|
Baseline | DLEF | MDC | |||||
√ | × | × | 94.34 ± 0.12 | 86.89 ± 0.35 | 82.21 ± 0.31 | 84.52 ± 0.28 | 73.86 ± 0.21 |
√ | √ | × | 94.60 ± 0.10 | 87.28 ± 0.28 | 81.14 ± 0.33 | 83.78 ± 0.25 | 74.45 ± 0.19 |
√ | × | √ | 94.54 ± 0.11 | 87.33 ± 0.30 | 80.74 ± 0.37 | 83.85 ± 0.31 | 74.19 ± 0.22 |
√ | √ | √ | 94.52 ± 0.15 | 84.74 ± 0.42 | 85.01 ± 0.25 | 84.83 ± 0.34 | 75.39 ± 0.18 |
Params (M) | FLOPs (G) | FPS (f/s) | mIoU | |
---|---|---|---|---|
MANet | 48.21 | 44.35 | 18.7 | 75.37 |
Segformer | 3.71 | 7.79 | 45.2 | 73.92 |
DeepLabV3+ | 58.6 | 23.99 | 12.5 | 69.23 |
HRNet | 11.88 | 20.32 | 24.3 | 70.23 |
C_ASegformer | 5.56 | 11.47 | 38.6 | 75.46 |
Statistical Indicators | Massachusetts | INRIA |
---|---|---|
Spatial resolution | 1.0 m | 0.3 m |
Building density | Small | Large |
Main building types | Stand-alone houses | Apartment complexes, commercial buildings |
Percentage of vegetation cover | Large | Small |
Model | Massachusetts | INRIA | ||
---|---|---|---|---|
IoU (%) | F1-Score (%) | IoU (%) | F1-Score (%) | |
HRNet | 70.23 | 81.96 | 32.87 | 50.25 |
DeepLabV3+ | 69.23 | 81.82 | 33.22 | 47.69 |
Segformer | 73.92 | 84.59 | 65.13 | 74.94 |
MANet | 75.37 | 84.61 | 64.26 | 75.24 |
C_ASegformer | 75.46 | 84.73 | 69.26 | 77.08 |
Module | OA (%) | Precision (%) | Recall (%) | F1-Score (%) | mIoU (%) |
---|---|---|---|---|---|
Baseline | 94.36 | 86.93 | 82.38 | 84.59 | 73.92 |
Baseline + DLEF + SE | 94.55 | 88.3 | 79.34 | 83.05 | 73.49 |
Baseline + DLEF + CA | 94.43 | 86.94 | 80.23 | 83.15 | 73.51 |
Baseline + DLEF + EA | 94.66 | 87.2 | 81.67 | 84.14 | 74.79 |
Baseline + DLEF + CBAM | 94.75 | 85.73 | 81.94 | 83.79 | 75.06 |
Baseline + DLEF + TA | 95.42 | 86.47 | 84.93 | 85.69 | 75.46 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, S.; Chen, T.; Su, F.; Xu, H.; Li, Y.; Liu, Y. Deep Layered Network Based on Rotation Operation and Residual Transform for Building Segmentation from Remote Sensing Images. Sensors 2025, 25, 2608. https://doi.org/10.3390/s25082608
Zhang S, Chen T, Su F, Xu H, Li Y, Liu Y. Deep Layered Network Based on Rotation Operation and Residual Transform for Building Segmentation from Remote Sensing Images. Sensors. 2025; 25(8):2608. https://doi.org/10.3390/s25082608
Chicago/Turabian StyleZhang, Shuzhe, Taoyi Chen, Fei Su, Hao Xu, Yan Li, and Yaohui Liu. 2025. "Deep Layered Network Based on Rotation Operation and Residual Transform for Building Segmentation from Remote Sensing Images" Sensors 25, no. 8: 2608. https://doi.org/10.3390/s25082608
APA StyleZhang, S., Chen, T., Su, F., Xu, H., Li, Y., & Liu, Y. (2025). Deep Layered Network Based on Rotation Operation and Residual Transform for Building Segmentation from Remote Sensing Images. Sensors, 25(8), 2608. https://doi.org/10.3390/s25082608