AP-PointRend: An Improved Network for Building Extraction via High-Resolution Remote Sensing Images
Abstract
:1. Introduction
- (1)
- The proposed image annotation method can annotate and cut the entire image to any size and overlapping degree, which solves the problem of existing datasets being of fixed sizes and the difficulty in keeping up with the rapid development of hardware.
- (2)
- An improved deep learning network, AP-PointRend, is employed in the extraction of building roof outlines. This approach addresses the issue of discrete patches in building extraction using PointRend, and improves the edge effect for large buildings.
2. Related Work
2.1. Segmentation-Based Buildings Extraction
2.2. Instance Segmentation-Based Buildings Extraction
3. Materials and Methods
3.1. Datasets
3.2. The Flow of DL-Based Buildings Outlines Extraction from Remote Sensing Images
- (1)
- The original images are manually annotated to generate single-scene training data for large scenes. All the large scene aerial images and satellite images in the dataset are cropped according to any overlap degree, and the training data are automatically generated based on the manual labeling data of the original images.
- (2)
- The building boundary annotation coordinates are recalculated in the cropped training data based on the mapping relationship between the coordinates of the cropped image and the original image. As shown in Figure 3, when a single building is cut into multiple images, the intersection point of the cutting line and the building boundary line must be calculated for coordinate insertion and reordering.
- (3)
- Based on the training data generated in the previous step, the network model is iteratively trained on the basis of the pre-trained model, and the model file is optimized through a continuous iteration. When a predefined number of epochs is reached, the iteration model is stopped. The network model includes the basic framework of building instance segmentation, which can complete the multiscale feature extraction. At the same time, mask upsampling based on fine-grained pixel segmentation and adaptive parameter selection method is carried out.
- (4)
- Buildings are extracted from remote sensing images using the model files generated by iterative training, which meets the accuracy requirements of the training datasets.
- (5)
- The trained model is used to extract the two-dimensional outline of the building from multiple cropped test targets. The corresponding extraction results of the multiscene test data are obtained, and the extracted multiple images are merged to eliminate the seam.
3.3. Adaptive Cropping and Automatic Generation of Training Data
- (1)
- The original large-scale aerial images or satellite images are manually labeled to obtain the initial training data of the building (i.e., the building boundary point set [(x1, y1), (x2, y2), (x3, y3), (x4, y4), …, (xn, yn)]) and the attribute information (e.g., number, category, and range).
- (2)
- Then, cutting of the original image with overlap is completed, and the original building boundary point set and attribute value are recalculated according to the cutting results.
3.4. Automatic Extraction of Building Roof Contour
3.5. Merging Method of Extraction Results of Multiple Remote Sensing Images
- (1)
- Use the trained model to extract the two-dimensional outline of the building from multiple cropped test targets and obtain the corresponding extraction results of the multiscene test data.
- (2)
- Perform grayscale and binarization for each test image extraction, resulting in the mask. The mask part is 255, and the rest are set to 0.
- (3)
- Calculate the pixel coordinates of each binary recognition result in the original image without clipping, merge multiple cropped results (which are binary images), and perform the expansion operation on the merged result image with the specific size of the convolution kernel. Based on experience, the convolution kernel size is generally 5 × 5 or 7 × 7.
4. Experiment and Analysis
4.1. Experiment Environment
4.2. Experiment Results
4.2.1. Generate Datasets of Arbitrary Size and Overlap
4.2.2. Visualization Results
Building Extraction on Vaihingen Dataset
Building Extraction on WHU Dataset
4.3. Quantitative Analysis
5. Discussion
5.1. Influence on Building Extraction with Different Sizes and Overlapping Rate
5.2. Accuracy Evaluation Analysis
- (1)
- Some extraction results had discrete image patches, as shown in Figure 16d; the overall edge was smooth, but there were discrete small image spots locally.
- (2)
- The existing accuracy index IoU only measures the area coverage of the evaluation area, which cannot evaluate the accuracy of the boundary.
5.3. Merging Method of Extraction Results of Multiple RS Images
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Mayer, H. Automatic Object Extraction from Aerial Imagery—A Survey Focusing on Buildings. Comput. Vis. Image Underst. 1999, 74, 138–149. [Google Scholar] [CrossRef]
- Zhao, K.; Kang, J.; Jung, J.; Sohn, G. Building Extraction from Satellite Images Using Mask R-CNN with Building Boundary Regularization. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA, 18–22 June 2018; pp. 242–2424. [Google Scholar]
- Ji, S.; Wei, S.; Lu, M. Fully Convolutional Networks for Multisource Building Extraction From an Open Aerial and Satellite Imagery Data Set. IEEE Trans. Geosci. Remote Sens. 2019, 57, 574–586. [Google Scholar] [CrossRef]
- Shrestha, S.; Vanneschi, L. Improved Fully Convolutional Network with Conditional Random Fields for Building Extraction. Remote Sens. 2018, 10, 1135. [Google Scholar] [CrossRef]
- Bi, Q.; Qin, K.; Zhang, H.; Zhang, Y.; Li, Z.; Xu, K. A Multi-Scale Filtering Building Index for Building Extraction in Very High-Resolution Satellite Imagery. Remote Sens. 2019, 11, 482. [Google Scholar] [CrossRef]
- Li, W.; He, C.; Fang, J.; Zheng, J.; Fu, H.; Yu, L. Semantic Segmentation-Based Building Footprint Extraction Using Very High-Resolution Satellite Images and Multi-Source GIS Data. Remote Sens. 2019, 11, 403. [Google Scholar] [CrossRef]
- Zhang, B.; Wang, C.; Shen, Y.; Liu, Y. Fully Connected Conditional Random Fields for High-Resolution Remote Sensing Land Use/Land Cover Classification with Convolutional Neural Networks. Remote Sens. 2018, 10, 1889. [Google Scholar] [CrossRef]
- Vakalopoulou, M.; Karantzalos, K.; Komodakis, N.; Paragios, N. Building Detection in Very High Resolution Multispectral Data with Deep Learning Features. In Proceedings of the 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Milan, Italy, 26–31 July 2015; pp. 1873–1876. [Google Scholar]
- Chen, K.; Fu, K.; Gao, X.; Yan, M.; Sun, X.; Zhang, H. Building Extraction from Remote Sensing Images with Deep Learning in a Supervised Manner. In Proceedings of the 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA, 23–28 July 2017; pp. 1672–1675. [Google Scholar]
- Shelhamer, E.; Long, J.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 640–651. [Google Scholar] [CrossRef]
- Maggiori, E.; Tarabalka, Y.; Charpiat, G.; Alliez, P. Convolutional Neural Networks for Large-Scale Remote-Sensing Image Classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 645–657. [Google Scholar] [CrossRef]
- Zuo, T.; Feng, J.; Chen, X. HF-FCN: Hierarchically Fused Fully Convolutional Network for Robust Building Extraction. In Proceedings of the Computer Vision—ACCV 2016, Taipei, Taiwan, 20–24 November 2016; Springer: Cham, Switzerland, 2017; pp. 291–302. [Google Scholar]
- Ji, S.; Wei, S.; Lu, M. A Scale Robust Convolutional Neural Network for Automatic Building Extraction from Aerial and Satellite Imagery. Int. J. Remote Sens. 2019, 40, 3308–3322. [Google Scholar] [CrossRef]
- Bischke, B.; Helber, P.; Folz, J.; Borth, D.; Dengel, A. Multi-Task Learning for Segmentation of Building Footprints with Deep Neural Networks. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; pp. 1480–1484. [Google Scholar]
- Wen, Q.; Jiang, K.; Wang, W.; Liu, Q.; Guo, Q.; Li, L.; Wang, P. Automatic Building Extraction from Google Earth Images under Complex Backgrounds Based on Deep Instance Segmentation Network. Sensors 2019, 19, 333. [Google Scholar] [CrossRef]
- Liu, Y.; Chen, D.; Ma, A.; Zhong, Y.; Fang, F.; Xu, K. Multiscale U-Shaped CNN Building Instance Extraction Framework with Edge Constraint for High-Spatial-Resolution Remote Sensing Imagery. IEEE Trans. Geosci. Remote Sens. 2021, 59, 6106–6120. [Google Scholar] [CrossRef]
- Zhu, Y.; Huang, B.; Gao, J.; Huang, E.; Chen, H. Adaptive Polygon Generation Algorithm for Automatic Building Extraction. IEEE Trans. Geosci. Remote Sens. 2022, 60, 4702114. [Google Scholar] [CrossRef]
- He, S.; Jiang, W. Boundary-Assisted Learning for Building Extraction from Optical Remote Sensing Imagery. Remote Sens. 2021, 13, 760. [Google Scholar] [CrossRef]
- Jin, Y.; Xu, W.; Zhang, C.; Luo, X.; Jia, H. Boundary-Aware Refined Network for Automatic Building Extraction in Very High-Resolution Urban Aerial Images. Remote Sens. 2021, 13, 692. [Google Scholar] [CrossRef]
- Zhang, H.; Liao, Y.; Yang, H.; Yang, G.; Zhang, L. A Local–Global Dual-Stream Network for Building Extraction From Very-High-Resolution Remote Sensing Images. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 1269–1283. [Google Scholar] [CrossRef]
- Yu, Y.; Ren, Y.; Guan, H.; Li, D.; Yu, C.; Jin, S.; Wang, L. Capsule Feature Pyramid Network for Building Footprint Extraction From High-Resolution Aerial Imagery. IEEE Geosci. Remote Sens. Lett. 2021, 18, 895–899. [Google Scholar] [CrossRef]
- Shao, Z.; Tang, P.; Wang, Z.; Saleem, N.; Yam, S.; Sommai, C. BRRNet: A Fully Convolutional Neural Network for Automatic Building Extraction From High-Resolution Remote Sensing Images. Remote Sens. 2020, 12, 1050. [Google Scholar] [CrossRef]
- Xu, Y.; Wu, L.; Xie, Z.; Chen, Z. Building Extraction in Very High Resolution Remote Sensing Imagery Using Deep Learning and Guided Filters. Remote Sens. 2018, 10, 144. [Google Scholar] [CrossRef]
- Chen, M.; Mao, T.; Wu, J.; Du, R.; Zhao, B.; Zhou, L. SAU-Net: A Novel Network for Building Extraction From High-Resolution Remote Sensing Images by Reconstructing Fine-Grained Semantic Features. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 6747–6761. [Google Scholar] [CrossRef]
- Awad, B.; Erer, I. FAUNet: Frequency Attention U-Net for Parcel Boundary Delineation in Satellite Images. Remote Sens. 2023, 15, 5123. [Google Scholar] [CrossRef]
- Huang, J.; Zhang, X.; Xin, Q.; Sun, Y.; Zhang, P. Automatic Building Extraction from High-Resolution Aerial Images and LiDAR Data Using Gated Residual Refinement Network. ISPRS J. Photogramm. Remote Sens. 2019, 151, 91–105. [Google Scholar] [CrossRef]
- Xiao, X.; Guo, W.; Chen, R.; Hui, Y.; Wang, J.; Zhao, H. A Swin Transformer-Based Encoding Booster Integrated in U-Shaped Network for Building Extraction. Remote Sens. 2022, 14, 2611. [Google Scholar] [CrossRef]
- Zhao, Y.; Sun, G.; Zhang, L.; Zhang, A.; Jia, X.; Han, Z. MSRF-Net: Multiscale Receptive Field Network for Building Detection From Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5515714. [Google Scholar] [CrossRef]
- Zhu, X.; Zhang, X.; Zhang, T.; Tang, X.; Chen, P.; Zhou, H.; Jiao, L. Semantics and Contour Based Interactive Learning Network for Building Footprint Extraction. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5623513. [Google Scholar] [CrossRef]
- Yuan, W.; Zhang, X.; Shi, J.; Wang, J. LiteST-Net: A Hybrid Model of Lite Swin Transformer and Convolution for Building Extraction from Remote Sensing Image. Remote Sens. 2023, 15, 1996. [Google Scholar] [CrossRef]
- Wang, Y.; Zhao, Q.; Wu, Y.; Tian, W.; Zhang, G. SCA-Net: Multiscale Contextual Information Network for Building Extraction Based on High-Resolution Remote Sensing Images. Remote Sens. 2023, 15, 4466. [Google Scholar] [CrossRef]
- Nie, J.; Wang, Z.; Liang, X.; Yang, C.; Zheng, C.; Wei, Z. Semantic Category Balance-Aware Involved Anti-Interference Network for Remote Sensing Semantic Segmentation. IEEE Trans. Geosci. Remote Sens. 2023, 61, 4409712. [Google Scholar] [CrossRef]
- Zuo, X.; Shao, Z.; Wang, J.; Huang, X.; Wang, Y. A cross-stage features fusion network for building extraction from remote sensing images. Geo-Spat. Inf. Sci. 2024, 27, 1–15. [Google Scholar] [CrossRef]
- Ye, Z.; Li, Y.; Li, Z.; Liu, H.; Zhang, Y.; Li, W. Attention Multiscale Network for Semantic Segmentation of Multimodal Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5610315. [Google Scholar] [CrossRef]
- Dai, X.; Xia, M.; Weng, L.; Hu, K.; Lin, H.; Qian, M. Multiscale Location Attention Network for Building and Water Segmentation of Remote Sensing Image. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5609519. [Google Scholar] [CrossRef]
- Li, Y.; Hong, D.; Li, C.; Yao, J.; Chanussot, J. HD-Net: High-resolution decoupled network for building footprint extraction via deeply supervised body and boundary decomposition. ISPRS J. Photogramm. Remote Sens. 2024, 209, 51–65. [Google Scholar] [CrossRef]
- Zhang, X.; Su, Q.; Xiao, P.; Wang, W.; Li, Z.; He, G. FlipCAM: A Feature-Level Flipping Augmentation Method for Weakly Supervised Building Extraction From High-Resolution Remote Sensing Imagery. IEEE Trans. Geosci. Remote Sens. 2024, 62, 4402917. [Google Scholar] [CrossRef]
- Wu, Y.; Xu, L.; Chen, Y.; Wong, A.; Clausi, D.A. TAL: Topography-Aware Multi-Resolution Fusion Learning for Enhanced Building Footprint Extraction. IEEE Geosci. Remote Sens. Lett. 2022, 19, 6506305. [Google Scholar] [CrossRef]
- Holail, S.; Saleh, T.; Xiao, X.; Li, D. AFDE-Net: Building Change Detection Using Attention-Based Feature Differential Enhancement for Satellite Imagery. IEEE Geosci. Remote Sens. Lett. 2023, 20, 6006405. [Google Scholar] [CrossRef]
- Holail, S.; Saleh, T.; Xiao, X.; Xiao, J.; Xia, G.; Shao, Z.; Wang, M.; Gong, J.; Li, D. Time-series satellite remote sensing reveals gradually increasing war damage in the Gaza Strip. Natl. Sci. Rev. 2024, 11, 9. [Google Scholar] [CrossRef] [PubMed]
- Zhu, Y.; Huang, B.; Fan, Y.; Usman, M.; Chen, H. Iterative Polygon Deformation for Building Extraction. IEEE Trans. Geosci. Remote Sens. 2024, 62, 4704314. [Google Scholar] [CrossRef]
- Chen, Z.; Liu, T.; Xu, X.; Leng, J.; Chen, Z. DCTC: Fast and Accurate Contour-Based Instance Segmentation with DCT Encoding for High-Resolution Remote Sensing Images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 8697–8709. [Google Scholar] [CrossRef]
- Xie, Z.; Wu, Y.; Ma, Z.; Chen, M.; Qian, Z.; Zhang, F.; Sun, L.; Peng, B. An urban building use identification framework based on integrated remote sensing and social sensing data with spatial constraints. Geo-Spat. Inf. Sci. 2024, 27, 1–25. [Google Scholar] [CrossRef]
- Guo, N.; Jiang, M.; Hu, X.; Su, Z.; Zhang, W.; Li, R.; Luo, J. NPSFF-Net: Enhanced Building Segmentation in Remote Sensing Images via Novel Pseudo-Siamese Feature Fusion. Remote Sens. 2024, 16, 3266. [Google Scholar] [CrossRef]
- Saleh, T.; Holail, S.; Zahran, M.; Xiao, X.; Xia, G.-S. LiST-Net: Enhanced Flood Mapping with Lightweight SAR Transformer Network and Dimension-Wise Attention. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5211817. [Google Scholar] [CrossRef]
- Zhang, F.; Liu, K.; Liu, Y.; Wang, C.; Zhou, W.; Zhang, H.; Wang, L. Multitarget Domain Adaptation Building Instance Extraction of Remote Sensing Imagery with Domain-Common Approximation Learning. IEEE Trans. Geosci. Remote Sens. 2024, 62, 4702916. [Google Scholar] [CrossRef]
- Saleh, T.; Holail, S.; Xiao, X.; Xia, G.-S. High-precision flood detection and mapping via multi-temporal SAR change analysis with semantic token-based transformer. Int. J. Appl. Earth Obs. Geoinf. 2024, 131, 1569–8432. [Google Scholar] [CrossRef]
- Chen, K.; Liu, C.; Chen, H.; Zhang, H.; Li, W.; Zou, Z.; Shi, Z. RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation Based on Visual Foundation Model. IEEE Trans. Geosci. Remote Sens. 2024, 62, 4701117. [Google Scholar] [CrossRef]
- Wang, M.; Su, L.; Yan, C.; Xu, S.; Yuan, P.; Jiang, X.; Zhang, B. RSBuilding: Toward General Remote Sensing Image Building Extraction and Change Detection with Foundation Model. IEEE Trans. Geosci. Remote Sens. 2024, 62, 4707417. [Google Scholar] [CrossRef]
- Fang, F.; Wu, K.; Liu, Y.; Li, S.; Wan, B.; Chen, Y.; Zheng, D. A Coarse-to-Fine Contour Optimization Network for Extracting Building Instances from High-Resolution Remote Sensing Imagery. Remote Sens. 2021, 13, 3814. [Google Scholar] [CrossRef]
- Qiu, Y.; Wu, F.; Qian, H.; Zhai, R.; Gong, X.; Yin, J.; Liu, C.; Wang, A. AFL-Net: Attentional Feature Learning Network for Building Extraction from Remote Sensing Images. Remote Sens. 2023, 15, 95. [Google Scholar] [CrossRef]
- Gerke, M.; Rottensteiner, F.; Wegner, J.; Sohn, G. ISPRS Semantic Labeling Contest. 6 September 2014. Available online: https://www.researchgate.net/profile/Markus-Gerke/publication/267150834_ISPRS_Semantic_Labeling_Contest/links/54462b390cf2d62c304da962/ISPRS-Semantic-Labeling-Contest.pdf (accessed on 27 February 2025).
- Kirillov, A.; Wu, Y.; He, K.; Girshick, R. PointRend: Image Segmentation As Rendering. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 9796–9805. [Google Scholar]
- Chen, K.; Wang, J.; Pang, J.; Cao, Y.; Xiong, Y.; Li, X.; Sun, S.; Feng, W.; Liu, Z.; Xu, J.; et al. MMDetection: Open MMLab Detection Toolbox and Benchmark. arXiv 2019, arXiv:1906.07155. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 9992–10002. [Google Scholar]
- Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common Objects in Context. In Computer Vision—ECCV 2014; Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2014; Volume 8693, pp. 740–755. ISBN 978-3-319-10601-4. [Google Scholar]
- Zhu, Q.; Liao, C.; Hu, H.; Mei, X.; Li, H. MAP-Net: Multiple Attending Path Neural Network for Building Footprint Extraction From Remote Sensed Imagery. IEEE Trans. Geosci. Remote Sens. 2021, 59, 6169–6181. [Google Scholar] [CrossRef]
- Li, R.; Wang, L.; Zhang, C.; Duan, C.; Zheng, S. A2-FPN for Semantic Segmentation of Fine-Resolution Remotely Sensed Images. Int. J. Remote Sens. 2022, 43, 1131–1155. [Google Scholar] [CrossRef]
- Li, R.; Zheng, S.; Duan, C.; Wang, L.; Zhang, C. Land Cover Classification from Remote Sensing Images Based on Multi-Scale Fully Convolutional Network. Geo-Spat. Inf. Sci. 2022, 25, 278–294. [Google Scholar] [CrossRef]
- Bokhovkin, A.; Burnaev, E. Boundary Loss for Remote Sensing Imagery Semantic Segmentation. In Advances in Neural Networks—ISNN 2019; Springer: Cham, Switzerland, 2019; Volume 11555. [Google Scholar] [CrossRef]
- Hosseinpour, H.; Samadzadegan, F.; Javan, F.D. A Novel Boundary Loss Function in Deep Convolutional Networks to Improve the Buildings Extraction From High-Resolution Remote Sensing Images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 4437–4454. [Google Scholar] [CrossRef]
- Ma, S.; Li, T.; Zhai, S. Adaptive Layer Selection and Fusion Network for Infrastructure Contour Segmentation Using UAV Remote Sensing Images. IEEE Geosci. Remote Sens. Lett. 2025, 22, 7500205. [Google Scholar] [CrossRef]
- Liu, Z.; Wang, Y.; Vaidya, S.; Ruehle, F.; Halverson, J.; Soljacic, M.; Hou, T.Y.; Tegmark, M. KAN: Kolmogorov-Arnold Networks. arXiv 2024, arXiv:2404.19756. [Google Scholar]
- Cheon, M.; Mun, C. Combining KAN with CNN: KonvNeXt’s Performance in Remote Sensing and Patent Insights. Remote Sens. 2024, 16, 3417. [Google Scholar] [CrossRef]
- Jamali, A.; Roy, S.K.; Hong, D.; Lu, B.; Ghamisi, P. How to Learn More? Exploring Kolmogorov–Arnold Networks for Hyperspectral Image Classification. Remote Sens. 2024, 16, 4015. [Google Scholar] [CrossRef]
- Li, Y.; Liu, S.; Wu, J.; Sun, W.; Wen, Q.; Wu, Y.; Qin, X.; Qiao, Y. Multi-Scale Kolmogorov-Arnold Network (KAN)-Based Linear Attention Network: Multi-Scale Feature Fusion with KAN and Deformable Convolution for Urban Scene Image Semantic Segmentation. Remote Sens. 2025, 17, 802. [Google Scholar] [CrossRef]
- Ma, X.; Wang, Z.; Hu, Y.; Zhang, X.; Pun, M. Kolmogorov-Arnold Network for Remote Sensing Image Semantic Segmentation. arXiv 2025, arXiv:2501.07390. [Google Scholar]
- Wu, Z.; Lu, H.; Paoletti, M.E.; Su, H.; Jing, W.; Haut, J.M. KACNet: Kolmogorov-Arnold Convolution Network for Hyperspectral Anomaly Detection. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5506514. [Google Scholar] [CrossRef]
- Teymoor, S.S.; Sadegh, M.; Chanussot, J. Kolmogorov–Arnold Network for Hyperspectral Change Detection. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5505515. [Google Scholar] [CrossRef]
- Xu, G.; Yang, S.; Feng, Z. Dual-Semantic Graph Convolution Network for Hyperspectral Image Classification with Few Labeled Samples. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5508815. [Google Scholar] [CrossRef]
Methods | APsegm | APsegm50 | APsegm75 | ARsegm | APbox | APBox50 | APBox75 | ARBox |
---|---|---|---|---|---|---|---|---|
PointRend | 0.558 | 0.732 | 0.625 | 0.735 | 0.558 | 0.721 | 0.628 | 0.655 |
SwinTransformer | 0.635 | 0.820 | 0.710 | 0.743 | 0.635 | 0.821 | 0.710 | 0.750 |
Mask R-CNN | 0.603 | 0.799 | 0.690 | 0.741 | 0.603 | 0.796 | 0.695 | 0.730 |
AP-PointRend | 0.646 | 0.836 | 0.724 | 0.750 | 0.648 | 0.837 | 0.725 | 0.735 |
Methods | APsegm | APsegm50 | APsegm75 | ARsegm | APbox | APBox50 | APBox75 | ARBox |
---|---|---|---|---|---|---|---|---|
PointRend | 0.544 | 0.736 | 0.641 | 0.591 | 0.559 | 0.726 | 0.625 | 0.632 |
SwinTransformer | 0.616 | 0.794 | 0.710 | 0.662 | 0.618 | 0.791 | 0.699 | 0.665 |
Mask R-CNN | 0.579 | 0.778 | 0.670 | 0.628 | 0.578 | 0.774 | 0.658 | 0.634 |
AP-PointRend | 0.637 | 0.825 | 0.725 | 0.670 | 0.641 | 0.814 | 0.727 | 0.675 |
Algorithm | Cropping Sizes | Overlapping Rate (%) | Mask AP50 | Mask AP70 | Mask AR | Box AP50 | Box AP70 | Box AR |
---|---|---|---|---|---|---|---|---|
Mask R-CNN | 448 × 448 | 0 | 51.1 | 31.1 | 52.8 | 51.5 | 28.5 | 50.6 |
20 | 54.5 | 33.8 | 54.2 | 55.1 | 31.7 | 52.9 | ||
50 | 55.5 | 37.0 | 53.3 | 55.4 | 35.2 | 53.4 | ||
896 × 896 | 0 | 68.4 | 59.9 | 70.1 | 68.9 | 58.8 | 69.0 | |
20 | 69.3 | 60.0 | 70.1 | 69.0 | 59.7 | 69.4 | ||
50 | 69.9 | 60.6 | 70.9 | 69.3 | 59.6 | 68.0 | ||
1792 × 1792 | 0 | 72.4 | 61.5 | 73.0 | 85.9 | 66.3 | 72.0 | |
20 | 73.2 | 62.5 | 73.5 | 79.6 | 69.5 | 73.0 | ||
50 | 73.6 | 62.9 | 74.1 | 80.2 | 69.6 | 73.6 | ||
PointRend | 448 × 448 | 0 | 62.6 | 49.8 | 60.8 | 61.4 | 51.1 | 60.8 |
20 | 61.0 | 48.9 | 59.4 | 60.7 | 50.2 | 59.7 | ||
50 | 61.7 | 48.9 | 59.7 | 60.6 | 50.2 | 60.3 | ||
896 × 896 | 0 | 66.4 | 51.4 | 70.5 | 66.4 | 51.0 | 70.4 | |
20 | 67.2 | 59.3 | 69.9 | 67.1 | 59.0 | 69.8 | ||
50 | 67.5 | 60.1 | 69.9 | 67.5 | 60.6 | 70.0 | ||
1792 × 1792 | 0 | 72.9 | 61.2 | 73.2 | 717 | 61.6 | 73.3 | |
20 | 73.2 | 62.5 | 73.5 | 72.1 | 62.8 | 73.5 | ||
50 | 73.8 | 63.2 | 74.1 | 72.9 | 63.1 | 74.9 | ||
Swin Transformer | 448 × 448 | 0 | 77.7 | 65.3 | 73.7 | 77.7 | 65.5 | 73.6 |
20 | 78.5 | 65.1 | 73.7 | 78.2 | 65.8 | 74.1 | ||
50 | 78.8 | 65.2 | 73.8 | 79.6 | 65.9 | 74.5 | ||
896 × 896 | 0 | 80.6 | 68.7 | 74.1 | 80.7 | 68.9 | 74.8 | |
20 | 80.9 | 68.9 | 74.8 | 80.9 | 69.1 | 74.9 | ||
50 | 81.1 | 69.2 | 74.9 | 81.4 | 69.7 | 74.9 | ||
1792 × 1792 | 0 | 81.7 | 69.5 | 73.8 | 81.8 | 69.6 | 74.6 | |
20 | 82.0 | 71.0 | 74.3 | 82.1 | 71.0 | 75.0 | ||
50 | 82.3 | 71.7 | 74.9 | 82.4 | 71.9 | 76.6 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhu, B.; Yu, D.; Xiao, X.; Shen, J.; Cui, Z.; Su, Y.; Li, A.; Li, D. AP-PointRend: An Improved Network for Building Extraction via High-Resolution Remote Sensing Images. Remote Sens. 2025, 17, 1481. https://doi.org/10.3390/rs17091481
Zhu B, Yu D, Xiao X, Shen J, Cui Z, Su Y, Li A, Li D. AP-PointRend: An Improved Network for Building Extraction via High-Resolution Remote Sensing Images. Remote Sensing. 2025; 17(9):1481. https://doi.org/10.3390/rs17091481
Chicago/Turabian StyleZhu, Bowen, Ding Yu, Xiongwu Xiao, Jian Shen, Zhigao Cui, Yanzhao Su, Aihua Li, and Deren Li. 2025. "AP-PointRend: An Improved Network for Building Extraction via High-Resolution Remote Sensing Images" Remote Sensing 17, no. 9: 1481. https://doi.org/10.3390/rs17091481
APA StyleZhu, B., Yu, D., Xiao, X., Shen, J., Cui, Z., Su, Y., Li, A., & Li, D. (2025). AP-PointRend: An Improved Network for Building Extraction via High-Resolution Remote Sensing Images. Remote Sensing, 17(9), 1481. https://doi.org/10.3390/rs17091481