Combining Deep Fully Convolutional Network and Graph Convolutional Neural Network for the Extraction of Buildings from Aerial Images
Abstract
:1. Introduction
- (1)
- A new multi-scale feature fusion module, the AAP (atrous attention pyramid) module, is proposed to fuse multi-scale features through the combination of multi-branching and attention mechanisms, which helps network to cope with complex scenes with variable building dimensions.
- (2)
- The DGC (dual graph convolution) module is used to obtain global contextual information in spatial and channel dimensions. This module guides the network to perceive effective features from the global context, reduces the influence of the background environment on building recognition, and allows more accurate identification of the classes to which edge pixels belong.
- (3)
- A new network, the building graph convolutional network (BGC-Net), is proposed. The proposed method was thoroughly evaluated on two different and versatile datasets, which confirmed that the proposed method can comprehensively outperform the existing CNN-based methods in the Overall Accuracy (OA), Recall, F1 score, and intersection over union (IoU).
2. Methodology
2.1. Overview of the Proposed Model
2.2. Feature Extraction Module
2.3. Atrous Attention Pyramid Module
2.4. Dual Graph Convolutional Module
2.5. Decoder Module
3. Experimental Datasets and Evaluation
3.1. Datasets
3.2. Evaluation Metrics
3.3. Implementation Details
4. Results
4.1. Visualization Results
4.1.1. Results on WHU Dataset
4.1.2. Results on CHN Dataset
4.2. Quantitative Comparisons
5. Discussion
5.1. Ablation Study
5.2. Comparing with the State-of-the-Art
5.3. The Effects of Channel Dimension GCN Part
5.4. Limitations
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Wu, T.; Hu, Y.; Peng, L.; Chen, R. Improved Anchor-Free Instance Segmentation for Building Extraction from High-Resolution Remote Sensing Images. Remote Sens. 2020, 12, 2910. [Google Scholar] [CrossRef]
- Zhou, J.; Liu, Y.; Nie, G.; Cheng, H.; Yang, X.; Chen, X.; Gross, L. Building Extraction and Floor Area Estimation at the Village Level in Rural China via a Comprehensive Method Integrating UAV Photogrammetry and the Novel EDSANet. Remote Sens. 2022, 14, 5175. [Google Scholar] [CrossRef]
- Liu, Y.; Zhou, J.; Qi, W.; Li, X.; Gross, L.; Shao, Q.; Zhao, Z.; Ni, L.; Fan, X.; Li, Z. ARC-Net: An Efficient Network for Building Extraction from High-Resolution Aerial Images. IEEE Access 2020, 8, 154997–155010. [Google Scholar] [CrossRef]
- Moya, L.; Perez, L.R.M.; Mas, E.; Adriano, B.; Koshimura, S.; Yamazaki, F. Novel Unsupervised Classification of Collapsed Buildings Using Satellite Imagery, Hazard Scenarios and Fragility Functions. Remote Sens. 2018, 10, 296. [Google Scholar] [CrossRef] [Green Version]
- Sun, S.; Mu, L.; Wang, L.; Liu, P.; Liu, X.; Zhang, Y. Semantic Segmentation for Buildings of Large Intra-Class Variation in Remote Sensing Images with O-GAN. Remote Sens. 2021, 13, 475. [Google Scholar] [CrossRef]
- Liu, Y.; Gross, L.; Li, Z.; Li, X.; Fan, X.; Qi, W. Automatic Building Extraction on High-Resolution Remote Sensing Imagery Using Deep networks for biomedical image segmentation Encoder-Decoder with Spatial Pyramid Pooling. IEEE Access 2019, 7, 128774–128786. [Google Scholar] [CrossRef]
- Shackelford, A.K.; Davis, C.H. A Combined Fuzzy Pixel-Based and Object-Based Approach for Classification of High-Resolution Multispectral Data over Urban Areas. IEEE Trans. Geosci. Remote Sens. 2003, 41, 2354–2363. [Google Scholar] [CrossRef] [Green Version]
- Hossain, M.D.; Chen, D. Segmentation for Object-Based Image Analysis (OBIA): A Review of Algorithms and Challenges from Remote Sensing Perspective. ISPRS J. Photogramm. Remote Sens. 2019, 150, 115–134. [Google Scholar] [CrossRef]
- Wang, J.; Yang, X.; Qin, X.; Ye, X.; Qin, Q. An Efficient Approach for Automatic Rectangular Building Extraction from Very High Resolution Optical Satellite Imagery. IEEE Geosci. Remote Sens. Lett. 2014, 12, 487–491. [Google Scholar] [CrossRef]
- Lin, C.; Nevatia, R. Building Detection and Description from a Single Intensity Image. Comput. Vis. Image Underst. 1998, 72, 101–121. [Google Scholar] [CrossRef]
- Huang, D.; Sun, J.; Liu, S.; Xu, S.; Liang, S.; Li, C.; Wang, Z. Multi-dimension and multi-granularity segmentation of remote sensing image based on improved otsu algorithm. In Proceedings of the 2017 IEEE 14th International Conference on Networking, Sensing and Control (ICNSC), Calabria, Italy, 16–18 May 2017; IEEE: New York, NY, USA, 2017; pp. 679–684. [Google Scholar]
- Du, J.; Chen, D.; Wang, R.; Peethambaran, J.; Mathiopoulos, P.T.; Xie, L.; Yun, T. A Novel Framework for 2.5-D Building Contouring from Large-Scale Residential Scenes. IEEE Trans. Geosci. Remote Sens. 2019, 57, 4121–4145. [Google Scholar] [CrossRef]
- Awrangjeb, M.; Zhang, C.; Fraser, C.S. Automatic Extraction of Building Roofs Using LIDAR Data and Multispectral Imagery. ISPRS J. Photogramm. Remote Sens. 2013, 83, 1–18. [Google Scholar] [CrossRef] [Green Version]
- Cui, W.; Zhang, Y. An Effective Graph-Based Hierarchy Image Segmentation. Intell. Autom. Soft Comput. 2011, 17, 969–981. [Google Scholar] [CrossRef]
- Chaokui, L.; Jun, F.; Baiyan, W.; Jianhui, C. Research on the Classification of High Resolution Image Based on Object-Oriented and Class Rule. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2015, 7, 75–80. [Google Scholar] [CrossRef] [Green Version]
- Li, C.; Dong, X.; Zhang, Q. Multi-scale object-oriented building extraction method of Tai’an city from high resolution image. In Proceedings of the 2014 Third International Workshop on Earth Observation and Remote Sensing Applications (EORSA), Changsha, China, 11–14 June 2014; IEEE: New York, NY, USA, 2014; pp. 91–95. [Google Scholar]
- Yan, Z.; Huazhong, R.; Desheng, C. The research of building earthquake damage object-oriented change detection based on ensemble classifier with remote sensing image. In Proceedings of the IGARSS 2018—2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 22–27 July 2018; IEEE: New York, NY, USA, 2018; pp. 4950–4953. [Google Scholar]
- Alshehhi, R.; Marpu, P.R.; Woon, W.L.; Dalla Mura, M. Simultaneous Extraction of Roads and Buildings in Remote Sensing Imagery with Convolutional Neural Networks. ISPRS J. Photogramm. Remote Sens. 2017, 130, 139–149. [Google Scholar] [CrossRef]
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-Based Learning Applied to Document Recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet Classification with Deep Convolutional Neural Networks. Adv. Neural Inf. Process. Syst. 2012, 25, 84–90. [Google Scholar] [CrossRef] [Green Version]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 5–9 October 2015; Springer: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. Segnet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef]
- Noh, H.; Hong, S.; Han, B. Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE International Conference on Computer vision, Washington, DC, USA, 7–13 December 2015; pp. 1520–1528. [Google Scholar]
- Yu, M.; Zhang, W.; Chen, X.; Liu, Y.; Niu, J. An End-to-End Atrous Spatial Pyramid Pooling and Skip-Connections Generative Adversarial Segmentation Network for Building Extraction from High-Resolution Aerial Images. Appl. Sci. 2022, 12, 5151. [Google Scholar] [CrossRef]
- Jin, Y.; Xu, W.; Zhang, C.; Luo, X.; Jia, H. Boundary-Aware Refined Network for Automatic Building Extraction in Very High-Resolution Urban Aerial Images. Remote Sens. 2021, 13, 692. [Google Scholar] [CrossRef]
- Pan, X.; Gao, L.; Zhang, B.; Yang, F.; Liao, W. High-Resolution Aerial Imagery Semantic Labeling with Dense Pyramid Network. Sensors 2018, 18, 3774. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Liu, H.; Luo, J.; Huang, B.; Hu, X.; Sun, Y.; Yang, Y.; Xu, N.; Zhou, N. DE-Net: Deep Encoding Network for Building Extraction from High-Resolution Remote Sensing Imagery. Remote Sens. 2019, 11, 2380. [Google Scholar] [CrossRef] [Green Version]
- Ji, S.; Wei, S.; Lu, M. A Scale Robust Convolutional Neural Network for Automatic Building Extraction from Aerial and Satellite Imagery. Int. J. Remote Sens. 2019, 40, 3308–3322. [Google Scholar] [CrossRef]
- Liu, Y.; Zhu, Q.; Cao, F.; Chen, J.; Lu, G. High-Resolution Remote Sensing Image Segmentation Framework Based on Attention Mechanism and Adaptive Weighting. ISPRS Int. J. Geo-Inf. 2021, 10, 241. [Google Scholar] [CrossRef]
- Zhu, Q.; Liao, C.; Hu, H.; Mei, X.; Li, H. MAP-Net: Multiple Attending Path Neural Network for Building Footprint Extraction from Remote Sensed Imagery. IEEE Trans. Geosci. Remote Sens. 2020, 59, 6169–6181. [Google Scholar] [CrossRef]
- Sun, G.; Huang, H.; Zhang, A.; Li, F.; Zhao, H.; Fu, H. Fusion of Multiscale Convolutional Neural Networks for Building Extraction in Very High-Resolution Images. Remote Sens. 2019, 11, 227. [Google Scholar] [CrossRef] [Green Version]
- Zhang, Z.; Huang, J.; Jiang, T.; Sui, B.; Pan, X. Semantic Segmentation of Very High-Resolution Remote Sensing Image Based on Multiple Band Combinations and Patchwise Scene Analysis. J. Appl. Remote Sens. 2020, 14, 16502. [Google Scholar] [CrossRef]
- Guo, M.; Liu, H.; Xu, Y.; Huang, Y. Building Extraction Based on U-Net with an Attention Block and Multiple Losses. Remote Sens. 2020, 12, 1400. [Google Scholar] [CrossRef]
- Zhang, F.; Chen, Y.; Li, Z.; Hong, Z.; Liu, J.; Ma, F.; Han, J.; Ding, E. Acfnet: Attentional class feature network for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6798–6807. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NA, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Chen, L.-C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Deeplab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected Crfs. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 834–848. [Google Scholar] [CrossRef] [Green Version]
- Zhao, H.; Shi, J.; Qi, X.; Wang, X.; Jia, J. Pyramid scene parsing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2881–2890. [Google Scholar]
- Chen, L.-C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
- Yang, M.; Yu, K.; Zhang, C.; Li, Z.; Yang, K. Denseaspp for semantic segmentation in street scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 3684–3692. [Google Scholar]
- Yu, M.; Chen, X.; Zhang, W.; Liu, Y. AGs-Unet: Building Extraction Model for High Resolution Remote Sensing Images Based on Attention Gates U Network. Sensors 2022, 22, 2932. [Google Scholar] [CrossRef] [PubMed]
- Fu, J.; Liu, J.; Tian, H.; Li, Y.; Bao, Y.; Fang, Z.; Lu, H. Dual attention network for scene segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–17 June 2019; pp. 3146–3154. [Google Scholar]
- Kang, W.; Xiang, Y.; Wang, F.; You, H. EU-Net: An Efficient Fully Convolutional Network for Building Extraction from Optical Remote Sensing Images. Remote Sens. 2019, 11, 2813. [Google Scholar] [CrossRef] [Green Version]
- Yan, J.; Ji, S.; Wei, Y. A Combination of Convolutional and Graph Neural Networks for Regularized Road Surface Extraction. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–13. [Google Scholar] [CrossRef]
- Zhang, J.; Hua, Z.; Yan, K.; Tian, K.; Yao, J.; Liu, E.; Liu, M.; Han, X. Joint Fully Convolutional and Graph Convolutional Networks for Weakly-Supervised Segmentation of Pathology Images. Med. Image Anal. 2021, 73, 102183. [Google Scholar] [CrossRef] [PubMed]
- Ouyang, S.; Li, Y. Combining Deep Semantic Segmentation Network and Graph Convolutional Neural Network for Semantic Segmentation of Remote Sensing Imagery. Remote Sens. 2020, 13, 119. [Google Scholar] [CrossRef]
- Yuan, Y.; Huang, L.; Guo, J.; Zhang, C.; Chen, X.; Wang, J. Ocnet: Object Context Network for Scene Parsing. arXiv 2018, arXiv:1809.00916. [Google Scholar]
- Zhang, L.; Li, X.; Arnab, A.; Yang, K.; Tong, Y.; Torr, P.H.S. Dual Graph Convolutional Network for Semantic Segmentation. arXiv 2019, arXiv:1909.06121. [Google Scholar]
- Ji, S.; Wei, S.; Lu, M. Fully Convolutional Networks for Multisource Building Extraction from an Open Aerial and Satellite Imagery Data Set. IEEE Trans. Geosci. Remote Sens. 2018, 57, 574–586. [Google Scholar] [CrossRef]
- Fang, F.; Wu, K.; Zheng, D.; Chen, Y.; Zeng, L.; Zhang, J.; Chai, S.; Xu, W.; Yang, Y.; Li, S.; et al. A Dataset of Building Instances of Typical Cities in China. Chin. Sci. Data 2021, 6, 191–199. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Dataset | Resolution/m | Size | Train | Validation | Test |
---|---|---|---|---|---|
WHU | 0.30 | 512 × 512 | 4736 | 1036 | 2416 |
CHN | 0.29 | 500 × 500 | 5985 | - | 1275 |
Project | Parameter | Project | Parameter |
---|---|---|---|
CPU | Intel(R) Core(TM) i9-10850K | Operating system | Windows 10 |
RAM | 32G | CUDA version | CUDA 11 |
Hard disk | 1T | Language used | Python 3.6 |
GPU | NVIDIA GeForce GTX 3070 | Deep learning framework | Pytorch 1.8 |
ResNet | AAP | DGC | OA | Precision | Recall | F1-Score | IoU |
---|---|---|---|---|---|---|---|
√ | √ | 0.957 | 0.879 | 0.947 | 0.912 | 0.838 | |
√ | √ | 0.973 | 0.952 | 0.936 | 0.944 | 0.895 | |
√ | √ | √ | 0.976 | 0.951 | 0.952 | 0.951 | 0.908 |
ResNet | AAP | DGC | OA | Precision | Recall | F1-Score | IoU |
---|---|---|---|---|---|---|---|
√ | √ | 0.956 | 0.921 | 0.919 | 0.919 | 0.852 | |
√ | √ | 0.958 | 0.919 | 0.929 | 0.923 | 0.859 | |
√ | √ | √ | 0.961 | 0.926 | 0.939 | 0.932 | 0.871 |
Network | OA | Precision | Recall | F1-Score | IoU |
---|---|---|---|---|---|
ARC-Net [2] | 0.968 | 0.929 | 0.949 | 0.938 | 0.884 |
BARNet [23] | 0.972 | 0.954 | 0.934 | 0.944 | 0.895 |
BGC-Net | 0.976 | 0.959 | 0.949 | 0.953 | 0.909 |
OA | Precision | Recall | F1-Score | IoU | |
---|---|---|---|---|---|
Without channel dimension DGC | 0.965 | 0.969 | 0.944 | 0.956 | 0.907 |
With channel dimension DGC | 0.975 | 0.957 | 0.966 | 0.961 | 0.918 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, W.; Yu, M.; Chen, X.; Zhou, F.; Ren, J.; Xu, H.; Xu, S. Combining Deep Fully Convolutional Network and Graph Convolutional Neural Network for the Extraction of Buildings from Aerial Images. Buildings 2022, 12, 2233. https://doi.org/10.3390/buildings12122233
Zhang W, Yu M, Chen X, Zhou F, Ren J, Xu H, Xu S. Combining Deep Fully Convolutional Network and Graph Convolutional Neural Network for the Extraction of Buildings from Aerial Images. Buildings. 2022; 12(12):2233. https://doi.org/10.3390/buildings12122233
Chicago/Turabian StyleZhang, Wenzhuo, Mingyang Yu, Xiaoxian Chen, Fangliang Zhou, Jie Ren, Haiqing Xu, and Shuai Xu. 2022. "Combining Deep Fully Convolutional Network and Graph Convolutional Neural Network for the Extraction of Buildings from Aerial Images" Buildings 12, no. 12: 2233. https://doi.org/10.3390/buildings12122233
APA StyleZhang, W., Yu, M., Chen, X., Zhou, F., Ren, J., Xu, H., & Xu, S. (2022). Combining Deep Fully Convolutional Network and Graph Convolutional Neural Network for the Extraction of Buildings from Aerial Images. Buildings, 12(12), 2233. https://doi.org/10.3390/buildings12122233