Author Contributions
Conceptualization, W.Z. and M.Y.; methodology, W.Z. and X.C.; software, W.Z. and X.C.; validation, W.Z. and M.Y.; formal analysis, M.Y.; investigation, W.Z.; resources, M.Y. and F.Z.; data curation, W.Z. and J.R.; writing—original draft preparation, M.Y.; writing—review and editing, W.Z.; visualization, W.Z.; supervision, M.Y. and H.X.; project administration, M.Y.; funding acquisition, M.Y. and S.X. All authors have read and agreed to the published version of the manuscript.
Figure 1.
The structure of our proposed BGC-Net, consisting of three parts: FE module, AAP module, DGC module.
Figure 1.
The structure of our proposed BGC-Net, consisting of three parts: FE module, AAP module, DGC module.
Figure 2.
The structure of the residual block.
Figure 2.
The structure of the residual block.
Figure 3.
The structure of ResNet-50.
Figure 3.
The structure of ResNet-50.
Figure 4.
Structure of atrous attention pyramid module in the BGC-Net.
Figure 4.
Structure of atrous attention pyramid module in the BGC-Net.
Figure 5.
Structure of the dual graph convolutional module in the BGC-Net. This module consists of two branches, which each consist of a graph convolutional network (GCN) to model contextual information in the spatial- and channel-dimensions in a convolutional feature map, X.
Figure 5.
Structure of the dual graph convolutional module in the BGC-Net. This module consists of two branches, which each consist of a graph convolutional network (GCN) to model contextual information in the spatial- and channel-dimensions in a convolutional feature map, X.
Figure 6.
Structure of the decoder module in the BGC-Net.
Figure 6.
Structure of the decoder module in the BGC-Net.
Figure 7.
Examples of the images and corresponding labels for the two employed datasets. (a,b) represent the WHU dataset and CHN dataset, respectively. The white pixels stand for buildings.
Figure 7.
Examples of the images and corresponding labels for the two employed datasets. (a,b) represent the WHU dataset and CHN dataset, respectively. The white pixels stand for buildings.
Figure 8.
Examples of building extraction results obtained by different networks on the WHU dataset. (
a) Original image. (
b) Ground-truth. (
c) FCN8s. (
d) DANet. (
e) SegNet. (
f) U-Net. (
g) BGC-Net. Note, in Columns (
b–
g), black, white, green, blue, and red indicate true, false, true-positive, false-negative, and false-positive, respectively. The red rectangles in (
a) are the selected regions for close-up inspection in
Figure 8.
Figure 8.
Examples of building extraction results obtained by different networks on the WHU dataset. (
a) Original image. (
b) Ground-truth. (
c) FCN8s. (
d) DANet. (
e) SegNet. (
f) U-Net. (
g) BGC-Net. Note, in Columns (
b–
g), black, white, green, blue, and red indicate true, false, true-positive, false-negative, and false-positive, respectively. The red rectangles in (
a) are the selected regions for close-up inspection in
Figure 8.
Figure 9.
Close-up views of the results obtained by different networks on the WHU dataset. Images and results shown in (
a–
g) are the subset from the selected regions marked in
Figure 7a. (
a) Original image. (
b) Ground-truth. (
c) FCN8s. (
d) DANet. (
e) SegNet. (
f) U-Net. (
g) BGC-Net. Note, in Columns (
b–
g), black, white, green, blue, and red indicate true, false, true-positive, false-negative, and false-positive, respectively.
Figure 9.
Close-up views of the results obtained by different networks on the WHU dataset. Images and results shown in (
a–
g) are the subset from the selected regions marked in
Figure 7a. (
a) Original image. (
b) Ground-truth. (
c) FCN8s. (
d) DANet. (
e) SegNet. (
f) U-Net. (
g) BGC-Net. Note, in Columns (
b–
g), black, white, green, blue, and red indicate true, false, true-positive, false-negative, and false-positive, respectively.
Figure 10.
Examples of building extraction results obtained by different networks on the CHN dataset. (
a) Original image. (
b) Ground-truth. (
c) FCN8s. (
d) DANet. (
e) SegNet. (
f) U-Net. (
g) BGC-Net. Note, in Columns (
b–
g), black, white, green, blue, and red indicate true, false, true-positive, false-negative, and false-positive, respectively. The red rectangles in (
a) are the selected regions for close-up inspection in
Figure 10.
Figure 10.
Examples of building extraction results obtained by different networks on the CHN dataset. (
a) Original image. (
b) Ground-truth. (
c) FCN8s. (
d) DANet. (
e) SegNet. (
f) U-Net. (
g) BGC-Net. Note, in Columns (
b–
g), black, white, green, blue, and red indicate true, false, true-positive, false-negative, and false-positive, respectively. The red rectangles in (
a) are the selected regions for close-up inspection in
Figure 10.
Figure 11.
Close-up views of the results obtained by different networks on the CHN dataset. Images and results shown in (
a–
g) are the subset from the selected regions marked in
Figure 9a. (
a) Original image. (
b) Ground-truth. (
c) FCN8s. (
d) DANet. (
e) SegNet. (
f) U-Net. (
g) BGC-Net. Note, in Columns (
b–
g), black, white, green, blue, and red indicate true, false, true-positive, false-negative, and false-positive, respectively.
Figure 11.
Close-up views of the results obtained by different networks on the CHN dataset. Images and results shown in (
a–
g) are the subset from the selected regions marked in
Figure 9a. (
a) Original image. (
b) Ground-truth. (
c) FCN8s. (
d) DANet. (
e) SegNet. (
f) U-Net. (
g) BGC-Net. Note, in Columns (
b–
g), black, white, green, blue, and red indicate true, false, true-positive, false-negative, and false-positive, respectively.
Figure 12.
Quantitative results of different networks on the two datasets. (a) Quantitative results of different networks on the WHU dataset. (b) Quantitative results of different networks on the CHN dataset.
Figure 12.
Quantitative results of different networks on the two datasets. (a) Quantitative results of different networks on the WHU dataset. (b) Quantitative results of different networks on the CHN dataset.
Figure 13.
Comparison of ablation experimental results on the WHU dataset. (a) Original image. (b) Ground-truth. (c) ResNet + AAP. (d) ResNet + DGC. (e) ResNet + AAP + DGC. Note, in Columns (b–e), black, white, green, blue, and red indicate true, false, true-positive, false-negative, and false-positive, respectively.
Figure 13.
Comparison of ablation experimental results on the WHU dataset. (a) Original image. (b) Ground-truth. (c) ResNet + AAP. (d) ResNet + DGC. (e) ResNet + AAP + DGC. Note, in Columns (b–e), black, white, green, blue, and red indicate true, false, true-positive, false-negative, and false-positive, respectively.
Figure 14.
Comparison of ablation experimental results on the CHN dataset. (a) Original image. (b) Ground truth. (c) ResNet + AAP. (d) ResNet + DGC. (e) ResNet + AAP + DGC. Note, in Columns (b–e), black, white, green, blue, and red indicate true, false, true-positive, false-negative, and false-positive, respectively.
Figure 14.
Comparison of ablation experimental results on the CHN dataset. (a) Original image. (b) Ground truth. (c) ResNet + AAP. (d) ResNet + DGC. (e) ResNet + AAP + DGC. Note, in Columns (b–e), black, white, green, blue, and red indicate true, false, true-positive, false-negative, and false-positive, respectively.
Figure 15.
Building extraction results of different networks on the WHU dataset. (a) Original image. (b) Ground truth. (c) ARC-Net. (d) BARNet. (e) BGC-Net. Note, in Columns (b–e), black, white, green, blue, and red indicate true, false, true-positive, false-negative, and false-positive, respectively.
Figure 15.
Building extraction results of different networks on the WHU dataset. (a) Original image. (b) Ground truth. (c) ARC-Net. (d) BARNet. (e) BGC-Net. Note, in Columns (b–e), black, white, green, blue, and red indicate true, false, true-positive, false-negative, and false-positive, respectively.
Figure 16.
Comparison of the BGC-Net with or without the channel dimension DGC part on the WHU dataset. (a) Original image. (b) Ground truth. (c) Without channel dimension DGC. (d) Without channel dimension DGC. Note, in Columns (b–d), black, white, green, blue, and red indicate true, false, true-positive, false-negative, and false-positive, respectively.
Figure 16.
Comparison of the BGC-Net with or without the channel dimension DGC part on the WHU dataset. (a) Original image. (b) Ground truth. (c) Without channel dimension DGC. (d) Without channel dimension DGC. Note, in Columns (b–d), black, white, green, blue, and red indicate true, false, true-positive, false-negative, and false-positive, respectively.
Table 1.
Basic parameters and training assignment of dataset.
Table 1.
Basic parameters and training assignment of dataset.
Dataset | Resolution/m | Size | Train | Validation | Test |
---|
WHU | 0.30 | 512 × 512 | 4736 | 1036 | 2416 |
CHN | 0.29 | 500 × 500 | 5985 | - | 1275 |
Table 2.
Hardware and software configuration.
Table 2.
Hardware and software configuration.
Project | Parameter | Project | Parameter |
---|
CPU | Intel(R) Core(TM) i9-10850K | Operating system | Windows 10 |
RAM | 32G | CUDA version | CUDA 11 |
Hard disk | 1T | Language used | Python 3.6 |
GPU | NVIDIA GeForce GTX 3070 | Deep learning framework | Pytorch 1.8 |
Table 3.
Accuracy statistics of the ablation experiment on the WHU dataset. The best scores are highlighted in bold.
Table 3.
Accuracy statistics of the ablation experiment on the WHU dataset. The best scores are highlighted in bold.
ResNet | AAP | DGC | OA | Precision | Recall | F1-Score | IoU |
---|
√ | √ | | 0.957 | 0.879 | 0.947 | 0.912 | 0.838 |
√ | | √ | 0.973 | 0.952 | 0.936 | 0.944 | 0.895 |
√ | √ | √ | 0.976 | 0.951 | 0.952 | 0.951 | 0.908 |
Table 4.
Accuracy statistics of the ablation experiment on the CHN dataset. The best scores are highlighted in bold.
Table 4.
Accuracy statistics of the ablation experiment on the CHN dataset. The best scores are highlighted in bold.
ResNet | AAP | DGC | OA | Precision | Recall | F1-Score | IoU |
---|
√ | √ | | 0.956 | 0.921 | 0.919 | 0.919 | 0.852 |
√ | | √ | 0.958 | 0.919 | 0.929 | 0.923 | 0.859 |
√ | √ | √ | 0.961 | 0.926 | 0.939 | 0.932 | 0.871 |
Table 5.
Average accuracy of different networks for building extraction on WHU dataset. The best scores are highlighted in bold.
Table 5.
Average accuracy of different networks for building extraction on WHU dataset. The best scores are highlighted in bold.
Network | OA | Precision | Recall | F1-Score | IoU |
---|
ARC-Net [2] | 0.968 | 0.929 | 0.949 | 0.938 | 0.884 |
BARNet [23] | 0.972 | 0.954 | 0.934 | 0.944 | 0.895 |
BGC-Net | 0.976 | 0.959 | 0.949 | 0.953 | 0.909 |
Table 6.
Average accuracy of the BGC-Net with or without the channel dimension DGC part on the WHU dataset. The best scores are highlighted in bold.
Table 6.
Average accuracy of the BGC-Net with or without the channel dimension DGC part on the WHU dataset. The best scores are highlighted in bold.
| OA | Precision | Recall | F1-Score | IoU |
---|
Without channel dimension DGC | 0.965 | 0.969 | 0.944 | 0.956 | 0.907 |
With channel dimension DGC | 0.975 | 0.957 | 0.966 | 0.961 | 0.918 |
Table 7.
Parameters, computation, and training time of each model in WHU dataset and CHN dataset. The best scores are highlighted in bold.
Table 7.
Parameters, computation, and training time of each model in WHU dataset and CHN dataset. The best scores are highlighted in bold.
Model | WHU Training (s)/Epoch | CHN Training (s)/Epoch | Parameters (M) | Computation (G Mac) |
---|
FCN8s [18] | 275 | 344 | 134.27 | 62.81 |
DANet [39] | 103 | 119 | 49.48 | 10.93 |
SegNet [20] | 140 | 179 | 16.31 | 23.77 |
U-Net [19] | 129 | 187 | 13.4 | 23.77 |
BGC-Net | 256 | 294 | 79.73 | 29.46 |