Semantic Segmentation Network for Unstructured Rural Roads Based on Improved SPPM and Fused Multiscale Features
Abstract
:1. Introduction
2. Related Work
3. Improvement of PP-LiteSeg Rural Road Recognition Methods
3.1. Architecture of Semantic Segmentation Model
3.2. Simple Pyramid Module for Strip Pooling
3.3. Parallel Feature Fusion Module
Bottleneck Attention Module
4. Dataset Construction
4.1. Image Data Acquisition
4.2. Data Preprocessing
5. Experimentation and Analysis
5.1. Experiment Platform and Parameter Settings
5.2. Evaluation Indices
5.3. Experiment: Comparison with State-of-the-Art Methods
5.4. Experiment: Results on the Indian Driving Dataset (IDD)
5.5. Ablation Experiment: Category Accuracy of Semantic Segmentation Models with Different Functional Unit Configurations
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Talaviya, T.; Shah, D.; Patel, N.; Yagnik, H. Implementation of artificial intelligence in agriculture for optimisation of irrigation and application of pesticides and herbicides. Artif. Intell. Agric. 2020, 4, 58–73. [Google Scholar] [CrossRef]
- Hou, X.; Chen, P. Analysis of Road Safety Perception and Influencing Factors in a Complex Urban Environment—Taking Chaoyang District, Beijing, as an Example. ISPRS Int. J. Geo-Inf. 2024, 13, 272. [Google Scholar] [CrossRef]
- Wang, J.; Zeng, X.; Wang, Y.; Ren, X.; Wang, D.; Qu, W.; Liao, X.; Pan, P. A Multi-Level Adaptive Lightweight Net for Damaged Road Marking Detection Based on Knowledge Distillation. Remote Sens. 2024, 16, 2593. [Google Scholar] [CrossRef]
- Ding, L.; Zhang, H.; Xiao, J.; Li, B.; Lu, S.; Klette, R.; Norouzifard, M.; Xu, F.; Xu, F. A Comprehensive Approach for Road Marking Detection and Recognition. Multimed. Tools Appl. 2020, 79, 17193–17210. [Google Scholar] [CrossRef]
- Wang, Z.; Wang, J.; Yang, K.; Wang, L.; Su, F.; Chen, X. Semantic Segmentation of High-Resolution Remote Sensing Images Based on a Class Feature Attention Mechanism Fused with Deeplabv3+. Comput. Geosci. 2022, 18, 1049–1069. [Google Scholar] [CrossRef]
- Yang, Y.; He, J.; Wang, P.; Luo, X.; Zhao, R.; Huang, P.; Gao, R.; Liu, Z.; Luo, Y.; Hu, L. TCNet: Transformer Convolution Network for Cutting-Edge Detection of Unharvested Rice Regions. Agriculture 2024, 14, 1122. [Google Scholar] [CrossRef]
- Fan, S.; Zhang, X. Infrastructure and regional economic development in rural China. In Regional Inequality in China; Routledge: Oxfordshire, UK, 2009; pp. 177–189. [Google Scholar]
- Smith, A.B.; Katz, R.W. US billion-dollar weather and climate disasters: Data sources, trends, accuracy and biases. Nat. Hazards 2013, 67, 387–410. [Google Scholar] [CrossRef]
- Paz, D.; Zhang, H.; Li, Q.; Xiang, H.; Christensen, H. Probabilistic semantic mapping for urban autonomous driving applications. In Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NA, USA, 25–29 October 2020; pp. 2059–2064. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
- Paszke, A.; Chaurasia, A.; Kim, S.; Culurciello, E. ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation. arXiv 2016, arXiv:1606.02147. [Google Scholar]
- Yu, C.; Wang, J.; Peng, C.; Gao, C.; Yu, G.; Sang, N. BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 325–341. [Google Scholar]
- Yu, C.; Gao, C.; Wang, J.; Yu, G.; Shen, C.; Sang, N. Bisenet v2: Bilateral network with guided aggregation for real-time semantic segmentation. Int. J. Comput. Vis. 2021, 129, 3051–3068. [Google Scholar] [CrossRef]
- Jiang, H.; Zhang, C.; Qiao, Y.; Zhang, Z.; Zhang, W.; Song, C. CNN feature-based graph convolutional network for weed and crop recognition in smart farming - ScienceDirect. Comput. Electron. Agric. 2020, 174, 105450. [Google Scholar] [CrossRef]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef] [PubMed]
- Zhu, L.; Deng, W.; Lai, Y.; Guo, X.; Zhang, S. Research on Improved Road Visual Navigation Recognition Method Based on DeepLabV3+ in Pitaya Orchard. Agronomy 2024, 14, 1119. [Google Scholar] [CrossRef]
- Ni, H.; Jiang, S. Deep Dual-Resolution Road Scene Segmentation Networks Based on Decoupled Dynamic Filter and Squeeze–Excitation Module. Sensors 2023, 23, 7140. [Google Scholar] [CrossRef] [PubMed]
- Lv, Q.; Sun, X.; Chen, C.; Dong, J.; Zhou, H. Parallel complement network for real-time semantic segmentation of road scenes. IEEE Trans. Intell. Transp. Syst. 2021, 23, 4432–4444. [Google Scholar] [CrossRef]
- Peng, J.; Liu, Y.; Tang, S.; Hao, Y.; Chu, L.; Chen, G.; Wu, Z.; Chen, Z.; Yu, Z.; Du, Y.; et al. PP-LiteSeg: A Superior Real-Time Semantic Segmentation Model. arXiv 2022, arXiv:2204.02681. [Google Scholar]
- Fan, M.; Lai, S.; Huang, J.; Wei, X.; Chai, Z.; Luo, J.; Wei, X. Rethinking BiSeNet For Real-time Semantic Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021; pp. 9716–9725. [Google Scholar]
- Hou, Q.; Zhang, L.; Cheng, M.M.; Feng, J. Strip Pooling: Rethinking Spatial Pooling for Scene Parsing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 4003–4012. [Google Scholar]
- Park, J.; Woo, S.; Lee, J.Y.; Kweon, I.S. Bam: Bottleneck attention module. arXiv 2018, arXiv:1807.06514. [Google Scholar]
- Ma, Y.; Yu, D.; Wu, T.; Wang, H. PaddlePaddle: An Open-Source Deep Learning Platform from Industrial Practice. Front. Data Comput. 2019, 1, 105–115. [Google Scholar]
- Liu, Y.; Chu, L.; Chen, G.; Wu, Z.; Chen, Z.; Lai, B.; Hao, Y. PaddleSeg: A High-Efficient Development Toolkit for Image Segmentation. arXiv 2021, arXiv:2101.06175. [Google Scholar]
- Bottou, L. Large-scale machine learning with stochastic gradient descent. In Proceedings of the COMPSTAT’2010: 19th International Conference on Computational Statistics, Paris, France, 22–27 August 2010; pp. 177–186. [Google Scholar]
- Shrivastava, A.; Gupta, A.; Girshick, R. Training region-based object detectors with online hard example mining. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NA, USA, 27–30 June 2016; pp. 761–769. [Google Scholar]
Category | Number of Semantics |
---|---|
Fence | 983 |
Barrier | 1202 |
Asphalt road | 8480 |
Non-hardened road | 4957 |
Cement road | 1106 |
Building | 4047 |
Person | 327 |
Sky | 2830 |
Vegetation | 6730 |
Banner | 592 |
Pole | 3903 |
Traffic sign | 724 |
Car | 1172 |
Motorcycle | 294 |
Truck | 516 |
Agricultural machinery | 106 |
Tower | 283 |
Setting | Value |
---|---|
Batch size | 4 |
Crop size | |
Momentum | 0.9 |
Initial learning rate | 0.005 |
Weight decay | 0.000005 |
Model | MIoU (%) | Kappa (%) | Dice (%) | Parameters | FLOPs (G) | FPS (f · s−1) |
---|---|---|---|---|---|---|
Unet | 39.32 | 86.83 | 62.31 | 118.02 | 6.13 | |
Enet | 33.82 | 85.28 | 47.86 | 2.50 | 28.73 | |
BiSeNetv1 | 50.51 | 88.60 | 65.69 | 52.76 | 22.73 | |
BiSeNetv2 | 40.68 | 88.83 | 51.25 | 7.52 | 43.50 | |
PP-LiteSeg | 51.38 | 88.20 | 64.55 | 5.34 | 41.67 | |
The proposed method | 53.82 | 89.36 | 67.04 | 5.42 | 40.03 |
Model | MIoU (%) | Kappa (%) | Dice (%) | Parameters | FLOPs (G) | FPS (f · s−1) |
---|---|---|---|---|---|---|
Unet | 28.85 | 85.66 | 50.20 | 118.02 | 6.56 | |
Enet | 21.82 | 84.37 | 32.39 | 2.50 | 25.17 | |
BiSeNetv1 | 47.38 | 87.23 | 63.58 | 52.76 | 17.87 | |
BiSeNetv2 | 39.39 | 87.34 | 49.63 | 7.52 | 35.26 | |
PP-LiteSeg | 49.48 | 87.22 | 62.59 | 5.34 | 39.20 | |
The proposed method | 51.08 | 87.81 | 64.66 | 5.42 | 38.16 |
Category | Base Model (%) | +SP (%) | +BAM (%) | +SP+BAM (%) |
---|---|---|---|---|
Building | 82.39 | 82.99 | 81.62 | 84.59 |
Asphalt road | 89.10 | 89.88 | 90.33 | 91.38 |
Non-hardened road | 75.15 | 76.88 | 75.53 | 77.48 |
Sky | 70.44 | 71.56 | 74.63 | 75.07 |
Barrier | 69.07 | 81.39 | 82.06 | 82.75 |
Car | 60.16 | 59.86 | 63.54 | 63.11 |
Tower | 93.35 | 93.73 | 93.20 | 93.58 |
Pole | 65.99 | 65.23 | 65.47 | 68.65 |
Vegetation | 98.73 | 98.74 | 98.71 | 98.69 |
Fence | 68.56 | 70.17 | 70.71 | 75.62 |
Cement road | 62.89 | 61.45 | 64.72 | 69.44 |
Motorcycle | 56.37 | 55.78 | 62.44 | 64.26 |
Agricultural machinery | 85.20 | 76.84 | 85.50 | 86.64 |
Banner | 70.04 | 71.04 | 72.86 | 74.57 |
Person | 75.78 | 70.07 | 75.30 | 82.45 |
Truck | 65.07 | 84.08 | 87.79 | 88.87 |
Traffic sign | 72.19 | 73.83 | 74.69 | 75.59 |
Model | MIoU (%) | Kappa (%) | Dice (%) | Parameters | FLOPs (G) | FPS (f · s−1) |
---|---|---|---|---|---|---|
Base model | 51.38 | 88.20 | 64.55 | 5.34 | 41.67 | |
+SP | 53.10 | 88.60 | 66.39 | 5.38 | 41.17 | |
+BAM | 53.39 | 89.02 | 64.42 | 5.34 | 41.55 | |
+SP+BAM | 53.82 | 89.36 | 67.04 | 5.42 | 40.03 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cao, X.; Tian, Y.; Yao, Z.; Zhao, Y.; Zhang, T. Semantic Segmentation Network for Unstructured Rural Roads Based on Improved SPPM and Fused Multiscale Features. Appl. Sci. 2024, 14, 8739. https://doi.org/10.3390/app14198739
Cao X, Tian Y, Yao Z, Zhao Y, Zhang T. Semantic Segmentation Network for Unstructured Rural Roads Based on Improved SPPM and Fused Multiscale Features. Applied Sciences. 2024; 14(19):8739. https://doi.org/10.3390/app14198739
Chicago/Turabian StyleCao, Xinyu, Yongqiang Tian, Zhixin Yao, Yunjie Zhao, and Taihong Zhang. 2024. "Semantic Segmentation Network for Unstructured Rural Roads Based on Improved SPPM and Fused Multiscale Features" Applied Sciences 14, no. 19: 8739. https://doi.org/10.3390/app14198739
APA StyleCao, X., Tian, Y., Yao, Z., Zhao, Y., & Zhang, T. (2024). Semantic Segmentation Network for Unstructured Rural Roads Based on Improved SPPM and Fused Multiscale Features. Applied Sciences, 14(19), 8739. https://doi.org/10.3390/app14198739