Attention-Driven and Hierarchical Feature Fusion Network for Crop and Weed Segmentation with Fractal Dimension Estimation
Abstract
1. Introduction
1.1. Considering Homogenous Dataset Environments
1.1.1. ML-Based Methods
1.1.2. Single-Stage CNN-Based Methods
1.1.3. Multi-Stage CNN-Based Methods
1.2. Considering Heterogeneous Dataset Environments
1.2.1. CNN with Image-Based Augmentation Methods
1.2.2. CNN with Patch-Based Augmentation Methods
- This study proposes AHFF-Net, which incorporates patch-based augmentation to improve the accuracy compared to previous methods. This method effectively addresses the challenges of a heterogeneous dataset environment for semantic segmentation of crops and weeds.
- AHFF-Net includes a progressive encoder-stage refinement block (PERB) as part of the encoder. The PERB enhances feature extraction by capturing more sophisticated details such as the edges and textures of crops and weeds through deeper convolutions while preserving the feature map size. Additionally, the PERB improves low-level feature retention and helps prevent the loss of critical region-level information on crops and weeds. This enables the progressive refinement of spatial features before downsampling at each encoder stage. This is particularly favorable in heterogeneous field environments, where the high intraclass variability (e.g., varying crop textures) and low interclass separability (e.g., weeds replicating crop morphology) challenge conventional encoders. By enriching the representation before resolution reduction, PERB preserves the fine-grained cues essential for distinguishing visually similar classes. The symmetrical design also facilitates a smoother gradient flow and stabilizes the training. Thereby, it improves the generalization of the model to unseen or complex conditions that agricultural segmentation tasks generally involve.
- AHFF-Net employs a hierarchical multi-stage fusion block (HMFB) to capture and combine diverse features across multiple semantic levels. Each stage within the HMFB independently learns unique representations by focusing on fine-grained spatial details and high-level contextual information. These features are combined hierarchically. This results in a rich multiscale representation that strengthens the model’s capability to handle variations in object size, shape, and appearance during the semantic segmentation of crops and weeds in heterogeneous environments. Moreover, the HMFB includes residual connections that help preserve low-level spatial details that are generally lost in deeper layers. These low-level details are important for accurately identifying the small or thin structures of crops and weeds, particularly when their visual patterns differ across heterogeneous environments.
- The proposed attention-driven feature enhancement block (AFEB) in the decoder of AHFF-Net uses an attention mechanism to focus on critical regions of crops and weeds. It thereby enhances the segmentation performance. This functionality is particularly advantageous in heterogeneous environments because the attention mechanism adaptively highlights the most informative spatial features while suppressing irrelevant or noisy background information. By selectively emphasizing discriminative features, the AFEB contributes to more accurate semantic segmentation across field conditions. Furthermore, this framework integrates an FD estimation component to extract critical information on the spatial distribution patterns of crops and weeds. AHFF-Net is also combined with open-source large language model (LLM)-based pesticide recommendation systems by large language model Meta AI (LLaMA). To support transparency and reproducibility, the implementation of AHFF-Net has been made publicly available on GitHub (https://github.com/) [53].
Category | Method | Strengths | Limitations | |
---|---|---|---|---|
Considering homogeneous dataset | ML-based | RF algorithm with PCA was used to distinguish maize crop from weeds [30] | Effective use of hyperspectral data by leveraging rich spectral information, particularly red-edge NIR bands, to improve weed–crop discrimination | This method was validated only under controlled conditions, not in real field scenarios |
This method used both RFC- and MRF-based vegetation detection [31] | It detected both local and object base features, and was also evaluated on real farm of sugar beet fields | For image acquisition, the system requires a specific setup (e.g., a four-channel RGB+NIR camera and artificial halogen lighting), which limits its flexibility in varying field environments | ||
ANN and SVM were evaluated based on shape features [32] | Focusing on shapes and patterns, it precisely detected the weeds and crops | In a sugar beet field, it was evaluated on only four types of common weeds | ||
Background was separated from crops and weeds, and further classified using SVDD [33] | Reduced computation time by using color index features instead of shape-based features | Threshold values may change and become inaccurate if the colors of crops and weeds vary from green | ||
This approach was based on multiple classifier system (MCS) [35] | It showed better performance with selection-based MCS, which utilized multiple classifiers instead of a single classifier | The model became too heavy due to the combination of multiple classifiers | ||
SVM with radial basis function (RBF) kernel [36] | Real-time applicability achieved through integrated image acquisition, decision-making, and nozzle control for in-field operation. | Tiny cabbage plants were often missed by the targeted pesticide spraying mechanism, reducing overall detection accuracy | ||
Single-stage CNN-based | CNN model with ROI and two-class classifiers [37] | ROI-based accurate segmentation method for crops and weeds | Light variations across different datasets can reduce overall accuracy | |
A modified U-Net with data augmentations [40] | This study compared various input sizes for models and augmentation techniques to identify the optimal learning methods for PLC | Crops and weeds which are too small may not be detected in the augmented patches | ||
DeepVeg segmentation focused on damaged crops alongside healthy crops and weeds [41] | This lightweight model effectively addresses complex backgrounds and imbalanced classes | Image segmentation is challenging due to unclear boundaries, poor lighting, low contrast, and limited boundary information, resulting in errors and imprecise region definitions | ||
A modified Enet- and SegNet-based model taking 14 input channels [42] | 14 transformed images were concatenated as input to improve segmentation performance and reduce environmental impact | Generating 14 RGB-based channels increased computational costs and risked overfitting due to isolated features | ||
U-Net with a ResNet-50 backbone and CRF as post-processing [45] | The U-Net with a ResNet-50 backbone and CRF method exhibited enhanced performance across diverse environmental images | Processing took longer because the relationships between all pixels needed to be analyzed | ||
This method utilizes the conventional U-Net and U-Net++ architectures with resize [46] | Weeds were detected during their early growth stages to enable timely and effective intervention | The study utilized a small dataset, limiting its representation of the diversity and complexity of real-world agricultural environments | ||
Multi-stage CNN-based | A multi-stage MTS-CNN method for crop and weed segmentation [49] | It reduced the disparity between crop and weed segmentation accuracy while enhancing overall segmentation performance | The second stage depends on the first stage; poor first-stage training hampers second-stage performance | |
Modified U-Net based on conventional U-Net [50] | To address the challenges of data labeling and limited data availability, augmentation is employed, which enables accurate segmentation, even in complex backgrounds | The algorithm is limited to green bristlegrass, restricting its general applicability to other weeds | ||
A novel four-stage model, called CED-Net [51], is proposed, with each stage based on a modified U-Net encoder–decoder structure | A lightweight multi-stage model was developed, where training for each stage focused on a specific weed or crop separately | An error in any of the four stages impacted the overall performance | ||
Two stages encode a decoder-based semantic segmentation model [52] | It focused on objects rather than the background and delivered good results | Including the background yields varying results across different datasets | ||
Considering heterogeneous dataset | CNN with image-based augmentation | A framework for crop and weed semantic segmentation in heterogeneous environments using conventional CNN methods [22] | Limited training data was used for segmentation and achieved good results on heterogeneous datasets | Overlapping of crop and weed plants causes misclassification and lowers the performance |
CNN with patch-based augmentation | AHFF-Net (proposed method) | It improves crop and weed segmentation accuracy across diverse shapes and heterogeneous data environments using small training data | Segregating very small crops and weeds poses a significant challenge in heterogeneous data environments |
2. Material and Methods
2.1. Experimental Setup
2.2. Overview of the Proposed Method
2.2.1. Pre-Processing
2.2.2. Patch-Based Augmentation
2.3. AHFF-Net Architecture
2.3.1. PERB
2.3.2. HMFB
2.3.3. AFEB
2.4. Loss Function
2.5. Evaluation Metrices
2.6. FD Estimation
Algorithm 1. Pseudo-code to measure FD |
Input: image (input image path) |
Output: Fractal dimension (FD) value |
1: Read the input image and further convert it into grayscale 2: Set the maximum box-size with the power of 2 and ensure the dimensions q = 2^(log(max(size(image))/log2)] Add the padding if required to match the dimensions 3: Initialize the number of boxes 4: Calculate the number of boxes Y(q) to minimum pixels 5: Decrease the box size by 2 and recalculate Y(q) iteratively while q > 1 6: Calculate log(Y(q)) and log(1/q) for each q 7: Draw a fitted line to the points (log(Y(q)) and log(1/q)) |
8: FD value is the slope of the fitted line Return Fractal Dimension Value |
3. Experimental Results
3.1. Training Details
3.2. Testing of Proposed Method
3.2.1. Performance Evaluation with CWFID Following BoniRob Dataset Training
Ablation Study
Comparison of the Proposed and State-of-the-Art (SOTA) Methods
3.2.2. Performance Evaluation with BoniRob Following CWFID Training
3.2.3. Performance Evaluation with Sunflower Following BoniRob Dataset Training
3.2.4. Performance Evaluation with BoniRob Following Sunflower Dataset Training
3.2.5. Performance Evaluation with Sunflower Following CWFID Training
3.2.6. Performance Evaluation with CWFID Following Sunflower Dataset Training
3.3. Evaluation by FD Estimation
3.4. Comparisons of Algorithm Complexities
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Zhang, Q.; Ying, Y.; Ping, J. Recent Advances in Plant Nanoscience. Adv. Sci. 2022, 9, 2103414. [Google Scholar] [CrossRef] [PubMed]
- Kong, L.; Huang, M.; Zhang, L.; Chan, L.W.C. Enhancing Diagnostic Images to Improve the Performance of the Segment Anything Model in Medical Image Segmentation. Bioengineering 2024, 11, 270. [Google Scholar] [CrossRef]
- Usman, M.; Sultan, H.; Hong, J.S.; Kim, S.G.; Akram, R.; Gondal, H.A.H.; Tariq, M.H.; Park, K.R. Dilated Multilevel Fused Network for Virus Classification Using Transmission Electron Microscopy Images. Eng. Appl. Artif. Intell. 2024, 138, 109348. [Google Scholar] [CrossRef]
- Liu, L.; Guo, Z.; Liu, Z.; Zhang, Y.; Cai, R.; Hu, X.; Yang, R.; Wang, G. Multi-Task Intelligent Monitoring of Construction Safety Based on Computer Vision. Buildings 2024, 14, 2429. [Google Scholar] [CrossRef]
- Sultan, H.; Owais, M.; Nam, S.H.; Haider, A.; Akram, R.; Usman, M.; Park, K.R. MDFU-Net: Multiscale Dilated Features Up-sampling Network for Accurate Segmentation of Tumor from Heterogeneous Brain Data. J. King Saud Univ.-Comput. Inf. Sci. 2023, 35, 101560. [Google Scholar] [CrossRef]
- Li, Z.; Xiang, L.; Sun, J.; Liao, D.; Xu, L.; Wang, M. A Multi-Level Knowledge Distillation for Enhanced Crop Segmentation in Precision Agriculture. Agriculture 2025, 15, 1418. [Google Scholar] [CrossRef]
- Tang, Z.; Sun, J.; Tian, Y.; Xu, J.; Zhao, W.; Jiang, G.; Deng, J.; Gan, X. CVRP: A Rice Image Dataset with High-Quality Annotations for Image Segmentation and Plant Phenomics Research. Plant Phenom. 2025, 7, 100025. [Google Scholar] [CrossRef]
- Arsalan, M.; Haider, A.; Hong, J.S.; Kim, J.S.; Park, K.R. Deep Learning-Based Detection of Human Blastocyst Compartments with Fractal Dimension Estimation. Fractal Fract. 2024, 8, 267. [Google Scholar] [CrossRef]
- Rai, H.M.; Omkar Lakshmi Jagan, B.; Rao, N.T.; Mohammed, T.K.; Agarwal, N.; Abdallah, H.A.; Agarwal, S. Deep Learning for Leukemia Classification: Performance Analysis and Challenges Across Multiple Architectures. Fractal Fract. 2025, 9, 337. [Google Scholar] [CrossRef]
- Liu, X.; Gao, N.; He, S.; Wang, L. Application of Fractional Fourier Transform and BP Neural Network in Prediction of Tumor Benignity and Malignancy. Fractal Fract. 2025, 9, 267. [Google Scholar] [CrossRef]
- Tariq, M.H.; Sultan, H.; Akram, R.; Kim, S.G.; Kim, J.S.; Usman, M.; Gondal, H.A.H.; Seo, J.; Lee, Y.H.; Park, K.R. Estimation of Fractal Dimensions and Classification of Plant Disease with Complex Backgrounds. Fractal Fract. 2025, 9, 315. [Google Scholar] [CrossRef]
- Jiang, Y.; Li, C. Convolutional Neural Networks for Image-Based High-Throughput Plant Phenotyping: A Review. Plant Phenom. 2020, 2020, 4152816. [Google Scholar] [CrossRef]
- Zhang, C.; Kong, J.; Wu, D.; Guan, Z.; Ding, B.; Chen, F. Wearable Sensor: An Emerging Data Collection Tool for Plant Phenotyping. Plant Phenom. 2023, 5, 0051. [Google Scholar] [CrossRef]
- Al-Ghaili, A.M.; Gunasekaran, S.S.; Jamil, N.; Alyasseri, Z.A.A.; Al-Hada, N.M.; Ibrahim, Z.-A.B.; Bakar, A.A.; Kasim, H.; Hosseini, E.; Omar, R.; et al. A Review on Role of Image Processing Techniques to Enhancing Security of IoT Applications. IEEE Access 2023, 11, 101924–101948. [Google Scholar] [CrossRef]
- Li, D.; Li, J.; Xiang, S.; Pan, A. PSegNet: Simultaneous Semantic and Instance Segmentation for Point Clouds of Plants. Plant Phenom. 2022, 2022, 9787643. [Google Scholar] [CrossRef]
- Rawat, S.; Chandra, A.L.; Desai, S.V.; Balasubramanian, V.N.; Ninomiya, S.; Guo, W. How Useful Is Image-Based Active Learning for Plant Organ Segmentation? Plant Phenom. 2022, 2022, 9795275. [Google Scholar] [CrossRef] [PubMed]
- Yun, C.; Kim, Y.H.; Lee, S.J.; Im, S.J.; Park, K.R. WRA-Net: Wide Receptive Field Attention Network for Motion Deblurring in Crop and Weed Image. Plant Phenom. 2023, 5, 0031. [Google Scholar] [CrossRef] [PubMed]
- Parven, A.; Meftaul, I.M.; Venkateswarlu, K.; Megharaj, M. Herbicides in Modern Sustainable Agriculture: Environmental Fate, Ecological Implications, and Human Health Concerns. Int. J. Environ. Sci. Technol. 2024, 22, 1181–1202. [Google Scholar] [CrossRef]
- Gupta, S.K.; Yadav, S.K.; Soni, S.K.; Shanker, U.; Singh, P.K. Multiclass Weed Identification Using Semantic Segmentation: An Automated Approach for Precision Agriculture. Ecol. Inform. 2023, 78, 102366. [Google Scholar] [CrossRef]
- Hasan, A.S.M.M.; Diepeveen, D.; Laga, H.; Jones, M.G.K.; Sohel, F. Object-Level Benchmark for Deep Learning-Based Detection and Classification of Weed Species. Crop Prot. 2024, 177, 106561. [Google Scholar] [CrossRef]
- Li, W.; Zhang, Y. DC-YOLO: An Improved Field Plant Detection Algorithm Based on YOLOv7-Tiny. Sci. Rep. 2024, 14, 26430. [Google Scholar] [CrossRef]
- Akram, R.; Hong, J.S.; Kim, S.G.; Sultan, H.; Usman, M.; Gondal, H.A.H.; Tariq, M.H.; Ullah, N.; Park, K.R. Crop and Weed Segmentation and Fractal Dimension Estimation Using Small Training Data in Heterogeneous Data Environment. Fractal Fract. 2024, 8, 285. [Google Scholar] [CrossRef]
- Liu, Y.; Liu, M.; Zhao, X.; Zhu, J.; Wang, L.; Ma, H.; Zhang, M. Real-time Semantic Segmentation Network for Crops and Weeds Based on Multi-branch Structure. IET Comput. Vis. 2024, 18, 1313–1324. [Google Scholar] [CrossRef]
- Chebrolu, N.; Lottes, P.; Schaefer, A.; Winterhalter, W.; Burgard, W.; Stachniss, C. Agricultural Robot Dataset for Plant Classification, Localization and Mapping on Sugar Beet Fields. Int. J. Robot. Res. 2017, 36, 1045–1052. [Google Scholar] [CrossRef]
- Haug, S.; Ostermann, J. A Crop/Weed Field Image Dataset for the Evaluation of Computer Vision Based Precision Agriculture Tasks. In Proceedings of the Computer Vision—ECCV 2014 Workshops, Zurich, Switzerland, 6–7 and 12 September 2014; pp. 105–116. [Google Scholar] [CrossRef]
- Fawakherji, M.; Potena, C.; Pretto, A.; Bloisi, D.D.; Nardi, D. Multi-Spectral Image Synthesis for Crop/Weed Segmentation in Precision Farming. Robot. Auton. Syst. 2021, 146, 103861. [Google Scholar] [CrossRef]
- Nguyen, D.T.; Nam, S.H.; Batchuluun, G.; Owais, M.; Park, K.R. An Ensemble Classification Method for Brain Tumor Images Using Small Training Data. Mathematics 2022, 10, 4566. [Google Scholar] [CrossRef]
- Wahid, A.; Mahmood, T.; Hong, J.S.; Kim, S.G.; Ullah, N.; Akram, R.; Park, K.R. Multi-Path Residual Attention Network for Cancer Diagnosis Robust to a Small Number of Training Data of Microscopic Hyperspectral Pathological Images. Eng. Appl. Artif. Intell. 2024, 133, 108288. [Google Scholar] [CrossRef]
- Abdalla, A.; Cen, H.; Wan, L.; Rashid, R.; Weng, H.; Zhou, W.; He, Y. Fine-Tuning Convolutional Neural Network with Transfer Learning for Semantic Segmentation of Ground-Level Oilseed Rape Images in a Field with High Weed Pressure. Comput. Electron. Agric. 2019, 167, 105091. [Google Scholar] [CrossRef]
- Gao, J.; Nuyttens, D.; Lootens, P.; He, Y.; Pieters, J.G. Recognising weeds in a maize crop using a random forest machine-learning algorithm and near-infrared snapshot mosaic hyperspectral imagery. Biosyst. Eng. 2018, 170, 39–50. [Google Scholar] [CrossRef]
- Lottes, P.; Hörferlin, M.; Sander, S.; Stachniss, C. Effective Vision-Based Classification for Separating Sugar Beets and Weeds for Precision Farming: Effective Vision-Based Classification. J. Field Robot. 2017, 34, 1160–1178. [Google Scholar] [CrossRef]
- Bakhshipour, A.; Jafari, A. Evaluation of Support Vector Machine and Artificial Neural Networks in Weed Detection Using Shape Features. Comput. Electron. Agric. 2018, 145, 153–160. [Google Scholar] [CrossRef]
- Zheng, Y.; Zhu, Q.; Huang, M.; Guo, Y.; Qin, J. Maize and Weed Classification Using Color Indices with Support Vector Data Description in Outdoor Fields. Comput. Electron. Agric. 2017, 141, 215–222. [Google Scholar] [CrossRef]
- Wu, X.; Xu, W.; Song, Y.; Cai, M. A Detection Method of Weed in Wheat Field on Machine Vision. Procedia Eng. 2011, 15, 1998–2003. [Google Scholar] [CrossRef]
- Kamath, R.; Balachandra, M.; Prabhu, S. Paddy Crop and Weed Discrimination: A Multiple Classifier System Approach. Int. J. Agron. 2020, 2020, 6474536. [Google Scholar] [CrossRef]
- Zhao, X.; Wang, X.; Li, C.; Fu, H.; Yang, S.; Zhai, C. Cabbage and weed identification based on machine learning and target spraying system design. Front. Plant Sci. 2022, 13, 924973. [Google Scholar] [CrossRef] [PubMed]
- Ahmed, A.; Rafique, A.A. Deep Network for Smart Precision Agriculture through Segmentation and Classification of Crops. In Proceedings of the International Bhurban Conference on Applied Sciences and Technology, Islamabad, Pakistan, 16–20 August 2022; pp. 502–507. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014, arXiv:1409.1556v6. [Google Scholar] [CrossRef]
- Brilhador, A.; Gutoski, M.; Hattori, L.T.; de Souza Inácio, A.; Lazzaretti, A.E.; Lopes, H.S. Classification of Weeds and Crops at the Pixel-Level Using Convolutional Neural Networks and Data Augmentation. In Proceedings of the IEEE Latin American Conference on Computational Intelligence, Guayaquil, Ecuador, 11–15 November 2019; pp. 1–6. [Google Scholar] [CrossRef]
- Das, M.; Bais, A. DeepVeg: Deep Learning Model for Segmentation of Weed, Canola, and Canola Flea Beetle Damage. IEEE Access 2021, 9, 119367–119380. [Google Scholar] [CrossRef]
- Milioto, A.; Lottes, P.; Stachniss, C. Real-Time Semantic Segmentation of Crop and Weed for Precision Agriculture Robots Leveraging Background Knowledge in CNNs. In Proceedings of the IEEE International Conference on Robotics and Automation, Brisbane, Australia, 21–25 May 2018; pp. 2229–2235. [Google Scholar] [CrossRef]
- Paszke, A.; Chaurasia, A.; Kim, S.; Culurciello, E. ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation. arXiv 2016, arXiv:1606.02147v1. [Google Scholar] [CrossRef]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef]
- Sahin, H.M.; Miftahushudur, T.; Grieve, B.; Yin, H. Segmentation of Weeds and Crops Using Multispectral Imaging and CRF-Enhanced U-Net. Comput. Electron. Agric. 2023, 211, 107956. [Google Scholar] [CrossRef]
- Fathipoor, H.; Shah-Hosseini, R.; Arefi, H. Crop and Weed Segmentation on Ground-Based Images using Deep Convolutional Neural Network. In Proceedings of the ISPRS Annals of the Photogrammetry Remote Sensing and Spatial Information Sciences, Tehran, Iran, 13 January 2023; pp. 195–200. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar] [CrossRef]
- Zhou, Z.; Siddiquee, M.M.R.; Tajbakhsh, N.; Liang, J. UNet++: A Nested U-Net Architecture for Medical Image Segmentation. In Proceedings of the Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, Granada, Spain, 20 September 2018; pp. 3–11. [Google Scholar] [CrossRef]
- Kim, Y.; Park, K.R. MTS-CNN: Multi-Task Semantic Segmentation-Convolutional Neural Network for Detecting Crops and Weeds. Comput. Electron. Agric. 2022, 199, 107146. [Google Scholar] [CrossRef]
- Zou, K.; Chen, X.; Wang, Y.; Zhang, C.; Zhang, F. A Modified U-Net with a Specific Data Argumentation Method for Semantic Segmentation of Weed Images in the Field. Comput. Electron. Agric. 2021, 187, 106242. [Google Scholar] [CrossRef]
- Khan, A.; Ilyas, T.; Umraiz, M.; Mannan, Z.I.; Kim, H. CED-Net: Crops and Weeds Segmentation for Smart Farming Using a Small Cascaded Encoder-Decoder Architecture. Electronics 2020, 9, 1602. [Google Scholar] [CrossRef]
- Moazzam, S.I.; Nawaz, T.; Qureshi, W.S.; Khan, U.S.; Tiwana, M.I. A W-Shaped Convolutional Network for Robust Crop and Weed Classification in Agriculture. Precis. Agric. 2023, 24, 2002–2018. [Google Scholar] [CrossRef]
- AHFF-Net for Semantic Segmentation of Crops and Weeds in Heterogeneous Dataset Environment. Available online: https://github.com/iamrehanch/AHFF-Net_for_Semantic_Segmentation_of_Crops_and_Weeds (accessed on 2 January 2025).
- Reinhard, E.; Adhikhmin, M.; Gooch, B.; Shirley, P. Color Transfer between Images. IEEE Comput. Graph. Appl. 2001, 21, 34–41. [Google Scholar] [CrossRef]
- Ruderman, D.L.; Cronin, T.W.; Chiao, C.C. Statistics of Cone Responses to Natural Images: Implications for Visual Coding. J. Opt. Soc. Am. A 1998, 15, 2036–2045. [Google Scholar] [CrossRef]
- Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
- Sudre, C.H.; Li, W.; Vercauteren, T.; Ourselin, S.; Jorge Cardoso, M. Generalised Dice Overlap as a Deep Learning Loss Function for Highly Unbalanced Segmentations. In Proceedings of the International Conference on Communication, Control and Information Sciences, Québec City, QC, Canada, 9 September 2017; pp. 240–248. [Google Scholar] [CrossRef]
- Rezaie, A.; Mauron, A.J.; Beyer, K. Sensitivity analysis of fractal dimensions of crack maps on concrete and masonry walls. Autom. Constr. 2020, 117, 103258. [Google Scholar] [CrossRef]
- Wu, J.; Jin, X.; Mi, S.; Tang, J. An effective method to compute the box-counting dimension based on the mathematical definition and intervals. Results Eng. 2020, 6, 100106. [Google Scholar] [CrossRef]
- Loshchilov, I.; Hutter, F. SGDR: Stochastic Gradient Descent with Warm Restarts. arXiv 2016, arXiv:1608.03983v5. [Google Scholar] [CrossRef]
- Bhatti, M.A.; Syam, M.S.; Chen, H.; Hu, Y.; Keung, L.W.; Zeeshan, Z.; Ali, Y.A.; Sarhan, N. Utilizing Convolutional Neural Networks (CNN) and U-Net Architecture for Precise Crop and Weed Segmentation in Agricultural Imagery: A Deep Learning Approach. Big Data Res. 2024, 36, 100465. [Google Scholar] [CrossRef]
- Arun, R.A.; Umamaheswari, S.; Jain, A.V. Reduced U-Net Architecture for Classifying Crop and Weed Using Pixel-Wise Segmentation. In Proceedings of the IEEE International Conference for Innovation in Technology, Bangluru, India, 6–8 November 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Habib, M.; Sekhra, S.; Tannouche, A.; Ounejjar, Y. New Segmentation Approach for Effective Weed Management in Agriculture. Smart Agric. Technol. 2024, 8, 100505. [Google Scholar] [CrossRef]
- Yang, Q.; Ye, Y.; Gu, L.; Wu, Y. MSFCA-Net: A Multi-Scale Feature Convolutional Attention Network for Segmenting Crops and Weeds in the Field. Agriculture 2023, 13, 1176. [Google Scholar] [CrossRef]
- Wang, A.; Xu, Y.; Wei, X.; Cui, B. Semantic Segmentation of Crop and Weed Using an Encoder-Decoder Network and Image Enhancement Method Under Uncontrolled Outdoor Illumination. IEEE Access 2020, 8, 81724–81734. [Google Scholar] [CrossRef]
- Kamath, R.; Balachandra, M.; Vardhan, A.; Maheshwari, U. Classification of Paddy Crop and Weeds Using Semantic Segmentation. Cogent Eng. 2022, 9, 2018791. [Google Scholar] [CrossRef]
- Wu, T.; Tang, S.; Zhang, R.; Cao, J.; Zhang, Y. CGNet: A Light-Weight Context Guided Network for Semantic Segmentation. IEEE Trans. Image Process. 2021, 30, 1169–1179. [Google Scholar] [CrossRef]
- Sultan, H.; Ullah, N.; Hong, J.S.; Kim, S.G.; Lee, D.C.; Jung, S.Y.; Park, K.R. Estimation of fractal dimension and segmentation of brain tumor with parallel features aggregation network. Fractal Fract. 2024, 8, 357. [Google Scholar] [CrossRef]
- NVIDIA GeForce GTX 1070. Available online: https://www.nvidia.com/en-us/geforce/10-series/ (accessed on 20 February 2025).
- GTX 4080 Super NVIDIA GeForce RTX 4080 SUPER. Available online: https://www.nvidia.com/en-us/geforce/graphics-cards/40-series/ (accessed on 20 February 2025).
- Jetson TX2. Available online: https://developer.nvidia.com/embedded/jetson-tx2 (accessed on 20 February 2025).
- Mishra, P.; Singh, U.; Pandey, C.M.; Mishra, P.; Pandey, G. Application of student’s t-test, analysis of variance, and covariance. Ann. Card. Anaesth. 2019, 22, 407–411. [Google Scholar] [CrossRef] [PubMed]
- Cohen, J. A power primer. Psychol. Bull. 1992, 112, 155–159. [Google Scholar] [CrossRef] [PubMed]
- Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar] [CrossRef]
- Large Language Model Meta AI (LLaMA) 3.2. Available online: https://www.llama.com (accessed on 2 January 2025).
Dataset-1 | Training | Validation | Dataset-2 | Testing |
---|---|---|---|---|
BoniRob | 400 | 30 | CWFID | 30 |
CWFID | 45 | 5 | BoniRob | 246 |
Sunflower | 120 | 17 | BoniRob | 246 |
BoniRob | 400 | 30 | Sunflower | 86 |
CWFID | 45 | 5 | Sunflower | 86 |
Sunflower | 120 | 17 | CWFID | 30 |
PERB | HMFB | AFEB | (B) | (W) | (C) | ||||
---|---|---|---|---|---|---|---|---|---|
0.624 | 0.985 | 0.538 | 0.350 | 0.761 | 0.759 | 0.759 | |||
√ | 0.626 | 0.986 | 0.600 | 0.289 | 0.772 | 0.756 | 0.763 | ||
√ | 0.625 | 0.986 | 0.598 | 0.288 | 0.770 | 0.752 | 0.759 | ||
√ | 0.624 | 0.986 | 0.592 | 0.294 | 0.770 | 0.753 | 0.761 | ||
√ | √ | 0.633 | 0.986 | 0.578 | 0.334 | 0.787 | 0.765 | 0.774 | |
√ | √ | 0.632 | 0.986 | 0.622 | 0.290 | 0.784 | 0.761 | 0.772 | |
√ | √ | √ | 0.653 | 0.986 | 0.637 | 0.338 | 0.804 | 0.772 | 0.787 |
Cases | (B) | (W) | (C) | ||||
---|---|---|---|---|---|---|---|
With IMBA | 0.644 | 0.986 | 0.62 | 0.326 | 0.797 | 0.765 | 0.780 |
With patch-based augmentation (proposed) | 0.653 | 0.986 | 0.637 | 0.338 | 0.804 | 0.772 | 0.787 |
Number of Patches | (B) | (W) | (C) | ||||
---|---|---|---|---|---|---|---|
4 | 0.623 | 0.986 | 0.548 | 0.335 | 0.772 | 0.751 | 0.760 |
5 (Proposed) | 0.653 | 0.986 | 0.637 | 0.338 | 0.804 | 0.772 | 0.787 |
6 | 0.646 | 0.986 | 0.583 | 0.369 | 0.792 | 0.764 | 0.778 |
Number of Convolutions | (B) | (W) | (C) | ||||
---|---|---|---|---|---|---|---|
2 | 0.636 | 0.985 | 0.562 | 0.359 | 0.783 | 0.765 | 0.773 |
3 (Proposed) | 0.653 | 0.986 | 0.637 | 0.338 | 0.804 | 0.772 | 0.787 |
4 | 0.645 | 0.985 | 0.598 | 0.353 | 0.795 | 0.769 | 0.781 |
Cases | (B) | (W) | (C) | ||||
---|---|---|---|---|---|---|---|
With sequential convolutions (proposed) | 0.653 | 0.986 | 0.637 | 0.338 | 0.804 | 0.772 | 0.787 |
With parallel convolutions | 0.637 | 0.985 | 0.554 | 0.371 | 0.785 | 0.764 | 0.774 |
Cases | (B) | (W) | (C) | ||||
---|---|---|---|---|---|---|---|
With residual connection (proposed) | 0.653 | 0.986 | 0.637 | 0.338 | 0.804 | 0.772 | 0.787 |
Without residual connection | 0.631 | 0.986 | 0.564 | 0.341 | 0.792 | 0.754 | 0.772 |
Number of Convolutions | (B) | (W) | (C) | ||||
---|---|---|---|---|---|---|---|
3 | 0.636 | 0.985 | 0.623 | 0.299 | 0.796 | 0.761 | 0.778 |
4 (Proposed) | 0.653 | 0.986 | 0.637 | 0.338 | 0.804 | 0.772 | 0.787 |
5 | 0.634 | 0.986 | 0.579 | 0.335 | 0.79 | 0.763 | 0.776 |
Cases | (B) | (W) | (C) | ||||
---|---|---|---|---|---|---|---|
With attention (Proposed) | 0.653 | 0.986 | 0.637 | 0.338 | 0.804 | 0.772 | 0.787 |
Without attention | 0.64 | 0.986 | 0.562 | 0.371 | 0.778 | 0.781 | 0.779 |
Cases | (B) | (W) | (C) | ||||
---|---|---|---|---|---|---|---|
U-Net (RHT + small training data + IMBA) [22] | 0.620 | 0.986 | 0.524 | 0.349 | 0.749 | 0.762 | 0.752 |
U-Net [61] | 0.624 | 0.985 | 0.538 | 0.350 | 0.761 | 0.759 | 0.759 |
Modified U-Net [40] | 0.606 | 0.985 | 0.546 | 0.288 | 0.765 | 0.738 | 0.751 |
Unet++ [46] | 0.547 | 0.982 | 0.514 | 0.146 | 0.684 | 0.654 | 0.668 |
Reduced U-Net [62] | 0.560 | 0.978 | 0.482 | 0.220 | 0.680 | 0.691 | 0.685 |
DWUNet [63] | 0.535 | 0.979 | 0.447 | 0.178 | 0.646 | 0.655 | 0.650 |
MSFCA-Net [64] | 0.538 | 0.982 | 0.465 | 0.166 | 0.647 | 0.658 | 0.652 |
DeepLAB V3 Plus [65] | 0.597 | 0.974 | 0.556 | 0.262 | 0.712 | 0.713 | 0.712 |
SEG-Net [66] | 0.558 | 0.961 | 0.516 | 0.197 | 0.631 | 0.719 | 0.672 |
CG-Net [67] | 0.506 | 0.963 | 0.447 | 0.110 | 0.711 | 0.576 | 0.636 |
AHFF-Net (proposed) | 0.653 | 0.986 | 0.637 | 0.338 | 0.804 | 0.772 | 0.787 |
Cases | (B) | (W) | (C) | ||||
---|---|---|---|---|---|---|---|
U-Net (RHT + small training data + IMBA) [22] | 0.637 | 0.971 | 0.292 | 0.647 | 0.708 | 0.787 | 0.743 |
U-Net [61] | 0.642 | 0.975 | 0.301 | 0.649 | 0.755 | 0.738 | 0.746 |
Modified U-Net [40] | 0.584 | 0.966 | 0.277 | 0.509 | 0.644 | 0.778 | 0.704 |
Unet++ [46] | 0.576 | 0.958 | 0.230 | 0.541 | 0.633 | 0.789 | 0.702 |
Reduced U-Net [62] | 0.568 | 0.951 | 0.238 | 0.516 | 0.626 | 0.803 | 0.703 |
DWUNet [63] | 0.533 | 0.967 | 0.077 | 0.555 | 0.667 | 0.578 | 0.619 |
MSFCA-Net [64] | 0.570 | 0.972 | 0.225 | 0.513 | 0.652 | 0. 723 | 0.685 |
DeepLAB V3 Plus [65] | 0.541 | 0.969 | 0.208 | 0.446 | 0.661 | 0.644 | 0.652 |
SEG-Net [66] | 0.491 | 0.951 | 0.205 | 0.319 | 0.565 | 0.652 | 0.605 |
CG-Net [67] | 0.475 | 0.930 | 0.211 | 0.285 | 0.553 | 0.735 | 0.631 |
AHFF-Net (proposed) | 0.661 | 0.974 | 0.334 | 0.674 | 0.742 | 0.796 | 0.768 |
Cases | (B) | (W) | (C) | ||||
---|---|---|---|---|---|---|---|
U-Net (RHT + small training data + IMBA) [22] | 0.581 | 0.993 | 0.503 | 0.248 | 0.679 | 0.694 | 0.685 |
U-Net [61] | 0.587 | 0.992 | 0.514 | 0.256 | 0.686 | 0.695 | 0.690 |
Modified U-Net [40] | 0.575 | 0.990 | 0.577 | 0.158 | 0.651 | 0.700 | 0.674 |
Unet++ [46] | 0.580 | 0.991 | 0.604 | 0.144 | 0.694 | 0.682 | 0.687 |
Reduced U-Net [62] | 0.546 | 0.988 | 0.490 | 0.161 | 0.609 | 0.687 | 0.645 |
DWUNet [63] | 0.427 | 0.980 | 0.000 | 0.300 | 0.503 | 0.486 | 0.494 |
MSFCA-Net [64] | 0.583 | 0.992 | 0.520 | 0.236 | 0.685 | 0.693 | 0.688 |
DeepLAB V3 Plus [65] | 0.566 | 0.99 | 0.586 | 0.121 | 0.657 | 0.636 | 0.646 |
SEG-Net [66] | 0.505 | 0.934 | 0.544 | 0.037 | 0.571 | 0.699 | 0.628 |
CG-Net [67] | 0.554 | 0.986 | 0.559 | 0.116 | 0.800 | 0.572 | 0.667 |
AHFF-Net (proposed) | 0.601 | 0.993 | 0.512 | 0.300 | 0.724 | 0.691 | 0.707 |
Cases | (B) | (W) | (C) | ||||
---|---|---|---|---|---|---|---|
U-Net (RHT + small training data + IMBA) [22] | 0.561 | 0.973 | 0.172 | 0.538 | 0.659 | 0.671 | 0.664 |
U-Net [61] | 0.558 | 0.973 | 0.136 | 0.565 | 0.678 | 0.667 | 0.672 |
Modified U-Net [40] | 0.579 | 0.973 | 0.217 | 0.548 | 0.671 | 0.701 | 0.685 |
Unet++ [46] | 0.530 | 0.974 | 0.128 | 0.486 | 0.631 | 0.632 | 0.631 |
Reduced U-Net [62] | 0.514 | 0.963 | 0.154 | 0.426 | 0.588 | 0.656 | 0.620 |
DWUNet [63] | 0.424 | 0.966 | 0.077 | 0.230 | 0.521 | 0.501 | 0.510 |
MSFCA-Net [64] | 0.570 | 0.970 | 0.214 | 0.525 | 0.663 | 0.685 | 0.673 |
DeepLAB V3 Plus [65] | 0.471 | 0.971 | 0.124 | 0.319 | 0.580 | 0.558 | 0.568 |
SEG-Net [66] | 0.479 | 0.955 | 0.159 | 0.324 | 0.556 | 0.630 | 0.590 |
CG-Net [67] | 0.378 | 0.960 | 0.009 | 0.164 | 0.600 | 0.395 | 0.476 |
AHFF-Net (proposed) | 0.595 | 0.972 | 0.225 | 0.587 | 0.699 | 0.711 | 0.704 |
Cases | (B) | (W) | (C) | ||||
---|---|---|---|---|---|---|---|
U-Net (RHT + small training data + IMBA) [22] | 0.568 | 0.99 | 0.487 | 0.229 | 0.646 | 0.710 | 0.676 |
U-Net [61] | 0.582 | 0.991 | 0.514 | 0.241 | 0.686 | 0.678 | 0.682 |
Modified U-Net [40] | 0.569 | 0.986 | 0.509 | 0.211 | 0.617 | 0.759 | 0.680 |
Unet++ [46] | 0.580 | 0.991 | 0.539 | 0.209 | 0.683 | 0.682 | 0.682 |
Reduced U-Net [62] | 0.545 | 0.983 | 0.475 | 0.177 | 0.579 | 0.766 | 0.659 |
DWUNet [63] | 0.498 | 0.988 | 0.416 | 0.090 | 0.621 | 0.550 | 0.583 |
MSFCA-Net [64] | 0.578 | 0.992 | 0.498 | 0.243 | 0.678 | 0.673 | 0.675 |
DeepLAB V3 Plus [65] | 0.569 | 0.987 | 0.622 | 0.098 | 0.746 | 0.598 | 0.663 |
SEG-Net [66] | 0.437 | 0.930 | 0.344 | 0.035 | 0.468 | 0.675 | 0.552 |
CG-Net [67] | 0.441 | 0.962 | 0.278 | 0.082 | 0.599 | 0.594 | 0.596 |
AHFF-Net (proposed) | 0.592 | 0.992 | 0.513 | 0.272 | 0.694 | 0.710 | 0.701 |
Cases | (B) | (W) | (C) | ||||
---|---|---|---|---|---|---|---|
U-Net (RHT + small training data + IMBA) [22] | 0.515 | 0.977 | 0.313 | 0.254 | 0.666 | 0.676 | 0.670 |
U-Net [61] | 0.580 | 0.978 | 0.571 | 0.189 | 0.699 | 0.690 | 0.694 |
Modified U-Net [40] | 0.581 | 0.975 | 0.496 | 0.273 | 0.675 | 0.727 | 0.700 |
Unet++ [46] | 0.569 | 0.976 | 0.485 | 0.246 | 0.695 | 0.715 | 0.704 |
Reduced U-Net [62] | 0.590 | 0.970 | 0.567 | 0.234 | 0.703 | 0.718 | 0.710 |
DWUNet [63] | 0.506 | 0.966 | 0.492 | 0.060 | 0.647 | 0.617 | 0.631 |
MSFCA-Net [64] | 0.539 | 0.982 | 0.560 | 0.076 | 0.640 | 0.644 | 0.641 |
DeepLAB V3 Plus [65] | 0.568 | 0.963 | 0.514 | 0.226 | 0.697 | 0.693 | 0.694 |
SEG-Net [66] | 0.483 | 0.857 | 0.523 | 0.067 | 0.554 | 0.731 | 0.627 |
CG-Net [67] | 0.480 | 0.950 | 0.409 | 0.081 | 0.756 | 0.534 | 0.625 |
AHFF-Net (proposed) | 0.595 | 0.982 | 0.485 | 0.319 | 0.714 | 0.750 | 0.731 |
Weed FD | Crop FD |
---|---|
1.60 | 1.09 |
1.57 | 1.16 |
1.58 | 1.10 |
1.56 | 1.11 |
Methods | Desktop | Jetson Embedded System | |
---|---|---|---|
NVIDIA GeForce GTX 1070 | NVIDIA GeForce RTX 4080 SUPER | ||
U-Net [22,61] | 412.05 | 41.50 | 629.12 |
Modified U-Net [40] | 426.51 | 41.08 | 472.55 |
Unet++ [46] | 443.83 | 40.61 | 502.86 |
Reduced U-Net [62] | 411.98 | 33.73 | 453.18 |
DWUNet [63] | 416.37 | 38.28 | 428.06 |
MSFCA-Net [64] | 442.06 | 45.07 | 1319.61 |
DeepLAB V3 Plus [65] | 456.16 | 42.43 | 977.56 |
SEG-Net [66] | 419.40 | 27.72 | 446.93 |
CG-Net [67] | 459.82 | 26.78 | 583.12 |
AHFF-Net (Proposed) | 417.96 | 29.75 | 908.22 |
Methods | Number of Parameters (Unit: Mega) | GPU Memory Requirements (Unit: MB) | FLOPs (Unit: G) |
---|---|---|---|
U-Net [22,61] | 31.04 | 3352 | 378.10 |
Modified U-Net [40] | 31.03 | 3971 | 437.79 |
Unet++ [46] | 26.08 | 3426 | 147.75 |
Reduced U-Net [62] | 7.76 | 1906 | 96.25 |
DWUNet [63] | 3.26 | 1571 | 63.12 |
MSFCA-Net [64] | 30.88 | 6617 | 523.96 |
DeepLAB V3 Plus [65] | 54.6 | 8137 | 201.6 |
SEG-Net [66] | 29.44 | 6647 | 339.6 |
CG-Net [67] | 0.503 | 962 | 30.055 |
AHFF-Net (Proposed) | 47.88 | 5271 | 287.71 |
Input Image | Segmentation Mask | Weed Morphology (Early Stage) | Weed Name | Pesticides |
---|---|---|---|---|
Narrow and pointed leaves that grow in clusters from the base, spreading outward in a pattern | Barnyard Grass (Echinochloa crus-galli) |
| ||
Thin, branching stems with narrow, deeply lobed leaves forming a spreading growth habits | Knotweed (Polygonum aviculare) |
| ||
Small, oval-shaped green leaves with smooth edges, arranged in opposite pairs on delicate stems | Chickweed (Stellaria media) |
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Akram, R.; Kim, J.S.; Jeong, M.S.; Gondal, H.A.H.; Tariq, M.H.; Irfan, M.; Park, K.R. Attention-Driven and Hierarchical Feature Fusion Network for Crop and Weed Segmentation with Fractal Dimension Estimation. Fractal Fract. 2025, 9, 592. https://doi.org/10.3390/fractalfract9090592
Akram R, Kim JS, Jeong MS, Gondal HAH, Tariq MH, Irfan M, Park KR. Attention-Driven and Hierarchical Feature Fusion Network for Crop and Weed Segmentation with Fractal Dimension Estimation. Fractal and Fractional. 2025; 9(9):592. https://doi.org/10.3390/fractalfract9090592
Chicago/Turabian StyleAkram, Rehan, Jung Soo Kim, Min Su Jeong, Hafiz Ali Hamza Gondal, Muhammad Hamza Tariq, Muhammad Irfan, and Kang Ryoung Park. 2025. "Attention-Driven and Hierarchical Feature Fusion Network for Crop and Weed Segmentation with Fractal Dimension Estimation" Fractal and Fractional 9, no. 9: 592. https://doi.org/10.3390/fractalfract9090592
APA StyleAkram, R., Kim, J. S., Jeong, M. S., Gondal, H. A. H., Tariq, M. H., Irfan, M., & Park, K. R. (2025). Attention-Driven and Hierarchical Feature Fusion Network for Crop and Weed Segmentation with Fractal Dimension Estimation. Fractal and Fractional, 9(9), 592. https://doi.org/10.3390/fractalfract9090592