U-MoEMamba: A Hybrid Expert Segmentation Model for Cabbage Heads in Complex UAV Low-Altitude Remote Sensing Scenarios
Abstract
1. Introduction
- (1).
- Design of a Heterogeneous Expert Collaboration Framework: This framework features a tri-expert architecture that integrates multi-scale convolutional experts, cross-attention experts, and Mamba path experts. The fusion leverages the strengths of convolutional neural networks (CNNs) for local perception, attention mechanisms for global context modeling, and state-space models (SSMs) for capturing long-range dependencies.
- (2).
- Proposal of the MambaMoEFusion Module: We construct a hybrid MoE fusion module, which includes three distinct expert pathways and a lightweight gating network for adaptive integration. Unlike traditional fusion strategies such as concatenation or static weighting, this dynamic approach allows context-aware selection of the most suitable expert features.
- (3).
- Development of the MSCrossDualAttention Module: We propose a dual-path attention mechanism that simultaneously processes features along the spatial and channel dimensions, aligning shallow structural details with deep semantic information and thereby enhancing feature complementarity.
- (4).
- Lightweight Gating via MOEGatingNetwork: We design a resource-efficient gating mechanism. By generating expert weights through adaptive average pooling and fully connected layers, our approach achieves hardware-friendly dynamic resource allocation with minimal computational overhead.
2. Materials and Methods
2.1. Data Acquisition
2.2. Data Augmentation
- (5).
- Horizontal Flipping: Images were randomly flipped horizontally with a probability of 50%.
- (6).
- Saturation Adjustment: Color saturation was linearly perturbed within the range of −25% to +25% to simulate varying lighting and imaging conditions.
- (7).
- Salt-and-Pepper Noise Injection: Random salt-and-pepper noise was introduced to 0.1% of image pixels to enhance noise robustness.
2.3. Methods
2.3.1. U-MoEMamba Semantic Segmentation Network
2.3.2. MambaMoEFusion: Hybrid Expert Gating Fusion Module
2.3.3. MSCrossDualAttention: Multi-Scale Cross Dual Attention Module
2.3.4. MOEGatingNetwork: Gating Fusion Module
2.4. Experimental Platform and Parameter Settings
2.5. Evaluation Metrics
3. Results
3.1. Comparative Analysis on the Early Heading Stage of Cabbage
3.2. Comparative Analysis on the Compact Heading Stage of Cabbage
3.3. Ablation Study
4. Discussion
- Sensitivity to Extreme Weather Conditions: The current model was primarily trained and evaluated under common weather scenarios such as sunny and cloudy conditions. Its performance in more challenging environments, such as heavy rainfall and strong winds, has not yet been thoroughly examined, which may affect its robustness in real-world deployments.
- Dependence on High-Performance Hardware: Due to the integration of multiple expert modules, the model requires considerable computational resources, particularly during training. This limits its immediate applicability on edge devices, indicating a need for lightweight optimization in future work.
- Limited Generalizability to Other Crops: The current study focuses exclusively on cabbage segmentation, with limited applicability to other crops. The model’s adaptability and effectiveness across different crop types with similar morphological characteristics (e.g., lettuce, romaine) remain to be validated.
5. Conclusions
- U-MoEMamba introduces a heterogeneous expert collaboration architecture, which comprises three complementary expert branches: a multi-scale convolutional expert, a global attention expert (MSCrossDualAttention), and a long-range dependency modeling expert (VSSBlock). These experts are adaptively fused at the pixel level through a gated fusion module (MambaMoEFusion), enabling optimal feature representation for different spatial regions and significantly enhancing segmentation robustness in complex agricultural scenes. The proposed MSCrossDualAttention module, as one of the expert branches, effectively captures global contextual dependencies in the image, thereby strengthening the model’s semantic understanding of entire scenes.
- Compared with mainstream Mamba-based segmentation models (e.g., EfficientPyramidMamba and SegMamba), U-MoEMamba demonstrates superior performance on datasets corresponding to two key growth stages of cabbage. Specifically, it achieves 89.51% mIoU and 97.85% overall accuracy (OA) on the early heading dataset, and 91.88% mIoU and 97.98% OA on the compact heading dataset, with notable improvements in missed and incorrect segmentation.
- Trained and evaluated on UAV datasets collected under diverse real-world conditions—including overcast and sunny weather, varying degrees of leaf occlusion, and different planting densities—U-MoEMamba exhibits strong generalization and environmental adaptability. Its practical potential is significant for smart agriculture, particularly in tasks such as crop growth monitoring, heading stage assessment, and yield prediction. Furthermore, this work aligns with the broader trend of AI-driven agricultural automation and digital phenotyping, supporting labor-efficient, data-informed, and sustainable crop management. It offers a reliable technical foundation for advancing precision farming practices.
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
References
- Moreb, N.; Murphy, A.; Jaiswal, S.; Jaiswal, A.K. Cabbage. In Nutritional Composition and Antioxidant Properties of Fruits and Vegetables; Academic Press: Cambridge, MA, USA, 2020; pp. 33–54. [Google Scholar]
- Vrchota, J.; Pech, M.; Švepešová, I. Precision agriculture technologies for crop and livestock production in the Czech Republic. Agriculture 2022, 12, 1080. [Google Scholar] [CrossRef]
- Zhang, Z.; Lin, M.; Li, D.; Wu, R.; Lin, R.; Yang, C. An AUV-enabled dockable platform for long-term dynamic and static monitoring of marine pastures. IEEE J. Ocean. Eng. 2025, 50, 276–293. [Google Scholar] [CrossRef]
- Puppala, H.; Peddinti, P.R.; Tamvada, J.P.; Ahuja, J.; Kim, B. Barriers to the adoption of new technologies in rural areas: The case of unmanned aerial vehicles for precision agriculture in India. Technol. Soc. 2023, 74, 102335. [Google Scholar] [CrossRef]
- Huang, H.; Deng, J.; Lan, Y.; Yang, A.; Deng, X.; Wen, S.; Zhang, H.; Zhang, Y. Accurate weed mapping and prescription map generation based on fully convolutional networks using UAV imagery. Sensors 2018, 18, 3299. [Google Scholar] [CrossRef]
- Sahin, H.M.; Miftahushudur, T.; Grieve, B.; Yin, H. Segmentation of weeds and crops using multispectral imaging and CRF-enhanced U-Net. Comput. Electron. Agric. 2023, 211, 107956. [Google Scholar] [CrossRef]
- Kang, S.; Li, D.; Li, B.; Zhu, J.; Long, S.; Wang, J. Maturity identification and category determination method of broccoli based on semantic segmentation models. Comput. Electron. Agric. 2024, 217, 108633. [Google Scholar] [CrossRef]
- Crespo, A.; Moncada, C.; Crespo, F.; Morocho-Cayamcela, M.E. An efficient strawberry segmentation model based on Mask R-CNN and TensorRT. Artif. Intell. Agric. 2025, 15, 327–337. [Google Scholar] [CrossRef]
- Xiang, D.; He, D.; Sun, H.; Gao, P.; Zhang, J.; Ling, J. HCMPE-Net: An unsupervised network for underwater image restoration with multi-parameter estimation based on homology constraint. Opt. Laser Technol. 2025, 186, 112616. [Google Scholar] [CrossRef]
- Yu, Y.; Zhu, F.; Qian, J.; Fujita, H.; Yu, J.; Zeng, K.; Chen, E. CrowdFPN: Crowd counting via scale-enhanced and location-aware feature pyramid network. Appl. Intell. 2025, 55, 359. [Google Scholar] [CrossRef]
- Han, K.; Wang, Y.; Chen, H.; Chen, X.; Guo, J.; Liu, Z.; Tang, Y.; Xiao, A.; Xu, C.; Xu, Y. A survey on vision transformer. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 87–110. [Google Scholar] [CrossRef]
- Strudel, R.; Garcia, R.; Laptev, I.; Schmid, C. Segmenter: Transformer for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 7262–7272. [Google Scholar]
- Xiang, P.; Pan, F.; Liu, T.; Zhao, X.; Hu, M.; He, D.; Zhang, B. MixSegNext: A CNN-Transformer hybrid model for semantic segmentation and picking point localization algorithm of Sichuan pepper in natural environments. Comput. Electron. Agric. 2025, 237, 110564. [Google Scholar] [CrossRef]
- Cai, Y.; Zeng, F.; Xiao, J.; Ai, W.; Kang, G.; Lin, Y.; Cai, Z.; Shi, H.; Zhong, S.; Yue, X. Attention-aided semantic segmentation network for weed identification in pineapple field. Comput. Electron. Agric. 2023, 210, 107881. [Google Scholar] [CrossRef]
- Liu, L.B.; Zhao, C.J.; Wu, H.R.; Gao, R.H. Image Reduction Method for Rice Leaf Disease Based on Visual Attention Model. Appl. Mech. Mater. 2012, 220, 1393–1397. [Google Scholar] [CrossRef]
- Meng, F.; Li, J.; Zhang, Y.; Qi, S.; Tang, Y. Transforming unmanned pineapple picking with spatio-temporal convolutional neural networks. Comput. Electron. Agric. 2023, 214, 108298. [Google Scholar] [CrossRef]
- Wang, J.; Zhang, Z.; Luo, L.; Wei, H.; Wang, W.; Chen, M.; Luo, S. DualSeg: Fusing transformer and CNN structure for image segmentation in complex vineyard environment. Comput. Electron. Agric. 2023, 206, 107682. [Google Scholar] [CrossRef]
- Guo, Z.; Cai, D.; Jin, Z.; Xu, T.; Yu, F. Research on unmanned aerial vehicle (UAV) rice field weed sensing image segmentation method based on CNN-transformer. Comput. Electron. Agric. 2025, 229, 109719. [Google Scholar] [CrossRef]
- Jeon, Y.-J.; Hong, M.J.; Ko, C.S.; Park, S.J.; Lee, H.; Lee, W.-G.; Jung, D.-H. A hybrid CNN-Transformer model for identification of wheat varieties and growth stages using high-throughput phenotyping. Comput. Electron. Agric. 2025, 230, 109882. [Google Scholar] [CrossRef]
- Gu, A.; Dao, T. Mamba: Linear-time sequence modeling with selective state spaces. arXiv 2023, arXiv:2312.00752. [Google Scholar]
- Xu, R.; Yang, S.; Wang, Y.; Cai, Y.; Du, B.; Chen, H. Visual mamba: A survey and new outlooks. arXiv 2024, arXiv:2404.18861. [Google Scholar]
- Zhang, H.; Zhu, Y.; Wang, D.; Zhang, L.; Chen, T.; Wang, Z.; Ye, Z. A survey on visual mamba. Appl. Sci. 2024, 14, 5683. [Google Scholar] [CrossRef]
- Shi, D.; Li, C.; Shi, H.; Liang, L.; Liu, H.; Diao, M. A Hierarchical Feature-Aware Model for Accurate Tomato Blight Disease Spot Detection: Unet with Vision Mamba and ConvNeXt Perspective. Agronomy 2024, 14, 2227. [Google Scholar] [CrossRef]
- Zhao, F.; He, Y.; Song, J.; Wang, J.; Xi, D.; Shao, X.; Wu, Q.; Liu, Y.; Chen, Y.; Zhang, G. Smart UAV-assisted blueberry maturity monitoring with Mamba-based computer vision. Precis. Agric. 2025, 26, 56. [Google Scholar] [CrossRef]
- Zhang, X.; Mu, W. GMamba: State space model with convolution for Grape leaf disease segmentation. Comput. Electron. Agric. 2024, 225, 109290. [Google Scholar] [CrossRef]
- Gao, X.; Wang, G.; Zhou, Z.; Li, J.; Song, K.; Qi, J. Performance and speed optimization of DLV3-CRSNet for semantic segmentation of Chinese cabbage (Brassica pekinensis Rupr.) and weeds. Crop Prot. 2025, 195, 107236. [Google Scholar] [CrossRef]
- Kong, X.; Li, A.; Liu, T.; Han, K.; Jin, X.; Chen, X.; Yu, J. Lightweight cabbage segmentation network and improved weed detection method. Comput. Electron. Agric. 2024, 226, 109403. [Google Scholar] [CrossRef]
- Ma, Z.; Wang, G.; Yao, J.; Huang, D.; Tan, H.; Jia, H.; Zou, Z. An improved U-net model based on multi-scale input and attention mechanism: Application for recognition of Chinese cabbage and weed. Sustainability 2023, 15, 5764. [Google Scholar] [CrossRef]
- Tian, Y.; Cao, X.; Zhang, T.; Wu, H.; Zhao, C.; Zhao, Y. CabbageNet: Deep Learning for High-Precision Cabbage Segmentation in Complex Settings for Autonomous Harvesting Robotics. Sensors 2024, 24, 8115. [Google Scholar] [CrossRef] [PubMed]
- Guan, Y.; Cui, Z.; Zhou, W. Reconstruction in off-axis digital holography based on hybrid clustering and the fractional Fourier transform. Opt. Laser Technol. 2025, 186, 112622. [Google Scholar] [CrossRef]
Datasets | Total Number of Pictures | Traning Set | Val Set | Test Set |
---|---|---|---|---|
Early heading stage cabbage image | 1602 | 1101 | 314 | 187 |
Compact heading stage cab-bage image | 3354 | 2347 | 670 | 337 |
Method | Backgrounds | Cabbage Fruit | Cabbage Leaves | mIoU (%) | mF1 (%) | OA (%) | |||
---|---|---|---|---|---|---|---|---|---|
IoU (%) | F1 (%) | IoU (%) | F1 (%) | IoU (%) | F1 (%) | ||||
SegMamba | 95.21 | 97.54 | 65.47 | 79.13 | 96.11 | 98.02 | 85.60 | 91.56 | 97.30 |
MambaUnet | 95.10 | 97.49 | 72.61 | 84.13 | 96.47 | 98.20 | 88.06 | 93.26 | 97.55 |
CM-UNet | 95.45 | 97.67 | 71.76 | 83.56 | 96.58 | 98.26 | 87.93 | 93.16 | 97.63 |
RS3-Mamba | 95.44 | 97.67 | 72.20 | 83.85 | 96.55 | 98.24 | 88.06 | 93.25 | 97.61 |
EfficientPyramidMamba | 95.08 | 97.48 | 72.76 | 84.23 | 96.49 | 98.21 | 88.11 | 93.30 | 97.57 |
U-MoEMamba | 95.68 | 97.79 | 75.96 | 86.63 | 96.90 | 98.42 | 89.51 | 94.28 | 97.85 |
Method | Params/MB | FLOPs/GB | mIoU |
---|---|---|---|
SegMamba | 5.05 | 72.99 | 85.60 |
MambaUnet | 13.88 | 51.86 | 88.06 |
CM-UNet | 13.41 | 51.76 | 87.93 |
RS3-Mamba | 43.32 | 158.24 | 88.06 |
EfficientPyramidMamba | 28.76 | 76.22 | 88.11 |
U-MoEMamba | 20.88 | 165.72 | 89.51 |
Method | Background | Cabbage Fruit | mIoU (%) | mF1 (%) | OA (%) | ||
---|---|---|---|---|---|---|---|
IoU (%) | F1 (%) | IoU (%) | F1 (%) | ||||
SegMamba | 96.97 | 98.46 | 81.98 | 90.09 | 89.47 | 94.28 | 97.34 |
MambaUnet | 97.35 | 98.65 | 84.05 | 91.33 | 90.70 | 96.44 | 97.67 |
CM- UNet | 97.44 | 98.70 | 84.52 | 91.61 | 90.98 | 95.15 | 97.76 |
RS3-Mamba | 97.56 | 98.76 | 85.28 | 92.05 | 90.52 | 95.40 | 97.86 |
EfficientPyramidMamba | 97.24 | 98.60 | 83.22 | 90.84 | 90.23 | 94.72 | 98.60 |
U-MoEMamba | 97.70 | 98.83 | 86.07 | 92.51 | 91.88 | 95.67 | 97.98 |
Method | Params/MB | FLOPs/GB | mIoU |
---|---|---|---|
SegMamba | 5.05 | 72.99 | 89.47 |
MambaUnet | 13.88 | 51.86 | 90.70 |
CM- UNet | 13.41 | 51.76 | 90.98 |
RS3-Mamba | 43.32 | 158.24 | 90.52 |
EfficientPyramidMamba | 28.76 | 76.22 | 90.23 |
U-MoEMamba | 20.88 | 165.72 | 91.88 |
Expert 1 (Multi-Scale Conv) | Expert 2 (MSCrossDualAttention) | Expert 3 (VSSBlock) | MOEGatingNetwork | Param (M) | FLOPS (G) | Iou (%) | mIoU (%) | ||
---|---|---|---|---|---|---|---|---|---|
Backgrounds | Cabbage Fruit | Cabbage Leaves | |||||||
16.90 | 128.80 | 95.44 | 72.20 | 96.55 | 88.06 | ||||
√ | 17.68 | 136.09 | 95.64 | 74.44 | 96.78 | 88.95 | |||
√ | 16.99 | 129.62 | 95.63 | 74.53 | 96.79 | 88.98 | |||
√ | 16.84 | 128.48 | 95.52 | 74.84 | 96.76 | 89.04 | |||
√ | √ | 19.40 | 151.48 | 95.65 | 74.79 | 96.81 | 89.08 | ||
√ | √ | 19.25 | 150.34 | 95.60 | 75.22 | 96.50 | 89.01 | ||
√ | √ | 18.56 | 143.90 | 95.65 | 74.56 | 96.79 | 89.00 | ||
√ | √ | √ | 20.80 | 165.72 | 95.62 | 74.89 | 96.80 | 89.10 | |
√ | √ | √ | √ | 20.88 | 165.72 | 95.68 | 75.96 | 96.90 | 89.51 |
Expert 1 ((Multi-Scale Conv) | Expert 2 (MSCrossDualAttention) | Expert 3 (VSSBlock) | MOEGatingNetwork | Param (M) | FLOPS (G) | Iou (%) | mIoU (%) | |
---|---|---|---|---|---|---|---|---|
Backgrounds | Cabbage Fruit | |||||||
16.90 | 128.80 | 97.59 | 85.42 | 91.50 | ||||
√ | 17.68 | 136.09 | 97.65 | 85.90 | 91.77 | |||
√ | 16.99 | 129.62 | 97.19 | 86.07 | 91.63 | |||
√ | 16.84 | 128.48 | 97.62 | 85.67 | 91.64 | |||
√ | √ | 19.40 | 151.48 | 97.55 | 85.32 | 91.43 | ||
√ | √ | 19.25 | 150.34 | 97.51 | 85.17 | 91.34 | ||
√ | √ | 18.56 | 143.90 | 97.65 | 85.82 | 91.73 | ||
√ | √ | √ | 20.80 | 165.72 | 97.63 | 85.74 | 91.69 | |
√ | √ | √ | √ | 20.88 | 165.72 | 97.70 | 86.11 | 91.88 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, R.; Ding, X.; Peng, S.; Cai, F. U-MoEMamba: A Hybrid Expert Segmentation Model for Cabbage Heads in Complex UAV Low-Altitude Remote Sensing Scenarios. Agriculture 2025, 15, 1723. https://doi.org/10.3390/agriculture15161723
Li R, Ding X, Peng S, Cai F. U-MoEMamba: A Hybrid Expert Segmentation Model for Cabbage Heads in Complex UAV Low-Altitude Remote Sensing Scenarios. Agriculture. 2025; 15(16):1723. https://doi.org/10.3390/agriculture15161723
Chicago/Turabian StyleLi, Rui, Xue Ding, Shuangyun Peng, and Fapeng Cai. 2025. "U-MoEMamba: A Hybrid Expert Segmentation Model for Cabbage Heads in Complex UAV Low-Altitude Remote Sensing Scenarios" Agriculture 15, no. 16: 1723. https://doi.org/10.3390/agriculture15161723
APA StyleLi, R., Ding, X., Peng, S., & Cai, F. (2025). U-MoEMamba: A Hybrid Expert Segmentation Model for Cabbage Heads in Complex UAV Low-Altitude Remote Sensing Scenarios. Agriculture, 15(16), 1723. https://doi.org/10.3390/agriculture15161723