Automatic Lettuce Weed Detection and Classification Based on Optimized Convolutional Neural Networks for Robotic Weed Control

Zhao, Chang-Tao; Wang, Rui-Feng; Tu, Yu-Hao; Pang, Xiao-Xu; Su, Wen-Hao

doi:10.3390/agronomy14122838

Open AccessArticle

Automatic Lettuce Weed Detection and Classification Based on Optimized Convolutional Neural Networks for Robotic Weed Control

by

Chang-Tao Zhao

¹

,

Rui-Feng Wang

¹

,

Yu-Hao Tu

¹

,

Xiao-Xu Pang

² and

Wen-Hao Su

^1,*

¹

College of Engineering, China Agricultural University, 17 Qinghua East Road, Haidian, Beijing 100083, China

²

College of Food Science and Nutritional Engineering, China Agricultural University, 17 Qinghua East Road, Haidian, Beijing 100083, China

^*

Author to whom correspondence should be addressed.

Agronomy 2024, 14(12), 2838; https://doi.org/10.3390/agronomy14122838

Submission received: 13 October 2024 / Revised: 20 November 2024 / Accepted: 26 November 2024 / Published: 28 November 2024

(This article belongs to the Special Issue Advanced Machine Learning in Agriculture)

Download

Browse Figures

Versions Notes

Abstract

:

Weed management plays a crucial role in the growth and yield of lettuce, with timely and effective weed control significantly enhancing production. However, the increasing labor costs and the detrimental environmental impact of chemical herbicides have posed serious challenges to the development of lettuce farming. Mechanical weeding has emerged as an effective solution to address these issues. In precision agriculture, the prerequisite for autonomous weeding is the accurate identification, classification, and localization of lettuce and weeds. This study used an intelligent mechanical intra-row lettuce-weeding system based on a vision system, integrating the newly proposed LettWd-YOLOv8l model for lettuce–weed recognition and lettuce localization. The proposed LettWd-YOLOv8l model was compared with other YOLOv8 series and YOLOv10 series models in terms of performance, and the experimental results demonstrated its superior performance in precision, recall, F1-score, mAP50, and mAP95, achieving 99.732%, 99.907%, 99.500%, 99.500%, and 98.995%, respectively. Additionally, the mechanical component of the autonomous intra-row lettuce-weeding system, consisting of an oscillating pneumatic mechanism, effectively performs intra-row weeding. The system successfully completed lettuce localization tasks with an accuracy of 89.273% at a speed of 3.28 km/h and achieved a weeding rate of 83.729% for intra-row weed removal. This integration of LettWd-YOLOv8l and a robust mechanical system ensures efficient and precise weed control in lettuce cultivation.

Keywords:

YOLOv8; deep learning; attention mechanism; precision agriculture; weed control

1. Introduction

Currently, weeds are widely recognized as the primary biological factor affecting crop growth and causing yield reduction [1]. According to Oreke’s research [2], weeds can reduce crop yields by approximately 34%, resulting in substantial economic losses for agriculture. Lettuce, a staple in human diets, is cultivated extensively worldwide, and weed control is one of the most critical factors determining lettuce production [3]. Weed control in lettuce farming consists of inter-row and intra-row weeding. However, intra-row weed control poses greater challenges than inter-row weeding due to the close proximity of weeds to crop rows [4] and the high planting density of lettuce. At present, manual weeding and chemical herbicides are still the primary methods used [5,6]. Although manual weeding can accurately remove intra-row weeds, this labor-intensive method is not compatible with the demands of smart agriculture and significantly increases production costs [7]. Additionally, the prolonged use of chemical herbicides leads to herbicide resistance and severe environmental degradation [8,9]. In contrast, mechanical weeding has gained attention as a research focus due to its environmental friendliness and cost-effectiveness. Additionally, different types of weeds exhibit varying growth habits, root structures, and resistance, which directly impact the efficiency and effectiveness of weeding systems. Some weeds bear a close resemblance to lettuce in appearance, leading to potential misidentifications that reduce overall operational efficiency. By utilizing weed classification data, it is possible to further analyze the types and distribution of weeds in lettuce fields, providing a scientific basis for decision-making in precision agriculture. Therefore, developing an intelligent, real-time, fast, and cost-effective weeding machine for lettuce fields, equipped with technology for real-time weed identification, localization, and classification, is of great significance for improving lettuce production efficiency and advancing modern agriculture.

The emergence and advancement of deep learning and computer vision technologies have opened new avenues for smart and precision agriculture [10], offering new possibilities for the automation and intelligence of intra-row lettuce weeding. Over the past two decades, deep learning has led to the development of numerous renowned models [11], which can automatically extract features in high-dimensional spaces and have been widely applied in various other agricultural production scenarios [12,13].

Furthermore, with the rapid improvement in camera performance and computational power, computer vision has shown tremendous potential for tasks such as the rapid detection, classification, and localization of weeds and crops [14,15,16]. Earlier methods for detecting and classifying field weeds typically relied on preprocessing, segmentation, feature extraction, and classification techniques. However, these methods often exhibited poor robustness in cases where crops overlapped with weeds or under suboptimal lighting conditions [17], leading to the development of machine-learning-based weed detection algorithms. Rumpf et al. [18] proposed a weed classification method for small-grain crops using a support vector machine (SVM) combined with near-infrared spectroscopy, but it achieved only 80% classification accuracy. Pérez-Ortiz et al. [19] developed a semi-supervised weed detection system for sunflower crops based on SVM and Hough transform. Lottes et al. [20] combined random forest algorithms with near-infrared spectroscopy, using drone-acquired data to classify objects and weeds. However, machine learning classifiers require handcrafted feature vectors created by experts based on the visual texture, spectral characteristics, and spatial context of weeds and crops [13], significantly limiting the applicability of these methods to specific weed or crop types.

Deep learning overcomes these limitations by enabling automatic feature extraction. Tao et al. [21] proposed a hybrid convolutional neural network–support vector machine classifier, achieving 92.1% accuracy in weed detection for winter oilseed rape fields. Mohd Anul Haq [22] developed a CNN-LVQ model for weed classification in soybean fields. Zhang et al. [23] introduced a YOLOv8-based segmentation model for the localization of weed apical meristems, achieving outstanding segmentation accuracy with 97.2%, though the overall weed detection accuracy was only 81%. Hu et al. [24] developed a lightweight multimodule YOLOv7 model for lettuce recognition and weed severity classification by integrating ECA and CA mechanisms, ELAN-B3, and DownC modules, achieving a detection accuracy of 97.5%. Kong et al. [25] proposed the Efficient Crop Segmentation Net (ECSNet) based on the YOLO architecture for weed management in maize fields, achieving 90.9% mIoU50 and 90.2% accuracy in maize segmentation. In intelligent weed control systems, merely achieving classification or detection tasks is insufficient; the ultimate goal is the rapid and precise localization of crops and weeds. Quan et al. [26] developed a YOLOv3-based intelligent weeding robot for inter-row weed removal in cornfields, achieving detection accuracies of 98.5% for maize and 90.9% for weeds, with a weeding rate of 85.91% and seedling injury rate of 1.19%. Ju et al. [27] developed an adaptive weeding robot for paddy fields based on YOLOv5x, achieving an accuracy of 90.05%, a weeding rate of 82.4%, and a seedling injury rate of 2.8%. These studies demonstrate the immense potential of deep learning methods for fast weed and crop detection and localization. Additionally, YOLO object detectors have shown high precision in detecting weeds even in highly dynamic and challenging unstructured environments [28]. However, beyond accurate weed–crop recognition and localization, the development of real-time, fast, and efficient weeding equipment is also critical. Therefore, in addition to proposing a novel deep-learning-based lettuce–weed detection, classification, and localization model, this study also optimized a previously developed weeding system to validate the effectiveness of the proposed model [3]. The main contributions of this study are as follows:

This study proposed an optimized YOLOv8l model, which, to the best of our knowledge, is the first to incorporate both the GAM and CA mechanism for the rapid detection of lettuce and weeds, as well as the classification of six common weed species.
A high-efficiency vision system was developed to identify the emergence point of lettuce stems (i.e., the center point), integrating the LettWd-YOLOv8l model and a lettuce–weed localization method.
The intra-row weeding device, based on a vision system and pneumatic servo technology, was uniquely optimized in this study, marking a significant innovation compared to previous studies.

Overall, this research aims to advance the automation and intelligence of intra-row lettuce weeding by optimizing an intelligent weeding system and introducing a novel deep learning model to enhance the efficiency of lettuce–weed detection, classification, and localization. This provides valuable insights for the future development of precision and smart agriculture. The remainder of this study is organized into five sections: Section 2 describes the creation of the dataset, the structure of LettWd-YOLOv8l, the localization algorithm, and the optimized intelligent intra-row weeding system; Section 3 details the parameter settings, experimental environment, and evaluation metrics for various experiments (including model training and conveyor belt experiments); Section 4 presents and discusses the experimental results; Section 5 discusses the findings and potential future improvements; and Section 6 provides a summary and conclusion of the study. The full forms and annotations of abbreviations mentioned in this article are provided in Table A1, while the symbols and their corresponding meanings are listed in Table A2.

2. Methods and Materials

This study uses an autonomous intra-row weeding system for lettuce using the LettWd-YOLOv8l model for weed and crop detection. A dataset of 584 images collected in Haidian, Beijing is expanded to 3008 images through data augmentation. The proposed model integrates the Global Attention Mechanism (GAM) and Coordinate Attention (CA) for improved feature extraction and localization. A custom lettuce localization algorithm ensures accurate lettuce detection. The system, equipped with pneumatic-controlled cutting blades, operates under an STM32-based intelligent control system to selectively remove weeds without damaging lettuces. This study will be presented in the following order: (1) dataset; (2) model optimization and lettuce center localization algorithm; (3) autonomous intra-row lettuce-weeding system.

2.1. Dataset

2.1.1. Dataset Source

Based on insights from previous studies [29], this study ensures the scientific rigor and validity of this research by collecting and annotating the dataset in three phases: (1) field image collection, (2) preliminary data filtering, and (3) data annotation. The original dataset of lettuce and weed images was collected in Haidian, Beijing, China (116.27, 40.06; Sunny Day). We then perform a preliminary filtering of the raw images collected from the field, removing low-quality images, resulting in a dataset comprising 584 images of lettuce and weeds. This dataset includes images of six different weed species: Chenopodium album L. (CL), Portulaca oleracea L. (PL), Cynodon dactylon (L.) Persoon (CP), Amaranthus blitum L. (AL), Galinsoga parviflora Cav. (GC), and Glycine max (Linn.) Merr. (GM). Sample images from the dataset are shown in Figure 1.

2.1.2. Data Augmentation

Data augmentation is crucial for enhancing model performance and generalizing input data [30]. To prevent overfitting and improve model robustness, this study applies five data augmentation techniques: rotation, flipping, noise injection, brightness adjustment, and cutout. In addition to flipping, the following adjustments are made to ensure that the augmented images exhibit differences from the original images while preserving their details [31,32,33]: (1) adding noise with an intensity of 0.4, (2) generating images with brightness levels ranging from 0.3 to 1.5 times the original, (3) applying random rotations from −30° to +30°, and (4) generating 15 random blocks, each sized 40 × 40 pixels. After data augmentation, the final dataset comprises 3008 images. Examples of the data augmentation results are shown in Figure 2. The final and original datasets are split in an 8:2 ratio, as shown in Table 1.

2.2. Optimization of YOLOv8l Model and Lettuce Center Localization Algorithm

The YOLO series of networks has now evolved to YOLOv10 [34]. Among the ten versions of the YOLO series, YOLOv5 is the most widely applied in object detection [35], while the YOLOv8 series has also gained extensive use in crop detection [23,36]. Compared to YOLOv5, YOLOv8 features an optimized Backbone, where the C3 module from CSP-Darknet53 is replaced by the C2f module, further reducing the model’s size [37]. Additionally, the introduction of the ELAN module enhances the network’s representational capacity by efficiently aggregating deep features [38]. The YOLOv8 Neck employs a path aggregation network (PANet), which optimizes feature aggregation and information transfer across layers, improving detection performance. The Head section adopts a decoupled Head design, separating the classification and regression tasks to boost detection precision. Compared to YOLOv8n, YOLOv8l employs a more complex model structure, enhancing multi-scale feature fusion, deep Backbone networks, and advanced training strategies, thereby improving the model’s generalization and robustness. In practical applications for field crop detection, detection speed is critical, but accuracy is equally important. Notably, the YOLOv8l model balances detection accuracy with high speed [39]. Therefore, this study adopts YOLOv8l as the baseline model. The proposed LettWd-YOLOv8l model enhances the model’s ability to capture features in complex environments, allowing for more precise identification and localization of similar-looking weeds and lettuce, thereby improving target detection accuracy.

2.2.1. Detection Head and Neck

LettWd-YOLOv8l improves the YOLOv8l Head to better integrate features from different layers, capturing multi-scale information and enhancing detection accuracy. The Head processes multi-scale features received from the Neck and outputs bounding boxes, class probabilities, and confidence scores. By integrating and processing features at different scales, the Head can more effectively identify and localize objects in the image. To further enhance detection performance, the proposed model incorporates the Global Attention Mechanism (GAM) [40] into the Head, which significantly boosts the model’s ability to detect relevant information.

The GAM improves the model’s sensitivity to key information, enhancing feature extraction and object recognition. In the proposed model, the GAM is integrated into the C2f layer within the Head. The C2f layer extracts multi-scale local features through multiple convolutional layers and feature fusion, which are then passed to the GAM. The GAM refines these features using global attention, emphasizing important regions while suppressing background noise. Additionally, we fine-tune the position of the GAM in the model, replacing the sigmoid activation function [41] with ReLU [42], simplifying computations, improving training efficiency, and reducing the vanishing gradient issue. The ReLU function also enhances performance through sparse activations. The optimized GAM structure is shown in Figure 3a.

In the proposed model, the GAM processes the features passed from the C2f layer and forwards them to the Detect layer for predicting object classes and bounding boxes, as well as to the subsequent Conv layer for further convolutional operations. This makes the ReLU activation function more suitable for the GAM network in LettWd-YOLOv8l. By improving both the GAM module and the YOLOv8l Head, the proposed model can effectively combine local and global information, reducing the interference from redundant data and improving detection accuracy, efficiency, and inference speed. This enhances the model’s robustness and overall performance. The integration of the GAM module within the Head is illustrated in Figure 3b. The model’s Neck utilizes a structure similar to the FPN network [43], which improves object detection by combining upsampling and feature fusion.

2.2.2. Backbone

Coordinate Attention (CA) [44] is an efficient and lightweight attention mechanism that enhances the model’s representational capability by incorporating both channel and spatial attention while introducing coordinate information. The CA module decouples the spatial information of the input feature map and generates attention maps along the horizontal and vertical directions, which are then combined with channel attention weighting. This approach not only captures inter-channel correlations but also highlights important regions in the feature map, thereby improving the model’s object perception and localization capabilities. To prevent spatial information from being fully compressed into the channels, the model first decomposes the global average pooling operation along the x- and y-directions, enabling long-range spatial interaction with precise positional information. The decomposition is defined by the following formulas:

z_{c}^{h} (h) = \frac{1}{w} \sum_{0 \leq i \leq w} x_{c} (h, i)

(1)

z_{c}^{w} (w) = \frac{1}{H} \sum_{0 \leq j \leq H} x_{c} (j, w)

(2)

Next, convolution operations are applied separately to the x- and y-directional features, followed by concatenation as expressed by:

f = δ (F_{1} [z^{h}, z^{w}])

(3)

where F₁ represents dimensionality reduction and activation through a 1 × 1 convolutional kernel, producing the feature

f \in ℝ^{C / r \times (H + W) \times 1}

.

The processed features are then split along the spatial dimension into

f^{h} \in ℝ^{C / r \times H \times 1}

and

f^{w} \in ℝ^{C / r \times 1 \times w}

, both of which undergo processing using 1 × 1 convolutions, followed by a sigmoid activation function, yielding the final attention vectors

g^{h} = σ (F_{h} (f^{h}))

and

g^{w} = σ (F_{w} (f^{w}))

, where

σ

represents the sigmoid activation function. Finally, the output of the CA module is given by:

y_{c} (i, j) = x_{c} (i, j) \times g_{h}^{c} (i) \times g_{c}^{w} (j)

(4)

Through this decomposition and fusion of feature information, the CA mechanism comprehensively learns feature correlations, enhancing the model’s ability to understand complex visual scenes and improving its performance in detecting small objects. In the proposed model, the CA module is integrated after one of the C2f layers in the Backbone, significantly improving the representational capability of the feature map. By focusing on both channel and spatial information, the model is better able to detect small objects. The features processed by the CA module are passed to the next Conv layer, optimizing feature flow and improving detection accuracy. Due to the lightweight nature of the CA module and the minimal number of modules integrated into the Backbone, the inclusion of the CA module does not significantly increase computational overhead or inference time, maintaining a balance between detection accuracy and efficiency. The overall structure of LettWd-YOLOv8l is shown in Figure 4.

2.2.3. Lettuce Localization Algorithm

To prevent damage to crops during the intra-row weeding process of the autonomous intra-row lettuce-weeding system, this study integrates the LettWd-YOLOv8l model with a localization algorithm to achieve accurate lettuce position detection. In 2D images, traditional center localization algorithms often approximate the crop center by using the center of the bounding box (as shown in Figure 5); however, this does not represent the true center of the crop, leading to deviations in real-world agricultural applications.

To more accurately locate the crop’s center, this study optimizes the traditional localization algorithm. Since weeds often have a similar color to the crops, their presence can affect the model’s ability to accurately locate the crop’s center coordinates. To address this, the proposed method preprocesses the crop using the HSV color space to generate a mask that extracts the crop’s phenotypic color and then normalize the RGB image:

\begin{matrix} R ’ = \frac{R}{255}, & G ’ = \frac{G}{255}, & B ’ = \frac{B}{255} \end{matrix}

(5)

Next, the maximum and minimum values of the corresponding grayscale values are calculated:

\{\begin{matrix} C_{\max} = \max (R ’, G ’, B ’) \\ C_{\min} = \min (R ’, G ’, B ’) \\ △ = C_{\max} - C_{\min} \end{matrix}

(6)

Then, the hue H and the brightness V are calculated:

\begin{matrix} H = \{\begin{matrix} 0, △ = 0 \\ 60 \times (\frac{G ’ - B ’}{△} \mod 6), C_{\max} = R ’ \\ 60 \times (\frac{B ’ - R ’}{△} + 2), C_{\max} = G ’ \\ 60 \times (\frac{B ’ - G ’}{△} + 4), C_{\max} = B ’ \end{matrix} \\ V = C_{\max} \end{matrix}

(7)

Based on the crop’s phenotypic color and varying grayscale thresholds, different parameters are selected. By combining the LettWd-YOLOv8l model and the localization algorithm, the autonomous intra-row lettuce-weeding system first detects the crop, generates the corresponding bounding box, and calculates the average area of all marked regions. Subsequently, smaller areas that may represent noise are filtered out as follows:

a v a = \frac{t a}{n r}

(8)

ta represents the total area of annotated regions in the image, nr denotes the number of valid annotated regions, and ava indicates the average area of each annotated region. If the number of regions is zero

(n r = 0)

, then Equation (9) is not calculated.

The weighted sum ctr of the centroids of all valid regions is calculated by multiplying each region’s centroid by its area, followed by the total area total_area. If valid regions are present, then the average centroid aver_ctr is computed; if not, then the centroid and total area of all regions are used (i.e., the center of the bounding box). The centroid calculation is as follows:

c t r = \sum_{i = 0}^{n} c e n t [i] \cdot s t a [i] [a r e a]

(9)

t a = \sum_{i = 0}^{n} s t a [i] [a r e a]

(10)

a c = \frac{r c}{r t a}

(11)

a c = \frac{c t r}{t a}

(12)

cent[i] represents the centroid coordinates of the i-th region, and sta[i][area] denotes the area of the i-th region. ctr is the weighted sum of centroids across all regions, where the weight is the region’s area. rc represents the weighted sum of centroids for valid regions, and rta is the total area of valid regions. When rta is not zero

(r t a! = 0)

, Equation (11) is applied; otherwise, Equation (12) is used in cases of no valid regions. ac represents the average centroid.

This study uses the grayscale mask to separate the crop’s center from other parts and apply binarization to the center image. Depending on the crop, different grayscale thresholds are set: pixels with values above the threshold are considered foreground, while others are treated as background, thus extracting the crop’s position. The results of the binarization process for the crop center are shown in Figure 6a. To compare between the traditional and proposed center localization algorithms, this study processes images of individual crops as well as images where crops coexist with weeds, marking the localized coordinates on the original images, as shown in Figure 6b.

The proposed localization algorithm determines the centroid of the detected target. The final output is generated at a rate of 120 frames per second and saved in a ‘txt’ file, which is then provided to the weeding mechanism. This file includes the detected crop type and its corresponding center coordinates. The structure of the weed–crop center localization system is depicted in Figure 7. The proposed localization algorithm not only accurately locates lettuce but can also be integrated with mechanical weeding devices to complete intra-row weeding tasks. Furthermore, it has the potential to be combined with laser [24] or precision pesticide-spraying [45] technologies for more refined weeding operations. This study primarily focuses on combining this localization algorithm with mechanical weeding devices for intra-row lettuce-weeding.

2.3. Autonomous Intra-Row Lettuce-Weeding System

2.3.1. Mechanical Weeding Device

This study integrates an object detection method with a weeding mechanism based on cutting blades, designing an intelligent intra-row weed control system for lettuce. This design builds upon previous research [3] and the weeding devices developed by Raja et al. [46] and Pérez-Ruíz et al. [47]. The structure of the mechanical weeding device is shown in Figure 8. The machine’s frame is made of aluminum profiles, and the weeding blades consist of a pair of blades positioned side by side along the row centerline. The blades are driven by two cylinders (model SC40 × 50, Juxiang Intelligent Technology (Shenzhen) Co., Shenzhen City, China) and have a pentagonal shape, with a width of 6.2 cm, a length of 11.4 cm, and a thickness of 0.8 cm. The front ends of the blades are sharpened and are fixed to the mechanical arms of the weeding device. The forward direction of the intra-row weeding device aligns with the triangular tips of the blades. Additionally, the blades are kept parallel to the soil surface and maintain a depth of no more than 2.3 cm during operation. The mechanical arms connected to the blades are pivoted 70 cm above the soil surface, allowing the blades to move in the direction of the weeding device’s travel. When the blades close, they remove weeds in a 12.4 cm wide intra-row area at the center of the weeding device. As the blades approach the crop, they are rapidly pushed away from the centerline at the same time, preventing damage to the crops. The movement of the blades is controlled by a solenoid valve (model 4V210-08 DC24V, Juxiang Intelligent Technology (Shenzhen) Co., Shenzhen City, China), which drives the cylinders by providing air pressure (0.7 MPa).

Figure 9 illustrates the operational process of the intra-row weeding device for lettuce. This study artificially divides the farmland into three areas: Region α is the Inter-Row Area, Region β is the In-Row Area, and Region γ is the Crop Security Zone. In the In-Row Area, the weeding blade (approximately 12.4 cm wide) is used to remove weeds between rows. As shown in Figure 10b, the weeding blade’s operation can be divided into three stages from left to right: at Position I and Position III, the solenoid valve is activated, the cylinder extends, and the weeding blades close together, moving forward in parallel. At Position II, the solenoid valve deactivates, the cylinder retracts, and the blades open together to avoid the lettuce, ensuring that the Crop Security Zone remains undamaged. After passing the lettuce, the cylinder reactivates to close the blades, returning them to the Intra-Row Area. This process is repeated for each lettuce plant, minimizing the risk of crop damage during weeding.

During the intra-row weeding operation, the intelligent lettuce-weeding system relies on precise visual recognition tools to ensure the blades open at the correct position. The accuracy of the visual system’s positioning directly affects the weeding device’s performance, as precise detection of the lettuce plant spacing can significantly reduce the risk of mechanical damage to the crops. Additionally, achieving this functionality requires an accurate intelligent control system. Therefore, this study integrates the improved YOLOv8l lettuce–weed model and the intra-row weeding system with a real-time control system to ensure both precision and safety during the weeding process.

2.3.2. Intelligent Control System

As shown in Figure 10a, the intelligent control system of the weeding device primarily consists of a power supply, industrial camera (HF868-2, Shenzhen Jierui Microcomputer Electronic Technology Co., Shenzhen City, China), computer, STM32F103C8T6 controller (STM32F103C8T6, STMicroelectronics, 39 Chemin du Champ des Filles, 1228 Plan-les-Ouates, Geneva, Switzerland), solenoid valve, air compressor, and cylinder. The core of the control system is the STM32, which enables parameter settings and fine-tuning through the control core, while the working status of the weeding device is displayed in real time on the computer. The controller we use is a microcontroller based on the ARM Cortex-M3 core, programmed using Keil, and equipped with an I/O interface board. The 220 V power supply provides energy to the laptop, buck module, and air compressor. The industrial camera connects to the laptop via a USB interface, capturing real-time information on crops and weeds and transmitting the video stream to the laptop, which runs the LettWd-YOLOv8l model. This model analyzes the video data from the industrial camera, identifying lettuce and weed types and obtaining the positional information of the crops. The localization algorithm automatically determines the crop center and transmits this information to the controller, which then controls the solenoid valve and cylinder, powering the weeding device to avoid damaging the lettuce during the operation. The solenoid valve is powered by the buck module, while the cylinder is driven by the air compressor. Throughout the entire intra-row weeding process, the visual recognition program runs on the computer, processing the lettuce–weed information captured by the industrial camera in real time and transmitting the precise location of the lettuce to the controller, which adjusts the opening and closing of the weeding blades to achieve precise weeding. The control algorithm flowchart for the intelligent control system of the weeding device is shown in Figure 10b.

3. Experiment

3.1. Parameter Setting and Experimental Environment

The data training process was conducted on a server equipped with an NVIDIA GeForce RTX 4080 GPU (NVIDIA, Santa Clara, CA, USA, 16 GB VRAM) and an Intel^® Core™ i9-14900K @ 6.00 GHz processor (Intel, Santa Clara, CA, USA). To ensure optimal performance for each model, this study set the number of epochs to 800. However, due to varying model complexities, to avoid wasting computational resources while still achieving optimal performance, this study implemented an Early Stopping Algorithm [48], which halts training once the model converges and surpasses the patience parameter. The training parameters were as follows: a learning rate of 0.01, batch size of 6, and patience set to 20. The number of training epochs varied depending on the model.

The industrial camera used in the experiment (HF868-2, Shenzhen Jierui Microcomputer Electronic Technology Co., Shenzhen City, China) had a maximum resolution of 480 p (640 × 480) and a frame rate of 120 frames per second. The autonomous intra-row lettuce-weeding system used in the experiment is shown in Figure 11, with the camera positioned approximately 67 cm above the ground and about 40 cm horizontally from the weeding blade. The camera’s field of view covered an area of approximately 300 square centimeters.

3.2. Model Evaluation Metrics

To evaluate the effectiveness of the LettWd-YOLOv8l model in object detection tasks, this study utilized a set of performance metrics commonly applied in the object detection field. The primary metrics used included the loss function, precision, recall, F1-score, mean average precision (mAP), and the confusion matrix. The loss function measures the effectiveness of data fitting, precision assesses the accuracy of positive category predictions, recall evaluates the recognition rate of the positive category, and F1-score provides harmonic mean of recall and precision. mAP serves as a comprehensive indicator of model’s performance, while the confusion matrix offers a visual representation of the model’s classification performance. The formulas for recall, precision, F1-score, and mAP are as follows:

Recall = \frac{t p}{t p + f n}

(13)

Precision = \frac{t p}{t p + f p}

(14)

F 1 - Score = \frac{2 \times Precision \times Recall}{Precision + Recall}

(15)

mAP = \frac{\sum_{n = 1}^{N} A P (n)}{N}

(16)

where tp donates true positive; fn donates false negative; fp donates false positive; N donates class.

3.3. Conveyor Belt Experiment

In the conveyor belt experiment described in this study, we primarily evaluated the weeding effectiveness of the autonomous intra-row lettuce-weeding system under varying weed densities. Following the research methodology of Raja et al. [46], we planted weeds at different densities between lettuce plants to simulate the range of weed densities encountered in real-world production. Weed density was categorized into three levels: light (fewer than 10 weeds per square meter), moderate (11–100 weeds), and heavy (more than 100 weeds). To assess the effectiveness of the proposed autonomous weeding system, we introduced two evaluation metrics: lettuce localization success rate (Loc) [3] and weeding rate (W_r) [49]. The specific formulas are as follows:

L o c = (1 - \frac{L e_{w} + L e_{m}}{L e_{t o t a l}}) \times 100 %

(17)

W_{r} = \frac{W_{O} - W_{A}}{W_{O}} \times 100 %

(18)

where

L e_{w}

represents the number of misidentified lettuce plants,

L e_{m}

refers to the number of undetected lettuce plants,

L e_{t o t a l}

denotes the total number of lettuce plants,

W_{O}

represents the total number of weeds before weeding, and

W_{A}

signifies the number of remaining weeds after weeding.

In this experiment, the conveyor belt on the test platform operated at a speed of 3.28 km/h to simulate the walking speed of the weeding device during field operations.

4. Results

4.1. Training of LettWd-YOLOv8l Model

In this study, we compared the LettWd-YOLOv8l model with eight other YOLOv8 models and five YOLOv10 models. The trends during training for these models are shown in Figure 12. Figure 12a presents the Box_loss curve of the proposed LettWd-YOLOv8l model for the lettuce–weed recognition task, reflecting its accuracy in locating targets, including both training and validation losses. The losses for all models rapidly decreased within the first 100 epochs and gradually stabilized after the 300th epoch. Notably, the YOLOv10 series model stabilized after only the 200th epoch. Figure 12b displays the DFL_loss curve for the lettuce–weed recognition task, representing the regression accuracy of the bounding boxes, also showing both training and validation losses. Similar to Box_loss, the losses decreased significantly during the first 100 epochs and stabilized after 300 epochs. It was worth noting that the YOLOv10 series models gradually stabilized after only the 200th epoch likewise. Although the LettWd-YOLOv8l model converged slightly slower than other models, its fitting performance was superior, indicating that the LettWd-YOLOv8l model achieved a good balance between accuracy and recognition speed, demonstrating strong generalization ability and robustness in the lettuce–weed detection task.

Figure 13 shows the trends of loss functions and evaluation metrics for LettWd-YOLOv8l during training and validation. As training progressed, both training and validation losses significantly decreased, indicating an improvement in prediction accuracy for these tasks. Moreover, the evaluation metrics on the validation set also showed marked improvement, demonstrating that the proposed model effectively learned the target detection tasks during training and exhibited strong generalization capabilities in the validation set. Table 2 shows the training and validation loss values at model convergence. Although the proposed model is slightly inferior to certain lightweight models in terms of inference time per image, it significantly outperforms other models in both training and validation loss, demonstrating complementary strengths and highlighting its superior overall performance.

4.2. Detection and Classification of LettWd-YOLOv8l Network

In this study, we trained a total of fourteen different models, including Lettwd-YOLOv8l for lettuce–weed classification, to evaluate their performance. Table 3 presents the performance metrics for each model. The experimental results show that LettWd-YOLOv8l achieved precision, recall, mAP@0.5, mAP@[0.5:0.95], and F1-score values of 99.732%, 99.907%, 99.500%, 98.995%, and 99.500%, respectively. Although LettWd-YOLOv8l slightly underperformed compared to YOLOv8x and YOLOv8n + GAM + CA in terms of precision and recall, it demonstrated superior performance in mAP@0.5, mAP@[0.5:0.95], and F1-score, indicating that LettWd-YOLOv8l strikes an excellent balance between accuracy and recognition speed, validating the model’s reliability. Notably, the YOLOv10 series models all slightly underperformed compared to the proposed model.

As shown in Table 4, this study also analyzed the performance of the LettWd-YOLOv8l model in classifying lettuce and six common weed species: Chenopodium album L. (CL), Portulaca oleracea L. (PL), Cynodon dactylon (L.) Persoon (CP), Amaranthus blitum L. (AL), Galinsoga parviflora Cav. (GC), and Glycine max (Linn.) Merr. (GM). The proposed LettWd-YOLOv8l model achieved a recognition accuracy of 99.042% for lettuce, and similarly, it also attained 99.042% accuracy for these six weed species, demonstrating its excellent performance in the lettuce–weed classification task. The confusion matrix presented in Figure 14 further supports this conclusion, clearly illustrating the proposed LettWd-YOLOv8l model’s outstanding ability in recognizing and classifying both lettuce and the various weed species.

4.3. Results of the Conveyor Belt Experiment

4.3.1. Efficiency of Lettuce Localization Approach

In this conveyor belt experiment, we first evaluated the efficiency of the lettuce localization approach under three different weed density conditions. Figure 15 illustrates the lettuce localization results. During the experiment, researchers assessed the localization accuracy by observing whether the weeding blades could accurately avoid the lettuce plants, recording the corresponding results. The experimental performance was evaluated using the lettuce localization success rate, with detailed results presented in Table 5.

As shown in Table 5, the average localization accuracy of the lettuce localization approach across the three weed density levels was 86.049%. The experiment demonstrated that under light density and good light conditions, the localization accuracy was the highest, reaching 89.273%. Under the same light conditions (whether good or poor), as weed density increased, the accuracy gradually decreased, but the overall success rate remained relatively stable. During the experiment, it was observed that when weeds were densely distributed around the lettuce, the localization algorithm sometimes misidentified weeds as lettuce, marking incorrect center points. This likely contributed to the observed decrease in accuracy. Under better light conditions, this issue showed improvement. Notably, we observed slight shifts in the position of center points due to changes in the light source’s location. Despite the suboptimal lighting conditions in the laboratory environment, the lettuce localization success rate still exhibited strong performance, demonstrating the robustness of the proposed algorithm.

The lettuce localization experiments based on the LettWd-YOLOv8l model showed that the proposed method is reliable in both lettuce–weed recognition and lettuce localization, highlighting its potential for broader applications. Under the current experimental conditions (3.28 km/h), the model achieved an average localization accuracy of 86.049%. Future improvements in both software and hardware are expected to enhance the system’s performance further, contributing to increased efficiency and yield in lettuce production.

4.3.2. Weeding Efficiency of Autonomous Intra-Row Lettuce-Weeding System

In this experiment, we conducted weeding trials under varying light conditions and three preset weed density conditions on the experimental platform, with weed distribution shown in Figure 16a,c, corresponding to light, medium, and heavy densities. Figure 16b,d presents the weeding results of the autonomous intra-row lettuce-weeding system under different weed densities and different light conditions. The experimental results indicate that as weed density increases, the difficulty of weeding also rises. Under light-weed-density conditions, the proposed system was able to accurately identify weeds and successfully perform the weeding operation, achieving a weeding rate of 86.717% (poor light conditions) and 88.661% (good light conditions). Under medium-weed-density conditions, although the precision of weeding and obstacle avoidance decreased due to the increased weed density, the proposed system still demonstrated strong weed identification and removal capabilities, with a weeding rate of 83.229% (poor light conditions) and 83.856% (good light conditions). Under heavy-weed-density conditions, the large and dense weed coverage posed a greater challenge for the system in distinguishing between lettuce and weeds, resulting in a decrease in weeding efficiency. However, the overall weeding performance remained relatively high, with a weeding rate of 78.909% (poor light conditions) and 81.002% (good light conditions).

Specifically, under light-weed-density conditions, the low-density weeds were effectively turned into the soil by the weeding blades, achieving optimal weeding results. Under medium weed density, some weeds were turned into the soil while others were pushed aside by the blades. Notably, under heavy-weed-density conditions, the weeds were also pushed aside by the blades, but when the density was very high, weeds occasionally became entangled with the blades. Additionally, when weeds were densely concentrated, the later-contacted weeds were only slightly damaged and might not have been fully removed, potentially leading to regrowth. Notably, under favorable light conditions, both the lettuce localization success rate and the weeding rate showed a slight improvement. The experimental findings demonstrate that while this system’s weeding capability varies across different weed densities, it still exhibits robust performance and effective weeding even under high-density conditions. The average weeding rate of the system is 83.729%. The weed removal rates under different light conditions and weed densities are shown in Figure 17.

5. Discussion

In this study, we successfully designed and implemented an efficient and low-cost autonomous intra-row lettuce-weeding system. The system integrates a vision recognition module to control the activation and deactivation of the weeding blade, thereby facilitating effective weed removal between lettuce plants. The proposed vision recognition system is built upon the LettWd-YOLOv8l model, specifically tailored for lettuce–weed identification and localization tasks. Compared to traditional computer vision methods, the YOLOv8 model has demonstrated superior adaptability to the complex field conditions of agricultural environments [28,39]. Leveraging the proposed deep-learning-based intelligent vision system, we simulated the operation of the lettuce-weeding device on a conveyor belt to mimic field conditions. The experimental results indicated that the proposed device and method are both feasible and efficient. However, the damage rate to lettuce plants by the system has yet to be evaluated. In this study, 0.7 MPa was selected as the standard operating pressure for the cylinder. Pressure and installed power may influence the efficiency and effectiveness of the weeding system. Future research will further investigate the specific impact of these factors on overall weeding performance.

A higher precision typically indicates the model performs well in identifying positive samples. However, by analyzing the experimental results, we observed that the recall rate for Amaranthus blitum L. (AL) was relatively lower compared to other weed species, which can be attributed to the smaller number of samples of this species in our dataset. This suggests that the model might be too strict in matching with limited samples, potentially overlooking positive samples from other categories, which may lead to overfitting. To address this, future work should focus on expanding the dataset, particularly by acquiring more images of AL, and applying more data augmentation techniques to enhance the dataset. Additionally, implementing k-fold cross-validation will allow for better evaluation of the model’s performance across different datasets, reducing bias from relying on a single dataset. Introducing regularization methods could further improve the model’s generalization ability.

Under poor light conditions, both the lettuce localization success rate and the weeding rate declined slightly, likely due to the use of data augmentation techniques such as adding noise, cutout, and brightness adjustment during dataset creation, which significantly enhanced the model’s robustness under suboptimal lighting. Notably, under good light conditions, the LettWd-YOLOv8l model showed a substantial reduction in misclassifying weeds as lettuce and extracting their center points. However, we observed that the extracted lettuce center points were slightly affected by changes in the light source’s position.

As illustrated in Figure 18, this study employed the response surface methodology to investigate the effects of light conditions and weed density on lettuce localization success rate and weeding rate. Light conditions positively influenced both metrics, with better lighting significantly improving performance, particularly under lower weed density. Additionally, weed density emerged as a critical factor affecting lettuce localization success, as both metrics declined notably with increasing weed density. Furthermore, an interaction effect was observed between light conditions and weed density, impacting both metrics, though the influence of weed density was more pronounced.

During the experiments, we observed that some weeds tended to cluster in certain areas, which occasionally caused the weeding blade to become entangled with the weeds. This, in turn, affected the precise localization of the lettuce plants and disrupted the normal operation of the weeding blade, thereby reducing the weeding efficiency and effectiveness of the proposed system. The experimental results indicate that excessively high weed density significantly reduces weeding efficiency. Therefore, it is recommended that farmers implement weed control measures early in the lettuce growth stage to enhance the system’s efficiency and weeding effectiveness. Future research should focus on improving the algorithm and optimizing the weeding strategy to enhance the overall weeding rate of the proposed autonomous intra-row lettuce-weeding system. Additionally, further studies should aim to optimize the design of the weeding blade to improve its resistance to weed entanglement and its ability to penetrate the soil. Moreover, enhancements in the motion control algorithms and strategies of the weeding blade [50] will be critical to ensuring high-efficiency weed removal under complex weed conditions. Given that lighting conditions may affect the performance of the vision recognition system, future research should also include experimental validation under different lighting scenarios to assess the system’s robustness and reliability in varying environmental conditions.

Table 6 presents the recent advancements in intelligent weeding equipment for various crops. Notably, the proposed system demonstrated outstanding performance in both weed removal rate and crop detection accuracy. However, it is important to acknowledge that during the conveyor belt testing phase, LettWd-YOLOv8l did not fully achieve the performance level observed during training. These discrepancies may be attributed to both software and hardware factors. On the software side, communication issues between the laptop and the STM32 microcontroller could be the root cause. On the hardware side, the performance of both the laptop and the microcontroller plays a crucial role in overall system accuracy. Although the crop detection accuracy in the conveyor belt experiments did not reach the level achieved during model training, this does not directly indicate poor real-world performance of the model. The actual recognition performance was limited by hardware conditions, and there remains room for improvement through technical enhancements. Several studies cited in the table have conducted field experiments. However, this study was limited to simulation experiments conducted on a conveyor belt platform, which does not fully replicate the complex production conditions in actual lettuce fields. To more comprehensively validate the proposed optimized YOLOv8l model and the mechanical lettuce-weeding system, future research should not be restricted to conveyor belt experiments. Future work should focus on developing a suitable mobile platform, enabling the weeding system to conduct field experiments on real lettuce crops. Through such field trials, the system’s weeding performance and robustness can be better assessed under unstructured environmental factors such as varying terrain and lettuce planting densities. These evaluations will provide a more complete validation of the feasibility and effectiveness of the device and model in real-field applications.

6. Conclusions

In this study, we used and implemented an intelligent mechanical intra-row lettuce-weeding device based on a deep-learning-powered vision recognition system. The device utilizes the LettWd-YOLOv8l model, which performs tasks such as detecting lettuce and weeds, precisely locating lettuce plants, and classifying weeds. The proposed model is an improvement over the original YOLOv8l, enhanced by the integration of CA (Coordinate Attention) and GAM (Global Attention Mechanism) modules at appropriate layers. As a result, the model achieved outstanding performance across several metrics, with precision, recall, F1-score, mAP50, and mAP95 reaching 99.732%, 99.907%, 99.5%, 99.5%, and 98.995%, respectively. To evaluate the system’s weeding performance, we simulated field conditions using a conveyor belt setup. The experimental results showed that the proposed autonomous intra-row lettuce-weeding system was able to achieve 89.273% accuracy in crop detection and lettuce localization tasks, and an 83.729% weeding rate at a speed of 3.28 km/h under different light conditions and weed densities. The findings of this study provide valuable insights and knowledge for the development of autonomous weeding robots, offering an innovative solution for precision weeding in modern agriculture.

Author Contributions

Conceptualization, W.-H.S. and R.-F.W.; methodology, W.-H.S. and R.-F.W.; software, C.-T.Z.; validation, C.-T.Z., Y.-H.T., and W.-H.S.; formal analysis, R.-F.W. and C.-T.Z.; investigation, X.-X.P.; resources, W.-H.S.; data curation, X.-X.P.; writing—original draft preparation, R.-F.W. and C.-T.Z.; writing—review and editing, C.-T.Z., W.-H.S., and R.-F.W.; visualization, X.-X.P. and Y.-H.T.; supervision, W.-H.S.; project administration, W.-H.S. and R.-F.W.; funding acquisition, W.-H.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (grant number 32371991; 32101610).

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Full forms and annotations of abbreviations used in this study.

Abbreviation	Full Name	Note
GAM	Global Attention Mechanism	An attention mechanism designed to enhance the performance of deep neural networks.
CA	Coordinate Attention	An attention mechanism aimed at improving the accuracy of lightweight deep learning models.
CL	Chenopodium album L.	A common weed found in lettuce fields.
PL	Portulaca oleracea L.	A common weed found in lettuce fields.
CP	Cynodon dactylon (L.) Persoon	A common weed found in lettuce fields.
AL	Amaranthus blitum L.	A common weed found in lettuce fields.
GC	Galinsoga parviflora Cav.	A common weed found in lettuce fields.
GM	Glycine max (Linn.) Merr.	A common weed found in lettuce fields.
ta	total_area	Total area of annotated regions in an image.
nr	num_regions	Number of valid annotated regions.
ava	average_area	Average area per annotated region.
cent[i]	centroid[i]	Centroid coordinates of the i-th region.
sta[i][area]	stats[i][area]	Area of the i-th region.
ctr	centroid	Weighted sum of centroids across regions, with region area as the weight.
rc	result_centroid	Weighted sum of centroids for valid regions.
rta	result_total_area	Total area of valid regions.
ac	aver_ centroid	Average centroid.
mAP	Mean Average Precision	\
AP	Average Precision	\
tp	True Positive	\
fn	False Negative	\
fp	False Positive	\

Table A2. Symbols used in this study and their meanings.

Symbol	Meaning
$z_{c}^{h} (h)$	Global average pooling along the height (h) direction for channel C, preserving only vertical features.
$z_{c}^{w} (h)$	Global average pooling along the width (w) direction for channel C, preserving only horizontal features.
$x_{c} (h, j)$	Pixel value at row h and column i of channel C in input feature map x.
$x_{c} (j, w)$	Pixel value at row i and column w of channel C in input feature map x.
W	Width of the feature map (number of columns).
H	Height of the feature map (number of rows).
C	Index representing a specific channel.
i	Index of columns in the horizontal direction.
j	Index of rows in the vertical direction.
h	Index of a specific row in the vertical direction.
w	Index of a specific column in the horizontal direction.
$f$	Feature map after attention module processing, containing both $z^{h}$ and $z^{w}$ information, with dimensions $f \in ℝ^{C / r \times (H + W) \times 1}$
$δ$	Activation function, introducing non-linearity in the output.
$F_{1}$	1×1 convolution. A convolution operation to reduce dimensionality and compress channel features.
$z^{h}$	Vertical feature calculated by Equation (1).
$z^{w}$	Horizontal feature calculated by Equation (2).
$[z^{h}, z^{w}]$	New feature vector formed by concatenating $z^{h}$ and $z^{w}$ along the channel dimension.
$σ$	Sigmoid activation function.
$y_{c} (i, j)$	Pixel value at row i and column j of channel C in output feature map y.
$x_{c} (i, j)$	Pixel value at row i and column j of channel C in input feature map x.
$g_{h}^{c} (i)$	Attention weight for row i in the vertical direction of channel C, generated by 1×1 convolution and sigmoid activation on $f^{h}$ .
$g_{c}^{w} (j)$	Attention weight for column j in the horizontal direction of channel C, generated by 1×1 convolution and sigmoid activation on $f^{w}$ .
R	Red channel of an image pixel.
G	Green channel of an image pixel.
B	Blue channel of an image pixel.
R’	Normalized red channel of an image pixel.
G’	Normalized green channel of an image pixel.
B’	Normalized blue channel of an image pixel.
$C_{\max}$	Maximum value among normalized RGB channels for the current pixel.
$C_{\min}$	Minimum value among normalized RGB channels for the current pixel.
$Δ$	Difference between maximum and minimum values, representing the intensity range in RGB space.
Recall	A deep learning model evaluation metric.
Precision	A deep learning model evaluation metric.
F1-Score	A deep learning model evaluation metric.
N	The number of classes.
Loc	Lettuce location success rate.
$L e_{w}$	The number of misidentified lettuce plants.
$L e_{m}$	The number of undetected lettuce plants.
$L e_{t o t a l}$	The total number of lettuce plants.
$W_{r}$	Weeding rate.
$W_{O}$	The total number of weeds before weeding.
$W_{A}$	The number of remaining weeds after weeding.

References

Hu, R.; Niu, L.-T.; Su, W.-H. A Novel Mechanical-Laser Collaborative Intra-Row Weeding Prototype: Structural Design and Optimization, Weeding Knife Simulation, and Laser Weeding Experiment. Front. Plant Sci. 2024, 15, 1469098. [Google Scholar] [CrossRef]
Oerke, E.C. Crop Losses to Pests. J. Agric. Sci. 2006, 144, 31–43. [Google Scholar] [CrossRef]
Jiang, B.; Zhang, J.; Su, W.; Hu, R. A SPH-YOLOv5x-Based Automatic System for Intra-Row Weed Control in Lettuce. Agronomy 2023, 13, 2915. [Google Scholar] [CrossRef]
Pérez-Ruiz, M.; Slaughter, D.C.; Gliever, C.J.; Upadhyaya, S.K. Automatic GPS-Based Intra-Row Weed Knife Control System for Transplanted Row Crops. Comput. Electron. Agric. 2012, 80, 41–49. [Google Scholar] [CrossRef]
Nyamangara, J.; Mashingaidze, N.; Masvaya, E.N.; Nyengerai, K.; Kunzekweguta, M.; Tirivavi, R.; Mazvimavi, K. Weed Growth and Labor Demand Under Hand-Hoe-Based Reduced Tillage in Smallholder Farmers’ Fields in Zimbabwe. Agric. Ecosyst. Environ. 2014, 187, 146–154. [Google Scholar] [CrossRef]
Allmendinger, A.; Spaeth, M.; Saile, M.; Peteinatos, G.G.; Gerhards, R. Precision Chemical Weed Management Strategies: A Review and a Design of a New CNN-Based Modular Spot Sprayer. Agronomy 2022, 12, 1620. [Google Scholar] [CrossRef]
Melander, B.; Rasmussen, G. Effects of Cultural Methods and Physical Weed Control on Intrarow Weed Numbers, Manual Weeding and Marketable Yield in Direct-Sown Leek and Bulb Onion. Weed Res. 2001, 41, 491–508. [Google Scholar] [CrossRef]
Perotti, V.E.; Larran, A.S.; Palmieri, V.E.; Martinatto, A.K.; Permingeat, H.R. Herbicide Resistant Weeds: A Call to Integrate Conventional Agricultural Practices, Molecular Biology Knowledge and New Technologies. Plant Sci. 2020, 290, 110255. [Google Scholar] [CrossRef]
Dai, X.; Xu, Y.; Zheng, J.; Song, H. Analysis of the Variability of Pesticide Concentration Downstream of Inline Mixers for Direct Nozzle Injection Systems. Biosyst. Eng. 2019, 180, 59–69. [Google Scholar] [CrossRef]
Panati, H.S.; Gopika, G.; Andrushia, D.; Neebha T., M. Weeds and Crop Image Classification Using Deep Learning Technique. In Proceedings of the 2023 9th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 17–18 March 2023. [Google Scholar]
Bengio, Y.; Courville, A.; Vincent, P. Representation Learning: A Review and New Perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 1798–1828. [Google Scholar] [CrossRef]
Yao, M.; Huo, Y.; Tian, Q.; Zhao, J.; Liu, X.; Wang, R.; Xue, L.; Wang, H. FMRFT: Fusion Mamba and DETR for Query Time Sequence Intersection Fish Tracking. arXiv 2024, arXiv:2409.01148. [Google Scholar]
Rui-Feng, W.; Wen-Hao, S. The Application of Deep Learning in the Whole Potato Production Chain: A Comprehensive Review. Agriculture 2024, 14, 1225. [Google Scholar] [CrossRef]
Zhang, J.; Su, W.; Zhang, H.; Peng, Y. SE-YOLOv5x: An Optimized Model Based on Transfer Learning and Visual Attention Mechanism for Identifying and Localizing Weeds and Vegetables. Agronomy 2022, 12, 2061. [Google Scholar] [CrossRef]
Tang, J.; Wang, D.; Zhang, Z.; He, L.; Xin, J.; Xu, Y. Weed Identification Based on K-Means Feature Learning Combined with Convolutional Neural Network. Comput. Electron. Agric. 2017, 135, 63–70. [Google Scholar] [CrossRef]
Mu, Y.; Hu, J.; Wang, H.; Li, S.; Zhu, H.; Luo, L.; Wei, J.; Ni, L.; Chao, H.; Hu, T.; et al. Research on the Behavior Recognition of Beef Cattle Based on the Improved Lightweight CBR-YOLO Model Based on YOLOv8 in Multi-Scene Weather. Animals 2024, 14, 2800. [Google Scholar] [CrossRef]
Almalky, A.M.; Ahmed, K.R. Real Time Deep Learning Algorithm for Counting Weed’s Growth Stages. In Proceedings of the 2023 IEEE 15th International Symposium on Autonomous Decentralized System (ISADS), Mexico City, Mexico, 15–17 March 2023. [Google Scholar]
Rumpf, T.; Römer, C.; Weis, M.; Sökefeld, M.; Gerhards, R.; Plümer, L. Sequential Support Vector Machine Classification for Small-Grain Weed Species Discrimination with Special Regard to Cirsium arvense and Galium aparine. Comput. Electron. Agric. 2012, 80, 89–96. [Google Scholar] [CrossRef]
Pérez-Ortiz, M.; Peña, J.M.; Gutiérrez, P.A.; Torres-Sánchez, J.; Hervás-Martínez, C.; López-Granados, F. A Semi-Supervised System for Weed Mapping in Sunflower Crops Using Unmanned Aerial Vehicles and a Crop Row Detection Method. Appl. Soft Comput. 2015, 37, 533–544. [Google Scholar] [CrossRef]
Lottes, P.; Khanna, R.; Pfeifer, J.; Siegwart, R.; Stachniss, C. UAV-Based Crop and Weed Classification for Smart Farming. In Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 29 May–3 June 2017. [Google Scholar]
Tao, T.; Wei, X. A Hybrid CNN–SVM Classifier for Weed Recognition in Winter Rape Field. Plant Methods 2022, 18, 1. [Google Scholar] [CrossRef]
Anul Haq, M. CNN Based Automated Weed Detection System Using UAV Imagery. Comput. Syst. Sci. Eng. 2022, 42, 837–849. [Google Scholar] [CrossRef]
Zhang, D.; Lu, R.; Guo, Z.; Yang, Z.; Wang, S.; Hu, X. Algorithm for Locating Apical Meristematic Tissue of Weeds Based on YOLO Instance Segmentation. Agronomy 2024, 14, 2121. [Google Scholar] [CrossRef]
Hu, R.; Su, W.; Li, J.; Peng, Y. Real-Time Lettuce-Weed Localization and Weed Severity Classification Based on Lightweight YOLO Convolutional Neural Networks for Intelligent Intra-Row Weed Control. Comput. Electron. Agric. 2024, 226, 109404. [Google Scholar] [CrossRef]
Kong, X.; Liu, T.; Chen, X.; Jin, X.; Li, A.; Yu, J. Efficient Crop Segmentation Net and Novel Weed Detection Method. Eur. J. Agron. 2024, 161, 127367. [Google Scholar] [CrossRef]
Quan, L.; Jiang, W.; Li, H.; Li, H.; Wang, Q.; Chen, L. Intelligent Intra-Row Robotic Weeding System Combining Deep Learning Technology with a Targeted Weeding Mode. Biosyst. Eng. 2022, 216, 13–31. [Google Scholar] [CrossRef]
Ju, J.; Chen, G.; Lv, Z.; Zhao, M.; Sun, L.; Wang, Z.; Wang, J. Design and Experiment of an Adaptive Cruise Weeding Robot for Paddy Fields Based on Improved YOLOv5. Comput. Electron. Agric. 2024, 219, 108824. [Google Scholar] [CrossRef]
Upadhyay, A.; Zhang, Y.; Koparan, C.; Rai, N.; Howatt, K.; Bajwa, S.; Sun, X. Advances in Ground Robotic Technologies for Site-Specific Weed Management in Precision Agriculture: A Review. Comput. Electron. Agric. 2024, 225, 109363. [Google Scholar] [CrossRef]
Wang, Z.; Wang, R.; Wang, M.; Lai, T.; Zhang, M. Self-supervised Transformer-Based Pre-training Method with General Plant Infection Dataset. In Pattern Recognition and Computer Vision, Proceedings of the PRCV 2024; Lin, Z., Ed.; Lecture Notes in Computer Science; Springer: Singapore, 2024; Volume 15032. [Google Scholar]
Mao, M.; Lee, A.; Hong, M. Efficient Fabric Classification and Object Detection Using YOLOv10. Electronics 2024, 13, 3840. [Google Scholar] [CrossRef]
DeVries, T.; Taylor, G.W. Improved Regularization of Convolutional Neural Networks with Cutout. arXiv 2017, arXiv:1708.04552. [Google Scholar]
Yang, S.; Xiao, W.; Zhang, M.; Guo, S.; Zhao, J.; Shen, F. Image Data Augmentation for Deep Learning: A Survey. arXiv 2022, arXiv:2201.07075. [Google Scholar]
Semenova, N.; Larger, L.; Brunner, D. Understanding and Mitigating Noise in Trained Deep Neural Networks. Neural Netw. 2022, 146, 151–160. [Google Scholar] [CrossRef]
Wang, A.; Chen, H.; Liu, L.; Chen, K.; Lin, Z.; Han, J.; Ding, G. YOLOv10: Real-Time End-to-End Object Detection. arXiv 2024, arXiv:2405.14458. [Google Scholar]
Wang, A.; Peng, T.; Cao, H.; Xu, Y.; Wei, X.; Cui, B. TIA-YOLOv5: An Improved YOLOv5 Network for Real-Time Detection of Crop and Weed in the Field. Front. Plant Sci. 2022, 13, 1. [Google Scholar] [CrossRef] [PubMed]
Chen, G.; Hou, Y.; Cui, T.; Li, H.; Shangguan, F.; Cao, L. YOLOv8-CML: A Lightweight Target Detection Method for Color-Changing Melon Ripening in Intelligent Agriculture. Sci. Rep. 2024, 14, 14400. [Google Scholar] [CrossRef] [PubMed]
Reis, D.; Kupec, J.; Hong, J.; Daoudi, A. Real-Time Flying Object Detection with YOLOv8. arXiv 2023, arXiv:2305.09972. [Google Scholar]
Zhang, X.; Zeng, H.; Guo, S.; Zhang, L. Efficient Long-Range Attention Network for Image Super-Resolution. In Computer Vision—Proceedings of the ECCV 2022: 17th European Conference, Tel Aviv, Israel, 23–27 October 2022; Springer Nature: Cham, Switzerland, 2022. [Google Scholar]
Kahya, E.; Özdüven, F.F.; Ceylan, B.C. Application of YOLOv8L Deep Learning in Robotic Harvesting of Persimmon (Diospyros kaki). ISPEC J. Agric. Sci. 2023, 7, 587–601. [Google Scholar]
Liu, Y.; Shao, Z.; Hoffmann, N. Global Attention Mechanism: Retain Information to Enhance Channel-Spatial Interactions. arXiv 2021, arXiv:2112.05561. [Google Scholar]
Elfwing, S.; Uchibe, E.; Doya, K. Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning. Neural Netw. 2018, 107, 3–11. [Google Scholar] [CrossRef]
Agarap, A.F. Deep Learning Using Rectified Linear Units (ReLU). arXiv 2018, arXiv:1803.08375. [Google Scholar]
Lin, T.; Dollar, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 1–3 July 2017. [Google Scholar]
Hou, Q.; Zhou, D.; Feng, J. Coordinate Attention for Efficient Mobile Network Design. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021. [Google Scholar]
Raja, R.; Nguyen, T.T.; Slaughter, D.C.; Fennimore, S.A. Real-Time Weed-Crop Classification and Localisation Technique for Robotic Weed Control in Lettuce. Biosyst. Eng. 2020, 192, 257–274. [Google Scholar] [CrossRef]
Raja, R.; Nguyen, T.T.; Slaughter, D.C.; Fennimore, S.A. Real-Time Robotic Weed Knife Control System for Tomato and Lettuce Based on Geometric Appearance of Plant Labels. Biosyst. Eng. 2020, 194, 152–164. [Google Scholar] [CrossRef]
Pérez-Ruíz, M.; Slaughter, D.C.; Fathallah, F.A.; Gliever, C.J.; Miller, B.J. Co-Robotic Intra-Row Weed Control System. Biosyst. Eng. 2014, 126, 45–55. [Google Scholar] [CrossRef]
Wang, X.X.; Li, J.G. A New Early Stopping Algorithm for Improving Neural Network Generalization. In Proceedings of the 2009 Second International Conference on Intelligent Computation Technology and Automation, Changsha, China, 10–11 October 2009. [Google Scholar]
Jiao, J.; Hu, L.; Chen, G.; Tu, T.; Wang, Z.; Zang, Y. Design and Experiment of an Inter-Row Weeding Equipment Applied in Paddy Field. Trans. Chin. Soc. Agric. Eng. 2023, 39, 11–22, (In Chinese with English Abstract). [Google Scholar]
Wang, Y.; Ye, Y.; Wu, H.; Tao, K.; Qian, M. In Different Weed Distributions, the Dynamic Coverage Algorithm for Mechanical Selective Weeding Robot. Comput. Electron. Agric. 2024, 226, 109486. [Google Scholar] [CrossRef]
Raja, R.; Slaughter, D.C.; Fennimore, S.A.; Tao, K.; Qian, M. Real-Time Control of High-Resolution Micro-Jet Sprayer Integrated with Machine Vision for Precision Weed Control. Biosyst. Eng. 2023, 228, 31–48. [Google Scholar] [CrossRef]
Wang, J.; Weng, W.; Ju, J.; Siemens, M.C. Design and Test of Weeder Between Rows in Rice Field Based on Remote Control Steering. Trans. Chin. Soc. Agric. Mach. 2021, 52, 97–105, (In Chinese with English Abstract). [Google Scholar]
Quan, L.; Zhang, J.; Jiang, W.; Li, H.; Yang, C.; Zhang, X. Development and Experiment of Intra-Row Weeding Robot System Based on Protection of Maize Root System. Trans. Chin. Soc. Agric. Mach. 2021, 52, 115–123, (In Chinese with English Abstract). [Google Scholar]

Figure 1. Dataset sample description: (a) an example of lettuce; (b) an example of CL; (c) an example of PL; (d) an example of CP; (e) an example of AL; (f) an example of GC; (g) an example of GM.

Figure 2. Examples of augmented samples and their effects.

Figure 3. Structural diagram of the optimized GAM module and its position: (a) optimized GAM module structure; (b) GAM module location.

Figure 4. Overall framework of LettWd-YOLOv8l model.

Figure 5. Traditional center localization methods.

Figure 6. (a) Lettuce center binarization treatment effect; (b) lettuce center localization detection schematic: Traditional localization visual coordinate points in blue, optimized localization visual coordinate points in red.

Figure 7. Lettuce–weed center localization system structure.

Figure 8. Mechanical weeding device: (1) conveyor belt; (2) air compressor; (3) electric motor powering the conveyor belt; (4) weeding knives; (5) industrial camera; (6) pneumatic cylinder; (7) mechanical arms; (8) aluminum profile frame.

Figure 9. Schematic of lettuce–weed distribution: (a) working area schematic diagram; (b) schematic diagram of weeding principle.

Figure 10. Autonomous intra-row lettuce-weeding system: (a) components of proposed intra-row weeding system; (b) control algorithm flow chart of proposed intelligent control system.

Figure 11. The actual image of autonomous intra-row lettuce-weeding system.

Figure 12. Comparison of loss curves for fourteen YOLO models: (a) Box_loss curves of the fourteen models; (b) DFL_loss curves of the fourteen models. An epoch represents one complete iteration of training, signifying one full pass through the training dataset for model parameter updates and learning. Note: YOLOv8l + GAM + CA is LettWd-YOLOv8l.

Figure 13. Performance of LettWd-YOLOv8l model.

Figure 14. Confusion matrix of the trained LettWd-YOLOv8l model for lettuce and six common weed classifications.

Figure 15. Results of lettuce localization: Poor light conditions: (a) light density; (b) moderate density; (c) heavy density. Good light conditions: (d) light density; (e) moderate density; (f) heavy density.

Figure 16. Weeding effect of autonomous intra-row lettuce-weeding system: Poor light conditions: (a) weed distribution diagram; (b) weeding effect of different weed densities. Good light conditions: (c) weed distribution diagram; (d) weeding effect of different weed densities.

Figure 17. Validation results of autonomous intra-row lettuce-weeding system under different weed densities: (a) poor light conditions; (b) good light conditions.

Figure 18. Response surface analysis was conducted to investigate the effects of light conditions and weed densities on lettuce localization success rate and weeding rate. The left graph presents the response surface of light conditions and weed densities in relation to the lettuce localization success rate, while the right graph shows the response surface for light conditions and weed densities in relation to the weeding rate. The parameters set for this study are as follows: (1) Light conditions: poor = 0; good = 1. (2) Weed density: light density = 1; moderate density = 2; heavy density = 3.

Table 1. Dataset splitting.

Class	Training Set	Test Set	Total
Class	Training Set	Test Set	Original	Augmentation
Lettuce	784	196	196	980
CL	490	122	123	612
PL	313	79	79	392
CD	180	45	25	225
AL	149	37	38	186
GC	282	71	71	353
GM	208	52	52	260
Original	468	116	584	3008
Augmentation	2406	602	584	3008

Note: “Training Set” and ”Test Set” denote the dataset division after augmentation.

Table 2. Training and validation loss values of YOLO models on our dataset for object detection. Bolding indicates best performance.

Model	Train/Box Loss	Train/Cls Loss	Validation/Box Loss	Validation/Cls Loss	Inference Time/ms (per Image)
YOLOv8n	0.518	0.416	0.244	0.203	35.4
YOLOv8n + GAM + CA	0.348	0.272	0.201	0.150	38.6
YOLOv8s	0.337	0.274	0.194	0.157	42.6
YOLOv8m	0.337	0.278	0.205	0.172	44.4
YOLOv8l	0.338	0.315	0.222	0.184	45.4
YOLOv8x	0.295	0.243	0.180	0.136	48.8
YOLOv10n	0.683	0.596	0.417	0.376	30.5
YOLOv10s	0.524	0.428	0.305	0.245	39.6
YOLOv10m	0.591	0.490	0.339	0.278	42.4
YOLOv10l	0.559	0.471	0.301	0.240	44.8
YOLOv10x	0.590	0.514	0.322	0.256	46.7
YOLOv8l + GAM	0.262	0.201	0.184	0.145	47.8
YOLOv8l + CA	0.302	0.285	0.214	0.165	45.7
LettWd-YOLOv8l	0.246	0.198	0.167	0.131	48.6

Table 3. Comparison of LettWd-YOLOv8l with other YOLO models on our dataset for object detection. Bolding indicates best performance.

Model	Precision (%)	Recall (%)	mAP@0.5 (%)	mAP@[0.5: 0.95] (%)	F1-Score (%)
YOLOv8n	99.641	99.453	99.342	96.960	99.500
YOLOv8n + GAM + CA	99.658	100.000	99.499	98.159	99.495
YOLOv8s	99.783	99.832	99.499	98.606	99.495
YOLOv8m	99.778	99.881	99.499	98.465	99.500
YOLOv8l	99.731	99.902	99.500	97.967	99.500
YOLOv8x	99.797	99.940	99.500	98.956	99.500
YOLOv10n	83.356	86.970	93.133	81.523	86.942
YOLOv10s	96.358	94.835	98.784	92.608	95.596
YOLOv10m	91.766	90.795	96.849	86.785	91.541
YOLOv10l	93.712	92.720	97.959	91.277	94.500
YOLOv10x	93.528	94.598	98.313	91.292	94.672
YOLOv8l + GAM	99.728	99.805	99.500	98.665	99.500
YOLOv8l + CA	99.712	99.801	99.428	98.421	99.500
LettWd-YOLOv8l	99.732	99.907	99.500	98.995	99.500

Table 4. Lettuce and weed classification results of LettWd-YOLOv8l model. Bolding indicates best performance.

Plant Species	Precision (%)	Recall (%)	mAP@0.5 (%)	mAP@[0.5: 0.95] (%)	F1-Score (%)
Lettuce	99.354	99.625	99.276	99.042	99.496
CL	99.876	100.00	99.045	99.042	99.500
PL	99.410	100.00	99.362	99.042	99.500
GM	99.926	100.00	99.347	99.042	99.500
AL	100.00	99.703	97.266	99.042	99.500
GC	99.321	100.00	99.500	99.042	99.500
CD	99.685	100.00	99.500	99.042	99.500

Table 5. Validation results of lettuce localization approach under different weed densities.

Light Conditions	Weed Density	Plant Number	Missed/Incorrect Number	Correct Detection Number	Accuracy (%)
Poor	Light	306	36	270	88.235
	Moderate	317	42	275	86.751
	Heavy	298	56	242	81.208
Good	Light	289	31	258	89.273
	Moderate	294	38	256	87.075
	Heavy	277	45	232	83.754

Table 6. Research progress of intelligent weeding equipment in recent years.

Crop	Technology	Weed Removal Rate	Crop Detection Accuracy	References
Lettuce	YOLOv5x	\	87.80%	Jiang et al. [3]
Rice	YOLOv5s	82.40%	90.05%	Ju et al. [27]
Tomato	Crop signal technology	83.00%	97.80%	Raja et al. [46]
Lettuce	Crop signal technology	73.70%	99.40%	Raja et al. [51]
Rice	Remote Control	77.90%	\	Wang et al. [52]
Corn	YOLOv4	81.00%	94.04%	Quan et al. [53]
Lettuce	YOLOv8l	82.95%	99.40%	This study

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhao, C.-T.; Wang, R.-F.; Tu, Y.-H.; Pang, X.-X.; Su, W.-H. Automatic Lettuce Weed Detection and Classification Based on Optimized Convolutional Neural Networks for Robotic Weed Control. Agronomy 2024, 14, 2838. https://doi.org/10.3390/agronomy14122838

AMA Style

Zhao C-T, Wang R-F, Tu Y-H, Pang X-X, Su W-H. Automatic Lettuce Weed Detection and Classification Based on Optimized Convolutional Neural Networks for Robotic Weed Control. Agronomy. 2024; 14(12):2838. https://doi.org/10.3390/agronomy14122838

Chicago/Turabian Style

Zhao, Chang-Tao, Rui-Feng Wang, Yu-Hao Tu, Xiao-Xu Pang, and Wen-Hao Su. 2024. "Automatic Lettuce Weed Detection and Classification Based on Optimized Convolutional Neural Networks for Robotic Weed Control" Agronomy 14, no. 12: 2838. https://doi.org/10.3390/agronomy14122838

APA Style

Zhao, C.-T., Wang, R.-F., Tu, Y.-H., Pang, X.-X., & Su, W.-H. (2024). Automatic Lettuce Weed Detection and Classification Based on Optimized Convolutional Neural Networks for Robotic Weed Control. Agronomy, 14(12), 2838. https://doi.org/10.3390/agronomy14122838

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Automatic Lettuce Weed Detection and Classification Based on Optimized Convolutional Neural Networks for Robotic Weed Control

Abstract

1. Introduction

2. Methods and Materials

2.1. Dataset

2.1.1. Dataset Source

2.1.2. Data Augmentation

2.2. Optimization of YOLOv8l Model and Lettuce Center Localization Algorithm

2.2.1. Detection Head and Neck

2.2.2. Backbone

2.2.3. Lettuce Localization Algorithm

2.3. Autonomous Intra-Row Lettuce-Weeding System

2.3.1. Mechanical Weeding Device

2.3.2. Intelligent Control System

3. Experiment

3.1. Parameter Setting and Experimental Environment

3.2. Model Evaluation Metrics

3.3. Conveyor Belt Experiment

4. Results

4.1. Training of LettWd-YOLOv8l Model

4.2. Detection and Classification of LettWd-YOLOv8l Network

4.3. Results of the Conveyor Belt Experiment

4.3.1. Efficiency of Lettuce Localization Approach

4.3.2. Weeding Efficiency of Autonomous Intra-Row Lettuce-Weeding System

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI