1. Introduction
Weed infestation and management are critical production challenges in agricultural fields [
1]. Weeds in agricultural fields have a strong ability to reproduce, and they compete with crops for resources such as nutrients, water, and light, directly or indirectly affecting both the yield and quality of the crop [
2]. In order to reduce the loss of crop production, timely and effective control of weed is particularly important.
Intra-row weed control is challenging due to the high risk of damaging crops when targeting nearby weeds [
3]. Although manual weeding can remove weeds with precision, it is labor-intensive, costly, and often ineffective—misidentifying or missing approximately 35% of weeds [
4]. Chemical herbicides are widely used because of their efficiency and convenience. At the same time, the use of herbicides to suppress or kill weeds can effectively reduce the competition of weeds for crops and the operation is simple. However, this method causes a large amount of herbicide waste and environmental pollution, and also renders weeds resistant to pesticides to a certain extent, which is not conducive to the sustainable development of agricultural production [
5,
6,
7,
8,
9]. Mechanical weeding is environmentally friendly and efficient, but it has a higher rate of crop damage during the weeding process, which could hinder the growth of crops and may lead to secondary injuries due to infection by pathogens, thus reducing the yield [
10,
11]. Laser weeding is an emerging weeding technology and represents a precise, non-contact physical method of weed control [
12]. It uses high-energy laser beams that can deliver a high density of energy to selected points. Laser weeding achieves the effect of rapidly and precisely targeting and removing weeds by directing the laser beam to heat and damage or kill the plant tissue of the weeds [
13]. Compared to traditional mechanical and chemical weeding methods, laser weeding has the advantages of environmental protection, high efficiency, flexibility, and automation, thus improving the efficiency and accuracy of weed control [
14]. However, the main challenge of laser weed control technology is to accurately differentiate between weeds and crops and to pinpoint weeds [
15]. Therefore, developing a fast and accurate real-time identification and localization technology is crucial for the practical application of intelligent weeding equipment.
The accurate identification of weeds is a prerequisite for precise weed control management in the field, and machine vision technology is an effective means of achieving accurate identification of weeds [
16]. With the continuous development of deep learning and computer vision technology, the accuracy of weed recognition of weeding robots has improved rapidly, and the transformation pace from theory to practical application is also accelerating [
17]. Traditional methods mainly achieve target detection by manually designing features in combination with classifiers, but are limited in efficiency and generalization ability [
18,
19,
20,
21,
22], and are gradually being replaced by end-to-end detection methods based on deep learning. While modern approaches like DETR and SAM demonstrate strong capabilities in open-domain segmentation, their computational demands and hardware requirements limit practical deployment in agricultural edge devices. Although U-Net combined with SAM has demonstrated excellent performance in this case, its training and inference processes require high hardware resources (such as the NVIDIA RTX A6000 GPU) [
23]. DETR [
24] achieves 28 FPS on high-end GPUs, and improved Swin-Unet [
25] achieves 15.1 FPS on an Ubuntu 20.04 system with the Pytorch1.6 deep learning framework, CUDA11.6 parallel computing architecture, and cuDNN8.4.0 deep neural network GPU acceleration library, whereas You Only Look Once (YOLO) enables real-time weeding on edge devices. EfficientDet-D7 achieves 55.1 AP on the COCO test set, with significantly fewer parameters and FLOPs (floating-point operations per second) compared to other detectors. In contrast, EfficientDet-D0 requires only 1/28 of the FLOPs needed by YOLOv3 to achieve a similar level of accuracy. However, for applications that require rapid deployment or have limited computational resources, especially those demanding minimal latency and quick response times, YOLO stands out due to its simplicity and efficiency in implementation [
10]. This highlights the trade-off between model complexity and robustness, where YOLO’s architecture excels in balancing speed and accuracy [
26].
YOLO is an advanced object detection algorithm proposed by Joseph Redmon and others that can detect multiple objects in an image within a short time [
27]. The YOLO algorithm can detect the object in almost real time in one frame, making it both fast and accurate. Therefore, YOLO is widely used in the field of agriculture to identify and locate crops and weeds. Zhu, et al. [
14] designed a blue laser weeding robot based on the YOLOX neural network for weed control in cornfields. Compared to traditional convolutional neural networks, the YOLOX neural network had an advantage in recognizing small targets. On flat ground, the model achieved average recognition rates of 92.45% for corn and 88.94% for weeds, demonstrating strong robustness and reliability. Hu, et al. [
28] proposed a YOLOv4 network for detecting 12 types of rice weeds. Experimental results showed that the performance of the proposed algorithm was 11.6% higher than that of YOLOv3, which was suitable for real-time detection in precision agriculture. Chen, et al. [
29] added an attention mechanism to YOLOv4 to detect weeds in sesame seeds. The average accuracy for detecting sesame crops and weeds was 96.16%, with a detection speed of 27.17 FPS. Ying, et al. [
30] replaced the original backbone structure of YOLOv4 network with MobileNetV3-Small, and introduced a lightweight attention mechanism. This reduced the memory required for image processing and thus improved the efficiency of the detection model. Yong, et al. [
31] proposed a BEM-YOLOv7-tiny target detection model for peanut and weed identification and localization in different weeding periods. The mAP@0.5 and F1 score of the model were 88.3% and 92.4%, respectively, but the volume was only 14.1 MB, which can meet the requirements of real-time seedling and grass detection and positioning. In summary, adding the attention mechanism and optimizing the model architecture are important methods to improve the target detection performance and speed of the YOLO model. While emerging models such as DETR and SAM offer specialized capabilities, YOLO’s unique balance of speed, accuracy, and deployability makes it the best choice for real-time agricultural robots. The improvement and development of the YOLO model provides favorable technical conditions for the development of intelligent agricultural robots. The development of lightweight, highly accurate target detection models that can be easily deployed to edge devices is important to achieve effective weed control in lettuce fields.
The primary contributions of this study are as follows: (1) A lightweight object detection model based on the YOLOv8 network was built. (2) A lettuce intra-row laser weeding system based on deep learning was developed. (3) The proposed lightweight YOLOv8s-CBAM model has a size of only 6.2 MB, but its mAP@0.5 is 98.9%. (4) The established laser weeding system has realized real-time and effective weed control, contributing to green and sustainable development.
2. Materials and Methods
2.1. Overall Design of the Lettuce Intra-Row Laser Weeding System
The main work of this study was to develop a lettuce intra-row laser weeding system based on deep learning, which can accurately identify and locate lettuces or weeds, thereby controlling the laser to emit light to kill the weeds. The laser weeding system mainly consisted of a perception module, a decision module, and an execution module. The system structure was shown in
Figure 1.
As the core component of the perception module, the industrial camera is the eye of the whole laser weeding system. Its primary function is to capture images of weeds between lettuce plants under the control of an upper-level computer program. The industrial camera is a 1.2 million pixel MV-UB130GM small industrial camera (Mindvision Technology Co., LTD., Shenzhen, China), with a maximum image resolution of 1280 × 960 and a frame rate of 39 FPS. The industrial camera transmitted image data to the computer via a USB interface.
The decision module primarily consisted of a computer and a MTL decision system. As shown in
Figure 2, the computer processed the images of weeds between lettuce plants acquired by the perception module to obtain the positional information of the weeds. Based on this positional information, the MTL decision system performed path planning and output weeding instructions, thereby controlling the weeding execution mechanism to carry out the corresponding actions. The system was equipped with an Intel
® Core™ i9-14900K CPU (Intel Corporation, Santa Clara, CA, USA), NVIDIA GeForce RTX4080 GPU (NVIDIA Corporation, Santa Clara, CA, USA), and 64 GB of RAM (Kingston, Fountain Valley, CA, USA), which met all processing requirements. The path planning algorithm adopted by the MTL decision system was the 2-OPT local search ant colony algorithm, known for its robust performance.
The execution module consisted of a microcontroller, stepper motor, linear slide and carriage, laser, and laser head (as shown in
Figure 3), with the microcontroller, stepper motor, and laser being the core components. The microcontroller was responsible for receiving the weed position and weeding path planning information sent by the host computer, and then controlling the laser beam to move to the specified position to complete the laser weeding operation. Based on these instructions, the microcontroller controlled the movement of the laser beam to the designated position, thereby executing the laser weeding operation. The stepper motor and laser were in charge of driving the carriage movement and controlling the laser’s on/off state, respectively. The microcontroller used in this study is an Arduino microcontroller with the model ATmega328 (Microchip Technology Inc., Chandler, AZ, USA), which operates with an external input voltage of 7~12 V DC. The stepper motor is model 57CME13D (Leadshine, Shenzhen, Guangdong Province, China), with a holding torque of 1.3 N·m and a rated current of 4 A. The laser is a 450 nm blue semiconductor laser (Han’s TCS Semiconductor Co., Ltd., Beijing, China). This laser uses a multi-chip coupled fiber output port connected to the laser emission head, with a maximum output power of 50 W. It supported three control modes: local, RS232, and AD. The key functional parameters of the laser are listed in
Table 1.
The workflow of the laser weeding system for lettuce plants was divided into three parts: image acquisition, recognition and decision, and laser weeding. First, the perception module captures images of weeds between the lettuce plants, which are then transmitted to the computer via a serial port. The MTL decision system processes these images, makes decisions and outputs weeding instructions. Finally, the Arduino microcontroller controls the weeding execution mechanism to perform the laser weeding action. The main frame of the laser weeding prototype was built from aluminum profiles and was mounted on a conveyor belt with a speed of 0.91 km/h. The laser head was vertically fixed at a distance of 13 cm from the seedling tray, ensuring that the laser beam was perfectly perpendicular (90°) to the plane where the weeds are located. This setup ensured precise mapping between the laser’s position coordinates and the weed coordinates, as shown in
Figure 3. During the experiment, a seedling tray with lettuce and weeds was placed on a conveyor belt to simulate the movement state along the vegetable patch.
2.2. Acquisition and Processing of Lettuce/Weed Image Dataset
For completing the weed target detection task based on deep learning, a prerequisite was to have a large amount of high-quality image data for training the model. A high-quality lettuce/weed dataset should cover samples from different angles and under various environmental conditions to help the neural network fully learn the appearance characteristics and distinguishing criteria of lettuce and weed.
The original dataset of vegetables and weeds was collected in Beijing, China, with the shooting environments mainly being outdoor greenhouses and indoor laboratories. The target objects were early-stage lettuce seedlings and a common accompanying weed (purslane), with sample images as shown in
Figure 4. During image acquisition, the method of camera position shooting was manually adjusted, and the rotation angle and direction of the camera relative to the target were randomly changed. Finally, the original data set containing 295 lettuce images and 318 purslane weed images was established.
Since deep learning required a sufficient amount of data to train the model, data augmentation methods were used to obtain various types of image samples. Firstly, cycle generative adversarial network (Cycle GAN) was employed to execute image style transfer between lettuce and weeds, resulting in the unsupervised generation of 80 novel samples that retain key features while exhibiting distinct details. Based on this premise, randomly selected images were employed for data augmentation through techniques such as darkening, mirroring, rotation, and the addition of noise to enhance the generalization capability of the trained model. After completing the data augmentation, the new dataset included 1228 images of lettuce and 1445 images of weeds. The image annotation software, LabelImg 1.8.6
https://github.com/tzutalin/labelImg (accessed on 11 October 2024), was utilized for manual tagging tasks, saving tags that complied with the required model format specifications. Finally, the dataset was randomly split into a training set and a validation set in a 7:3 ratio for subsequent model training and validating.
2.3. Lightweight Object Detection Method Based on Improved YOLOv8
To enhance the detection accuracy and reduce the size of the YOLOv8 model for deployment on a mobile laser weeding platform, this study adopted the following improvement strategies. On the one hand, various attention mechanism modules were added to the network backbone to help the model more accurately detect and locate weeds in images between vegetable plants. On the other hand, to address the issues of low efficiency and excessive storage resource consumption when executing object detection algorithms on embedded devices, a class knowledge distillation method based on transfer learning was proposed to train the improved model. This approach aimed at saving computational power and compressing the model size while maintaining model performance, thereby shortening the model training period and inference time.
2.3.1. YOLOv8 Model
YOLOv8 was the latest state of the art (SOTA) model released by the Ultralytics team, offering different network architectures in various scales (N, S, M, L, X) to meet the needs of various scenarios. YOLOv8 was built upon the architectural framework of YOLOv5, incorporating the C2f module in place of the C3 module utilized in V5 to improve gradient flow information. In addition, YOLOv8 also retained the spatial pyramid pooling (SPPF) module, which enabled the network to better deal with the problem of object size changes in images. The neck part adopted multi-scale feature fusion technology to fuse feature maps from different stages of the backbone to enhance the feature representation ability. During the prediction phase, YOLOv8 employed an anchor-free detection method and a dynamic matching approach called task alignment learning (TAL). Based on the established index, high-quality prior boxes were dynamically selected as positive samples and incorporated into the loss function design, while a decoupled head structure was employed to separate box loss from cls loss.
The latest YOLOv8 model was an improved and efficient version built upon previous generations of YOLO. These improvements allowed YOLOv8 to maintain the advantages of the YOLO series’ network architecture while making more refined adjustments and optimizations. It made the model scalable in different scenarios.
2.3.2. Attention Mechanism
An attention mechanism is a technique that enhances the ability of deep learning models to focus on important features. The attention mechanism allows the network to generate a vector of weights that quantifies the importance of each position in the input image data. By introducing attention mechanisms, the model can selectively focus on relevant features of the target while filtering out unrelated background information. Squeeze-and-excitation (SE), convolutional block attention module (CBAM), coordinate attention (CA), and global attention mechanism (GAM) are four mainstream attention mechanism strategies.
- (1)
SE module
SE Net is a typical channel attention mechanism known for its simplicity and ease of deployment. It enhances feature representation by adaptively emphasizing important feature channels and suppressing less important ones [
32]. The core process of the SE mechanism includes global average pooling (GAP), which compresses each channel feature to a single value to evaluate its importance; subsequently, based on this assessment, a weight vector is generated through fully connected layers and activation functions, which is used to recalibrate the original feature maps. However, the SE module is computationally complex and focuses only on channel information, paying insufficient attention to spatial position information within the feature maps.
- (2)
CBAM module
The introduction of CBAM addresses the shortcomings of SE Net. Compared to SE Net, CBAM combines both a channel attention module (CAM) and a spatial attention module (SAM), emphasizing key features and their spatial localization, thus providing a comprehensive, accurate, and precise feature representation. The design of this structure takes into account the inter-channel relationships and the correlations in the spatial dimension [
33]. Additionally, CBAM can be flexibly integrated into existing CNN architectures without requiring additional parameters or computational cost.
- (3)
CA module
Compared to other types of attention mechanisms, CA not only captures differential information between channels but also acquires perceptual information at the directional and positional levels, allowing the detection algorithm to more accurately identify target regions. The computation process of the CA mechanism is divided into two parallel stages. First, the input feature map undergoes global average pooling separately in the X and Y directions. This is followed by concatenation, convolution, and non-linear transformation to obtain feature weights for both directions. Finally, these weights are fused with the original features, enhancing the representation power of the algorithm’s image features. Compared to CBAM, the CA module offers finer spatial localization, making it particularly suitable for tasks that require high spatial precision.
- (4)
GAM module
As a global attention mechanism, GAM enhances network performance by suppressing information diffusion and reinforcing global interactions [
34]. The overall structure of GAM is similar to that of CBAM, as it also employs both CAM and SAM. However, the processing of these two modules differs. Considering the importance of cross-dimensional interaction, the GAM module mainly uses convolutional processing for SAM and GAM, and introduces a 3D permutation with multi-layer perceptron to improve the model’s understanding of image context information.
2.3.3. Class Knowledge Distillation
- (1)
Transfer learning
Transfer learning is a common training technique in deep learning. This method involves transferring the weight parameters of a model trained on a source domain to a new model that needs to learn a different target. This not only accelerates the convergence of the new model but also helps to mitigate overfitting to some extent [
35]. The COCO dataset is a widely recognized benchmark dataset in the field of computer vision, containing more than 330,000 annotated images covering 80 object classes and 91 material classes. The selected network model can use pre-trained weights from the COCO dataset, and the network parameters can be fine-tuned to adapt to the lettuce/weed dataset. By using transfer learning, the process of random initialization of the network is skipped, thereby improving the efficiency of model training.
- (2)
Knowledge distillation
Knowledge distillation is a technique that extracts knowledge from a large model and transfers it to a smaller, more lightweight model. In knowledge distillation, the large model is not used directly but serves as a “teacher” to guide the training of the smaller “student” model. Due to the limited computing power of the terminal device, the large model is compressed, and the small model under guidance saves computing power and is easier to deploy. Compared with other compression methods, knowledge distillation can protect as much knowledge learned in the original model as possible and does not destroy the structure of the original model. This knowledge is then transferred to the compressed model, resulting in a much smaller and more lightweight model.
2.3.4. Lightweight YOLOv8s-CBAM Object Detection Model
In this study, we improved the YOLOv8s model to develop a lightweight YOLOv8s-CBAM object detection model, as shown in
Figure 5. To enhance the accuracy of lettuce and weed recognition, this study incorporated the CBAM attention mechanism before the SPPF module in the backbone layer. This allowed the model to learn the weights of each channel before multi-scale pooling, thereby reinforcing key information and reducing the impact of unnecessary features. To address the issues of low efficiency and excessive storage resource consumption when executing object detection algorithms on embedded devices, this study proposed a class knowledge distillation method based on transfer learning to train the YOLOv8s-CBAM model. This study used the COCO128 (
https://ultralytics.com/assets/coco128.zip, accessed on 11 October 2024) object detection dataset as the source domain to train the YOLOv8n model, allowing it to learn complex features. The pre-trained YOLOv8n model weights were then loaded. Additionally, this study defined the YOLOv8s network architecture with the embedded CBAM attention mechanism as the configuration file. Finally, this study used the custom lettuce-weed dataset as the target domain and trained a model that combines the YOLOv8n weights and the YOLOv8s-CBAM structure. This approach enabled the model to have the capability to recognize lettuce and weeds, resulting in a lightweight, high-performance object detection model. This strategy compressed the model to save computational resources while preserving the performance of the original model.
2.4. Weed Position Information Extraction
Since the working principle of laser weeding relies on burning the apical meristem of the weeds to inhibit their growth, an effective localization algorithm is crucial. Considering the opposite leaf blades of purslane, symmetrical overall morphology, and the set laser spot diameter of 3 mm, it is simple and feasible to use the center point of the target detection frame to represent the apical meristematic tissue of the weed. The center point of the object detection box was used to represent the weed top position. The equations were as follows, where (X, Y) represent the coordinates of the center of the weed’s top, and x
1, x
2, y
1, and y
2 are the X and Y coordinates of the bottom-left and top-right corners of the predicted bounding box for the weed.
As shown in
Figure 6, the laser weeding device moved along the ridges, spanning the rows of lettuce. The lettuce field was divided into inter-row areas (white parts) and intra-row areas (blue boxes), with the area around the lettuces defined as the safety zone (yellow boxes). Finally, the apical center coordinates of all weeds between the two lettuce plants were output, including point bitmap and position information txt file, which provided information for subsequent work and ensured readability.
2.5. Experimental Configuration
To ensure fairness in model comparisons and the reliability of training outcomes, all models were trained and verified multiple times using the same dataset and equipment, with a consistent set of training parameters. The detailed information regarding the software and hardware resource configuration as well as model parameters are shown in
Table 2.
2.6. Evaluation Metrics
To comprehensively assess the performance of the model in weed detection, this study has established a set of evaluation metrics, including precision, recall, mean average precision (mAP), intersection over union (IoU), and F1 score. Among these, mAP is the primary evaluation metric in object detection, typically denoted as mAP@0.5 and mAP@0.5:0.95, with higher values indicating better algorithm performance. Additionally, the loss value is used to evaluate the error between the predicted and actual values. The training loss reflects the model’s ability to fit the dataset, while the validation loss reflects the model’s generalization capability. The loss value comprises three parameters: obj-loss (objectness loss), cls-loss (classification loss), and box-loss (bounding box regression loss). The equations for calculating each metric are as follows:
In the equations, TP stands for true positives, which is the number of correctly detected targets; FP stands for false positives, which represents the number of instances incorrectly identified as targets; FN stands for false negatives, which represents the number of targets that were not detected but should have been. Average precision (AP) is the area under the precision-recall curve; C is the total number of classes in the object detection task. E represents the ground truth bounding boxes annotated by humans, and F represents the predicted bounding boxes generated by the model. For mAP@0.5, if the intersection over union (IoU) value of the model’s predicted region is above the preset threshold of 0.5, then the prediction is considered a true positive (TP). Similarly, mAP@0.5:0.95 represents the average mAP over a range of IoU thresholds from 0.5 to 0.95 (with a step size of 0.05), making this metric more stringent.
4. Discussion
Real-time and accurate target detection of crops and weeds is a prerequisite for precise weeding [
36]. The integration of object detection methods such as Faster R-CNN, DETR, and YOLO with advanced weeding techniques has propelled the development of modernization and intelligence in agriculture. In this study, deep learning technology was used to improve the YOLOv8 network, and a lightweight YOLOv8s-CBAM target detection model was deployed on the laser weeding device, which achieved good weeding effects. In the work of crop and weed target detection, achieving a lightweight and high-precision model has always been the goal pursued by many researchers, but it is difficult to achieve both. Xiangpeng, et al. [
37] developed a new Faster R-CNN detection model to distinguish between various complex growth states of weeds and cotton seedlings. The experimental results showed that the average inference time of this model was 0.165 s, with a mAP@0.5 of 98.43% for weed and cotton seedling detection. Compared to the improved YOLOv8s-CBAM in this study, this model was relatively heavier and had lower operational efficiency on mobile platforms. Lingbing, et al. [
38] constructed a YOLOv5-MobileNet-SE model for real-time recognition of field weeds. The experimental results indicated that the model size was only 7.5 MB, meeting the requirements for lightweight detection. However, the mAP@0.5 for weed detection was only 87%, significantly lower than the 98.6% achieved by the lightweight YOLOv8s-CBAM in this study. Currently, direct detection methods are commonly used in crop weeding to identify and locate crops and weeds which require large datasets for training. However, the variety of field weeds and the complex, variable conditions of actual agricultural production environments present significant challenges for the target detection of crops and weeds. Hui, et al. [
39] proposed an indirect cabbage weed detection method that detects crops by generating bounding boxes that cover them, with any green pixels outside the bounding box treated as weeds. Experimental results showed that this indirect detection method was more effective than direct weed detection and did not require a large dataset, offering a new approach to the target detection of crops and weeds.
Laser weeding is a novel thermal weeding technology that can more precisely focus energy on specific parts of weeds, causing them to rapidly disintegrate and die [
40]. Compared to traditional mechanical and chemical weeding methods, laser weeding has the advantages of being environmentally friendly, efficient, flexible, and automated [
41]. During the laser weeding process, factors such as the laser treatment site, weed species, laser power, laser wavelength, and treatment duration all influence the weeding effectiveness. Therefore, the optimization of laser weeding parameters and the construction of laser weeding systems are significant challenges currently faced by laser weeding. Lünsmann, et al. [
42] investigated the impact of plant treatment points on weeding effectiveness in laser weeding technology. Through experimental comparisons, they found that irradiating the base of the stem caused more severe damage, leading to plant death or growth inhibition, compared to irradiating the apical meristems. The experimental results also showed that high temperatures generated during laser irradiation could cause the water in plant cells to boil and even evaporate. Marx, et al. [
43] developed a laser damage model for two types of weeds (monocot weed ECHCG and dicot weed AMARE) that predicted the probability of successful weed control based on factors such as weed species, growth stage, laser power, laser spot size, and position. This module allowed the laser weeding system to adjust parameters according to actual conditions, thus achieving precise weed control. Model validation results indicated an accuracy rate of 93% for ECHCG and 84% for AMARE. This research provides an important theoretical foundation for the development of laser weeding technology, but further evaluation of its practical application effects is still needed. Additionally, the mechanisms of how laser irradiation affects crop growth and the field environment are not yet fully understood, and further research should be conducted in this area.
The weed dataset used in this study is limited in both variety and quantity. Although the CycleGAN method was employed to augment the dataset, it may still affect the model’s generalization ability. Additionally, the central localization algorithm adopted in this research is only suitable for symmetrically shaped weeds. When more types of weeds are introduced, this localization method may not be rigorous, which is a primary reason for the laser irradiation point offset in actual weeding experiments. Therefore, in future research, methods such as YOLO-pose anchor regression and refined skeleton extraction algorithms will be considered to more accurately locate weeds.
5. Conclusions
Overall, laser weeding is a highly efficient, environmentally friendly, and promising weed control technology with significant efficiency and environmental advantages over traditional mechanical and chemical weeding, and the real-time high-precision target detection of the YOLO model provides strong technical support for the practical application of laser weeding. This study enhanced the YOLOv8 architecture by incorporating the CBAM attention mechanism and employed class knowledge distillation to compress the model, ultimately resulting in the lightweight YOLOv8s-CBAM model. The lightweight YOLOv8s-CBAM model has a size of only 6.2 MB, yet it achieved an mAP@0.5 value of 98.9%, realizing both performance enhancement and model lightweighting. This research completed the design and construction of a laser weeding prototype, developed the corresponding control program, and conducted lettuce inter-plant laser weeding experiments in a laboratory setting that simulated field conditions. The experimental results showed that the laser weeding system exhibited a 100% detection and weeding success rate in low-density weed scenarios. Even in high-density weed distributions, the system successfully identified and located 88.9% of the inter-plant weeds and completed the laser weeding task. The overall weeding success rate for the experiments was 76.9%, demonstrating the value and potential of this research work in practical applications. However, combined with the current technological development and industry challenges, the laser weed control system still needs to be improved in the following aspects: (1) Establishing diverse and rich datasets and optimizing deep learning models to improve the accuracy of weed identification. (2) Explore the effects of different laser action points (top meristematic tissue of weeds, weed stalks, weed emergence points, etc.) on the weed control effect and further optimize the weed localization methods. (3) Exploring the effects of different laser powers on crops and soil and determining the safe dose of lasers to be used. (4) Designing control algorithms to achieve tunable laser power output to minimize energy loss while ensuring safety. Further research related to laser weed control will be carried out with a view to realizing practical applications.