Development of a Premium Tea-Picking Robot Incorporating Deep Learning and Computer Vision for Leaf Detection

Wu, Luofa; Liu, Helai; Ye, Chun; Wu, Yanqi

doi:10.3390/app14135748

Open AccessArticle

Development of a Premium Tea-Picking Robot Incorporating Deep Learning and Computer Vision for Leaf Detection

by

Luofa Wu

^1,*

,

Helai Liu

¹,

Chun Ye

¹ and

Yanqi Wu

²

¹

Institute of Agricultural Engineering, Jiangxi Academy of Agricultural Sciences, Nanchang 330000, China

²

Department of Electronic Engineering, Eindhoven University of Technology, 5600 MB Eindhoven, The Netherlands

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(13), 5748; https://doi.org/10.3390/app14135748

Submission received: 21 May 2024 / Revised: 28 June 2024 / Accepted: 30 June 2024 / Published: 1 July 2024

(This article belongs to the Special Issue Applied Computer Vision in Industry and Agriculture)

Download

Browse Figures

Versions Notes

Abstract

Premium tea holds a significant place in Chinese tea culture, enjoying immense popularity among domestic consumers and an esteemed reputation in the international market, thereby significantly impacting the Chinese economy. To tackle challenges associated with the labor-intensive and inefficient manual picking process of premium tea, and to elevate the competitiveness of the premium tea sector, our research team has developed and rigorously tested a premium tea-picking robot that harnesses deep learning and computer vision for precise leaf recognition. This innovative technology has been patented by the China National Intellectual Property Administration (ZL202111236676.7). In our study, we constructed a deep-learning model that, through comprehensive data training, enabled the robot to accurately recognize tea buds. By integrating computer vision techniques, we achieved exact positioning of the tea buds. From a hardware perspective, we employed a high-performance robotic arm to ensure stable and efficient picking operations even in complex environments. During the experimental phase, we conducted detailed validations on the practical application of the YOLOv8 algorithm in tea bud identification. When compared to the YOLOv5 algorithm, YOLOv8 exhibited superior accuracy and reliability. Furthermore, we performed comprehensive testing on the path planning for the picking robotic arm, evaluating various algorithms to determine the most effective path planning approach for the picking process. Ultimately, we conducted field tests to assess the robot’s performance. The results indicated a 62.02% success rate for the entire picking process of the premium tea-picking robot, with an average picking time of approximately 1.86 s per qualified tea bud. This study provides a solid foundation for further research, development, and deployment of premium tea-picking robots, serving as a valuable reference for the design of other crop-picking robots as well.

Keywords:

premium tea; picking robot; deep learning; visual recognition

1. Introduction

Tea, rich in bioactive substances beneficial for health, is a popular non-alcoholic beverage consumed globally. Originating in China over 5000 years ago, it is now drunk in 71.4% of world countries and regions [1]. China remains the largest producer and consumer, accounting for 62.1% of world tea consumption in 2021. The tea market is divided into bulk and premium tea. Bulk tea, cheap and abundant, suits mechanized harvesting, while premium tea demands freshness and high-quality raw materials, making it rare and unique. Its production relies on immature or newly matured tender buds and leaves, rich in key components like tea polyphenols and catechins, which determine tea quality. However, these components decrease as leaves mature, emphasizing the need for timely harvesting.

Mechanized harvesting, although widespread in bulk tea, faces challenges with premium tea due to limited precision in identifying bud tips, often mixing old leaves, stems, and broken leaves. As shown in Figure 1. Hence, premium tea harvesting still relies on manual labor, affected by factors like weather and inefficiency. In this context, developing a vision and deep learning-based premium tea harvesting robot is crucial. Such robots can precisely identify and locate tender buds and leaves meeting premium tea standards, enabling automated, efficient, and precise harvesting. This significantly improves harvesting efficiency, reduces labor costs, and enhances tea quality and yield, supporting the sustainable development of the premium tea industry [2].

In recent years, remarkable advancements have been made in the application of deep learning and computer vision technology within agricultural robotics [3,4]. Deep-learning models, through rigorous training, acquire the ability to learn intricate feature representations, thereby equipping robots with a perception and comprehension akin to human cognition. The integration of computer vision technology has further bolstered robots’ capabilities in environmental perception and target recognition, paving the way for precision picking. Within the scientific community of major tea-producing nations like China, Japan, and South Korea, the design and advancement of tea-picking robots have garnered significant attention [5,6,7,8,9]. These countries, rich in tea culture and vast consumption markets, are witnessing a surge in demand for intelligent robotic solutions aimed at enhancing tea-picking efficiency and minimizing labor costs. The multifaceted research and development of tea-picking robots, spanning mechanical engineering and artificial intelligence, signifies a technological leap for the traditional tea industry.

For instance, Gui et al. [10] introduced the Ghost_conv and BAM modules to address the issue of the large size of the tea bud detection model, which is not conducive to deployment on mobile terminals. They proposed a lightweight model based on YOLOv5, reducing the model size by 86.7 MB. However, the detection accuracy of the model still needs further improvement. Zhu et al. [11] addressed the limitations of solely relying on machine vision for precise spatial positioning of tea buds by developing a premium tea-picking robot incorporated with negative pressure technology. This innovative robot utilizes negative pressure to systematically guide the tea bud through the end actuator, thereby refining its position. In another study, Zhou et al. [12] undertook a comprehensive design and experimentation for a high-end tea-picking robot. They employed a depth camera coupled with the skeleton method to pinpoint the ideal tea-picking location and established a trajectory model for the picking arm, successfully minimizing acceleration mutations during movement. Although these robots have not yet achieved optimal field performance, they offer invaluable insights and impetus for future research and implementation of advanced tea-picking robots.

The team has developed a premium tea-picking robot based on deep learning and computer vision recognition, which is patented by Chinese State Intellectual Property Office (ZL202111236676.7). In order to verify the performance of the designed robot, a series of results were obtained: (1) By comparing the performance of YOLOv8 and YOLOv5 algorithm in tea bud recognition, the superiority of the YOLOv8 algorithm was verified. (2) For the path-planning problem of the harvesting manipulator, the performance of different algorithms was tested, which provided an important reference for robot optimization. (3) The premium tea-picking robot was tested in the field in a picking experiment, and its comprehensive performance was evaluated. The objective of this research was not only to solve the problems in the traditional tea-picking process, but also to provide significant support for further research and application of the premium tea-picking robot.

2. Materials and Methods

2.1. Whole Machine Structure

The premium tea-picking robot created by our research team adopts the structure of a two-wing crawler and integrates the key components of a binocular camera, Real Time Kinematic (RTK), a multi-function robotic arm, and a large-capacity collecting box. As shown in Figure 2, the utility model has the advantages of a large grounding area, a strong grip, and excellent extension ability, and can meet the picking demands of a tea garden with a gentle slope and varied row spacing. The size of the robot is 1300 × 780 × 1750 mm, its weight is about 400 kg, and its structure is compact and stable. The crawler system provides the robot with stable movement and flexible steering; its maximum speed is 8 km/h, and the robot can easily climb terrain obstacles over 20° and adapt to the changeable tea-garden environment.

In terms of energy, the robot is equipped with a 200 Ah battery pack and a 48 V DC power supply, ensuring continuous operations for 8.5 h, thus integrating endurance and meeting the needs of high-intensity harvesting. In terms of computing performance, a high-performance computing platform with an AMD Ryzen 97900X 12-core processor (produced by Advanced Micro Devices, Inc., Santa Clara, CA, USA), an NVIDIA GeForce RTX 3060 (produced by NVIDIA Corporation, Santa Clara, CA, USA) graphics card, and 12 GB of video memory provide powerful computing support for precision harvesting.

In addition, the robot control software runs on the Ubuntu 22.04 operating system, ensuring the compatibility, stability, and security of the software system and providing a solid technical foundation for the robot to work efficiently and accurately in a tea garden.

2.2. Picking Process

Based on the actual demands of premium tea picking, the tea-picking robot was endowed with the core functions of moving, picking, and collecting. The control system of the tea-picking robot can be divided into three main modules: walking system, vision system, and picking system. As shown in Figure 3. Through systematic deconstruction of its functional composition, the overall workflow can be described as follows: (1) According to the environmental information and path planning in the tea garden, the robot automatically navigates near the target tea tree. (2) The vision system captures the tea image in the current working range and then realizes the efficient detection and precise location of tea buds through deep learning and computer vision technology. (3) After obtaining the coordinate information of a tea bud, the picking system uses the path planning algorithm to control the manipulator to perform the precise picking action. At the same time, the negative pressure collector and the mechanical arm work together to ensure that the picked tea buds are successfully sucked into the collector. (4) On completing the picking task in the current area, the robot moves to the next predetermined area and repeats the process to continue picking tea buds.

2.3. Test System Design and Test

2.3.1. Picking Position

Based on the differences in tea rating systems and tea frying techniques, the front-view tea buds can be subdivided into three types according to the picking position: single bud picking (point A), one bud with one leaf picking (point B), and one bud with two leaves picking (point C). In current practice, the method of picking one bud and two leaves is dominant due to its wide applicability [13]. This study also adopted this point as the main picking position, with the specific picking location located slightly below the junction of the tender leaves and the stem (point C in Figure 4), ensuring the accuracy of picking and the quality of the tea.

2.3.2. The Visual System

The vision system of the premium tea-picking robot is responsible for detecting and locating tea buds in space. The whole principle is illustrated in Figure 5. In the initial phase, the binocular camera captures the tea image data in the current operating area. Then, the system uses the deep-learning algorithm to process these images, accurately detect the distribution of tea buds, and obtain their coordinates in the two-dimensional plane. The visual system then uses the 3D plane to select the buds. This step relies on the parallax principle of the binocular camera, which computes the parallax between the left and right camera images to obtain the depth information of the tea bud (Figure 6b, A1, A2: optical center of left and right camera, B1, B2: imaging point, Z: distance). Finally, the visual system combines the two-dimensional coordinates and depth information of the tea bud in order to obtain the three-dimensional coordinates of the tea buds.

The Zed2 binocular camera (produced by Stereo Labs Corporation in San Francisco, CA, USA) was principally used in the hardware of the premium tea-picking robot vision system designed in this paper. The resolution of the camera is 2K, the measuring depth range is 0.2–20 m, the FPS is 100 Hz, and the aperture is f/1.8, and an image of the camera is shown in Figure 6a.

2.3.3. Detection Algorithm

The You Only Look Once (YOLO) series of algorithms changed real-time object detection by making it a regression task, predicting bounding boxes and categories in one step, crucial for fast, accurate detections [14]. Among YOLO versions, YOLOv8 stands out due to its optimized network and training, boosting detection speed and accuracy. For our elite tea-picking robot, we chose YOLOv8 based on several strengths. Its efficient backbone enhances feature extraction, vital for detecting tea leaves of varying shapes, sizes, and colors. YOLOv8’s improved loss functions and optimization speed up training and adaptation to tea leaf characteristics, boosting accuracy. Compared to YOLOv5, YOLOv8 handles challenges like complex backgrounds or object occlusion in tea gardens better. Its innovative three-part network structure allows efficient feature extraction, fusion, and detection. This ensures accurate tea leaf detections across various scenarios. In short, YOLOv8’s optimized network, efficient feature extraction, and faster training make it perfect for our tea detection needs, as shown in Figure 7.

2.3.4. Sample Collection

The tea image samples were obtained from the fully mechanized tea research base (28°25′18″ N, 115°13′07.71″ E) of the Institute of Agricultural Engineering, Jiangxi Academy of Agricultural Sciences, as shown in Figure 8. The tea plants at the base are more than 20 years old, and the Zed2 binocular camera was used to capture images during the peak summer tea-picking season (i.e., in late July). During the shoot, the camera was positioned at a 45° angle to the horizon while maintaining a 20–50 cm distance from the tree’s canopy. In order to obtain tea images under different illumination conditions, the images were collected on both sunny and cloudy days, under conditions of sufficient and weak sunlight. After systematic collection and processing, a total of 9546 tea image samples were obtained, all of which were saved in “JPG” format.

2.3.5. Image Processing

In order to improve the generalization ability and robustness of the model, the 9546 tea sample images were augmented. In particular, affine transformations were used to adjust the angle of the image to simulate different shooting angles, and brightness adjustment was used to simulate the effect of different lighting conditions, thus increasing the light adaptability of the model. Noise was added to simulate possible image interference, such as sensor noise. In addition, a random cropping technique was used to enhance the model’s ability to recognize the local features of tea. This series of image processing techniques was carried out to enhance each image three times in order to simulate the complex and variable field environment.

In the image annotation stage, LabelImg 1.7.1 software was used to label the target region of the tea bud samples in the image. A total of 186,580 tea bud targets were marked, which provided an accurate database for subsequent target detection. With regard to data set preparation, the tea sample data set was split into a training set and a test set following an 8:2 ratio in order to ensure that the model was able to learn the tea characteristics completely, as well as to allow for an effective performance evaluation on an independent test set.

The experimental platform adopts an industrial control computer equipped with an Intel Core i5-12450H CPU (main frequency 3.3 GHz, L3 cache 12 MB, produced by Intel Corporation, Santa Clara, CA, USA), an NVIDIA Geforce RTX 3050 GPU (produced by NVIDIA Corporation, Santa Clara, CA, USA), and 16 GB of memory. Using the PyTorch 2.1 deep-learning framework, advanced YOLOv8 and YOLOv5 algorithms were used for model training. To ensure a fair comparison of model performance and the validity of subsequent comparison analyses, hyperparameters were uniformly set during training: the number of training cycles (epochs) was set to 600, and the batch size was set to 32. Through using these settings, it was expected that a tea target detection model with excellent performance would be obtained.

2.3.6. Evaluation Indicators

For this test, precision, recall, and average precision mean (mAP) were selected as the indices to be used in evaluating the detection of tea buds. These indices are usually applicable to boundary box-based recognition algorithms and have been commonly used to evaluate the results of YOLO algorithms [15]. The closer that the values of these three indices are to 1, the better the performance.

(1): Precision represents the accuracy of the prediction result and, in this case, represents the probability that the tea image predicted to be positive will actually be positive, as shown in Equation (1) as follows:

p r e c i s i o n = \frac{T P}{T P + F P} \times 100 %

(1)

(2): Recall is used to measure the coverage of tea bud data, indicating the proportion of tea images actually identified as positive examples among those identified as positive examples by the model, as a percentage of the total number of positive examples in the data set, as shown in Equation (2).

r e c a l l = \frac{T P}{T P + F N} \times 100 %

(2)

(3): mAP stands as one of the primary metrics for assessing the performance of object-detection algorithms. It integrates the accuracy and recall of various classes and computes their average values, as demonstrated in Equation (3). It is important to note that mAP can be calculated under different IoU (Intersection over Union) thresholds. For instance, mAP@0.5 refers to the mean average precision computed at an IoU threshold of 0.5, while mAP@0.5:0.95 represents the average of mean average precisions calculated at multiple IoU thresholds ranging from 0.5 to 0.95 with a step size of 0.05. This methodology provides a more comprehensive evaluation of the performance of object-detection algorithms.

m A P = \frac{\sum_{i = 0}^{n} A P (i)}{n}

(3)

In our model analysis, the loss function gauges the gap between predictions and actual labels. We used four key loss functions in object detection: Box loss (comparing predicted and actual bounding boxes), Validation (Val) Box loss (same as Box loss but on the validation set), Objectness loss (assessing the accuracy of object presence predictions), and Validation Objectness loss (similar to Objectness loss, but on the validation set). These functions output non-negative values, quantifying prediction errors. Lower loss values indicate higher prediction accuracy and model robustness.

2.4. Design and Test of Harvesting System

2.4.1. Picking System Design

The picking operation for tea buds is significantly dependent on a complex picking system, which consists of a picking manipulator and a path-planning algorithm. When the binocular camera and deep-learning technology are closely integrated, they can complete the detection and location of all tea buds in the current field of vision. Once the 3D coordinates of the tea buds are captured and transmitted to the manipulator control system, the manipulator’s path-planning algorithm can calculate the optimal picking path based on this information. In this process, the algorithm needs to apply the inverse kinematic solution to each path node, following which the joint state of the manipulator at each node should be defined. Then, with the aid of the interpolation method, the joint state sequences of the manipulator are generated smoothly over the path, where these state sequences constitute the continuous and accurate joint trajectories of the manipulator. The mechanical arm with the shear hand travels along this trajectory to perform the precise tea bud-picking action.

In this study, an LM3 robotic arm (produced by Liby Company in Shanghai, China)and an HS-K40 pneumatic shear hand (produced by Shuteng Automation Technology Co., Ltd. in Wenzhou, China) were selected as the main actuators of the harvesting robot (Figure 9). The repeatability precision of the manipulator was ±0.5 mm, the maximum terminal speed was 2 m/s, the working radius was 638 mm, and the maximum load was 3 kg, which ensured the stability and high efficiency of the robot in the actual picking operation.

2.4.2. Simulation of Planning Algorithm

A path-planning algorithm was used to generate a collision-free path for the picking manipulator. Compared with those of apple, strawberry, kiwi, and other crops, more attention must be paid to picking efficiency in the tea-picking process. As the picking robot’s path planning will directly affect the efficiency of picking operations, it is one of its key technologies. As a search algorithm based on random adoption, Rapidly-exploring Random Tree (RRT)—first proposed by LaValle [16] in 1998—is the most popular manipulator path-planning algorithm at present, which generates nodes via random sampling. Then, the tree node closest to the new node is selected through step size expansion, and if there is a collision, it is discarded; otherwise, the nodes are joined. This process is repeated until the target node is reached. RRT is widely used in assisting with picking robot’s path planning, due to its high efficiency and fast planning performance. However, the traditional RRT algorithm has blindness and efficiency problems, which has become the focus of academic research [17]. In order to optimize the algorithm, researchers have focused on improving search efficiency and path quality while maintaining its completeness and flexibility. For example, based on the traditional RRT algorithm, the RRT-connect algorithm has been proposed by introducing the strategy of bidirectional search and greedy linking [18]; a novel RRT-star algorithm has been proposed to optimize the connections between the explored nodes [19], which takes into account the dynamic characteristics and constraints of the manipulator; and a heuristic T-RRT algorithm has been proposed to ensure that the robot arm can be used for path planning in complex environments [20].

Therefore, in this study, the RRT, RRT-connect, RRT-star, and T-RRT algorithms were simulated 50 times. The industrial control computer described in Section 2.3.5 was used as the experimental equipment, and Ubuntu 22.04 was selected as the operating system. Furthermore, the simulation environment was built based on Robot Operating System 2 (ROS2). During the experiment, using the MoveIt! motion planning framework, the configuration package initiated the model documents of the manipulator and set the trajectory of the manipulator. Specifically, the starting origin (0, 0, 0), the initial position of the manipulator (0.50, 0.29, 0.07), and the target position (−0.20, 0.47, 0.56) were defined. The motion path of the manipulator was strictly defined to attain the target position through the initial position from the initial origin, and the maximum number of searches was set to 1024. Therefore, if a path was found within the maximum number of searches, the path search was considered successful; otherwise, the path search was considered to be a failure. The objective deviation of the test parameters was set to 0.05, and the step length was set to 0.4 degrees. The 3D visualization tool Rviz was used to observe the motion-planning effect of the manipulator during the test, and relevant data were generated in the planning process for statistical analysis, as shown in Figure 10.

2.4.3. Evaluation Indicators

In order to evaluate the performance of the manipulator path-planning algorithm, the planning time, success rate, root mean square error (RMSE), and decision coefficient (R²) were used as evaluation indices. When comparing the robustness of different models, the smaller the RMSE, the better the robustness of the model. The R² value reflects the interpretation of the fitting degree of the prediction model when the dependent variable changes with the independent variable: the closer the decision coefficient is to 1, the more robust the model. In addition, the regression-fitting effect of the model is considered to be improved. The formulas for these two indicators are shown as Equations (4) and (5), as follows:

R M S E = \sqrt{\frac{Σ {(y_{i} - {\hat{y}}_{i})}^{2}}{n}}

(4)

R^{2} = \frac{Σ {(y_{i} - {\hat{y}}_{i})}^{2}}{Σ {(y_{i} - \bar{y})}^{2}}

(5)

where n represents the sample size, yᵢ represents the i^th observation, ŷᵢ represents the corresponding model prediction, and

\bar{y}

represents the mean of the model predictions.

2.5. Field Test of the Designed Machine

Our research group jointly developed the premium tea-picking robot with Cooperation Unit Island Intelligence Technology Co., Ltd. in Shanghai, China. In order to verify its performance, the tea plantation at the Institute of Agricultural Engineering, Jiangxi Academy of Agricultural Sciences, was selected as the research base for mechanized tea picking. In this plantation, the width of the tea ridges is about 1.5 m, and the distance between the ridges is 0.9 m. Under the climatic conditions of an overcast day in August, a row of tea plants was randomly selected as the test object. The actual work scene of the picking robot is shown in Figure 11.

In this experiment, the premium tea-picking robot was guided to move to 10 predetermined picking points in order to collect tea buds. At each picking point, the number of tea buds picked by the robot in the three stages of detection, positioning, and picking and the corresponding working time were recorded in detail. In order to fully evaluate the performance of the robot, two key indicators, namely the success rate of tea bud picking and the picking efficiency, were used to evaluate the performance of the robot, achieved through in-depth analysis and evaluation of the test data.

R = \frac{c}{n} \times 100 %

(6)

T = \frac{t}{c} \times 100 %

(7)

In Equations (6) and (7), R is the success rate, c is the number of collected tea buds, n is the number of tea buds in the field of view, T is the total picking time, and t is the time taken to complete picking in the current field of view.

3. Results

3.1. Comparison and Analysis of Detection Results

During training and validation on the tea bud data set with YOLOv8 and YOLOv5, the loss functions (box loss, objectness loss, val box loss, and val objectness loss) for each epoch were stored in the data list, allowing them to be plotted. As shown in Figure 12, an overall downward trend was observed. In the first 200 iterations, the convergence rate of YOLOv8 was significantly faster than that of YOLOv5, and the loss value was lower. After about 600 iterations, the loss values of the two models tended to become stable, indicating that they converged and had the ability to effectively identify tea bud images. At the same time, the model metrics (precision, recall, mAp@0.5, and mAp@0.5:0.95) changed significantly with the number of iterations, as shown in Figure 13. At the beginning of training, the indices increased rapidly, especially in the first 100 iterations; then, they slowed down and became stable by 600 iterations, indicating that the model achieved a stable state and had high recognition performance. The four parameters of the YOLOv5 model were 0.969, 0.935, 0.978, and 0.853, respectively. In contrast, those of the YOLOv8 model were improved by 2.4%, 4.1%, 1.2%, and 6.2% over those of the YOLOv5 model, respectively (see Table 1). Under the same conditions, the YOLOv8 model showed a more significantly improved performance when trained on the tea bud data set.

3.2. Comparison of Different Path Planning Algorithms

In the simulation experiments focused on different picking manipulator path-planning algorithms, namely RRT, RRT-connect, RRT-star, and T-RRT, the ROS2 node was programmed to subscribe the status topic of the manipulator and record the required data. Python NumPy, Pandas, and other libraries were used for the subsequent data analysis. The detailed data are provided in Table 2. The experimental results show that the success rate of the four algorithms was 100% in 50 experiments, which proves their effectiveness in the path-planning task. T-RRT outperformed the other three algorithms in RMSE and R² by 0.002% and 99.661%, respectively. In terms of planning time, the four planning algorithms did not show significant differences in performance; however, while the T-RRT algorithm was slightly slower than RRT-connect, in fine-detail operations such as tea-picking, accuracy is often more important than speed.

In summary, the T-RRT algorithm—with its excellent precision and acceptable speed performance—achieved the best balance between precision and efficiency for the selection of a path-planning algorithm for a premium tea-picking manipulator. Compared with the other three algorithms, this algorithm should be the first choice for fine-detail operations such as tea picking.

3.3. Picking Accuracy and Picking Efficiency

In this study, we conducted a detailed analysis of the 10 selected picking sites, with the data presented in Table 3. Across the 10 different picking experiments, our findings reveal that the total number of tea buds in the target field was approximately 258. Among these, 219 tea buds were successfully detected, yielding a detection success rate of 84.88%. However, depth information could not be accurately obtained for 17 tea buds, either due to exceeding the set threshold or being occluded. Of the detected tea buds, 186 were precisely located, achieving a location success rate of 84.93%. Ultimately, 160 shoots were successfully picked, resulting in a picking success rate of 86.02%. In summary, our premium tea-picking robot achieved an overall success rate of 62.02%. This rate slightly surpasses the recently reported index level of 61.30% in China. Additionally, the average picking time per tea bud was 1.86 s, which is marginally higher than the reference level of 1.51 s documented in recent Chinese literature [21]. To provide a clearer context for the performance of our proposed robot, we conducted a comparative analysis with existing systems. This analysis highlights the competitiveness of our robot in terms of both success rate and picking time. While our robot’s performance is commendable, there are still opportunities for improvement, particularly in reducing the picking time per tea bud. Such advancements could further enhance the efficiency and effectiveness of our premium tea-picking robot, positioning it as a leading solution in the industry.

4. Discussion

YOLOv8 performs better than YOLOv5 in tea bud-detection tasks. However, our research team has also noticed that the YOLOv8 model did not successfully identify all tea buds, mainly due to the following three factors:

(1): The morphological characteristics of tea buds affect detection results [21]. Tea buds are small in size and grow in high density, and their color is not significantly different from the background. These factors reduce the visibility and recognizability of tea buds in images, thus increasing the difficulty of the detection algorithm.
(2): Factors related to image acquisition also interfere with recognition results [22,23,24,25,26]. When collecting tea images, it is difficult to maintain a stable distance and shooting angle between the camera and the tea. Besides the variation in the growth direction of tea buds, many other variables can affect the appearance of tea buds in images to some extent, thereby adversely affecting the accuracy of the recognition algorithm.
(3): The limitations of the YOLO model are also one of the reasons why some tea buds are not successfully identified. Although the performance of YOLOv8 has improved compared to that of YOLOv5, there are still some challenges when dealing with detection of small targets similar to the surrounding environment. This problem may stem from the limitations of the model in feature extraction and discrimination ability, which needs to be optimized and improved in future research.

In field experiments, we also observed differences in the growth period between the collected tea buds and the samples used in the training model. This difference leads to subtle changes in the color and appearance of tea buds, thereby reducing the confidence of the tea bud-detection model. To address this issue, we suggest that future research should focus on using transfer learning to improve the model’s generalization ability for tea bud detection at different growth stages [27].

Additionally, during the positioning of tea buds, some tea buds are severely occluded, leading to positioning errors or failures. Nevertheless, these occluded buds still retain their morphological characteristics. Therefore, future research can consider utilizing these morphological characteristics to establish a topological model of tea buds to improve positioning accuracy.

Meanwhile, we must acknowledge some potential limitations of the current method. Firstly, our method may not be robust enough for interference factors in unstructured environments, such as breeze. These factors have a significant impact on the process of picking high-quality tea, especially during camera acquisition, where the movement of tea bud positions can severely affect the accuracy of detection and positioning. To reduce these interferences, we plan to adjust camera parameters based on experimental conditions or add equipment such as hoods to improve the acquisition effect in the future.

Another limitation lies in the precision and speed of the robotic arm. The robotic arm plays a crucial role in tea bud picking. However, the effectiveness of the currently proposed high-quality tea-picking robot still needs improvement [10,28,29,30,31]. To enhance picking accuracy and efficiency, we consider adopting a parallel manipulator in future research and improving its movement speed.

In summary, although our method has achieved remarkable results in certain aspects, there is still much room for improvement. Future research will focus on enhancing the model’s generalization ability, optimizing the image-acquisition process and improving the performance of the robotic arm. We believe that these improvements can further enhance the overall performance of the high-quality tea-picking robot.

5. Conclusions

In this study, a premium tea-picking robot based on machine vision recognition and deep learning was designed and tested successfully. In order to realize accurate recognition and picking of tea, after comparison with the YOLOv5 model, the YOLOv8 algorithm was chosen for the detection of tea buds. The tea bud detection results indicated that YOLOv8 was significantly superior to YOLOv5 in terms of precision, recall, mAp@0.5, and mAp@0.5:0.95, reaching values of 0.993, 0.976, 0.992, and 0.915, respectively. In addition, the performances of the RRT, RRT-connect, RRT-star, and T-RRT path planning algorithms were evaluated through simulation of the manipulator. After comparative analysis, the T-RRT algorithm was confirmed to be the optimal path-planning algorithm for the picking manipulator, given that its performance in terms of the RMSE and R² metrics was excellent (reaching 0.002% and 99.661%, respectively).

The picking performance of the robot was further verified through field tests. The experimental results showed that the picking success rate was 62.02%, and the average picking time for each tea bud was 1.86 s. These research results have important theoretical and practical significance regarding the promotion of intelligent premium tea-picking processes.

6. Patents

Based on the results of this research, “Intelligent tea picking equipment based on deep learning and computer vision recognition” was recognized in January of 2023 by a Chinese State Intellectual Property Office patent, patent number ZL202111236676.7.

Author Contributions

Conceptualization, C.Y.; Methodology, L.W.; Validation, C.Y.; Formal analysis, Y.W.; Investigation, H.L.; Data curation, H.L.; Writing—original draft, H.L.; Writing—review & editing, Y.W.; Supervision, L.W.; Project administration, L.W.; Funding acquisition, L.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Ministry of Agriculture and rural areas grant number [2022]119. And The APC was funded by Jiangxi province linkage project.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Wu, L. Analysis on the Technical Gap and Key Problems of Tea Picker in China. Significances Bioeng. Biosci. 2023, 6, 1–6. [Google Scholar]
Liu, J.; Zhang, C.; Hu, R.; Zhu, X.; Cai, J. Aging of agricultural labor force and technical efficiency in tea production: Evidence from Meitan County, China. Sustainability 2019, 11, 6246. [Google Scholar] [CrossRef]
Wang, T.; Zhang, K.; Zhang, W.; Wang, R.; Wan, S.; Rao, Y.; Jiang, Z.; Gu, L. Tea picking point detection and location based on Mask-RCNN. Inf. Process. Agric. 2021, 10, 267–275. [Google Scholar] [CrossRef]
Yan, L.; Wu, K.; Lin, J.; Xu, X.; Zhang, J.; Zhao, X.; Tayor, J.; Chen, D. Identification and picking point positioning of tender tea shoots based on MR3P-TS model. Front. Plant Sci. 2022, 13, 962391. [Google Scholar] [CrossRef]
Yang, H.; Chen, L.; Chen, M.; Ma, Z.; Deng, F.; Li, M.; Li, X. Tender tea shoots recognition and positioning for picking robot using improved YOLO-V3 model. IEEE Access 2019, 7, 180998–181011. [Google Scholar] [CrossRef]
Yuan, Y.F.; Zheng, X.Z.; Lin, W.G. Path planning of picking robot for famous tea. J. Anhui Agric. Univ. 2017, 44, 530–535. [Google Scholar]
Tian, J.; Zhu, H.; Liang, W.; Chen, J.; Wen, F.; Long, Z. Research on the application of machine vision in tea autonomous picking. In Journal of Physics: Conference Series; IOP Publishing: Bristol, UK, 2021; Volume 1952, p. 022063. [Google Scholar]
Yang, H.; Chen, L.; Ma, Z.; Chen, M.; Zhong, Y.; Deng, F.; Li, M. Computer vision-based high-quality tea automatic plucking robot using Delta parallel manipulator. Comput. Electron. Agric. 2021, 181, 105946. [Google Scholar] [CrossRef]
Selsiya, M.S.; Manusa, K.R.; Gopika, S.; Kanimozhi, J.; Varsheni, K.C.; Mohan, M.S. Robotic Arm Enabled Automatic Tea Harvester. In Proceedings of the 2021 International Conference on Advancements in Electrical, Electronics, Communication, Computing and Automation (ICAECA), Coimbatore, India, 8–9 October 2021; pp. 1–5. [Google Scholar]
Gui, Z.; Chen, J.; Li, Y.; Chen, Z.; Wu, C.; Dong, C. A lightweight tea bud detection model based on Yolov5. Comput. Electron. Agric. 2023, 205, 107636. [Google Scholar] [CrossRef]
Zhu, Y.; Wu, C.; Tong, J.; Chen, J.; He, L.; Wang, R.; Jia, J. Deviation Tolerance Performance Evaluation and Experiment of Picking End Effector for Famous Tea. Agriculture 2021, 11, 128. [Google Scholar] [CrossRef]
Zhou, Y.; Wu, Q.; He, L.; Zhao, R.; Jia, J.; Chen, J.; Wu, C. System design and experiment of famous and high-quality tea picking robot. J. Mech. Eng. 2022, 58, 12–23. [Google Scholar]
Lin, G.; Xiong, J.; Zhao, R.; Li, X.; Hu, H.; Zhu, L.; Zhang, R. Efficient detection and picking sequence planning of tea buds in a high-density canopy. Comput. Electron. Agric. 2023, 213, 108213. [Google Scholar] [CrossRef]
Cai, Z.; Vasconcelos, N. Cascade r-cnn: Delving into high quality object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6154–6162. [Google Scholar]
Chai, E.; Ta, L.; Ma, Z.; Zhi, M. ERF-YOLO: A YOLO algorithm compatible with fewer parameters and higher accuracy. Image Vis. Comput. 2021, 116, 104317. [Google Scholar] [CrossRef]
LaValle, S. Rapidly-exploring random trees: A new tool for path planning. Res. Rep. 9811 1998. [Google Scholar]
Wang, Y.; Jiang, W.; Luo, Z.; Yang, L.; Wang, Y. Path planning of a 6-DOF measuring robot with a direction guidance RRT method. Expert Syst. Appl. 2023, 238, 122057. [Google Scholar] [CrossRef]
Kuffner, J.J.; LaValle, S.M. RRT-connect: An efficient approach to single-query path planning. In Proceedings of the 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No. 00CH37065), San Francisco, CA, USA, 24–28 April 2000; Volume 2, pp. 995–1001. [Google Scholar]
Wang, L.; Guan, S.X. Trajectory planning and control for SCARA manipulator of tea picking robot. Inf. Technol. Netw. Secur. 2020, 39, 73–80. [Google Scholar]
Karaman, S.; Frazzoli, E. Sampling-based algorithms for optimal motion planning. Int. J. Robot. Res. 2011, 30, 846–894. [Google Scholar] [CrossRef]
Li, Y.; Zhou, Y.; Wang, S.; Chen, J.; He, Y.; Jia, J.; Wu, C. Experimental Study on High-quality Tea Plucking by Robot. J. Tea Sci. 2024, 44, 75–83. [Google Scholar]
Liu, X.; Zhao, D.; Jia, W.; Ji, W.; Sun, Y. A detection method for apple fruits based on color and shape features. IEEE Access 2019, 7, 67923–67933. [Google Scholar] [CrossRef]
Du, X.; Meng, Z.; Ma, Z.; Zhao, L.; Lu, W.; Cheng, H.; Wang, Y. Comprehensive visual information acquisition for tomato picking robot based on multitask convolutional neural network. Biosyst. Eng. 2024, 238, 51–61. [Google Scholar] [CrossRef]
Hu, H.; Kaizu, Y.; Zhang, H.; Xu, Y.; Imou, K.; Li, M.; Huang, J.; Dai, S. Recognition and localization of strawberries from 3D binocular cameras for a strawberry picking robot using coupled YOLO/Mask R-CNN. Int. J. Agric. Bioeng. 2022, 15, 175–179. [Google Scholar] [CrossRef]
Pal, A.; Leite, A.C.; Johan, P. A novel end-to-end vision-based architecture for agricultural human-robot collaboration in fruit picking operations. Robot. Auton. Syst. 2024, 172, 104567. [Google Scholar] [CrossRef]
Chen, B.; Gong, L.; Yu, C.; Du, X.; Chen, J.; Xie, S.; Le, X.; Li, Y.; Liu, C. Workspace decomposition based path planning for fruit-picking robot in complex greenhouse environment. Comput. Electron. Agric. 2023, 215, 108353. [Google Scholar] [CrossRef]
Ling, X.; Zhao, Y.; Gong, L.; Liu, C.; Wang, T. Dual-arm cooperation and implementing for robotic harvesting tomato using binocular vision. Robot. Auton. Syst. 2019, 114, 134–143. [Google Scholar] [CrossRef]
Tan, C.; Sun, F.; Kong, T.; Zhang, W.; Yang, C.; Liu, C. A survey on deep transfer learning. In Artificial Neural Networks and Machine Learning–ICANN 2018: 27th International Conference on Artificial Neural Networks, Rhodes, Greece, October 4–7, 2018; Proceedings Part III 27; Springer International Publishing: Berlin/Heidelberg, Germany, 2018; pp. 270–279. [Google Scholar]
Li, Y.; Wu, S.; He, L.; Tong, J.; Zhao, R.; Jia, J.; Chen, J.; Wu, C. Development and field evaluation of a robotic harvesting system for plucking high-quality tea. Comput. Electron. Agric. 2023, 206, 107659. [Google Scholar] [CrossRef]
Yida, N. Research State and Trend of Fruit Picking Robot Manipulator Structure. Int. Equip. Eng. Manag. 2020, 25, 15. [Google Scholar] [CrossRef]
Tavares, N.; Gaspar, P.D.; Aguiar, M.L.; Mesquita, R.; Simões, M.P. Robotic arm and gripper to pick fallen peaches in orchards. Acta Hortic. 2022, 1352, 567–574. [Google Scholar] [CrossRef]

Figure 1. The difference in fresh leaves between bulk tea (a) and premium tea (b).

Figure 2. Structural diagram of the premium tea-picking robot.

Figure 3. The control system diagram of the premium tea-picking robot.

Figure 4. Picking positions on the tea bud.

Figure 5. The schematic diagram of the visual system.

Figure 6. The binocular camera. (a) Physical image of the camera, and (b) depth-measuring principle.

Figure 7. YOLOv8’s network structure.

Figure 8. Tea image collection.

Figure 9. Physical images and workspace map of the picking system: (a) Image of cutting hand, (b) image of physical manipulator, (c) manipulator workspace top view, and (d) manipulator workspace master view.

Figure 10. The motion simulation of the manipulator: (a) starting origin, (b) initial position, and (c) target position.

Figure 11. The premium tea-picking robot in the field work site.

Figure 12. The loss functions: (A) box loss, (B) objectness loss, (C) val box loss, and (D) val objectness loss.

Figure 13. The indicator parameters: (A) precision, (B) recall, (C) mAP@0.5, and (D) mAP@0.5:0.95.

Table 1. Comparison of YOLOv5’s and YOLOv8’s performances after 600 training iterations.

Parameter	Value
Parameter	YOLOv5	YOLOv8
Box	0.728	0.015
Val Box	0.583	0.010
Objectness	1.000	0.005
Val Objectness	0.927	0.021
Precision	0.969	0.993
Recall	0.935	0.976
mAP@0.5	0.978	0.992
mAP@0.5:0.95	0.853	0.915

Table 2. Simulation test data for the manipulator under different path-planning algorithms.

Parameter	Value
Parameter	RRT	RRT-Connect	RRT-Star	T-RRT
RMSE (%)	0.030	0.071	0.003	0.002
R² (%)	99.014	98.983	99.450	99.661
Average planning time (s)	0.001	0.0002	0.001	0.001
Execution success rate	100%	100%	100%	100%

Table 3. The number of tender shoots and picking time in different stages of the field picking experiment.

Location Number	Number of Shoots	Number of Detections	Number of Locations	Number of Pickings	Overall Success Rate	Picking Time (s)
1	25	21	18	15	60.00%	29.6
2	23	20	16	14	60.87%	26.5
3	33	28	25	22	66.67%	36.3
4	29	24	22	20	68.67%	34.8
5	18	15	13	10	55.56%	19.8
6	27	24	20	16	59.26%	32.4
7	25	21	17	15	60.00%	28.8
8	17	14	11	10	58.82%	18.7
9	26	23	19	16	61.84%	28.6
10	35	29	25	22	62.86%	42.3
Total	258	219	186	160	62.02%	297.8

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wu, L.; Liu, H.; Ye, C.; Wu, Y. Development of a Premium Tea-Picking Robot Incorporating Deep Learning and Computer Vision for Leaf Detection. Appl. Sci. 2024, 14, 5748. https://doi.org/10.3390/app14135748

AMA Style

Wu L, Liu H, Ye C, Wu Y. Development of a Premium Tea-Picking Robot Incorporating Deep Learning and Computer Vision for Leaf Detection. Applied Sciences. 2024; 14(13):5748. https://doi.org/10.3390/app14135748

Chicago/Turabian Style

Wu, Luofa, Helai Liu, Chun Ye, and Yanqi Wu. 2024. "Development of a Premium Tea-Picking Robot Incorporating Deep Learning and Computer Vision for Leaf Detection" Applied Sciences 14, no. 13: 5748. https://doi.org/10.3390/app14135748

APA Style

Wu, L., Liu, H., Ye, C., & Wu, Y. (2024). Development of a Premium Tea-Picking Robot Incorporating Deep Learning and Computer Vision for Leaf Detection. Applied Sciences, 14(13), 5748. https://doi.org/10.3390/app14135748

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Development of a Premium Tea-Picking Robot Incorporating Deep Learning and Computer Vision for Leaf Detection

Abstract

1. Introduction

2. Materials and Methods

2.1. Whole Machine Structure

2.2. Picking Process

2.3. Test System Design and Test

2.3.1. Picking Position

2.3.2. The Visual System

2.3.3. Detection Algorithm

2.3.4. Sample Collection

2.3.5. Image Processing

2.3.6. Evaluation Indicators

2.4. Design and Test of Harvesting System

2.4.1. Picking System Design

2.4.2. Simulation of Planning Algorithm

2.4.3. Evaluation Indicators

2.5. Field Test of the Designed Machine

3. Results

3.1. Comparison and Analysis of Detection Results

3.2. Comparison of Different Path Planning Algorithms

3.3. Picking Accuracy and Picking Efficiency

4. Discussion

5. Conclusions

6. Patents

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI