**5. Conclusions**

A novel 3D locating system based on binocular vision was proposed for laser pest control, combining a Mask R-CNN, pest skeleton extraction, and multi-constraint stereo matching. The ResNet50-based Mask R-CNN model was trained and validated with a self-built NIR field *P. rapae* image dataset collected in a real-world agriculture scene. The AP, recall, and *F*<sup>1</sup> values were 94.24%, 97.47%, and 96.55% of the Mask R-CNN, respectively, showing the adaptability of the proposed model.

Furthermore, when the working depth varied between 400 and 600 mm, the average location errors were 0.40 mm, 0.30 mm, and 0.51 mm and the maximum errors were 0.98, 0.68, and 1.16 mm for the 3D system in the *X*-axis, *Y*-axis, and *Z*-axis direction. The conclusions of this study provide a design basis for the follow-up research and development of the laser pest control execution system.

Since the laser strike point extraction in this paper was limited to the processing of two-dimensional image features, there is still room for improvement in object point localization methods and accuracy evaluation experiments. In the future, the depth camera can be further used to obtain the overall 3D pose information of the pests to improve the target localization accuracy.

**Author Contributions:** Conceptualization, Y.L., Y.X. and Q.F.; methodology, Y.L., Y.X. and Q.F.; software, Y.L.; validation, Y.L., J.L., X.L. and Z.H.; formal analysis, Y.L.; investigation, Y.L., J.L., X.L. and Z.H.; resources, Y.X. and Q.F.; data curation, Y.L.; writing—original draft preparation, Y.L.; writing—review and editing, Y.X. and Q.F.; visualization, Y.L.; supervision, Y.X. and X.L.; project administration, Y.X.; funding acquisition, Y.X. and Q.F. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the National Key Research and Development Plan Project (2019YFE0125200), the Natural Science Foundation of Hunan Province of China (2021JJ30363), the BAAFS Innovation Capacity Building Project (KJCX20210414), and the Science and Technology General Project of Beijing Municipal Education Commission (KM202112448001).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** All data are presented in this article in the form of figures and tables.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Anwen Liu 1, Yang Xiang 1,\*, Yajun Li 1,2, Zhengfang Hu 1, Xiufeng Dai 1, Xiangming Lei <sup>1</sup> and Zhenhui Tang <sup>1</sup>**


**Abstract:** Currently, pineapple processing is a primarily manual task, with high labor costs and low operational efficiency. The ability to precisely detect and locate pineapple eyes is critical to achieving automated pineapple eye removal. In this paper, machine vision and automatic control technology are used to build a pineapple eye recognition and positioning test platform, using the YOLOv5l target detection algorithm to quickly identify pineapple eye images. A 3D localization algorithm based on multiangle image matching is used to obtain the 3D position information of pineapple eyes, and the CNC precision motion system is used to pierce the probe into each pineapple eye to verify the effect of the recognition and positioning algorithm. The recognition experimental results demonstrate that the mAP reached 98%, and the average time required to detect one pineapple eye image was 0.015 s. According to the probe test results, the average deviation between the actual center of the pineapple eye and the penetration position of the probe was 1.01 mm, the maximum was 2.17 mm, and the root mean square value was 1.09 mm, which meets the positioning accuracy requirements in actual pineapple eye-removal operations.

**Keywords:** pineapple eye; three-dimensional; YOLOv5; stereo-matching

**1. Introduction**

Pineapple is a fruit with a high added economic value. In 2018, China's yearly pineapple production was approximately 1.64 million tons [1]. Approximately 30% of pineapples are utilized for production and processing [2]. The processing of pineapple is complicated, especially because even after the pineapple has been skinned, there are still many eyes on its surface that need to be removed. Currently, the main way to remove pineapple eyes is to do so manually with special tools, which is labor intensive and has high labor costs and low production efficiency. The key to automatically removing pineapple eyes is to rapidly and accurately identify and locate pineapple eyes.

Machine vision technology is frequently utilized in fruit recognition and quality inspection because of its noncontact nature, high speed, and high precision [3]. In traditional machine vision technology, targets are primarily recognized based on characteristics such as color, shape, and texture. Li et al. [4] proposed a field recognition system for pineapple based on monocular vision through threshold segmentation, morphological processing, and other operations to recognize pineapples and obtain pineapple center point information. Lin et al. [5] presented a segmentation method using texture and color features, and Leung-Malik textures and HSV color features were fused to realize the detection and recognition of citrus fruit. Lv et al. [6] proposed a method to deepen the fruit region and improve the edge definition in images by using a histogram equalization algorithm. Then, the R-B color difference image based on histogram equalization was obtained, and green apple recognition was realized. Kurtulmus et al. [7] used circular Gabor texture analysis for green citrus object recognition. When the fruit surface is uneven in color, shadowed, or obscured

**Citation:** Liu, A.; Xiang, Y.; Li, Y.; Hu, Z.; Dai, X.; Lei, X.; Tang, Z. 3D Positioning Method for Pineapple Eyes Based on Multiangle Image Stereo-Matching. *Agriculture* **2022**, *12*, 2039. https://doi.org/10.3390/ agriculture12122039

Academic Editor: Yanbo Huang

Received: 13 October 2022 Accepted: 26 November 2022 Published: 28 November 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

due to environmental factors such as light, the recognition quality of traditional machine vision technology is significantly reduced [8].

By applying machine learning technology to fruit image analysis, a better applica-tion effect and higher efficiency can be obtained [9]. Li Han et al. [10] used a naive Bayes classifier to classify fruit and nonfruit regions, and the interference caused by the color similarity of green tomatoes and green foliage backgrounds was eliminated to improve the fruit recognition accuracy. Wang et al. [11] proposed a litchi recognition algorithm based on K-means clustering, which can better resist the influence of illumination changes and can maintain high accuracy for recognition under occlusion and different lighting conditions. Zhao et al. [12] extracted the Haar-like features of grayscale images and used the AdaBoost classifier for classification and recognition. In the actual environment, the detection accuracy rate of ripe tomatoes reached 96%, and the classifier structure was simple.

In recent years, object detection based on deep learning has shown great advantages in the field of fruit image recognition [13,14]. The convolutional neural network, with its fast detection speed and excellent ability to extract target features, not only reduces the workload but also improves the recognition speed and accuracy [15]. Zhang Xing et al. [16] proposed a study on pineapple picking and recognition under a complex field environment based on the improved YOLOv3. The multiscale fusion training network was used to detect single-category pineapple, and a detection and recognition rate of approximately 95% was achieved using this method. Tian et al. [17] proposed an improved YOLOv3 model to identify apples at different growth stages in orchards. The model was used with the DenseNet method to process low-resolution feature layers; this method effectively enhances feature propagation, promotes feature reuse, improves network performance, and has good recognition performance under apple overlap and occlusion conditions. Yu et al. [18] proposed a mask R-CNN algorithm to identify 100 wild strawberry images. The results demonstrated that the average recognition accuracy was 95.78% and the recall rate was 95.41%. Zhang et al. [19] proposed a real-time detection method for grape clusters based on the YOLOv5s deep learning algorithm. By training and adjusting the parameters of the YOLOv5s model on the data set, the fast and accurate detection of grape clusters was finally realized. The test results showed that the precision, recall, and mAP of the grape cluster detection network were 99.40%, 99.40%, and 99.40%, respectively.

Studies related to fruit positioning, which are widely used, have mainly focused on the three-dimensional positioning of fruit for robot picking, and methods include binocular stereo vision, structured light stereo vision, and monocular stereo vision. In binocular stereo vision, not only can the image information of different angles of the target be obtained, but the three-dimensional position information of the target through stereo matching can also be obtained [20]. Therefore, this is a widely used approach in fruit and vegetable recognition [21], positioning [22], and acquisition of phenotypic parameters [23]. Luo et al. [24] proposed a method for solving and positioning enclosure based on binocular stereo vision. When the depth distance was within 1000 mm, the positioning error was less than 5 mm. However, the calibration process of the binocular camera is complex, and the calculational burden of the algorithm was relatively large [25]. Stereovision, which is based on structured light, is a combination of structured light technology and binocular stereo vision technology. Through structured light matching, the corresponding pixels of the left and right cameras are subjected to stereo matching, parallax calculation, and recovery of the three-dimensional data of the scene. Zhang et al. [26] used a machine vision system based on a near-infrared array structure and three-dimensional reconstruction technology to realize the recognition and positioning of apple stems and calyxes. However, structured light stereo vision is easily affected by illumination [27]. Monocular stereo vision positioning can be divided into monocular camera positioning of one, two, or more images. The positioning of a single image mainly relies on the mapping relationship between the known spatial information of the characteristic light points, lines, or other image features to obtain the position coordinate information [28]. Generally, images from different perspectives are obtained using the positioning method by changing the position

of the camera, and the matching relationship of image feature points is used to obtain the relative positional relationship between the cameras during multiple shots to realize the positioning of the target. Zhao et al. [29] used a monocular color camera to build a vision system to locate the picking point of litchi clusters and realize the three-dimensional positioning of litchi clusters.

To date, there have been no research reports on pineapple eye machine-vision recognition or positioning. Based on the analysis of the existing research in the field of fruit recognition and positioning, deep learning technology based on convolutional neural networks is proposed in this paper to carry out research on pineapple eye recognition. On this basis, combined with the entire circumference-image-acquisition-of-pineapple method, the three-dimensional localization of pineapple eyes is realized by applying the stereo-matching method of monocular and multiangle images.
