1. Introduction
As the world’s population continues to grow, so does the demand for agricultural products. According to the United Nations World Population Prospects, the global population will reach 9.6 billion in 2050 [
1]. As the fourth major food crop in the world, potato is produced in large quantities, but its yield per unit area is low. Among them, the low level of intelligence of agricultural machinery is an important factor limiting the increase in its yield per unit area. As a result, in the event of limited resources such as land, the key focus for current agricultural academics is to investigate how to apply new technology to increase grain production per unit area [
2] to fulfill the growing demand of the population. Automatic guidance technologies are not only a means to reduce the waste of labor resources [
3], they are also a means to improve the level of intelligence of agricultural machinery, which in turn helps to boost food harvests. At present, most potato machinery with intelligent control technology relies on GPS or inertial navigation for operation. However, the cost of satellite navigation is high, and global path planning of the operating area is required before each use. In contrast, visual navigation stands out among other navigation methods because of its low cost and high flexibility [
4].
Many experts have performed a great deal of research on this visual navigation technology [
5,
6,
7]. The most important issue of visual navigation is to extract navigation information from the acquired images. According to conventional methods, navigation information extraction is generally divided into image segmentation, feature extraction, clustering, and navigation line fitting. Image segmentation is the primary work of navigation line extraction, and its segmentation effect determines the accuracy of navigation line extraction. Moreover, the segmentation effect is often different due to different objects. In the field of agriculture, researchers often use crops as segmentation objects for image preprocessing. However, the appearance differences exhibited by crops in different growth periods often have an impact on the segmentation effect. Potatoes, in particular, have obvious differences in crop appearance in different periods. In addition, illumination and the presence of weeds in the field also affect the segmentation effect of the image. Therefore, finding a method for extracting potato visual navigation lines which can adapt to multiple growth periods and is not disturbed by noise, such as illumination and weeds, to meet the navigation requirements of potato machinery in different periods is an important focus of study for researchers.
Image-based guidance technology is mainly divided into two categories: traditional image processing and image processing based on deep learning [
8]. In the traditional processing method, many researchers have devoted themselves to improving the classical green feature algorithm and Otsu algorithm to provide better cropping and background segmentation effects [
9,
10,
11] before using cropping features to extract navigation lines. This approach has obtained decent results in field crop environment segmentation [
12]. To efficiently identify corn seedlings and weeds in the field, Montalvo et al. [
13] used the double-threshold segmentation approach, followed by another threshold segmentation after utilizing the Otsu threshold method, which significantly reduced the influence of field weeds on crop row segmentation. In this manner, they realized the identification and detection of straight and curved crop rows. Yue Yu et al. [
12] used a triple classification method to segment rice seedlings, and then used a two-dimensional adaptive clustering method to eliminate misleading crop feature points. The experimental results show that this method can achieve better navigation line extraction results in weeds, duckweed, and eutrophic complex paddy field environments. In addition, there are various experts dedicated to the study of stereo vision [
14,
15,
16]. To meet the precise navigation operation of a cotton harvester, Fue et al. [
17] proposed a cotton crop row detection method based on stereo vision, which provided an effective solution for the crop row detection of canopy crops and is expected to assist RTK-GNSS navigation in harvesting cotton bolls. However, the accuracy and real-time performance of stereo vision matching are problems that remain to be solved. Although the above traditional image processing methods are effective in specific situations, they are easily affected by noise, such as light and weeds, and have poor anti-interference ability. Moreover, potatoes vary in appearance in different growth periods, and there are different requirements for the setting of a segmentation threshold.
In recent years, Artificial intelligence [
18] and deep learning have made significant progress in the fields of autonomous driving [
19], medical image processing [
20,
21], and speech recognition [
22]. Especially with the application of transfer learning [
23], it solves the important problem of the lack of relevant datasets in the agricultural field. It is most commonly used in crop identification [
24,
25], weed identification [
26,
27], plant pest detection [
28,
29,
30], water quality monitoring [
31], and agricultural robot navigation in the field of agricultural engineering. To reduce the complexity of traditional image segmentation, many researchers utilize object detection and semantic segmentation techniques to locate crop rows [
32,
33]. Based on the ES-Net network model, Adhikari SP et al. [
34] performed segmentation training on the rice line dataset [
35], and the sliding window algorithm was used to cluster and fit the crop lines within the ROI. Finally, the geometric midline formed by two crop rows was used as the navigation line. The results show that the error was approximately 5-pixel values. To adapt to the different row spacing of strawberries, Ponnambalam et al. [
36] used SegNet [
37] to identify and segment strawberry crop rows. The semantic information was divided into three categories: strawberry row, non-crop row, and background. In the end, the adaptive ROI algorithm was used to achieve the autonomous navigation of strawberries with various line spacings. Bah et al. proposed a CRowNet model [
38] consisting of SegNet and CNN-based Hough transform for UAV crop row detection. The performance of this method was quantitatively compared with traditional methods, and a good crop row detection rate of 93.58% was obtained. In addition, the object detection algorithm was also applied to crop row recognition. Jiahui Wang [
39] used the YOLO V3 object detection algorithm to identify paddy field seedlings under various working conditions. In this paper, segmentalized labeling and the prediction box were used to locate paddy rice seedlings, providing a new method for crop row detection. To make the navigation line detection effect suitable for different growth periods of kiwifruit trees, Zongbin Gao [
40] identified the kiwifruit trunks based on the Yolo v3 Tiny-3p model and fitted the navigation lines through the midpoints of the trunks on both sides of the road. The results show that the extracted guidelines can be applied to different kiwifruit growing environments. The above research shows that the crop row detection method based on deep learning is widely used, and it is more and more favored by researchers because of its strong learning ability and robustness. However, due to the lack of data samples, such methods only have strong applicability to specific learned objects. For the detection of potato crop rows and their different periods, in particular, no research exists at present.
The objective of this study was to utilize deep learning-based methods to reduce the impact of illumination, weeds, and other noise on crop row segmentation and to achieve accurate segmentation of potato crop rows in different growth periods, something that has not been fully addressed in the literature. In addition, a feature midpoint adaptive navigation line extraction method is proposed, which can realize the adaptive adjustment of the vision navigation line position according to the growth shape of the potato to ensure that the potato machine always maintains the center position of the row during operation. The main contributions are as follows:
A potato crop row dataset was established under various growth periods and lighting conditions.
Based on improved U-Net, a segmentation and recognition model of potato crop rows was constructed.
A complete detection scheme for the potato visual navigation line suitable for multiple growth periods was proposed.
The remainder of this paper is divided as follows:
Section 2 contains the details of potato visual navigation line detection.
Section 3 details the model segmentation and vision navigation fitting results and provides the discussion. Finally,
Section 4 provides this study’s conclusions.
4. Conclusions
Image-based guidance methods to control the navigation operation of agricultural machinery can be used to greatly improve the automation level of agricultural robots. Moreover, they can operate stably in areas without satellite signals. There are two main methods for image-based guidance: (1) methods based on traditional image processing; (2) methods based on CNN offline training. In practical applications, although the guidance method based on traditional threshold segmentation has obvious advantages in terms of time, it is not particularly applicable in different growth periods. In this paper, a method for the segmentation of potato crop rows based on semantic segmentation is proposed. By creating data labeling files under actual working conditions in navigation operations, the sample data are learned to achieve pixel-level segmentation of potato crop rows and backgrounds under different working conditions. In addition, the feature midpoint adaptive algorithm proposed in this paper was used to extract the navigation reference point. Finally, the navigation line was fitted by the least square method. The experimental results show that the method proposed in this paper has strong robustness and can better adapt to the field operation environment, which contains many non-structural factors. This, therefore, provided a reference for the self-adaptive adjustment of agricultural machinery in the field. Furthermore, using VGG16 as the feature extractor of the U-Net network not only improved the model’s convergence speed but also reduced the training time. Our method outperforms the original U-Net, SegNet, PSPNet, and Deeplab V3 methods in terms of segmentation. The proposed method can meet the actual operational requirements of agricultural machinery because the average running speed of agricultural machinery is 1.5 km/h.
However, the size of the dataset collected in this study and the working conditions were limited; thus, they could not completely represent each potato growth period. Therefore, increasing the dataset size should be considered in future research to improve the applicability of the model. Moreover, this paper only segmented the potato crop rows and did not consider the influence of other factors, such as obstacles. In the future, a variety of sensors should be integrated to improve the intelligent mechanical perception and decision-making capabilities of machinery. In addition, although this paper has achieved ideal results in detecting potato navigation lines, the method used is relatively primitive. In future work, it is necessary to study and improve upon our methods by exploring new semantic segmentation models and SVM to overall improve the innovation of the system.