SN-CNN: A Lightweight and Accurate Line Extraction Algorithm for Seedling Navigation in Ridge-Planted Vegetables

Zhang, Tengfei; Zhou, Jinhao; Liu, Wei; Yue, Rencai; Shi, Jiawei; Zhou, Chunjian; Hu, Jianping

doi:10.3390/agriculture14091446

Open AccessArticle

SN-CNN: A Lightweight and Accurate Line Extraction Algorithm for Seedling Navigation in Ridge-Planted Vegetables

by

Tengfei Zhang

^1,2,

Jinhao Zhou

^1,2,

Wei Liu

^1,2,

Rencai Yue

^1,2

,

Jiawei Shi

^1,2,

Chunjian Zhou

³ and

Jianping Hu

^1,2,*

¹

School of Agricultural Engineering, Jiangsu University, Zhenjiang 212013, China

²

Jiangsu Provincial Key Laboratory of Hi-Tech Research for Intelligent Agricultural Equipment, Jiangsu University, Zhenjiang 212013, China

³

Shanghai Agricultural Machinery Research Institute, Shanghai 201106, China

^*

Author to whom correspondence should be addressed.

Agriculture 2024, 14(9), 1446; https://doi.org/10.3390/agriculture14091446

Submission received: 21 July 2024 / Revised: 12 August 2024 / Accepted: 23 August 2024 / Published: 24 August 2024

(This article belongs to the Section Agricultural Technology)

Download

Browse Figures

Versions Notes

Abstract

:

In precision agriculture, after vegetable transplanters plant the seedlings, field management during the seedling stage is necessary to optimize the vegetable yield. Accurately identifying and extracting the centerlines of crop rows during the seedling stage is crucial for achieving the autonomous navigation of robots. However, the transplanted ridges often experience missing seedling rows. Additionally, due to the limited computational resources of field agricultural robots, a more lightweight navigation line fitting algorithm is required. To address these issues, this study focuses on mid-to-high ridges planted with double-row vegetables and develops a seedling band-based navigation line extraction model, a Seedling Navigation Convolutional Neural Network (SN-CNN). Firstly, we proposed the C2f_UIB module, which effectively reduces redundant computations by integrating Network Architecture Search (NAS) technologies, thus improving the model’s efficiency. Additionally, the model incorporates the Simplified Attention Mechanism (SimAM) in the neck section, enhancing the focus on hard-to-recognize samples. The experimental results demonstrate that the proposed SN-CNN model outperforms YOLOv5s, YOLOv7-tiny, YOLOv8n, and YOLOv8s in terms of the model parameters and accuracy. The SN-CNN model has a parameter count of only 2.37 M and achieves an [email protected] of 94.6%. Compared to the baseline model, the parameter count is reduced by 28.4%, and the accuracy is improved by 2%. Finally, for practical deployment, the SN-CNN algorithm was implemented on the NVIDIA Jetson AGX Xavier, an embedded computing platform, to evaluate its real-time performance in navigation line fitting. We compared two fitting methods: Random Sample Consensus (RANSAC) and least squares (LS), using 100 images (50 test images and 50 field-collected images) to assess the accuracy and processing speed. The RANSAC method achieved a root mean square error (RMSE) of 5.7 pixels and a processing time of 25 milliseconds per image, demonstrating a superior fitting accuracy, while meeting the real-time requirements for navigation line detection. This performance highlights the potential of the SN-CNN model as an effective solution for autonomous navigation in field cross-ridge walking robots.

Keywords:

precision agriculture; navigation line fitting; seedling detection; RANSAC; YOLOv8n

1. Introduction

With the advancement of precision agriculture and smart farming, mechanized transplanting has become the mainstream method for large-scale vegetable cultivation [1]. Additionally, ridge cultivation is a common method to enhance vegetable yield [2]. Although mechanized transplanting significantly improves planting efficiency [3], issues such as missing transplants frequently occur due to factors like agricultural machinery design [4,5], agronomic factors [6], and environmental influences, leading to gaps in the seedling rows. Currently, post-transplant seedling inspection and replanting in the early growth stages of vegetables have become highly promising research areas aimed at optimizing the vegetable yield and enhancing the agricultural production efficiency [7,8,9,10]. As agriculture advances toward Agriculture 5.0, the role of advanced AI technologies in fostering sustainable and efficient practices is becoming increasingly important [11]. The accurate extraction of navigation lines is a critical technology for the autonomous operation of agricultural robots. The precision and efficiency of this technology directly impact the automation level and operational quality of the robots. Furthermore, the advancements in navigation line extraction technology not only reflect the progress in precision agriculture but also mark an important milestone in propelling modern agriculture to higher levels of development.

The current navigation technologies for agricultural machinery primarily include Global Navigation Satellite Systems (GNSSs) [12] and machine vision navigation systems [13]. GNSSs provide precise positioning for the inspection robots in open fields, but they may cause damage to ridges or seedlings when autonomously navigating narrow furrows, thereby failing to meet the precise navigation requirements of robots. In contrast, machine vision systems offer high-precision navigation by directly processing visual information from the environment. This makes them highly suitable for tasks requiring accurate positioning and object recognition, showing significant potential in the navigation of inspection or replanting robots [14,15].

Currently, domestic and international researchers focus on two main types of navigation for ridge-following robots: navigation without crops and navigation with crops [16]. Navigation without crops is primarily applied to vegetable transplanters, while navigation with crops is mainly used for field inspection and management robots. In unstructured environments, the features of seedling rows are more prominent; hence, this study uses seedling row detection for ridge-following navigation. Seedling row detection models using machine vision technology are mainly divided into traditional machine vision techniques and deep learning techniques [17].

Traditional machine vision techniques primarily utilize manually defined features such as color and texture for feature extraction. Chen et al. [18] employed the Otsu method based on the RGB color space to segment crops and furrows, proposing a prediction point Hough transform method to identify the furrow centerline between crops as the navigation line for robots. Ospina and Noguchi [19] used threshold functions and morphological operations to accurately segment crop rows, then applied the least squares method to extract the crop rows as navigation lines for harvesting and plant protection machinery operating on ridges. Rabab et al. [20] proposed a machine vision-based crop row detection algorithm that utilized the Otsu method in the RGB color space to segment crops and furrows and leveraged the color differences between the crops and soil to detect the crop rows. This method enhances the accuracy of crop row detection under field conditions, facilitating the autonomous navigation of agricultural machinery and is particularly suitable for field operations in precision agriculture. Hamuda et al. [21] developed an automatic crop detection algorithm based on the HSV color space and morphological operations. The algorithm segmented the region of interest (ROI) by filtering the HSV channels and further refined the ROI using morphological erosion and dilation operations. While traditional machine learning methods are computationally efficient, simple to implement, and robust, they exhibit weaker adaptability and generalization capabilities in unstructured environments.

One significant advantage of deep learning is its ability to automatically extract complex features from data. For instance, Convolutional Neural Networks (CNNs) have been widely applied in image classification and feature extraction, demonstrating a high speed and accuracy in seedling band detection. Li et al. [22] proposed an end-to-end crop row detection algorithm, E2CropDet, based on deep learning, which combined CNNs and Long Short-Term Memory (LSTM) networks to accurately detect crop rows under varying lighting and background conditions. YOLO, as a one-stage detection algorithm, excels in both speed and accuracy [23]. Yang et al. [24] introduced a real-time maize field crop row detection method based on ROI autonomous extraction, achieving efficient crop row detection in complex backgrounds through image preprocessing and feature point extraction. Diao et al. [25] proposed a maize spraying robot navigation line extraction algorithm based on an improved YOLOv8s network, incorporating a novel Atrous Spatial Pyramid Pooling Fusion (ASPPF) structure to enhance the detection accuracy of core maize plants. Liu et al. [26] improved pineapple navigation line extraction accuracy by adding small target detection layers and modifying the loss function in the neck section of YOLOv5. Diao et al. [27] presented a maize row detection model based on ST-YOLOv8s, which can adapt to crop row recognition tasks at different growth stages. To improve the model accuracy, in addition to adding small target detection layers and enhancing the spatial pyramid structure, incorporating attention mechanisms is also an effective method. Attention mechanisms enable the network model to focus on critical information, with commonly used mechanisms including Squeeze-and-Excitation (SE) networks [28] and a Convolutional Block Attention Module (CBAM) [29]. Currently, domestic and international scholars primarily apply seedling band missing detection to crop yield assessment. Lin et al. [30] utilized drones for detecting missing seedlings in peanut fields, combining YOLOv5 with DeepSort to achieve efficient detection; however, this method is unsuitable for the high-precision detection of missing seedlings on ridge surfaces. Cui et al. [31] developed an improved YOLOv5 model for rice seedling missing detection and counting, achieving an accuracy of 93.2% but with relatively high model parameters.

For the deployment of agricultural robots in field environments, lightweight architectures with fewer parameters are typically utilized, such as GhostConv [32], ShuffleNet [33,34], and MobileNet [35]. Gong et al. [36] proposed the YOLOV5s-M3 model for autonomous navigation in maize seedling weeding. This model employed MobileNetv3 as the backbone network, reducing the number of parameters, and incorporated CBAM to enhance the detection accuracy. Additionally, knowledge distillation was used to improve the model’s recall and precision. Andújar et al. [37] developed the DCGA-YOLOv8 multi-crop row detection model, which enhances crop row detection accuracy by introducing deformable convolutions and the Global Attention Mechanism (GAM).

In summary, current research on crop row detection faces challenges such as large model parameters and the limited computational resources of agricultural robots. Moreover, the presence of missing seedling rows in ridge-cultivated vegetables necessitates high-precision navigation lines for robots to cross ridges accurately. To address these challenges, this paper develops the SN-CNN model for extracting navigation lines in ridge-cultivated vegetables. The main contributions are as follows. (1) We constructed a dual-row ridge-cultivated broccoli seedling navigation line dataset, detailing data sources, collection methods, and data augmentation techniques. (2) We proposed the C2f_UIB module, which effectively reduces redundant computations by integrating NAS and weight transfer technologies, thereby improving the model efficiency. (3) By introducing the parameter-free SimAM attention mechanism into the neck part of the baseline model, we enhanced the detection accuracy. Additionally, using the RANSAC algorithm for the navigation line fitting reduced the fitting error.

2. Materials and Methods

2.1. Research Process and Methods of This Paper

This study employs common vegetable planting methods, focusing on ridge-cultivated broccoli seedlings, specifically targeting seedling band extraction after transplantation by vegetable transplanters. As shown in Figure 1, the detection targets on the ridge surface are classified into two categories: seedling and missed hill. We use the midpoints of the bounding boxes of the actual detection targets and missed seedlings as fitting points for the navigation line. Based on this, we developed the SN-CNN navigation line detection model. Firstly, we provided a detailed description of the image data sources and collection methods. Secondly, we constructed the dataset using data augmentation techniques and Labelme1.8.6 software. Following this, we presented the SN-CNN network model and the proposed C2f_UIB module in detail. Additionally, we integrated the parameter-free SimAM attention mechanism into the neck section. In the navigation line fitting section, we compared the performance of the least squares method with the RANSAC algorithm. The least squares method, a widely used mathematical approach, minimizes the squared error between data points and the fitted line to achieve an optimal result. It is particularly effective in scenarios where the data distribution is uniform and free of significant outliers, offering a high computational efficiency and ease of implementation. However, the least squares method is sensitive to outliers, which can lead to biased fitting in datasets with noise or anomalies. In contrast, the RANSAC algorithm iteratively selects random data subsets for model fitting and evaluates the conformity of each model with the overall dataset, effectively mitigating the influence of outliers and producing more robust fitting results. Despite its higher computational complexity and resource demands due to random sampling and multiple iterations, RANSAC excels in handling noisy data. Therefore, we employed both methods in our analysis to achieve an optimal balance between speed and accuracy. Finally, we evaluated the actual detection performance of the model using the test set, discussed the model’s applicability, advantages, limitations, future directions, and potential applications in field inspection and replanting robots.

2.2. Dataset Construction and Image Preprocessing

In August 2022, data collection was conducted at the broccoli full-process mechanization production demonstration base in Xiangshui County, Jiangsu Province, China. In Figure 2, the dataset sources are detailed. The broccoli seedlings were approximately 30 days old, with an average height of 124 mm and possessing 3 to 5 leaves. The image data were collected from the broccoli transplanted using three different vegetable transplanters, as illustrated in Figure 2. Specifically, C-E correspond to the Jiangsu University Vegetable Transplanter (2ZBA-2) [38] (Jiangsu University, Zhenjiang, China), the Yanmar Vegetable Transplanter (2ZQ-2) (Yanmar Co., Ltd., Osaka, Japan), and the AMEC Vegetable Transplanter (2ZS-2) [39] (AMEC, Changzhou, China), respectively. We used a visual platform equipped with a Realsense D455 camera (Intel Corporation, Santa Clara, CA, USA) to collect the video stream data. The RGB resolution was 1920 × 1080 pixels. Given the complex field environment and the varying scales of the detection targets during the robot’s operation, we ensured the diversity of the image data scales by varying the camera installation heights to 0.8 m, 1 m, and 1.2 m, with installation angles of 30° and 45° overlooking the ridge surface. This setup yielded six sample video segments. Using this method, we collected video data from different vegetable ridges on-site. Finally, we extracted the image dataset by frame extraction from the videos. As shown in Figure 3, the dataset images included continuous seedling rows, gaps in rows, and severe gaps.

To enhance the model’s generalization capability and address the issue of sample imbalance, we employed four data augmentation techniques using the ImgAug3.2 software available at https://github.com/Fafa-DL/Image-Augmentation (URL accessed on 18 March 2024). These techniques included horizontal flipping, brightness adjustment, adding Gaussian noise, and motion blur, which were applied offline. As a result, we generated a dataset containing 5000 images. These augmentation methods better simulate field conditions, forcing the model to learn more robust features, thereby significantly improving its performance on unseen data. For annotation, we used the LabelMe5.3.1 software, available at https://github.com/labelmeai/labelme (URL accessed on 20 March 2024). During its use, we only needed to convert the original image data and corresponding labels, thus avoiding the need to annotate the additionally augmented data. The annotation targets were divided into two categories, following the COCO dataset format requirements. The dataset was split into training, validation, and test sets in a ratio of 8:1:1. The training set was used for the model parameter training, the validation set for optimizing the hyperparameters during training, and the test set for evaluating the model’s generalization performance.

2.3. Improvement of YOLOv8n

YOLOv8n, an advancement in the YOLO series of object detection models, is engineered to deliver real-time performance while maintaining a high detection accuracy [23] and is available at https://github.com/ultralytics/ultralytics (URL accessed on 20 March 2024). This model is particularly suitable for applications requiring fast and efficient object detection with limited computational resources. In our research, we improved the YOLOv8n model to address the specific challenges of navigation line extraction in ridge-cultivated vegetable fields. We introduced the C2f_UIB module, which integrates the UIB (Unified Inverted Bottleneck) module with the NAS and weight transfer techniques to reduce the model parameters and redundant computations, thereby improving the model efficiency [37,40,41]. Additionally, we incorporated the SimAM attention mechanism into the neck section of the network, significantly enhancing the model’s ability to detect the seedlings and missed hills [42]. These adaptations make the YOLOv8n-based model an effective tool for accurate and reliable autonomous navigation in agricultural robotics. As shown in Figure 4, we replaced the original C2f module with the improved C2f_UIB module. Additionally, we integrated the parameter-free SimAM attention mechanism into the neck section.

2.4. Efficient C2f_UIB Block

In this section, we present the Efficient C2f_UIB block, a novel architectural component designed to enhance the performance of our navigation line extraction model while significantly reducing the computational complexity and model parameters. The UIB module is an innovative building block that enhances the efficiency and capacity of neural networks [40]. It extends the traditional Inverted Bottleneck by integrating additional depthwise convolutions and a more flexible structure that can be optimized through the NAS. As shown in Figure 5, the UIB module combines several key architectural elements: (1) an Extra Depthwise (ExtraDW) Variant, which adds additional depthwise convolutions to increase the network depth and receptive field without significantly increasing the computational load; (2) an Inverted Bottleneck (IB), which enhances feature representation through the spatial mixing of expanded feature activations; (3) a ConvNext Block, which efficiently mixes spatial features using larger kernel sizes; (4) a Feed Forward Network (FFN), which employs pointwise convolutions for effective channel mixing. In the context of the navigation line extraction task, the C2f_UIB block enables the model to capture fine-grained features of the crop rows by expanding the receptive field while maintaining the computational efficiency. The ExtraDW variant specifically helps in better delineating the subtle differences in the crop row structure, allowing for more precise detection of the navigation line, even in challenging conditions. The primary advantage of the C2f_UIB block lies in its ability to reduce the model parameters and floating-point operations (FLOPs) without compromising the performance. This is achieved through a combination of efficient architectural design and optimization techniques. The calculation formula of the model parameters is as follows (1):

P = k^{2} \cdot C_{i n} \cdot C_{o u t}

(1)

where k is the kernel size and C_in and C_out are the number of input and output channels, respectively. The calculation formula of the model’s FLOPs is as follows:

F = 2 \cdot P \cdot H \cdot W

(2)

where H and W are the spatial dimensions of the input tensor.

The integration of the NAS allows the C2f_UIB block to find the optimal configurations for k, C_in, and C_out, minimizing P and F. The ExtraDW variant further reduces the computational costs by focusing on the depthwise operations, which are less computationally intensive compared to standard convolutions. The overall architecture, augmented by the Efficient C2f_UIB block, demonstrates a superior balance between computational efficiency and detection accuracy, making it a robust solution for practical deployment in resource-constrained environments.

2.5. Integrating the SimAM Attention Mechanism

Attention mechanisms play an increasingly important role in CNNs by helping models focus on key regions of an image, thus enhancing performance [43]. However, existing attention mechanisms often require additional parameters, increasing the model’s complexity and computational cost. SimAM is a lightweight, parameter-free attention mechanism for CNNs that generates attention weights by computing the local self-similarity of feature maps [42]. SimAM does not introduce any additional parameters and can effectively improve the performance of CNNs.

Figure 6 illustrates the SimAM process for generating attention weights in a 3D feature space. The input feature map X with dimensions

H \times W \times C

undergoes three main steps: (1) Generation, where 3D attention weights are generated based on the local self-similarity of the feature map; (2) Expansion, where these 3D weights are expanded across the feature map; (3) Fusion, where the expanded weights are fused with the original feature map, enhancing the model’s ability to focus on critical information by prioritizing important features and suppressing irrelevant background noise.

This mechanism allows the network to dynamically adjust attention weights without adding extra parameters, thereby improving the performance and efficiency of the model.

The core idea of SimAM is based on the local self-similarity of images. In an image, adjacent pixels typically exhibit a strong similarity, while distant pixels show a weaker similarity. SimAM leverages this characteristic by calculating the similarity between each pixel and its neighboring pixels in the feature map to generate attention weights. The calculation formula for SimAM is as follows (3):

w_{i} = \frac{1}{k} \sum_{j \in N_{i}} s (f_{i}, f_{j})

(3)

SimAM uses a simple and effective similarity measure, namely the Euclidean distance (4):

s (f_{i}, f_{j}) = - {‖ f_{i} - f_{j} ‖}_{2}^{2}

(4)

Formula (3) represents the attention weight

w_{i}

for each pixel i, calculated as the average similarity s(f_i, f_j) between pixel i and all neighboring pixels j within its neighborhood N_i. Here, k is the number of pixels in the neighborhood.

Formula (4) represents the similarity measure between two feature vectors f_i and f_j, using the squared Euclidean distance. The negative sign indicates that a higher similarity corresponds to a smaller distance, resulting in a higher weight value.

By incorporating SimAM into our YOLOv8n, we significantly enhanced its ability to distinguish between seedlings and missed hills, thereby providing more reliable and accurate navigation for agricultural robotics. This integration not only improved the model’s accuracy but also maintained computational efficiency.

2.6. Model Training and Evaluation

2.6.1. Hardware Platform and Hyperparameter Settings

In this research, the experimental hardware setup featured an Intel i7-13700KF CPU (Intel Corporation, Santa Clara, CA, USA), paired with an RTX 4080 GPU (NVIDIA Corporation, Santa Clara, CA, USA), and equipped with 32 GB of RAM (Samsung Electronics, Suwon, South Korea), all running on the Windows 10 operating system (Microsoft Corporation, Redmond, WA, USA). The software used for this study was developed in Python 3.8.15, utilizing the PyTorch 1.13.0 deep learning framework. To ensure optimal performance, hyperparameter tuning was performed, with the detailed hyperparameter configurations provided in Table 1.

2.6.2. Model Evaluation

To evaluate the quality of the navigation line extraction for broccoli seedlings, we used metrics that consider both the precision and recall of detecting seedlings and missed hills, which are critical for accurate line fitting.

Formula (5)’s precision P measures the accuracy of detecting seedlings and missed hills, calculated as the ratio of true positive detections (TP) to the sum of true positive and false positive detections (FP):

P = \frac{T P}{T P + F P}

(5)

Formula (6)’s recall R measures the model’s ability to detect all actual seedlings and missed hills, calculated as the ratio of true positive detections to the sum of true positive and false negative detections (FN):

R = \frac{T P}{T P + F N}

(6)

Formula (7)’s AP evaluates the precision of the model for a single class (either seedlings or missed hills) across different recall levels. Formula (8)’s mAP provides an overall performance metric by averaging the AP values across all classes:

A P = \int_{0}^{1} P (R) d R

(7)

m A P = \frac{1}{C} \sum_{i = 1}^{C} A P_{i}

(8)

We used the fitted navigation lines’ accuracy to evaluate the overall performance. The navigation lines are derived by fitting a line through the midpoints of the bounding boxes of detected seedlings and missed hills. The fitting accuracy is assessed by calculating the root mean square error (RMSE) between the predicted navigation line and the truth line. As shown in Formula (9), this metric captures the deviation of the predicted line from the actual line.

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{true, i} - y_{fit, i})}^{2}}

(9)

where

y_{true, i}

and

y_{fit, i}

represent the values of the true navigation line and the fitted navigation line at the i-th point, respectively, and n is the total number of points.

The [email protected] signifies that a detection is deemed positive when the Intersection over Union (IoU) between the detection frame and the ground truth exceeds 0.5. This metric serves as a comprehensive evaluation of the model precision. We evaluated the size and computational cost of the model by its number of parameters and FLOPs. These metrics are crucial for deployment on edge computing platforms. By considering all these indicators, we can understand the model’s detection performance in a single category, as well as its overall performance.

3. Results

3.1. Comparison of SN-CNN and the Baseline’s Training Loss Functions

We trained the model under the same experimental conditions. Figure 7 displays the loss function graphs for the SN-CNN and the baseline (yolov8n) model over 200 epochs. The SN-CNN model exhibits a more rapid decrease in training loss compared to the baseline model during the initial epochs. Both models show a similar overall convergence pattern, with the SN-CNN model maintaining a consistently lower training loss throughout the training process. The Val Loss graph presents the validation loss comparison. During the first 150 epochs, the baseline model’s validation loss decreases slightly faster than that of the SN-CNN model. However, after 150 epochs, the SN-CNN model demonstrates a more significant reduction in the validation loss. Both models’ loss values stabilize after 200 epochs, indicating that they have effectively learned the image features and converged to optimal solutions. Overall, the SN-CNN model shows an improved performance in terms of reducing the training and validation losses, especially in the later stages of training. This suggests that the SN-CNN model is more efficient in learning and generalizing from the data.

3.2. Comparison of SN-CNN Performance with Existing Algorithms

Figure 8 shows the precision–recall (PR) curves for four different models—(a) YOLOv5s, (b) YOLOv7-tiny, (c) YOLOv8n, and (d) SN-CNN—evaluated on the task of detecting seedlings (s) and missed hills (m). Across models YOLOv5s, YOLOv7-tiny, and YOLOv8n, the detection AP values for seedlings are consistently high (around 97%), whereas the AP values for missed hills are noticeably lower (ranging from 88% to 89.9%). This discrepancy indicates a common challenge in accurately detecting missed hills compared to seedlings. The improved SN-CNN model, however, shows significant enhancements, achieving the highest overall [email protected] of 94.6%, with an AP of 96.9% for seedlings and 92.3% for missed hills. These results demonstrate that the SN-CNN model effectively addresses the detection challenges for missed hills.

Table 2 provides a comprehensive comparison of various models, including YOLOv5s, YOLOv7-tiny, YOLOv8n, YOLOv8s, and the improved SN-CNN. The metrics include the number of parameters, FLOPs, P, R, and [email protected]. This analysis is based on the combined performance across all categories, while the previous PR curve analysis focused on individual categories. In the analysis of the four baseline models, YOLOv8s demonstrates the highest detection accuracy with an [email protected] of 93.5%. However, it also has the largest number of parameters (11.4 M) and FLOPs (29.4 G), which is not favorable for lightweight deployment. Conversely, YOLOv8n achieves a better balance between accuracy and computational efficiency, with fewer parameters (3.31 M) and lower FLOPs (8.7 G), while maintaining a respectable [email protected] of 92.6%. Given this balance, we chose YOLOv8n as the foundation for our improvements. The resulting SN-CNN model further optimizes this balance, reducing the number of parameters to 2.37 M and FLOPs to 6.7 G, while achieving the highest [email protected] of 94.6%. This indicates significant improvements in both efficiency and performance.

Figure 9 presents a comparative analysis of the models’ parameters and [email protected]. The bar chart illustrates the number of parameters, while the line graph shows the corresponding [email protected] values for each model. Figure 9 effectively illustrates that the SN-CNN model achieves the best balance of a high detection accuracy and low computational requirements. With the smallest parameter count and FLOPs among all models, SN-CNN’s high [email protected] of 94.6% highlights its efficiency and suitability for real-time deployment in resource-constrained environments. This underscores the significant advancements made through our improvements on the YOLOv8n architecture.

3.3. Ablation Experiment

Table 3 presents the results of the ablation experiments comparing different models (M1 to M4) with various combinations of the C2f_UIB and SimAM modules. The performance metrics include the parameters, FLOPs, P, R, and [email protected]. This analysis highlights the impact of these modules on the model efficiency and detection performance.

M1 (Baseline YOLOv8n): Without the C2f_UIB and SimAM modules, the model has 3.31 M parameters and 8.7 G FLOPs, achieving a precision of 88.4%, a recall of 89.3%, and an [email protected] of 92.6%. M2 (C2f_UIB Only): Integrating the C2f_UIB module reduces the parameters to 2.37 M and FLOPs to 6.7 G. The precision improves to 90.2%, but the recall slightly decreases to 88.3%, resulting in an [email protected] of 92.8%. M3 (SimAM Only): Adding the SimAM module to the baseline YOLOv8n maintains the parameters and FLOPs at 3.31 M and 8.7 G, respectively. However, there is a significant improvement in precision (92.6%) and recall (90.5%), with the [email protected] increasing to 93.7%. M4 (C2f_UIB and SimAM): Combining both C2f_UIB and SimAM modules yields the best results. The model achieves the lowest parameters (2.37 M) and FLOPs (6.7 G), with the highest precision (93.9%), recall (90.2%), and [email protected] (94.6%). The ablation experiments clearly demonstrate that the inclusion of both the C2f_UIB and SimAM modules significantly enhances the model’s performance. M4, which incorporates both modules, shows the best balance between efficiency and accuracy, highlighting the effectiveness of the improvements. The reduction in parameters and FLOPs, combined with the highest precision and recall, confirms that these modules contribute substantially to the performance of the SN-CNN model compared to the baseline.

3.4. Performance Evaluation of Navigation Line Fitting on Jetson AGX Xavier

The visual navigation chassis and its structural components are shown in Figure 10. We converted the SN-CNN algorithm into a TensorRT engine file and deployed it on the Jetson AGX Xavier, an embedded computing platform developed by NVIDIA that is widely used for high-performance applications. To evaluate the navigation line fitting performance, Table 4 compares two methods: LS and RANSAC. The metrics used for this evaluation are the number of images, the RMSE, and the processing time in milliseconds per image (Ms/(images)). Both methods are evaluated on the same number of images (100 total: 50 validation images and 50 field-collected images) to ensure a fair comparison. This consistency allows us to directly compare the RMSE and processing time metrics without concern for sample size bias. As shown in Formula (9), RMSE is a standard measure of the differences between the values predicted by a model and the actual values observed. It provides a quantitative measure of the accuracy of the navigation line fitting. Lower RMSE values indicate a better fitting performance, meaning that the predicted navigation line closely matches the actual navigation line. The processing time in milliseconds per image is crucial for real-time applications. A lower processing time indicates a more efficient algorithm, which is essential for real-time navigation in agricultural robotics. The RANSAC method shows a lower RMSE of 5.7 compared to the LS method’s RMSE of 6.8. This indicates that the RANSAC provides a more accurate fit for the navigation line, better matching the actual navigation path. The LS method has a lower processing time of 22 milliseconds per image compared to RANSAC’s 25.00 milliseconds. While the LS is faster, the difference in processing time is relatively small. Given that real-time applications often prioritize accuracy over slight differences in processing time, the trade-off for a lower RMSE with the RANSAC is acceptable.

As shown in Figure 11, the detection results of the two methods are illustrated. Figure 11a–c display the performance on continuous rows of seedlings. Specifically, Figure 11a shows the LS method applied to a continuous row. Figure 11b shows the RANSAC method applied to the same row. Figure 11c provides a direct comparison of the LS and RANSAC methods with the actual navigation line in the same field of view. Similarly, Figure 11d–f demonstrate the results for rows with missing seedlings: Figure 11d shows the LS method applied to a row with missing seedlings; Figure 11e shows the RANSAC method applied to the same row; and Figure 11f compares both methods with the actual navigation line in the same field of view. The RANSAC method demonstrated a higher accuracy in both continuous and missing seedling scenarios, with a lower RMSE of 5.7 pixels compared to the LS method’s RMSE of 6.8 pixels.

4. Discussion

Existing algorithms typically perform well on complex public datasets, but specific tasks often require addressing unique challenges. In our navigation line detection task, there are only two categories, with the “missed hill” category being difficult to recognize due to its similarity to the background. Additionally, agricultural robots have limited computational resources, making smaller models more suitable for edge deployment. The challenge of achieving a high accuracy with a low computational load increases the difficulty of model development.

To address these challenges, we developed the SN-CNN deep learning model. The proposed C2f_UIB module reduces redundant computations without sacrificing accuracy, improving model efficiency. Additionally, integrating the SimAM attention mechanism in the neck section enhances the model’s focus on hard-to-recognize samples without increasing the computational load. As shown in Figure 8, the accuracy of detecting “missed hills” significantly improves. Figure 9 demonstrates that the SN-CNN model uses only 2.37 M parameters and achieves an [email protected] of 94.6%. This is fewer than the 19.38 M parameters reported in [25] and outperforms existing models like YOLOv5s, YOLOv7-tiny, YOLOv8n, and YOLOv8s. Compared to the baseline model, the parameter count is reduced by 28.4%, and the [email protected] is increased by 2%.

The model was deployed on the Jetson AGX Xavier, and two fitting methods were compared. The RANSAC achieves a low RMSE of 5.7 pixels and a processing speed of 25 ms per image. Compared to the E2CropDet model [22], which achieves a lateral deviation of 5.945 pixels for the centerline extraction, our lightweight SN-CNN model achieves an even lower error of 5.7 pixels. Additionally, our model outperforms the semantic segmentation-based (7.153 pixels) and Hough transform-based (17.834 pixels) approaches reported in their study. Compared to the study in [44], where the best-performing model achieved a navigation line error of 9.59 pixels, our SN-CNN model significantly outperforms with a lower error of 5.7 pixels, while also being lightweight and efficient for real-time agricultural robot navigation. While the algorithm in [24] processes images at 25 ms per frame with a frame rate exceeding 40 FPS, our SN-CNN model offers similar real-time performance. The RANSAC’s higher accuracy is attributed to its robustness to outliers, as it iteratively selects random subsets of data to fit the model and evaluates the consensus set of inliers that fit the model well. This is crucial for ridge-planted vegetable navigation line fitting, as seedling lodging or poor uprightness can cause significant deviations in the bounding box center points, as illustrated in Figure 11a,b. Ultimately, the SN-CNN model achieved a good balance between detection accuracy, speed, and model parameter count.

Applicability: The SN-CNN model is particularly well suited for navigation tasks in post-transplant scenarios, such as guiding replanting robots, field inspection, and precision spraying in ridge-planted vegetables. Specifically, our model is designed for replanting tasks in ridge-planted broccoli seedlings 20 to 40 days after transplanting. It can accurately and in real time guide replanting robots along the ridges, ensuring precise navigation without damaging the ridges or crops.

Limitations: This model was developed and tested on a dataset collected from a specific region, focusing on broccoli seedlings. Its adaptability to different regions, environments, and vegetable seedling varieties may be limited. Future research should aim to gather datasets from various regions and cover a wider range of vegetable types to enhance the model’s generalizability. Additionally, while our dataset is relatively large, collecting high-quality data requires significant preparation. The labeling process is also labor-intensive, which could be a limiting factor in further research. To address data scarcity, future work could explore techniques such as transfer learning, few-shot learning, semi-supervised learning, and unsupervised learning, which could reduce the dependency on large labeled datasets and improve the model’s robustness and adaptability [45,46].

5. Conclusions

In this study, we successfully developed the lightweight SN-CNN model to address the critical challenges of navigation line detection in ridge-planted vegetables, including low accuracy that can lead to damage to ridges and crops, as well as the issue of missing seedling rows. Given the limited computational resources of field agricultural robots, our model was designed to be both efficient and precise. By incorporating the C2f_UIB module, which utilizes the NAS and weight transfer techniques, we significantly reduced computational redundancy and improved model efficiency without sacrificing accuracy. The inclusion of the SimAM attention mechanism further enhanced the model’s ability to focus on difficult-to-recognize samples, resulting in superior performance metrics.

Our experimental results demonstrate that the SN-CNN model achieves an [email protected] of 94.6% with only 2.37 M parameters, outperforming state-of-the-art models such as YOLOv5s, YOLOv7-tiny, YOLOv8n, and YOLOv8s, as well as the other relevant methods reported in the literature. Deployed on the Jetson AGX Xavier platform and combined with the RANSAC method, the model achieves a low RMSE of 5.7 pixels and a processing time of 25 milliseconds per image. This makes it a highly effective solution for autonomous navigation in field cross-ridge walking robots.

The SN-CNN model is particularly well suited for post-transplant navigation tasks in ridge-planted vegetables, such as guiding replanting robots, field inspections, and precision spraying. However, its current adaptation is primarily based on a dataset from a specific region, focusing on broccoli seedlings. Future work should expand the model’s applicability to a broader range of environments and crop types, utilizing advanced techniques such as transfer learning and semi-supervised learning to overcome the challenges of data scarcity and the burdens of large-scale annotation.

In conclusion, the SN-CNN model offers a balanced approach to navigation line detection, combining accuracy, efficiency, and real-time performance, thus advancing precision agriculture technologies.

Author Contributions

Conceptualization, T.Z. and J.H.; methodology, T.Z., J.H. and J.Z.; software, T.Z. and W.L.; validation, T.Z. and J.H.; formal analysis, T.Z., J.H. and C.Z.; investigation, R.Y. and J.Z.; resources, J.H.; data curation, C.Z. and J.S.; writing—original draft preparation, T.Z.; and writing—review and editing, T.Z., J.H. and J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Research and Development of Key Intelligent Technologies for Fully Automated Lettuce Transplanting Equipment (2023-02-08-00-12-F04592); the Shanghai Green Leafy Vegetable Industry Technology System Construction—Development and Application of Key Technologies for High-Density Green Leafy Vegetable Transplanting [Shanghai Nongke (2023) no. 2]; the Jiangsu Province Agricultural Science and Technology Independent Innovation Fund Project (CX (22)2022); the Jiangsu Province Key Research and Development Plan—Modern Agriculture Project (BE2021342); and the Priority Academic Program Development of Jiangsu Higher Education Institutions (no. PAPD-2023-87).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Karunathilake, E.M.B.M.; Le, A.T.; Heo, S.; Chung, Y.S.; Mansoor, S. The Path to Smart Farming: Innovations and Opportunities in Precision Agriculture. Agriculture 2023, 13, 1593. [Google Scholar] [CrossRef]
Liu, X.; Wang, Y.; Yan, X.; Hou, H.; Liu, P.; Cai, T.; Zhang, P.; Jia, Z.; Ren, X.; Chen, X. Appropriate ridge-furrow ratio can enhance crop production and resource use efficiency by improving soil moisture and thermal condition in a semi-arid region. Agric. Water Manag. 2020, 240, 106289. [Google Scholar] [CrossRef]
Jin, Y.; Liu, J.; Xu, Z.; Yuan, S.; Li, P.; Wang, J. Development status and trend of agricultural robot technology. Int. J. Agric. Biol. Eng. 2021, 14, 1–19. [Google Scholar] [CrossRef]
Ma, G.; Mao, H.; Han, L.; Liu, Y.; Gao, F. Reciprocating mechanism for whole row automatic seedling picking and dropping on a transplanter. Appl. Eng. Agric. 2020, 36, 751–766. [Google Scholar] [CrossRef]
Zhao, S.; Liu, J.; Jin, Y.; Bai, Z.; Liu, J.; Zhou, X. Design and Testing of an Intelligent Multi-Functional Seedling Transplanting System. Agronomy 2022, 12, 2683. [Google Scholar] [CrossRef]
Han, L.; Mo, M.; Gao, Y.; Ma, H.; Xiang, D.; Ma, G.; Mao, H. Effects of new compounds into substrates on seedling qualities for efficient transplanting. Agronomy 2022, 12, 983. [Google Scholar] [CrossRef]
Zhang, T.; Zhou, J.; Liu, W.; Yue, R.; Yao, M.; Shi, J.; Hu, J. Seedling-YOLO: High-Efficiency Target Detection Algorithm for Field Broccoli Seedling Transplanting Quality Based on YOLOv7-Tiny. Agronomy 2024, 14, 931. [Google Scholar] [CrossRef]
Wu, T.; Zhang, Q.; Wu, J.; Liu, Q.; Su, J.; Li, H. An improved YOLOv5s model for effectively predict sugarcane seed replenishment positions verified by a field re-seeding robot. Comput. Electron. Agric. 2023, 214, 108280. [Google Scholar] [CrossRef]
Jin, X.; Zhu, X.; Xiao, L.; Li, M.; Li, S.; Zhao, B.; Ji, J. YOLO-RDS: An efficient algorithm for monitoring the uprightness of seedling transplantation. Comput. Electron. Agric. 2024, 218, 108654. [Google Scholar]
Sun, X.; Miao, Y.; Wu, X.; Wang, Y.; Li, Q.; Zhu, H.; Wu, H. Cabbage Transplantation State Recognition Model Based on Modified YOLOv5-GFD. Agronomy 2024, 14, 760. [Google Scholar] [CrossRef]
Holzinger, A.; Fister, I., Jr.; Fister, I.; Kaul, H.-P.; Asseng, S. Human-Centered AI in smart farming: Towards Agriculture 5.0. IEEE Access 2024, 12, 62199–62214. [Google Scholar] [CrossRef]
Radočaj, D.; Plaščak, I.; Jurišić, M. Global Navigation Satellite Systems as State-of-the-Art Solutions in Precision Agriculture: A Review of Studies Indexed in the Web of Science. Agricultural 2023, 13, 1417. [Google Scholar] [CrossRef]
Wang, T.; Chen, B.; Zhang, Z.; Li, H.; Zhang, M. Applications of machine vision in agricultural robot navigation: A review. Comput. Electron. Agric. 2022, 198, 107085. [Google Scholar] [CrossRef]
Ruangurai, P.; Dailey, M.N.; Ekpanyapong, M.; Soni, P. Optimal vision-based guidance row locating for autonomous agricultural machines. Precis. Agric. 2022, 23, 1205–1225. [Google Scholar] [CrossRef]
Kanagasingham, S.; Ekpanyapong, M.; Chaihan, R. Integrating machine vision-based row guidance with GPS and compass-based routing to achieve autonomous navigation for a rice field weeding robot. Precis. Agric. 2020, 21, 831–855. [Google Scholar] [CrossRef]
Liu, W.; Hu, J.; Liu, J.; Yue, R.; Zhang, T.; Yao, M.; Li, J. Method for the navigation line recognition of the ridge without crops via machine vision. Int. J. Agric. Biol. Eng. 2024, 17, 230–239. [Google Scholar]
Shi, J.; Bai, Y.; Diao, Z.; Zhou, J.; Yao, X.; Zhang, B. Row Detection BASED Navigation and Guidance for Agricultural Robots and Autonomous Vehicles in Row-Crop Fields: Methods and Applications. Agronomy 2023, 13, 1780. [Google Scholar] [CrossRef]
Chen, J.; Qiang, H.; Wu, J.; Xu, G.; Wang, Z. Navigation path extraction for greenhouse cucumber-picking robots using the prediction-point Hough transform. Comput. Electron. Agric. 2021, 180, 105911. [Google Scholar] [CrossRef]
Ospina, R.; Noguchi, N. Simultaneous mapping and crop row detection by fusing data from wide angle and telephoto images. Comput. Electron. Agric. 2019, 162, 602–612. [Google Scholar] [CrossRef]
Rabab, S.; Badenhorst, P.; Chen, Y.-P.P.; Daetwyler, H.D. A template-free machine vision-based crop row detection algorithm. Precis. Agric. 2021, 22, 124–153. [Google Scholar] [CrossRef]
Hamuda, E.; Mc Ginley, B.; Glavin, M.; Jones, E. Automatic crop detection under field conditions using the HSV colour space and morphological operations. Comput. Electron. Agric. 2017, 133, 97–107. [Google Scholar] [CrossRef]
Li, D.; Li, B.; Kang, S.; Feng, H.; Long, S.; Wang, J. E2CropDet: An efficient end-to-end solution to crop row detection. Expert Syst. Appl. 2023, 227, 120345. [Google Scholar] [CrossRef]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
Yang, Y.; Zhou, Y.; Yue, X.; Zhang, G.; Wen, X.; Ma, B.; Xu, L.; Chen, L. Real-time detection of crop rows in maize fields based on autonomous extraction of ROI. Expert Syst. Appl. 2023, 213, 118826. [Google Scholar] [CrossRef]
Diao, Z.; Guo, P.; Zhang, B.; Zhang, D.; Yan, J.; He, Z.; Zhao, S.; Zhao, C.; Zhang, J. Navigation line extraction algorithm for corn spraying robot based on improved YOLOv8s network. Comput. Electron. Agric. 2023, 212, 108049. [Google Scholar] [CrossRef]
Liu, T.-H.; Zheng, Y.; Lai, J.-S.; Cheng, Y.-F.; Chen, S.-Y.; Mai, B.-F.; Liu, Y.; Li, J.-Y.; Xue, Z. Extracting visual navigation line between pineapple field rows based on an enhanced YOLOv5. Comput. Electron. Agric. 2024, 217, 108574. [Google Scholar] [CrossRef]
Diao, Z.; Ma, S.; Zhang, D.; Zhang, J.; Guo, P.; He, Z.; Zhao, S.; Zhang, B. Algorithm for Corn Crop Row Recognition during Different Growth Stages Based on ST-YOLOv8s Network. Agronomy 2024, 14, 1466. [Google Scholar] [CrossRef]
Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
Lin, Y.; Chen, T.; Liu, S.; Cai, Y.; Shi, H.; Zheng, D.; Lan, Y.; Yue, X.; Zhang, L. Quick and accurate monitoring peanut seedlings emergence rate through UAV video and deep learning. Comput. Electron. Agric. 2022, 197, 106938. [Google Scholar] [CrossRef]
Cui, J.; Zheng, H.; Zeng, Z.; Yang, Y.; Ma, R.; Tian, Y.; Tan, J.; Feng, X.; Qi, L. Real-time missing seedling counting in paddy fields based on lightweight network and tracking-by-detection algorithm. Comput. Electron. Agric. 2023, 212, 108045. [Google Scholar] [CrossRef]
Han, K.; Wang, Y.; Tian, Q.; Guo, J.; Xu, C.; Xu, C. Ghostnet: More features from cheap operations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 1580–1589. [Google Scholar]
Zhang, X.; Zhou, X.; Lin, M.; Sun, J. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6848–6856. [Google Scholar]
Ji, W.; Pan, Y.; Xu, B.; Wang, J. A Real-Time Apple Targets Detection Method for Picking Robot Based on ShufflenetV2-YOLOX. Remote Sens. 2022, 12, 856. [Google Scholar] [CrossRef]
Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar] [CrossRef]
Gong, H.; Wang, X.; Zhuang, W.J.A. Research on Real-Time Detection of Maize Seedling Navigation Line Based on Improved YOLOv5s Lightweighting Technology. Agriculture 2024, 14, 124. [Google Scholar] [CrossRef]
Andújar, D.; Rueda-Ayala, V.; Moreno, H.; Rosell-Polo, J.R.; Escolá, A.; Valero, C.; Gerhards, R.; Fernández-Quintanilla, C.; Dorado, J.; Griepentrog, H.-W. Discriminating Crop, Weeds and Soil Surface with a Terrestrial LIDAR Sensor. Sensors 2013, 13, 14662–14675. [Google Scholar] [CrossRef]
Yao, M.; Hu, J.; Liu, W.; Yue, R.; Zhu, W.; Zhang, Z. Positioning control method for the seedling tray of automatic transplanters based on interval analysis. Trans. Chin. Soc. Agric. Eng. 2023, 39, 27–36. [Google Scholar]
Yu, G.; Lei, W.; Liang, S.; Xiong, Z.; Ye, B. Advancement of mechanized transplanting technology and equipments for field crops. Trans. Chin. Soc. Agric. Mach. 2022, 53, 1–20. [Google Scholar]
Qin, D.; Leichner, C.; Delakis, M.; Fornoni, M.; Luo, S.; Yang, F.; Wang, W.; Banbury, C.; Ye, C.; Akin, B. MobileNetV4-Universal Models for the Mobile Ecosystem. arXiv 2024, arXiv:2404.10518. [Google Scholar]
Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.-C. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2018; pp. 4510–4520. [Google Scholar]
Yang, L.; Zhang, R.-Y.; Li, L.; Xie, X. Simam: A simple, parameter-free attention module for convolutional neural networks. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 11863–11874. [Google Scholar]
Zhang, W.; Zhao, W.; Li, J.; Zhuang, P.; Sun, H.; Xu, Y.; Li, C. CVANet: Cascaded visual attention network for single image super-resolution. Neural Netw. 2024, 170, 622–634. [Google Scholar] [CrossRef]
Yu, J.; Zhang, J.; Shu, A.; Chen, Y.; Chen, J.; Yang, Y.; Tang, W.; Zhang, Y.J. Study of convolutional neural network-based semantic segmentation methods on edge intelligence devices for field agricultural robot navigation line extraction. Comput. Electron. Agric. 2023, 209, 107811. [Google Scholar] [CrossRef]
Tian, Y.; Zhang, W.; Su, P.; Xu, Y.; Zhuang, P.; Xie, X.; Zhao, W. S⁴: Self-Supervised learning with Sparse-dense Sampling. Knowl.-Based Syst. 2024, 299, 112040. [Google Scholar] [CrossRef]
Zhang, W.; Li, Z.; Li, G.; Zhuang, P.; Hou, G.; Zhang, Q.; Li, C. Gacnet: Generate adversarial-driven cross-aware network for hyperspectral wheat variety identification. IEEE Trans. Geosci. Remote Sens. 2023, 62, 5503314. [Google Scholar] [CrossRef]

Figure 1. Classification of seedling and missed hill in ridge-cultivated vegetables.

Figure 2. Dataset collection sites and seedling transplantation methods. Figure (A) represents the dataset collection site; (B) is the seedling factory; and (C–E) show the broccoli seedlings transplanted by the three different transplanters.

Figure 3. Some example images from the dataset: (a) continuous seedling row; (b) row with gaps; and (c) row with severe gaps.

Figure 4. Network structure of the SN-CNN.

Figure 5. The structure of the UIB block.

Figure 6. Generation of attention weights by SimAM in a 3D feature space.

Figure 7. Loss function curve comparison between improved model and baseline (YOLOv8n).

Figure 8. Comparison of P-R curves of four different models: (a) YOLOv5s; (b) YOLOv7-tiny; (c) YOLOv8n; and (d) SN-CNN.

Figure 9. Comparison of model parameters and [email protected] across different YOLO variants and SN-CNN.

Figure 10. Machine vision navigation chassis.

Figure 11. Comparison of navigation line fitting methods: (a) LS method; (b) RANSAC method; (c) comparison of LS and RANSAC methods with the actual navigation line in a continuous row; (d) LS method; (e) RANSAC method; and (f) comparison of LS and RANSAC methods with the actual navigation line in a row with missing seedlings. Note: In the picture, s stands for seedling and m stands for missed hill; (a–c) demonstrate the comparison of different methods within the same field of view, and, similarly, (d–f) illustrate the same comparison.

Table 1. Hyperparameter settings for SN-CNN training.

Set	Values
Lr0	0.001667
Momentum	0.9
Weight decay	0.0005
Batch size	32
Epochs	200
Pre-trained weight	YOLOv8n.pt

Table 2. Performance comparison of different algorithms.

Networks	Parameters	FLOPs	P	R	[email protected]
YOLOv5s	7.3 M	17.5 G	91%	88.4%	93.1%
YOLOv7-tiny	6.35 M	14.2 G	91.2%	89.1%	93.4%
YOLOv8n	3.31 M	8.7 G	88.4%	89.3%	92.6%
YOLOv8s	11.4 M	29.4 G	91.5%	89.2%	93.5%
SN-CNN(ours)	2.37 M	6.7 G	93.9%	90.2%	94.6%

Table 3. Comparisons of ablation experiments.

Model	C2f_UIB	SimAM	Parameters	FLOPs	P	R	[email protected]
M1	×	×	3.31 M	8.7 G	88.4%	89.3%	92.6%
M2	√	×	2.37 M	6.7 G	90.2%	88.3%	92.8%
M3	×	√	3.31 M	8.7 G	92.6%	90.5%	93.7%
M4	√	√	2.37 M	6.7 G	93.9%	90.2%	94.6%

× indicates that the module is not used, while √ indicates that the module is used in the model.

Table 4. Comparison of navigation line fitting methods.

Methods	Number of Images	RMSE (Pixels)	ms/(Images)
SN-CNN + LS	100	6.8	22
SN-CNN + RANSAC	100	5.7	25

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, T.; Zhou, J.; Liu, W.; Yue, R.; Shi, J.; Zhou, C.; Hu, J. SN-CNN: A Lightweight and Accurate Line Extraction Algorithm for Seedling Navigation in Ridge-Planted Vegetables. Agriculture 2024, 14, 1446. https://doi.org/10.3390/agriculture14091446

AMA Style

Zhang T, Zhou J, Liu W, Yue R, Shi J, Zhou C, Hu J. SN-CNN: A Lightweight and Accurate Line Extraction Algorithm for Seedling Navigation in Ridge-Planted Vegetables. Agriculture. 2024; 14(9):1446. https://doi.org/10.3390/agriculture14091446

Chicago/Turabian Style

Zhang, Tengfei, Jinhao Zhou, Wei Liu, Rencai Yue, Jiawei Shi, Chunjian Zhou, and Jianping Hu. 2024. "SN-CNN: A Lightweight and Accurate Line Extraction Algorithm for Seedling Navigation in Ridge-Planted Vegetables" Agriculture 14, no. 9: 1446. https://doi.org/10.3390/agriculture14091446

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

SN-CNN: A Lightweight and Accurate Line Extraction Algorithm for Seedling Navigation in Ridge-Planted Vegetables

Abstract

1. Introduction

2. Materials and Methods

2.1. Research Process and Methods of This Paper

2.2. Dataset Construction and Image Preprocessing

2.3. Improvement of YOLOv8n

2.4. Efficient C2f_UIB Block

2.5. Integrating the SimAM Attention Mechanism

2.6. Model Training and Evaluation

2.6.1. Hardware Platform and Hyperparameter Settings

2.6.2. Model Evaluation

3. Results

3.1. Comparison of SN-CNN and the Baseline’s Training Loss Functions

3.2. Comparison of SN-CNN Performance with Existing Algorithms

3.3. Ablation Experiment

3.4. Performance Evaluation of Navigation Line Fitting on Jetson AGX Xavier

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI