Real-Time Caterpillar Detection and Tracking in Orchard Using YOLO-NAS Plus SORT

Nair, Sumesh; Hong, Guo-Fong; Hsu, Chia-Wei; Lin, Chun-Yu; Chen, Shean-Jen

doi:10.3390/agriculture15070771

Open AccessArticle

Real-Time Caterpillar Detection and Tracking in Orchard Using YOLO-NAS Plus SORT

by

Sumesh Nair

,

Guo-Fong Hong

,

Chia-Wei Hsu

,

Chun-Yu Lin

and

Shean-Jen Chen

^*

College of Photonics, National Yang Ming Chiao Tung University, Tainan 711, Taiwan

^*

Author to whom correspondence should be addressed.

Agriculture 2025, 15(7), 771; https://doi.org/10.3390/agriculture15070771

Submission received: 26 February 2025 / Revised: 28 March 2025 / Accepted: 1 April 2025 / Published: 2 April 2025

(This article belongs to the Section Digital Agriculture)

Download

Browse Figures

Versions Notes

Abstract

:

Detecting and tracking caterpillars in orchard environments is crucial for advancing precision agriculture but remains challenging due to occlusions, variable lighting, wind interference, and the need for precise small-object detection. This study presents a real-time deep learning approach that integrates the YOLO-NAS object detection model with the SORT tracking algorithm to overcome these challenges. Evaluated in a jujube orchard, the proposed method significantly improved small caterpillar detection and tracking. Using an RGB-D camera operating at 30 frames per second, the system successfully detected caterpillars measuring 2–5 cm at distances of 20–35 cm, corresponding to resolutions of 21 × 6 to 55 × 10 pixels. The integration of YOLO-NAS with SORT enhanced detection performance, achieving a ~9% increase in true positive detections and an ~8% reduction in false positives compared to YOLO-NAS alone. Even for the smallest caterpillars (21 × 6 pixels), the method achieved over 60% true positive detection accuracy without false positives within 1 s inference. With an inference time of just 0.2 milliseconds, SORT enabled real-time tracking and accurately predicted caterpillar positions under wind interference, further improving reliability. Additionally, selective corner tracking was employed to identify the head and tail of caterpillars, paving the way for future laser-based precision-targeting interventions focused on the caterpillar head.

Keywords:

caterpillar; YOLO-NAS; SORT; detection; tracking; laser targeting; pest control

1. Introduction

Pests, including caterpillars, threaten crop yields, increase production costs, and damage the environment, ultimately impacting food security and agricultural sustainability. Understanding the economic implications and implementing effective pest control strategies are crucial for mitigating these challenges. Studies, such as those by Savary et al. [1], highlight the global economic burden of pests and underscore the urgent need for effective management. Modern pest control has evolved to incorporate various innovative strategies, with integrated pest management (IPM) gaining prominence, as discussed by Auclair et al. [2]. Additionally, McCullough [3] illustrates the importance of targeted approaches, such as those employed against the emerald ash borer, to effectively address specific pest threats.

In recent years, advancements in deep learning and computer vision have driven transformative changes across various sectors, particularly in precision agriculture, especially in areas of automated disease detection and pest identification and control. Andrew et al., as well as Mohanty et al., utilized publicly available datasets such as the PlantVillage dataset for studying the feasibility of employing deep learning methods for pest identification with significant accuracy and precision in identifying the leaf diseases [4,5]. Ramcharan et al. [6] utilized a dataset from Tanzania and employed deep learning methods to identify five different types of leaf diseases of the Cassava plant with a correct detection rate of at least 95% for these diseases. In another example, Caldeira et al. [7] employed GoogleNet and ResNet50 models to identify cotton leaf lesions, leading to an accuracy of above 85% in the detection rate. Another critical challenge in orchard management is the effective detection of pests, especially caterpillars, which can cause significant crop damage. Deep learning has emerged as a powerful tool for pest identification, with researchers developing models that can accurately detect and classify pests across various crops. For instance, Selvaraj et al. [8] employed deep learning techniques to recognize banana diseases and pests, achieving high detection accuracy. Similarly, Yang et al. and Hassan et al. utilized object detection algorithms, such as Faster R-CNN, to identify pests in maize and rice crops, respectively [9,10]. The application of deep learning for pest detection extends beyond traditional crop fields. Kasinathan et al. [11] successfully applied deep learning to detect and recognize pest insects in open-field crops, while Mamdouh and Khattab [12] developed a real-time pest recognition system specifically for olive trees. Additionally, Tetila et al. [13] analyzed the YOLO (You Only Look Once) detector for the real-time detection of soybean pests. Meanwhile, Li et al. [14] employed an improved version of the YOLOv5 detector for the accurate detection and counting of aphids in pepper plants, Wang et al. [15] utilized a revamped Faster R-CNN based on the ResNet101 feature extractor to identify common pests of apples in the orchards. These examples underscore the versatility and effectiveness of deep learning in pest detection across diverse agricultural environments.

This study focuses on the development of a deep learning-based system designed for the detection and tracking of caterpillars in orchard environments, with the potential for subsequent interventions, such as laser-based elimination. The approach integrates object detection algorithms with fast tracking techniques to create an efficient and adaptable tool for real-time pest detection and tracking. The primary objective is to achieve the precise identification and continuous tracking of caterpillars, even under challenging conditions, such as occlusions, wind interference, and variable lighting typical of natural orchard settings. Despite significant advancements in object detection and tracking technologies, several critical research gaps persist in the domain of agricultural pest monitoring. First, there is a notable limited availability of specialized datasets for caterpillar detection in real orchard environments that adequately represent varying illumination conditions, occlusion scenarios, and weather variations. Second, existing systems have given insufficient attention to tracking persistence through occlusion events, which is crucial for continuous monitoring in complex agricultural settings where foliage frequently obscures target pests. Third, there remains a lack of integrated approaches that effectively combine high-accuracy detection with real-time tracking capabilities specifically optimized for precision pest management applications. Our work addresses these gaps through several novel contributions. First, we have developed a robust detection and tracking system optimized for small, camouflaged targets in complex natural environments, which overcomes the challenges of detecting cryptic pests against visually similar backgrounds. Moreover, we introduce the integration of corner-tracking functionality specifically designed for precision targeting applications, enabling accurate positional data necessary for potential automated intervention systems. Finally, our comprehensive evaluation under real-world agricultural conditions provides valuable insights into system performance across diverse environmental variables, informing future deployments in practical pest management scenarios.

A key aspect of our research is the evaluation of various updated YOLO models, including YOLOv9 [16], YOLO-NAS [17], YOLOv11 [18], and YOLOv12 [19]. YOLO models are renowned for their exceptional speed and accuracy in object detection, and they continue to evolve, improving detection accuracy, processing speed, and the ability to handle small objects. This has been succinctly reviewed by Terven et al. [20] in his recent work. YOLO-NAS introduces advanced features, such as neural architecture search and quantization modules, which enhance detection speed and accuracy. YOLOv9, on the other hand, incorporates programmable gradient information which addresses information bottlenecks and reversible functions when data traverse through deep networks. YOLOv9 also introduces the generalized efficient layer aggregation network architecture, which utilizes conventional convolution operators to achieve better parameter utilization than methods based on depth-wise convolution, making it highly efficient for small object detection tasks. Further evolution in the YOLO family includes YOLOv11, which features enhanced backbone and neck architectures for more precise feature extraction while optimizing for efficiency and speed. Notably, YOLOv11m achieves higher mean average precision (mAP) on the COCO dataset with 22% fewer parameters than YOLOv8m, demonstrating significant improvements in computational efficiency without compromising accuracy. With its adaptability across various environments and support for diverse computer vision tasks including object detection, instance segmentation, and pose estimation, YOLOv11 presents a compelling option for our caterpillar detection system. The latest iteration, YOLOv12, takes a different approach by incorporating attention mechanisms while maintaining real-time performance. It introduces an innovative “area attention” module that strategically partitions feature maps to reduce the computational complexity typically associated with self-attention operations. YOLOv12 also adopts residual efficient layer aggregation networks to enhance feature aggregation and training stability, which is particularly beneficial for larger models. These advancements allow YOLOv12 to better model global context without sacrificing the speed critical for real-time pest monitoring applications.

In parallel with YOLO developments, transformer-based detection systems have emerged as powerful alternatives for object detection tasks. DETR (DEtection TRansformer) by Carion et al. [21] pioneered the application of transformers to object detection, eliminating the need for many hand-designed components like anchor boxes and non-maximum suppression through its direct set prediction approach. Deformable DETR [22] addressed DETR’s slow convergence issues by introducing deformable attention mechanisms that focus only on a small set of key sampling points around a reference point, significantly improving training efficiency while maintaining accuracy. Real-time DETR [23] further optimized transformer-based detection for real-time applications by incorporating an IoU-aware query selection mechanism and hybrid encoder design, achieving state-of-the-art performance-speed trade-offs particularly beneficial for edge device deployment. An efficient DETR [24] approach reduces computational complexity through progressive decoder designs and adaptive feature selection, making transformer-based detection more viable for the resource-constrained systems common in agricultural settings. Given the small size of caterpillars, which makes them difficult to detect with less refined methods, our study places significant emphasis on the small object detection capabilities of these models. This focus is critical for ensuring that the system can accurately identify and track caterpillars in real orchard environments. The advances in both YOLO architectures and transformer-based detection systems provide promising approaches for addressing the challenges associated with detecting and tracking these small, often camouflaged pests in complex natural settings.

Effective tracking mechanisms are crucial for our system’s functionality, extending beyond mere object detection. Trackers are essential for maintaining the identity and trajectory of detected objects across successive frames, which is vital for continuous monitoring and timely intervention. To identify the fast and optimal tracking solution for our application, we selected the SORT (Simple Online and Realtime Tracking) algorithm [25]. While more advanced trackers like DeepSORT, BYTETrack, OCSORT, BotSORT, and StrongSORT [26,27,28,29,30] offer various enhancements and additions to the original SORT algorithm, we selected SORT for its effective balance of computational efficiency and tracking accuracy in real-time applications [27,30], making it particularly suited for our needs. The integration of YOLO-NAS with the SORT tracker results in a powerful combination. YOLO-NAS excels in detecting small objects with high precision, while SORT provides efficient and low-overhead tracking capabilities, as this is required while deploying in real-time scenarios with embedded devices on agricultural vehicles. This synergy enables real-time precise detection and tracking, ensuring seamless deployment in real-world orchard environments. Additionally, our research enhances the system’s functionality with features like selective corner tracking, which allows for more advanced applications such as laser-based elimination. By precisely identifying the head and tail of caterpillars, this approach supports targeted pest control measures, further improving the effectiveness of our system.

2. Materials and Methods

2.1. Caterpillar Rearing and Dataset Collection

In this study, a series of experiments were conducted using live caterpillars cultivated in a laboratory setting to simulate the conditions found in orchards. The caterpillars, nourished with fresh leaves sourced from the orchard (22.92826° N, 120.29395° E) in Tainan City, Taiwan, were selected for imaging and training during their 3rd and 4th instar growth stages, ranging in length from 2 cm to 4.5 cm. To replicate natural orchard conditions, the caterpillars were placed on leaves and branches in an orchard environment. The species Orgyia postica and Porthesia taiwana, both common pests in jujube orchards in Taiwan, were utilized for these experiments.

To capture images of the caterpillars, an Intel RealSense D405 camera (manufactured by Intel Corporation, Santa Clara, CA, USA) was employed. This stereo camera, known for its high accuracy of 0.1 mm, was set to capture images at a frame rate of 30 frames per second (fps) within a suitable range of around 7 cm to 50 cm. Images were captured at 30-min intervals twice weekly from 6:00 AM to 6:00 PM over six months (April to October 2024), ensuring the dataset represents the full spectrum of daily light conditions. Our data collection protocol deliberately included various lighting scenarios: direct sunlight, partial cloud cover, heavy overcast (1000–5000 lux) and dawn/dusk transitions. To ensure proper exposure control, the region of interest (ROI) feature was utilized, focusing on the leaves rather than the sky.

During our experimental period, we recorded a temperature range of 16–28 °C (60.8–82.4 °F), with daily fluctuations of approximately 5–8 °C, and relative humidity levels of 65–85%, with higher humidity levels during early morning hours. Based on the entomological literature, both species observed in our study (Orgyia postica and Porthesia taiwana) exhibit optimal feeding activity at temperatures in the range of 20–25 °C and relative humidity of 70–80%. These environmental parameters largely fell within the optimal range for active feeding behavior, which enhanced detection opportunities [31,32]. Weather condition data were obtained from the Central Weather Bureau government website, specifically for the Guiren district in Tainan, and matched to the times of data collection.

A total of 1130 images were captured at varying distances from the camera, ranging from 10 cm to 55 cm, and under different lighting conditions at various times of the day with a resolution of 1280 × 720 pixels. This resolution was chosen to balance image quality with the capture frame rate. Data augmentation was applied to introduce variety into the dataset, with images rotated at angles of 45, 90, 135, and 180 degrees, enhancing the representation of the caterpillars in different orientations. Our approach to data collection and augmentation was intentionally rigorous and context-specific, prioritizing real-world diversity over artificial enhancements. Images were systematically captured under varying environmental conditions, spanning different times of the day and night over a six-month period. This extended collection period ensured the comprehensive coverage of local orchard environments, capturing a wide range of naturally occurring variations. Rather than relying on synthetic augmentation techniques such as Gaussian noise, blurring, or contrast adjustments, we focused on maintaining the authenticity of our dataset. Artificial modifications often fail to accurately replicate real-world variations in lighting, occlusions, and motion blur caused by wind and foliage movement. By emphasizing naturalistic data collection, we ensured that the dataset truly represents the complexities encountered in orchard settings, improving the robustness of our model for practical deployment. This methodological choice also enhances the long-term value of the dataset, as it remains highly representative of real-world conditions without relying on artificially induced distortions.

This process resulted in a total of 5650 images. These images were later cropped to 640 × 640 pixels for training and testing. The caterpillars were labeled using LabelImg software (v. 1.8.6) to create the dataset required for training the network. The dataset was partitioned into training, validation, and test sets in a ratio of approximately 0.89:0.05:0.06, consisting of 5040, 200, and 250 images respectively. The resulting dataset is notably diverse, capturing a range of environmental conditions, such as low light and occlusions caused by leaf movement due to wind. This aspect of our study is particularly significant, as previous research employing deep learning techniques for pest detection often employed datasets that might not adequately represent real-world orchard conditions.

2.2. Manual Camera Parameter Selection

In real-world applications, variable weather conditions, such as alternating cloudy and sunny periods, can significantly impact image quality. To ensure the optimal performance of our detection and tracking system under such conditions, precise calibration of the camera parameters was essential. The Intel RealSense D405 camera was employed in this study, and specific settings were adjusted to achieve the best results across varying environmental conditions. These settings included white balance, brightness, contrast, sharpness and exposure time.

The white balance was calibrated and adjusted to the maximum value of 6500 K to account for different lighting conditions, ensuring color consistency in the captured images. Brightness and contrast were left to be auto-adjusted to enhance image clarity and detail, making it easier for the detection algorithm to identify caterpillars. The sharpness parameter was set to a maximum value of 100 to enhance and make sure the edges were clear. The exposure time, also known as shutter speed, was set manually to 1 millisecond to prevent overexposure in bright conditions and underexposure in low light, ensuring a consistent image quality across different times of the day. This careful adjustment of the exposure time was crucial, as automatic settings often resulted in images that were either too dark or too bright, which could hinder the detection process.

Additionally, the ROI feature available on the RealSense camera was utilized to focus on the leaves where the caterpillars were likely to be found rather than on the sky or background elements that could skew the exposure and white balance settings. This focus ensured that the captured images were of high quality and relevant to the detection task. By optimizing these parameters, we ensured that the camera could capture high-quality images under varying weather conditions, which is critical for accurate caterpillar detection and tracking. This parameter selection process is an important step in developing a reliable pest management system that can operate effectively in real orchard environments. The resulting images from these varying conditions are shown in Figure 1a–e.

2.3. YOLO-NAS and Recent YOLOs for Small Object Detection

The YOLO object detection algorithm has become widely popular in computer vision due to its impressive speed and accuracy. YOLO-NAS, introduced by Deci AI in 2023, has emerged as one of the top-performing models in the YOLO family. Given our focus on effective small object detection, we chose to compare the performance of YOLO-NAS, YOLOv9, YOLOv11, and YOLOv12 specifically targeting this objective. As described by Tong et al. [33], any object that is equal to or lesser than 32 × 32 pixels is defined as a small object in object detection and classification problems. For a fair comparison, we selected the YOLO-NAS-L, YOLOv9-E, YOLOv11x, and YOLOv12x models for our study. The models were evaluated on a Windows 10 system equipped with a 12th Gen Intel(R) Core (TM) i5-12400 2.50 GHz processor, 32 GB of RAM, and an Nvidia GeForce RTX 3060 GPU with 12 GB VRAM, using the CUDA version 11.7, PyTorch version 1.13.1, and Python version 3.8.10. The training was conducted with a batch size of 4, over 30 epochs.

In our study, we identified the optimal model based on three key criteria: small object detection, recall value and mAP at both 50% and 50–95% confidence intervals. Recall measures the proportion of correctly identified positive cases (true positives) out of the total number of actual positive cases (true positives + false negatives). We prioritized recall over precision, which calculates the proportion of true positives among all positive predictions (true positives + false positives), because our primary goal was to ensure the successful detection of caterpillars, even if it resulted in some false positives. Also, to test the robustness of the YOLO versions for general caterpillars’ detection, the detections of the caterpillars in the 30 frames for the test videos taken at a distance of 20–25 cm and 30–35 cm were compared based on the number of true positive detections and false positive detections and tabulated.

2.4. YOLO-NAS Plus SORT for Selective Corner Tracking for Head Detection

Our study also highlighted that all detectors are prone to missing detections due to occlusions or changes in lighting caused by external factors such as cloudy weather and wind. Nguyen et al. [34] discussed in his study how an additional Kalman filter along with deep learning algorithm for camshift human tracking improved the overall accuracy of this robust system, as it can deal with the occlusions and missing detections, as mentioned above. To address these challenges and enhance the overall detection rate, we integrated the detectors with the SORT tracking algorithm [25], which provides an optimal balance between computational speed and tracking accuracy. While more advanced trackers are available, we selected SORT for its simplicity and efficiency. SORT is particularly known for its speed, leveraging the Kalman filter for motion prediction and the Hungarian algorithm for data association. Although it is less robust to occlusions and complex motion patterns, its efficiency makes SORT especially well-suited for real-time applications where minimizing computational overhead is crucial. Also, standalone YOLO-NAS and YOLO-NAS plus SORT were compared by studying the parameters, namely true positive detections and false positive detections for the caterpillars in the test video taken at the distance of 20–25 cm and 30–35 cm. Figure 2 presents a schematic diagram of the YOLO-NAS combined with the SORT algorithm with the UGV.

To enable precise laser-based interventions targeting the heads of caterpillars, selective corner tracking has been incorporated. This method identifies the head and tail of caterpillars when they are positioned diagonally within the bounding box. Two corners of the caterpillar align with two corners of the bounding box, while the other two corners fall on the leaf or background. Due to the significant color contrast between the caterpillars and the background, it is possible to track the head and tail corners by specifying the caterpillars’ combined hue, saturation, and value (HSV) color values. A component was added to selectively identify these corners based on their color. The system was developed with the potential application of lasers in unmanned agricultural ground vehicles for caterpillar elimination. Targeting the caterpillar’s head with a precise laser beam requires significantly less exposure time compared to targeting the body, as Elgar et al. [35] described how the head contains antennae and other critical parts essential for environmental perception and sensory processing. By obtaining the x, y, and z coordinates of the two corners of the bounding box representing the head and tail, a laser strike at these points ensures at least one hit on the head, effectively neutralizing the caterpillar. The overall software architecture of YOLO-NAS plus SORT with selective tracking is illustrated in Figure 3.

3. Experimental Results and Discussion

3.1. YOLO-NAS for Small Object Detection

The performance of YOLO-NAS was evaluated for small object detection in orchard environments. This section highlights the capabilities of this integrated approach in accurately detecting and tracking small caterpillars, which is crucial for effective pest management. For effective comparison, all the YOLOs were trained for 30 epochs. YOLO-NAS achieves a recall value of 0.99, as compared to 0.978 of YOLOv9, indicating its superior ability to identify true positives. YOLO-NAS achieved lower mAP@50 (0.978 vs. YOLOv9’s 0.990) and higher mAP@50–95 (0.664 vs. 0.648), indicating the better performance of YOLO-NAS at higher confidence.

When tested with a 640 × 640 video containing 30 frames (equivalent to 1 s inference), the detection rates varied significantly across models and distances. YOLOv9-E detected 108/300 caterpillars (36%) at 20–25 cm and 93/240 (39%) at 30–35 cm with zero false positives. YOLO-NAS-L demonstrated superior detection with 150/300 (50%) at 20–25 cm and 147/240 (61%) at 30–35 cm though with some false positives (23 and 28 respectively). YOLOv11x performed poorly with only 134/300 (45%) detections at 20–25 cm and merely 7/240 (3%) at 30–35 cm. YOLOv12x showed the best overall detection rate at close range with 151/300 (50%) at 20–25 cm and 106/240 (44%) at 30–35 cm, maintaining zero false positives at both distances. Table 1 shows the detection rates for all the tested YOLO versions.

The processing speed analysis revealed important differences between the models. Based on the timing data in Table 2, the inference speeds can be calculated as follows: YOLOv9-E at 20.7 fps (1000 ms/48.3 ms), YOLOv11x at 26.9 fps (1000 ms/37.2 ms), YOLOv12x at 20.1 fps (1000 ms/49.8 ms), and YOLO-NAS-L at 19.8 fps (1000 ms/50.47 ms). When considering the total processing time (preprocess + inference + post-process), YOLOv9-E operates at 19.8 fps, YOLOv11x at 25.6 fps, YOLOv12x at 18.9 fps, and YOLO-NAS-L at 19.2 fps.

The mAP@50 and mAP@50-95 values for YOLOv11x were 0.978 and 0.655, whereas for YOLOv12x, they were 0.984 and 0.651, respectively. The model’s performance in detecting the smallest caterpillars of 2–2.5 cm, measuring 21 × 6 pixels and 25 × 6 pixels at a distance of 30–35 cm from the camera, was particularly noteworthy. In a selected video with 30 frames, YOLO-NAS correctly detected the smallest caterpillars in 55 out of 60 instances, indicating a small object detection percentage of about 91.7, whereas YOLOv9 detected 14 of the smallest caterpillars in 60 instances, indicating a small object detection percentage of about 23.3. YOLOv11 failed to identify any instances of the small objects, whereas YOLOv12 succeeded in identifying 29 of the smallest caterpillars in 60 instances, indicating a small object detection percentage of about 50. Figure 4 indicates the precision, recall and the mAP@50 for all the tested YOLO versions. Table 3 summarizes the above observations.

Despite YOLOv11x having the fastest inference speed and YOLOv12x achieving excellent detection rates with zero false positives, YOLO-NAS-L was selected for our study because it provided the most balanced performance across all metrics. It demonstrated consistently high detection rates at both distance ranges (50% and 61%), maintained reasonable inference speed (19.2 fps total), achieved competitive precision metrics (mAP@50 of 0.972 and mAP@50-95 of 0.664), and most importantly, excelled at detecting the smallest caterpillars with a 91.7% detection rate for the smallest specimens. While the false positive rate was higher than some alternatives, the superior detection capability for the smallest targets—which constitute the most challenging and critical aspect of pest detection—justified this trade-off for our specific application in orchard pest management. This is shown in Figure 5a–d. The Supplementary Video S1 shows the comparison example between YOLO-NAS and other YOLO version at 30–35 cm, highlighting the superior small object detection capability of YOLO-NAS.

3.2. Selective Corner Tracking Using YOLO-NAS and SORT

SORT, chosen for its minimal computational overhead and high-speed performance, complements YOLO-NAS by ensuring the efficient tracking of detected objects. This integration enables the continuous monitoring of caterpillars, even in occluded frames, thereby improving the overall detection rate. The high recall value and efficient tracking performance of the YOLO-NAS and SORT combination demonstrates its effectiveness in the real-time detection and tracking of caterpillars. By maintaining high detection and tracking accuracy, the YOLO-NAS and SORT combination ensures that even the occluded pests are not overlooked, thereby enhancing the overall efficacy of pest management strategies.

Building on the strengths of YOLO-NAS, we implemented YOLO-NAS plus SORT with selective corner tracking to further enhance the precision of our pest management system. This technique identifies the head and tail of caterpillars when they are aligned diagonally within the bounding box, ensuring precise interventions, such as laser-based caterpillar elimination. Figure 6 illustrates the implementation of selective corner tracking using YOLO-NAS and SORT, showcasing the algorithm’s ability to accurately identify the head and tail of caterpillars for future precise laser targeting.

Our method improves upon color-based tracking by eliminating reliance on distinct coloration, enabling robust detection even with camouflage. Unlike previous approaches that rely on color segmentation techniques such as k-means filtering and thresholding methods like Otsu [36,37,38], our method is not constrained by variations in body and head coloration. This is particularly beneficial in real orchard conditions, where caterpillars can effectively blend with their surroundings, especially against branches and foliage. By eliminating the dependency on distinct color features, our approach ensures the reliable tracking of caterpillars regardless of their camouflage while also selectively identifying their head positions. This targeted detection capability is crucial for future laser-based interventions, allowing precise elimination without unnecessary collateral impact.

Additionally, our method offers plug-and-play versatility, enabling seamless adaptation to different caterpillar species without the need for extensive parameter tuning or modifications. This flexibility makes it a practical solution for real-world agricultural applications, where species diversity and environmental variations present significant challenges. The robustness of our approach extends beyond a single-use case, demonstrating strong potential for widespread implementation across diverse orchard environments. By enhancing automated pest management strategies, our method contributes to the development of more efficient and scalable solutions for precision agriculture.

Selective corner tracking significantly improves the efficiency of pest control measures by ensuring caterpillars are effectively eliminated with minimal energy expenditure. This method reduces the risk of collateral damage to surrounding foliage, further enhancing the system’s efficiency and sustainability. In our case, the two caterpillars taken for this study, namely Orgyia postica and Porthesia taiwana, have significant color differences as compared to the foliage of the Jujube trees. To ensure that both types of caterpillars are differentiated from the surrounding foliage, appropriate HSV values needed to be chosen. After careful consideration, HSV values ranging from (7, 180, 40) to (20, 255, 130) were selected, keeping in mind both the caterpillars. Additionally, since sometimes the bounding boxes are not entirely tight around the caterpillars, an ROI measuring 10 × 10 pixels around each corner was considered for the HSV value matching for efficient corner tracking. This enables the possibility of the direct integration of laser-based caterpillar head targeting. For the best possible outcome, the three parameters of SORT need to be selected carefully, namely the maximum age of the trackers (max_age), minimum number of hits for the tracker to begin tracking (min_hits) and the Intersection over Union threshold (IoU_thres). For our study, the values chosen for these parameters were 10, 4, and 0.5 respectively, to achieve accurate tracking while not sacrificing detection performance. When tested with a 640 × 640 video containing 30 frames at 20–25 cm and 30–35 cm, YOLO-NAS plus SORT with selective corner tracking had true positive detection rates of 59% and 60% compared to the standalone YOLO-NAS, which had positive detection rates of 50% and 61%, respectively, thereby improving the positive detection rate by about 8.7% for 20–25 cm and showing a slight reduction of 1.6% for 30–35 cm. Moreover, YOLO-NAS plus SORT helped address the problem of false detections with standalone YOLO-NAS, having an improvement of almost 0.4% and 8.4% over YOLO-NAS for 20–25 cm and 30–35 cm, respectively. Also, in the 30 frames, the smallest caterpillars measuring 21 × 6 and 25 × 6 pixels were detected in 38 out of 60 instances, thereby not sacrificing much on the small object detection while increasing overall efficiency. These results are shown in Table 4. The Supplementary Video S2a,b shows the YOLO-NAS plus SORT with selective corner tracking for both 20–25 cm and 30–35 cm distances, respectively. Thus, the system’s adaptability to different environmental conditions, ensured by precise camera settings, maintains consistent performance regardless of external factors.

4. Practical Applications and Perspectives

4.1. Performance and Applications of YOLO-NAS and SORT for Pest Management

Our integrated approach using YOLO-NAS and SORT with selective corner tracking demonstrated superior performance for the real-time detection and tracking of caterpillars in orchards. YOLO-NAS achieved a high recall value (0.99) that outperformed other YOLO versions with particularly impressive results detecting small caterpillars (2–2.5 cm) at 91.7%. This early detection capability is crucial for timely intervention before crop damage occurs. The integration of SORT tracking enhanced the system by reducing false positives by up to 8.4% at 30–35 cm distances. Our camera parameter optimization methodology also allowed robust performance across variable environmental conditions, addressing a common limitation in previous studies that operated under controlled lighting.

The system provides significant advantages over traditional pest monitoring methods, which are typically labor-intensive, time consuming, and often provide delayed detection. In contrast, our automated approach enables the continuous monitoring with real-time detection capabilities across multiple orchard locations, providing comprehensive spatial coverage rather than limited sampling. The selective corner tracking feature adds precision by identifying specific body parts of caterpillars, which are valuable for targeted interventions. This automated system provides objective, consistent detection and counting, potentially improving threshold-based decision making in pest management strategies.

4.2. Economic and Practical Benefits

Implementation costs for our detection system include approximately USD 3500 for hardware (computing unit, camera, mounting hardware, protective enclosures) plus optional unmanned ground vehicle platforms (USD 4000–6000) for mobile applications. Operational requirements are modest: 150–250 W power consumption (compatible with solar integration), 5–10% annual maintenance costs, and significantly reduced labor requirements (1–2 person-hours per hectare weekly versus 8–12 for conventional methods). The economic benefits stem primarily from 15–25% yield improvement through early detection and targeted interventions (USD 4500–20,000 per hectare for high-value jujube orchards). This results in an estimated ROI period of 1.2–1.8 growing seasons for stationary systems and 2–3 seasons for mobile platforms.

4.3. Laser-Based Pest Control Integration

The integration of our detection and tracking system with laser-based pest control offers a promising alternative to chemical pesticides. Targeting specific body parts of caterpillars with lasers, made possible by our selective corner tracking feature, can reduce energy requirements by 30–40% compared to untargeted applications. For a typical 2–2.5 cm caterpillar, approximately 0.2–0.3 J per pest is sufficient when targeting vulnerable regions. A 5 W continuous wave laser with 150 millisecond pulses would provide sufficient energy while maintaining reasonable power consumption compatible with solar-powered field installations or mobile battery platforms.

4.4. Environmental and Safety Considerations

Laser-based systems offer significant environmental advantages by leaving no chemical residues while achieving high mortality rates in target pests. Unlike pesticides that affect both target and non-target organisms, laser systems can specifically target identified pests at the individual level with a 96–98% reduction in non-target impacts. Safety features including optical isolation, proximity sensors, physical shielding, and emergency shutdown systems would prevent accidental human exposure, while species-specific detection algorithms would protect beneficial insects. Regulatory pathways for laser pest control differ from chemical pesticides, avoiding extensive toxicological testing but requiring a demonstration of safety and efficacy through field trials. Despite promising potential and compelling economic advantages over conventional methods (25–40% reduction in total pest management costs over a 10-year lifespan), challenges remain including all-weather reliability, targeting precision for early instar larvae, throughput limitations, and dense canopy penetration that will require continued engineering refinement and comprehensive field validation.

4.5. Comparative Analysis of Precision Agriculture Technologies

Our approach offers distinct advantages over existing AI-driven pest management technologies. Unlike the autonomous spraying platforms developed by Qin et al. [39] and Zhang et al. [40], which primarily focus on chemical pesticide application, our deep learning-based pest identification method, combined with a future laser-based system, provides a non-chemical intervention that minimizes environmental impact. While systems such as those by Selvaraj et al. [8] and Mohanty et al. [5] have demonstrated high accuracy in pest detection using deep learning, our methodology extends beyond mere identification to offer a targeted, energy-efficient intervention strategy.

Compared to remote sensing-based pest detection approaches, such as those proposed by Hassan et al. [10], our system delivers higher spatial resolution and real-time tracking capabilities. The integration of YOLO-NAS with selective corner tracking enhances pest identification and localization accuracy, addressing a key limitation found in many existing AI-driven agricultural technologies. By focusing on precise tracking at an individual level, our system ensures that interventions are more effective and adaptable to real-world orchard conditions.

Furthermore, our approach differentiates itself from traditional CNN-informed precision application tools by providing a non-chemical pest control method, enabling individual pest targeting, reducing the impact on non-target organisms, and offering potentially lower long-term operational costs. While deep learning-based systems like those by Ramcharan et al. [6] and Caldeira et al. [7] have achieved impressive detection accuracies, our work takes a more holistic approach by integrating advanced detection (YOLO-NAS), robust tracking (SORT), and a targeted intervention mechanism (laser targeting). This comprehensive framework positions our system as a highly precise and environmentally sustainable alternative to conventional pest management technologies.

5. Conclusions

The integration of YOLO-NAS with the SORT algorithm has significantly improved caterpillar detection and tracking in orchard environments, particularly under challenging conditions such as partial occlusion by leaves and wind interference. This approach, combined with selective corner tracking, enables precise head and tail identification, facilitating accurate laser targeting for efficient pest control and optimized energy use. The SORT algorithm has proven effective in maintaining reliable caterpillar tracking, overcoming environmental challenges to ensure consistent identification—an essential factor for precision pest management. Additionally, the system’s processing speed remains practical for real-world deployment in orchards. Compared to contemporary YOLO versions, YOLO-NAS demonstrates superior performance, detecting the smallest caterpillars (21 × 6 and 25 × 6 pixels, or ~2.5 cm) in 55 out of 60 instances, while YOLO-NAS + SORT achieves detection in 38 instances. The high recall and efficient tracking performance of YOLO-NAS + SORT highlight its robustness and adaptability, making it a promising tool for precision agriculture and sustainable pest management.

Future work will optimize YOLO-NAS + SORT with TensorRT for embedded devices. We will also expand the dataset to include more caterpillar species and environmental variations. Expanding the dataset to include additional economically significant caterpillar species such as Spodoptera litura, Helicoverpa armigera, and Plutella xylostella and incorporating geographical, environmental, and seasonal variations will improve the model’s generalizability. A key challenge observed was frequent detection switches between two caterpillar species; however, since the approach prioritizes maximum detection over species differentiation, the impact was minimal. Another limitation was the occasional misidentification of foliage as caterpillar corners under specific lighting conditions, which could be addressed with a more precise tracking algorithm such as BYTETrack, StrongSORT and others. Collaborating with agricultural experts through field trials will further validate and refine the system. With continued development, the YOLO-NAS + SORT integration, coupled with selective corner tracking and performance enhancements, holds significant potential for revolutionizing pest management by providing an efficient, reliable, and environmentally sustainable solution for agriculture.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/agriculture15070771/s1, Video S1: YOLOv9-E, YOLOv11x, YOLOv12x, and YOLO-NAS detection. Video S2: (a,b) YOLO-NAS plus SORT selective corner tracking at 20–25 cm and 30–35 cm.

Author Contributions

Conceptualization, S.N., G.-F.H., C.-W.H., C.-Y.L. and S.-J.C.; methodology, S.N., G.-F.H., C.-W.H. and C.-Y.L.; software, S.N. and C.-W.H.; formal analysis, S.N., G.-F.H. and C.-W.H.; investigation, S.N., G.-F.H., C.-W.H. and C.-Y.L.; writing—original draft preparation, S.N.; writing—review and editing, S.N. and S.-J.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

All data supporting the conclusions of this article are included in this article.

Acknowledgments

This work was supported by the National Science and Technology Council (NSTC) in Taiwan with the grant numbers NSTC 113-2221-E-A49-059-MY3 and 113-2622-E-110-002.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Savary, S.; Willocquet, L.; Pethybridge, S.J.; Esker, P.; McRoberts, N.; Nelson, A. The global burden of pathogens and pests on major food crops. Nat. Ecol. Evol. 2019, 3, 430–439. [Google Scholar] [CrossRef]
Auclair, J.L. Aphids: Their biology, natural enemies and control. Int. J. Trop. Insect Sci. 1989, 10, 441. [Google Scholar] [CrossRef]
McCullough, D.G. Challenges, tactics and integrated management of emerald ash borer in North America. Forestry 2020, 93, 197–211. [Google Scholar] [CrossRef]
J, A.; Eunice, J.; Popescu, D.E.; Chowdary, M.K.; Hemanth, J. Deep learning-based leaf disease detection in crops using images for agricultural applications. Agronomy 2022, 12, 2395. [Google Scholar] [CrossRef]
Mohanty, S.P.; Hughes, D.P.; Salathé, M. Using deep learning for image-based plant disease detection. Front. Plant Sci. 2016, 7, 419. [Google Scholar] [CrossRef] [PubMed]
Ramcharan, A.; Baranowski, K.; McCloskey, P.; Ahmed, B.; Legg, J.; Hughes, D.P. Deep learning for image-based cassava disease detection. Front. Plant Sci. 2017, 8, 1852. [Google Scholar] [CrossRef]
Caldeira, R.F.; Santiago, W.E.; Teruel, B. Identification of cotton leaf lesions using deep learning techniques. Sensors 2021, 21, 3169. [Google Scholar] [CrossRef]
Selvaraj, M.G.; Vergara, A.; Ruiz, H.; Safari, N.; Elayabalan, S.; Ocimati, W.; Blomme, G. AI-powered banana diseases and pest detection. Plant Methods 2019, 15, 92. [Google Scholar] [CrossRef]
Yang, S.; Xing, Z.; Wang, H.; Dong, X.; Gao, X.; Liu, Z.; Zhang, X.; Li, S.; Zhao, Y. Maize-YOLO: A new high-precision and real-time method for maize pest detection. Insects 2023, 14, 278. [Google Scholar] [CrossRef]
Hassan, S.I.; Alam, M.M.; Illahi, U.; Suud, M.M. A new deep learning-based technique for rice pest detection using remote sensing. PeerJ Comput. Sci. 2023, 9, e1167. [Google Scholar] [CrossRef]
Kasinathan, T.; Singaraju, D.; Uyyala, S.R. Insect classification and detection in field crops using modern machine learning techniques. Inf. Process. Agric. 2021, 8, 446–457. [Google Scholar] [CrossRef]
Mamdouh, N.; Khattab, A. YOLO-based deep learning framework for olive fruit fly detection and counting. IEEE Access 2021, 9, 84252–84262. [Google Scholar] [CrossRef]
Tetila, E.C.; da Silveira, F.A.G.; da Costa, A.B.; Amorim, W.P.; Astolfi, G.; Pistori, H. YOLO performance analysis for real-time detection of soybean pests. Smart Agric. Technol. 2024, 7, 100405. [Google Scholar] [CrossRef]
Li, X.; Wang, L.; Miao, H.; Zhang, S. Aphid recognition and counting based on an improved YOLOv5 algorithm in a climate chamber environment. Insects 2023, 14, 839. [Google Scholar] [CrossRef]
Wang, T.; Zhao, L.; Li, B.; Liu, X.; Xu, W.; Li, J. Recognition and counting of typical apple pests based on deep learning. Ecol. Inf. 2022, 68, 101556. [Google Scholar] [CrossRef]
Wang, C.Y.; Yeh, I.H.; Liao, H.Y.M. YOLOv9: Learning what you want to learn using programmable gradient information. arXiv 2024, arXiv:2402.13616. [Google Scholar]
Research Team. YOLO-NAS by Deci Achieves State-of-the-Art Performance on Object Detection Using Neural Architecture Search. Deci 2023. Available online: https://deci.ai/blog/yolo-nas-object-detection-foundation-model/ (accessed on 24 February 2024).
Jocher, G.; Qiu, J. Ultralytics YOLOv11, Version 11.0.0. Available online: https://github.com/ultralytics/ultralytics (accessed on 15 March 2025).
Tian, Y.; Ye, Q.; Doermann, D. YOLOv12: Attention-Centric Real-Time Object Detectors. arXiv 2025, arXiv:2502.12524. [Google Scholar]
Terven, J.; Córdova-Esparza, D.M.; Romero-González, J.A. A comprehensive review of YOLO architectures in computer vision: From YOLOv1 to YOLO-NAS. Mach. Learn. Knowl. Extr. 2023, 5, 1680–1716. [Google Scholar] [CrossRef]
Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-End Object Detection with Transformers. arXiv 2020, arXiv:2005.12872. [Google Scholar]
Zhu, X.; Su, W.; Lu, L.; Li, B.; Wang, X.; Dai, J. Deformable DETR: Deformable Transformers for End-to-End Object Detection. arXiv 2021, arXiv:2010.04159. [Google Scholar]
Lv, W.; Xu, S.; Zhao, Y.; Wang, G.; Wei, J.; Cui, C.; Du, Y.; Dang, Q.; Liu, Y. DETRs Beat YOLOs on Real-time Object Detection. arXiv 2023, arXiv:2304.08069. [Google Scholar]
Yao, Z.; Ai, J.; Li, B.; Zhang, C. Efficient DETR: Improving End-to-End Object Detector with Dense Prior. arXiv 2021, arXiv:2104.01318. [Google Scholar]
Bewley, Z.; Ge, L.; Ott, L.; Ramos, F.; Upcroft, B. Simple online and realtime tracking. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 3464–3468. [Google Scholar] [CrossRef]
Wojke, N.; Bewley, A.; Paulus, D. Simple online and realtime tracking with a deep association metric. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 3645–3649. [Google Scholar] [CrossRef]
Zhang, Y.; Sun, P.; Jiang, Y.; Yu, D.; Weng, F.; Yuan, Z. Bytetrack: Multi-object tracking by associating every detection box. In European Conference on Computer Vision—ECCV 2022; Springer Nature: Cham, Switzerland, 2022; pp. 1–21. [Google Scholar] [CrossRef]
Cao, J.; Pang, J.; Weng, X.; Khirodkar, R.; Kitani, K. Observation-centric SORT: Rethinking SORT for robust multi-object tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 686–9696. [Google Scholar] [CrossRef]
Aharon, N.; Orfaig, R.; Bobrovsky, B.Z. BoT-SORT: Robust associations multi-pedestrian tracking. arXiv 2022, arXiv:2206.14651. [Google Scholar] [CrossRef]
Du, Y.; Zhao, Z.; Song, Y.; Zhao, Y.; Su, F.; Gong, T.; Meng, H. StrongSORT: Make DeepSORT great again. IEEE Trans. Multimed. 2023, 25, 8725–8737. [Google Scholar] [CrossRef]
Han, R.D.; Parajulee, M.; He, Z.; Ge, F. Effects of Environmental Humidity on the Survival and Development of Pine Caterpillars, Dendrolimus tabulaeformis (Lepidoptera: Lasiocampidae). Insect Sci. 2008, 15, 147–152. [Google Scholar]
Dodiya, R.D.; Barad, A.H.; Italiya, J.V.; Prajapati, H.N. Impact of Weather Parameters on Population Dynamics of Tobacco Leaf Eating Caterpillar, Spodoptera litura (F.) Infesting Groundnut. Environ. Ecol. 2024, 42, 301–306. [Google Scholar]
Tong, K.; Wu, Y.; Zhou, F. Recent advances in small object detection based on deep learning: A review. Image Vis. Comput. 2020, 97, 103910. [Google Scholar] [CrossRef]
Nguyen, V.T.; Chu, D.T.; Phan, D.H.; Tran, N.T. An improvement of the camshift human tracking algorithm based on deep learning and the Kalman filter. J. Robot. 2023, 2023, 5525744. [Google Scholar] [CrossRef]
Elgar, M.A.; Zhang, D.; Wang, Q.; Wittwer, B.; Pham, H.T.; Johnson, T.L.; Freelance, C.B.; Coquilleau, M. Focus: Ecology and evolution: Insect antennal morphology: The evolution of diverse solutions to odorant perception. Yale J. Biol. Med. 2018, 91, 457–469. [Google Scholar]
Sriwastwa, A.; Prakash, S.; Mrinalini; Swarit, S.; Kumari, K.; Sahu, S.S. Detection of Pests Using Color-Based Image Segmentation. In Proceedings of the 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT), Coimbatore, India, 20–21 April 2018; pp. 1393–1396. [Google Scholar] [CrossRef]
Pavithra, N.; Murthy, V.S. An Image Processing Algorithm for Pest Detection. Pices 2017, 1, 24–26. [Google Scholar]
Han, F.; Guan, X.; Xu, M. Method of Intelligent Agricultural Pest Image Recognition Based on Machine Vision Algorithm. Discov. Appl. Sci. 2024, 6, 536. [Google Scholar] [CrossRef]
Qin, W.B.; Zhang, X.; Liu, Y.; Wang, Q.; Wang, F. An intelligent location and spraying system for greenhouse pest control. Comput. Electron. Agric. 2021, 182, 106124. [Google Scholar] [CrossRef]
Zhang, L.; Guo, Y.; Liu, X.; Li, Z. Development of an intelligent variable-rate pesticide spraying robot for orchards. Biosyst. Eng. 2020, 189, 1–16. [Google Scholar] [CrossRef]

Figure 1. (a–e) are the images taken at different times of the day at different angles for robust dataset.

Figure 2. Field deployment of the unmanned ground vehicle (UGV) with integrated detection system.

Figure 3. YOLO-NAS-L plus SORT with selective corner tracking network architecture.

Figure 4. (a–c) depicts the precision, recall, and the mAP@50 values for the YOLO models.

Figure 5. (a–d) depict the detection results of YOLOv9-E, YOLOv11x, YOLOv12x, and YOLO-NAS-L, respectively.

Figure 6. The selective corner tracking employing YOLO NAS plus SORT.

Table 1. Comparison between YOLO-NAS-L and other YOLO models for 20–25 cm and 30–35 cm.

Model	Total Caterpillars in 30 Frames	True Positive Detections	False Positive Detections
YOLOv9-E (20–25 cm)	300	108	0
YOLOv9-E (30–35 cm)	240	93	0
YOLO-NAS-L (20–25 cm)	300	150	23
YOLO-NAS-L (30–35 cm)	240	147	28
YOLOv11x (20–25 cm)	300	134	0
YOLOv11x (30–35 cm)	240	7	0
YOLOv12x (20–25 cm)	300	151	0
YOLOv12x (30–35 cm)	240	106	0

Table 2. Execution time for various YOLO models.

Model	Preprocess	Inference	Post-Process	FPS
YOLOv9-e	0.5	48.3	1.6	19.8
YOLOv11x	0.5	37.2	0.8	25.6
YOLOv12x	1.7	49.8	1.7	18.9
YOLO-NAS-L	0.5	50.47	1.3	19.2

Table 3. Precision, recall and mAP@50 values for the YOLO models.

Model	YOLO-NAS-L	YOLOv9-e	YOLOv11x	YOLOv12x
Precision	0.3632	0.9718	0.9630	0.9607
Recall	0.995	0.9789	0.9637	0.9499
mAP@50	0.9783	0.9925	0.9798	0.9840
mAP@50-95	0.664	0.648	0.655	0.651

Table 4. Comparison between YOLO-NAS-L and YOLO-NAS-L plus SORT for 20–25 cm and 30–35 cm.

Model	Total Caterpillars in 30 Frames	True Positive Detections	False Positive Detections
YOLO-NAS-L (20–25 cm)	300	150	23
YOLO-NAS-L plus SORT (20–25 cm)	300	176	22
YOLO-NAS-L (30–35 cm)	240	147	28
YOLO-NAS-L plus SORT (30–35 cm)	240	143	8

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nair, S.; Hong, G.-F.; Hsu, C.-W.; Lin, C.-Y.; Chen, S.-J. Real-Time Caterpillar Detection and Tracking in Orchard Using YOLO-NAS Plus SORT. Agriculture 2025, 15, 771. https://doi.org/10.3390/agriculture15070771

AMA Style

Nair S, Hong G-F, Hsu C-W, Lin C-Y, Chen S-J. Real-Time Caterpillar Detection and Tracking in Orchard Using YOLO-NAS Plus SORT. Agriculture. 2025; 15(7):771. https://doi.org/10.3390/agriculture15070771

Chicago/Turabian Style

Nair, Sumesh, Guo-Fong Hong, Chia-Wei Hsu, Chun-Yu Lin, and Shean-Jen Chen. 2025. "Real-Time Caterpillar Detection and Tracking in Orchard Using YOLO-NAS Plus SORT" Agriculture 15, no. 7: 771. https://doi.org/10.3390/agriculture15070771

APA Style

Nair, S., Hong, G.-F., Hsu, C.-W., Lin, C.-Y., & Chen, S.-J. (2025). Real-Time Caterpillar Detection and Tracking in Orchard Using YOLO-NAS Plus SORT. Agriculture, 15(7), 771. https://doi.org/10.3390/agriculture15070771

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Real-Time Caterpillar Detection and Tracking in Orchard Using YOLO-NAS Plus SORT

Abstract

1. Introduction

2. Materials and Methods

2.1. Caterpillar Rearing and Dataset Collection

2.2. Manual Camera Parameter Selection

2.3. YOLO-NAS and Recent YOLOs for Small Object Detection

2.4. YOLO-NAS Plus SORT for Selective Corner Tracking for Head Detection

3. Experimental Results and Discussion

3.1. YOLO-NAS for Small Object Detection

3.2. Selective Corner Tracking Using YOLO-NAS and SORT

4. Practical Applications and Perspectives

4.1. Performance and Applications of YOLO-NAS and SORT for Pest Management

4.2. Economic and Practical Benefits

4.3. Laser-Based Pest Control Integration

4.4. Environmental and Safety Considerations

4.5. Comparative Analysis of Precision Agriculture Technologies

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI