Only Detect Broilers Once (ODBO): A Method for Monitoring and Tracking Individual Behavior of Cage-Free Broilers

Yin, Chengcheng; Tan, Xinjie; Li, Xiaoxin; Cai, Mingrui; Chen, Weihao

doi:10.3390/agriculture15070669

Open AccessArticle

Only Detect Broilers Once (ODBO): A Method for Monitoring and Tracking Individual Behavior of Cage-Free Broilers

by

Chengcheng Yin

^1,2,

Xinjie Tan

^1,2,

Xiaoxin Li

^3,*,

Mingrui Cai

³ and

Weihao Chen

^1,2

¹

College of Electronic Engineering (College of Artificial Intelligence), South China Agricultural University, Guangzhou 510642, China

²

National Center for International Collaboration Research on Precision Agricultural Aviation Pesticide Spraying Technology, Guangzhou 510642, China

³

Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China

^*

Author to whom correspondence should be addressed.

Agriculture 2025, 15(7), 669; https://doi.org/10.3390/agriculture15070669

Submission received: 20 February 2025 / Revised: 7 March 2025 / Accepted: 18 March 2025 / Published: 21 March 2025

(This article belongs to the Special Issue Modeling of Livestock Breeding Environment and Animal Behavior)

Download

Browse Figures

Versions Notes

Abstract

:

In commercial poultry farming, automated behavioral monitoring systems hold significant potential for optimizing production efficiency and improving welfare outcomes at scale. The behavioral detection of free-range broilers matters for precision farming and animal welfare. Current research often focuses on either behavior detection or individual tracking, with few studies exploring their connection. To continuously track broiler behaviors, the Only Detect Broilers Once (ODBO) method is proposed by linking behaviors with identity information. This method has a behavior detector, an individual Tracker, and a Connector. First, by integrating SimAM, WIOU, and DIOU-NMS into YOLOv8m, the high-performance YOLOv8-BeCS detector is created. It boosts P by 6.3% and AP by 3.4% compared to the original detector. Second, the designed Connector, based on the tracking-by-detection structure, transforms the tracking task, combining broiler tracking and behavior recognition. Tests on sort-series trackers show HOTA, MOTA, and IDF1 increase by 27.66%, 28%, and 27.96%, respectively, after adding the Connector. Fine-tuning experiments verify the model’s generalization. The results show this method outperforms others in accuracy, generalization, and convergence speed, providing an effective method for monitoring individual broiler behaviors. In addition, the system’s ability to simultaneously monitor individual bird welfare indicators and group dynamics could enable data-driven decisions in commercial poultry farming management.

Keywords:

precision livestock farming; YOLOv8; computer vision; object detection; behavior tracking

1. Introduction

Chicken meat, considered one of the most environmentally-friendly and economical sources of high-quality protein [1], has led to an increasing prevalence of industrial and intensive farming to meet the growing demand. However, as the public’s requirements for animal welfare and food quality become increasingly stringent, non-caged farming models in commercial farming are gradually becoming the mainstream [2]. Compared with intensive farming models, non-caged models support the normal behavioral expression of broiler chickens [3]. Natural non-stress behaviors, such as eating, walking, and standing, can reflect the welfare status of broiler chickens. Chickens in good physical condition exhibit regular natural-behavior expressions. Monitoring natural behaviors to reflect the physiological state of animals has already become a common approach [4,5,6,7]. Kilpinen et al. [8] utilized the correlation between preening behavior and skin condition to determine whether chickens were infected by red mites. Mendoza et al. [9] also determined the degree of stimulation of ultraviolet light on hens by observing behaviors such as wing-flapping and standing. Therefore, rapid and accurate behavior recognition of free-range broiler chickens is of great significance for the welfare management of broiler chickens and the optimization of the farming process in commercial farming.

In recent years, computer-vision (CV)-based behavior recognition technology has been gradually developing. As a key aspect of CV, behavior detection, with its characteristics of accurate recognition and non-contact, has also started to be applied in animal behavior recognition. For instance, Sozzi et al. [10] developed a visual deep learning model for detecting the comfort behaviors of free-range hens at different ages. Li et al. [11] used Faster R-CNN to detect the stretching behavior of broiler chickens in image data at two poultry ages, achieving an accuracy of over 92%. To detect the behaviors of breeding hens in cages, Wang et al. [12] utilized the YOLOv3 model and obtained high behavior detection accuracies: mating (94.72%), standing (94.57%), feeding (93.10%), spreading (92.02%), fighting (88.67%), and drinking (86.88%). Yang et al. [13] also classified six different behaviors of laying hens (standing, sitting, sleeping, preening, scratching, and pecking) using object detection, with an identification accuracy reaching 95.3%.

Most studies have employed behavior detection techniques to achieve high precision recognition of broiler chicken behaviors. However, merely conducting behavior recognition on instantaneous images cannot accurately reflect the precise welfare status of broiler chickens. In commercial farming, breeders are more concerned with the continuous behavioral changes of a particular chicken over an extended period. Continuously recognizing and recording each chicken’s behavior to create a unique “behavioral profile” is crucial for ensuring animal welfare. This process relies on individual tracking.

In the context of automated farming systems, the lack of individual tracking of broilers can lead to difficulties in early disease detection [14], inefficient feeding [15], and inadequate animal welfare monitoring [16]. Wearable devices are a viable option for tracking broiler chickens. For instance, Yang et al. [17] used accelerometers to record the movements of nine seven-week-old broiler chickens, allowing for the classification of specific behaviors. Various wearable devices have been employed for broiler chicken behavior tracking, such as RFID [18], IMU [19], and UWB [20]. Visual-based tracking methods have seen rapid development recently. Li et al. [21] and Siriani et al. [22] used Kalman filtering to track the movement of chickens in videos. Similarly, Tan et al. [23] proposed SY-Track, a tool for high-precision tracking of broiler chickens in videos and analysis of their restlessness index. Sensor-based and visual-based tracking methods may perform similarly for the same task. For example, when detecting lameness in broilers, de Alencar Nääs et al. [24] used pressure-sensitive pads combined with machine learning to predict broiler lameness, achieving 91% accuracy. Nasiri et al. [25] developed a posture estimation model that automatically identifies broiler lameness by analyzing videos, achieving 95% accuracy. While both methods are accurate, visual-based tracking is more practical for real-world breeding environments, as it enables non-invasive, continuous monitoring of the entire flock. Sensor-based tracking is better suited for detailed studies of individual behavior, but its large-scale application is costly, and installation is complex.

However, most current methods either detect behaviors or track individuals without integrating both aspects, which hinders the monitoring of continuous behavioral patterns. This is because most research in the multi-object tracking (MOT) field focuses solely on objects of a specific category. Behavior recognition results involve multiple behavior categories. In an image, changes in different behaviors not only impact behavior recognition but can also alter the behavioral characteristics and appearance models of broiler chickens, potentially leading to tracking failure [26]. To tackle the issue of the detector’s output of multiple-behavior-category information in the MOT task affecting broiler chicken tracking, some current studies employ multi-step approaches to associate specific behaviors with broiler chicken identities. For instance, Nasiri et al. [27] performed two-step detections, concurrently detecting broiler chickens and their drinking behaviors. They utilized the single category results from broiler chicken detection for tracking. Finally, the tracking results were matched with the drinking behavior detection results to estimate the broiler chickens’ drinking time. However, excessive detection processes and matching procedures demand substantial computational resources, rendering it difficult to implement in commercial farming. Meanwhile, the cumbersome procedures may also cause errors to be continuously transmitted and amplified, resulting in poor method stability.

Therefore, this study focused on recognizing and continuously tracking the behavior of individual free-range broilers, verifying that the proposed method exhibits good adaptability. The purposes of this study are to (1) train a YOLOv8-BeCS model to accurately identify the natural behavior of broilers from an overhead perspective, (2) use visual tracking technology to track individual broilers, (3) design a connector structure to integrate behavioral information into the tracking process of individual broilers, and (4) conduct fine-tuning experiments in unfamiliar scenarios to verify the adaptability of the proposed method.

2. Materials and Methods

2.1. Experiment Design and Data Collection

The broiler chickens used in this study were obtained from two commercial farms in Shandong and Guangdong, China. Neither the age nor the gender of the chickens was pre-selected to ensure that normal commercial farming processes remained unperturbed. Over 100 broiler chickens were encompassed within the data collection scope. Specifically, Scene 1 involved 10 Sanhuang chickens, and Scene 2 involved around 100 Jiangfeng Lujiao chickens. The experimental environment is shown in Figure 1. The breeding density of scenario 1 was lower (1.28 birds/m²), while that of scenario 2 was higher (2.4 birds/m²). The environment of the two scenes was the same except for the temperature and humidity environment, which was a semi-enclosed house area with three sides closed and one side open. Food and drinking equipment were placed in the area, allowing the chickens to eat and drink freely. The average daytime temperature of Scenario 1 and Scenario 2 during data collection was 7.5 °C and 20.5 °C, respectively. A daylight LED lamp was installed beneath the imaging platform to ensure that the daytime brightness within the camera’s field of view was at least 50 Lux. The experimental environment was implemented under the guidance of Directive 2010/63/EU and was approved by the ethics committee to ensure the welfare of the animals.

The data collection methods were consistent across all the scenarios. Video data were captured using a data collection platform (see Figure 1), which comprises an Azure Kinect DK camera (MIC2574, Microsoft, Redmond, WA, USA) and a data processing and storage unit. The Azure Kinect DK camera supports an RGB video resolution of 2048 × 1535. The platform was positioned 1.5 m above the ground, with the camera oriented vertically downward—a perspective commonly used for group monitoring [28]. To ensure biosecurity, all the equipment remained within the experimental area throughout the data collection process.

The experiment took place from January to March 2023. Data collection occurred from 7:00 a.m. to 6:00 p.m., spanning a total of 11 h. The required video format for the experiment was AVI, with a frame rate of 15 frames per second and a resolution of 2048 × 1535. A 30 s video was captured every three minutes. The data were stored in a computing and storage platform (Intel Core i7-8750H, GTX1060, 16 GB RAM).

2.2. Image Annotation and Dataset Division

First, an object detection behavior dataset was assembled. A total of 1200 images from Scene 1 and 300 images from Scene 2 were utilized to construct the behavior dataset. The images included different light environments and broiler densities, all of which were taken into account and averaged. Notably, the behavior detection model employed by ODBO underwent training and evaluation using the dataset from Scene 1. The dataset from Scene 2 served for fine-tuning experiments to verify the generalization of ODBO across different application scenarios.

In a comfortable and relaxed state, free-range broiler chickens manifest an increase in behaviors such as eating and feather pecking [29]. Thus, the natural behaviors of broiler chickens annotated herein include standing, lying down, stretching, preening, and eating. The definitions and examples of the relevant behaviors are detailed in Table 1.

LabelImg1.8.6 was used to mark the selected broiler images (see Figure 2a for example), and the dataset was in YOLO format. In the labeling process, the rectangular box was dragged to delimit the location of a broiler and give it a behavior label.

The dataset was augmented through data enhancement techniques, including rotation, random pixel removal, and affine transformation. The effect of the image enhancement is depicted in Figure 3a. Figure 3b shows the distribution of label sizes before and after data enhancement. The change in the scatter plot area demonstrates that the operations effectively expanded the scale distribution of the detection boxes, making the positions and scales of the detection boxes more reasonable. Ultimately, the dataset was randomly divided into subsets at a ratio of 6:2:2.

Concurrently, a tracking evaluation dataset was assembled to verify the performance of multi-object tracking. Two videos, with a combined length of 45 s, were individually selected (Video 1: filmed in Scene 1, 30 s, 450 frames; Video 2: filmed in Scene 1, 15 s, 225 frames). The videos were amended using DarkLabel2.4 to generate the tracking and evaluation dataset. The tagging process is similar to the target detection dataset, where the broiler region is selected in each frame and assigned an ID serial number label (see Figure 2b for example).

Data in agricultural production have the following common characteristics:

From an overhead perspective, broiler behaviors appear smaller, and the distinctions between different behaviors become less pronounced. Behavioral differences are primarily observed in minor body parts, such as the feet and head, increasing the likelihood of erroneous differentiation.
Broilers tend to cluster, leading to significant occlusion in the images. This makes individual identification prone to false positives, as some broilers may have body parts obscured by others, complicating the accurate detection of specific individuals.
In cage-free environments, the ground background is typically complex, with some areas closely resembling the color and texture of the broilers. This reduces the contrast between the broilers and the background, making it difficult to distinguish them clearly.
Variations in lighting conditions across our settings can cause shifts in light and shadow, color changes, and intricate background textures. These fluctuations and noise may confuse the target detector, leading to the misidentification of background variations and noise as targets.

In summary, these characteristics are common in most real-world commercial free-range environments. The detection model will be specifically designed to address these challenges.

2.3. ODBO

The entire ODBO process comprises three components: the behavior detector YOLOv8-BeCS, the Connector, and the Tracker. Figure 4 shows ODBO’s output. Here, the tracking box simultaneously shows the ID of the tracked broiler chicken and its behavior. The implementation details of each component follow.

2.3.1. The Proposed Object Detection Model

To enhance the accuracy of broiler chicken behavior detection, the YOLOv8-BeCS algorithm based on YOLOv8 was developed (as shown in Figure 5). This model enhances its focus on key features by incorporating the SimAM attention mechanism. Moreover, DIOU-based NMS effectively discriminates between dense targets during post-processing, and the introduction of WIOU improves the calculation speed and accuracy of the IOU.

2.3.2. SimAM

In object detection tasks, incorporating attention mechanism modules is a common approach to boost a model’s feature extraction capabilities [30,31,32]. Liang et al. [33] integrated the CBAM attention module into YOLOv5, improving the recognition accuracy of small targets. As a simple and efficient attention module, SimAM [34] can infer the 3D attention weights of feature maps in a layer without adding extra parameters to the original network. The inference process is depicted in Figure 6. Compared with traditional attention mechanism modules, SimAM enhances the module’s capacity to learn discriminative cues, thus boosting the model’s ability to focus on the key features of different broiler chicken behaviors in overhead shot scenarios. As shown in Figure 5, in the backbone network, the C2f feature extraction module was swapped with SimAM to address the weak feature extraction ability of YOLOv8 in this study.

2.3.3. DIOU-NMS

In object detection output, multiple bounding boxes (bboxes) of varying sizes, positions, and angles often exist. These boxes may overlap and cover the same target. The NMS algorithm serves to eliminate these redundant boxes and select the optimal one for target description. In agricultural production, target clustering and mutual occlusion scenarios are common, so NMS is also widely applied. For instance, Zhang et al. [35] and Shen et al. [36] enhanced their models’ performance by introducing NMS processing. To reduce false positives and negatives due to free-range broiler chicken clustering, NMS was incorporated in the post-processing stage.

Standard NMS ranks all candidate bboxes by their confidence scores and deletes redundant ones based on the threshold and ranking [37]. However, in practical use, when different category objects are close, a large IOU can cause NMS to wrongly delete candidate boxes, leading to missed detections. Distance-intersection-over-union-based NMS (DIOU, [38]) was proposed to tackle this problem. The DIOU-NMS is expressed by Equations (1) and (2):

S_{x} = \{\begin{matrix} S_{x} IOU - R_{DIOU} (M, B_{x}) < V_{s e t} \\ {0 IOU - R}_{DIOU} (M, B_{x}) \geq V_{s e t} \end{matrix}

(1)

R_{DIOU} = \frac{ρ^{2} (b, b_{g t})}{c^{2}}

(2)

where

R_{DIOU} (M, B_{x})

represents the calculated value using the DIOU strategy,

b

represents the coordinates of the center point of the prediction box,

b_{gt}

represents the coordinates of the center point of the GT box,

ρ^{2} (b, b_{g t})

denotes the square of the distance between the two center points, and

c^{2}

denotes the square of the length of the diagonal of the smallest outer rectangle of the two rectangles. By minimizing the distance between two boxes directly, DIOU ensures that the DIOU-NMS computation time is within a reasonable range. DIOU-NMS considers not only the IOU but also the distance between center points. In this study’s dataset, the broiler chicken targets had severe occlusion; therefore, DIOU-NMS was introduced to mitigate interference between adjacent candidate boxes.

2.3.4. WIOU

In YOLOv8, complete intersection over union (CIOU) computes the IOU. CIOU integrates the aspect ratio of the predicted bboxes and the ground truth bboxes, making the regression of the bboxes more stable. However, it also increases computational complexity, resulting in a relatively large computational cost during the training process. To improve the overall computational speed, this study introduces a new IOU function: WIOU (Wise-IOU, [39]). WIOU focuses on the ordinary quality anchor boxes to enhance the model’s localization performance. Compared to CIOU, which demands substantial computational power for computing the aspect ratio, WIOU can achieve faster and more accurate calculations due to its lower computational complexity, despite some focusing coefficients being increased. The blue area in Figure 7 indicates the labeled box, the red area indicates the predicted box, and the gradient rectangle on the periphery indicates the minimum bounding box.

The specific calculations are presented in the following formulas:

W I O U = r \cdot (1 - \frac{W_{i} H_{i}}{w h + w_{g t} h_{g t} - W_{i} H_{i}} \bar{I O U}) \cdot λ

(3)

λ = e x p [\frac{{(x - x_{g t})}^{2} + {(y - y_{g t})}^{2}}{{(W_{g}^{2} + H_{g}^{2})}^{*}}]

(4)

r = \frac{β}{δ α^{β - δ}}

(5)

β = \frac{{L^{*}}_{I O U}}{\bar{L_{I O U}}}

(6)

In the equation above,

β

represents the outlier degree of the anchor box, and

α

and

δ

are hyperparameters. A smaller outlier indicates that the anchor boxes are of higher quality, at which point the IOU, through the gradient gain

r = \frac{β}{δ α^{β - δ}}

, can redirect computational attention away from high-quality anchor boxes and instead focus on the average-quality anchor boxes. Furthermore, the criteria for determining the quality of the anchor box are constantly changing, and

\bar{L_{IOU}}

is constantly changing, allowing WIOU to assign the most appropriate gradient gain at any moment during model training. In view of the advantages of WIOU, we have replaced the bounding box loss function in the YOLOv8 Detect Head (Shown in the Bbox Loss section in Figure 5).

2.4. Connector

The class information in the detection results of YOLOv8-BeCS is not unique, which differs from the input format of conventional single-class multi-object tracking (MOT) tasks. Therefore, drawing on the ideas of single class MOT tasks that generally achieve good tracking performance, this study designed a Connector placed between YOLOv8-BeCS and the Tracker. The structure and workflow of the Connector are shown in Figure 8. It should be noted specifically that the Connector module is only used during the inference stage and does not participate in the training process.

The Connector consists of two sub-parts: a single class output head located in the detection head of YOLOv8-BeCS and a corresponding receiving part in the Tracker. First, the single class output head (Output2 in Figure 8) is improved based on the original class detection head (Cls Head of Output1 in Figure 8) in YOLOv8. Without altering the prediction probabilities of the Cls Head, the new single Cls Head reduces the dimension through a

1 \times 1 Conv

operation and outputs a single class number: 0. In this task, this class is designated as broiler chickens. Meanwhile, compared to the original classification head, the number of classes (nc) of C3 in the newly added classification head is set to 1. This adjustment not only ensures the stability of class output but also decreases the computational complexity. After that, we reconstructed the receiver in the Tracker. Specifically, the input and output format of the “Objdetector” function in the Sort series code is changed. It takes the four items of data output from the three detection headers (green structure in the Input section of Figure 8) and passes them to the “Objtracker” function for tracking. Objtracker and the consequent parameters of each function of the process have to carry four real Behaviors of information (x1, y1, x2, y2, 0, the Conf, behaviors). This change depends on the flow of arguments between the functions of the Sort family of algorithms, so we will use it later in the trace structure.

As a result, the tracking part of ODBO is transformed into an MOT task involving only one type of target (broiler chickens). This undoubtedly significantly enhances the accuracy and stability of the tracking task. The receiving part of the Tracker does not delete the behavior category information. Instead, it adds this information as additional data to the tracking output to match the behavior information with the ID of the broiler chicken (as shown in Figure 4).

Furthermore, to adhere to the concept of single-class object tracking, this study has modified the method of calculating the confidence score. The formula for the ordinary confidence score is presented in Equation (5):

C o n f i d e n c e = P (C) \times P (O)

(7)

where

P (C)

represents the class confidence, which is the probability that the object within the detection box belongs to a specific class, and

P (O)

is the probability related to the IOU between the predicted box and the ground truth box. In the YOLOv8 model, the calculation of

P (C)

is accomplished by the Sigmoid function, and the calculation method is as shown in Equation (6):

P (C) = \frac{1}{1 + e^{- τ}}

(8)

where

τ

is the original classification score calculated by the object detection model. In this study, for the confidence score involving the single Cls Head,

P (C)

is directly assigned a value of 1. This operation makes the confidence score solely dependent on

P (O)

. Meanwhile, since

P (C) \in [0, 1]

, the confidence score used for tracking is increased.

2.5. Broiler Tracking Algorithm

In multi-object tracking algorithms, tracking-by-detection (TBD) is currently the mainstream of research in both academia and industry. Based on the Connector structure in ODBO, this study has adapted various Trackers from the popular Sort series in current TBD algorithms, such as Sort, DeepSort [40], OCSort [41], StrongSort [42], and BoTSort [43]. Most algorithms in the Sort series are based on the Kalman filter, the Hungarian algorithm, and the Mahalanobis distance. The Hungarian algorithm is responsible for detecting IOU matching.

In Sort, the Hungarian algorithm takes the IOU situation of detection boxes and tracking boxes as the input and outputs the matching results of detection boxes and tracking boxes. By combining the two metrics, namely, the Mahalanobis distance of the target box and the feature cosine distance, to integrate motion information and appearance information, the detection boxes and tracking predictions are matched with the Hungarian algorithm.

In the motion information, the Mahalanobis distance is used to represent the degree of motion association, and the calculation formula is as follows:

d^{(1)} (i, j) = {(d_{j} - y_{i})}^{T} S_{i}^{- 1} (d_{j} - y_{i})

(9)

b_{i, j}^{(1)} = l [d^{(1)} (i, j) \leq t^{(1)}]

(10)

where

d_{j}

represents the position of the

j

detection box,

y_{i}

represents the prediction box position of the

i

Tracker, and

S_{i}

is the covariance matrix between the detection position and the average tracking position. In broiler tracking tasks, the motion uncertainty of broiler chickens is high, and the matching method based on Mahalanobis distance easily fails. Sort assists the matching by adding an appearance information (cosine distance) formula (Equation (9)), and when

d^{(2)} (i, j)

is less than the specified threshold, the match is successful.

d^{(2)} (i, j) = \min {1 - r_{j}^{T} r_{k}^{(i)} | r_{k}^{(i)} \in R_{i}}

(11)

To fully match the two methods, the linear weighting of the two measures is used as the final measure (Equation (10)).

c_{i, j} = λ d^{(1)} (i, j) + (1 - λ) d^{(2)} (i, j)

(12)

The combination of Mahalanobis distance and the characteristic cosine distance enables the application of motion information and appearance information, effectively resolving the issue of ID loss when the broiler is blocked but reappears later. The specific matching process for Sort is illustrated in Figure 9.

2.6. Experimental Environment and Evaluation Metrics

2.6.1. Experimental Environment

In this study, an Intel Core i5-12600KF, 32GB of RAM, and an NVIDIA GeForce RTX 3090 were utilized to accelerate the training process. The deep learning framework based on Python 3.7.15 and PyTorch 1.11.0 served as the fundamental operating software.

2.6.2. Hyperparameter Settings

In this research, the stochastic gradient descent (SGD) optimizer was chosen as the optimization algorithm. The model input was set at 640 × 640, the initial learning rate was fixed at 0.01, and the NMS threshold was set to 0.5. Furthermore, a seed was employed and set to 100. The fixed training epoch method is gradually superseded by automatic training termination, a strategic shift reflected in YOLOv8. By automatically assessing the model’s convergence, one only needs to set the patience parameter; the training will then automatically halt after the model has converged for patience epochs. In this paper, the patience was set at 100. Notably, some models in the comparative experiments were unable to automatically determine convergence; therefore, the number of epochs was set to 200 in this study. A more comprehensive set of hyperparameters is presented in the Supplementary Materials.

2.6.3. Evaluation Metrics

The evaluation metrics for the detection and tracking components differ. In this study, we employed detection evaluation metrics and tracking evaluation metrics to assess the performance of behavior detection and individual tracking, respectively. To ensure the statistical validity of the experimental results, all the data presented in this study were subjected to t-tests, with statistical significance defined as p < 0.05.

Common evaluation metrics in the field of object detection include precision (P), recall (R), average precision (AP), model size, and inference time. In the context of broiler behavior detection in this study, precision (P) indicates the rate of correct positive predictions, reflecting the model’s reliability in identifying behaviors. Recall (R) measures the model’s ability to detect all instances of the target behaviors, thereby minimizing missed detections. Average precision (AP) integrates both precision and recall to provide a comprehensive assessment of the model’s overall performance. The inference time used in this paper is the average inference time per image, calculated as the ratio of the total inference time of the validation set to the total number of images in the validation set. The formulae for calculating these metrics are as follows:

P = \frac{T P}{T P + F P}

(13)

R = \frac{T P}{T P + F P}

(14)

A P = \frac{P}{N u m (A l l O b j e c t s)}

(15)

where TP denotes the number of truly positive instances correctly classified as positive by the classifier. FP represents the number of truly negative instances misclassified as positive by the classifier, and FN is the number of truly positive instances misclassified as negative by the classifier. Subsequently, two types of AP values with different precisions (AP50 and AP95) are employed to jointly assess the model’s accuracy. AP50 implies that a prediction box’s result is deemed correct only when the similarity between the prediction box and the ground truth box exceeds 50%; likewise, for AP95.

There are numerous metrics for evaluating tracking performance. In this study, the widely used multi-object tracking accuracy (MOTA), IDF1, IDS, frames per second (FPS), and higher order tracking accuracy (HOTA) were selected to assess the tracking effectiveness. Among them, the accuracy of IDS, IDF1, MOTA, and HOTA can indicate whether the method can maintain the correct tracking of the same chicken for a long time to avoid identity confusion. The calculation formulas for some of these metrics are as follows:

M O T A = 1 - \frac{F N + F P + I D S}{G T}

(16)

I D F 1 = \frac{2 I D T P}{2 I D T P + I D F P + I D F N}

(17)

H O T A = \sqrt{\frac{\sum_{c \in {T P}} A (c)}{|T P| + |F N| + |F P|}}

(18)

where GT is the ground truth, FP is the misjudged object box, FN represents the missed box, and TP is the box detected by both the Tracker and GT. MOTA is the number of ID switch. For MOTA, A(c) is the data association score. In the evaluation process, the higher the value of the indicators except for IDS, the better.

3. Results

3.1. Results of Different Improvement Strategies

3.1.1. Comparison of SimAM Modules

The C2f module in YOLOv8 has limited efficacy in extracting salient features for differentiating different broiler chicken behaviors. These relatively small-scale key features directly impact the model’s recognition performance. Figure 10 shows that the C2f module overemphasizes feature extraction from the image’s background and irrelevant pixels. This hinders the model from focusing on the target’s detailed features, consequently affecting the model’s overall accuracy. With the introduction of the SimAM attention module, the key features in the image details are accentuated, and the model’s overall classification accuracy improves significantly, with P increasing by 3.5% (Table 2). Compared with other common attention mechanisms, such as CA, ECA, SE, and CBAM, SimAM can better focus on key features.

3.1.2. IOU Function

The current mainstream IOU functions were tested, and the test results are presented in Table 3. Compared with CIOU, DIOU, and GIOU, all three versions of WIOU performed well in various comparisons. The reduction in computational complexity leads to a faster inference speed. The fact that WIOU does not need to calculate the aspect ratio enables the overall inference speed of the model to increase by 6.25% compared to the basic model.

When comparing AP95, WIOU v3 used in this study exhibited a 2.7% improvement over the original CIOU, and precision P rose by 5.01%. Although the size of the model weights remained unchanged, the change in Boxloss indicates that the actual convergence of the model was more thorough. The focusing mechanism’s effectiveness manifested in the middle and late training stages, i.e., when the model neared convergence. During this time, WIOU assigned small gradient gains to low-quality anchor boxes, reducing harmful gradients and further lowering Boxloss. The effectiveness of this strategy was also confirmed in the study by Zhao et al. [44].

3.1.3. Performance of DIOU-NMS

As can be seen from Table 2, after using DIOU-NMS in the post-processing stage, the problem of inaccurate recognition caused by dense occlusion was solved to a certain extent, with AP50 increasing by 2.3%. This is because if two detection boxes are adjacent or overlapping, the NMS algorithm selects and retains the better detection box based on the scores. To verify the effectiveness of NMS, images with clustered broiler chickens were randomly selected for comparison. Figure 11 presents the comparison results in the same scene, demonstrating that NMS addition reduces the model’s missed detection probability when detected targets are dense. To further assess the performance of the NMS strategy in handling occlusion, a manual evaluation experiment was conducted. First, 30 images with mild occlusion and 30 images with severe occlusion were selected. Mild occlusion was defined as no broiler in the image being obscured by more than 50% of its area, while severe occlusion was defined as one or more broilers being obscured by more than 50% of their area. The evaluation metrics included the missed detection rate (the proportion of broilers that should have been detected but were not) and the false detection rate (the proportion of broilers incorrectly identified as other broilers). The experimental results are presented in Table 4. The results indicate that after incorporating NMS, the missed detection rate of the YOLOv8-BeCS model in severe occlusion scenarios decreased by approximately 13.4%. Additionally, in mild occlusion scenarios, the improved method reduced both the missed detection and false detection rates by about 4%. These findings demonstrate that the DIOU-NMS effectively addresses the occlusion problem. However, due to the need to calculate the IOU multiple times as the basis for the scores, the overall inference speed decreased by 0.3 ms.

When DIOU-NMS and the SimAM module work in tandem, they basically meet the special requirements for the model proposed earlier due to the particularity of the dataset. SimAM lays a good precision foundation for the stage of constructing candidate boxes. The combination of NMS and SimAM has led to increases of 3% and 5.7% in the AP and P metrics respectively compared to the basic model, which means that the probability of the behavior being detected incorrectly was reduced by about 5.7%. However, it also results in a loss of approximately 6% in the inference speed.

3.1.4. YOLOv8-BeCS

The newly designed YOLOv8-BeCS model achieves an AP50 of 84.9%, marking a 3.1% increase over the unadjusted YOLOv8m. Additionally, P and AP95 increased by 6.3% and 3.4%, respectively. Table 5’s ablation experiment results show that the effective coupling of multiple modules accounts for the excellent performance of YOLOv8-BeCS. The new strategy combination proposed in this study takes into account both detection accuracy and detection speed. Although the inference time increased slightly, an AP accuracy of over 70% is sufficient to meet the requirements of practical applications.

Table 6 presents the results of YOLOv8-BeCS in recognizing different behaviors within the dataset. It is gratifying that the proposed algorithm performs extremely well in four behavior categories, except for the lying behavior, with an average P value greater than 88.1%. However, relatively low precision is observed in the lying behavior. Nevertheless, an AP50 performance of over 75% reaches an applicable level in practical scenarios.

To further validate the effectiveness of the algorithm, several representative object detection algorithms, such as YOLOv7 [45], YOLOv6 [46], YOLOv5 [47], SSD [48], Faster-R-CNN [49], PP-YOLO [50], and DTER [51], were included in the comparative experiments. Table 7 shows the performance of each model. Compared with the two-stage algorithm Faster-R-CNN, which has a complex procedure, YOLOv8-BeCS has an extremely short detection time, accounting for only 17.82% of the time taken by Faster-R-CNN, while the AP50 accuracy increased by 15.1%. Compared with other one-stage detection algorithms, such as SSD, YOLOv8-BeCS has a significant advantage in accuracy due to the new bounding box regression method brought by WIOU. As state of the art (SOTA) models in the field of object detection, models like YOLOv5, YOLOv6, YOLOv7, and PP-YOLO show obvious advantages in accuracy over other models in the broiler chicken behavior dataset, but they are slightly inferior to YOLOv8-BeCS.

Another advantage of YOLOv8-BeCS lies in its model size, as can be seen in the data. As seen in Table 7, even after incorporating multiple modules, YOLOv8-BeCS maintains the same model weight size as the basic YOLOv8m model.

Several existing broiler behavior detection models designed for agricultural settings were compared in this study. The results are presented in Table 8. The YOLOv8-BeCS model outperformed the others in a real free-range environment, likely due to its ability to address the complex background and severe occlusion prevalent in such settings. By enhancing feature extraction and incorporating NMS, YOLOv8-BeCS effectively overcomes these challenges, improving its adaptability to diverse free-range broiler rearing environments. Its suitability is further validated in the subsequent section.

Table 9 shows that the accuracy remains largely consistent across different computing platforms, whereas speed performance is significantly constrained by computational power. The CPU platform fails to accelerate inference, resulting in slower processing speeds. This experiment further demonstrates the strong hardware adaptability of the proposed method.

3.2. Performance of Connector and Trackers

Experiments were conducted to investigate the impact of the presence or absence of the Connector on the tracking performance. The comparative demonstrations video can be found in the Supplementary Materials. Figure 12 shows the different tracking results of the same original video segment. It can be observed that the Tracker without the Connector exhibits abnormal tracking continuity after several frames of tracking, with multiple target ID switches and tracking losses occurring. This indicates that in the absence of an effective Connector, the Tracker has difficulty maintaining stable associations of the target objects across consecutive frames.

In contrast, the ODBO Tracker using the complete strategy significantly outperforms the version without the Connector in terms of tracking performance. This study has successfully reduced the interference of changes in behavior classes and target shapes on the tracking process, significantly decreasing the occurrences of ID switches and tracking losses. The data presented in Table 10 is consistent with the observed results. For the Tracker equipped with the Connector, the number of ID switches decreased by 66.7%, while HOTA, MOTA, and IDF1 increased by 30.17%, 44.22%, and 30.59%, respectively. Despite the substantial improvement in tracking accuracy, due to the additional secondary calculations of the detection head in the Connector and the data processing procedures, the tracking frame rate of ODBO decreased from 31 frames per second to 23 frames per second.

To comprehensively demonstrate the performance of ODBO, the mainstream algorithms of the Sort series were integrated into ODBO (including Sort, DeepSort, OCSort, StrongSort, and BoTSort). The same weights trained based on YOLOv8-BeCS and the Connector were used, and the results are presented in Table 11.

At the cost of a partial reduction in the detection frame rate, all the Trackers demonstrated improvements in tracking accuracy and stability. The average increases in HOTA, MOTA, and IDF1 were 27.66%, 28%, and 27.96%, respectively.

In comparison with other algorithms, Sort significantly outperformed similar algorithms in terms of FPS, owing to its concise matching process. For DeepSort, its HOTA, MOTA, IDF1, and IDS were 71.56%, 76.54%, 81.36%, and 7, respectively. It can be observed that, despite slightly lower accuracy compared to some other methods, DeepSort had a much higher frame rate than OCSort, StrongSort, and BoTSort. In the comparison of the IDF1 metric, OCSort achieved the best performance at 83.72%, which was attributed to its effective solution to the limitations of the Kalman filter in the Sort algorithm. StrongSort took the lead in HOTA by virtue of its equipped strong detector, following, and embedding models. Regarding IDS, except for Sort, which performed significantly worse, the other four algorithms showed similar performances.

In the comparison of all the metrics, except FPS, BoTSort had the best overall performance. This was due to its improvement of the Kalman filter and the use of camera motion compensation. However, its tracking speed of only seven frames per second might not meet the requirements of applications. OCSort and StrongSort also had poor speed performances. Therefore, considering accuracy, stability, and tracking speed comprehensively, it is more practically meaningful for ODBO to choose DeepSort as the tracker.

3.3. Fine-Tuned YOLOv8-BeCS in Multi-Broiler Scenarios

In commercial farming settings, detectors designed for a single data scenario obviously lack generalization to other scenarios; thus, fine-tuning the model using data from the application scenario is essential [54]. This study performed fine-tuning experiments on the proposed algorithm with the Scenario 2 dataset. Scenario 2 is an overhead shot of a commercial farming scenario. Notably, the number of broiler chickens in Scenario 2 substantially exceeds that in Scenario 1, yet the amount of training data falls far short of the conventional training level, posing greater challenges.

First, the optimal weights of YOLOv8-BeCS trained on the data from Scenario 1 were used to perform behavior recognition on the dataset of Scenario 2 (the results are shown in Figure 13). When trained on the data from Scenario 1, YOLOv8-BeCS still achieved a precision of 34.7% and a recall of 47.2%. This demonstrates the good adaptability of YOLOv8-BeCS to unknown scenarios. Its precision was 20.3% higher than the original version, and recall was 17.2% higher. The differences in AP50 and AP95 were 21.7% and 13.9%, respectively.

After fine-tuning the training using the optimal weights of YOLOv8-BeCS trained on the data from Scenario 1, the experimental results are presented in Table 12. The fine-tuned YOLOv8-BeCS significantly outperformed YOLOv8m when the number of training rounds was relatively small (30 and 50 rounds). Specifically, after 30 rounds of training, YOLOv8-BeCS led by 11.8%, 13.6%, and 11.5% in the P, AP50, and AP95 metrics, respectively. In the Auto Stop comparison, YOLOv8-BeCS showed better precision performance in the complex Scenario 2, with a 3.9% lead in the P metric.

Figure 14 shows the precision and loss changes of the fine-tuned YOLOv8m and YOLOv8-BeCS models during 500 training rounds. Initially, YOLOv8-BeCS quickly enhanced its precision thanks to prior information, while YOLOv8m lagged in progress due to weight matching difficulties. After 100 training rounds, the precision gap between the two models narrowed. At the 300-round mark, the precision of both models peaked but subsequently declined, possibly due to overfitting impacting the testing performance.

Fine-tuning experiments were performed in scenario 2 data using full ODBO, and the detailed performance is shown in Figure 15. Due to the limitation of the camera coverage, cross-camera tracking could not be carried out. However, the tracking performance for broiler chickens that move within the camera range for an extended period is acceptable (such as ID-4, ID-6, ID-12, and ID-14 in Figure 15).

4. Discussion

A key consideration is the extensibility of the ODBO system in agricultural settings, particularly regarding camera systems, all-day behavior tracking, and applicability across diverse farms. First, the limited field of view of a single camera system restricts its ability to cover the entire broiler activity area, regardless of the angle used, thereby hindering the large-scale deployment of ODBO. Two solutions address this limitation: a multi-scale detection model and a cross-camera system. The former adjusts the camera angle to capture a broader scene and employs a multi-scale detection model to identify smaller, distant targets [54]. The latter establishes a cross-camera system to enable continuous tracking of the same target across multiple cameras. This approach has been explored in agricultural contexts; for instance, Han et al. [55] implemented multi-camera tracking of cattle. However, such technology demands extensive scene-specific datasets and cannot be readily deployed with simple adjustments. Consequently, the selection of an appropriate camera system will significantly influence the future scalability of behavior tracking applications.

As object tracking technology advances, the tracking of different animals’ behaviors has steadily matured. Tu et al. [56] designed a pig behavior tracking method based on YOLOv5-Byte, showing that its tracking performance is better for non-behavior classification than for behavior classification. This validates the approach of downplaying behavior categories in this study. Currently, research on broiler chicken behavior tracking remains scarce. Nasiri et al. [27] used multiple detections to achieve the combination of broiler chicken behavior recognition and tracking. This dual detection branch tracking structure requires tedious multiple detection and a lot of computing power, and the accuracy is not high. Therefore, the ODBO process proposed in this paper achieves a balance between accuracy and speed. In selecting behavior tracking approaches, this study adopted the tracking-by-detection (TBD) method. Incorporating spatio-temporal information for time series modeling is another widely used behavior tracking technique. However, these two approaches differ fundamentally in their technical foundations. TBD excels in recognizing static behaviors (e.g., “standing” or “lying down”), whereas time series modeling offers greater accuracy for continuous actions (e.g., “walking” or “eating”), representing a key direction for future behavior tracking. Although the ODBO system demonstrates computational efficiency in detecting and tracking static behaviors, it requires integration of spatio-temporal information to accurately capture continuous behaviors. Consequently, ODBO exhibits notable limitations in long-term continuous behavior tracking. Recent advancements, such as spatio-temporal dual-stream networks and transformer-based temporal attention models, have been developed to address continuous behavior detection. Moving forward, this study could leverage the current single-frame model to extract high-confidence behavior data and integrate lightweight temporal networks (e.g., temporal convolutional networks, TCN) to model continuous behaviors effectively.

At the application level, across diverse farms, fine-tuning experiments have confirmed the method’s robust performance under varying flock densities, ground backgrounds, and lighting conditions. However, this study did not evaluate the method’s performance under extreme environmental conditions (e.g., temperature or lighting extremes), which impose significant demands on the system’s perception and stability. Current behavior tracking systems in related research predominantly rely on either pure visual or pure sensor-based methods. Visual systems may lose stability in extreme weather, while sensor-based systems face challenges related to cost and energy consumption. Future research could explore an integrated approach combining vision and sensor data to enhance adaptability across diverse scenarios and extreme environments.

The YOLOv8-BeCS designed in this study has an AP50 improvement of 3.1% compared to the original model, with P and AP95 increasing by 6.3% and 3.4%, respectively. This is attributed to taking into account the particularities of commercial farming scenarios. According to previous research, integrating SimAM with the backbone network can enhance the image feature extraction ability [57]. The studies by Tan et al. [58] and Liang et al. [33] also demonstrated that NMS and changing the IOU can improve the detection performance. The effective coupling of the three strategies has led to an improvement in detection accuracy. The designed Connector has enhanced the tracking performance. Similarly, the research by Zheng and Qin [59] also mentioned that the introduction of behavior recognition contributes to improving the tracking performance. In their cow behavior tracking experiment, they achieved the highest HOTA (72.4%), MOTP (86.1%), and IDF1 (80.3%). Both the buffer proposed by them and the tracker designed in this study reuse the data in the tracking process to achieve animal behavior recognition and tracking.

5. Conclusions

Only Detect Broilers Once (ODBO) is a visual-based method to correlate broiler behavior with individual identity information. ODBO consists of a high-precision broiler behavior detector, YOLOv8-BeCS, and a Tracker and Connector between them. YOLOv8-BeCS is based on YOLOv8m. The integration of the SimAM attention module, WIOU, and DIOU-NMS enhanced the detection accuracy of five frequent non-specific behaviors in broilers: eating, standing, lying, preening, and stretching. The comparative findings reveal that the average detection precision of the model for each behavior grows from 77.8% to 84.1%, and the AP50 reaches 84.9%, which is superior to similar models. YOLOv8-BeCS is connected to the Tracker through a well-designed Connector. ODBO performed exceptionally well in the tracking part. The average accuracy of Sort series trackers after Connector use was 71.31% (HOTA), 73.52% (MOTA), and 81.47% (IDF1). Compared to the tracking performance without a Connector, the tracking performance improved by 27.66%, 28%, and 27.96%, respectively. The results demonstrate that the ODBO video processing speed and tracking stability are good. Additionally, fine-tuning studies demonstrate the object detection model’s capacity to generalize across multiple commercial environments. ODBO’s detection track technique only employs one identification process to gather the behavioral data of the broiler and integrate it with the broiler’s identity. This method is vital for accurately managing livestock, safeguarding animal welfare, and promoting smart agriculture.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/agriculture15070669/s1, Video S1: The comparative demonstrations video.

Author Contributions

C.Y. contributed to the conceptualization, methodology, investigation, data collection, and writing. X.T. contributed to the visualization, data collection, and writing. X.L., M.C. and W.C. contributed to the data collection methodology. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (No. 62206154) and Shenzhen Startup Funding (No. QD2023014C).

Institutional Review Board Statement

The datasets were collected on a farm in China, under the supervision of the Animal Welfare and Animal Ethics Review Committee of South China Agricultural University. Grant Number: HNAW20230122-28-3. The broilers in the experiment died without any human intervention.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to them also being necessary for future essay writing.

Acknowledgments

The authors would like to acknowledge the support provided by the JiangFeng Co., Ltd. (GuangZhou, China) in the use of their animals, facilities, and equipment.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Li, X.; Cai, M.; Li, M.; Wei, X.; Liu, Z.; Wang, J.; Jia, K.; Han, Y. Combining Vis-NIR and NIR hyperspectral imaging techniques with a data fusion strategy for the rapid qualitative evaluation of multiple qualities in chicken. Food Control 2023, 145, 109416. [Google Scholar] [CrossRef]
Chai, L.; Zhao, Y.; Xin, H.; Wang, T.; Soupir, M.L. Mitigating airborne bacteria generations from cage-free layer litter by spraying acidic electrolysed water. Biosyst. Eng. 2018, 170, 61–71. [Google Scholar] [CrossRef]
Alig, B.N.; Malheiros, R.D.; Anderson, K.E. Evaluation of Physical Egg Quality Parameters of Commercial Brown Laying Hens Housed in Five Production Systems. Animals 2023, 13, 716. [Google Scholar] [CrossRef] [PubMed]
Liu, K.; Xin, H.; Shepherd, T.; Zhao, Y. Perch-shape preference and perching behaviors of young laying hens. Appl. Anim. Behav. Sci. 2018, 203, 34–41. [Google Scholar] [CrossRef]
Jacobs, L.; Vezzoli, G.; Beerda, B.; Mench, J.A. Northern fowl mite infestation affects the nocturnal behavior of laying hens. Appl. Anim. Behav. Sci. 2019, 216, 33–37. [Google Scholar] [CrossRef]
Aydin, A. Using 3D vision camera system to automatically assess the level of inactivity in broiler chickens. Comput. Electron. Agric. 2017, 135, 4–10. [Google Scholar] [CrossRef]
Dunkley, C.S.; Friend, T.H.; McReynolds, J.L.; Kim, W.K.; Dunkley, K.D.; Kubena, L.F.; Nisbet, D.J.; Ricke, S.C. Behavior of Laying Hens on Alfalfa Crumble Molt Diets. Poult. Sci. 2008, 87, 815–822. [Google Scholar] [CrossRef]
Kilpinen, O.; Roepstorff, A.; Permin, A.; Nørgaard-Nielsen, G.; Lawson, L.G.; Simonsen, H.B. Influence of Dermanyssus gallinae and Ascaridia galli infections on behaviour and health of laying hens (Gallus gallus domesticus). Brit. Poult. Sci. 2010, 46, 26–34. [Google Scholar] [CrossRef]
Mendoza, A.V.; Weimer, S.; Williams, Z. Can UV light induce movement in cage-free laying hens? J. Appl. Poult. Res. 2023, 32, 100350. [Google Scholar] [CrossRef]
Sozzi, M.; Pillan, G.; Ciarelli, C.; Marinello, F.; Pirrone, F.; Bordignon, F.; Bordignon, A.; Xiccato, G.; Trocino, A. Measuring Comfort Behaviours in Laying Hens Using Deep-Learning Tools. Animals 2023, 13, 33. [Google Scholar] [CrossRef]
Li, G.; Zhao, Y.; Porter, Z.; Purswell, J.L. Automated measurement of broiler stretching behaviors under four stocking densities via faster region-based convolutional neural network. Animal 2021, 15, 100059. [Google Scholar] [CrossRef] [PubMed]
Wang, J.; Wang, N.; Li, L.; Ren, Z. Real-time behavior detection and judgment of egg breeders based on YOLO v3. Neural Comput. Appl. 2020, 32, 5471–5481. [Google Scholar] [CrossRef]
Yang, X.; Bist, R.; Subedi, S.; Wu, Z.; Liu, T.; Chai, L. An automatic classifier for monitoring applied behaviors of cage-free laying hens with deep learning. Eng. Appl. Artif. Intel. 2023, 123, 106377. [Google Scholar] [CrossRef]
Li, N.; Ren, Z.; Li, D.; Zeng, L. Review: Automated techniques for monitoring the behaviour and welfare of broilers and laying hens: Towards the goal of precision livestock farming. Animal 2020, 14, 617–625. [Google Scholar] [CrossRef]
George, A.S.; George, A.S. Optimizing Poultry Production Through Advanced Monitoring and Control Systems. Partn. Univers. Int. Innov. J. 2023, 1, 77–97. [Google Scholar] [CrossRef]
Wu, D.; Cui, D.; Zhou, M.; Ying, Y. Information perception in modern poultry farming: A review. Comput. Electron. Agric. 2022, 199, 107131. [Google Scholar] [CrossRef]
Yang, X.; Zhao, Y.; Street, G.M.; Huang, Y.; To, S.; Purswell, J.L. Classification of broiler behaviours using triaxial accelerometer and machine learning. Animal 2021, 15, 100269. [Google Scholar] [CrossRef]
van der Sluis, M.; de Haas, Y.; de Klerk, B.; Rodenburg, T.B.; Ellen, E.D. Assessing the Activity of Individual Group-Housed Broilers Throughout Life Using a Passive Radio Frequency Identification System-A Validation Study. Sensors 2020, 20, 3612. [Google Scholar] [CrossRef]
Shahbazi, M.; Mohammadi, K.; Derakhshani, S.M.; Koerkamp, P. Deep Learning for Laying Hen Activity Recognition Using Wearable Sensors. Agriculture 2023, 13, 738. [Google Scholar] [CrossRef]
van der Sluis, M.; de Klerk, B.; Ellen, E.D.; de Haas, Y.; Hijink, T.; Rodenburg, T.B. Validation of an Ultra-Wideband Tracking System for Recording Individual Levels of Activity in Broilers. Animals 2019, 9, 580. [Google Scholar] [CrossRef]
Li, X.; Zhao, Z.; Wu, J.; Huang, Y.; Wen, J.; Sun, S.; Xie, H.; Sun, J.; Gao, Y. Y-BGD: Broiler counting based on multi-object tracking. Comput. Electron. Agric. 2022, 202, 107347. [Google Scholar] [CrossRef]
Siriani, A.L.R.; Kodaira, V.; Mehdizadeh, S.A.; de Alencar Nääs, I.; de Moura, D.J.; Pereira, D.F. Detection and tracking of chickens in low-light images using YOLO network and Kalman filter. Neural Comput. Appl. 2022, 34, 21987–21997. [Google Scholar] [CrossRef]
Tan, X.; Yin, C.; Li, X.; Cai, M.; Chen, W.; Liu, Z.; Wang, J.; Han, Y. SY-Track: A tracking tool for measuring chicken flock activity level. Comput. Electron. Agric. 2024, 217, 108603. [Google Scholar] [CrossRef]
de Alencar Nääs, I.; Da Silva Lima, N.D.; Gonçalves, R.F.; Antonio De Lima, L.; Ungaro, H.; Minoro Abe, J. Lameness prediction in broiler chicken using a machine learning technique. Inf. Process. Agric. 2021, 8, 409–418. [Google Scholar] [CrossRef]
Nasiri, A.; Yoder, J.; Zhao, Y.; Hawkins, S.; Prado, M.; Gan, H. Pose estimation-based lameness recognition in broiler using CNN-LSTM network. Comput. Electron. Agric. 2022, 197, 106931. [Google Scholar] [CrossRef]
Deshpande, V.K.; Bhalerao, R.H.; Chaturvedi, M. Modified YOLO Module for Efficient Object Tracking in a Video. IEEE Lat. Am. T. 2023, 21, 389–398. [Google Scholar] [CrossRef]
Nasiri, A.; Amirivojdan, A.; Zhao, Y.; Gan, H. An automated video action recognition-based system for drinking time estimation of individual broilers. Smart Agric. Technol. 2024, 7, 100409. [Google Scholar] [CrossRef]
Tu, S.; Liu, H.; Li, J.; Huang, J.; Li, B.; Pang, J.; Xue, Y. Instance Segmentation Based on Mask Scoring R-CNN for Group-housed Pigs. In Proceedings of the 2020 International Conference on Computer Engineering and Application (ICCEA), Guangzhou, China, 18–20 March 2020; pp. 458–462. [Google Scholar]
Subedi, S.; Bist, R.; Yang, X.; Chai, L. Tracking pecking behaviors and damages of cage-free laying hens with machine vision technologies. Comput. Electron. Agric. 2023, 204, 107545. [Google Scholar] [CrossRef]
Chen, J.Q.; Wang, H.B.; Zhang, H.D.; Luo, T.; Wei, D.P.; Long, T.; Wang, Z.K. Weed detection in sesame fields using a YOLO model with an enhanced attention mechanism and feature fusion. Comput. Electron. Agric. 2022, 202, 107412. [Google Scholar] [CrossRef]
Wan, G.; Fang, H.B.; Wang, D.Z.; Yan, J.W.; Xie, B.L. Ceramic tile surface defect detection based on deep learning. Ceram. Int. 2022, 48, 11085–11093. [Google Scholar] [CrossRef]
Dong, Y.S.; Shen, L.C.; Pei, Y.H.; Yang, H.T.; Li, X.L. Field-matching attention network for object detection. Neurocomputing 2023, 535, 123–133. [Google Scholar] [CrossRef]
Liang, J.; Chen, X.; Liang, C.; Long, T.; Tang, X.; Shi, Z.; Zhou, M.; Zhao, J.; Lan, Y.; Long, Y. A detection approach for late-autumn shoots of litchi based on unmanned aerial vehicle (UAV) remote sensing. Comput. Electron. Agric. 2023, 204, 107535. [Google Scholar] [CrossRef]
Yang, L.; Zhang, R.; Li, L.; Xie, X. SimAM: A Simple, Parameter-Free Attention Module for Convolutional Neural Networks. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021. [Google Scholar]
Zhang, D.; Luo, H.; Wang, D.; Zhou, X.; Li, W.; Gu, C.; Zhang, G.; He, F. Assessment of the levels of damage caused by Fusarium head blight in wheat using an improved YoloV5 method. Comput. Electron. Agric. 2022, 198, 107086. [Google Scholar] [CrossRef]
Shen, L.; Su, J.; He, R.; Song, L.; Huang, R.; Fang, Y.; Song, Y.; Su, B. Real-time tracking and counting of grape clusters in the field based on channel pruning with YOLOv5s. Comput. Electron. Agric. 2023, 206, 107662. [Google Scholar] [CrossRef]
Zhao, H.; Wang, J.; Dai, D.; Lin, S.; Chen, Z. D-NMS: A dynamic NMS network for general object detection. Neurocomputing 2022, 512, 225–234. [Google Scholar] [CrossRef]
Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019. [Google Scholar]
Tong, Z.; Chen, Y.; Xu, Z.; Yu, R. Wise-IoU: Bounding Box Regression Losswith Dynamic Focusing Mechanism. arXiv 2023, arXiv:2301.10051. [Google Scholar]
Wojke, N.; Bewley, A.; Paulus, D. Simple online and realtime tracking with a deep association metric. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 3645–3649. [Google Scholar]
Cao, J.; Pang, J.; Weng, X.; Khirodkar, R.; Kitani, K. Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Kuala Lumpur, Malaysia, 8–11 October 2023; pp. 9686–9696. [Google Scholar]
Du, Y.H.; Zhao, Z.C.; Song, Y.; Zhao, Y.Y.; Su, F.; Gong, T.; Meng, H.Y. StrongSORT: Make DeepSORT Great Again. IEEE Trans. Multimed. 2023, 25, 8725–8737. [Google Scholar] [CrossRef]
Aharon, N.; Orfaig, R.; Bobrovsky, B. BoT-SORT: Robust Associations Multi-Pedestrian Tracking. arXiv 2022, arXiv:2206.14651. [Google Scholar]
Zhao, Q.; Wei, H.; Zhai, X. Improving Tire Specification Character Recognition in the YOLOv5 Network. Appl. Sci. 2023, 13, 7310. [Google Scholar] [CrossRef]
Wang, C.; Bochkovskiy, A.; Liao, H.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 7464–7475. [Google Scholar]
Li, C.; Li, L.; Jiang, H.; Weng, K.; Geng, Y.; Li, L.; Ke, Z.; Li, Q.; Cheng, M.; Nie, W. YOLOv6: A single-stage object detection framework for industrial applications. arXiv 2022, arXiv:2209.02976. [Google Scholar]
Jocher, G.; Chaurasia, A.; Stoken, A.; Borovec, J.; NanoCode; Kwon, Y.; Michael, K.; TaoXie; Fang, J.; Imyhxy; et al. yolov5: v7.0–YOLOv5 SOTA Realtime Instance Segmentation. In Zenodo; Harvard University: Cambridge, MA, USA, 2022. [Google Scholar]
Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.; Berg, A.C. SSD:Single Shot MultiBox Detector. arXiv 2016, arXiv:1512.02325. [Google Scholar]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
Long, X.; Deng, K.; Wang, G.; Zhang, Y.; Dang, Q.; Gao, Y.; Shen, H.; Ren, J.; Han, S.; Ding, E.; et al. PP-YOLO: An Effective and Efficient Implementation of Object Detector. arXiv 2020, arXiv:2007.12099. [Google Scholar]
Zhu, X.; Su, W.; Lu, L.; Li, B.; Wang, X.; Dai, J. Deformable detr: Deformable transformers for end-to-end object detection. arXiv 2020, arXiv:2010.04159. [Google Scholar]
Wang, F.J.; Cui, J.Q.; Xiong, Y.Y.; Lu, H.S. Application of deep learning methods in behavior recognition of laying hens. Front. Phys. 2023, 11, 1139976. [Google Scholar] [CrossRef]
Tong, Q.; Zhang, E.; Wu, S.; Xu, K.; Sun, C. A real-time detector of chicken healthy status based on modified YOLO. Signal Image Video Process. 2023, 17, 4199–4207. [Google Scholar] [CrossRef]
Li, X.X.; Cai, M.R.; Tan, X.J.; Yin, C.C.; Chen, W.H.; Liu, Z.; Wen, J.T.; Han, Y.X. An efficient transformer network for detecting multi-scale chicken in complex free-range farming environments via improved RT-DETR. Comput. Electron. Agric. 2024, 224, 109160. [Google Scholar] [CrossRef]
Han, S.; Fuentes, A.; Yoon, S.; Jeong, Y.; Kim, H.; Sun Park, D. Deep learning-based multi-cattle tracking in crowded livestock farming using video. Comput. Electron. Agric. 2023, 212, 108044. [Google Scholar] [CrossRef]
Tu, S.; Cai, Y.; Liang, Y.; Lei, H.; Huang, Y.; Liu, H.; Xiao, D. Tracking and monitoring of individual pig behavior based on YOLOv5-Byte. Comput. Electron. Agric. 2024, 221, 108997. [Google Scholar] [CrossRef]
Fan, S.; Liang, W.; Ding, D.; Yu, H. LACN: A lightweight attention-guided ConvNeXt network for low-light image enhancement. Eng. Appl. Artif. Intel. 2023, 117, 105632. [Google Scholar] [CrossRef]
Tan, S.; Lu, H.; Yu, J.; Lan, M.; Hu, X.; Zheng, H.; Peng, Y.; Wang, Y.; Li, Z.; Qi, L.; et al. In-field rice panicles detection and growth stages recognition based on RiceRes2Net. Comput. Electron. Agric. 2023, 206, 107704. [Google Scholar] [CrossRef]
Zheng, Z.Y.; Qin, L.F. PrunedYOLO-Tracker: An efficient multi-cows basic behavior recognition and tracking technique. Comput. Electron. Agric. 2023, 213, 108172. [Google Scholar] [CrossRef]

Figure 1. Experimental scene and data acquisition method.

Figure 2. (a) The schematic of the detection box labeling using LabelImg1.8.6. (b) The details of labeling videos using DarkLabel2.4. Different colors boxes represent different individual broilers.

Figure 3. (a) A partial image of the dataset, I, II, III, and IV in the figure represent groups, each of which is changed by the same original image; (b) the labeling statistics before and after data enhancement, with the blue color indicating before the operation and the red color indicating after the enhancement.I, II, III, IV.

Figure 4. The output result form of ODBO.

Figure 5. Proposed model structure.

Figure 6. Three-dimensional weight estimation process for SimAM.

Figure 7. Principles of labeling, prediction, and minimum bounding boxes.

Figure 8. Connector structure and data transfer process.

Figure 9. The Sort matching procedure.

Figure 10. Comparison of heat maps under different attention mechanisms.

Figure 11. Comparison of leakage before and after the introduction of NMS.

Figure 12. Comparison of tracking effects without Connector and with Connector.

Figure 13. Results of directly detecting Scenario 2 data without training.

Figure 14. Fine-tuning training curve.

Figure 15. The tracking results after fine-tuning based on the data of Scenario 2.

Table 1. Examples and description of the behavior in the object detection dataset.

Behavior	Description	Instances
Standing	Open the sole of the foot at an angle
Lying	Keep the body close to the ground and tuck their legs in
Eating	Elongate the neck and hang the head down near the floor
Preening	The head is turned toward the body, and the neck is bent
Stretching	Lift the body slightly and spread out the wings completely

Table 2. Model performance after different strategy combinations.

Method	Speed (ms)	P	R	AP50	AP95
YOLOv8m	4.8	0.778	0.761	0.818	0.674
YOLOv8m + DIOU-NMS	5.1	0.833	0.799	0.841	0.695
YOLOv8m + SimAM	5.0	0.813	0.746	0.819	0.702
YOLOv8m + DIOU-NMS + SimAM	5.3	0.837	0.811	0.848	0.704

Table 3. Mainstream IOU comparison results.

IOU	Box Loss	P	AP50	AP95	Speed (ms)
GIOU	0.068495	0.778	0.818	0.674	4.8
DIOU	0.056917	0.808	0.829	0.676	4.8
CIOU	0.045915	0.801	0.821	0.685	4.8
EIOU	0.053548	0.797	0.828	0.683	4.8
WIOU v1	0.015614	0.807	0.841	0.694	4.5
WIOU v2	0.021981	0.782	0.832	0.693	4.5
WIOU v3	0.013015	0.817	0.835	0.701	4.5

Table 4. Comparison of errors and failure to detect in crowded scenarios.

Scenario		Total	Omissions	Errors	Undetected Rate (%)	Error Rate (%)
With NMS	Light occlusion	227	17	8	7.5	3.5
With NMS	Severe occlusion	231	22	11	9.5	4.8
Without NMS	Light occlusion	227	31	16	13.6	7.5
Without NMS	Severe occlusion	231	53	21	12.5	9.1

Table 5. Ablation experiments.

Method	Speed (ms)	P	R	AP50	AP95
YOLOv8m	4.8	0.778	0.761	0.818	0.674
YOLOV8-BeCS\SimAM	4.5	0.807	0.808	0.840	0.704
YOLOV8-BeCS\NMS	4.9	0.822	0.787	0.839	0.688
YOLOV8-BeCS\WIOU	5.2	0.824	0.793	0.813	0.698
Ours	4.9	0.841	0.811	0.849	0.706

Table 6. Results of various behavioral identifications.

Class	P	R	AP50	AP95
Eating	0.902	0.866	0.925	0.729
Lying	0.684	0.672	0.752	0.659
Standing	0.848	0.762	0.833	0.702
Preening	0.862	0.856	0.809	0.713
Stretching	0.911	0.936	0.925	0.737
All	0.841	0.811	0.849	0.708

Table 7. Comparison results of different object detection models.

Method	Speed (ms)	P	AP50	AP95	Size (Mb)	Parameters (M)	FLOPs (G)
YOLOv8m	4.8	0.778	0.818	0.674	49.5	25.9	78.7
YOLOv7e6e	9.0	0.783	0.719	0.544	288	150.9	208.9
YOLOv5x6	13.0	0.784	0.805	0.685	185.0	96.8	213.0
SSD	7.8	/	0.479	/	85.0	24.9	35.2
Faster-R-CNN	27.5	/	0.698	/	236.0	28.4	202.1
PP-YOLO	25.1	0.732	0.568	0.448	207.2	44.9	45.1
DETR	83.3	0.624	0.624	0.421	158.8	41.0	225.0
YOLOv6l	6.9	0.769	0.787	0.579	455.0	59.6	150.7
Ours	4.9	0.841	0.849	0.708	49.5	20.9	69.2

Table 8. Comparison results of different behavior detection models of broilers.

Method	Original Model	P	R	AP50	AP95
Wang et al. [52]	YOLOv4-Tiny	0.722	0.718	0.731	0.668
Wang et al. [12]	YOLOv3	0.736	0.698	0.704	0.649
Tong et al. [53]	YOLOv5	0.788	0.777	0.793	0.665
Ours	YOLOv8	0.841	0.811	0.849	0.708

Table 9. Model performance in different experimental environments.

Calculator	Video Memory	P	AP50	Speed (ms)
NVIDIA GTX 1060	6 GB	0.838	0.849	26.6
NVIDIA RTX 3060	12 GB	0.843	0.842	8.3
Intel Core i7-8750H	-	0.837	0.840	195.0
Apple M1	-	0.843	0.850	16.1

Table 10. DeepSort’s tracking effects before and after Connector is added.

Method	HOTA (%)	MOTA (%)	IDF1 (%)	IDS	FPS
ODBO\Connector	41.39	32.32	50.77	21	31
ODBO (DeepSort)	71.56	76.54	81.36	7	23

\ indicates that this part is not included.

Table 11. The tracking effect of Sort algorithm before and after Connector is added.

Tracker	With Connector					Without Connector
Tracker	HOTA (%)	MOTA (%)	IDF1 (%)	IDS	FPS	HOTA (%)	MOTA (%)	IDF1 (%)	IDS	FPS
Sort	65.30	57.34	77.29	21	47	40.65	41.27	49.70	33	55
DeepSort	71.56	76.54	81.36	7	23	41.39	32.32	55.77	21	31
OCSort	71.27	73.77	83.72	7	14	41.72	53.71	49.98	20	17
StrongSort	74.31	79.90	81.59	8	9	50.21	49.83	57.20	15	15
BoTSort	74.13	80.03	83.37	6	7	44.29	50.45	54.86	14	12
Average	71.31	73.52	81.47	9.8	20	43.65	45.52	53.50	20.6	26

Table 12. Results of YOLOV8-BeCS and YOLOV8m after fine-tuning different rounds.

Method	Epochs	P	R	AP50	AP95
YOLOv8m	30	0.429	0.632	0.538	0.415
	50	0.518	0.735	0.613	0.484
	150	0.572	0.639	0.615	0.502
	300	0.560	0.711	0.670	0.526
	500	0.576	0.716	0.674	0.528
	Auto Stop (238)	0.559	0.715	0.675	0.529
YOLOV8-BeCS	30	0.547	0.714	0.674	0.530
	50	0.535	0.714	0.676	0.532
	150	0.561	0.624	0.671	0.527
	300	0.621	0.696	0.683	0.533
	500	0.597	0.697	0.678	0.531
	Auto Stop (98)	0.598	0.716	0.678	0.532

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yin, C.; Tan, X.; Li, X.; Cai, M.; Chen, W. Only Detect Broilers Once (ODBO): A Method for Monitoring and Tracking Individual Behavior of Cage-Free Broilers. Agriculture 2025, 15, 669. https://doi.org/10.3390/agriculture15070669

AMA Style

Yin C, Tan X, Li X, Cai M, Chen W. Only Detect Broilers Once (ODBO): A Method for Monitoring and Tracking Individual Behavior of Cage-Free Broilers. Agriculture. 2025; 15(7):669. https://doi.org/10.3390/agriculture15070669

Chicago/Turabian Style

Yin, Chengcheng, Xinjie Tan, Xiaoxin Li, Mingrui Cai, and Weihao Chen. 2025. "Only Detect Broilers Once (ODBO): A Method for Monitoring and Tracking Individual Behavior of Cage-Free Broilers" Agriculture 15, no. 7: 669. https://doi.org/10.3390/agriculture15070669

APA Style

Yin, C., Tan, X., Li, X., Cai, M., & Chen, W. (2025). Only Detect Broilers Once (ODBO): A Method for Monitoring and Tracking Individual Behavior of Cage-Free Broilers. Agriculture, 15(7), 669. https://doi.org/10.3390/agriculture15070669

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Only Detect Broilers Once (ODBO): A Method for Monitoring and Tracking Individual Behavior of Cage-Free Broilers

Abstract

1. Introduction

2. Materials and Methods

2.1. Experiment Design and Data Collection

2.2. Image Annotation and Dataset Division

2.3. ODBO

2.3.1. The Proposed Object Detection Model

2.3.2. SimAM

2.3.3. DIOU-NMS

2.3.4. WIOU

2.4. Connector

2.5. Broiler Tracking Algorithm

2.6. Experimental Environment and Evaluation Metrics

2.6.1. Experimental Environment

2.6.2. Hyperparameter Settings

2.6.3. Evaluation Metrics

3. Results

3.1. Results of Different Improvement Strategies

3.1.1. Comparison of SimAM Modules

3.1.2. IOU Function

3.1.3. Performance of DIOU-NMS

3.1.4. YOLOv8-BeCS

3.2. Performance of Connector and Trackers

3.3. Fine-Tuned YOLOv8-BeCS in Multi-Broiler Scenarios

4. Discussion

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI