A Deep-Learning-Based System for Pig Posture Classification: Enhancing Sustainable Smart Pigsty Management

Jeon, Chanhui; Kim, Haram; Kim, Dongsoo

doi:10.3390/su16072888

Open AccessArticle

A Deep-Learning-Based System for Pig Posture Classification: Enhancing Sustainable Smart Pigsty Management

by

Chanhui Jeon

¹,

Haram Kim

² and

Dongsoo Kim

^2,*

¹

Department of IT Distribution and Logistics, Soongsil University, Seoul 06978, Republic of Korea

²

Department of Industrial and Information Systems Engineering, Soongsil University, Seoul 06978, Republic of Korea

^*

Author to whom correspondence should be addressed.

Sustainability 2024, 16(7), 2888; https://doi.org/10.3390/su16072888

Submission received: 15 February 2024 / Revised: 23 March 2024 / Accepted: 25 March 2024 / Published: 29 March 2024

(This article belongs to the Special Issue Sustainable Technology in Agricultural Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

This paper presents a deep-learning-based system for classifying pig postures, aiming to improve the management of sustainable smart pigsties. The classification of pig postures is a crucial concern for researchers investigating pigsty environments and for on-site pigsty managers. To address this issue, we developed a comprehensive system framework for pig posture classification within a pigsty. We collected image datasets from an open data sharing site operated by a public organization and systematically conducted the following steps: object detection, data labeling, image preprocessing, model development, and training. These processes were carried out using the acquired datasets to ensure comprehensive and effective training for our pig posture classification system. Subsequently, we analyzed and discussed the classification results using techniques such as Grad-CAM. As a result of visual analysis through Grad-CAM, it is possible to identify image features when posture is correctly classified or misclassified in a pig image. By referring to these results, it is expected that the accuracy of pig posture classification can be further improved. Through this analysis and discussion, we can identify which features of pig postures in images need to be emphasized to improve the accuracy of pig posture classification. The findings of this study are anticipated to significantly improve the accuracy of pig posture classification. In practical applications, the proposed pig posture classification system holds the potential to promptly detect abnormal situations in pigsties, leading to prompt responses. Ultimately, this can greatly contribute to increased productivity in pigsty operations, fostering efficiency enhancements in pigsty management.

Keywords:

deep learning; image classification; object detection; explainable AI; pigsty management; sustainable smart pigsty

1. Introduction

Recently, there has been a fervent consumer interest in products that consider environmental factors such as ‘Food safety’, ‘Well-being’, and ‘Environmental friendliness’. Consumers express a need for agricultural and livestock products produced in high-welfare environments, showing a tendency to purchase products produced in environments with higher levels of welfare, even at a higher price [1]. This is because meat produced in high-welfare environments is perceived as safer and of higher quality. According to research, pigs raised in environments with high welfare exhibit more active behaviors, resulting in the production of higher-quality meat [2]. In contrast, pigs raised in environments with low welfare are more likely to engage in abnormal behaviors, particularly aggressive actions such as tail-biting, leading to a decline in productivity [3]. Tail biting, in particular, leads to the more serious issue of bleeding in pigs. This not only affects the health, hygiene, and susceptibility to infections in pigs but also has long-term implications for the environment of swine production [4]. To address such problems, tail docking is sometimes performed. Tail docking involves the partial removal of a pig’s tail to prevent tail biting behavior. However, while tail docking may prevent tail biting behavior, it is widely perceived as ethically questionable from a welfare perspective due to the inherent pain it causes to pigs. Especially in Europe, where awareness of animal welfare is high, the European Union established regulations in 2008 prohibiting tail docking. Therefore, it is necessary to monitor the behavior of pigs to manage the pigsty environment. In line with this need, research is underway on behavioral monitoring [5]. By monitoring the behavior of pigs within the pigsty, it will be possible to improve the environment and enhance the productivity of the pigsty.

With this background, it can be inferred that animal welfare is a crucial factor in enhancing the productivity of pigsties. Particularly, the interest in animal welfare, initially sparked in European countries, is also growing in Asian nations [6]. This is increasingly recognized as significant from the perspective of sustainable food production. Farm environment and animal behavior researchers argue that improving the welfare of animals raised in farms is a sustainable method to enhance productivity in farms [7,8]. Research on whether the farm environment influences animal behavior has been ongoing, revealing a connection between the farm environment and animal behavior. To achieve sustainable food production, attempts are being made to apply deep learning on farms. For instance, Kim and Kim presented a framework for classifying pests by examining strawberry leaves for sustainable smart farming [9]. Mahfuz et al. reviewed the urgency of technological innovation to modernize the livestock industry and ensure sustainable pig production [10]. Specifically, they emphasized the need for technological innovation, addressing pig health, stress, and behavioral welfare issues through monitoring in pig farming. Understanding animal behavior is crucial for researchers studying the correlation between farm environment and animal behavior, as well as for actual farm managers. However, currently, animal behavior is typically observed manually, a time-consuming and labor-intensive task. Additionally, simultaneously monitoring the postures of a large number of animals is generally a challenging problem due to their continuous movements.

The objective of this study is to construct a pig posture classification system that enables researchers and managers in pigsties to comprehend the pigsty environment and situations. To achieve this more effectively and accurately, the performance of the models utilized was compared, and the classification results were analyzed and discussed. In reality, researchers studying farm environments and animal behaviors, as well as actual farm managers, require tasks to understand animal behavior. If the current posture of an animal is understood, it becomes possible to infer the behavior it is currently engaging in. Through this, it becomes feasible to deduce or suspect the stress intensity experienced by the animal, the feeding and watering conditions in the farm, abnormal situations, and more. For these reasons, research utilizing computer vision to classify animal postures has been consistently conducted, and this holds true in the field of pigsties as well. However, due to the diverse environments in pigsties, there have been clear limitations in the accuracy of posture classification in the past [11]. This is because attempts at image classification were made using single-layer networks instead of CNNs (convolutional neural networks). To overcome these existing limitations, this study constructs a system framework using deep learning to classify pig postures.

The following models and techniques are commonly employed when addressing image recognition challenges. Therefore, these technologies are also considered in research focused on recognizing pig postures. Subsequently, the explanation follows on how these technologies have been utilized in previous studies. Firstly, studies proposing models for CNNs and enhancing image classification accuracy are as follows. Ref. [12] introduced ResNet, a model incorporating Residual Connection, a structure that skips convolution layers and directly adds them to the output, improving image classification accuracy. Ref. [13] proposed EfficientNet, a model with features that scale the width, depth, and resolution of the model simultaneously. Ref. [14] enhanced the existing EfficientNet by introducing EfficientNetV2, which adjusts the intensity of normalization based on the input images during the model’s learning process to improve accuracy. Ref. [15] suggested CBAM (convolutional block attention module), a lightweight and general module added to the end of CNN model layers, enhancing the accuracy of deep learning models. In this study, we compared the classification performance of ResNet, ResNet (+CBAM), EfficientNet, and EfficientNetV2.

Ref. [16] proposed YOLOv7 (You Only Look Once version 7), an improved version of the YOLO object detection method, by applying trainable bag-of-freebies. YOLOv7 enhances the performance of the original YOLO, which is a one-stage detector performing regional proposal and classification simultaneously. This provides the advantage of faster speed compared to two-stage detectors like R-CNNs (region-based convolutional neural networks). In this study, YOLOv7 was utilized as the object detection technique to capture single pig images from raw data. Ref. [17] introduced cutout, a method that randomly masks regions of input images during the input stage and trains the model, ultimately improving the image classification accuracy of deep learning models. This technique makes the learning problem challenging during model training, making it easier to solve during testing. It was employed in the image preprocessing stage of this study.

Ref. [18] proposed a CAM (class activation map) to provide visual explanations for the inference results of CNN models, addressing the difficulty in understanding the results of conventional CNN models. Subsequently, various versions improving CAMs have been suggested. Ref. [19] introduced Grad-CAM (gradient-weighted class activation mapping), which utilizes the weights obtained through learning gradients rather than using GAP (Global Average Pooling) as in the original CAM, to visualize feature maps. Grad-CAM improves the drawbacks of the original CAM, which required replacing the FC (Fully Connected) layer with GAP and necessitated retraining the model to use CAM. In this study, we utilize Grad-CAM to visually analyze and discuss the results of pig posture classification. And the following illustrates the distinctive features of this study compared to previous research on pig posture classification:

It provides a more detailed explanation of the need for a pig posture classification system in a pigsty and its potential scenarios;
It conducts a thorough visual analysis of classification outcomes to support practical implementation;
It enhances classification methodologies by employing techniques such as weighted random sampling and diverse CNN architectures.

2. Materials and Methods

2.1. Overall System Framework

The overall framework of this study is illustrated in Figure 1. Initially, raw data were obtained from AI-Hub. AI-Hub is an artificial-intelligence-related information platform that provides datasets for the development of AI technologies. Therefore, the dataset used in this study was obtained through this platform. The dataset was constructed via object detection to capture images containing the poses of individual pigs. Since the dataset lacked labels for pig pose classes, a data labeling process was undertaken. To address the imbalance in the number of instances per class in the constructed dataset, we applied data augmentation and transformation during the image preprocessing stage. This process generated diverse patterns of images, which were then used to train various CNN models. The classification results of these models were compared to assess their accuracy. Subsequently, the classification results of EfficientNetV2-S, which exhibited the highest classification accuracy, were visually analyzed using Grad-CAM.

As shown in Figure 1, we configured and conducted experiments on the system framework of this study. It is important to note that the research environment can influence the model’s training speed and the reproducibility of the system framework.

2.2. Dataset

To conduct this study, images and videos of pigs within a pigsty were required. The dataset was constructed through the processes of data selection, object detection, and data labeling. In the data selection phase, the acquisition of raw data is described, followed by the object detection process, which involves detecting the appearance of individual pigs to obtain images containing a single pig’s posture. Finally, the data labeling process is detailed, explaining how classes were assigned to images capturing the posture of individual pigs. The dataset structured in this way is identical to the dataset used in the [20]. In this study, we distinguish between the dataset obtained from AI-Hub and the dataset processed through a series of steps. The dataset obtained in its initial state from AI-Hub is referred to as raw data.

2.2.1. Data Selection

We obtained raw data by acquiring pig images from the Big Data platform AI-Hub. The collected raw data consist of images captured in an actual pigsty in 2020. Figure 2 provides examples of the acquired raw data images. In this study, as it is essential to classify the posture of each pig, there was a need to obtain images containing the appearance of a single pig through object detection from the raw data.

2.2.2. Object Detection

In this study, we performed object detection on raw data to acquire image data containing the posture of individual pigs, constructing a dataset. For this process, we chose YOLOv7 as the object recognition tool. However, since the standard model provided by YOLOv7 did not include the posture of pigs, it was necessary to train the YOLO model to detect pig postures. To train the model, we needed the coordinates for the objects to be detected in the images. We obtained these coordinates by manually labeling pig postures using Roboflow, which provides various tools for computer vision tasks. We trained YOLOv7 using the acquired coordinates, and it demonstrated excellent performance when detecting the postures of pigs in actual pigsty images.

Figure 3 provides examples of object detection in the pigsty. Pigs whose entire body is visible were detected with high probability. However, in cases where pigs were positioned in corners and a significant portion of the body was obscured, detection was often unsuccessful. For undetected pigs, the posture was frequently truncated, making it challenging to accurately determine the posture. Even if detection occurred, there were concerns about the model learning inaccuracies if such images were added to the dataset.

Since this could potentially lower the overall classification accuracy of the model, images in which the pig’s posture could not be clearly identified were excluded from the dataset. Figure 4 illustrates instances of images included and excluded from the dataset, despite partial obfuscation of their appearance. Images where the posture could be identified, even if part of the whole body was not visible, were incorporated into the dataset. Conversely, images where it was deemed challenging to discern the posture due to significant occlusion of the entire body were excluded. For example, cases where the front or hind legs of the pig are not identifiable in the image. As shown in Figure 4, even for a human observer, identifying the pig’s posture in such cases is difficult. Labeling pig postures in these images could lead to a degradation in the quality of the dataset. Here, the degradation in dataset quality refers to the inclusion of incorrectly labeled data, which may result in misclassified classes within the dataset. Notwithstanding, considering the real-world application of the system framework to actual pigsties, instances where the posture could be identified to some extent were included in the dataset for diverse learning. This systematic process resulted in the acquisition of individual pig images, forming the dataset.

2.2.3. Data Labeling

The constructed dataset initially lacked class labels, requiring the need for data labeling. To address this, the VoTT (Visual Object Tagging Tool) developed by Microsoft was utilized to label the pig posture classes.

Figure 5 demonstrates the usage of the image labeling tool (VoTT). Although labeling followed standard criteria, in cases where the pig’s posture was not clear in the image, some subjectivity from the researcher was involved. The classes were categorized as ‘Standing’, ‘Lying on belly’, ‘Lying on side’, and ‘Sitting’.

Table 1 shows the distribution of training and test data per class. The total classified data are 7604 images. Following a division into training and test data with an 8:2 ratio, the dataset comprises 6085 images for training and 1519 images for testing. Examination of the data per class revealed imbalances. This imbalance can be attributed to the characteristic behavior of pigs, as they spend over 80% of their time lying down [21]. Consequently, the ‘Standing’ and ‘Sitting’ classes are relatively underrepresented compared to the ‘Lying on belly’ and ‘Lying on side’ classes.

2.3. Image Preprocessing

In the image preprocessing stage, images from the training data are processed to enable the deep learning model to learn various patterns. This occurs before training the deep learning model. In this study, there was an imbalance in the number of data samples across classes. In such cases, the model may fail to learn various patterns, leading to overfitting. To prevent this issue, we employed two main techniques: weighted random sampling and data augmentation.

Figure 6 illustrates the image preprocessing process in the dataset. For each training iteration, weighted random sampling is utilized to balance the number of data samples per class, and the images are preprocessed to create new images temporarily. These generated images are then used to train the deep learning model. Weighted random sampling involves balancing the number of data samples for each class according to the batch size. This ensures that, during training, all classes are equally represented even if the number of data samples per class is imbalanced. Therefore, in this study, with a batch size of 32 and 4 classes, there are 8 data samples for each class for every training iteration.

The images fetched in this manner are then preprocessed as follows. The preprocessing probabilities and values were determined by the authors through random search. Image preprocessing and data augmentation involved generating diverse images by applying various transformations, including horizontal flipping, vertical flipping, rotation, and translation, with a 50% probability for each transformation on each training image. Additionally, cutout, a technique involving masking parts of the image, was applied. To prevent a decline in classification performance, square masks (48 × 48 in size) were used, and they were applied with a 30% probability for each training image. Until experimented, the optimal values for the probability of applying cutout and the masking range on the images cannot be determined. Therefore, in this study, the values for the probability of applying cutout and the mask size were varied, and the final values obtained were 30% for the application probability and (48 × 48) for the mask size. Furthermore, the test data in this study are not exclusively composed of sharp, high-resolution images. Therefore, Gaussian noise was intentionally applied to the training data images with a probability of 30%. The rationale for applying it with a probability of 30% is because approximately 20% of the data in the dataset of this study were deemed to have low image quality. While the criteria for determining low image quality were subjective to the authors’ judgment, efforts were made to assess it based on common standards. Through this process, the newly generated images are used to train the deep learning model and are discarded after each training step, repeating the process for every training iteration. While this effectively trains the model on a variety of image patterns, it does not increase the size of the dataset. Ultimately, our goal is to create a robust and versatile deep learning model that is resilient to noise.

2.4. Modeling and Learning

Based on the training data obtained through the previous steps of this study, we compared the classification accuracy of different models in the same environment. The compared CNN models include EfficientNet, EfficientNetV2, ResNet, and ResNet (+CBAM). These models are among the most widely utilized CNN models in image recognition and classification problems.

The classification accuracy represents the results of classifying the test data based on the epoch with the highest accuracy after training each model for 80 epochs with the following hyperparameters. In the hyperparameter tuning process, we applied random search to sequentially adjust the learning rate, batch size, and max epochs. Random search involves setting hyperparameter values randomly and testing them during training. This method allows for efficiently finding optimal hyperparameter values within a limited time frame [22]. The final selected hyperparameters are Adam optimizer, cross-entropy loss function, 80 max epochs, 32 batch size, and a learning rate of 0.0001.

Table 2 provides a comprehensive overview of the performance of various models employed in the system framework of this study. Under the same experimental conditions, EfficientNetV2-S exhibited the highest classification accuracy and relatively faster training times, making it the chosen classification model for this study.

3. Results

3.1. Classification Result

In this section, we examine the classification results of EfficientNetV2-S, which exhibited the highest accuracy in pig posture classification. First, we visually inspect the learning curve to observe the training progress. In this study, the training and test data are divided in an 8:2 ratio, with no separate validation data. Therefore, the learning curve focuses on the training progress of the training data.

Figure 7 illustrates the learning curve of EfficientNetV2-S, enabling examination of epoch-wise F1 Score and Loss values. As training progresses, the F1 Score demonstrates an upward trend, while the Loss exhibits a downward trend, confirming the satisfactory progress of learning.

Table 3 presents the classification results using the EfficientNetV2-S model in the system framework of this study. When examining the classification results, the accuracy of ‘Standing’ and ‘Lying on side’ is notably high, while ‘Lying on belly’ and ‘Sitting’ exhibit relatively lower accuracy. Despite having fewer data samples, ‘Standing’ shows high classification accuracy, possibly due to the application of weighted random sampling during image preprocessing, ensuring that the class imbalance does not significantly affect the results. On the other hand, ‘Lying on side’ benefits from a larger number of data samples, allowing the model to learn various patterns and angles effectively, resulting in higher classification accuracy. In contrast, ‘Lying on belly’ poses challenges due to its similarity to other postures, leading to misclassification depending on the angle at which the pig is captured in the image. ‘Sitting’ particularly exhibits a high rate of misclassification as ‘Standing’, possibly because pigs sitting may appear lying down depending on the angle of view in the image. Further discussion on these classification patterns will be elaborated in the Section 3.2.

Figure 8 shows the confusion matrix of classification results. To improve the visibility of the confusion matrix, each element is assigned a different color based on its value. ‘Standing’ is correctly classified with very high probability. However, caution must be exercised as ‘Standing’ can be misclassified as ‘Lying on belly’ or ‘Sitting’. Prior to applying weighted random sampling in the image preprocessing stage of this study, there were instances where ‘Lying on belly’ and ‘Sitting’ were frequently misclassified. ‘Lying on belly’ is particularly prone to being misclassified as ‘Lying on side’ even more than other classes, while ‘Lying on side’ is occasionally misclassified as ‘Lying on belly’. This suggests that ‘Lying on belly’ and ‘Lying on side’ are occasionally misclassified as each other. ‘Sitting’ is primarily misclassified as ‘Standing’. This study opted for the F1 Score for classification accuracy verification, considering the imbalance in class-specific data. The F1 Score evaluates the model’s performance using the harmonic mean of precision and recall from the confusion matrix. Therefore, if either precision or recall is relatively low, the F1 Score decreases accordingly.

Table 4 presents the precision and recall metrics for each class. Additionally, we calculated the F1 Score using the following formula:

2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

Consequently, despite the overall classification accuracy being 96.64% in this study, the F1 Score is 0.948 due to the relatively low classification accuracy of ‘Lying on belly’ and ‘Sitting’.

3.2. Discussion

The classification results were visually confirmed using Grad-CAM. Through this process, we examined the features of correctly classified images for each class. It is anticipated that this exploration can lead to improvements in classification accuracy by identifying and understanding the distinctive characteristics of images that are accurately classified for each class.

Figure 9 displays the Grad-CAM results for images correctly classified for each class. As shown in the images, the likelihood of accurate classification is high when the distinctive features of each class are highlighted. Each class-specific feature of the classification model is highlighted in red and yellow, while the parts that are not emphasized are understood in blue. The points of interest for each class are as follows: ‘Standing’ focuses on the torso, ‘Lying on belly’ emphasizes the torso and legs, ‘Lying on side’ highlights both legs, and ‘Sitting’ predominantly looks at one leg or irregular patterns. Examining features associated with accurate classification is essential, and understanding misclassification features is equally important.

Figure 10 presents the features of misclassified images for each class and the misclassified classes. The label displayed at the top of the pig image indicates the actual class of the misclassified instance for each class. Through this, it is evident that there are certain common features among misclassified images across different classes. The key misclassification features are as follows: ‘Standing’ was classified correctly with a probability of 99.13% in this study, making it impossible to obtain three misclassified images. However, the pilot study conducted for this research revealed that ‘Standing’ was frequently misclassified as ‘Lying on belly’ and ‘Sitting’. This occurred when the standing pig appeared like ‘Lying on belly’ or ‘Sitting’ from certain viewing angles in the images. ‘Lying on belly’ is commonly misclassified as ‘Lying on side,‘ often occurring when the pig extends one leg. ‘Lying on side’ is often misclassified as ‘Lying on belly,‘ particularly when other pigs are present in the image. ‘Sitting’ is frequently misclassified as ‘Lying on belly,‘ typically when the fully seated pig’s appearance from the back resembles that of ‘Lying on belly’. These findings underscore the importance of training the model with diverse perspectives of pig postures to enhance classification accuracy.

This study addresses aspects that were either not covered or were insufficiently explored in previous studies. In particular, there is a pilot study for this study [20]. Based on the pilot study, this study significantly improved three aspects. Primarily, it offers a more comprehensive rationale for the necessity of a pig posture classification system within pigsties, elucidating its potential applications across diverse scenarios. Secondly, to facilitate the practical implementation of the pig posture classification system within real-world pigsties, this study extensively delves into the visual analysis of classification outcomes. This not only encompasses the examination of accurately classified images but also scrutinizes those that are misclassified, thereby providing a deeper understanding of system performance. Thirdly, this study employs a diverse array of techniques to augment the efficacy of previous classification methodologies. Notably, it incorporates weighted random sampling during the image preprocessing phase and harnesses various CNN architectures to enhance the representation and accuracy of classification results. In this study, we addressed the imbalance in the dataset using weighted random sampling. Additionally, it is also possible to mitigate the imbalance in the pig dataset using Borderline-SMOTE (Synthetic Minority Over-sampling Technique) [23]. Given the high classification performance achieved using weighted random sampling in this study, we chose to utilize weighted random sampling for our analysis. Through these improvements, this study reinforced the necessity and rationale of the research, added comparison for the objectivity of classification results, and achieved better accuracy in classification results.

Additionally, we also reviewed other studies on pig posture classification. Each study on pig posture classification has its own strengths and weaknesses due to differences in techniques and experimental setups, making direct comparisons impractical. However, we review the previous studies on pigs’ posture classification and describe the distinctiveness of this study. Table 5 illustrates the pig posture classification studies that preceded our research.

Through Table 5, it can be observed that various studies classify postures into different classes. However, broadly speaking, they all classify pig postures into standing, lying, and sitting positions. Hongmin Shao et al. used semantic segmentation techniques to classify pig postures into four categories and compared the classification accuracy of different models (ResNet, MobileNet, and XceptionNet), finding that ResNet achieved the highest accuracy at 92.26% [24]. Jan-Hendrik Witte et al. utilized EfficientNet-B0 in YOLO5 to classify ‘Lying’ and ‘Not lying’ in real-time [25]. Abozar Nasirahmadi et al. performed posture classification simultaneously with object detection. They compared the classification accuracy when training with different object detection models (Faster R-CNN, R-FCN, and SSD) and found that using ResNet101 in R-FCN achieved the highest classification accuracy [26]. Jinyang Xu et al. constructed a classification model using CNN-SVM and achieved high classification accuracy when classifying five postures [27]. However, in previous studies, there was no analysis of which features in the images contributed to the successful classification results.

This study distinguishes itself from previous studies in three main aspects. The first distinctive feature is the presentation of scenarios and applications that can be utilized in constructing a smart pigsty. In Section 3.3, utilization scenarios describe how the pig posture classification system can be used to infer abnormal situations in the pigsty and scenarios for utilization. The second distinctive feature is that we analyzed the classification results of the deep learning model using Grad-CAM and visual inspection. This allowed us to identify the features of correctly classified and misclassified images for each class. Utilizing a camera with the specific goal of classifying only certain posture classes and acquiring images could potentially improve the accuracy of pig posture classification. This is discussed in Section 3.2. The third distinctive feature is the enhancement of classification methodologies by leveraging techniques such as weighted random sampling and diverse CNN architectures. This includes comparing the pig posture classification performance of models such as EfficientNetV2 and ResNet (+CBAM), which is detailed in Section 2.4. EfficientNetV2 seems to be a relatively recent model that was not utilized in previous studies. Also, CBAM is a module that can be applied to CNNs, but it has not been utilized in previous studies on pigs’ posture classification. The research confirmed that larger models do not necessarily outperform smaller models when comparing the classification performance of various pig posture classification models. Furthermore, we addressed the issue of disparate data counts across classes by employing weighted random sampling. The results of this methodology can be seen in Table 4. These three distinctions elucidate the primary objective of this study, which is to contribute to future research on pig posture recognition. The findings of this study can serve as valuable references for selecting pig posture classification models, organizing datasets, deciding camera placements for pig posture classification, and exploring optimal strategies for utilizing classification models.

3.3. Utilization Scenarios

The utilization scenarios of the proposed system framework in this study are categorized into four main representatives. These utilization scenarios indicate that the application of deep learning technology to pigsty management can contribute to the establishment of a sustainable pigsty. This is achieved by enabling efficient pig farming operations through an understanding of the environment without the need for extensive resources. Figure 11 illustrates the system’s ability to address various issues within the pigsty based on the current status of pig postures obtained through the pigsty management system.

Figure 11 presents various utilization scenarios of the pigsty management system. If the status of pig posture classification could be monitored through a smartphone application, more efficient pigsty management would be achievable. Considering the existing utilization of ICT in smart pigsties, incorporating this system is anticipated to be a feasible task. Information expected to be discerned through the status of pig posture classification includes the following: The ability to gauge the stress levels of pigs is the first advantage. While pigs spend the majority of their time lying down, if a significant number of pigs simultaneously adopt abnormal postures, it raises suspicions that the pig’s stress levels may be elevated due to issues in the pigsty environment. Secondly, it enables monitoring the health status of pigs. Pigs afflicted by diseases typically exhibit reduced activity. If this information can be discerned through the posture status of pigs, it allows for early suspicion of viral infections leading to group infections. Thirdly, insights into the feeding and watering conditions within the pigsty are attainable. If pig behavior is sluggish during feeding and watering times, it may suggest insufficient feeding or watering amounts. Lastly, it enables the identification of abnormal situations within the pigsty. If a considerable number of pigs assume abnormal postures, suspicions of abnormal situations such as the intrusion of wild animals can be considered. Through early detection and addressing such situations through the classification of pig postures within the pigsty, it is anticipated that the productivity and animal welfare in the pigsty can be enhanced. This, in turn, is expected to optimize the efficiency of pigsty management.

3.4. Limitations and Future Work

This study demonstrated the possibility of sustainable smart pigsty management based on deep learning, but the following limitations exist. It is anticipated that addressing these limitations in future research will yield better results. The first limitation is the imbalance in data across classes, particularly due to the characteristics of pigs, resulting in a relative scarcity of data for ‘Standing’ and ‘Sitting’. Obtaining more data encompassing diverse postures (patterns) of pigs could enhance the model’s accuracy. The second limitation involves the subjective nature introduced during the author’s manual labeling of data. This was necessary because the pig images obtained from open data platforms were not labeled into the four required classes. Constructing a dataset with more objective and professional labeling could enable more detailed classification of postures. The third limitation is the absence of images with various patterns. The dataset in this study consists solely of images captured from an overhead perspective. Consequently, patterns learned from different angles were limited, leading to a decrease in classification accuracy.

A variety of additional studies are needed to address these limitations. In particular, we suggest adding cameras that capture specific posture classes from various angles, referring to the characteristics of classes with relatively low posture classification accuracy, as a direction for future research. This requires the ability to capture images from various angles within the pigsty. Through this approach, it is expected that posture classification within pigsties will become more accurate and effective.

4. Conclusions

In this study, deep learning was utilized to comprehend pig behavior in pigsties. Images of a pigsty were processed to detect and segment the appearance of pigs, acquiring images containing the posture of individual pigs. The obtained images were labeled by the author to create a dataset. To enable the deep learning model to learn from diverse image patterns, various preprocessing techniques, including image transformations, cutout, Gaussian noise, and weighted random sampling, were applied to the generated images before training the deep learning model. Results from comparing the performance of different models revealed that EfficientNetV2-S exhibited the highest classification accuracy of 96.64% and an F1 Score of 0.948. This confirmed that larger models do not necessarily guarantee higher classification accuracy than smaller models. Subsequently, Grad-CAM analysis was employed to analyze the features that should be emphasized for each class to improve the accuracy of pig posture classification. The results indicated that using computer vision to classify pig postures is efficient and yields high performance. Particularly, images misclassified based on the pig’s orientation could potentially be corrected by adjusting the shooting angle. The insights gained from this classification result analysis may contribute to enhancing the performance of pig posture classification in future research.

Author Contributions

Conceptualization, C.J. and D.K.; methodology, C.J. and D.K.; software, C.J. and H.K.; validation, C.J. and D.K.; formal analysis, D.K.; resources, C.J.; data curation, C.J. and H.K.; writing—original draft, C.J.; writing—review and editing, D.K.; visualization, C.J.; supervision, D.K.; funding acquisition, D.K. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the Korea Institute for Advancement of Technology (KIAT) grant funded by the Korean Government (MOTIE) (P0017123, The Competency Development Program for Industry Specialist).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Studnitz, M.; Jensen, M.B.; Pedersen, L.J. Why do pigs root and in what will they root?: A review on the exploratory behaviour of pigs in relation to environmental enrichment. Appl. Anim. Behav. Sci. 2007, 107, 183–197. [Google Scholar] [CrossRef]
Fernandes, J.N.; Hemsworth, P.H.; Coleman, G.J.; Tilbrook, A.J. Costs and Benefits of Improving Farm Animal Welfare. Agriculture 2021, 11, 104. [Google Scholar] [CrossRef]
Irene, C.; Winanda, W.U. Tail postures and tail motion in pigs: A review. Appl. Anim. Behav. Sci. 2020, 230, 105079. [Google Scholar]
Schrøder-Petersen, D.L.; Simonsen, H.B. Tail Biting in Pigs. Vet. J. 2001, 162, 196–210. [Google Scholar] [CrossRef] [PubMed]
Pandey, S.; Kalwa, U.; Kong, T.; Guo, B.; Gauger, P.; Peters, D.; Yoon, K. Behavioral Monitoring Tool for Pig Farmers: Ear Tag Sensors, Machine Intelligence, and Technology Adoption Roadmap. Animals 2021, 11, 2665. [Google Scholar] [CrossRef] [PubMed]
Johansson-Stenman, O. Animal Welfare and Social Decisions: Is It Time to Take Bentham Seriously? Ecol. Econ. 2018, 145, 90–103. [Google Scholar] [CrossRef]
Alonso, M.E.; González-Montaña, J.R.; Lomillos, J.M. Consumers’ Concerns and Perceptions of Farm Animal Welfare. Animals 2020, 10, 385. [Google Scholar] [CrossRef] [PubMed]
Miranda-de la Lama, G.C.; Estévez-Moreno, L.X.; Sepúlveda, W.S.; Estrada-Chavero, M.C.; Rayas-Amor, A.A.; Villarroel, M.; María, G.A. Mexican consumers’ perceptions and attitudes towards farm animal welfare and willingness to pay for welfare friendly meat products. Meat Sci. 2017, 125, 106–113. [Google Scholar] [CrossRef] [PubMed]
Kim, H.; Kim, D. Deep-Learning-Based Strawberry Leaf Pest Classification for Sustainable Smart Farms. Sustainability 2023, 15, 7931. [Google Scholar] [CrossRef]
Mahfuz, S.; Mun, H.-S.; Dilawar, M.A.; Yang, C.-J. Applications of Smart Technology as a Sustainable Strategy in Modern Swine Farming. Sustainability 2022, 14, 2607. [Google Scholar] [CrossRef]
Alameer, A.; Kyriazakis, I.; Bacardit, J. Automated recognition of postures and drinking behaviour for the detection of compromised health in pigs. Sci. Rep. 2020, 10, 13665. [Google Scholar] [CrossRef] [PubMed]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Tan, M.; Le, Q. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; Volume 97, pp. 6105–6114. [Google Scholar]
Tan, M.; Le, Q. EfficientNetV2: Smaller Models and Faster Training. In Proceedings of the 38th International Conference on Machine Learning, Virtual, 18–24 July 2021; Volume 139, pp. 10096–10106. [Google Scholar]
Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 7464–7575. [Google Scholar]
DeVries, T.; Taylor, G.W. Improved Regularization of Convolutional Neural Networks with Cutout. arXiv 2017, arXiv:1708.04552v2. [Google Scholar]
Zhou, B.; Khosla, A.; Lapedriza, A.; Oliva, A.; Torralba, A. Learning Deep Features for Discriminative Localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 2921–2929. [Google Scholar]
Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar]
Jeon, C.; Kim, H.; Kim, D. Classifying Pig Poses for Smart Pigsties Using Deep Learning. ICIC Express Lett. Part B Appl. 2024, 15, 187–193. [Google Scholar]
Ekkel, E.D.; Spoolder, H.A.; Hulsegge, I.; Hopster, H. Lying characteristics as determinants for space requirements in pigs. Appl. Anim. Behav. Sci. 2003, 80, 19–30. [Google Scholar] [CrossRef]
Bergstra, J.; Bengio, Y. Random Search for Hyper-Parameter Optimization. J. Mach. Learn. Res. 2012, 13, 281–305. [Google Scholar]
Jin, M.; Wang, C.; Jensen, D.B. Effect of De-noising by Wavelet Filtering and Data Augmentation by Borderline SMOTE on the Classification of Imbalanced Datasets of Pig Behavior. Front. Anim. Sci. 2021, 2, 666855. [Google Scholar] [CrossRef]
Shao, H.; Pu, J.; Mu, J. Pig-Posture Recognition Based on Computer Vision: Dataset and Exploration. Animals 2021, 11, 1295. [Google Scholar] [CrossRef] [PubMed]
Witte, J.-H.; Gómez, J.M. Introducing a New Workflow for Pig Posture Classification Based on a Combination of YOLO and EfficientNet. In Proceedings of the 55th Hawaii International Conference on System Sciences, Maui, HI, USA, 4–7 January 2022. [Google Scholar]
Nasirahmadi, A.; Sturm, B.; Edwards, S.; Jeppsson, K.-H.; Olsson, A.-C.; Müller, S.; Hensel, O. Deep Learning and Machine Vision Approaches for Posture Detection of Individual Pigs. Sensors 2019, 19, 3738. [Google Scholar] [CrossRef] [PubMed]
Xu, J.; Zhou, S.; Xu, A.; Ye, J.; Zhao, A. Automatic scoring of postures in grouped pigs using depth image and CNN-SVM. Comput. Electron. Agric. 2022, 194, 106746. [Google Scholar] [CrossRef]

Figure 1. Overall system framework.

Figure 2. Acquired raw data image of the pigsty.

Figure 3. Object detected smart pigsty.

Figure 4. Dataset configuration criteria.

Figure 5. Class labeling.

Figure 6. Image preprocessing process.

Figure 7. Learning curve of EfficientNetV2-S.

Figure 8. Confusion matrix of classification results.

Figure 9. Grad-CAM result.

Figure 10. Misclassified image by class.

Figure 11. Example utilization scenarios.

Table 1. Number of data per class.

Classes	Training Data	Test Data
Standing	462	115
Lying on belly	1752	437
Lying on side	3369	842
Sitting	502	125
Total	6085	1519

Table 2. Models’ performance comparison.

Model	Accuracy	F1 Score
EfficientNet-B0	95.26%	0.918
EfficientNet-B1	95.52%	0.919
EfficientNet-B2	94.93%	0.912
EfficientNet-B3	95.39%	0.926
EfficientNet-B4	95.91%	0.927
EfficientNet-B5	95.39%	0.919
EfficientNetV2-S	96.64%	0.948
EfficientNetV2-M	94.99%	0.918
ResNet-18	93.74%	0.896
ResNet-34	94.60%	0.896
ResNet-50	95.39%	0.917
ResNet-101	94.00%	0.900
ResNet-18 (+CBAM)	91.57%	0.859
ResNet-34 (+CBAM)	94.33%	0.913
ResNet-50 (+CBAM)	93.81%	0.888
ResNet-101 (+CBAM)	94.86%	0.922

Table 3. Pigs’ posture classification results.

Classes	Correct	Wrong	Accuracy
Standing	114	1	99.13%
Lying on belly	408	29	93.36%
Lying on side	834	8	99.04%
Sitting	112	13	89.60%
Total	1468	51	96.64%

Table 4. Precision and recall by class.

Classes	Precision	Recall
Standing	0.8769	0.99130
Lying on belly	0.9784	0.93363
Lying on side	0.9754	0.99049
Sitting	0.9572	0.89600
average	0.947	0.952

Table 5. Previous studies on pigs’ posture classification.

Author	Classes	Accuracy
Hongmin Shao et al. [24]	Standing, Lying, Lying on their sides, and Exploring	92.45%
Jan-Hendrik Witte et al. [25]	Lying and Not lying	91.00%
Abozar Nasirahmadi et al. [26]	Standing, Lying on Side, and Lying on Belly	94.85%
Jinyang Xu et al. [27]	Standing, Sternal recumbency, Ventral recumbency, Lateral recumbency, and Sitting	94.63%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jeon, C.; Kim, H.; Kim, D. A Deep-Learning-Based System for Pig Posture Classification: Enhancing Sustainable Smart Pigsty Management. Sustainability 2024, 16, 2888. https://doi.org/10.3390/su16072888

AMA Style

Jeon C, Kim H, Kim D. A Deep-Learning-Based System for Pig Posture Classification: Enhancing Sustainable Smart Pigsty Management. Sustainability. 2024; 16(7):2888. https://doi.org/10.3390/su16072888

Chicago/Turabian Style

Jeon, Chanhui, Haram Kim, and Dongsoo Kim. 2024. "A Deep-Learning-Based System for Pig Posture Classification: Enhancing Sustainable Smart Pigsty Management" Sustainability 16, no. 7: 2888. https://doi.org/10.3390/su16072888

APA Style

Jeon, C., Kim, H., & Kim, D. (2024). A Deep-Learning-Based System for Pig Posture Classification: Enhancing Sustainable Smart Pigsty Management. Sustainability, 16(7), 2888. https://doi.org/10.3390/su16072888

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Deep-Learning-Based System for Pig Posture Classification: Enhancing Sustainable Smart Pigsty Management

Abstract

1. Introduction

2. Materials and Methods

2.1. Overall System Framework

2.2. Dataset

2.2.1. Data Selection

2.2.2. Object Detection

2.2.3. Data Labeling

2.3. Image Preprocessing

2.4. Modeling and Learning

3. Results

3.1. Classification Result

3.2. Discussion

3.3. Utilization Scenarios

3.4. Limitations and Future Work

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI