1. Introduction
The agricultural supply chain has been passing through a digital transformation over the last few years by absorbing elements from industry 4.0 [
1]. Innovations linked to sensor technology, telecommunications, robotics, the Internet of Things (IoT), and artificial intelligence (AI) are being applied to management automation and real-time data-driven intervention. In addition, these revolutions toward enhancing the production of food, fibers, and energy are expected to provide solutions for the contention of wastes and the mitigation of environmental pollutants [
2]. Novel practices for agri-food industries introducing computational methods combined with smart devices are allowing new alternatives to assess the quality of farming products and traceability. Hence, with the recent release of 5th generation telecommunications networks (5G), for the very first time, rural areas can obtain access to fast internet connections, which may support farmers and decision-makers in the adoption of best practices and mediation just in time.
In the field of computer sciences, AI is a cutting-edge technology from recent decades with the potential to disrupt society in the coming years. Deep neural networks (DNNs) are the workhorse of AI that have been leading solutions to nonlinear and multidimensional problems, such as image processing, natural language processing, and speech recognition. In addition, a combination of big data, faster algorithms, and powerful processing units are considered to be mainly responsible for bringing the deep learning (DL) approach to the spotlight, and DL is currently considered the state of the art in human-centered AI systems.
In the meantime, the greatest efforts have been focused on computer vision to address image classification, object detection, segmentation, and localization. Faster algorithms have been improved continuously to reach high confidence and speed by employing convolutional neural networks (CNNs) on the backbone of those algorithms. Currently, the state of the art object detection algorithms are based on the You Only Look Once (YOLO) [
3]. A series of improvements on this deep learning algorithm have achieved fast inference on edge devices, including smartphones, low-end computers, and cloud processing platforms [
4]. YOLO is classified as a one-stage detector algorithm [
5], the same class as RetinaNet [
6] and SSD [
7]; however, the architecture is anchor-based and inherited from two-stage detectors, such as the R-CNN family [
8], Fast R-CNN [
9], and Faster R-CNN [
10].
YOLO was introduced in 2015 [
3]; the breakthrough of this object detection algorithm was its ability to predict classes and localize coordinates in images (bounding boxes) by using a single CNN, which makes the inference faster and real-time applications become possible. The basic idea behind YOLO architecture was to look at the entire image (or frame) all at once, then divide it into a grid S × S, after which, the localization of an object is treated as a regression problem instead of a traditional classification problem. When the center of an object falls into a specific grid, that grid becomes responsible for detecting the object [
11].
Since its release, YOLO has evolved into a series, also known as versions. The first 3 versions of YOLO were released by Redmon et al. [
3,
12,
13], and the 4th version (YOLOv4) by Bochkovskiy et al. (2020) [
5]; this last one was reported to overperform the previous version (YOLOv3) by 12% in speed and 10% for accuracy, and then became one of the most used real-time solution object detection algorithms.
Over time, the upgraded versions have stepped-up the tradeoff between speed and accuracy of detections, and besides that, from the version 5 (referred to as YOLOv5), the YOLO algorithms have incorporated user-friendly characteristics, including less complex framework, training efficiency, and portability of models among diverse inference platforms [
14]. Computer vision systems based on the YOLO series have been reported for solutions related to detecting fruits in orchards using YOLOv3 [
15], YOLOv4 [
16], and YOLOv5 [
17], leaf diseases detections [
18], defects assessment on fruits [
19], and pest detection [
20].
Furthermore, for other agricultural purposes, computer vision systems based on deep learning object detection algorithms show enormous potential, which includes self-driving vehicles [
21,
22], robotics [
23], and object tracking [
24,
25]. However, most of the utilization of vision systems has been concerned with in-farm operations aiming to supply labor shortages and quality of operations; few implementations are given to reduce waste at the shelf level as a priority target.
Egg quality is a complex task to be assessed because it involves diverse parameters, such as size, weight, color, eggshell defects, spoilage, bacterial infection, and freshness (which can be considered the major criteria for quality [
26]).
The freshness domain is explained as an objective attribute that represents biochemical and physical variables that are sensory [
27]. The lack of fresh characteristics occurs principally due to aging effects that start to compromise egg quality immediately after oviposition. The loss of water and CO
2 through shell pores reduces the weight of eggs [
28] and increases the pH [
29], respectively, liquefying the albumen and yolk, and therefore facilitating osmotic exchange between them [
30]. These factors can be intensified by environmental conditions, especially temperature and humidity [
31], during the storage period.
Several methods to evaluate the freshness of eggs have been reported, including destructive analysis, such as the Haugh unit (HU) [
32], the pH of the albumen [
33], the yolk index (YI) [
34], and the air cell size [
35]. Nondestructive techniques mainly involve NIR/Raman spectroscopy [
30,
36,
37], odor sensors [
38,
39,
40], electrical conductivity [
41], ultrasound [
42], and candling [
43]. However, most of these techniques are not robust for real-time applications, and the nondestructive analysis techniques are not even suitable due to issues regarding equipment cost, intraclass variations related to shell color (specific calibration is required) and thickness, and environmental parameters, such as illumination, temperature, and humidity [
44]. A recent study used a pulse phase thermography approach with neural networks and was able to estimate the aging of hen eggs according to the approximation of the air cell size and obtained a high degree correlation (R
2 > 0.95) [
45], which demonstrates the potential of high-throughput application of thermal imaging and heuristic algorithms.
All over the world, different countries have adopted different metrics for freshness standards. The European Union (EU), for instance, adopted air cell size as a parameter to evaluate the freshness of avian eggs [
46], while in Brazil, the standard is HU [
47]. Therefore, the aim of this work was to develop a fast and accurate method using a computer vision model based on deep learning algorithms for air cell detection as a fast and nondestructive method to classify nonfresh quail eggs using a thermal microcamera. Our hypothesis was that the loss in egg weight, as determined by the air cell increase, can be detected by thermal cameras due to gas transmission through the eggshell; thus, a machine vision system could be carried out by detecting this feature in a quick and nondestructive way. In this study, we dealt with deep learning object detection algorithms to assess the shelf-life quality of quail eggs toward freshness by classification of radiometric images from a thermal camera according to the new proposed methodology.
2. Materials and Methods
2.1. Experimental Environment
For this study, we used Japanese quail eggs (Coturnix japonica) collected from local grocery stores. The label expiration date was considered the reference for the end point of freshness. The experiments were conducted in the Bioproduction and Machinery laboratory, University of Tsukuba, Japan, during the middle of the summer season, in which the average range daily temperature was 24–32 °C. The methodology was developed in two phases: first, air cell assessment by pixel measurement; second, deep learning object detection for automatic classification of nonfresh eggs using thermal images.
2.2. Thermal Imaging
A FLIR
® (Teledyne FLIR LLC, Wilsonville, OR, USA) Model VUE™ 336, 6.8 mm, thermal camera with a sensor resolution of 336 × 256 pixels and a spectral band range of 7.5–13.5 μm, size 2.26” (5.74 cm) × 1.75” (4.44 cm), was used to collect radiometric images from the quail eggs. Thermal cameras can produce images by interpreting the intensity of infrared (IR) radiance emitted from the target when interacting with the environment. Therefore, the images result from atmospheric transmission, IR reflection, and the emission wavelength from the target [
48]. In this regard, three variables are determinant to acquire information of objects: the size, distance of targets and the angle from the camera.
The main point when using thermal cameras relies on the fact of no-light dependency compared to optical cameras. Thermal cameras are specifically used for night vision problems and body temperature measurements. However, limitations are found due to low resolution and high cost compared to conventional cameras. In addition, thermal cameras provide relative temperatures, and absolute measurements can be reached after data processing or calibration procedures for specific purposes.
In this study, we used a thermal microcamera to collect the egg images. By random exploration of the thermal camera, we found that radiometric images collected from cold eggs could highlight a “chamber” on the large base of the eggs, as we knew that the air cell is located in the same position. We investigated the possibility of assessing the freshness according to the size of this feature, which could be a reference to the air cell in proportion to the aging effects on eggs (
Figure 1). When the eggs are colder than room temperature, the chamber is highlighted (
Figure 1c).
We collected images inside an automatic incubator machine (no brand) to avoid direct atmospheric interference on the eggs. The incubator was warmed to 38 °C to enhance the temperature contrast between the environment and the cold eggs and was rewarmed after every 20 egg images. The room temperature was constant at 27 °C during the data collection period. We kept the eggs inside the incubator by picking them up by their equator region. The thermal camera was placed above the target in an up view position ± 10 cm from the eggs (
Figure 2). The thermal camera was controlled with a SHARP
® smartphone (Sharp Corporation, Sakai, Osaka, Japan), AQUOS™ sense4 basic Model A003SH with an ANDROID™ version 11 operating system connected to the camera by Bluetooth technology. The software used was FLIR
® UAS™ 2 version 2.2.4.
2.3. Dataset Collection
We collected 390 quail eggs from local grocery stores at random. However, 8 eggs were found to be cracked; therefore, the remaining 382 eggs were used for the experiments. The eggs were divided into three groups:
The 1st group was composed of 174 eggs stored for 60 days continuously inside a conventional refrigerator under a minimum cold temperature of 17 °C and a relative humidity of 45%. On the 30th, 50th, and 60th days, the eggs were removed from the refrigerator for image sampling (thermal pictures acquired). After the 30th day of storage, we assumed that no eggs would be fresh at all; to make sure, data were collected on the 50th day and on the 60th day. In addition, by collecting samples at 3 time points, we could obtain more representative data over long storage periods. This dataset was used to train the vision-based object detection algorithms.
The 2nd group was formed of 148 eggs, and this group was used to assess the air cell size. Air cells increase due to aging effects, and there is a loss in egg weight due to that; therefore, these measures could be correlated.
The 3rd group had 60 eggs that were used for testing the prediction model. This group was stored under same conditions of group 1 and evaluated 15 days after the expiration date. To assess the air cells in this group, we boiled the eggs and visually confirmed the air cell size by cutting the eggs longitudinally.
2.4. Air Cell Assessment Methodology
To correlate the air cell enlargement with the loss in egg weight, an experiment was performed with eggs from group 2. Basically, we scaled fresh eggs immediately after purchase (1st day) and after a 10-day storage period under accelerated aging conditions (the eggs were kept at room temperature in summer conditions, where the average temperature of the room was 27 °C and humidity of 60%, such conditions can speed up the dehydration of eggs).
The 148 eggs from group 2 were all numbered and scaled with a digital semiprecise scale (0.001 g precision, no brand). In addition, the long axis (Y) size was measured with a digital caliper (0.01 mm precision).
Figure 3 shows the procedures.
The measurements of the representative air cell on the pictures were performed manually by contouring the feature highlighted on the large base of the eggs. We used the open-source software ImageJ (64 bits, version 1.8.0) developed by Wayne Rasband and contributors from the National Institutes of Health in the United States [
49]. The software could provide a conversion between real measurements and the length of pixels, according to a known real distance.
Figure 4 shows the workflow procedure. First, the real distance was converted into a pixel length. Next, according to the scale (pixel/mm), the contoured area was calculated by the software.
We determined the pixel distance between two points by means of 3 line distances (three points A-B) to reduce the error and the subjectivity of pixel length conversion.
The Pearson correlation (Equation (1)) was adopted to calculate the relationship between weight and air cell size variation during the accelerated aging period.
where r is the coefficient of correlation, x
i and y
i are the x and y variable samples (area and weight), respectively, and
and
are the mean values of the x-y sample variables.
2.5. Deep Learning-Based Object Detection Algorithms
In the field of machine learning (ML) techniques, deep learning (DL) uses deep neural networks to deal with nonlinear problems involving big data to create predictive models. In recent years, compared to traditional ML, such as logistic regression, support vector machine, and other methods, DL has been faster and more accurate when performing under multidimensional data [
50], for instance, image classification, segmentation, and localization.
The complexity of DL algorithms makes us think of it as a combination of “black boxes” where the entire process is difficult to visualize in a simple way. However, YOLO is a DL object detection algorithm that uses a single convolutional neural network (CNN) to localize the object of interest inside the image and classifies the object as a regression problem.
In our problems, the training dataset was fed into YOLO (v4 and v5). The algorithm then took a look at every image at once and then divided each image into a 13 × 13 grid. As our input size was 416 × 416, each cell of the grid had 32 × 32 pixels. Thus, when a high probability of the center point of the eggs with a large air cell was located, that grid was addressed to the prediction of the “not-fresh” class. The YOLO algorithm simplified architecture workflow can be seen in
Figure 2.
Since its release, YOLO has achieved many series, including YOLOv5 [
51] and YOLOv6 [
52], which were released by companies. Nevertheless, peer-reviewed research articles have not yet been published. However, regardless of that, the community of developers and industry are providing solutions adopting these tools. YOLOv5 was used in this study due to its stability and portability (deployment capability), considering the immediate potential of our methodology that can be extended to mobile applications addressing the most common deployment formats, such as TensorFlow™ Lite and Edge TPU. The first 4 versions of YOLO [
5] were based on the Darknet framework, and the 5th version uses the PyTorch framework, which is based on a Python ecosystem, one of the most used programing languages worldwide [
53], especially in the data science field.
A few elements were modified from YOLOv4 to YOLOv5, including the modified bottom-up and top-down layers in the new feature pyramid network (FPN) [
54] inside the path aggregation network (PANet) [
55] on the neck of the algorithm. Another modification was the loss function; the 5th version uses the binary cross entropy with the logit loss function [
56].
In this work, we trained YOLOv4, YOLOv5, and EfficientDet object detection architectures to predict nonfresh eggs after the expiration date and to revalidate the label date of the remaining eggs. The overall architecture of the deep learning algorithms is shown in
Figure 5, where we compared the different structures of the object detection algorithms used in this work.
The backbone of the algorithm represents the CNN type, which was responsible for feature extraction (edges, shapes, color differences) and the creation of the feature map by using convolutional operations. The neck was a feature aggregator network; it collected those features from the backbone and put them together as bottom-up and top-down features to the head, which was the final step to predict the nonfresh egg position on the image or frame. This last part was responsible for plotting the bounding boxes around the class and labeling the image with its name.
Table 1 shows a comparison between the object detection models and its basic architecture employed in this study.
EfficientDet was released by Google Research, LLC [
57]. The main point of this algorithm is the light model, high accuracy, and multiscalability, which focus on efficiency when detecting small objects and the speed of detections aiming at low-end devices. EfficientDet uses the EfficientNet convolutional neural network on the backbone to extract the features related to the egg shape, color, and borders of air cells with maximum efficiency in terms of computation costs. The bidirectional feature pyramid network (BiFPN) on the neck part is an aggregator similar to PANnet for feature fusion except for some skipped connections between the pyramid network from the backbone, which also contributes to increasing the detection efficiency of our thermal features related to eggs. Finally, the box prediction network on the head is responsible for labeling the predicted class.
2.5.1. Data Labeling
The first group (174 eggs) was used to train the YOLO and EfficientDet algorithms, and the total data were 522 (from the 30th, 50th, and 60th days). We considered that after 30 days of storage, all eggs would not be fresh at all. Therefore, the model of nonfresh eggs could be well-represented according to this dataset. The 522 images were enlarged by augmentation techniques to extend the generalization and to better extract features during training. We adopted spatial, pixel, and cutmix augmentation techniques.
Spatial augmentation was performed by applying free rotation to the eggs. Pixel augmentation was performed due to monochrome transformation (black and white), and the cutmix was made manually by mounting 100 images in the composition of images (
Figure 6) from the other two thermal conditions described in
Figure 1a,b. Note that YOLOv4 has the mosaic, and the cutmix augmentation techniques already included in the backbone and detector parts of the algorithm as a “bag of freebies”. Nevertheless, considering that we had only one object per image, when mounting similar objects that did not belong to our class of interest, we could have an honest model to detect difficult objects with more confidence.
2.5.2. Training Parameters
The total training dataset composed of 3610 images was split into two groups in a proportion of 70:30; thus, 2527 images were used for training, and 1083 were used for validation.
To train YOLO object detection, the data were labeled according to YOLO format using a self-designed program that could give the bounding box and label coordinates x, y, height, and width (
Figure 7). On the other hand, to train EfficientDet, the images were labeled using the open-source software LabelIMG, which gives bounding box coordinates in PASCAL VOC XML format.
To train the models, we used different frameworks. As mentioned earlier, YOLOv4 is embedded in the Darknet framework, while YOLOv5 is based on PyTorch, and EfficientDet is onboard TensorFlow.
YOLOv4 and EfficientDet were trained on a personal computer running Windows® 10™ 64 bits, with an Intel® Xeon™ E5-1607 processor, 32 GB of RAM, a NVIDIA® GTX 1650™ 4 GB GPU, Python version 3.8.5, CUDA 10.1, cuDNN 7.6.5, OpenCV 4.4.0., TensorFlow 2.3.1, and TensorFlow-GPU 2.3.1. We trained YOLOv5 in the Google, LLC, Collab cloud environment with PyTorch 1.11.0 + cu102 and a 16 GB GPU Tesla T4.
Some hyperparameter values were different, such as the batch size and number of iterations (
Table 2), as consequence of different frameworks. However, as our intention was to evaluate only the detection accuracy, training performance was not considered in this study.
2.5.3. Evaluation Metrics
To validate the models and compare the results, we adopted the common metrics accepted and recognized by deep learning developers and the academy: the precision (P) is defined as the proportion of true positive (TP) detections in relation to false positive (FP) detections (Equation (2)), the recall (R) is TP in relation to false negative (FN) detections (Equation (3)), and the F1 score (Equation (4)) indicates the balance between precision and recall and is a good metric to compare the efficacy between models. The average precision (AP, Equation (5)) and
[email protected] (Equation (6)) are metrics adopted to evaluate the trained parameters of the models adopted by the PASCAL VOC challenge [
58].
It is important to note that concepts of true and false detections are determined according to the prediction bounding boxes (bbox) in relation to the reference label bbox, called the ground truth. The trueness is determined by setting the intersection over union (IoU), which calls for the proportion of a prediction in relation to the reference. Usually, the IoU is defined to be greater than 50% on the training setup (
Figure 8).
4. Discussion
In this work, we used a thermal camera and proposed a new methodology to detect the freshness of eggs according to the air cell size. Thermal cameras have the ability to interpret the intensity of infrared wavelengths transmitted through the atmosphere. Hence, considering that CO2 is heavier than atmospheric air (at the same temperature and pressure), when contrasting distinct temperatures of cold eggs and warm room temperature, the CO2 and the composition of other gases in the air cell produce a spectral signature transmitted from the eggs that can be detected by the radiometric sensors of the thermal cameras. We called this method the “thermal imaging contrast technique”. When using this method, the identification of the air cell was easy, fast, and approachable to identify stale and not fresh eggs. Therefore, this method can be used in real time for high-throughput applications at the industrial level, especially when combined with deep learning object detection algorithms in automated systems, as demonstrated in this work.
During this study, the pH of the albumen or the yolk were not measured during the storage period. However, the literature shows that for hen eggs, the pH of the albumen may increase slightly more than the yolk pH [
59], and this modification can be reduced under controlled atmospheric systems by injecting CO
2 into the storage room [
60]. Additional factors can also contribute to potentializing the chemical transformations of the albumen and yolk, such as genotype, quail feed composition, diseases, age of parental flock, and environmental conditions.
Methods able to perform real time and nondestructive analysis can contribute to the development of quail industry farming and the post harvesting process to keep the quality and safety for the consumers. In this study, it was noticed that some eggs from the same groups were less fresh than others. From this observation, we can presume that eggs collected from the same quail may vary in eggshell composition, such as thickness and hardness.
In the correlation test, we observed that the standard deviation of real measurements and the pixel measurements were very different in magnitude because the pixel measurements were relative to the manual line tracing on marked points, which means that the line traced between two points may change the length of pixels when connecting the line to the top and bottom points of the eggs. Nevertheless, we traced a line three times and used the mean to define our pixel length; consequently, the error was reduced for our measurements.
For the second part of our study, by using deep learning object detection, the prediction of not fresh quail eggs with high accuracy was possible. When comparing the three models, the best results were obtained from YOLOv5, followed by EfficientDet and YOLOv4. Deep learning-based models for image recognition and localization are being upgraded continuously as the demand for applications for this tool increases in many sectors of our daily lives.
The comparison between deep learning algorithms has demonstrated that improvements on object detection algorithms can reduce the error of reclassification of nonfresh eggs. However, as thermal cameras have low resolution, further improvements on thermal sensing can also enhance the efficiency and accuracy of deep learning-based computer vision systems; therefore, the tradeoff between equipment and algorithms should be considered as a drawback.
While training the deep learning models, some parameters can influence the speed and accuracy of the model, and the most important parameter is the network size. The network size of YOLOv4 and YOLOv5 was set to 416 × 416 pixels because these models use the same architecture; however, the batch size was different due to the dependence on hardware resources, especially processing power (GPU). For EfficientDet, the minimum network size was 512 × 512 due to the scalable feature architecture, and the batch size was reliable for TensorFlow™ processing. As our hardware did not allow training YOLOv4 with a network size larger than 416 × 416, only the accuracy between models was compared, thus, the speed deployment was not taken into consideration in this study.
5. Conclusions
The air cell is one of the most important parameters to qualify the freshness of eggs. The thermal camera was able to interpret the infrared wavelength intensities transmitted through the atmosphere from the eggshell pores and then, by the contrast technique, highlight the portion where the gases were accumulated on the large base of the eggs as a consequence of the storage period. As the aging process affects the size of the air cell, the lack of freshness was correlated with the air cell size (R2 = 0.676). The combination of thermal camera imagery and deep learning object detection algorithms could identify nonfresh quail eggs with high accuracy; besides that, our investigation has demonstrated their potential to compose automatic systems for freshness assessment at industry and civil levels. We tested our model on the eggs after the expiration date, the YOLOv4, YOLOv5, and EfficientDet models could detect nonfresh eggs with an F1 score of 0.69, 0.89, and 0.86, respectively. The best model (YOLOv5) demonstrated that 48.33% of eggs could have their labelled date extended at least 15 days, with an error rate of 11.67%. EfficientDet and YOLOv4 showed reclassification proportions of 51.67% and 66.67% and an error rate of 15% and 30%, respectively.
The developed methodology was reported as a fast and nondestructive way to assess the freshness of quail eggs according to the detection of air cell size; the methodology itself can be extended for industrial applications, supermarkets, and restaurants to relabel eggs for extended consumption periods and minimize the postharvest poultry production losses.
The main limitation of the methodology developed is regarding how long the expiry date could be extended for. In this regard, further studies should be addressed including the assessment of eggshell thickness for prediction of ideal shelf-life of quail eggs and other avian eggs as well.