1. Introduction
Hen and quail eggs are the main eggs used for human consumption, with an annual production of about 87 million tons [
1]. Bird eggs are a valuable dietary product with high nutritional value [
2]. In addition to being a food product, eggs are also used for the reproduction of farm birds, for laboratory purposes, in human and veterinary medicine, cosmetics, art, and so on [
3,
4,
5]. The production of commercial and hatching eggs is of significant economic importance, with the global egg market estimated at about 143 billion USD in 2023 and expected to continue growing [
6]. Comparing the yield of chicken and quail eggs, it can be said that the main share is occupied by chicken eggs [
7]. Quail eggs, on the other hand, are less common, but are aimed at a certain market niche and are often associated with delicacy, as well as specific nutritional and dietary qualities [
8,
9].
The quality of eggs is assessed using a set of methods set out in the regulations at European Delegated Regulation (EU) 2023/2465 and national [
10] levels. An interesting point is that the regulatory framework does not contain specific requirements for the marketing of quail eggs for in-shell consumption. The main characteristics used for non-destructive grading of eggs for in-shell consumption are freshness, mass, size, shape, integrity and appearance of the shell, and internal quality through ovoscopy. An important element in laboratory (destructive) grading of eggs is the assessment of their components—albumen, yolk, and shell [
11]. Often, some quality characteristics that are not directly related to the suitability of eggs as a food product are essential for consumer choice, for example, yolk color [
12] or shell color [
13].
The determination of internal quality features of eggs related to spectral characteristics, yolk, and albumen quality is important from the point of view of food quality control and safety, nutritional value and classification of eggs, breeding and feeding of poultry for eggs, etc. [
2].
The primary and subsequent grading of eggs is related to whether the product will provide satisfactory quality and guarantee consumer safety. Regarding the quality of the egg as a food product, special attention is paid to the yolk and albumen, which make up about 90% of its total mass [
14]. The most valuable information about the characteristics of the egg content is obtained during its detailed analysis, after breaking a representative sample from each batch of eggs for consumption [
11]. On the other hand, the quality of the shell is related to maintaining the integrity and safety of the eggs during their production, grading, packaging, transportation, and storage [
15]. The quality of the shell is a major factor in reducing the rejection of whole eggs, and hence for the economics of raising laying hens [
16].
By observing the internal characteristics of eggs, deviations in quality indicators and even features of unfitness of the product for consumption are detected. Such an analysis ensures that consumers receive high-quality and safe eggs [
17]. Of the internal characteristics with the greatest importance for assessing egg quality, the quality of the dense part of the albumen and the strength and elasticity of the perivitelline membrane are the most important. They are measured by determining the albumen index, Hough units, and the yolk index [
11]. The color and spectral characteristics of the yolk and albumen are directly related to the diet of the birds, but can also be influenced by drug treatment, diseases, improper or prolonged storage, etc. [
18,
19,
20,
21]. The color of the yolk and the qualitative characteristics of the albumen determine the freshness and nutritional value of the egg [
22,
23]. Haugh units are a general indicator of egg quality [
24]. The proportion and composition of the yolk and albumen give an idea of the nutritional content of the egg [
25].
Internal and external egg characteristics are one of the main selection criteria in egg-laying poultry [
26]. They provide guidance for optimizing feeding practices with a view to the sustainable production of high-quality and safe eggs [
27].
To improve and facilitate the process of destructive grading of eggs for consumption, it is necessary to study the possibilities for implementing automated systems for qualitative analysis of the components of the egg using visual approaches and spectral analysis.
Maintenance of eggs in storage should be of excellent quality and safe for consumer consumption and further industrial use. The freshness and internal characteristics of the eggs reflect their nutritional value, marketability, and suitability for various food applications. However, most of the existing methods of monitoring these changes involve inefficient, complicated, and time-consuming analyses. Therefore, developing quick, accurate, and simple classification techniques for tracing egg quality is one of the most important activities for better management of storage, reduction of food waste, and conformity with regulatory requirements.
By leveraging advanced algorithms, this research will provide a computationally efficient method that balances the simplicity of classification with high regression accuracy to satisfy technological requirements for assessing food quality.
The aim of this study is to investigate the possibility of applying an innovative approach to determining the quality of chicken and quail eggs for consumption and to determine a basic set of features related to spectral quality analysis.
2. Material and Methods
We used eggs from three producers (GPS Coordinates, according to WGS84: Producer 1 (M1) Latitude 41.5905388° N, Longitude 25.3924812° E; Producer 2 (M2) Latitude 42.577000° N, Longitude 26.315500° E; Producer 3 (M3) Latitude 42.81667278° N, Longitude 23.21667278° E).
The three producers were selected based on differences in farming practices, production scale, and geographical location to ensure variability in egg characteristics and assess the robustness of the proposed methods across diverse conditions.
The production date varies by 1–2 days, as indicated on the packaging label. The eggs have a 28-day shelf life, also stated on the label.
The eggs were purchased from the commercial network in the city of Yambol, Bulgaria (GPS Coordinates, according to WGS84: Latitude 42.4833° N, Longitude 26.5000° E).
Table 1 represents the number of measured hen and quail eggs from three producers (M1, M2, and M3). The presented data are for the number of hen and quail eggs measured at six time points (days 0, 5, 10, 15, 20, and 25) from three manufacturers (M1, M2, and M3). The manufacturers delivered 25 eggs/day/point of measurement or a total of 150 eggs per manufacturer per type of egg. The quantity of hen eggs and quail eggs measured was 450 each, equating to 900 eggs. The measurements were taken in a controlled experiment whereby eggs were measured against time under consistent conditions.
The storage conditions for eggs were as follows: chamber, controlled at 10 ± 2 °C/70 ± 3% RH. Automatic regulation of air temperature and humidity was installed in the chamber to minimize variations. The measurements were continuously recorded by calibrated sensors with continuous stability. The samples were placed into standard egg trays, ensuring that there was adequate air around each egg to prevent the occurrence of a localized temperature difference. They had been set in a single layer and not on top of one another to prevent any pressure damage.
The main characteristics such as egg dimensions, egg white, and yolk height and diameter were determined using a digital caliper with an accuracy class of 0.05 mm and a maximum measuring length of 150 mm, SEB-DC-023 (Shanghai Shangerbo import & export Co., Ltd., Shanghai, China).
The mass of the egg and shell was determined with a digital scale Pocket Scale MH-200 (ZheZhong Weighing Apparatus Factory, Yongkang City, Zhejiang Province, China), maximum determined mass 200 g, with a resolution of 0.02 g.
The following egg characteristics were determined: shell surface area; shell density; albumen index; Hough units; yolk index, egg quality index.
Shell surface area
SS:
where
W is the mass of the egg [
28].
Shell density
Ds:
where
W is the mass of the egg in g;
SS is the surface area of the egg [
28]. This density provides information about the thickness and compactness of the eggshell. Shell density is an important factor in evaluating eggshell strength and durability.
The albumen index
AI is calculated as follows [
29]:
where
ha is the height of the egg white in mm;
Da is the major diameter of the dense egg white in mm;
da is the minor diameter of the solid albumin in mm. This is a measure of the quality of the egg white and a higher index is usually desirable for certain applications where the physical properties of the egg white are important.
Hough unit
HU, determined using the formula [
30]:
where
is the measured height of the dense albumen in mm;
W is the mass of the egg in g. This is recognized as a benchmark for objective quantification of the internal quality and freshness of eggs. Serving as a standard, this coefficient provides an objective measure of “freshness” that is consistent with consumer preferences.
Yolk index
YI:
where
is the height of the yolk in mm;
Dy is the diameter of the yolk in mm. It presents information about the shape and quality of the yolk. A higher yolk index means a more rounded and desirable shape of the yolk, which corresponds to preserved quality characteristics of the perivitelline membrane.
S1
EQI [
25]:
where
W is the egg mass;
is the height of the dense albumen;
is the height of the yolk. It is a complex index that takes into account multiple factors to provide a comprehensive evaluation of egg quality. It includes a combination of the above characteristics and possibly additional parameters.
An LG L70 mobile phone video sensor (LG Electronics, Inc., Seoul, Korea) was used to obtain color digital images of the internal characteristics of eggs. The video sensor is VB6955CM (STMicroelectronics International N.V., Geneva, Switzerland). Resolution 2600 × 1952 pix. Pixel size 1.4 × 1.4 μm.
Color digital images were obtained in the RGB color model, which were converted to the Lab color model, according to CIE 1976. Functions for converting color components at observer 2° and illumination D65 were used.
The egg albumen is a semi-transparent object, and the images are obtained by combining it with the background. The separation of the object from the background for the albumen was done according to the methodology of Yu et al. [
31].
Color indices calculated from the values of the components of the RGB model were used. These indices are presented in [
32,
33,
34].
R1 indicates the relative intensity of the red color.
R2 reflects greenness.
R3 represents the contribution of the blue color in the object.
R4 emphasizes the red color.
R5 emphasizes the green color.
R6 emphasizes the blue color.
R7 measures the contrast between the green and red colors.
R8 focuses on the balance between the blue and red colors.
R9 evaluates the relationship between the green and blue colors.
R10 is an index obtained from the combination of the red, green, and blue bands in images.
R11 focuses specifically on the green color.
R12 is calculated using the red and blue bands of the electromagnetic spectrum.
R13 quantifies the balance between the red and green colors.
R14 measures the relationship between the red and blue colors.
R15 evaluates the ratio between the green and blue colors.
Color indices calculated from the values of the components in the Lab model were used. These indices are presented in [
35,
36].
Color indices are used in terms of what color changes they correspond to, regardless of the objects for which they are intended in their original form.
C1 measures the degree of yellowness. C2 quantifies how close the color of the object is to white. C3 measures the browning of the object. C4 represents the saturation or intensity of the color. C5 indicates changes in the green color. C6 reflects the luminance or brightness level. C7 quantifies the chromaticity of the object, in particular the ratio of green-red (a) to blue-yellow (b). C8 evaluates the balance between chromaticity and brightness. C9 reflects the degree of fading or loss of color. C10 evaluates the level of whiteness. C11 quantifies changes in the green color of the object.
The
C and
H components of the LCH color model were calculated using the formulas
The color indices are calculated according to the following formulas:
The spectral characteristics were obtained after converting the values from the XYZ and LMS models into reflectance spectra in the VIS region, in the range 390–730 nm. These calculations are for observer 2° and illuminance D65.
Spectral indices were used [
37,
38,
39]. The wavelengths indicated are in the visible range of the spectrum.
Spectral indices are used in terms of which changes in the spectrum of the studied products correspond to, regardless of the objects for which they are intended in their original form.
S1 measures the relative amount of red color in the object.
S2 shows changes in the color of the object.
S3 assesses the presence of yellow-orange color in the object.
S4 combines the reflection at different spectral wavelengths.
S5 reflects the intensity of the green color.
S6 enhances the information about the green color.
S7 calculates the difference between the green and red areas in the reflection spectra.
S8 shows the changes in the RGB color channels.
S9 quantifies the intensity of green.
S10 corrects the effects of noise in visible light.
S11 amplifies the green color signal.
where
Rx is the reflectance value at the specified spectral wavelength.
Feature selection methods suitable for both classification and regression analysis were used (FSRNCA, SFCPP, RReliefF):
- ✓
FSRNCA (feature selection for classification and regression by neighboring component analysis): This method identifies the most appropriate features through their weights, obtained in such a way as to minimize the classification and prediction error [
40].
- ✓
SFCPP (feature selection with comparable predictive ability) is a feature selection method that focuses on identifying features with similar predictive and classification capabilities while reducing data redundancy [
41].
- ✓
RReliefF is an improved version of the ReliefF algorithm which is used for feature selection in classification and regression tasks. It evaluates the importance of features based on their ability to distinguish instances that are close to each other [
42].
Those features with weight coefficients above 0.6 are considered informative [
43]. A vector of them is defined based on the selected features.
2.1. Test for Normal Distribution of the Selected Characteristics
The Anderson–Darling, Lilliefors, and Jarque–Bera methods were used to check for normal distribution of the selected features.
The methods were chosen from the point of view that the results obtained from each of them are comparable.
The Anderson–Darling test [
44] is used to check whether a data sample has a normal distribution. However, it can be used to test for another hypothetical distribution, even if the parameters of the distribution are not specified in advance. Instead, the test estimates all unknown parameters of the data sample. This test statistic belongs to the family of the quadratic empirical distribution function, which measures the distance between the hypothetical distribution,
F(
x), and the empirical cdf,
Fn(
x), as
where
x1 <
x2 <
… <
xn;
w(
x) is a weighting function;
n is the number of data.
The weight function in the Anderson–Darling test is
which gives more weight to observations in the tails of the distribution, thus making the test sufficiently sensitive to outliers and suitable for detecting deviation from normality in the tails of the distribution.
The test statistic has the form
where {
X1 < … <
Xn} are the ordered data points of the sample;
n is the number of data points in the sample.
In this test, the decision to reject or not reject the null hypothesis is based on comparing the p-value for the hypothesis test with the specified significance level, rather than comparing the test statistic with the critical value.
The Lilliefors test [
45] is a two-tailed goodness-of-fit test, suitable when the parameters of the null distribution are unknown and need to be estimated. This contrasts with the one-sample Kolmogorov–Smirnov test, which requires the null distribution to be fully specified.
The test statistic has the following form:
where
F(
x) is the empirical cumulative distribution function of the sample data;
G(
x) is the empirical cumulative distribution function of the hypothetical distribution with estimated parameters that are equal to those of the sample.
The Lilliefors test can be used to test whether a data vector x has a lognormal or Weibull distribution by applying a transformation to the data vector and performing the appropriate Lilliefors test:
- ✓
To test x for a lognormal distribution, check whether log(x) has a normal distribution.
- ✓
To test x for a Weibull distribution, check whether log(x) has an extreme value distribution.
The Lilliefors test cannot be used when the null hypothesis is not a family of location-scale distributions.
The Jarque–Bera test [
46] is a two-tailed goodness-of-fit test, suitable when a fully specified null distribution is unknown and its parameters need to be estimated.
The test is specifically designed for alternatives in the Pearson distribution system.
The test statistics are
where
n is the sample size;
s is the sample skewness;
k is the sample kurtosis. For large samples, the test statistic has a χ
2 distribution with two degrees of freedom.
Two hypotheses are accepted: H0—the data have a normal distribution; H1—the data do not have a normal distribution.
Four criteria are used to assess the distribution of data in the vectors of selected features.
- ✓
Criterion H: if H = 0, the null hypothesis is accepted; if H = 1, the null hypothesis is rejected.
- ✓
p-level, if p is greater than the accepted significance level α, then the null hypothesis is accepted. If p < α, the null hypothesis is rejected.
- ✓
Statistics of the method used (stat): according to the calculations mentioned above.
- ✓
Critical level (CV). If the value in stat is greater than CV, then the null hypothesis is rejected at the accepted significance level α.
2.2. Test for Informativeness of the Selected Features by Cross-Validation
To check the informativeness of the selected features, the k-Fold, Hold-Out, and Leave-One-Out methods were used. Through these methods, the evaluation of the informativeness of the features in different subsets of data shows which features are truly informative and increases the predictive power of the models and the efficiency of the classifiers.
k-Fold [
47] is a technique mainly used to evaluate the performance of machine learning models. The data set is divided into
k-groups of equal size. The model is trained on (
k − 1) groups and tested on the remaining group. The process is repeated (
k) times, with each group being used exactly once as a test sample. The final performance indicator is the average value of the indicators from each group.
In Hold-Out cross-validation [
48], the data set is divided into two groups for training and for testing. The model is trained on the training set and then evaluated on the test set. In this work, the data is split by default in Matlab: 50% of the data is used for training and 50% for testing.
Leave-One-Out [
49] is a special case of
k-Fold cross-validation, where (
k) is equal to the number of values in the data set. Each value of the data is used once for testing, while the remaining (
n − 1) values are used for training. This process is repeated for each value.
Figure 1 shows in a general graphical form the principles of operation of the three used cross-validation methods. The example for
k-Fold includes three groups. For Hold-Out, the two possible groups with 60/40% training to test sample are shown. For Leave-One-Out,
n-th iterations with testing on each individual value of the data are presented.
Cross-validation was performed for both classification and regression.
For classification, a method is used with a separation function
δk(
x) of the form
where
x is the input feature vector;
Σ is the covariance matrix, which is assumed to be the same for all classes;
µk is the mean vector for the corresponding class;
πk is the prior probability of class
k.
The classification error
ei is defined using the formula
where
is the number of misclassified data;
is the total number of data in the feature vector.
The average classification error
is determined using the formula
where
is the total number of data in the feature vector;
is the classification error for each cross-validation method.
In cross-validation by regression, calculations were used for the training samples and and the test samples and .
A linear regression model of the type was trained:
where
is the dependent variable;
are the model coefficients;
is the training sample.
With certain coefficients of the trained regression model, the test data are entered, and the regression equation takes the form
The accuracy of the obtained regression models was checked by means of root mean square error (RMSE), sum of squares (SS), total sum of squares (TSS), and coefficient of determination (R2).
Root mean square error RMSE is determined using the formula
where
is the number of data in the
i-th test sample;
are the predicted data for the dependent variable;
are the actual data for the dependent variable;
is each of the data in the test sample (1 ≤
j ≤
); i is the number of the test sample (1 ≤
i ≤
k).
The sum of squares
is calculated using the formula
where
is the number of data in the test sample;
are the predicted data for the dependent variable in the test sample;
are the actual data for the dependent variable in the test sample;
j is each of the data in the test sample (1 ≤
j ≤
); i is the test sample number (1 ≤
i ≤
k).
The total sum of squared errors
is calculated using the formula
where
is the number of data in the test sample;
is the mean of the predicted data for the dependent variable;
is the actual data for the dependent variable;
j is each of the data in the test sample (1 ≤
j ≤
); i is the number of the test sample (1 ≤
i ≤
k).
The coefficient of determination
is determined using the formula
where
SSi is the sum of squares of the error for the
i-th iteration of the cross-validation algorithm;
TSSi is the total sum of squares of the error for the
i-th iteration of the cross-validation algorithm.
The average values of the criteria for evaluating the regression models during cross-validation were determined: root mean square error, sum of squares of the errors and coefficient of determination (
RMSE,
SS and
), which are presented as a result of this analysis.
where
k is the number of iterations in the cross-validation algorithm;
j is each of the data in the test sample (1 ≤
j ≤
);
i is the test sample number (1 ≤
i ≤
k).
where
k is the number of iterations in the cross-validation algorithm;
j is each of the data in the test sample (1 ≤
j ≤
);
i is the test sample number (1 ≤
i ≤
k).
where
k is the number of iterations in the cross-validation algorithm;
j is each of the data in the test sample (1 ≤
j ≤
);
i is the test sample number (1 ≤
i ≤
k).
All data were processed at a significance level of α = 0.05.
3. Results
3.1. Main Characteristics of Hen and Quail Eggs
Table 2 shows the characteristics of hen eggs from producer M1, which were stored for a period of 25 days. During the storage period, the mass of the eggs does not change to a large extent, but some change in the dimensions of the yolk is observed.
Although the variations in the height and diameter of the yolk were significant, they could be due to the decomposition of the internal composition of the egg over time. Egg quality parameters such as AI, HU, and YI show that the albumen and yolk are relatively acceptable during the first days of storage, reaching a peak at approximately the 5th day and then gradually deteriorating. Although this starts to affect the EQI, it still remained relatively stable despite these fluctuations and would therefore mean that reasonable egg quality is maintained for up to 25 days of storage, although features of structural degradation become apparent towards the latter part of this period.
The results for hen eggs from producer M2 are presented in
Table 3. It can be seen that the egg mass changed minimally, from 60 g on Day 0 to 56.32 g on Day 5 and 51.37 g on Day 25. The D
e and d
e values gave a very large variation, with the lowest values obtained on Day 5. The shell weight was also unusually high on Day 5 and may reflect changes in the shell composition during early storage. The yolk diameters Dy, as well as Dtka and Dtha, showed differences, demonstrating significant changes during the first days of storage. The key egg quality indices, such as AI, HU, and YI, showed an improvement in the beginning of storage, especially around Day 5. Their values reached their maximum value by Day 5 and then maintained at this level or decrease slightly around Day 25 of storage. The egg quality index EQI increased linearly during the storage period and reached a maximum value on the 20th day with 103.44, indicating relatively good quality during the storage period, despite some fluctuations in shell and yolk characteristics.
The characteristics of hen eggs from producer M3 for a storage period of 25 days are presented in
Table 4. The weight of the eggs decreased from We = 60.33 g on day 0 to We = 54.96 g on day 5, before stabilizing around 58 g after day 15. From the curve of De and de, it is observed that within the first days of storage, by day 5, the values decreased and then stabilized. In the case of Ws, the increase on day 5 to 38.96 g indicates that the changes in the composition of the shell were uniform and smooth. The values of the height and diameter of the yolk Dy also presented variations in the direction of flaking during the storage period. Various egg quality indicators such as albumen, yolk index, and Haugh unit indicated that egg quality was satisfactory up to Day 5 with high AI and HU values followed by a gradual downward trend during storage periods. EQI increased significantly on Day 15 to 96.45, indicating an overall improvement in quality up to that point, followed by maintenance of egg quality. The results showed that despite the occurrence of early changes, M3 eggs maintained relatively good quality during the storage period of 25 days.
Table 5 presents data on the main characteristics of quail eggs from producer M1, stored for a period of 25 days. The mass of the eggs varied depending on the day of analysis, being 12 ± 1.67 g on day 0 and reaching a maximum of 13.08 ± 0.44 g on day 5. From that point on, it gradually decreased until day 15, when it reached 10.07 ± 0.39 g. The minimum values were obtained on the 10th and 15th days of the storage period. The remaining internal characteristics of the eggs, such as the long axis of the yolk, for example, steadily decreased from 27.05 ± 0.87 mm on day 0 to 23.45 ± 1.91 mm on day 15, indicating shrinkage over time. On one hand, fluctuations in the diameters of the dense and sparse albumen, Dtka, decreased sharply on day 15. On the other hand, Dtha increased until day 5 and then fluctuated. The critical quality indicators analyzed included Hough unit, yolk index, and egg quality index. All of these characteristics underwent a change, with HU showing a decrease from 75.51 ± 1.69 on day 0 to 70.58 ± 1.54 on day 25.
Table 6 shows the characteristics of quail eggs from producer M2 over a storage period of 25 days. The egg mass We remained constant, around 12.6 g from day 0 to day 5, decreased to 10.05 ± 0.35 g on day 15 and remained almost constant with some fluctuations. Regarding internal quality, the characteristics related to the dimensions of the yolk and albumen varied with time. The longitudinal axis of the yolk Dy followed a gradual decrease from 25.3 ± 1.19 mm on day 0 to 23.48 ± 1.78 mm on day 15, after which it remained relatively constant. Regarding the dimensions of the dense and sparse albumen, Dtka and Dtha varied significantly; for example, the dense albumen decreased in size by day 15, while the size of the sparse albumen experienced a significant decrease from day 5 to day 10. Other quality parameters, such as HU, remained relatively stable in this range, varying between 70 and 72 HU throughout the storage period. A similar trend was shown by the egg quality index, which also remained relatively constant.
The data in
Table 7 show the changes in the main characteristics of quail eggs from producer M3 during storage for a period of 25 days. The mass of the eggs changed from 11.34 ± 0.43 g on Day 0 to 11.85 ± 0.46 g on Day 5, decreasing to 10.06 ± 0.35 g on Day 15. The internal characteristics of the eggs also underwent changes. The long axis of the yolk (Dy) varied slightly, decreasing from 24.37 ± 1.45 mm on Day 0 to 23.46 ± 1.75 mm on Day 15. The sizes of the dense and rare albumen behaved similarly.
The dense albumen showed a decrease on day 15 to 37.23 ± 0.91 mm from 40.81±4.62 mm on day 10. The sparse albumen, Dtha, decreased sharply until day 10 and then leveled off. Despite these changes, HU and EQI remained fairly stable during the storage period.
3.2. Color and Spectral Characteristics of Hen and Quail Eggs Yolk and Albumen
The RGB color characteristics of the yolk and solid and thin albumen of hen eggs for the three different producers are presented in
Table 8. In the case of the yolk, M1 gives the highest value for R in most cases, reaching a peak of 242.32 on Day 25, followed by M2 and M3. M2 shows the most variation in the G and B components during this period, reaching values of 63.72 on Day 5 for the blue component.
The changes in the yolk of M3 are similar to those of the other two producers. Regarding the solid and thin albumen, the variation in RGB values remains relatively similar for all three producers. The trend over time for the solid albumen is that for the R, G, and B components the change is relatively small. This is especially true for M1, whose values are slightly changed. Both M2 and M3 have the highest amount of variation, especially with the green and blue components. The RGB values for the rare albumen show significant changes towards the end of the storage period, while all M1, M2, and M3 reach their peak value for the B component by day 25, indicating that the color change of the albumen is significant.
The RGB color characteristics of the internal parts of quail eggs for the three producers M1, M2, and M3 (
Table 9) vary over time in yolk (Y), dense albumen (TkA), and sparse albumen (TnA), with a decrease in the red (R) component in the yolk observed in all producers from Day 0 to Day 25.
The largest decrease was found for M2, going from 227.48 ± 4.21 on average to 229.98 ± 7.04. Similarly, it can be seen that the G and B components also show fluctuations: M2 with the most pronounced decreases, especially in component B, which decreased from 48.47 ± 22.67 to 37.77 ± 8.39. The other producers, M1 and M3, have more consistent RGB values for the yolk, although M3 has slight variations in the green component, which gradually decreases from 163.92 ± 5.08 to 152.77 ± 6.94 by day 25. These changes indicate a color variation in the yolk, which is related to storage conditions and the degradation of pigments over time. The RGB of the dense and sparse albumen also show differences in the B (blue) component. In the dense albumen of M1 and M3, the values of the red and green components maintained relatively stable values, while that of M2 showed significant fluctuation in its blue values, especially on day 25, from 246.45 ± 17.49 to 225.09 ± 27.61. Regarding the sparse albumen, the same trend was observed, i.e., M2 and M3 showed greater variation in their blue and green components, respectively, while in M1, they remained relatively constant. The color stability in each of the egg white components over time indicates that while the yolk may show more noticeable visual changes over time, the egg white retains its color profile, although there are specific differences between individual manufacturers.
Figure 2 shows the averaged spectral characteristics of elements from hen eggs for producer M1. The spectral characteristics of the yolk overlapped in the first days of storage. After day 10, a tendency towards their differentiation was first observed. Relatively good separation was observed at spectral wavelengths of 600 nm. In the spectral characteristics of the dense albumen, separation was observed in the first days of storage, while after day 15, an overlap of these characteristics was visible. Sufficient separation was visible in the range from 450 to 650 nm. The results are similar for the rare albumen. In the first days of storage, an overlap of the spectral characteristics was observed, while after day 15, they could be distinguished from each other. Similar to the dense albumen, in the rare albumen, relatively good separation of the spectral characteristics for individual days of storage was observed in the range from 450 to 650 nm.
Figure 3 shows the averaged spectral characteristics of elements from hen eggs for producer M2. In the yolk, the spectral characteristics were distinct in the range from 380 to 450 nm. In the other spectral ranges in the first days of storage, the spectral characteristics overlapped, and after day 15, their distinctness was again observed. In the dense and rare albumen, the same changes in the spectral characteristics were observed. Separability was observed at the beginning of the spectral range from 380 to 450 nm. In the first days of storage, the spectral characteristics of the two types of albumen overlapped, and after day 15, their distinctness was again observed.
Figure 4 shows the averaged spectral characteristics of elements from hen eggs for manufacturer M3. During the first days of storage, the spectral characteristics of the yolk overlapped. After the 10th day, a tendency towards their differentiation was first observed. In this case, relatively good separation was observed at spectral wavelengths of 600 nm. From the spectral characteristics of the dense albumen, it can be noted that in the first days of storage there was separation, but from the 15th day onwards, an overlap of these characteristics was observed. Sufficient separation was observed between 450 and 650 nm. For the rare albumen, the results were the same. From the first days of storage, an overlap of the spectral characteristics was observed. On the other hand, after the 15th day, it was possible to detect a difference between them. Similar to the dense albumen, in the rare one, a relatively good separation of the spectral characteristics was observed for individual days of storage in the range from 450 to 650 nm.
Figure 5 shows the averaged spectral characteristics of quail egg elements for producer M1. The characteristics of the yolk from day 0 to day 10 overlapped. After day 15, they could be clearly distinguished. In the dense albumen, a separation of the spectral characteristics was observed in the entire spectral range (from 380 to 780 nm). Partial overlap was observed in the first days of storage (from day 0 to day 10), after which the individual days of egg storage were clearly distinguished. In the rare albumen, there was a strong overlap of the spectral characteristics in the first days of storage. After day 15, the spectral characteristics could be visibly distinguished in the spectral range above 600 nm.
Figure 6 shows the averaged spectral characteristics of quail egg elements for producer M2. The spectral characteristics of the yolk overlapped in the first days of the storage period (up to day 10). Then, from day 15 onwards, they could be clearly distinguished. Distinction was also observed in the range of 400–550 nm, while after 600 nm, partial overlap of the spectral characteristics was observed again. Strong overlap of the spectral characteristics was seen in both the dense albumen and the sparse albumen. Distinction was seen in the range of 400–600 nm for the sparse albumen. In the remaining spectral ranges, the characteristics overlapped.
Figure 7 shows the averaged spectral characteristics of quail egg elements for the manufacturer M3. The spectral characteristics of the yolk overlapped in the first days of the storage period (up to day 10). Then, from day 15 onwards, they could be clearly distinguished. Distinction was also observed in the range of 400–550 nm, while after 600 nm, partial overlap of the spectral characteristics was observed again. In the dense and sparse albumen, distinction was seen in the range of 400–600 nm. In the remaining spectral ranges, the characteristics overlapped.
3.3. Feature Selection for Yolk and Albumen from Hen and Quail Eggs
For hen eggs (H) from manufacturer M1, yolk (Y), and albumen (A), the following feature vectors (FV) were selected (
Table 10).
For the yolk, features related to color indices such as C2 (RReliefF = 0.71, SFCPP = 0.65, mean = 0.67) and R4 (all criteria = 1.00) stand out due to their higher values. The 0.6 line serves as a threshold, indicating that many features with values above this line are more informative or significant. Features such as C5 (RReliefF = 1.18, mean = 0.54) and R5 (RReliefF = 0.63, SFCPP = 0.94, mean = 0.59) also demonstrate high values of weighting coefficients. Similar trends are observed for the albumen. Color indices, such as C5 (RReliefF = 0.65, SFCPP = 0.07, mean = 0.24) and S10 (RReliefF = 0.41, SFCPP = 0.94, mean = 0.65) show values above the 0.6 line, confirming that they are sufficiently informative.
The features that were selected by the three methods for producer M1 can be summarized as presented in
Table 11.
For hen eggs (H) from manufacturer M2, yolk (Y), and albumen (A), the following feature vectors (FV) were selected (
Table 12).
In the yolk, color features such as C1 (RReliefF = 0.72), C8 (RReliefF = 0.72, FSRNCA = 0.69), and R9 (RReliefF = 0.63, SFCPP = 0.74) stand out due to their high values of weight coefficients above the threshold of 0.6, which indicates that these features are sufficiently informative. Features with values below 0.6 are also C7 (RReliefF = 0.55) and R5 (RReliefF = 0.42, FSRNCA = 0.29). Albumen features such as S9 (SFCPP = 0.95, mean = 0.49), We (SFCPP = 0.83), and Dtha (RReliefF = 1.40) show high values of weight coefficients that exceed the threshold of 0.6 and are important for the following analyses. Color indices, such as C7 (RReliefF = 0.68) and C9 (SFCPP = 0.93), also show informative values of weighting coefficients, while lower values of C1 (RReliefF = 0.43) and S1 (RReliefF = 0.62) show less significance in these cases.
The features that were selected by the three methods for manufacturer M2 can be summarized as presented in
Table 13.
For hen eggs (H) from manufacturer M3, yolk (Y), and albumen (A), the following feature vectors (FV) were selected (
Table 14).
In the yolk, the color characteristics C5 (RReliefF = 0.98), C6 (SFCPP = 0.99), R4 (RReliefF = 1.40), and R9 (SFCPP = 1.00) are informative. Of the color indices, C8 (RReliefF = 0.67) and R7 (RReliefF = 0.65) also show sufficient informativeness. In the white, the color indices C6 (SFCPP = 0.99), R9 (SFCPP = 1.00), and R7 (RReliefF = 0.61) stand out as informative. Of the technological characteristics, We (RReliefF = 0.71) and Dtha (RReliefF = 1) have the highest values of the weighting coefficients. In this manufacturer, the spectral indices show weaker informativeness compared to the other characteristics.
The features that are selected using the three methods for manufacturer M3 can be summarized as presented in
Table 15.
For quail eggs (Q) from manufacturer M1, yolk (Y), and albumen (A), the following vectors of features (FV) were selected (
Table 16).
For yolk, feature C2 stands out with high levels of weighting coefficients, with both FSRNCA (0.76) and SFCPP (0.64) above the threshold of 0.6, making it sufficiently informative. In addition, C3 has an informative RReliefF value of 0.67, while the SFCPP function of C4 reaches 0.87. In the albumen data, several features with relatively high values of weighting coefficients are observed. For example, for feature R5, RReliefF is 0.62 and SFCPP reaches 0.38, while the spectral index S7 shows an informative value at an RReliefF of 0.64. The FSRNCA function is particularly important for feature R12, where it reaches 0.85.
The features that were selected using the three methods for producer M1 can be summarized as presented in
Table 17.
For quail eggs (Q) from manufacturer M2, yolk (Y), and albumen (A), the following feature vectors (FV) were selected (
Table 18).
In yolk, there are enough features with weight coefficients exceeding the threshold of 0.6, making them informative for classification or prediction. For example, color feature C1 has the highest weight coefficient in the SFCPP method, with a score of 1.00, significantly above the threshold, although in other selection methods, these values are lower. Feature C5 is also informative with an RReliefF score of 0.64. In addition, feature R4 shows a combination of high values in RReliefF (0.61) and FSRNCA (0.50), and its significance was established in both methods. Feature R3 stands out in SFCPP, with a weight coefficient value of 0.79, indicating an important feature. Also, feature R11 has a strong informative value, with a score of 0.65 in FSRNCA and 0.64 for SFCPP. The albumen data also shows several significant features with weight coefficient values exceeding 0.6. Feature S2 is significant under SFCPP selection, with a score of 0.76, indicating that it is important for further analysis. Feature R12 shows an SFCPP score of 1.00 and an FSRNCA value of 0.26. Similarly, feature S4 shows relatively high values of its weights under SFCPP, with a score of 1.00. Feature R5 has high weight coefficient values under two of the selection methods, RReliefF (0.44), FSRNCA (0.63), and SFCPP (0.74).
The features that were selected using the three methods for manufacturer M2 can be summarized as presented in
Table 19.
For quail eggs (Q) from manufacturer M3, yolk (Y), and albumen (A), the following feature vectors (FV) were selected (
Table 20).
In the yolk, the highest value of the weight coefficient is observed for “Dtka” 1, selected with the RReliefF method, along with 0.66 for FSRNCA. “Ws” and “de” are also sufficiently informative, both showing high values of the weight coefficients in RReliefF and SFCPP. The features “Htha” and “Dtha” also have balanced values of the weight coefficients. For the albumen, “Dtka” again stands out with the highest weight coefficient in RReliefF with a value of 1. The features “We”, “C9”, and “de” show high values of the weight coefficients in SFCPP. The presence of features with RReliefF values above 0.6, such as “We” and “R7”, emphasizes their informativeness.
The features that were selected using the three methods for manufacturer M3 can be summarized as presented in
Table 21.
3.4. Normal Distribution of Yolk and Albumen Features from Hen and Quail Eggs
In the M1 hen egg producer, most of the characteristics were normally distributed throughout days 0 to 25, suggesting stability in both the yolk and albumen. In the yolk section, several characteristics such as “C5”, “R5”, “S9”, and “AI” showed deviations from normality on days 25, 5, and 15. Similarly, in the albumen, most of these characteristics were normally distributed, although some, such as “C5”, “C6”, “R1”, “R14”, “S3”, and “AI”, were not on days 10, 15, or 25, indicating minor instability or deviation from normality in the data over time. These were isolated deviations, and most characteristics remained stable during the storage period.
In the M2 hen egg producer, in the yolk section, some of the deviations were evident, such as in the characteristics “C5”, “C6”, “C9”, and “R1”, where a deviation from the normal distribution was observed for days 10, 15, or 25, indicating that there may be some instability in these data. As for the albumen, the stability was generally similar, but features such as “C5”, “C7”, “C8”, and “S4” showed deviations, mainly around days 10 and 20. Such disturbances were more frequent around the later days of storage, on days 20 and 25, indicating that there is probably a gradual loss of normal distribution with increased storage time in the yolk and albumen, although the majority of the characteristics studied remained stable.
In the M3 hen egg producer, most of the characteristics have a normal distribution for each day. In the yolk section, some of the characteristics showed a deviation from the normal distribution on different days. For example, “C3”, “C8”, “C10”, and “Hy” showed deviations from normal distribution on days 10, 15, and 25. In the albumen section, the characteristics are mostly constant, except for some deviations in “S2”, “S10”, “Dtha”, and “Htka” on days 10 and 25. The important thing to see is that while the majority of the characteristics maintain a normal distribution throughout the storage period, both yolk and albumen have their cases of disturbance, especially in the later days of storage.
In the quail eggs from the manufacturer M1, in the yolk section, many of the characteristics showed deviations from normal distribution during the storage period, especially on days 10 and 25, in “C2”, “C3”, “R3”, “R6”, “R14”, and “De.” Meanwhile, in the albumen section, although the majority of the features maintained their normal distribution, some features such as “C3”, “C4”, “C5”, “C6”, “C11”, and “Dtha” showed minimal deviations, especially for days 10 and 25. It is from the yolk and albumen that the disturbances in the normal distribution were more frequent around days 10 and 25, indicating that with the passage of time during storage, some features became less stable. Nevertheless, despite these dislocations, a fairly large number of features remained stable during the storage period. In the M2 quail egg producer, most of the yolk features maintained their normal distribution throughout the measurement period. On days 20 and 25, there were some deviations from the normal distribution, especially in the features “C1”, “C5”, “R3”, “S10”, “Ws”, “de”, and “Dtka”. For the egg white, most of the characteristics are stable, but characteristics such as “C2”, “C3”, “R5”, “R7”, “S1”, “S5”, and “We” deviate, especially for 10, 15, and 25 days with deviations from the normal distribution. The characteristics of both the yolk and the egg white are more unstable around day 25, which may indicate that with increasing storage time, the normal distribution is gradually broken for certain characteristics. Despite this occasional interruption, the values of the characteristics are stable throughout the storage period.
In the M3 quail egg manufacturer, in the yolk section, some characteristics deviated from the normal distribution, especially on days 0 and 25. The characteristics of “C2”, “R4”, “R14”, “S5”, “S10”, and “De” were unstable, especially on days 0, 15, and 25, which means that their distribution deviated from the normal distribution at the beginning and end of the storage period. Deviations from the normal distribution in the albumen characteristics were more frequent on days 10 and 25 and for the characteristics “C2”, “C5”, “C7”, “C8”, “R5”, and “R7.” Similar to the yolk, the albumen features also showed an increased deviation towards the later days of storage, around day 25. Despite these deviations, for most of the storage period, the features maintained a normal distribution in both the yolk and albumen, with gradual deviations from the normal distribution becoming more evident with prolonged egg storage.
3.5. Cross-Validation of Yolk and Albumen Features from Hen and Quail Eggs
The analysis methods are presented with their abbreviations: k-Fold (k-F); Hold-Out (H-O); Leave-One-Out (L-O-O).
For chicken egg producer M1 using all three methods (k-Fold, Hold-Out, Leave-One-Out), about 27 out of 30 yolk characteristics performed well. This means that they are sufficiently informative and can be used for classification and prediction with lower values of the peas. On the other hand, more variability was observed for the albumen. While some characteristics performed consistently well for all methods, such as C6, C7, R7, S11, Dtka, Dtha, and EQI, some of them, such as C3, C5, and R1, showed a discrepancy mainly regarding k-Fold and Hold-Out. The performance of the “We” feature was poor for all methods for albumen, as can be noted by higher error values. Yolk features were more consistent than albumen features in terms of performance across cross-validation methods.
Analysis of yolk and albumen features from hen egg producer M2 showed different results for different methods. In the case of yolk, there were both consistent and inconsistent results: among the 33 yolk features, 18 features had low error values for all three methods (k-F, H-O, L-O-O), indicating reliable performance; some features, such as C1, C2 and R3, after performing poorly in two methods, performed well in Leave-One-Out. On the albumen side, most features performed well in all methods, especially C5, C6, C7, C8, R7, and R12. Features such as C9, C10, and R9 presented higher error values for some of the methods, especially k-Fold and L-O-O. The cross-validation results were better for the albumen data than for the yolk.
In cross-validation of egg data from the M3 hen egg producer, yolk, and albumen characteristics, yolk characteristics showed strong and consistent results across methods. Out of a total of 27 yolk characteristics, 24 of them showed relatively low error values across all three methods considered. Only a few characteristics, such as R15 and S4, presented mixed results. In contrast, for albumen, the results were varied. Several characteristics, such as C6, R7, R10, and Dtka, showed consistent values across all methods, while others, such as C3, We, or De, have higher error values across at least some methods, such as k-F and H-O. Yolk characteristics are more reliable for classification and prediction than albumen, although several albumen features also show reasonably good results.
The analysis of yolk and albumen characteristics of eggs from quail egg producer M1 shows that for yolk characteristics, 18 out of 30 features had low error values in all methods: k-Fold, Hold-Out, and Leave-One-Out. However, some of the characteristics, such as R3, R6, R11, and S9, showed inconsistency in at least two of the methods. However, all yolk-related characteristics, such as C2, C3, C4, and We, demonstrated consistently low error values in all methods used. Albumen characteristics showed similar variations: while C1, C3, C4, C6, and We consistently had sufficient accuracy, others, such as C7, R5, and R9, showed higher error values in k-F and H-O. In particular, EQI showed higher error values in all methods. The yolk and albumen characteristics had good performance in cross-validation.
In the cross-validation of yolk and albumen features of quail egg producer M2, better results were obtained for yolk features, with low error values in all three methods. Features C5, R4, R11, R12, S3, and S10 had good performance in all three methods. Some features such as C1, R3, and C11 showed variation for the k-F and H-O methods. For albumen, features such as C2, C6, R4, R7, and S1 demonstrated mixed performance with higher error rates in one or more methods. Other features such as C3, R5, S4, S8, Dtka, and We had consistently good results. Some features had clearly higher error values in certain methods, such as Dtha, Hy and EQI. Yolk features are more reliable compared to albumen, although a number of albumen features performed well in all cross-validation methods used.
The yolk features of quail egg producer M3 had a relatively good performance, where out of 26, 16 of them showed consistently low error values. Among these features, C2, C11, R4, S5, S9, S10, and We have low error values in all methods, while R14, S4, and de showed inconsistency in k-F and H-O. In the albumen, the features C8, Dy, Dtka, and We consistently performed well, but a number of others consistently showed higher error values in all methods. For example, the features HU, YI, and EQI had poor performance in most methods. The characteristics of the yolk were more consistent and reliable than the characteristics of the egg white.
3.6. Summary Analysis of the Results
In this work, 53 features for yolk and albumen of chicken and quail eggs were analyzed. The obtained results are presented in tabular and graphical form, which allows for additional checks. Using three methods for selection of informative features, FSRNCA, SFCPP, and RReliefF, the most informative of them were selected. The statistical properties of the selected characteristics were analyzed to check whether they follow a normal distribution, using the Anderson–Darling, Lilliefors, and Jarque–Bera methods. In addition, the informativeness of the selected features was additionally checked using cross-validation methods, using the k-Fold, Hold-Out, and Leave-One-Out methods. The results are presented in a summarized form, for both quail and chicken eggs.
Table 22 presents summarized data on the obtained results. Among hen egg yolks, good cross-validation performance was observed, especially for M1 and M3, while for whites the characteristics were less consistent. The latter were generally characterized by lower cross-validation results and even M1 gave poor values. For quail eggs, mixed results were observed, with egg whites from producer M1 presenting a normal distribution and positive cross-validation, although the characteristics of egg whites from M3 showed poor validation despite the large number of selected features. The best performances were shown in the data for hen egg yolks compared to those for whites, as well as for quail egg yolks and whites.
Figure 8 shows the percentage ratio between the number of selected features and those with normal distribution, as well as those with positive cross-validation. The figure refers to hen eggs. In the three producers, M1, M2, and M3, the yolk features have better results than those for the whites in the cross-validation, with 92.59% of the yolk features from producer M1 showing positive results in cross-validation and 88.46% in M3, but 55–61% of them had a normal distribution for all days of the storage period. In cross-validation, 47.62% of the selected egg white features from producer M1 had positive results, while in M2 and M3 this percentage was relatively higher 59.26% and 57.14%, respectively.
Figure 9 shows the percentage ratio between the number of selected features and those with normal distribution, as well as those with positive cross-validation. The figure refers to quail eggs. The results of this analysis show that, with the exception of M3, the features for the albumen had a better performance in cross-validation and the majority of them had a normal distribution for the entire storage period of the eggs. For producer M1, 3.57% of the yolk features showed a normal distribution for the entire storage period, while about half of them (53.57%) had positive cross-validation results. For the albumen data, an equal performance of slightly more than half of the selected features (about 63%) was observed in both the normal distribution check and the cross-validation. For producer M2, 60% of the selected yolk features had positive results in both the normal distribution check and the cross-validation. For egg whites from this producer, about half of the data performed well on both criteria. For producer M3, the results show that a very small proportion of the selected features had positive results in cross-validation (16.67%). For this producer, again, about half of the selected features showed a normal distribution over the storage period, as well as positive results in cross-validation.
4. Discussion
The present work has improved and supplemented studies found in the available literature. Narzassi et al. [
50] measured only yolk color to determine that the crossbreeding of F2 Mahkota Arabian chickens improves egg quality, including shape, yolk, albumen, Haugh unit, and shell color, compared to other chicken groups. In the present work, the color and spectral characteristics of yolk and albumen used would lead to a more detailed analysis and identification of changes in egg properties.
The spectral analysis of egg components has a strong peak at approximately 520 nm, showing a high number of pigments, especially carotenoids and xanthophylls. In fact, such naturally occurring bioactive substances responsible for yellow-colored yolk naturally occur in egg yolk.
The proposed informative features for yolk and albumen can be used to improve the work of Kurşun et al. [
51] in the study of chicken and quail eggs in university research centers, especially regarding the relationship of these indices with egg composition.
Color and spectral characteristics can be successfully used as an alternative to indirect and subjective determination of yolk color with a color fan, as was done in the work of Hawari et al. [
52] and Bovšková et al. [
53]. Spectral indices have a significantly greater predictive potential for egg properties [
54]. Their use would lead to a relatively high prediction accuracy with classical regression methods instead of using significantly more complex machine learning methods and color features [
55].
According to Sunwoo et al. [
56], carotenoids are among the pigments responsible for light absorption in the blue-green region of the spectrum, at approximately 520 nm. They are essential to human health because of their antioxidant properties and because they protect the eyes against impairments.
Although this work has given useful data for tracking internal egg characteristics during storage, we acknowledge that further validation is needed. Further research is thus recommended with an increased sample size to enhance the statistical robustness and ensure wider applicability. Furthermore, testing under varied storage conditions, such as at different temperatures, humidity levels, and environmental factors, will offer a chance to assess the adaptability of the proposed methods. Increasing the study to include different systems of egg production, seasonal changes, and geographic diversity will further strengthen the reliability of the findings.
The analysis reveals that the informativeness of the examined features varies across producers, making it difficult to generalize the results. Future stages of the study, focusing on predicting storage duration and egg composition, will include additional analyses. However, the current findings indicate that a universal generalization across all producers is challenging.