1. Introduction
The global protein market is witnessing a significant surge in demand and is expected to reach
$17.4 billion by the year 2027. This escalating demand is primarily driven by the recognition of the health benefits associated with protein consumption. Plant protein has gained considerable attention due to its affordability and diverse sources, which include beans, cereals, and oilseeds [
1,
2,
3]. These proteins are not only cost-effective but also provide lipid-lowering properties and prevention of cardiovascular and cerebrovascular diseases [
4,
5,
6]. For many years, soya has been the dominant player in the plant protein market. However, concerns regarding genetic modification have led to a shift in consumer focus towards other plant sources. Peanuts have emerged as a healthy and popular alternative source of plant protein [
7,
8,
9]. In 2022, global peanut production reached an impressive 50.27 million tons, with 40% yield used for oil extraction. The production of peanut meal with both high and low temperature has reached to 7.98 million tons. High-temperature peanut meal is primarily used for feed, while low-temperature peanut meal is used to prepare peanut protein powder, which boasts a protein content of 50% [
10]. Peanut protein powder is highly valued for its nutritional efficiency (58), ease of absorption by the human body (with a digestibility close to 90%), and unique peanut aroma [
11]. It is widely used in a variety of food products, including beverages, meat products, and flour products [
12,
13]. Despite its popularity, the quality of peanut protein powder available in the market varies significantly. There is a pressing need for rapid and standardized quality evaluation methods to ensure consumers receive high-quality products.
The evaluation of protein powder quality employs a variety of methods, including chemical techniques such as the Kjeldahl method, Soxhlet extraction, and direct drying as well as near-infrared (NIR) spectroscopy and high-performance liquid chromatography. Among these, NIR spectroscopy has gained considerable interest from researchers due to its non-destructive nature and its ability to deliver fast and straightforward results [
14,
15,
16,
17]. The main indicators for evaluating the quality of protein powder include protein, fat, moisture, and trace substances like crude fiber, ash, vitamins, amino acids, trace elements, and carbohydrates [
10,
18]. Furthermore, both enterprises and consumers are mainly concerned about the protein content during the procurement and purchasing process. In addition, they pay significant attention to control the fat and moisture content during the production, storage, and sales process for the maintenance of the product’s quality and shelf life [
19]. The Kjeldahl, Soxhlet extraction, and direct drying methods were utilized in a research study to evaluate the protein, fat, and moisture contents of peanut protein powder [
20]. The results demonstrated several drawbacks, including being time-consuming, requiring complex sample pretreatment, and the use of organic reagents. Recently, near-infrared spectroscopy in conjunction with principal component regression (PCR), partial least squares regression (PLS), support vector machine regression (SVR) [
21], and generalized regression neural network (GRNN) methods has been employed to monitor the quality of powdered samples, including insect powder [
22], wheat powder [
23], and cottonseed powder [
24]. So far, the PLS methods have not been applied in the quality detection of peanut protein powder [
25]. Therefore, there is an urgent need to establish a rapid evaluation method for assessing the quality of peanut protein powder. This would not only ensure the delivery of high-quality products to consumers but also streamline the production and sales process for enterprises. The development and implementation of such a method would contribute significantly to the protein powder industry.
In this study, a comprehensive evaluation method was used to assess the quality of peanut protein powder. A total of 51 peanut varieties were selected from China’s main planting areas, with 31 high-oleic peanut varieties included. The peanut protein powder was prepared using a low-temperature physical pressing process to retain the nutrients in the peanuts. Various processes were employed in the preparation of peanut protein powder during the sample selection stage, utilizing different production line models. To the best of our knowledge, it is the first instance in the literature of integrating peanut protein powder with multiple varieties, processes, and commercial products. This multifaceted approach was crucial in ensuring a thorough and accurate assessment of the product quality. The determination model for protein, fat, and water content in peanut protein powder was established by combining PLS and GRNN, presenting a novel method for evaluating the quality of peanut protein powder. Importantly, the PLS model and the GRNN model were applied for the first time in the detection of peanut protein powders, and the GRNN model showcased superior performance compared to the PLS model in predicting the primary components of peanut protein powder. Through the integration of these advanced analysis technologies, the accuracy and reliability of quality assessment have been significantly enhanced.
3. Results
3.1. Chemical Indexes Analysis of Peanut Protein Powder
Currently, the market offers peanut protein powder with a wide range of fat content, which is greatly influenced by different production processes, such as pressing times, type of oil press used, and the degreasing process. The application of degreasing treatment to peanut protein powder can effectively promote its shelf life. The protein content of peanut protein powder available on the market typically ranges from 35% to 60%, with a moisture content of less than or equal to 10% [
10]. To ensure the model’s adaptability to diverse production environments and processes, this study employed a three-step pressing process. The one-time pressing resulted in various indicators for peanut protein powders, including residual oil content ranging from 12.97% to 29.17% and protein content ranging from 36.19% to 52.28%. Secondary-pressed peanut protein powders exhibited residual oil content between 7.95% and 12.81%, as well as protein content ranging from 39.00% to 58.20%. Defatted peanut protein powders showed residual oil content ranging from 0.83% to 7.74% and protein content ranging from 42.00% to 58.78%. This approach facilitated a broader distribution of fat and protein content, enhancing the model’s predictive capabilities. As shown in
Figure 3, the fat content of peanut protein powder was significantly reduced following the secondary pressing and degreasing processes, while the protein content was increased. The moisture content appeared to be relatively unaffected by the pressing conditions.
In this study, 126 samples of peanut protein powder were prepared, and the average values of fat, protein, and moisture contents of the samples are reported in
Table 3. The fat content varied from 0.83% to 29.17%, the protein content ranged from 36.19% to 58.78%, and the moisture content ranged between 5.34% to 11.60%. The fat and protein content of the samples were able to completely cover the commercially available protein powders, including with high, low, and medium contents. The samples with a moisture content greater than 10% were also included in the training set and proved beneficial for monitoring the production and processing of peanut protein powder. Moisture content was an important index of peanut quality and storage stability. It was an important parameter that should be measured, monitored, and controlled during harvesting, drying, processing, marketing, and storage. It enabled the timely detection and handling of abnormal water samples.
The samples were found to be well-representative. A study by Liu et al. [
11] found that the protein content of peanut meal, a by-product of oil extraction, is about 55%, the range of peanut protein powder content in its determination. The protein, lipid, and water contents of the peanut protein powder determined in another study by Zhang et al. [
42] were 55.3% (dry basis), 8.8% (dry basis), and 6.1%, respectively. These studies provided valuable insights into the composition of peanut protein powder. The quality indicators of peanut protein powder depended on the specific conditions and parameters used in the production process. The quality index of peanut protein powder in this study had a wide range, which can encompass most of the quality of peanut protein powder.
3.2. Near-Infrared (NIR) Spectral Data
The spectral data of 126 peanut protein powder samples are shown in
Figure 4. The figure shows the spectrum of different peanut varieties under different preparation processes, the near-infrared spectra of various peanut protein powder samples show similar trends, but the absorption peak intensities of different samples are different, indicating that the components of different samples have differences, which can be used for the construction of the quantitative model of near-infrared spectroscopy.
The spectral curve of peanut protein powders was similar to the trend of the spectral curve of other cereal grain powder samples [
43]. The original spectra exhibited peaks at 1200 and 1480 nm, while valleys were observed at 950, 1120, and 1300 nm The bands observed in the near-infrared region indicated the absorption of the hydrogen-containing groups (C–H, N–H, and O–H), from major components found in the peanut protein powders, such as water, lipids, and proteins.
From
Figure 4, we can highlight the following spectral bands [
26]:
950 nm: N–H stretch second overtone, proteins; O–H stretch second overtone, water;
1120 nm: C–H stretch second overtone, lipids;
1200 nm: O–H stretch + O–H deformation, water;
1300 nm: C–H combination: lipids;
1480 nm: N–H stretch first overtone, proteins; O–H stretch first overtone, water; C–H combination, lipids.
These groups provided rich structural and compositional information that reflected the unique characteristics of peanut protein powder. These characteristics were advantageous for subsequent model prediction. The spectral data provided a comprehensive overview of the inherent properties of peanut protein powder, offering valuable insights into its composition and structure. This information was instrumental in enhancing the predictive capabilities of the model, thereby facilitating a more accurate analysis of the peanut protein powder.
Near-infrared spectroscopy delivered invaluable data on the anharmonic nature of molecular vibrations and peculiarities of intermolecular interactions [
44]. The spectral range chosen to develop a near-infrared spectroscopy model in this study closely resembled that used in detection models for substances like nuts and oilseeds. For instance, Zhao et al. [
45] utilized hyperspectral images ranging from 950 to 1700 nm to establish a PLS model for the rapid quality control of peanut and walnut powder in flour. Similarly, Mohammad Akbar Faqeerzada et al. [
46] employed a line scan hyperspectral imaging system covering a spectral range of 900 to 2494 nm to swiftly and non-destructively screen almond powder samples.
3.3. Principal Component Analysis (PCA)
The essence of principal component analysis was feature extraction, which mainly reserved the main classification information of the original space to the maximum extent in the feature space. The dimension of the feature space was far lower than that of the original space. Without reducing the “effective” information, the original dataset was converted to “effective” information with fewer dimensions. Principal component analysis (PCA) was employed to analyze the wavelength values of 126 peanut protein powder samples. The principal component variance contribution rates of PC1, PC2, and PC3 were 95.27%, 4.29%, and 0.28%, respectively, and the cumulative contribution rate was 99.85%, which could cover the sample information, as shown in
Figure 5. This high cumulative contribution rate showed that the three principal components (PC1, PC2, and PC3) could effectively cover the information of the sample data. In other words, these three components collected nearly all the variability in the data, thereby providing a comprehensive representation of the sample information.
PCA was usually used for feature information extraction and feature wavelength extraction, thereby improving the effectiveness of the training samples and improving recognition accuracy. A method for identifying the adulterated cocoa powder was presented by Yang et al. [
47], where the relative areas of 12 common characteristic peaks in the fingerprints were processed using PCA. Zhang et al. [
48] used PCA to study 33 representative traits associated with flavor and found that total sugar, sucrose, and total tocopherol had the most abundant information related to peanut flavor.
The PCA step can reduce the data matrix dimension and compress the data points into interpretable variables. Bilal et al. [
49] used principal component scores as input variables and applied PCA and linear discriminant analysis (LDA) models to quantify peanuts.
3.4. PLS Model
The chemical values of fat, protein, and moisture content of peanut protein powder were correlated with the near-infrared spectrum values. This was achieved using the Unscrambler X 10.4 software, which facilitated the screening of the optimal spectral preprocessing method, the selection of best principal components, and the construction of a model using the PLS. In this study, all data were divided in a 3:1 ratio and partitioned into calibration (95 samples) and prediction (31 samples). The best model was identified based on the correlation coefficient in prediction (Rcp) and standard error of prediction (SEP), with a high correlation coefficient and low error indicating high accuracy and stability. As shown in
Table 4, it was observed that not all spectra improved the model performance after preprocessing. As shown in
Figure 6, the optimal spectral pretreatment methods for PLS models of fat, protein, and moisture content of peanut protein powder were FD, SD, and SD, respectively. The model repeated cross-validation to eliminate outliers. The improvement was attributed to the enhancement of the absorption peaks of the spectra following the derivative processing and the reduction of the baseline offset. These adjustments made the model more sensitive to changes in the chemical composition and enhanced the performance of the model. The PLS method had been widely used in the analysis of peanuts and peanut products. Partial least squares discriminant analysis (PLS-DA) was employed by Song et al. [
50] to create models that could distinguish between uncontaminated and aflatoxin-contaminated peanut oil. Another study [
51] focused on the nutritional significance of peanuts, specifically examining the free amino acid (FAA) and crude protein (CP) content in raw peanut seeds.
The models were established to predict the fat, protein, and moisture content of peanut protein powder. As can be seen in
Table 5, the model for fat content achieved a calibration correlation coefficient (Rcal) of 0.9750, and a standard error prediction (SEP) of 0.0129. The protein model had an Rcal of 0.9771 and an SEP of 0.0148, while the moisture model had an Rcal of 0.9428 and an SEP of 0.0038. The accuracy of these models was evaluated using the residual prediction deviation (RPD). An RPD greater than 1.4 indicated that the model can provide reasonable prediction results, while an RPD greater than 2 suggested that the model had a good prediction effect. The RPD of the three PLS models of peanut protein powder was greater than 2, proving that the constructed model had a high degree of reliability. Compared to the chemical method, the error in the three indicators was smaller, the prediction speed was faster, and the model was more robust, eliminating the possibility of random errors caused by large data fluctuations. A near-infrared protein powder detection model was established by Ingle et al. [
52], which exhibited a correlation coefficient of 0.986. It used a training set of samples with a protein content of 20% to 90%, consisting of only 17 samples, and validated only 85% to 88% protein content in the protein powder. This was significantly much smaller than the more than 100 samples in this study, and the error reached 2%. In contrast, the error of the protein detection model in this study was reduced by 26%. These findings unveiled the potential of using near-infrared spectroscopy in combination with chemometric algorithms for the rapid and accurate prediction of the chemical composition of peanut protein powder.
3.5. Generalized Regression Neural Network (GRNN) Model
The chemical values of fat, protein, and moisture content in peanut protein powder were correlated with the near-infrared spectrum values. The model was established using the MATLAB R2021b software. The spectrum of peanut protein powder consisted of 125 wavelengths. However, the data information and sensitivity of the near-infrared spectrometer were affected by the operating environment of the equipment and the differences between different operators. Some redundant information was present in the spectral information, which led to low computational accuracy and poor model stability. Therefore, feature extraction of the original near-infrared spectra was extremely important, both to reduce the time of model calculation and to improve the stability of the model.
Prior to the neural network analysis of the data, the original data were extracted, and the first three principal components were extracted. Therefore, the cumulative variance contribution rate of the first three principal components was 99.85%, which can reflect the spectral information of peanut protein powders, representing that these three principal components reflected 99.85% of the spectral information contained in the original spectrum. For the 126 samples of peanut protein powder, 26 samples were first randomly selected as the test set and the remaining 100 samples as the training set. The principal factor scores after feature selection were used as the input variables of the model. This approach significantly reduced the time of training and the size of the network. In the GRNN model, the smoothing factor determined both the error of the training set and the shape of the hidden layer basis function, which directly affected the accuracy of the model prediction. The smoothing factor was determined by repeatedly comparing the selection of the smoothing factor with the output results. The fat (%), protein (%), and moisture content (%) of peanut protein powder were used as the outputs of the network, respectively, to construct the GRNN model. The optimal smoothing parameters were found to be 0.02, 0.015, and 0.02, respectively.
The GRNN model of fat, protein, and moisture content exhibited better prediction results for their chemical values, as represented in
Figure 7. The correlation coefficients of the training set were 0.9952, 0.9904, and 0.9896 for fat, protein, and moisture, respectively. The corresponding errors were 0.0022%, 0.0219%, and 0.0221%, and the RPD values were 10.82, 10.03, and 8.41, respectively (
Table 6). When the RPD was greater than 10, the model was applied to real-time process control and optimization. The prediction results of the fat, protein, and moisture models established by GRNN were better than those of the PLS model. Compared with the PLS model, the GRNN model exhibited higher model accuracy, with improved correlation coefficients of the fat, protein, and moisture models, and substantially reduced errors. The correlation coefficients of the prediction reached more than 0.98, and the fat model had the lowest error of 0.0022%, which may be the result of the lower fat content of the samples in the validation set. Compared with the chemical measurement method, both PLS and the GRNN models demonstrated smaller errors for the three indicators, demonstrating more robust models, and less large data fluctuation due to random error. The errors in the GRNN detection model for soybean meal powder were 0.3% for protein and 0.2% for moisture content [
53]. It was observed that the GRNN model had a good foundation in predicting the content of each component of the powder, and the GRNN method was applied to the quality detection of peanut protein powder. This study demonstrated the potential of using near-infrared spectroscopy in combination with neural network models for the rapid and accurate prediction of the chemical composition of peanut protein powder.
4. Conclusions
The integration of near-infrared spectroscopy with PLS and GRNN presented a rapid and efficient method for detecting the fat, protein, and moisture content of peanut protein powder. This approach facilitated the real-time process monitoring of peanut protein powder and provided a fast detection method for peanut food that ensured good peanut protein powder raw materials for the food industry. The implementation of near-infrared spectroscopy enhanced the speed and accuracy of detection and contributed to the optimization of the production process. In the future, the application of a near-infrared detection system can be extended to the peanut protein powder production line and processing system. This would enable continuous monitoring and quality control during the production process, thereby contributing significantly to the advancement of the peanut industry. By ensuring the consistent quality of peanut protein powder, this method can help manufacturers meet food safety standards and consumer expectations. Furthermore, the real-time data provided by this system can inform decision-making processes, enabling timely adjustments to the production process as needed. It can improve efficiency, reduce waste, and enhance product quality, all of which are critical for the sustainable growth of the peanut industry.
The direction of future development can be carried out from the following aspects: Firstly, the sample size can be expanded to include almond powder, soybean protein powder, and other common protein powders used in daily production and life. This expansion will enable the exploration of differences among various protein powders, leading to the establishment of a more universal protein powder-monitoring model. Such a model can then be applied to a wider array of research topics. Secondly, it is essential to delve into more detection indicators beyond the traditional evaluation based on fat, protein, and moisture content. In recent years, assessing ash content, digestibility, solubility, and other related indicators has gained increasing importance. By incorporating these additional indicators into the analysis, researchers can obtain a comprehensive understanding of protein powder quality. This multifaceted approach will not only enhance the depth but also broaden the scope of research results.