Next Article in Journal
Impact of Lentinus sajor-caju on Lignocellulosic Biomass, In Vitro Rumen Digestibility and Antioxidant Properties of Astragalus membranaceus var. mongholicus Stems under Solid-State Fermentation Conditions
Previous Article in Journal
The Effect of Chosen Biostimulants on the Yield of White Cabbage
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Non-Destructive Detection of Tea Polyphenols in Fu Brick Tea Based on Hyperspectral Imaging and Improved PKO-SVR Method

College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China
*
Author to whom correspondence should be addressed.
Agriculture 2024, 14(10), 1701; https://doi.org/10.3390/agriculture14101701 (registering DOI)
Submission received: 25 August 2024 / Revised: 16 September 2024 / Accepted: 26 September 2024 / Published: 28 September 2024
(This article belongs to the Section Digital Agriculture)

Abstract

:
Tea polyphenols (TPs) are a critical indicator for evaluating the quality of tea leaves and are esteemed for their beneficial effects. The non-destructive detection of this component is essential for enhancing precise control in tea production and improving product quality. This study developed an enhanced PKO-SVR (support vector regression based on the Pied Kingfisher Optimization Algorithm) model for rapidly and accurately detecting tea polyphenol content in Fu brick tea using hyperspectral reflectance data. During this experiment, chemical analysis determined the tea polyphenol content, while hyperspectral imaging captured the spectral data. Data preprocessing techniques were applied to reduce noise interference and improve the prediction model. Additionally, several other models, including K-nearest neighbor (KNN) regression, neural network regression (BP), support vector regression based on the sparrow algorithm (SSA-SVR), and support vector regression based on particle swarm optimization (PSO-SVR), were established for comparison. The experiment results demonstrated that the improved PKO-SVR model excelled in predicting the polyphenol content of Fu brick tea (R2 = 0.9152, RMSE = 0.5876, RPD = 3.4345 for the test set) and also exhibited a faster convergence rate. Therefore, the hyperspectral data combined with the PKO-SVR algorithm presented in this study proved effective for evaluating Fu brick tea’s polyphenol content.

1. Introduction

Tea has been one of the world’s most popular beverages for almost five thousand years [1]. It plays an essential role in nutrition and medicine in China and is also widely consumed worldwide [2]. According to different processing techniques, Chinese tea can be divided into six categories: green tea, white tea, yellow tea, oolong tea, black tea, and dark tea [3]. Among these, dark tea is a post-fermented tea with distinctive organoleptic characteristics. Its production involves a unique process of fermentation in piles, which requires the participation of microorganisms [4]. Fu brick tea is a representative of dark tea known for its distinctive ‘mushroom flower’ characteristic [5] and is mainly produced in Hunan, Shaanxi, Zhejiang, Sichuan, and Guizhou provinces [6]. This characteristic results from forming Eurotium christatum during fermentation, which gives Fu brick tea its unique flavor and significantly affects its active ingredients [7]. Fu brick tea is rich in polysaccharides, tea polyphenols (TPs), catechins, amino acids (AAs), and alkaloids, which have various health benefits [8]. TPs, as the main active constituents, show significant antioxidant, anti-inflammatory [9], and antimicrobial [10] effects, as well as anticancer [11], anti-obesity [12], and neuroprotective [13] effects and potentially COVID-19 preventive capacity [14]. These findings highlight the critical value of TPs in health promotion and disease prevention. Therefore, consumers may prefer commercial tea beverages rich in TPs. Consequently, it is essential to explore assay methods for the quality composition of Fu brick tea to understand its unique fermentation process, strictly control the production quality, and develop applications in health beverages, functional foods, and medicinal uses.
In previous studies, the measurement of TPs was mainly based on traditional chemical analysis methods such as high-performance liquid chromatography (HPLC) [15], gas chromatography–mass spectrometry (GC-MS) [16], and liquid mass spectrometry (LMS) [17]. Although these methods can accurately detect tea components, they are complex, time-consuming, and require destructive sampling, making it challenging to meet modern production requirements. For this reason, researchers have developed e-nose, e-tongue, and e-eye techniques for rapid tea quality assessment and chemical analysis [18]. However, these techniques have limited applications due to their complex design and environmental sensitivity [19]. In recent years, spectral analysis techniques have been widely used in tea quality assessment [20]. Wang Y accurately detected TPs and AAs using a miniature near-infrared spectrometer [21]. At the same time, Ye S accurately predicted tea polyphenol and Epigallocatechin Gallate (EGCG) content by combining Fourier transform near-infrared spectroscopy with machine learning [22]. However, due to the low resolution and lack of spatial information in near-infrared spectroscopy (NIR), it is not easy to obtain comprehensive and accurate data when measuring samples with inhomogeneous appearance [23,24].
Hyperspectral imaging is a new non-destructive testing technique that combines traditional and spectroscopic methods. It simultaneously acquires spatial and spectral information from samples, providing high resolution, a wide spectral range, and continuous wavelength bands. This technique has been widely used in tea processing [25], classification, and component content detection [26]. Luo X successfully combined hyperspectral imaging with chemometric methods and applied them to the non-destructive detection of tea polyphenol content in Tibetan tea, which verified its effectiveness in predicting tea composition [27]. Mao Y developed a model based on hyperspectral imaging to monitor the biochemical composition of black tea, which could quantify TPs, free AAs, and caffeine and determine the degree of withering and fermentation [28]. In addition, Tang Y successfully captured the spectral characteristics of green tea using hyperspectral imaging to predict its composition and quality grade [29]. Selecting the most informative bands is critical to improving model prediction performance when applying hyperspectral techniques [30]. Commonly used feature selection methods include the successive projection algorithm (SPA) [31] and competitive adaptive reweighted sampling (CARS) [32], which improve model simplicity and computational efficiency by reducing the number of input variables. However, these methods are usually based on analyzing the average spectrum of all pixels in a hyperspectral image, ignoring crucial spatial information. In addition, the limitations of linear feature selection make it challenging to capture complex nonlinear relationships in data [33]. Deep learning models have recently shown great potential in hyperspectral feature extraction and analysis due to their powerful data representation and nonlinear mapping capabilities. Models such as convolutional neural networks (CNNs) and self-encoders can effectively extract spectral and spatial features from hyperspectral images to achieve deep data analysis [34]. Luo N extracted spectral–spatial depth features using CNNs, which significantly improved the accuracy of predicting the polyphenol content of green tea [35]. Xu M extracted deep spectral features from a hyperspectral image of grapes using a stacked autoencoder (SAE) to achieve soluble solid content prediction [36]. Some researchers also extracted features for the fast and non-destructive prediction of total soluble solids and titratable acidity [37]. These methods capture nonlinear relationships in data and retain more helpful information, improving the accuracy of subsequent classification, regression, and other tasks. Therefore, subsequent researchers are exploring the application of deep learning to hyperspectral data characterization to advance the field.
Support vector regression (SVR), known for its powerful nonlinear regression capabilities, is widely used in quantitative hyperspectral data analysis. By constructing an optimal hyperplane in high-dimensional space, SVR exhibits excellent generalization ability, which makes it particularly suitable for nonlinear regression problems in agriculture and food testing. However, the performance of the SVR model is highly dependent on parameter selection, and parameter optimization can effectively control overfitting and improve the model’s generalization ability. Traditional parameter optimization methods like grid and random search can be more efficient and computationally expensive [38]. In recent years, metaheuristic algorithms, especially population-based intelligent optimization algorithms, have gained attention for their outstanding performance in solving complex optimization problems. At present, particle swarm optimization (PSO) and the sparrow search algorithm (SSA) have been successfully applied to the parameter optimization of SVR models, achieving significant results in the prediction of components of tea leaves and soybeans [39,40]. Although PSO and the SSA perform well in optimizing SVR parameters, they have some limitations. For example, particle swarm optimization–support vector regression (PSO-SVR) and sparrow search algorithm–support vector regression (SSA-SVR) often exhibit slow convergence when dealing with high-dimensional datasets. They are prone to becoming stuck in local optima for some complex issues. In addition, although these algorithms have global solid search capabilities, they may require longer computation times in practical applications [41]. According to the “No Free Lunch Theorem” (NFL), no single algorithm can be universally effective for all problems [42]. In 2024, a new efficient population-based intelligent optimization algorithm, the Pied Kingfisher Optimizer (PKO), was proposed by Bouaouda A., demonstrating superior convergence speed and accuracy in addressing complex optimization problems [43]. However, although the PKO improves convergence speed, it still faces challenges in escaping local optima when dealing with complex nonlinear regression problems. Therefore, this paper proposes an improved Pied Kingfisher Optimizer–support vector regression (PKO-SVR) that combines the Elite Evolution Strategy (EES) [44] to further improve the global search capability and convergence speed of the PKO. Compared with PSO-SVR and SSA-SVR, the enhanced PKO-SVR not only converges in fewer iterations but also avoids local optima more effectively, which significantly improves the prediction accuracy and optimization efficiency of the SVR model. The advantage of the enhanced PKO-SVR lies in the incorporation of the EES, which allows the algorithm to reduce optimization time and improve convergence speed without sacrificing global search ability.
There are few studies on Fu brick tea, especially in composition detection. Therefore, this work aims to develop an efficient and accurate composition detection model for Fu brick tea by combining hyperspectral techniques and optimization algorithms. Specifically, this study uses an SAE to extract the spectral features of Fu brick tea and construct an Elite Evolutionary Strategy–Pied Kingfisher Optimizer–support vector regression (EESPKO-SVR) model to predict the tea polyphenol content in Poria tea accurately. The specific research objectives are as follows:
1. Propose a variable sorting normalization (VSN)–SAE algorithm to effectively extract key spectral features in Fu brick tea samples, improving the data processing accuracy and efficiency of the non-destructive prediction of TPs.
2. Propose to improve the accuracy of the SVR model in predicting TPs using the PKO algorithm and further enhance model performance by integrating the EES. An independent AA dataset is used to evaluate its performance across different chemical compositions, such as AAs and TPs, to validate the model’s robustness and generalization ability.
3. Evaluate the model’s potential and value in practical applications, exploring its feasibility and prospects in non-destructive testing.

2. Materials and Methods

2.1. Sample Collection

A total of 337 Fu brick tea samples were collected from representative tea producers in different regions of China, including Zhejiang Province, Jingyang County in Shaanxi Province, and Anhua County in Hunan Province, with the sampling areas shown in Figure 1, to ensure the breadth and representativeness of the study results. These samples covered production years from 2018 to 2022, capturing the quality variation in Fu brick tea over different periods. Each sample was evenly divided into five portions, resulting in 1685 subsamples, which helped increase the experimental results’ reliability. After rigorous screening and processing, 1548 subsamples were maintained for further analysis. All samples were sealed and packaged immediately after collection to minimize external influences and stored in a relaxed, dry environment to prevent moisture and oxidation, ensuring the stability and reliability of the samples in subsequent studies.
TPs were extracted using the GB/T 8313-2018 standard method [44]. First, the Fu brick tea samples were ground to a fine powder, then placed in a 70% aqueous methanol solution and extracted in a water bath at 70 °C to facilitate the effective dissolution of polyphenols. The extract was then centrifuged at 3500 rpm to obtain a clarified supernatant. The clarified supernatant was then mixed with Folin–Ciocalteu reagent, which, through a color reaction, produced a blue compound that allowed for the quantitative determination of TPs. Finally, the absorbance of the reaction solution was measured at a wavelength of 765 nm using a spectrophotometer, and the tea polyphenol content was calculated by comparing the absorbance with that of a standard gallic acid solution. To ensure the reliability of the results, the relative error between two measurements of the same sample should not exceed 5%. If the error is within this range, the final result is the mean of the two measurements, rounded to one decimal place.

2.2. Hyperspectral Data Acquisition and Calibration

Hyperspectral data acquisition was performed using the SOC710VP portable hyperspectral imager (SOC, Bellevue, WA, USA). This system consists of the SOC710VP hyperspectral camera, two 150 W halogen lamps, a laptop computer, and a sample stage, as shown in Figure 2. The camera acquires spectral data in the 376.9–1050.16 nm range, covering the visible and near-infrared regions. According to the band range of the hyperspectral camera, useless bands at both ends were screened and removed. Finally, the spectral data in the 400–1000 nm range were selected, and 123 valid bands were collected. This system has a spectral resolution of 4.09 nm, a dynamic range of 12 bits, an image resolution of 520 × 696 pixels, a scanning speed of 30 lines per second, and an acquisition time of 23.2 s for a complete data cube. The light sources were located on either side of the sample stage at a height of 50 cm. Before measurement, the instrument was warmed up for 20 min to stabilize the light sources and calibrated using a standard whiteboard to ensure the accuracy of the data. The scanning software adjusted the focal length, and the exposure distance was 370 mm to avoid image distortion or oversaturation.
Fu brick tea samples were evenly distributed in test discs and arranged on the sample stage, with each group containing 12 samples. Each sample was measured three times independently, and the average value was taken to minimize errors. After spectral data acquisition, the data were immediately stored in a designated computer folder and calibrated using SRAnal710 software to ensure accuracy and reliability for subsequent analyses. As shown in Figure 3, the acquired hyperspectral images underwent lens calibration and reflectance correction using SRAnal710 software. The calibrated images were then imported into ENVI 5.3 software to extract spectral data. For each set of samples, a central rectangular area of 2500 pixels (50 × 50) with uniform lighting was selected as the region of interest (ROI), and its average value was calculated as the raw data for further analysis.

2.3. Data Preprocessing

Instrumental and environmental factors often affect spectral data acquisition, leading to scatter effects and noise. These can attenuate biochemical spectral signals and affect the construction of regression models [45]. Four preprocessing methods, multiplicative scatter correction (MSC), standard mean-variance (SNV), a combination of Savitzky–Golay smoothing and multiplicative scatter correction (SG + MSC), and variable sorting normalization (VSN), were used in this study to optimize the quality of the spectral data. These methods help to reduce scatter and noise, eliminate spectral shifts, and improve data reliability. In addition, by optimizing the weighting of spectral variables, they further enhance the predictive performance and robustness of the models [46,47,48,49]. After noise reduction, extracting convincing features from the raw spectral data is crucial for simplifying the model and improving accuracy [50]. This study uses CARS, the SAE, and Locally Linear Embedding (LLE) to reduce input variables, improve model efficiency, and increase predictive power. CARS uses Monte Carlo sampling and Partial Least Squares Regression (PLSR) to select the most relevant wavelengths, reducing redundancy and optimizing model performance [51]. The SAE uses multiple autoencoders to learn low-dimensional representations, mapping input data to hidden layer features and then reconstructing the original data [52]. LLE represents each data point as a linear combination of its neighbors, preserving local geometry and embedding high-dimensional data in a lower-dimensional space [53].

2.4. Model Construction and Evaluation

The model’s performance is crucial for predicting polyphenol content in tea. In this study, three machine learning methods, namely SVR, Back-Propagation (BP), and K-nearest neighbors (KNN), were used to develop regression models relating the spectral data of Fu brick tea samples to their polyphenol content. In addition, the SSA and PKO algorithm were used to optimize the parameters of the SVR model for improved performance.
In the model evaluation process, the coefficient of determination ( R 2 ), root mean square error ( R M S E ), relative analysis error ( RPD ), and mean absolute error ( M A E ) are the key indicators. The following formulas calculate these assessment metrics:
R 2 = 1 - i = 1 n y i y ^ i 2   i = 1 n y i y ¯ 2  
R M S E = 1 n i = 1 n y i - y ^ i 2  
RPD = S D R M S E
M A E = 1 n i = 1 n | y i y ^ i |
where y i and y ^ i refer to the actual measured value and the predicted value of the i -th sample in the calibration or prediction set, respectively. y i represents the average of the reference values in the calibration or prediction set, while S D indicates the standard deviation of these reference values in the prediction set. n denotes the number of samples in the calibration or prediction set.

2.5. Pied Kingfisher Optimizer and Its Improved Strategy

2.5.1. PKO

The Pied Kingfisher Optimizer is a swarm intelligence optimization algorithm inspired by pied kingfishers’ foraging and hunting behavior. This algorithm translates these natural behaviors into mathematical models to perform a global search and local optimization within the search space of complex problems. The PKO algorithm is divided into three stages: perching or hovering to locate prey, diving to capture prey, and establishing symbiotic relationships. The perching and hovering behavior ensure diversity in extensive area searches; the diving behavior allows for the precise capture of optimal solutions; and the symbiotic behavior further enhances cooperation and competition between individuals. These stages enable the algorithm to maintain high search efficiency in complex environments.
Like many other population-based methods, the PKO starts the search process by generating a random set of initial solutions from the search space as the first attempt. The following formulas generated the initial population:
X i , j = L B + ( U B L B ) rand , i = 1 , 2 ,   , N   and   j = 1 , 2 , , Dim  
where i represents the individual index, j represents the dimension, and r a n d is a random number in the range of [0, 1] used to generate the initial population. X i , j denotes the position of the i -th individual in the j -th dimension, while U B and L B represent the upper and lower bounds of the search space, respectively.
After generating the initial population, a fitness function evaluates each individual’s fitness to measure their ability to solve the problem. The individual with the highest fitness value is then selected to create the next generation.
The PKO algorithm’s exploration phase mimics pied kingfishers’ perching and hovering behavior. In nature, kingfishers adjust their positions based on environmental factors to optimize hunting efficiency. Similarly, in the PKO algorithm, the search agents update their positions by simulating this behavior, ensuring compelling solution space exploration during the exploration phase. The position of the search agents is updated according to the following formulas:
X i ( t + 1 ) = X i ( t ) + α T ( X j ( t ) X i ( t ) )   , i , j = 1 , 2 , . . . . . N   and   j i  
α = 2 r a n d n ( 1 , D i m ) 1
where X i ( t + 1 ) represents the solution of the next generation, while X i ( t ) represents the solution of the current generation. The parameter α is calculated through random numbers generated from a normal distribution, which introduces a certain degree of randomness and diversity.
The dimension D i m represents the problem’s dimensionality, indicating the solution space’s complexity. The parameter T plays a crucial role in the PKO algorithm, and its value is dynamically adjusted based on the current strategy (either ‘perching’ or ‘hovering’). Each strategy has a different calculation method T to ensure optimal performance under different operational modes. Pied kingfishers typically perch on trees, rocks, or artificial structures, scanning the water surface widely for prey, as shown in Figure 4a. This behavior aids in identifying potential prey. The mathematical model is as follows:
T = ( exp ( 1 ) - exp t - 1 Max _ Iter 1 BF cos ( Crest _ angles )
C r e s t _ a n g l e s = 2 p i r a n d
where M a x I t e r specifies the maximum number of iterations, B F (beating factor) is a constant value set to 8, and r a n d is a random value between 0 and 1.
Pied kingfishers’ hovering behavior is more agile than perching. By rapidly flapping their wings, they can hover in the air and scan a larger area to locate potential prey, as shown in Figure 4b.
The mathematical model for its hovering strategy is as follows:
T = b e a t i n g _ r a t e t 1 B F M a x _ I t e r 1 B F
b e a t i n g _ r a t e = r a n d P K O _ F i t n e s s ( j ) P K O _ F i t n e s s ( i )
where b e a t i n g r a t e represents the beating frequency, B F denotes the beating factor, typically set to a constant value of 8, and P K O F i t n e s s ( i ) and P K O F i t n e s s ( j ) represent the fitness values of the i -th and j -th individuals, respectively.
During the exploitation phase, pied kingfishers dive quickly from a high perch, such as a branch or rock, to catch prey. This process requires them to judge the position of the prey and act quickly and accurately. The PKO algorithm mimics this diving behavior in the exploitation phase, allowing individuals to approach the optimal solution, thereby improving optimization performance promptly. The mathematical model for this behavior is as follows:
X i ( t + 1 ) = X i ( t ) + H A o α b X best ( t )
H A = r a n d P K O _ F i t n e s s ( i ) B e s t _ F i t n e s s
o = exp t M a x _ I t e r 2
b = X i ( t ) + o 2 r a n d n X b e s t ( t )
where H A denotes the individual’s hunting ability, and B e s t F i t n e s s represents the best fitness value obtained across all current iterations. The parameter o is a scaling factor for controlling the search range of the individual, gradually shrinking as the number of iterations t increases. The parameter b is an intermediate variable that balances the search process, and X b e s t ( t ) denotes the current location of the optimal solution.
In the symbiotic stage, the hunting efficiency of pied kingfishers can be affected by various factors, such as individual hunting efficiency and environmental conditions. Typically, pied kingfishers have a symbiotic relationship with otters. This symbiotic relationship means that both species benefit from each other without causing harm or loss to each other. This behavior can be mathematically represented as follows:
X i ( t + 1 ) = X m ( t ) + o · α · abs ( X i ( t ) X n ( t ) )    if rand > ( 1 PE ) X i ( t ) otherwise
P E = P E max P E max P E min t M a x _ I t e r
where X m ( t ) and X n ( t ) represent two individuals randomly selected from the population, and P E denotes the hunting efficiency of the pied kingfisher. The constant values of P E max and P E min are set to 0.5 and 0, respectively.
The PKO algorithm was employed to optimize the parameters of an SVR model, explicitly targeting the complex regression task of hyperspectral tea composition prediction. The optimization process begins with generating a randomly created population of potential solutions. Throughout several iterations, the PKO algorithm simulates the predatory behavior and symbiotic relationships of the pied kingfisher, gradually adjusting the positions of individuals within the search space to fine-tune key SVR parameters, including the penalty factor C , kernel function parameter g , and tolerance ε . This iterative optimization strategy not only effectively enhances the prediction accuracy and generalization capability of the SVR model but also prevents the model from becoming trapped in local optima, guiding it towards a global optimum. The PKO algorithm demonstrates potential solid and broad applicability in solving complex optimization problems, particularly in fields requiring precise regression modeling, such as hyperspectral data analysis.

2.5.2. EES

The Elite Evolutionary Strategy (EES), developed by Chen Yang Li in 2021, is an optimization algorithm that exploits the critical strengths of the Genetic Algorithm (GA) and Differential Evolution (DE). The EES enhances global search and local exploitation capabilities by integrating elite selection, gene crossover, and regional variation. The algorithm uses two strategies: elite natural evolution and elite random mutation.
The elite natural evolution strategy focuses on enhancing local search capability by combining superior genes from multiple elite solutions, thereby accelerating the convergence speed of the algorithm. As shown in Figure 5, this strategy first selects and retains three optimal solutions ( E 1 , E 2 , and E 3 ). In the gene crossover stage, 50% of the genes from E 2 and E 3 are randomly selected to create a new chromosome, N 1 . Then, 100 1 s p % of the genes from E 1 are randomly selected and crossed with   100 s p % of the genes from N 1 to produce a new chromosome, N 2 . Finally, a Gaussian local mutation is applied to N 2 , producing the final updated chromosome X , to improve the evolutionary efficiency of the algorithm. The mathematical model of this process is described below:
N 1 = ( 50 % E 2 ) ( 50 % E 3 )
N 2 = 100 ( 1 s p ) % E 1 ( 100 s p ) % N 1
X = N 2 + G S N 2 X
where the symbol denotes the crossover recombination operation of chromosomal genes. The symbol indicates how many genes from each chromosome are randomly selected for crossover recombination. s p is a variable that controls the proportion of genes on chromosome E 1 .
The elite random mutation strategy aims to enhance the algorithm’s exploration ability in the later stages by randomly mutating some genes of the elite chromosomes, thus improving the effect of jumping out of the local optimal solution. As shown in Figure 6, this strategy involves generating a randomly mutated chromosome R 1 and performing gene crossover.
The position of R 1 is determined by the current optimal elite chromosome E 1 , the center of the search space C L , and a random number G S n u m generated from a Gaussian distribution. This allows R 1 to fluctuate close to E 1 but over a broader range. Next, a dynamic selection of genes from E 1 , based on the parameter s p , is crossed with the remaining genes of R 1 to create a new chromosome, X This process preserves the elite genes from E 1 while introducing the mutational characteristics of R 1 . Its mathematical description is shown below:
R 1 = C L + ( G S n u m ) C L E 1
X = ( 100 ( 1 s p ) ) % E 1 ( 100 s p ) % R 1
s p = r a n d ( 1 , 1 ) × ( 1 t T )
where C L denotes the center position vector in the search space, and G S n u m denotes the number generated by a Gaussian probability distribution ( μ = 0, σ = 1).
The EES has a critical control parameter s p . This controls the proportion of optimal parental genes in the neochromosome and the transition from exploration to utilization throughout the EES. When s p is large, the resulting neochromosome tends to contain more mutated genes, and conversely, it contains more genes from the best parent.

2.5.3. Improved Strategy

The PKO algorithm, as an advanced swarm intelligence optimization method, shows excellent performance in multimodal and composite function optimization. However, in high-dimensional discrete optimization problems and specific engineering applications, the PKO tends to degrade in efficiency and become stuck in local optima, limiting its broader applicability. To address these issues, this paper proposes an improved PKO algorithm that dynamically incorporates the EES to enhance local search capabilities and the diversity of global exploration. This improvement increases the adaptability and efficiency of the algorithm in complex optimization tasks, particularly in agricultural engineering optimization problems and hyperspectral data analysis.
The enhanced PKO algorithm optimizes the exploration and exploitation phases by integrating two core mechanisms of the EES: the elite stochastic mutation mechanism and the elite natural evolution strategy. During the exploration phase, the elite stochastic mutation mechanism extensively searches the solution space, improving global search capabilities and increasing the probability of escaping local optima. In the exploitation phase, the elite natural evolution strategy performs fine-tuned searches around potential optimal solutions, improving local search precision and accelerating the algorithm’s convergence. Combining these two strategies allows the PKO algorithm to effectively balance global exploration and local optimization, significantly improving overall optimization performance. Ultimately, the enhanced EESPKO algorithm can converge to the global optimum faster during optimization, demonstrating greater adaptability and stability.
Figure 7 illustrates the parameter optimization process of the EESPKO-SVR algorithm, which combines the advantages of the EES and the Pied Kingfisher Optimization (PKO) algorithm to efficiently optimize the critical parameters of the SVR model ( C , g , and ε ).
First, the algorithm initializes the position of each individual in the population. It sets critical parameters such as population size (N), dimension (Dim), maximum iterations (Max_Iter), and fitness function (such as MSE). During optimization, the algorithm calculates the exploration probability (exploration_prob), determining whether an individual enters the global or local development phase. In the global exploration phase, individuals traverse the parameter space in search of potential global optima, avoiding being trapped in local optima. In the local development phase, individuals perform more refined searches around the best solution to improve optimization. During each iteration, the algorithm enters the symbiotic phase. In this phase, the algorithm introduces additional random perturbations to encourage individuals to escape local optima and explore potential global optima. The main goal of the symbiotic phase is to enhance the algorithm’s global search capability through local escape strategies, thereby reducing the likelihood of being trapped in local optima. Even when the escape strategy is not triggered, the algorithm evaluates new fitness values and updates the current best parameter combination as the fitness value improves. The optimization process continues to iterate until the preset maximum number of iterations (Max_Iter) is reached, ultimately outputting the optimal parameters for building the SVR model, ensuring that the model has excellent generalization ability and accuracy in prediction tasks.
Algorithm 1 provides the pseudocode for the Elite Evolutionary Strategy–Pied Kingfisher Optimizer (EESPKO) algorithm and serves as a reference for further research and implementation.
Algorithm 1: Pseudocode of EESPKO algorithm
1
Inputs: The maximum number of iterations (Max_Iter), population size (N)
2
Outputs: The location of the EESPKO and its corresponding fitness value
3
% Initialization
4
Initialization of population creation Xi (i = 1, 2, …, N)
5
Calculate the initial fitness values of the EESPKO
6
% Main Loop
7
while (t < Max_Iter + 1) do
8
        % Calculating probability of exploration
9
        exploration_prob = 0.8 * (1 − t/Max_Iter)
10
        % Determine whether to execute exploration or exploitation strategies
11
        if (rand() < exploration_prob) then
12
                Apply elite random mutation to update the position of EESPKO
13
        else
14
                Apply elite natural evolution strategy to update the position of EESPKO
15
        end
16
        Adjust the position if it moves out of search bound
17
        % Calculating and updating adaptation values
18
        Calculate fitness values for updated EESPKO
19
        if (new solutions are superior) then
20
                Replace old solutions with new ones
21
                Update Best_Position and Best_Fitness
22
        end
23
        % Local escape strategy
24
        if (rand() > (1 − PE)) then
25
                Update the position of pied kingfishers using Formula (12)
26
        else
27
                Update the position of pied kingfishers using Formula (12)
28
        end
29
        Calculate fitness values for pied kingfishers after local escape
30
        if (new solutions are superior) then
31
                Replace old solutions with new ones
32
                Update Best_Position as the location of Best_Fitness
33
        end
34
        t = t + 1
35
end
36
Return Best_Position, Best_Fitness

3. Results

3.1. Division of Modeling Sample Set

The dataset was accurately divided into training and test sets using a random division method at a ratio of 4:1. Table 1 shows significant differences in the dataset’s statistical indicators, indicating that the quality components of the samples exhibit diversity.

3.2. Preprocessing of Hyperspectral Data

In order to reduce noise interference and improve the correlation between the Fu brick tea spectral data and its quality components, we performed preprocessing on the corrected spectral data in this experiment. The effectiveness of different preprocessing methods was evaluated using an SVR model, with evaluation metrics including Rp2, RMSEP, Rc2, RMSEC, MAE, and RPD. As shown in Table 2, the performance of the tea polyphenol content prediction model improved significantly after applying four preprocessing methods: MSC, SG + MSC, SNV, and VSN. In particular, the VSN method outperformed the others in all indicators, achieving Rc2 values 0.700787%, 11.85%, 8.587%, and 15.33% higher than the other three methods, respectively. However, noise reduction alone is insufficient for a comprehensive evaluation, so we also included the results after feature extraction to more accurately determine the optimal preprocessing method. Figure 8 visualizes the spectral curves after these four preprocessing methods, supporting further analysis and comparison.

3.3. Analysis of Feature Selection Scheme

This study used three feature selection methods, CARS, LLE, and the SAE, to extract key features from 123 variables. CARS focuses on removing irrelevant data and retaining the wavelengths significantly affecting tea polyphenol content. The SAE and LLE improve model performance by reducing dimensionality and solving the problems of band covariance and redundancy. By combining these methods, we accurately identified the most informative spectral bands, further improved the accuracy and stability of the model, and provided reliable support for the hyperspectral imaging detection of tea polyphenol content in Fu brick tea.

3.3.1. Feature Selection of CARS

We set the number of Monte Carlo samples to 50 and used tenfold cross-validation to evaluate model performance and the effect of feature wavelength selection. The results showed that the CARS method significantly improved model performance, with the VSN-CARS model performing the best. After 11 iterations, the root mean square error of cross-validation (RMSECV) reached its lowest point, and 53 critical wavelengths were successfully selected, with the coefficient of determination Rp2 for TPs reaching 0.8001. Figure 9 illustrates essential wavelengths related to the absorption characteristics of TPs selected through the CARS feature selection process. The significant absorption peak around 850 nm is attributed to the second overtone of the O-H stretching vibrations in TPs. The 950 and 1000 nm absorption peaks reflect the combination of C-H and O-H stretching vibrations, revealing polyphenols’ aliphatic and aromatic structures. These absorption characteristics are highly consistent with the molecular vibration modes of polyphenolic compounds, especially the C-H and O-H stretching vibrations, which are critical for the quantitative assessment of polyphenol content.

3.3.2. Feature Selection of LLE

LLE is a nonlinear manifold learning method that can effectively reveal the intrinsic nonlinear features of data [52]. In the experiments, after several optimizations and adjustments, the parameters of LLE were set to n_neighbours = 30 and n_components = 50 to preserve the key features and topological structure of the data to the maximum extent while significantly reducing the dimensionality of the data. The experimental results show that the application of different noise reduction techniques had a significant impact on the effectiveness of the LLE method, and the SNV-LLE model performed exceptionally well, with a coefficient of determination Rp2 as high as 0.8181, showing predictive solid and explanatory power.

3.3.3. Feature Selection of SAE

The SAE achieves nonlinear feature extraction from input data by stacking multiple layers of autoencoders. For this study, the network structure was set to 128-100-50-h-50-100-128, where h represents the number of neurons in the final coding layer, corresponding to the number of extracted feature variables. After several trials, the sigmoid activation function was selected, with the number of iterations set to 50, the batch size set to 20, the initial learning rate set to 0.001, and h set to 30. The experimental results showed that the VSN-SAE model performed the best, with a coefficient of determination Rp2 as high as 0.8965, indicating excellent predictive and explanatory power. The RMSEP of the model was as low as 0.67265.
Figure 10 clearly illustrates the training and validation results of the SAE model. Figure 10a shows the trend of the reconstruction error during training and validation, where the error decreases rapidly and stabilizes, remaining at a low level in both the training and validation sets, indicating that the model successfully learned the data features without overfitting. Figure 10b compares the reconstructed spectral reflectance curves of the SAE model with the original spectral curves, which overlap significantly, further confirming that the dimensionality-reduced features effectively retain and reflect the primary information of the original data. Table 3 shows an overall comparison of the performance of the SAE, CARS, and LLE. The results show that the SAE method significantly outperforms CARS and LLE when combined with different preprocessing techniques. In particular, in the VSN-SAE combination, the coefficient of determination Rp2 reaches 0.8965, far exceeding the 0.8182 of SNV-LLE and 0.8001 of VSN-CARS. These results demonstrate deep learning techniques’ powerful feature extraction capability in handling complex data. Therefore, the subsequent model analysis was conducted using the dataset preprocessed with VSN-SAE.

3.4. Establishment and Analysis of Model

This experiment used three models, SVR, KNN, and BP, to predict the polyphenol content in Fu brick tea. For the KNN model, after parameter optimization, ‘n_neighbours’ was set to 4 and ‘weights’ to ‘distance’, resulting in a coefficient of determination Rp2 of 0.8935 on the test set. The optimal configuration for the BP model included a learning rate of 0.002, a batch size of 64, and 250 iterations. This model had an Rp2 of 0.8310 on the test set, although its predictive accuracy was lower than that of the other models. As shown in Table 4, SVR achieved an Rp2 of 0.8965 on the test set, demonstrating superior predictive accuracy and stability.
Despite its excellent predictive performance, SVR is highly sensitive to parameter settings, resulting in significant performance differences between configurations. To improve the model’s performance, we used the PSO, SSA, PKO, and EESPKO algorithms to optimize the key parameters C , g , and ε of the SVR model. The optimization strategy was based on minimizing the mean square error (MSE), with an initial population size of 20 and a maximum of 30 iterations. The search range for C and g was set in the range of [0.1, 100] and for ε in the range of [0.001, 0.1]. The final optimal combination of parameters was C = 11.1038, g = 3.6064, and ε = 0.001.
In this study, PKO-SVR demonstrated significant advantages across several evaluation metrics. Firstly, the RMSEP of PKO-SVR was 0.6247, and the MAE was 0.3412, representing reductions of 2.9% and 2.2% compared to SSA-SVR and reductions of 5.2% and 5.3% compared to PSO-SVR, respectively, indicating higher predictive accuracy. Additionally, the RPD value of PKO-SVR reached 3.2309, the best among the comparisons with SSA-SVR and PSO-SVR, highlighting its clear advantage in model stability and generalization capability. A higher RPD value indicates that this model can more effectively capture critical information in complex datasets, improving predictive accuracy and enhancing resistance to noise and errors. Meanwhile, the data table shows that PSO-SVR exhibits relatively weaker predictive accuracy. Its higher RMSEP and MAE values indicate that PSO struggles with global search capability when dealing with complex chemical components (such as TPs), making it prone to becoming stuck in local optima and failing to capture the nonlinear characteristics of the data fully. SSA-SVR faces similar limitations; although its performance improves, the search range remains constrained. This is primarily due to the heavy reliance of PSO and the SSA on the precise tuning of their internal parameters. For instance, during optimization, PSO requires fine adjustments of acceleration constants C1 and C2 to balance global exploration and local exploitation. At the same time, the SSA also demands the optimization of multiple weight parameters to regulate the search behavior. In contrast, the parameters of the PKO algorithm are pre-set during the design phase and do not require complex manual tuning. As a result, PKO excels in solving complex optimization problems and avoids performance fluctuations caused by improper parameter settings. By balancing global search and local exploitation more effectively, PKO significantly enhances the model’s convergence speed and predictive accuracy, demonstrating greater adaptability and stability when handling complex problems.
The deviation between actual and predicted values can visually assess the model’s fitting effectiveness. Figure 11 shows the deviation of the actual polyphenol values from the predicted values under different prediction models. The closer the regression line is to the line, the better the model’s prediction performance.
As previously mentioned, although PKO performs well in optimization efficiency, it risks becoming trapped in local optima when handling complex, high-dimensional nonlinear problems. To address this issue, this paper introduces the EESPKO algorithm by combining the EES with PKO, further enhancing the model’s performance and enabling it to tackle local optima in complex optimization tasks more effectively. The experimental results show that EESPKO outperforms other algorithms regarding predictive accuracy and stability. Compared with the unoptimized SVR, EESPKO-SVR increased Rp2 from 0.8965 to 0.9152, reduced the RMSEP by 9.5% to 0.5876, and lowered the MAE by 9.7% to 0.3392. These results indicate that the improved EESPKO-SVR model significantly enhances predictive accuracy when handling complex data and demonstrates excellent stability and adaptability. Additionally, we conducted further validation and comparative experiments to assess the generalization capability of the EESPKO-SVR model, which will be discussed in detail in the next section.

3.5. Extended Model Validation

In this study, we validated the generalization ability of the constructed tea polyphenol content prediction model using an independent Fu brick tea AA dataset. Since the samples covered different years, regions, and batches, choosing the AA dataset as an independent validation sample was reasonable. There are significant differences between TPs and AAs in terms of chemical structure and function: AAs are low-content organic compounds that contribute to the sweetness and umami of tea, while TPs are high-content antioxidants that are more strongly influenced by environmental factors. As shown in Table 5, the maximum value of TPs (14.35) is far higher than that of AAs (0.2729), indicating a magnitude-level difference in component content. Additionally, the standard deviation of TPs is 2.01, significantly higher than that of AAs (0.0080), suggesting that tea polyphenol samples exhibit more significant variability. In contrast, AA samples are relatively more concentrated. These significant differences provide ideal conditions for testing the model’s generalization ability.
First, spectral correction was performed to remove external noise, and denoising methods such as SNV and MSC were applied to improve the signal-to-noise ratio. In the feature extraction stage, dimensionality reduction techniques such as the SAE and LLE were used to retain the most critical chemical composition information while reducing the interference of redundant features on the model. An SVR model was adopted for model selection, and various optimization algorithms (such as PSO, SSA, PKO, and EESPKO) were used to adjust the parameters to improve the model’s predictive performance. Several evaluation metrics were used to comprehensively evaluate the model’s performance, including R2, RMSE, MAE, and RPD. Precisely, R2 measures the goodness of fit of the model, with values closer to 1 indicating more substantial explanatory power; the RMSE reflects the squared difference between predicted and actual values, with smaller values indicating higher overall prediction accuracy; the MAE evaluates the average absolute difference between expected and actual values, with a lower MAE indicating more minor individual sample errors. RPD is used to assess the robustness of the model, with higher values indicating better prediction accuracy and stability across different samples.
After testing, the results showed that VSN-SAE performed outstandingly in improving the prediction accuracy of the AA dataset. Therefore, in the subsequent analysis, we continued to use VSN-SAE for data processing and evaluated various optimization algorithms combined with the SVR model to verify the model’s generalization ability and robustness. The results are presented in Table 6. Optimization with EESPKO resulted in the SVR model’s optimal parameters: C = 4.4038, gamma = 5.2699, and epsilon = 0.0030. The performance of the SVR model improved significantly with different optimization algorithms. The R2 value of the VSN-SAE-EESPKO-SVR model on the test set reached 0.9516, which is an improvement of 4.13% compared to the base VSN-SAE-SVR model of 0.9139, showing a better fit. Meanwhile, this model’s RMSEP and MAE values are 0.1704 and 0.0974, respectively, 25.01% and 38.52% lower than the 0.2272 and 0.1584 of the base model, reflecting higher prediction accuracy. Compared to other optimization algorithms, the EESPKO model shows significant advantages in various vital indicators. Compared to the PSO algorithm, the RMSEP and MAE of EESPKO are reduced by 24.26% and 24.57%, respectively; compared to the SSA algorithm, they are reduced by 18.63% and 16.18%, respectively; and compared to the PKO, these two indicators are reduced by 9.61% and 9.81%, respectively. In addition, the RPD value of the EESPKO model is approximately 10% higher than that of PKO and more than 24% and 19% higher than that of PSO and the SSA, respectively. This shows that EESPKO not only significantly improves the prediction accuracy of the SVR model but also significantly enhances its robustness and generalizability.
Figure 12 shows the convergence trends of optimization algorithms on the AA and TP datasets. The EESPKO-SVR model demonstrates fast convergence and low error across both datasets. It significantly outperforms algorithms like PKO-SVR, SSA-SVR, and PSO-SVR, showcasing its stability and superiority in predicting different chemical compositions. Figure 13 further illustrates the deviation between the predicted and actual values of the model on both datasets. In the AA dataset, the predicted values closely align with the actual values, indicating high prediction accuracy. In contrast, the TP dataset shows slightly more significant deviations, likely due to greater chemical composition variability between samples. Overall, the EESPKO-SVR model exhibits solid predictive performance and excellent generalization ability across different datasets.

4. Discussion

By introducing VSN preprocessing and SAE feature extraction, this study significantly improved the accuracy and stability of the prediction models for tea polyphenol and AA content in Fu brick tea. Compared to traditional methods, the SAE effectively utilized the deep network structure to extract key features and reduce redundant information. We evaluated the predictive performance of the SVR, KNN, and BP models and verified their effectiveness. Additionally, we employed the PKO heuristic optimization algorithm combined with the EES, which significantly enhanced the model’s optimization efficiency and prediction accuracy, particularly in predicting tea polyphenol and AA content. To ensure the robustness of the results, all experiments were repeated multiple times, and the analysis was based on the average results. The experiment was set with 30 iterations (epochs), and the MSE was used as the objective function for the optimization algorithms. The results indicated that EESPKO showed significant advantages during the optimization process. In the prediction of tea polyphenol content, EESPKO minimized the MSE within five iterations, while PKO required ten iterations, improving the convergence speed by 50%. Regarding computational efficiency, the total running time of EESPKO was approximately 240 s, 30 s shorter than PKO, representing a 10% improvement. Moreover, the final MSE of EESPKO was 0.345, lower than PKO’s 0.37, a reduction of 6.8%, further highlighting its significant improvement in prediction accuracy and optimization stability. In the prediction of AA content, EESPKO also demonstrated excellent optimization performance. EESPKO-SVR converged within approximately seven iterations and remained stable, while PKO-SVR required more iterations to stabilize, further validating the critical role of the EES in accelerating convergence. Compared to PKO, EESPKO reduced the MSE on the AA dataset from 0.0335 to 0.0290, a reduction of approximately 13.4%. This indicates that EESPKO not only performed well in the prediction of tea polyphenol content but also exhibited excellent generalization capability and stability in predicting AA content.
To further verify EESPKO’s advantages, we also evaluated the performance of PSO-SVR and SSA-SVR under the same experimental conditions. The results showed that the four methods ranked EESPKO > PKO > SSA > PSO regarding convergence speed and prediction accuracy. By incorporating the Elite Evolution Strategy, EESPKO enhanced the model’s global search capability and optimization efficiency, establishing itself as the preferred solution for addressing complex nonlinear regression problems. It demonstrated remarkable adaptability and robustness in predicting tea polyphenol and AA content.
Although EESPKO demonstrates excellent optimization efficiency and prediction accuracy, its computational resource requirements are relatively high, especially when handling large datasets. Despite the improved convergence speed, the overall computational load remains significant. In addition, the algorithm’s structure is relatively complex, which may increase the difficulty of application in scenarios requiring rapid development or deployment. While this method has shown excellent performance in hyperspectral data analysis, its applicability to other data types or domains has yet to be fully validated.
The enhanced PKO-SVR, combined with EES and hyperspectral data analysis technology, demonstrates significant optimization efficiency and advantages in prediction accuracy. By accelerating convergence speed, this method effectively improves data analysis efficiency and reduces the need for repetitive experiments, thereby better preserving sample integrity and reducing wasted experimental resources. The enhanced PKO-SVR is expected to be widely used in various fields, especially in scenarios requiring high-precision analysis and sample protection, such as food quality control, agricultural production monitoring, and environmental monitoring. As the scale of data continues to grow and the need for optimization increases, the advantages of this method in handling large datasets will become even more apparent. In addition, future research could focus on further reducing the computational resource requirements, simplifying the algorithm’s application process, and extending its applicability to a broader range of practical scenarios.

5. Conclusions

In this study, the polyphenol content of Fu brick tea was successfully predicted by combining hyperspectral technology with the improved EESPKO-SVR model. The results show that VSN has excellent noise reduction performance, while the SAE effectively extracts nonlinear features through its deep network structure, improving the model’s accuracy and generalization ability. Optimized by the EESPKO algorithm, the SVR model achieved an R2 value of 0.9152 on the test set, demonstrating good robustness and proving its suitability for the online monitoring and intelligent assessment of tea polyphenol content. Overall, hyperspectral technology and the EESPKO-SVR model offer significant advantages over traditional chemical analysis methods, such as ease of use, rapid response, and support for non-destructive testing. This method improves experimental efficiency and effectively preserves sample integrity, making it suitable for precise control and quality assessment in tea processing. Although this technique relies on hyperspectral equipment and complex data processing, which can increase costs, its application prospects will continue to expand as technology advances and algorithms are optimized.
Future research can expand the application potential and practical value of the improved PKO-SVR model in several ways. The model can predict critical components such as sugar and protein in the agricultural and food sectors, enabling more comprehensive quality assessments. This can help improve food quality control and provide scientific guidance for grading and pricing agrarian products. Combined with hyperspectral imaging technology, the enhanced PKO-SVR also shows broad application potential in soil composition analysis and crop quality assessment, providing critical data support for optimizing agricultural production and environmental monitoring. As profound learning technology advances, this model can be integrated with advanced feature selection algorithms to enhance its adaptability and predictive accuracy further, thus achieving more efficient data processing. In addition, the improved PKO-SVR has broad application prospects in medical diagnostics, environmental monitoring, and materials science. Future research can explore the cross-disciplinary applications of the model and fully exploit its advantages in various data processing scenarios.

Author Contributions

Conceptualization, Methodology, Software, Validation, and Writing—original draft: J.G.; Validation and Investigation: Y.D.; Methodology and Validation: C.L.; Resources, Supervision, and Funding acquisition: K.F. and G.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Hunan Province Key RD Plan Project, grant number 2023NK2011; the Hunan Provincial Social Science Achievement Evaluation Committee Project, grant number XSP24YBZ130; the Science Research Excellent Youth Project of the Hunan Provincial Department of Education, grant number 23B0920; and the Science Research Excellent Youth Project of the Hunan Provincial Department of Education, grant number 23B0906.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The data used to support the findings of this study are available from the corresponding author upon request.

Acknowledgments

The authors fully appreciate the editors and all anonymous reviewers for their constructive comments on this manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Pastoriza, S.; Mesías, M.; Cabrera, C.; Rufián-Henares, J.A. Healthy properties of green and white teas: An update. Food Funct. 2017, 8, 2650–2662. [Google Scholar] [CrossRef]
  2. Chen, G.; Yuan, Q.; Saeeduddin, M.; Ou, S.; Zeng, X.; Ye, H. Recent advances in tea polysaccharides: Extraction, purification, physicochemical characterization and bioactivities. Carbohydr. Polym. 2016, 153, 663–678. [Google Scholar] [CrossRef]
  3. Zheng, W.J.; Wan, X.C.; Bao, G.H. Brick dark tea: A review of the manufacture, chemical constituents and bioconversion of the major chemical components during fermentation. Phytochem. Rev. 2015, 14, 499–523. [Google Scholar] [CrossRef]
  4. Lin, F.-J.; Wei, X.-L.; Liu, H.-Y.; Li, H.; Xia, Y.; Wu, D.-T.; Zhang, P.-Z.; Gandhi, G.R.; Hua-Bin, L.; Gan, R.-Y. State-of-the-art review of dark tea: From chemistry to health benefits. Trends Food Sci. Technol. 2021, 109, 126–138. [Google Scholar] [CrossRef]
  5. Kang, D.; Su, M.; Duan, Y.; Huang, Y. Eurotium cristatum, a potential probiotic Fungus from Fuzhuan brick tea, alleviated obesity in mice by modulating gut microbiota. Food Funct. 2019, 10, 5032–5045. [Google Scholar] [CrossRef]
  6. Li, M.Y.; Xiao, Y.; Zhong, K.; Bai, J.R.; Wu, Y.P.; Zhang, J.Q.; Gao, H. Characteristics and chemical compositions of Pingwu Fuzhuan brick-tea, a distinctive post-fermentation tea in Sichuan province of China. Int. J. Food Prop. 2019, 22, 878–889. [Google Scholar] [CrossRef]
  7. Li, Q.; Jin, Y.; Jiang, R.; Xu, Y.; Zhang, Y.; Luo, Y.; Liu, Z. Dynamic changes in the metabolite profile and taste characteristics of Fu brick tea during the manufacturing process. Food Chem. 2021, 344, 128576. [Google Scholar] [CrossRef] [PubMed]
  8. Zhu, M.Z.; Li, N.A.; Zhou, F.; Ouyang, J.; Lu, D.M.; Xu, W.; Wu, J.L. Microbial bioconversion of the chemical components in dark tea. Food Chem. 2020, 312, 126043. [Google Scholar] [CrossRef] [PubMed]
  9. Huang, Y.; Xing, K.; Qiu, L.; Wu, Q.; Wei, H. Therapeutic implications of Functional tea ingredients for ameliorating inflammatory bowel disease: A focused review. Crit. Rev. Food Sci. Nutr. 2022, 62, 5307–5321. [Google Scholar] [CrossRef]
  10. Zhao, Y.Q.; Jia, W.B.; Liao, S.Y.; Xiang, L.; Chen, W.; Zou, Y.; Zhu, M.Z.; Xu, W. Dietary assessment of ochratoxin A in Chinese dark tea and inhibitory effects of tea polyphenols on ochratoxigenic Aspergillus niger. Front. Microbiol. 2022, 13, 1073950. [Google Scholar] [CrossRef]
  11. Wang, S.T.; Cui, W.Q.; Pan, D.; Jiang, M.; Chang, B.; Sang, L.X. Tea polyphenols and their chemopreventive and therapeutic effects on colorectal cancer. World J. Gastroenterol. 2020, 26, 562–597. [Google Scholar] [CrossRef] [PubMed]
  12. Zhao, Y.; Zhang, X. Interactions of tea polyphenols with intestinal microbiota and their implication for anti-obesity. J. Sci. Food Agric. 2020, 100, 897–903. [Google Scholar] [CrossRef] [PubMed]
  13. Fernandes, L.; Cardim-Pires, T.R.; Foguel, D.; Palhano, F.L. Green Tea Polyphenol Epigallocatechin-Gallate in Amyloid Aggregation and Neurodegenerative Diseases. Front. Neurosci. 2021, 15, 718188. [Google Scholar] [CrossRef] [PubMed]
  14. Mhatre, S.; Srivastava, T.; Naik, S.; Patravale, V. Antiviral activity of green tea and black tea polyphenols in prophylaxis and treatment of COVID-19: A review. Phytomedicine 2021, 85, 153286. [Google Scholar] [CrossRef]
  15. Wang, H.; Teng, J.; Huang, L.; Wei, B.; Xia, N. Determination of the variations in the metabolic profile and sensory quality of Liupao tea during fermentation through UHPLC–HR–MS metabolomics. Food Chem. 2023, 404, 134773. [Google Scholar] [CrossRef]
  16. Xie, J.; Wang, L.; Deng, Y.; Yuan, H.; Zhu, J.; Jiang, Y.; Yang, Y. Characterization of the key odorants in floral aroma green tea based on GC-E-Nose, GC-IMS, GC-MS and aroma recombination and investigation of the dynamic changes and aroma formation during processing. Food Chem. 2023, 427, 136641. [Google Scholar] [CrossRef]
  17. Tan, J.; Engelhardt, U.H.; Lin, Z.; Kaiser, N.; Maiwald, B. Flavonoids, phenolic acids, alkaloids and theanine in different types of authentic Chinese white tea samples. J. Food Compos. Anal. 2017, 57, 8–15. [Google Scholar] [CrossRef]
  18. Xu, M.; Wang, J.; Zhu, L. The qualitative and quantitative assessment of tea quality based on E-nose, E-tongue and E-eye combined with chemometrics. Food Chem. 2019, 289, 482–489. [Google Scholar] [CrossRef]
  19. Tozlu, B.H.; Okumuş, H.İ. A new approach to automation of black tea fermentation process with electronic nose. Automatika 2018, 59, 373–381. [Google Scholar] [CrossRef]
  20. Li, H.; Wang, Y.; Fan, K.; Mao, Y.; Shen, Y.; Ding, Z. Evaluation of important phenotypic parameters of tea plantations using multi-source remote sensing data. Front. Plant Sci. 2022, 13, 898962. [Google Scholar] [CrossRef]
  21. Wang, Y.; Cui, Q.; Jin, S.; Zhuo, C.; Luo, Y.; Yu, Y.; Zhang, Z. Tea Analyzer: A low-cost and portable tool for quality quantification of postharvest fresh tea leaves. LWT 2022, 159, 113248. [Google Scholar] [CrossRef]
  22. Ye, S.; Weng, H.; Xiang, L.; Jia, L.; Xu, J. Synchronously Predicting Tea Polyphenol and Epigallocatechin Gallate in Tea Leaves Using Fourier Transform–Near-Infrared Spectroscopy and Machine Learning. Molecules 2023, 28, 5379. [Google Scholar] [CrossRef] [PubMed]
  23. Yang, B.; Gao, Y.; Li, H.; Ye, S.; He, H.; Xie, S. Rapid Prediction of Yellow Tea Free Amino Acids with Hyperspectral Images. PLoS ONE 2019, 14, e0210084. [Google Scholar] [CrossRef] [PubMed]
  24. Wang, Y.-J.; Jin, G.; Li, L.-Q.; Liu, Y.; Kalkhajeh, Y.K.; Ning, J.-M.; Zhang, Z.-Z. NIR Hyperspectral Imaging Coupled with Chemometrics for Nondestructive Assessment of Phosphorus and Potassium Contents in Tea Leaves. Infrared Phys. Technol. 2020, 108, 103365. [Google Scholar] [CrossRef]
  25. Li, L.; Li, M.; Liu, Y.; Cui, Q.; Bi, K.; Jin, S.; Zhang, Z. High-sensitivity hyperspectral coupled self-assembled nanoporphyrin sensor for monitoring black tea fermentation. Sens. Actuators B Chem. 2021, 346, 130541. [Google Scholar] [CrossRef]
  26. Sun, J.; Zhou, X.; Hu, Y.; Wu, X.; Zhang, X.; Wang, P. Visualizing Distribution of Moisture Content in Tea Leaves Using Optimization Algorithms and NIR Hyperspectral Imaging. Comput. Electron. Agric. 2019, 160, 153–159. [Google Scholar] [CrossRef]
  27. Luo, X.; Xu, L.; Huang, P.; Wang, Y.; Liu, J.; Hu, Y.; Wang, P.; Kang, Z. Nondestructive Testing Model of Tea Polyphenols Based on Hyperspectral Technology Combined with Chemometric Methods. Agriculture 2021, 11, 673. [Google Scholar] [CrossRef]
  28. Mao, Y.; Li, H.; Wang, Y.; Fan, K.; Song, Y.; Han, X.; Zhang, J.; Ding, S.; Song, D.; Wang, H.; et al. Prediction of Tea Polyphenols, Free Amino Acids and Caffeine Content in Tea Leaves during Wilting and Fermentation Using Hyperspectral Imaging. Foods 2022, 11, 2537. [Google Scholar] [CrossRef]
  29. Tang, Y.; Wang, F.; Zhao, X.; Yang, G.; Xu, B.; Zhang, Y.; Xu, Z.; Yang, H.; Yan, L.; Li, L. A Nondestructive Method for Determination of Green Tea Quality by Hyperspectral Imaging. J. Food Compos. Anal. 2023, 123, 105621. [Google Scholar] [CrossRef]
  30. Khan, M.J.; Khan, H.S.; Yousaf, A.; Khurshid, K.; Abbas, A. Modern Trends in Hyperspectral Image Analysis: A Review. IEEE Access 2018, 6, 14118–14129. [Google Scholar] [CrossRef]
  31. Rahman, A.; Faqeerzada, M.A.; Cho, B.K. Hyperspectral imaging for predicting the allicin and soluble solid content of garlic with variable selection algorithms and chemometric models. J. Sci. Food Agric. 2018, 98, 4715–4725. [Google Scholar] [CrossRef] [PubMed]
  32. Feng, Z.H.; Wang, L.Y.; Yang, Z.Q.; Zhang, Y.Y.; Li, X.; Song, L.; He, L.; Duan, J.Z.; Feng, W. Hyperspectral Monitoring of Powdery Mildew Disease Severity in Wheat Based on Machine Learning. Front. Plant Sci. 2022, 13, 828454. [Google Scholar] [CrossRef] [PubMed]
  33. Li, X.; Wei, Z.; Peng, F.; Liu, J.; Han, G. Non-destructive prediction and visualization of anthocyanin content in mulberry fruits using hyperspectral imaging. Front. Plant Sci. 2023, 14, 1137198. [Google Scholar] [CrossRef]
  34. Jaiswal, G.; Rani, R.; Mangotra, H.; Sharma, A. Integration of hyperspectral imaging and autoencoders: Benefits, applications, hyperparameter tuning and challenges. Comput. Sci. Rev. 2023, 50, 100584. [Google Scholar] [CrossRef]
  35. Luo, N.; Li, Y.; Yang, B.; Liu, B.; Dai, Q. Prediction Model for Tea Polyphenol Content with Deep Features Extracted Using 1D and 2D Convolutional Neural Network. Agriculture 2022, 12, 1299. [Google Scholar] [CrossRef]
  36. Xu, M.; Sun, J.; Cheng, J.; Yao, K.; Wu, X.; Zhou, X. Non-destructive prediction of total soluble solids and titratable acidity in Kyoho grape using hyperspectral imaging and deep learning algorithm. Int. J. Food Sci. Technol. 2023, 58, 9–21. [Google Scholar] [CrossRef]
  37. Cao, W.; Li, G.; Song, H.; Quan, B.; Liu, Z. Research on Grain Moisture Model Based on Improved SSA-SVR Algorithm. Appl. Sci. 2024, 14, 3171. [Google Scholar] [CrossRef]
  38. Anggoro, D.A.; Mukti, S.S. Performance Comparison of Grid Search and Random Search Methods for Hyperparameter Tuning in Extreme Gradient Boosting Algorithm to Predict Chronic Kidney Failure. Int. J. Intell. Eng. Syst. 2021, 14, 198–207. [Google Scholar]
  39. Zhang, K.; Zuo, Z.; Zhou, C.; Chen, H.; Ding, Z. Research on Hyperspectral Timely Monitoring Model of Green Tea Processing Quality Based on PSO-LSSVR. J. Food Compos. Anal. 2024, 134, 106490. [Google Scholar] [CrossRef]
  40. Tan, K.; Liu, Q.; Chen, X.; Xia, H.; Yao, S. Estimation of Soybean Internal Quality Based on Improved Support Vector Regression Based on the Sparrow Search Algorithm Applying Hyperspectral Reflectance and Chemometric Calibrations. Agriculture 2024, 14, 410. [Google Scholar] [CrossRef]
  41. Gharehchopogh, F.S.; Namazi, M.; Ebrahimi, L.; Faraji, M.R.; Ghaffari, F. Advances in Sparrow Search Algorithm: A Comprehensive Survey. Arch. Comput. Methods Eng. 2023, 30, 427–455. [Google Scholar] [CrossRef] [PubMed]
  42. Wolpert, D.H.; Macready, W.G. No Free Lunch Theorems for Optimization. IEEE Trans. Evol. Comput. 1997, 1, 67–82. [Google Scholar] [CrossRef]
  43. Bouaouda, A.; Hashim, F.A.; Sayouti, Y.; Hussien, A.G. Pied kingfisher optimizer: A new bio-inspired algorithm for solving numerical optimization and industrial engineering problems. Neural Comput. Appl. 2024, 36, 15455–15513. [Google Scholar] [CrossRef]
  44. GB/T 8313-2018; Determination of Total Polyphenols and Catechins Content in Tea. State Market Regulatory Administration, Standardization Administration of the People’s Republic of China, China Standard Publishing Company: Beijing, China, 2019; pp. 1–9.
  45. Li, X.; Li, Z.; Yang, X.; He, Y. Boosting the generalization ability of Vis-NIRspectroscopy-based regression models through dimension reduction and transfer learning. Comput. Electron. Agric. 2021, 186, 106157. [Google Scholar] [CrossRef]
  46. Martens, H.; Jensen, S.A.; Geladi, P. Multivariate linearity transfor-mation for near-infrared reflectance spectrometry. In Proceedings of the Nordic Symposium on Applied Statistics; Stokkand Forlag Publishers: Stavanger, Norway, 1983; pp. 205–234. [Google Scholar]
  47. Barnes, R.J.; Dhanoa, M.S.; Lister, S.J. Standard Normal Variate Transformation and De-Trending of Near-Infrared DifFuse Reflectance Spectra. Appl. Spectrosc. 1989, 43, 772–777. [Google Scholar] [CrossRef]
  48. Zhu, H.; Liu, F.; Ye, Y.; Chen, L.; Liu, J.; Gui, A.; Zhang, J.; Dong, C. Application of Machine Learning Algorithms in Quality Assurance of Fermentation Process of Black Tea--Based on Electrical Properties. J. Food Eng. 2019, 263, 165–172. [Google Scholar] [CrossRef]
  49. Rabatel, G.; Marini, F.; Walczak, B.; Roger, J.M. VSN: Variable sorting for normalization. J. Chemom. 2020, 34, e3164. [Google Scholar] [CrossRef]
  50. Huang, Z.; Sanaeifar, A.; Tian, Y.; Liu, L.; Zhang, D.; Wang, H.; Ye, D.; Li, X. Improved Generalization of Spectral Models Associated with Vis-NIR Spectroscopy for Determining the Moisture Content of Different Tea Leaves. J. Food Eng. 2021, 293, 110374. [Google Scholar] [CrossRef]
  51. Li, H.; Liang, Y.; Xu, Q.; Cao, D. Key wavelengths screening using competitive adaptive reweighted sampling method for multivariate calibration. Anal. Chim. Acta 2009, 648, 77–84. [Google Scholar] [CrossRef]
  52. Xu, M.; Sun, J.; Yao, K.; Cai, Q.; Shen, J.; Tian, Y.; Zhou, X. Developing deep learning based regression approaches for prediction of firmness and pH in Kyoho grape using Vis/NIR hyperspectral imaging. Infrared Phys. Technol. 2022, 120, 104003. [Google Scholar] [CrossRef]
  53. Roweis, S.T.; Saul, L.K. Nonlinear Dimensionality Reduction by Locally Linear Embedding. Science 2000, 290, 2323–2326. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Distribution map of Fu brick tea samples.
Figure 1. Distribution map of Fu brick tea samples.
Agriculture 14 01701 g001
Figure 2. The hyperspectral imaging system.
Figure 2. The hyperspectral imaging system.
Agriculture 14 01701 g002
Figure 3. Hyperspectral image processing. (a) The SRAnal software. (b) Region of interest selection.
Figure 3. Hyperspectral image processing. (a) The SRAnal software. (b) Region of interest selection.
Agriculture 14 01701 g003
Figure 4. Pied kingfisher in nature. (a) Perching. (b) Hovering.
Figure 4. Pied kingfisher in nature. (a) Perching. (b) Hovering.
Agriculture 14 01701 g004
Figure 5. A schematic diagram of the natural evolutionary strategy of the elite.
Figure 5. A schematic diagram of the natural evolutionary strategy of the elite.
Agriculture 14 01701 g005
Figure 6. A schematic diagram of the elite randomized mutation strategy.
Figure 6. A schematic diagram of the elite randomized mutation strategy.
Agriculture 14 01701 g006
Figure 7. EESPKO-SVR parameter optimization process.
Figure 7. EESPKO-SVR parameter optimization process.
Agriculture 14 01701 g007
Figure 8. Pretreatment spectral curve. (a) Spectra after MSC processing. (b) Spectra after SG and MSC processing. (c) Spectra after SNV processing. (d) Spectra after VSN processing.
Figure 8. Pretreatment spectral curve. (a) Spectra after MSC processing. (b) Spectra after SG and MSC processing. (c) Spectra after SNV processing. (d) Spectra after VSN processing.
Agriculture 14 01701 g008aAgriculture 14 01701 g008b
Figure 9. Feature wavelength extraction of TPs. (a) TP CARS characterization selection. (b) The figure shows the wavelength range selected by CARS, with the red dots indicating the key wavelengths identified during the feature selection process.
Figure 9. Feature wavelength extraction of TPs. (a) TP CARS characterization selection. (b) The figure shows the wavelength range selected by CARS, with the red dots indicating the key wavelengths identified during the feature selection process.
Agriculture 14 01701 g009
Figure 10. Training results of SAE. (a) Reconstruction error of training. (b) Original input spectra and reconstructed spectra curves.
Figure 10. Training results of SAE. (a) Reconstruction error of training. (b) Original input spectra and reconstructed spectra curves.
Agriculture 14 01701 g010
Figure 11. A comparison of deviations in the predictive accuracy of different models. (af) respectively show the deviation analysis between the predicted and actual results for the BP, KNN, PSO-SVR, SSA-SVR, PKO-SVR, and EESPKO-SVR models.
Figure 11. A comparison of deviations in the predictive accuracy of different models. (af) respectively show the deviation analysis between the predicted and actual results for the BP, KNN, PSO-SVR, SSA-SVR, PKO-SVR, and EESPKO-SVR models.
Agriculture 14 01701 g011aAgriculture 14 01701 g011b
Figure 12. Optimization algorithm fitness function curve. (a) Convergence curves of optimization algorithms on AA dataset. (b) Convergence curves of optimization algorithms on polyphenol dataset.
Figure 12. Optimization algorithm fitness function curve. (a) Convergence curves of optimization algorithms on AA dataset. (b) Convergence curves of optimization algorithms on polyphenol dataset.
Agriculture 14 01701 g012
Figure 13. Comparison of predicted and actual values for EESPKO-SVR on AA and polyphenol datasets. (a) EESPKO-SVR model performance on AA dataset, comparing predicted and true values. (b) EESPKO-SVR model performance on polyphenol dataset, comparing expected and actual values.
Figure 13. Comparison of predicted and actual values for EESPKO-SVR on AA and polyphenol datasets. (a) EESPKO-SVR model performance on AA dataset, comparing predicted and true values. (b) EESPKO-SVR model performance on polyphenol dataset, comparing expected and actual values.
Agriculture 14 01701 g013
Table 1. The quality component content of the samples refers to TPs.
Table 1. The quality component content of the samples refers to TPs.
Sample SetSample SizeTea Polyphenol Value
MaximumMinimumAverageStandard Deviation
Training set123814.350.424.822.01
Testing set31013.631.084.832.02
Table 2. Comparison of pretreatment model.
Table 2. Comparison of pretreatment model.
ObjectPretreatmentTrain SetTest Set
Rc2RMSECMAERp2RMSEPMAERPD
TPsRAW0.48941.43410.74090.33521.64551.00601.2265
MSC0.68011.13510.62850.58221.30450.87601.5471
SG + MSSC0.71531.07070.57520.61501.25210.81281.6118
SNV0.64531.19520.65190.54661.35880.86281.4852
VSN0.80090.89540.44480.70071.10390.70991.8281
Table 3. Comparison of feature selection models.
Table 3. Comparison of feature selection models.
ObjectPretreatmentTrain SetTest Set
Rc2RMSECMAERp2RMSEPMAERPD
TPsRAW-CARS0.48021.44710.95390.45681.48751.01261.3568
MSC-CARS0.67631.14190.73760.64421.20380.88201.6764
SG + MSC-CARS0.73471.03370.66430.70281.10020.67781.8344
SNV-CARS0.67601.14240.75070.64311.20580.78641.6738
VSN-CARS0.83180.82310.54220.80010.90240.59622.2366
RAW-LLE0.43121.51370.89550.45481.49030.92211.3543
MSC-LLE0.72521.05220.56870.70501.09620.66961.8412
SG + MSC-LLE0.79810.90200.52280.77650.95410.54702.1154
SNV-LLE0.83390.81810.48190.81820.86060.51672.3452
VSN-LLE0.80950.87610.49880.78920.92660.51452.1782
RAW-SAE0.44791.49140.91840.40221.56050.93911.2934
MSC-SAE0.86340.74170.41080.82500.84430.52552.3904
SG + MSC-SAE0.86950.72510.41790.86400.74420.44492.7118
SNV-SAE0.86570.73570.41260.88170.69420.39872.9071
VSN-SAE0.88360.68490.38810.89650.64940.37563.1078
Table 4. Comparison of regression prediction models for TPs.
Table 4. Comparison of regression prediction models for TPs.
ObjectPretreatmentTrain SetTest Set
Rc2RMSECMAERp2RMSEPMAERPD
TPsBP0.91680.57890.45470.83100.82980.59372.4323
KNN0.86170.74630.39710.89350.65850.38353.0648
SVR0.89880.63850.36730.89650.64940.37563.1078
PSO-SVR0.89710.643730.34800.89330.65920.36013.0615
SSA-SVR0.90140.63040.33830.89840.64320.34873.1379
PKO-SVR0.90440.62070.33560.90420.62470.34123.2309
EESPKO-SVR0.90160.62950.34850.91520.58760.33923.4345
Table 5. The quality component content of the samples refers to AAs.
Table 5. The quality component content of the samples refers to AAs.
Sample SetSample SizeTea Amino Acids
MaximumMinimumAverageStandard Deviation
Training set12770.27290.22140.23610.0080
Testing set3190.27290.2210.23560.0077
Table 6. Comparison of regression prediction models for AAs.
Table 6. Comparison of regression prediction models for AAs.
ObjectModelTrain SetTest Set
Rc2RMSECMAERp2RMSEPMAERPD
AAsVSN-SAE-SVR0.90540.24690.17710.91390.22720.15843.4076
VSN-SAE-PSO-SVR0.91620.23250.14080.91550.22510.12953.4395
VSN-SAE-SSA-SVR0.92660.21750.12580.92680.20950.11623.6953
VSN-SAE-PKO-SVR0.93760.20060.11670.94130.18750.10804.1288
VSN-SAE-EESPKO-SVR0.93480.20500.11910.95160.17040.09744.5439
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gong, J.; Chen, G.; Deng, Y.; Li, C.; Fang, K. Non-Destructive Detection of Tea Polyphenols in Fu Brick Tea Based on Hyperspectral Imaging and Improved PKO-SVR Method. Agriculture 2024, 14, 1701. https://doi.org/10.3390/agriculture14101701

AMA Style

Gong J, Chen G, Deng Y, Li C, Fang K. Non-Destructive Detection of Tea Polyphenols in Fu Brick Tea Based on Hyperspectral Imaging and Improved PKO-SVR Method. Agriculture. 2024; 14(10):1701. https://doi.org/10.3390/agriculture14101701

Chicago/Turabian Style

Gong, Junyao, Gang Chen, Yuezhao Deng, Cheng Li, and Kui Fang. 2024. "Non-Destructive Detection of Tea Polyphenols in Fu Brick Tea Based on Hyperspectral Imaging and Improved PKO-SVR Method" Agriculture 14, no. 10: 1701. https://doi.org/10.3390/agriculture14101701

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop