Next Article in Journal
The Disease-Modifying Effects of a Single Intra-Articular Corticosteroid Injection during the Freezing Phase of Frozen Shoulder in an Animal Model
Previous Article in Journal
Dominant Chemical Interactions Governing the Folding Mechanism of Oligopeptides
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Hyperspectral Imaging Combined with Deep Transfer Learning to Evaluate Flavonoids Content in Ginkgo biloba Leaves

by
Jinkai Lu
1,
Yanbing Jiang
1,
Biao Jin
1,
Chengming Sun
2 and
Li Wang
1,*
1
College of Horticulture and Landscape Architecture, Yangzhou University, Yangzhou 225009, China
2
Jiangsu Key Laboratory of Crop Genetics and Physiology/Co-Innovation Center for Modern Production Technology of Grain Crops, College of Agriculture, Yangzhou University, Yangzhou 225009, China
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2024, 25(17), 9584; https://doi.org/10.3390/ijms25179584
Submission received: 26 June 2024 / Revised: 5 August 2024 / Accepted: 31 August 2024 / Published: 4 September 2024
(This article belongs to the Section Molecular Plant Sciences)

Abstract

:
Ginkgo biloba is a famous economic tree. Ginkgo leaves have been utilized as raw materials for medicines and health products due to their rich active ingredient composition, especially flavonoids. Since the routine measurement of total flavones is time-consuming and destructive, rapid, non-destructive detection of total flavones in ginkgo leaves is of significant importance to producers and consumers. Hyperspectral imaging technology is a rapid and non-destructive technique for determining the total flavonoid content. In this study, we discuss five modeling methods, and three spectral preprocessing methods are discussed. Bayesian Ridge (BR) and multiplicative scatter correction (MCS) were selected as the best model and the best pretreatment method, respectively. The spectral prediction results based on the BR + MCS treatment were very accurate (RTest2 = 0.87; RMSETest = 1.03 mg/g), showing a high correlation with the analytical measurements. In addition, we also found that the more and deeper the leaf cracks, the higher the flavonoid content, which helps to evaluate leaf quality more quickly and easily. In short, hyperspectral imaging is an effective technique for rapid and accurate determination of total flavonoids in ginkgo leaves and has great potential for developing an online quality detection system for ginkgo leaves.

1. Introduction

Ginkgo biloba L. is a very ancient species, known as a living fossil in the plant kingdom [1]. In addition, ginkgo is also an extremely important and famous economic species, and its leaves have been cultivated for pharmaceutical and industrial products, which are widely produced around the world [2,3]. The G. biloba leaf is rich in flavonoids, which can modulate the levels of reactive oxygen species (ROS), thereby helping to maintain redox homeostasis and prevent DNA damage [4]. For instance, under the stress of ultraviolet B (UV-B) radiation, the accumulation of flavonoids can serve as a “sunscreen” to protect cells from radiation-induced damage [5]. Due to the rich content of flavonoids in its leaves, it has become a raw material for various medicinal products [6]. The extract of G. biloba leaves (GBE) is commonly utilized for the treatment of cardiovascular diseases frequently encountered in the elderly population. The core constituents of GBE are flavonoid compounds, and the medicinal extract derived from dried G. biloba leaves is typically standardized to contain 24% flavonoids [7].
Quality attributes determine the commercial value of G. biloba. Generally, the higher the content of medicinal ingredients in ginkgo leaves, the higher the economic value. The production of flavonoid compounds in ginkgo is influenced by many factors, including the selected variety [8,9], harvest time [9], environmental conditions [4,5,10], and the extraction process [11,12]. In addition, the flavonoid content is also affected by the development status, and the older the plants, the lower the flavonoid content. Under normal circumstances, ginkgo leaves that are more than 5 years old do not have medicinal extraction value due to their low content of effective ingredients [13]. Therefore, the quantitative determination of flavonoid compound concentration in ginkgo leaves has important significance for quality sorting and consumption. However, conventional methods for determining the flavonoid content, such as spectrophotometry and high-performance liquid chromatography (HPLC), are destructive, with low efficiency, prolonged sampling time, strong dependence on reagents, and high cost. They are also only suitable for small batches of samples and require personnel with specialized experimental knowledge and skills, which under normal circumstances makes it difficult to meet the detection requirements. Therefore, there is an urgent need for online monitoring of the flavonoid content in G. biloba leaves to address the issues of rapid, accurate, and non-destructive measurement of the flavonoid content.
Machine vision technology employs optical devices to capture real-world imagery, subsequently processing and analyzing the image information using computers to perform preset operations on required information or control mechanical execution devices. This constitutes a non-contact measurement technique. Currently, this technology has become one of the most commonly used and efficient methods for non-destructive quality inspection of leaves [14,15]. It can combine leaf images with machine learning algorithms to classify leaves of different qualities by analyzing the differences in phenotypic characteristics such as size, shape, and texture. However, it is challenging to apply this method to leaves with similar phenotypes [16]. Fortunately, high-throughput phenotyping analysis-high spectral imaging (HSI) can overcome this issue by integrating the multispectral information of the leaves. As a rapid, accurate, and non-destructive technique, HSI has been widely applied in the detection of various agricultural products to overcome phenotypic bottlenecks in large-scale production [17]. For instance, this technology has been utilized to differentiate the characteristics of fresh tea leaves to determine the optimal harvest time [15]. Additionally, the estimation of chlorophyll levels in wheat using HSI allows for the rapid screening of nitrogen responses in large wheat populations [17]. Furthermore, this method can effectively distinguish and classify target objects or predict crop traits by detecting subtle differences in the chemical composition and distribution within the leaves [18], such as anthocyanin content in purple-fleshed sweet potatoes [19], the soluble solid content of apples [20], the total flavonoids content (TFC) in okra (Abelmoschus esculentus L.) pods [21], the total polysaccharides and total flavonoids in Chrysanthemum morifolium [22], and so on. Furthermore, HSI technology has been used for rapid, non-destructive testing and visualization of the quality of some medicinal plants. In previous work, HSI has been combined with regression analysis for estimating the concentration of tetrahydrocannabinolic acid [23] and cannabidiolic acid in hemp (Cannabis sativa L.) [24]. The above research shows the feasibility of using HIS technology to predict the quality of agro-products. However, to our knowledge, there has been limited research on the use of HSI technology to determine and visualize the total flavonoid content of ginkgo leaves.
However, the combination of hyperspectral technology with multivariate analysis still faces several challenges in practical applications. For instance, the accuracy and robustness of different algorithms in detecting the chemical composition of plant leaves using high-resolution images can vary between training and validation datasets. Additionally, the generalization capabilities of these algorithms in the resolution domain may also differ [21,23]. In particular, the acquired spectral data are influenced by a multitude of external factors, such as noise in the detection environment, variations in the chemical and physical properties of the samples, and measurement errors of the instruments [25]. The spectral preprocessing methods can effectively address these issues. Spectral preprocessing can reduce or eliminate the impact of various non-target factors, ensuring the universality and effectiveness of spectral data, and enhancing the predictive power and robustness of the models [26,27]. Specifically, different preprocessing methods can lead to varying performance of spectral models. Generally speaking, the optimal choice of spectral preprocessing is empirical and exploratory [28]. Therefore, it is crucial to select the appropriate preprocessing method to improve the accuracy of the model.
Deep transfer learning is an emerging subfield within the domain of machine learning, renowned for its capability to handle extremely large-scale datasets and its exceptional generalization performance on unseen data. However, due to the scarcity of relevant datasets, there have been few reports on the assessment of flavonoid content in Ginkgo biloba leaves using deep learning and hyperspectral imaging technologies. In this study, we explore the feasibility of utilizing hyperspectral imaging (HSI) technology for rapid, non-destructive prediction and visualization of total flavonoid content in ginkgo leaves. Initially, we acquire hyperspectral images of 380 sets of G. biloba leaf samples across the 400~1000 nm wavelength range, alongside reference values for total flavonoids, and extract the average spectral data from the region of interest (ROI). The spectral data are then subjected to multiplicative scatter correction (MSC), standard normal variate transformation (SNV), and Savitzky–Golay (SavGol) preprocessing to refine the spectral characteristics. Feature variables are extracted using the OTSU algorithm. Ultimately, the dataset was randomly partitioned into a 4:1 ratio, resulting in a training set comprising 304 samples and a testing set with 76 samples. Six predictive models were established and their accuracy was evaluated to determine the optimal model for predicting the total flavonoid content in leaves.

2. Results

2.1. Differences in Ginkgo Leaf Shapes and Total Flavonoid Content

We randomly selected a ginkgo tree and found that the leaves in different parts had different morphological characteristics (Figure 1a). We divided them into nine types according to the leaf cleft and edge angle. Further observation showed that the upper leaves of the trunk usually had multiple leaf cracks, while the middle leaves had less leaf fission and the lower leaves had the fewest leaf cracks. The content of total flavonoids in leaves of different parts of the upper, middle, and lower parts of the trunk was the highest, followed by the middle, and then the lowest (Figure 1b,c). These results indicated that the content of total flavonoids in ginkgo leaves was related to leaf phenotype and leaf location.

2.2. Flavonoid Composition Determination of Ginkgo Leaves and Sample Set for HIS

Since the TFC of ginkgo leaves was related to leaf phenotype, 407 groups of ginkgo leaves from 1 to 5 years old were randomly sampled, and their spectral information was acquired using a hyperspectral imaging system (Figure 2). The statistical results of TFC in 407 groups of ginkgo leaf samples by spectrophotometry are presented in Figure 3. The TFC values range from 6.1969 mg·g−1 to 20.5526 mg·g−1, which are comparable to the data in previous studies, proving that establishing a TFC prediction model on this dataset is effective [13,29]. Furthermore, the quantity distribution of flavonoids in the samples was mostly concentrated at 12~15 mg·g−1 (Figure 3a), which is consistent with the typical flavonoid content found in leaves from 1 to 5 years of age [13]. Especially, there is a certain correlation between the leaf lobe phenotype and the flavonoid content of the leaves. We found that the TFC was higher in leaves with lobed leaves than in leaves without lobed leaves (Figure 3b). This could be particularly useful if the producer decides to harvest leaves with a higher flavonoid content. Leaf splitting is an important phenotype of leaf age, and in general, younger leaves have more and deeper leaf splitting. Previous studies have reported a greater accumulation of flavonoids in young ginkgo leaves compared to older leaves [29]. Therefore, we speculate that the effect of leaf splitting on flavonoids was mainly affected by plant age.

2.3. Spectra Characteristics of Ginkgo Leaves

The average reflectance spectra of all samples are depicted in Figure 4. Across the entire visible to near-infrared spectral data ranging from 381.8 nanometers to 1020.5 nanometers, there are some intersections and overlaps between the spectra of different samples, yet all spectra exhibit similar trends. The vibrations observed in the visible light region around the 500~700 nanometer band are attributed to chlorophyll, which is associated with the nitrogen content in the leaves of green plants. The rapid changes in reflectance within the 680~750 nm range are indicative of the well-known “red edge” of plants in the electromagnetic spectrum [30]. Furthermore, minor weak peaks observed in the range of 750 nm to 930 nm are predominantly attributed to the third overtone stretching of the O-H functional groups associated with water in the ginkgo leaf samples. An absorption region observed in the 930~1000 nm range corresponds to the third harmonic of the C-H functional groups and the second overtone of the O-H harmonics [31]. The variations in the spectral characteristics of ginkgo leaves suggest that hyperspectral imaging holds potential for predicting flavonoid content.

2.4. Analysis of Different Models

In this study, we employed deep learning for regression analysis. To train a deep learning model, a training set and a test set are required. Spectral outliers were identified using the Hotelling T2 test. Twenty samples were identified as spectral outliers and excluded from the sample set, resulting in a final set of 380 samples that were matched between the corrected hyperspectral data and the flavonoid data. The 380 samples were randomly divided into a training set and a test set in a 4:1 ratio. Subsequently, six different models for total flavonoid content (TFC) analysis (Support Vector Regression, SVR; Partial Least Squares Regression, PLSR; Bayesian Ridge Regression, BR; Linear Regression; Lasso Regression; and Ridge Regression) were established based on the previous data. These models were designed to capture the relationship between the ginkgo leaf spectral data and the TFC values by inputting the entire spectral matrix and the TFC values. To mitigate the likelihood of model overfitting, the GridSearch method was applied to the training set during the model construction process.
The relationship between the measured and the modeled leaf flavonoid content is shown in Figure 5. The regression results showed that the PLSR model and the BR model are better among the eight models. The cross-validated RTest2 of the PLSR model was 0.83 with an RMSETest of 1.2 mg·g−1. The BR model outperformed the PLSR model, with RTest2 and RMSETest of 0.86 and 1.10 mg·g−1, respectively. The results indicate that establishing a predictive model for the total flavonoid content (TFC) in ginkgo leaves based on visible and near-infrared spectroscopy is feasible. The superior performance of the linear model may be attributed to the predominance of linear patterns in the relationship between the spectra and the TFC.

2.5. Effects of Different Pretreatments on Model Transfer

To select the optimal preprocessing method, the preprocessed spectral data were used as input variables in the PLSR and BR models to establish the corresponding prediction model (Figure 6). Three pretreatment methods were employed: MSC, SNV, and SavGol. Compared with the results of raw spectral data, the spectral pretreatment methods of MSC and SNV can improve the prediction performance of PLSR and BR models, while SavGol reduces the predictive performance of PLSR and BR models. Furthermore, the PLSR model using the preprocessing method of MSC showed more accuracy than other models, with RTest2 increasing by 1.20% and RMSETest declining by 4.17% compared to models built with original spectra. Additionally, the BR model combined with MSC preprocessing showed better performance than the BR models combined with SNV, with RTest2 increasing by 1.15% and RMSETest declining by 6.36% based on the raw spectra modeling (Figure 6). Based on the aforementioned analysis, MSC was chosen as the optimal preprocessing method, demonstrating strong generalization capability.

3. Discussion

Flavonoids are the most important secondary metabolites in Ginkgo biloba, and the content of flavonoids determines the value of ginkgo leaves [32]. The traditional methods for the determination of flavonoids are ultraviolet spectrophotometers and liquid mass spectrometry, which makes the determination method time-consuming and difficult to perform in the field. Recently, hyperspectral technology and deep learning have been widely used as fast and efficient methods to predict the physiological parameters of plants [33]. In particular, the prediction of active ingredients in some plants has good performance, such as the prediction of tea composition [34]. Furthermore, in okra pods, Cui et al. [21] reported R2 over 0.93 for TFC. In black goji berries, Zhang et al. [35] reported R2 over 0.95, 0.91, and 0.93 for total flavonoids, total anthocyanins, and total phenolics, respectively. For cannabidiolic acid, Ooi [24] reported an average R2 value of 0.98 in Cannabis sativa L. The prediction performances of total anthocyanins, total flavonoids, and total phenolics exhibit variability across different studies. In this study, we collected a large number of ginkgo leaf samples and measured the flavonoid contents. The estimation of flavonoid contents with deep learning was accomplished.
However, the integration of hyperspectral technology with multivariate analysis encounters several issues in practical applications. The obtained spectra are influenced by a variety of factors, such as noise in the measurement environment, differences in the chemical and physical properties of the samples, and instrumental measurement errors [25]. These issues can make it difficult for models built on the previous set of samples to be used for the next set of samples. Previously established models may also be difficult to apply to other varieties or after changing measurement conditions [36]. Spectral preprocessing can eliminate background information and noise, mitigate the effects of various factors, and preserve the useful information of the samples to the greatest extent possible. Furthermore, preprocessing can also enhance the versatility of the spectrum, improve the prediction error of the model, and improve its robustness. Therefore, spectral preprocessing is necessary to establish a reliable and stable model [37]. The common preprocessing methods include SavGol, MSC, SNV, variable sorting for normalization (VSN), and first-derivative (FD) methods [38]. Xiao et al. [25] discussed seven different spectral preprocessing methods and found that the model built by combining FD + SNV preprocessing with deep transfer learning was superior to the conventional model. In the study of the identification and classification of sea cucumbers by hyperspectral technology, it was found that VSN and SNV are the best preprocessing methods for filtering out noise and scatter information in the raw spectra, which can enhance the model’s recognition performance [39]. In general, a good model has two aspects: a higher R2 and a lower RMSE value. In this study, we found that after MSC and SNV pretreatment, the R2 increased and the RMSE decreased, indicating that the accuracy of the model was improved. Similarly, in studies of capsaicin content determination, it has been found that MSC and SNV pretreatments eliminate or reduce the effects of spectral scattering, thereby improving the precision of the model [40]. In contrast, SavGol pretreatment technology did not show the same robustness as other preprocessing methods and even performed worse than unpreprocessed data. Similarly, a recent study assessing the adulteration of sesame oil also found that SavGol pretreatment methods reduce the accuracy of the model [41]. Therefore, the selection of an appropriate preprocessing method is crucial for enhancing the robustness of the model.
In addition, the findings of this study also found that the TFC of ginkgo leaves was highly correlated with leaf cracks. The flavonoid content of leaves with leaf clefts was higher than that of leaves without leaf clefts. We speculate that this may be related to the age of the leaves. Previous research has identified age as a significant factor influencing the synthesis and accumulation of flavonoid compounds. In comparison, the flavonoid content in the young leaves of hawthorn and birch is markedly higher than that in mature leaves [42,43]. Similarly, young leaves of ginkgo trees possess high levels of flavonoids, whereas their content significantly decreases in mature trees [7]. Young leaves often have a deep and numerous leaf-splitting phenotype [29,44]. This result can help leaf harvesters evaluate the flavonoid content of leaves more intuitively and quickly.

4. Materials and Methods

4.1. Sample Preparation

An experiment was conducted in August 2021 at the leaf-use ginkgo nursery located in Pizhou City, Jiangsu Province, China (34° 36′ 27′′ N, 117° 58′ 47′′ E). A total of 407 groups of G. biloba (‘Fozhi’) leaf samples from 1 to 5 years old were collected and divided into two categories, including lobed leaves (LL, 207 samples) and unlobed leaves (UL, 200 samples). All freshly harvested G. biloba leaves were individually placed on a blackboard for the acquisition of hyperspectral imaging data. Following the data collection, the leaves were dried and ground into a fine powder for the determination of TFC.

4.2. Spectra Acquisition

Utilizing a near-infrared hyperspectral imaging system, hyperspectral imaging was conducted on fresh ginkgo leaves. The hyperspectral imaging system consists of a darkroom, a hyperspectral spectrometer, and a computer. The core component of the system was a stepping motor-driven moving platform designed for the linear scanning of samples, complemented by two symmetrically positioned halogen lamps that provide a stable light source. Prior to the acquisition of leaf spectral data, a white reference image was first obtained (using a standard whiteboard with a reflectance close to 100%) and a dark reference image was acquired (by turning off the light source and completely covering the camera lens with an opaque cap), to correct the raw intensity images to reflectance images. The corrected images are calculated using the original hyperspectral images, the white reference image, and the dark reference image. The grayscale correction formula is depicted as follows:
I = I r a w I b I w     I b
In the formula, I represents the corrected image data, Iraw signifies the raw image data, Ib denotes the dark background image data obtained when the device cover seals the camera, and Iw refers to the whiteboard image data captured when a whiteboard is placed in a position corresponding to the object under test, filling the frame acquisition range of the hyperspectral camera.

4.3. Spectral Extraction

The data processing of hyperspectral imaging is relatively more complex, with the specific technical procedure as follows:
  • Initially, a visible-light image was extracted from each hyperspectral image.
  • From the visible light image, a gray-scale image was extracted, and an EGI (Enhanced Greenness Index) image was obtained through the 2G-R-B operation. Otsu’s algorithm was employed to automatically determine the threshold for segmenting the gray-scale image, yielding the gray mask (i.e., the locational region of the black cloth within the image). Similarly, Otsu’s algorithm was used to ascertain the threshold for segmenting the EGI image, resulting in the EGI mask (the locational region within the yellow-green color spectrum of the image). The intersection of these two locational regions was taken to obtain the mask for the leaf’s position.
  • Based on the mask position information of the foreground, the hyperspectral image calculated the hyperspectral mean for each pixel point of the leaf, thus obtaining the spectral information for each leaf. This spectral information was then supplemented with corresponding flavonoid data to produce the final dataset used for modeling.

4.4. Measurements of Flavonoids Content

After acquiring hyperspectral imagery, an immediate assessment of the TFC in the ginkgo leaves was conducted. The ginkgo leaf samples were dried to a constant weight in an oven, then ground into a powder and passed through an 80-mesh sieve. A precise amount of 0.02 g of the powdered sample was weighed into the centrifuge tube, and 2 mL of extraction solvent (60% ethanol) was added, achieving a material-to-solvent ratio of 1:100. The samples were extracted under vortex agitation conditions at 60 °C for 2 h. Subsequently, the mixture was centrifuged at 12,000 rpm at 25 °C for 10 min, and the supernatant was collected for determination. Using a pipette, 540 μL of the supernatant was transferred into a 2 mL centrifuge tube. Following the method outlined in the kit’s instruction manual (Suzhou Comin Biotechnology Co., Suzhou, China), reagents were sequentially added to the supernatant, the mixture was vortexed, and then allowed to stand in the dark for 15 min. In the end, the TFC was calculated by measuring the absorbance at 510 nm using a UV-Vis spectrophotometer.

4.5. Spectral Preprocessing

In the acquisition of spectral data, there is a frequent occurrence of disturbances due to extrinsic environmental factors and instrumental responses that are extraneous to the intrinsic properties of the samples under examination. To confer superior predictive capabilities upon the models developed, it becomes necessary to apply suitable methodologies for the preprocessing of spectral datasets. Within the purview of this research, a triad of preprocessing algorithms has been adopted, encompassing MSC, SNV, and the application of SavGol.
MSC is a commonly used algorithm in multi-band calibration modeling, which can eliminate spectral differences caused by different scattering conditions to a certain extent and enhance the correlation between features and predicted data. MSC initially calculates the mean of all sample features, then performs a univariate linear operation between each sample’s features and the mean to determine the translation and offset of each sample relative to the mean of all features. By eliminating the translation in the original features of each sample and dividing by the offset, the corrected spectral feature data can be obtained [45]. The specific calculation formulas are as follows:
  • Average spectral data:
    x i = 1 n i = 1 n x i
  • Univariate linear operation:
    x i n e w = k i x i + b i
  • Corrected spectrum:
    x i M S C = x i n e w b i k i
In the formulas a, b, and c: xi represents the spectral data of an individual sample, and ki and bi denote the offset and intercept, respectively, obtained after performing a simple linear regression between the spectral data of each sample and the average spectrum.
SNV is commonly utilized to mitigate adverse factors such as shape asymmetry and non-specific scattering on the target surface. This method is applied to process the characteristics of individual samples [45,46]. The specific calculation formula is as follows:
x i , S N V = x i x i = 1 k x i x 2 k 1
In the formula, Ix represents the original spectral reflectance of the individual sample, x denotes the average spectral reflectance of the samples, and k signifies the number of spectral bands.
The SavGol method is an algorithm for least squares convolution fitting that preserves distributional characteristics such as relative maxima, minima, and widths. Its application to the preprocessing of spectral data can enhance the smoothness of the data. The computational process is as follows:
  • Determine a window of fixed size (2m + 1), considering all data within this window as a collective set.
  • For each measurement point x = [−m, 1 − m, …, −1, 0, 1, …, m], the following formula is employed for fitting:
    p x = k = 0 n a k x k
  • By calculating the least squares residuals between the fitted curve and the spectrum and setting them to the minimum as boundary conditions, the optimal coefficient matrix B can be obtained through the computation B = X(XTX)−1XT. The convolution of B with the sample spectrum is then performed to achieve SavGol filtering.

4.6. Model Establishment

The SVR model is an application of the Support Vector Machine (SVM) to regression problems, capable of effectively handling high-dimensional data and providing robust performance even when the feature space is substantially large. By incorporating the kernel trick, SVR can address nonlinear issues and exhibit high robustness against noise and outliers. Furthermore, SVR adheres to the principle of maximum margin characteristic of SVM, which contributes to enhancing the model’s generalization ability and preventing overfitting.
In this study, in addition to the SVR model, we also established and compared the predictive accuracy of five other models (including PLSR, BR, Linear, Lasso, and Ridge) to determine the optimal model for estimating the TFC in ginkgo leaves.

4.7. Model Evaluation

The performance of a model is assessed through the determination coefficient (R2) and the root mean square error of the training set (RMSE). For the models, R2 represents the percentage of variance in the predicted values of the target variable explained by the model. In general, an R2 value that falls within the interval of 0.61 to 0.8 is indicative of a model that is deemed suitable for predictive purposes. An R2 value positioned between 0.81 and 0.9 denotes a model that exhibits commendable performance, whereas an R2 value surpassing 0.9 signifies a model with an exceptionally high predictive efficacy [47]. RMSE is utilized to measure the deviation between the model’s predicted values and the actual measurements. The lower the RMSE value, the more accurate the predictive performance of the model. The calculation formulas for R2 and RMSE are as follows:
R 2 = 1 i = 1 n y i y i ^ 2 i = 1 n y i y m 2
RMSE   = 1 n i = 1 n y i y i ^ 2
In these formulas, yi and y i ^ denote the actual and predicted values of the measured indicator in the calibration and prediction sets, respectively, while ym represents the mean value of the measured indicator within the dataset.

5. Conclusions

In this study, we explored the possibility of combining hyperspectral techniques with deep learning algorithms for the detection of total flavone content in ginkgo. For the prediction of total flavonoids in ginkgo leaves, it demonstrated the highest predictive accuracy. Furthermore, we compared the impact of different preprocessing techniques on the model’s precision. It can be observed that different preprocessing methods influence the accuracy of the BR model. In summary, MSC outperforms other preprocessing techniques. The model that combines spectral preprocessing with deep transfer learning has achieved satisfactory results, thereby validating the effectiveness of the proposed approach. In future research, a broader range of G. biloba varieties and spectral variations will be considered to further enhance the robustness of the model.

Author Contributions

L.W. and J.L. carried out the design of the study. J.L. performed the experimental work and data analyses and drafted the manuscript. Y.J. participated in sample collection and data analysis. C.S. guided the experimental method. L.W. and B.J. reviewed and revised the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This work was financially supported by the National Natural Science Foundation of China (Grant No. 32171838) and the Jiangsu Provincial Key Research and Development Program (modern agriculture) (Grant No. BE2021367).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support the findings of this study are available on request from the corresponding author.

Acknowledgments

We thank Tianle Yang and Weijun Zhang for their guidance and assistance with this experiment.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Liu, H.; Wang, X.; Wang, G.; Cui, P.; Wu, S.; Ai, C.; Hu, N.; Li, A.; He, B.; Shao, X.; et al. The nearly complete genome of Ginkgo biloba illuminates gymnosperm evolution. Nat. Plants 2021, 7, 748–756. [Google Scholar]
  2. Wu, D.; Feng, J.; Lai, M.; Ouyang, J.; Liao, D.; Yu, W.; Wang, G.; Cao, F.; Jacobs, D.; Zeng, S. Combined application of bud and leaf growth fertilizer improves leaf flavonoids yield of Ginkgo biloba. Ind. Crops Prod. 2020, 150, 112379. [Google Scholar]
  3. Lu, J.; Xu, Y.; Meng, Z.; Cao, M.; Liu, S.; Kato-Noguchi, H.; Yu, W.; Jin, B.; Wang, L. Integration of morphological, physiological and multi-omics analysis reveals the optimal planting density improving leaf yield and active compound accumulation in Ginkgo biloba. Ind. Crops Prod. 2021, 172, 114055. [Google Scholar]
  4. Xu, N.; Liu, S.; Lu, Z.; Pang, S.; Wang, L.; Wang, L.; Li, W. Gene expression profiles and flavonoid accumulation during salt stress in Ginkgo biloba seedlings. Plants 2020, 9, 1162. [Google Scholar] [CrossRef] [PubMed]
  5. Zhao, B.; Wang, L.; Pang, S.; Jia, Z.; Wang, L.; Li, W.; Jin, B. UV-B promotes flavonoid synthesis in Ginkgo biloba leaves. Ind. Crops Prod. 2020, 151, 112483. [Google Scholar]
  6. Lu, J.; Tong, P.; Xu, Y.; Liu, S.; Jin, B.; Cao, F.; Wang, L. SA-responsive transcription factor GbMYB36 promotes flavonol accumulation in Ginkgo biloba. For. Res. 2023, 3, 19. [Google Scholar]
  7. van Beek, T.A.; Montoro, P. Chemical analysis and quality control of Ginkgo biloba leaves, extracts, and phytopharmaceuticals. J. Chromatogr. A 2009, 1216, 2002–2032. [Google Scholar] [PubMed]
  8. Yao, X.; Shang, E.; Zhou, G.; Tang, Y.; Guo, S.; Su, S.; Jin, C.; Qiao, D.; Qin, Y.; Duan, J.A. Comparative characterization of total flavonol glycosides and terpene lactones at different ages, from different cultivation sources and genders of Ginkgo biloba leaves. Int. J. Mol. Sci. 2012, 13, 10305–10315. [Google Scholar] [CrossRef] [PubMed]
  9. Gao, H.; Chen, X.; Li, Y.; Gao, X.; Wang, J.; Qian, M.; Tong, X.; Wang, S.; Wang, Y.; Feng, J.; et al. Quality evaluation of ginkgo biloba leaves based on non-targeted metabolomics and representative ingredient quantification. J. Chromatogr. B 2023, 1214, 123549. [Google Scholar]
  10. Yu, W.; Liu, H.; Luo, J.; Zhang, S.; Xiang, P.; Wang, W.; Cai, J.; Lu, Z.; Zhou, Z.; Hu, J.; et al. Partial root-zone simulated drought induces greater flavonoid accumulation than full root-zone simulated water deficiency in the leaves of Ginkgo biloba. Environ. Exp. Bot. 2022, 201, 104998. [Google Scholar]
  11. Miao, S.F.; Yu, J.P.; Du, Z.; Guan, Y.X.; Yao, S.J.; Zhu, Z.Q. Supercritical fluid extraction and micronization of ginkgo flavonoids from ginkgo biloba leaves. Ind. Eng. Chem. Res. 2010, 49, 5461–5466. [Google Scholar]
  12. Wang, J.; Cao, F.; Su, E.; Wu, C.; Zhao, L.; Ying, R. Improving flavonoid extraction from Ginkgo biloba leaves by prefermentation processing. J. Agr. Food. Chem. 2013, 61, 5783–5791. [Google Scholar] [CrossRef] [PubMed]
  13. Wang, Q.; Jiang, Y.; Mao, X.; Yu, W.; Lu, J.; Wang, L. Integration of morphological, physiological, cytological, metabolome and transcriptome analyses reveal age inhibited accumulation of flavonoid biosynthesis in Ginkgo biloba leaves. Ind. Crops Prod. 2022, 187, 115405. [Google Scholar] [CrossRef]
  14. Tong, J.H.; Li, J.B.; Jiang, H.Y. Machine vision techniques for the evaluation of seedling quality based on leaf area. Biosyst. Eng. 2013, 115, 369–379. [Google Scholar] [CrossRef]
  15. Zhang, L.; Zhang, H.; Chen, Y.; Dai, S.; Li, X.; Kenji, I.; Lu, Z.; Li, M. Real-time monitoring of optimum timing for harvesting fresh tea leaves based on machine vision. Int. J. Agr. Biol. Eng. 2019, 12, 6–9. [Google Scholar] [CrossRef]
  16. Taghavi Namin, S.; Esmaeilzadeh, M.; Najafi, M.; Brown, T.B.; Borevitz, J.O. Deep phenotyping: Deep learning for temporal phenotype/genotype classification. Plant Methods 2018, 14, 66. [Google Scholar]
  17. Banerjee, B.P.; Joshi, S.; Thoday-Kennedy, E.; Pasam, R.K.; Tibbits, J.; Hayden, M.; Spangenberg, G.; Kant, S. High-throughput phenotyping using digital and hyperspectral imaging-derived biomarkers for genotypic nitrogen response. J. Exp. Bot. 2020, 71, 4604–4615. [Google Scholar] [CrossRef]
  18. Araus, J.L.; Kefauver, S.C.; Vergara-Díaz, O.; Gracia-Romero, A.; Rezzouk, F.Z.; Segarra, J.; Buchaillot, M.L.; Chang-Espino, M.; Vatter, T.; Sanchez-Bragado, R.; et al. Crop phenotyping in a context of global change: What to measure and how to do it. J. Integr. Plant Biol. 2022, 64, 592–618. [Google Scholar]
  19. Liu, Y.; Sun, Y.; Xie, A.; Yu, H.; Yin, Y.; Li, X.; Duan, X. Potential of hyperspectral imaging for rapid prediction of anthocyanin content of purple-fleshed sweet potato slices during drying process. Food Anal. Method. 2017, 10, 3836–3846. [Google Scholar]
  20. Bai, Y.; Xiong, Y.; Huang, J.; Zhou, J.; Zhang, B. Accurate prediction of soluble solid content of apples from multiple geographical regions by combining deep learning with spectral fingerprint features. Postharvest Biol. Technol. 2019, 156, 110943. [Google Scholar]
  21. Cui, Y.; Wu, J.; Chen, Y.; Ji, F.; Li, X.; Yang, J.; Hong, S.B.; Zhu, Z.; Zang, Y. Optimization of near-infrared reflectance models in determining flavonoid composition of okra (Abelmoschus esculentus L.) pods. Food Chem. 2023, 418, 135953. [Google Scholar] [CrossRef] [PubMed]
  22. He, J.; Chen, L.; Chu, B.; Zhang, C. Determination of total polysaccharides and total flavonoids in Chrysanthemum morifolium using near-infrared hyperspectral imaging and multivariate analysis. Molecules 2018, 23, 2395. [Google Scholar] [CrossRef]
  23. Abeysekera, S.K.; Robinson, A.; Ooi, M.P.L.; Kuang, Y.C.; Manley-Harris, M.; Holmes, W.; Hirst, E.; Nowak, J.; Caddie, M.; Steinhorn, G.; et al. Sparse reproducible machine learning for near infrared hyperspectral imaging: Estimating the tetrahydrocannabinolic acid concentration in Cannabis sativa L. Ind. Crops Prod. 2023, 192, 116137. [Google Scholar] [CrossRef]
  24. Ooi, M.P.L.; Robinson, A.; Manley-Harris, M.; Hill, S.; Raymond, L.; Kuang, Y.C.; Steinhorn, G.; Caddie, M.; Nowak, J.; Holmes, W.; et al. Robust statistical analysis to predict and estimate the concentration of the cannabidiolic acid in Cannabis sativa L.: A comparative study. Ind. Crops Prod. 2022, 189, 115744. [Google Scholar] [CrossRef]
  25. Xiao, Q.; Tang, W.; Zhang, C.; Zhou, L.; Feng, L.; Shen, J.; Yan, T.; Gao, P.; He, Y.; Wu, N. Spectral preprocessing combined with deep transfer learning to evaluate chlorophyll content in cotton leaves. Plant Phenomics 2022, 2022, 9813841. [Google Scholar] [CrossRef] [PubMed]
  26. Xu, Y.; Zhang, J.; Wang, Y. Recent trends of multi-source and non-destructive information for quality authentication of herbs and spices. Food Chem. 2023, 398, 133939. [Google Scholar]
  27. Li, L.; Jang, X.; Li, B.; Liu, Y. Wavelength selection method for near-infrared spectroscopy based on standard-sample calibration transfer of mango and apple. Comput. Electron. Agric. 2021, 190, 106448. [Google Scholar] [CrossRef]
  28. Phiri, D.; Morgenroth, J.; Xu, C.; Hermosilla, T. Effects of pre-processing methods on Landsat OLI-8 land cover classification using OBIA and random forests classifier. Int. J. Appl. Earth. Obs. 2018, 73, 170–178. [Google Scholar]
  29. Lu, Z.; Zhu, L.; Lu, J.; Shen, N.; Wang, L.; Liu, S.; Wang, Q.; Yu, W.; Kato-Noguchi, H.; Li, W. Rejuvenation increases leaf biomass and flavonoid accumulation in Ginkgo biloba. Hortic. Res. 2022, 9, uhab018. [Google Scholar]
  30. Corti, M.; Gallina, P.M.; Cavalli, D.; Cabassi, G. Hyperspectral imaging of spinach canopy under combined water and nitrogen stress to estimate biomass, water, and nitrogen content. Biosyst. Eng. 2017, 158, 38–50. [Google Scholar]
  31. Zhou, X.; Sun, J.; Tian, Y.; Lu, B.; Hang, Y.; Chen, Q. Hyperspectral technique combined with deep learning algorithm for detection of compound heavy metals in lettuce. Food Chem. 2020, 321, 126503. [Google Scholar] [CrossRef] [PubMed]
  32. Guo, Y.; Wang, T.; Fu, F.F.; El-Kassaby, Y.A.; Wang, G. Temporospatial flavonoids metabolism variation in Ginkgo biloba leaves. Front. Genet. 2020, 11, 589326. [Google Scholar] [CrossRef] [PubMed]
  33. Sanaeifar, A.; Yang, C.; de la Guardia, M.; Zhang, W.; Li, X.; He, Y. Proximal hyperspectral sensing of abiotic stresses in plants. Sci. Total Environ. 2023, 861, 160652. [Google Scholar] [PubMed]
  34. Li, X.; Jin, J.; Sun, C.; Ye, D.; Liu, Y. Simultaneous determination of six main types of lipid-soluble pigments in green tea by visible and near-infrared spectroscopy. Food Chem. 2018, 270, 236–242. [Google Scholar] [CrossRef]
  35. Zhang, C.; Wu, W.; Zhou, L.; Cheng, H.; Ye, X.; He, Y. Developing deep learning based regression approaches for determination of chemical compositions in dry black goji berries (Lycium ruthenicum Murr.) using near-infrared hyperspectral imaging. Food Chem. 2020, 319, 126536. [Google Scholar] [CrossRef]
  36. Feng, L.; Wu, B.; He, Y.; Zhang, C. Hyperspectral imaging combined with deep transfer learning for rice disease detection. Front. Plant Sci. 2021, 12, 693521. [Google Scholar] [CrossRef]
  37. Conrad, A.O.; Li, W.; Lee, D.Y.; Wang, G.L.; Rodriguez-Saona, L.; Bonello, P. Machine learning-based presymptomatic detection of rice sheath blight using spectral profiles. Plant Phenomics 2020, 2020, 8954085. [Google Scholar]
  38. Vašát, R.; Kodešová, R.; Klement, A.; Borůvka, L. Simple but efficient signal pre-processing in soil organic carbon spectroscopic estimation. Geoderma 2017, 298, 46–53. [Google Scholar] [CrossRef]
  39. Zhang, X.; Sun, J.; Li, P.; Zeng, F.; Wang, H. Hyperspectral detection of salted sea cucumber adulteration using different spectral preprocessing techniques and SVM method. LWT 2021, 152, 112295. [Google Scholar] [CrossRef]
  40. Wu, S.; Wang, L.; Zhou, G.; Liu, C.; Ji, Z.; Li, Z.; Li, W. Strategies for the content determination of capsaicin and the identification of adulterated pepper powder using a hand-held near-infrared spectrometer. Food Res. Int. 2023, 163, 112192. [Google Scholar]
  41. Khodabakhshian, R.; Lavasani, H.S.; Weller, P. A methodological approach to preprocessing FTIR spectra of adulterated sesame oil. Food Chem. 2023, 419, 136055. [Google Scholar] [CrossRef] [PubMed]
  42. Valkama, E.; Salminen, J.P.; Koricheva, J.; Pihlaja, K. Changes in leaf trichomes and epicuticular flavonoids during leaf development in three birch taxa. Ann. Bot. 2004, 94, 233–242. [Google Scholar] [CrossRef] [PubMed]
  43. Valares Masa, C.; Sosa Díaz, T.; Alías Gallego, J.C.; Chaves Lobón, N. Quantitative variation of flavonoids and diterpenes in leaves and stems of Cistus ladanifer L. at different ages. Molecules 2016, 21, 275. [Google Scholar] [CrossRef] [PubMed]
  44. Wang, J.W.; Park, M.Y.; Wang, L.J.; Koo, Y.; Chen, X.Y.; Weigel, D.; Poethig, R.S. miRNA control of vegetative phase change in trees. PLoS Genet. 2011, 7, e1002012. [Google Scholar] [CrossRef]
  45. Xu, L.; Zhou, Y.P.; Tang, L.J.; Wu, H.L.; Jiang, J.H.; Shen, G.L.; Yu, R.Q. Ensemble preprocessing of near-infrared (NIR) spectra for multivariate calibration. Anal. Chim. Acta 2008, 616, 138–143. [Google Scholar] [CrossRef]
  46. Dhanoa, M.S.; Lister, S.J.; Barnes, R.J. On the scales associated with near-infrared reflectance difference spectra. Appl. Spectrosc. 1995, 49, 765–772. [Google Scholar] [CrossRef]
  47. Zornoza, R.; Guerrero, C.; Mataix-Solera, J.; Scow, K.M.; Arcenegui, V.; Mataix-Beneyto, J. Near infrared spectroscopy for determination of various physical, chemical and biochemical properties in Mediterranean soils. Soil Biol. Biochem. 2008, 40, 1923–1930. [Google Scholar] [CrossRef]
Figure 1. An analysis of total flavonoid content (TFC) was conducted on ginkgo leaves of different morphological regions. (a) Morphologies of the upper, middle, and lower parts of the leaves on a 5-year-old ginkgo tree. Among them, vii and viii represent unlobed leaves (UL), and i, ii, iii, iv, v, vi, and ix represent lobed leaves (LL). scale bar = 5 cm. (b) TFC in leaves from different regions. Data are presented as the mean ± standard deviation (SD) of three independent biological replicates. Different letters indicate significant differences determined by Tukey’s Honestly Significant Difference (HSD) test: p < 0.05. (c) A heatmap represents the differences in total flavonoid content among various regions of the ginkgo tree. The deeper the color, the higher the TFC.
Figure 1. An analysis of total flavonoid content (TFC) was conducted on ginkgo leaves of different morphological regions. (a) Morphologies of the upper, middle, and lower parts of the leaves on a 5-year-old ginkgo tree. Among them, vii and viii represent unlobed leaves (UL), and i, ii, iii, iv, v, vi, and ix represent lobed leaves (LL). scale bar = 5 cm. (b) TFC in leaves from different regions. Data are presented as the mean ± standard deviation (SD) of three independent biological replicates. Different letters indicate significant differences determined by Tukey’s Honestly Significant Difference (HSD) test: p < 0.05. (c) A heatmap represents the differences in total flavonoid content among various regions of the ginkgo tree. The deeper the color, the higher the TFC.
Ijms 25 09584 g001
Figure 2. Flowchart of the hyperspectral data acquisition process for ginkgo leaves. (a) Sampling site for ginkgo leaf samples. (b) Morphological characteristics of some ginkgo leaf samples. Among them, i, iii, and v represent UL, and ii, iv, vi, and vii represent LL. scale bar = 5 cm. (c) Hyperspectral imaging system.
Figure 2. Flowchart of the hyperspectral data acquisition process for ginkgo leaves. (a) Sampling site for ginkgo leaf samples. (b) Morphological characteristics of some ginkgo leaf samples. Among them, i, iii, and v represent UL, and ii, iv, vi, and vii represent LL. scale bar = 5 cm. (c) Hyperspectral imaging system.
Ijms 25 09584 g002
Figure 3. Statistical results of the TFC for all samples. (a) Distribution statistics of the TFC. (b) Statistics of the TFC in leaves with and without lobes.
Figure 3. Statistical results of the TFC for all samples. (a) Distribution statistics of the TFC. (b) Statistics of the TFC in leaves with and without lobes.
Ijms 25 09584 g003
Figure 4. The reflectance spectrum of ginkgo leaf samples.
Figure 4. The reflectance spectrum of ginkgo leaf samples.
Ijms 25 09584 g004
Figure 5. The training and testing results for estimating the TFC in ginkgo leaves using six different algorithms. The training set is represented by solid circles, and the test set is represented by hollow circles. RTest2 (coefficient of determination for test); RMSETest (root mean square error for test).
Figure 5. The training and testing results for estimating the TFC in ginkgo leaves using six different algorithms. The training set is represented by solid circles, and the test set is represented by hollow circles. RTest2 (coefficient of determination for test); RMSETest (root mean square error for test).
Ijms 25 09584 g005
Figure 6. The results of the Bayesian Ridge (a) model and the PLSR model (b) established based on the multiplicative scatter correction (MSC), standard normal variate transformation (SNV), and Savitzky–Golay (SavGol) preprocessed spectra. The training set is represented by solid circles, and the test set is represented by hollow circles.
Figure 6. The results of the Bayesian Ridge (a) model and the PLSR model (b) established based on the multiplicative scatter correction (MSC), standard normal variate transformation (SNV), and Savitzky–Golay (SavGol) preprocessed spectra. The training set is represented by solid circles, and the test set is represented by hollow circles.
Ijms 25 09584 g006
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lu, J.; Jiang, Y.; Jin, B.; Sun, C.; Wang, L. Hyperspectral Imaging Combined with Deep Transfer Learning to Evaluate Flavonoids Content in Ginkgo biloba Leaves. Int. J. Mol. Sci. 2024, 25, 9584. https://doi.org/10.3390/ijms25179584

AMA Style

Lu J, Jiang Y, Jin B, Sun C, Wang L. Hyperspectral Imaging Combined with Deep Transfer Learning to Evaluate Flavonoids Content in Ginkgo biloba Leaves. International Journal of Molecular Sciences. 2024; 25(17):9584. https://doi.org/10.3390/ijms25179584

Chicago/Turabian Style

Lu, Jinkai, Yanbing Jiang, Biao Jin, Chengming Sun, and Li Wang. 2024. "Hyperspectral Imaging Combined with Deep Transfer Learning to Evaluate Flavonoids Content in Ginkgo biloba Leaves" International Journal of Molecular Sciences 25, no. 17: 9584. https://doi.org/10.3390/ijms25179584

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop