Next Article in Journal
Satsuma Orange Tolerance to Spring and Autumn Indaziflam Applications in Georgia
Previous Article in Journal
Determining an Optimal Combination of Meteorological Factors to Reduce the Intensity of Atmospheric Pollution During Prescribed Straw Burning
Previous Article in Special Issue
Fast Dynamic Time Warping and Hierarchical Clustering with Multispectral and Synthetic Aperture Radar Temporal Analysis for Unsupervised Winter Food Crop Mapping
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Advancing Loquat Total Soluble Solids Content Determination by Near-Infrared Spectroscopy and Explainable AI

1
Institute of Facility Agriculture, Guangdong Academy of Agricultural Sciences, Guangzhou 510640, China
2
College of Electronic Engineering, South China Agricultural University, Guangzhou 510642, China
3
School of Computer and Information Engineering, Fuyang Normal University, Fuyang 236037, China
*
Author to whom correspondence should be addressed.
Agriculture 2025, 15(3), 281; https://doi.org/10.3390/agriculture15030281
Submission received: 11 November 2024 / Revised: 14 January 2025 / Accepted: 26 January 2025 / Published: 28 January 2025

Abstract

:
TSSC is one of the most important factors affecting loquat flavor, consumer satisfaction, and market competitiveness. To improve the ability to assess the TSSC of loquats, a method leveraging near-infrared spectroscopy and explainable artificial intelligence was proposed. The 900–1700 nm near-infrared spectroscopy of 156 fresh loquat samples was collected and preprocessed using seven preprocessing techniques, significant wavelength extraction utilizing six feature methods to eliminate data redundancy. Linear and nonlinear models were employed to establish the relationship between the feature spectrum and TSSC, with a focus on comparing and analyzing prediction performance. The findings reveal that the combination of 26 spectral bands selected by SPA and the PLSR model yielded the best prediction outcomes (R = 0.9031, RMSEP = 0.6171, RPD = 2.2803). The contribution of key wavelengths can be obtained by SHAP, which explains differences in model prediction accuracy and provides a reference for the application of loquat TSSC determination.

Graphical Abstract

1. Introduction

Loquat first came into being in China and is now widely planted in countries around the world [1]. Loquat is rich in vitamins, phenolics, and other nutrients and has a unique flavor, making it an interesting fruit option for consumers. The quality change of post-harvest loquats leads to differences in grading and price. TSSC is the main factor that determines quality and is not necessarily related to the appearance of the fruit. Consequently, some quality grading results that are only based on external characteristics such as weight, size, color, etc., are not convincing enough [2]. Conventional TSS measurement requires a lot of labor and is an invasive operation, which will cause a certain loss in the number of fruits. It is inconsistent with the demand for large quantities of loquats, real-time detection, and low losses in actual production. Therefore, how to quickly and non-destructively measure loquat TSSC to provide a more accurate basis for quality grade classification is a problem to be solved.
Numerous studies have underscored the efficacy of near-infrared spectroscopy as a non-destructive testing method across various domains, including tea quality grading [3,4,5], quantitative analysis of traditional Chinese medicine components [6], and the determination of intrinsic substances in fruits. Near-infrared spectrum, operating within the electromagnetic spectrum between visible and mid-infrared light, spans a wavelength range of 780–2526 nm. Its origin lies in the frequency doubling and combined frequency absorption of hydrogen-containing groups X–H (where X = C, N, O). The reflective information encapsulates the composition and molecular structure details of a wide array of organic compounds, facilitating organic matter analysis. Notably, near-infrared spectroscopy offers the advantages of convenience, speed, and cost-effectiveness, making it apt for assessing thin-skinned fruits and conducting commercial-level TSSC testing [6]. Recent advancements in near-infrared spectroscopy have propelled the measurement of TSS in various fruits like apples [7], mangoes [8], pears [9], and kiwi fruit [10]. Research on loquat has focused on the identification of bruising [11,12] and intelligent formulation of loquat compote [13], with limited attention paid to TSSC detection and its explanation.
Feature variable selection is a crucial step in alleviating computational load and enhancing accuracy in subsequent modeling endeavors. For instance, demonstrated the efficacy of feature selection algorithms in predicting durian total soluble solids content by reducing 92% of the bands, leading to a slight improvement in prediction accuracy compared to using the full band. Sun et al. employed a stochastic frog-leap algorithm for spectral calibration, facilitating the rapid identification and visualization of optimal information intervals [14]. The variables selected through this method exhibited minimal errors when integrated into the prediction model, with the average processing time ranking second lowest. Yun et al. employed a hybrid variable selection strategy anchored in VCPA to streamline the optimized variable space, achieving superior error minimization across various datasets [15].
The ultimate goal of most of the studies was to model the optimal results, but the results of the models varied after different numbers of sample species, preprocessing methods, and feature selection methods of treatment, making it not easy to refer to studies in the same field. On predicting tomato TSSC, the R of PLSR was 0.78 in some cases and 0.525 in others [16,17]. Therefore, for a specific research object and model setting, it is necessary to explain the rationale and clarify the mechanisms of feature selection and model action, which will help improve the generalizability of the paper’s application. Zhao et al. explain the superiority of RNN models in conditions of insufficient feature extraction and small samples by calculating the contribution value of key wavelengths [18]. Akulich et al. use a variety of interpretable AI techniques to explain the prediction results of performing high-dimensional but limited spectral data, making the machine learning modeling process clearer [19].
In the context of the growing need for model transparency, Shapley additive explanations (SHAP) is regarded as an effective technique for explaining machine learning, with its property of considering the predictive performance of the full combination of feature sets. SHAP values are obtained through the feature wavelength contribution in the modeling process, which helps smooth out the relationship between multidimensional features and model predictions. The SHAP technique allows the identification of key elements in the decision-making process of the model, which is feasible to help researchers better understand the operation mechanism of the model, thus improving the credibility of chemometric predictions. At present, the feasibility of SHAP in explaining the contribution of important bands for sweet potato hardness prediction [20], explaining the differences in soil nutrient content prediction models [21], and evaluating bands with high significance scores [22] has been proven.
Based on commercial loquat fruits, multiple feature selection approaches were used for TSSC significant wavelength extraction, followed by an analysis to evaluate the prediction performance of multiple machine learning methods (PLS, BPNN, and ELM). Finally, by employing the interpretable artificial intelligence approach of SHAP to visualize the contribution of features, the crucial wavelengths for predicting the sugar content of loquats were ascertained. This process facilitated an understanding of the response of spectral variable combinations to the TSSC, thereby guiding feature selection and model selection in addressing such problems.

2. Materials and Methods

2.1. Sample Preparation

In April 2023, we collected 163 mature loquats from an orchard in Guangzhou, Guangdong Province. After eliminating samples that did not meet the requirements, 156 samples remained. These samples consist of fully matured loquat fruits that are ready for sale. They are uniform in size, and fresh, and their outer skins are intact without any damage. The average transverse diameter of the fruit is 45.6 mm, the average longitudinal diameter is 54 mm, the average lateral diameter is 44.5 mm, and the average single—fruit weight is 57.6 g. Upon collection, the fruits should be stored in a refrigeration facility maintained at 4 °C. Before data collection, the samples should be left at a room temperature of 20 °C for 8 h.

2.2. Spectral Data Acquisition

Spectral data of loquat samples were obtained through a near-infrared spectrometer (Oceanhood optoelectronics XS9214, Shanghai Oceanhood Opto-electronics Tech Co., Shanghai, China), with a wavelength range of 900~1700 nm and a resolution of 3.5 nm. The experimental platform is shown in Figure 1, which consists of a near-infrared spectrometer, light source, dark box, fiber optic probe, tungsten light source, and computer. Spectrum acquisition software (Uspectral-PLUS, version: 5.0) was used for spectrum visualization and storage, in which the integration time was set to 50 ms and the average number was set to 100, and the continuous acquisition mode was adopted.

2.3. TSSC Measurement

The TSSC value is measured using a sugar meter (ATAGO PAL-1, ATAGO Co., Ltd., Tokyo, Japan) based on the principle of optical refraction, with an accuracy of ±0.2% Brix. After collecting the near-infrared spectrum, perform TSSC measurement. Peel and remove the seeds from the loquat, mash the pulp, filter out the residue, and place the juice in the test area of the sugar meter for measurement. Repeat the measurement three times and take the average value as the final sugar content. The TSSC range of 156 loquats was measured at 6.4–13.8 Brix.

2.4. Spectral Feature and Preprocessing

In order to mitigate the influence of noise stemming from environmental and instrumental factors, a suite of pretreatment methods is employed to enhance the raw data’s fidelity to the spectral mapping of the analyzed substances. The SG, MSC, SNV, and DT preprocessing algorithms will be implemented for preprocessing purposes to assess and compare their optimal denoising efficacy, the preprocessing process is carried out using the Unscrambler X 10.4 (64-bit) software.
The SG filter is a noise reduction method based on partial least squares fitting of high-order polynomials. The smoothing effect is controlled by adjusting the window width and filter order, which can effectively preserve the shape and height of the spectral waveform. Considering that a large window size setting leads to a smoother curve, causing the loss of some features within the bands, smoothing and comparing were performed with window sizes 3, 5, and 7 [23]. The calculation process of MSC is to establish a linear regression formula of the average spectrum of each sample and all spectra and correct the original spectrum by calculating the linear translation and tilt offset of the regression formula. SNV corrects each spectrum through the mean and standard deviation of the spectral data to normalize the spectral data standard. In the case of large sample differences, the correction ability is stronger. DT helps remove non-spectral interference such as temperature and humidity in the spectrum, making the spectral data more readable and clearer.

2.5. Sample Division

The SPXY algorithm was utilized to divide the data set, and the Euclidean distance that takes both the X and Y vectors into account is calculated as shown in Equation (3) as the basis for division. The two samples with the largest distance were selected, and then the smallest distance from the two samples was calculated from the remaining samples. Among the samples with the smallest distance, the one with the largest distance is selected into the calibration subset, and the next selection is repeated until reach the required sample number [24]. This method can characterize the sample distribution to the greatest extent, increase the diversity and representativeness of sample parts, and is proven to be more comprehensive in evaluating and classifying data collected by near-infrared spectroscopy [25].
A total of 156 loquat samples were divided into two datasets in a ratio of 3:1 to obtain 109 samples for the calibration set and 47 samples for the prediction set. The TSSC distribution is shown in Table 1.
D x m , n = i = 1 J x m i x n i 2 ,   m , n [ 1 , K ]
D y m , n = y m y n 2 ,   m , n [ 1 , K ]
D x y m , n = D x ( m , n ) m a x m , n [ 1 , K ] D x ( m , n ) + D y ( m , n ) m a x m , n [ 1 , K ] D y ( m , n )
where m and n represent the sample serial number, x m i and x n i are the spectral reflectance corresponding to the i-th band of sample m and n, and K is the total number of samples.

2.6. Variable Selection Process

2.6.1. SPA

The successive projection algorithm (SPA) is a method of searching for minimum collinear variables through iterative calculations. The calculation process requires setting the initial band and the number of variables and then selecting the band with the largest projection to the initial band by calculating the projection of the unselected band and the initial band. Then iterative calculations are performed until the number of set variables is selected. This method can be used to reduce redundant variables in subsequent modeling and achieve similar or even better results by replacing all variables with some variables for modeling and prediction.

2.6.2. UVE

Uninformative Variables Elimination (UVE) can remove the wavelengths that contribute the least to the modeling based on the PLS algorithm, and select feature variables through the statistics of irrelevant variable information of the noise. The specific steps are that UVE first adds a set of white noise variables through the PLS model, and then makes variable judgments on the statistical distribution of the regression coefficients of the target matrix based on the combined independent variable matrix, where the statistical distribution is represented by the ratio of the mean and the standard deviation. Finally, a threshold is set to determine the characteristic variables within the corresponding range. UVE-SPA initially selects variables in UVE and then performs secondary selection through SPA to eliminate redundant variables.

2.6.3. CARS

The Competitive adaptive reweighted sampling (CARS) algorithm selects the number of variables through Monte Carlo model sampling and evaluates the importance of variables based on the absolute coefficients of the PLS model regression coefficients. Adaptive reweighted sampling (ARS) is used to retain points with larger absolute value weights of regression coefficients in the PLS model as new subsets, removing the weights. For smaller points, a PLS model is then established based on the new subset. After multiple calculations, the wavelength in the subset with the smallest root mean square error (RMSECV) of the PLS model interactive verification is selected as the characteristic wavelength [26].

2.6.4. R-Frog

As a population-based algorithm, Random Frog (R-frog) simulates the behavior of frogs searching for optimal food locations in a swamp. There are generally three stages: the first stage is evaluation, initializing the frog group and sorting it in descending order according to the fitness value. The second stage is to divide the sorted frogs into multiple subgroups, and each group can conduct a local search independently. After all subgroups complete the local search, they are reordered and divided for the next round of local position updates. The solution to the problem is completed after alternating the local search of the subpopulation and the global search of the entire population [27].

2.6.5. VCPA-IRIV

VCPA-IRIV is a hybrid variable selection strategy based on the continuous contraction of the variable space. Optimization of variable selection is achieved through two processes: firstly, variable combination population analysis (VCPA) is used to generate different combinations of variables by Binary Matrix Sampling Strategy (BMS) and Exponential Decay Function (EDF) to construct the population of the sub-model. Two information vectors, the frequency of occurrence of the variables and the partial least squares regression coefficient, are used to assess the contribution value of each variable to achieve an initial contraction of the variable space. Secondly, iteratively retains informative variables (IRIV) and is modeled based on a subset of variables screened by the VCPA session to exclude the variables that do not improve the predictive performance of the model for further variable optimization selection [28].

2.7. Modeling Algorithm

2.7.1. PLSR

Partial Least Squares Regression (PLSR) is a linear regression modeling method suitable for problems with more variables than the number of samples. It not only considers the principal components of the independent variable and the dependent variable but also has the advantage of maximizing the correlation of the principal components of the independent variable and the dependent variable, respectively. PLSR is performed by selecting the first n principal components that can obtain better prediction ability instead of all principal components. The number of principal components is determined through cross-validity testing to achieve the best performance of the model [29].

2.7.2. BPNN

A backpropagation neural network (BPNN) is a nonlinear algorithm, and its structure generally consists of three layers: an input layer, one or more hidden layers, and an output layer. The main training process includes: information forward propagation and error back propagation. The first step of the forward operation is to perform the first layer calculation based on the input vector, and the output result is used as input to enter the next layer until the final prediction result is output. The second step of the reverse operation is to minimize the error between the predicted value and the true value. This error is called the loss function. The loss function is a multivariate function composed of the weights and offsets of the neurons in each layer. The minimization process is performed through gradient descent. As the algorithm proceeds, the weights and offsets are gradually transmitted back, so that the parameters are adjusted to a state where the network model error approaches zero [30].

2.7.3. ELM

Extreme Learning Machine (ELM) is a machine learning method with the advantages of fewer training parameters and fast learning speed. The training process is divided into two steps: the first is random feature mapping, which randomly generates the weights and biases of the hidden layer according to any continuous probability distribution, and uses a nonlinear mapping activation function to map the input data to a new feature space. This way of determining parameter values makes ELM have high training efficiency. The second step is to solve for the linear parameters and minimize the training error to find the weights of the hidden layer and the output layer.

2.8. Evaluation Indicator

Three indicators were used to evaluate the performance of the model in predicting loquat TSSC. The first is the mean square error, which is judged as the error between the predicted value and the actual value of each model. The closer the RMSE value is to 0, the better the prediction effect. RMSEC and RMSEP represent the errors of the calibration set and the prediction set, respectively. The second is the relative coefficient R which indicates the fitting effect. R_c and R_P are expressed as the fitting results of the calibration set and the prediction set, with a value range of 0 to 1. The closer the value is to 1, the better the model fitting effect. The third is the residual predictive deviation (RPD) of the predicted value. Generally, if the value is between 1.4–2.0, the model is considered more reliable, and if it is greater than 2.0, the model is considered to be highly reliable and has good practical value.
R = 1 i ( y i ^ y i ) 2 i ( y ¯ y i ) 2
R M S E = 1 m i = 1 m ( y i y i ^ ) 2
R P D = 1 1 R P 2

2.9. Model Explanation

SHAP (Shapley Additive Explanations) is based on the Shapley values in game theory, and the core idea is to quantify the contribution of each feature to the model output. This approach enables in-depth explanation and analysis of predictive models from both macro and micro dimensions. SHAP constructs an additive explanatory model in which all features are considered ‘contributors’, the impact of all possible combinations of features on the prediction is calculated, and the average of the contributions of each feature in different combinations of features is included in the SHAP score. Features with higher contributions are ranked higher and features with lower contributions are ranked lower.
SHAP has dedicated interpreters for various types of models, allowing for a comprehensive analysis of the performance of different models. In addition, SHAP does not only consider the magnitude of contribution but also reveals whether these features promote or inhibit the generation of predictive results through positive and negative values [31]. Equation (7) shows the formula for calculating the contribution of each feature to a single sample.
S H A P x i , M = ϕ x i , M 1 M j i ϕ ( x j , M \ { x i } )
where S H A P x i , M is the SHAP value of x i in the M set. x i is the i-th feature in the model. M is the feature set including x i and other features. ϕ x i , M is the contribution of a feature x i in the set of features M to the model prediction. ϕ ( x j , M \ { x i } ) is the contribution of feature x j to the model prediction in the feature set M excluding x i . |M| is the number of features in feature set M.
In our study, the Beewarm plot is used to display important features with feature ordering that implies information about the magnitude of influence, the combined influence of the features is calculated as the average of the absolute SHAP values of all the samples, and the specific SHAP distribution of the important features is also available.

3. Results and Discussion

The pre-processing session of this study was carried out on Unscrambler X 10.4 (CAMO, Oslo, Norway) and the model calculations were run using MATLAB 2018b (The Mathworks Inc., Natick, MA, USA). The computing process was carried out on an HP ENVY x360 computing system with an Intel Core i5-10210U processor (Intel, California, USA) and 16 GB of RAM on a Microsoft Windows 10 system. The SHAP model was computed using Python 3.9 on an Ubuntu 18.04 system.

3.1. Spectral Interpretation

Due to factors such as uneven light source distribution and irregular sample shape during spectrum collection, there is large noise at the beginning and end of the original band. The 900–927 nm and 1691–1700 nm band ranges were eliminated before further analysis. As illustrated in Figure 2a, the comprehensive spectral reflectance within the 928–1400 nm band ranges from 10% to 55%, while within the 1400–1600 nm band, it falls between 10% and 25%. Loquat has three absorption peaks at 950 nm, 1150 nm, and 1410 nm in the near-infrared spectral band. The characteristic spectral absorption peaks of loquat fruit were observed near 950, 1150, and 1410 nm. The 950 nm absorption peak is ascribed to the second-order frequency doubling of the O–H stretching vibration, the 1150 nm absorption peak to the first- and second-order frequency doubling of the C–H stretching vibration, encompassing the combined frequency of the first-order doubling of stretching and stretching vibrations, and the 1410 nm absorption peak to the first-order frequency doubling of the O–H stretching vibration [32].
Soluble solids include soluble sugars such as monosaccharides, disaccharides, and water-soluble polysaccharides, which contain C, H, and O elements that have a high absorption capacity for light energy. Moreover, various organic compounds are present, including distinctive aromatic components (such as alkanes, and hydrocarbons), pectin that forms the cellular structure, and associated functional groups responsible for energy absorption, influencing alterations in the reflection spectrum [33].

3.2. Preprocessing

The original spectrum is preprocessed in seven ways including single, dual, and multinomial methods, and the evaluation model of the preprocessing effect is based on the PLSR model. As shown in Table 2, when preprocessed data is used for modeling, both the R P and RPD increase, while RMSE decreases, indicating that preprocessing can improve data quality and have a significant positive impact on the modeling effect. The combined preprocessing method of SG(7)-SNV-DT has the best effect among all methods in processing spectra, with R P = 0.8879, RMSEP = 6799, and RPD = 2.0697. Feature extraction and modeling are performed on the spectral data preprocessed by SG(7)-SNV-DT, and the waveform is shown in Figure 2b.

3.3. Feature Variable Selection

Preprocessing multivariate data sets will undoubtedly provide rich information for research and applications, but they will also increase the difficulty of data interpretability and subsequent computational burden to a certain extent. Considering that blindly reducing indicators may lead to erroneous conclusions, the band that best expresses spectral characteristics is selected by using the variable selection method. Six feature selection methods used for variable selection are shown below.
The maximum number of bands for SPA is set to 50, and the minimum value is set to 18. The characteristic band combination is selected based on the RMSE. When the number of bands considered by the model reaches 26, the RMSE is the smallest. The selected spectrum accounts for 5.74% of the original spectrum, and the main distribution of the bands is shown in Figure 3b.
When using CARS for feature selection, the number of cross-validation folds is set to 5, the number of Monte Carlo samplings is set to 50, and the optimal variable is selected to be the corresponding value when achieving the global minimum of the RMSE curve. The number of variables, RMSECV, and regression coefficients during the 50 sampling runs are shown in Figure 3c. The samples are selected through the exponential decay function, which has experienced two stages rapid reduction and slow reduction. As the variables decrease, the RMSECV also decreases accordingly. After the number of runs exceeds the critical point of 25 times, more variables containing important information are deleted, and the error shows a rapid upward trend. Therefore, when the RMSECV is minimal, the number of sampling runs reaches 25 times. A total of 32 variables were selected, and the main distribution is shown in Figure 3d.
When UVE is used to extract the effective components of spectral data, 99% of the absolute value of the maximum stability of the noise matrix is set as the elimination threshold. In Figure 4a, the left side of the vertical red dashed line is the stability value of the spectrum, and the right side is the stability value of the noise. The value represented by the horizontal dotted line is the selection threshold ±64.06. Within the two lines is useless information, and outside the lines is valid information. The corresponding value is the feature variable extracted by UVE, with a total of 76 bands.
The number of R-Frog leap iteration cycles is set to 1000. The selection is based on the probability of variable occurrence. The higher the value, the more important it is. As shown in Figure 4c, the band corresponding to the threshold value greater than 0.2 is selected, and 36 band numbers were selected. The specific distribution is shown in Figure 4d.
In order to further extract useful information, SPA is used for secondary selection. As shown in Figure 5a, 18 bands were finally filtered out.
In VCPA-IRIV, the BMS is set to 1000 and a subset of 1000 variables will be randomly generated as the initial population. The variables are modeled according to PLSR and the partial least squares regression coefficient (Reg) for each variable and the frequency of occurrence are calculated as the contribution of the variable to the model. Vector space reduction was performed in terms of the size of the contribution. The number of EDF runs was 50, and the number of subsets of variables remaining at the end of the EDF session, L, was set to 100, giving 100 variables in the later stages of the VCPA. The PLS sub-model is built based on the 100 variables, and the combination of variables having the minimum RMSEP is obtained through the IRIV session. The redundant variables are eliminated, and the final 56 variables are retained as shown in Figure 5b.

3.4. Modeling and Results

Regression models responded differently to different combinations of spectral variables, and the TSSC determination ability of regression models conditioned on spectral variables was tested and analyzed. When the full-band spectrum and the bands selected by SPA, UVE, CARS, R-Frog, UVE-SPA, and VCPA-IRIV are used as input, the corresponding maximum principal components of the PLSR model are: 10, 12, 10, 12, 10, 11, and 9, respectively, the optimal number of hidden layer neurons corresponding to ELM are 38, 28, 35, 32, 20, 17, and 32, respectively. During the modeling process, the various parameters of BPNN are set as follows: learning rate is set to 0.001, training target minimum error is 0.0001, maximum training frequency is set to 1000 times, hidden layers are 1, and the activation function is sigmoid function. The Levenberg–Marquardt algorithm, which combines Newton’s method and gradient descent methods, was employed to train the model. The number of neurons in the input layer of BPNN is the number of full spectrum bands and the number of variables filtered out by SPA, UVE, CARS, R-Frog, UVE-SPA, and VCPA-IRIV. The optimal number of nodes for the corresponding hidden layer is 24, 12, 19, 8, 17, 6, and 14, respectively.
Table 3 shows the results of TSSC prediction by PLSR, BPNN, and ELM models. Overall, the calibration sets of models are slightly better than the prediction sets, which is in line with the general model training rules and also proves the effectiveness and stability of the model. The correlation coefficients of PLSR, ELM, and BPNN in the full spectrum are all around 0.8, indicating that there are linear and nonlinear relationships between the near-infrared spectrum of loquat and TSSC. The prediction set correlation coefficient of BPNN is lower than that of the other two models. The main reason is that BPNN is greatly affected by model parameters, and the value selected based on experience is not necessarily the best accuracy that BPNN can achieve. The highest accuracy of full-band PLSR indicates that the linear characteristics between spectral data and TSSC are more significant.
After dimensionality reduction of the full spectrum, the number of spectra is greatly reduced, and the model prediction accuracy is not affected too much. Any of the feature selection methods mentioned can make the prediction accuracy higher than that of full-band modeling. BPNN responds most obviously to dimensionality reduction spectra, and the bands after feature extraction are also more conducive to ELM detection of loquat TSSC. In addition, SPA, CARS, and R-Frog reduce spectral bands by more than 90%. SPA has better adaptability to different models and significantly improves the prediction accuracy of the three models. The more bands retained by UVE play a better role in ELM. Compared with UVE, UVE-SPA performs secondary selection to reduce the number of variables by 95%, which slightly improves the prediction effect of the PLSR model, but decreases the effect of the ELM and BPNN models.
UVE-ELM is the best among the ELM models, with the correlation coefficient, RMSE, and RPD of the prediction set being 0.8823, 0.6663, and 2.1118. The VCPA-IRIV method has a significant positive effect on the BPNN model prediction with a prediction set result of R_P = 0.8857, RMSEP = 0.6473, RPD = 2.1740. SPA-PLSR is the best performing among all models, with 26 bands, the correlation coefficient, RMSEP and RPD of the prediction set are 0.9031, 0.6171, and 2.2803, respectively.
Figure 6 shows the scatter plots of the measured TSSC and the predicted TSSC for the three models based on full-band and the corresponding optimal feature extraction methods. These points show a tendency to be closer to the regression line, which indicates that reasonable wavelength selection and modeling can approximately replace the actual values with the prediction results. Excellent prediction results were obtained by PLS modeling combined with the SPA method, ELM combined with the UVE method, and BPNN combined with the VCPA-IRIV method. Therefore, it is effective to conduct TSSC measurement research on loquat using near-infrared spectroscopy combined with spectral variable selection.

3.5. Explainable Analysis

This section explores the explanation of the differences in the SHAP values on the prediction accuracy of different model modeling, using SHAP to interpret the analysis of the contribution of the feature wavelengths to the model and to further understand the impact of the filtered bands on the model prediction results. Figure 7 shows the SHAP values of the high contributing features modeled by the PLS, BPNN, and ELM models based on their combination of features with the highest predictive accuracy, respectively.
The features in the beeswarm plot are sorted from top to bottom based on the average |SHAP| value, and the 20 variables with the highest contribution are listed. Each row of data in the plot represents the SHAP distribution of a feature band, with each point being a single sample, and all 156 loquat samples are shown. Vertical widths indicate large clusters of samples for close SHP values. A darker blue color of the point indicates a larger feature value, i.e., a larger reflectivity value, and a darker red color indicates a smaller reflectivity value. The area where the dots fall has a SHAP value > 0, which means that the feature has a positive influence on the results, and a SHAP value < 0, which means that the feature has a negative influence on the results. The wider the lateral distribution of the dots, the greater the influence of the feature on the model predictions, and conversely the closer to 0, the less influence it has on the predictions.
For the PLSR model, the characteristic wavelengths screened by SPA have the best effect on its [34]. From Figure 7a, it is learned that the wavelengths that are more backward in the ordering present the phenomenon that the SHAP values are clustered near 0, which means that the backward wavelengths have less effect on the results [35]. On the contrary, near 1150 nm (1172.15 nm, 1260 nm), 1410 nm (1424.3 nm, 1449.97 nm), and 975.01 nm whose variations have a greater impact on the PLS prediction are the core bands, which match with the range of bands of NIR spectroscopy in response to the C–H, O–H groups in TSSC. The mechanism of response of these bands to TSSC values varies, with loquat samples having higher TSSC values for larger reflectance values in the 1172.15, 1260, 1424.3, and 975.01 nm bands, whereas larger reflectance values in the 1449.97 nm band correspond to lower TSSC values. These three bands have the greatest impact on TSSC prediction, and similar conclusions were also reached in the research [18].
The BPNN shows the best response when dealing with the bands screened by the VCPA, and most of the wavelengths with the high contributions are concentrated in the 1040–1190 nm and 1240–1412 nm, as can be seen in Figure 7b. The small difference between the contribution rates of the bands, i.e., more bands contribute to the model results, also causes some uninformative variables to be focused on, which is a potential reason for the inferior prediction accuracy compared to PLS-SPA.
As shown in Figure 7b, the top contributing bands in the UVE-based ELM model are mainly dominated by the bands around 950 nm, and 1150 nm. The 1006.67 nm, 1016 nm band has a wider distribution of SHAP values, but the importance of the features is ranked more backward in the case that these features may be affecting the prediction results in a nonlinear way, and a nonlinear model such as the ELM is more sensitive to the nonlinear effects of the features, and thus these features will make the SHAP fluctuate more. Similarly, the 973.019 nm, 973.019 nm, and 1260.37 nm bands have a nonlinear effect on the prediction results.
Each regression prediction model is based on different principles that are tailored to specific practical challenges [36]. Based on the above analysis, we can see some common characteristics: under the conditions of intensive preprocessing and feature selection strategies, PLSR can capture the potential relationship between spectral data and the TSSC index for the problem of multicollinearity in explanatory variables, and combined with SPA algorithm, it is suitable for predicting the TSSC of fruits [37].

4. Conclusions

In this study, we combined near-infrared spectroscopy with interpretable AI to determine and interpret the TSSC of loquat through a series of procedures such as pre-processing, feature extraction, and building regression prediction models. Through extensive preprocessing and selection strategies based on various types, PLSR is more suitable for prediction with commercialized loquat TSSC as the response variable compared to neural network models for the case of linear relationship between features and response, high-dimensional data processing, and smaller data samples. The SG + SNV + DT preprocessing method showed the best modeling effect on the full spectrum, and the SPA-PLS model outperformed the other models in TSSC determination (R = 0.9031, RMSEP = 0.6171, RPD = 2.2803). The SHAP values explain this, since the wavelengths with high SPA-PLS contributions coincide with the wavelengths associated with TSSC, and the wider distribution of bands screened by UVE and VCPA are more helpful for non-linear models like BPNN and ELM to take advantage of, while the larger range of bands brings in some extraneous variables, making them slightly less effective than SPA-PLS. This study explains the differences in machine learning models for loquat TSSC determination via spectral variable information and improves the transparency and confidence in the prediction of commercial loquat TSSC.

Author Contributions

Conceptualization, Y.L. and X.Z.; Data curation, Q.J. and P.L.; Formal analysis, Q.J. and X.Z.; Funding acquisition, H.L.; Investigation, H.Q. and B.L.; Methodology, Y.L. and P.L.; Project administration, H.Q.; Software, Q.J. and G.Q.; Supervision, P.L. and G.Q.; Validation, H.Q. and B.L.; Visualization, H.Q. and X.Z.; Writing—original draft, Y.L.; Writing—review and editing, Y.L. and X.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by Special Fund for Rural Revitalization of Guangdong Province: 2024TS-1-3; the Scientific and Technological Innovation Strategic Program of Guangdong Academy of Agri-cultural Sciences: No. ZX202402; the Guangdong Basic and Applied Basic Research Foundation, grant number 2022A1515010391; the Innovation Fund of the Guangdong Academyof Agricultural Sciences, grant number 202104; National Natural Science Foundation of China (62405066).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest. We confirm that the manuscript has been read and approved by all named authors and that there are no other persons who satisfied the criteria for authorship but are not listed.

References

  1. Gisbert, A.D.; Romero, C.; Martínez-Calvo, J.; Leida, C.; Llácer, G.; Badenes, M.L. Genetic Diversity Evaluation of a Loquat (Eriobotrya japonica (Thunb) Lindl) Germplasm Collection by SSRs and S-Allele Fragments. Euphytica 2009, 168, 121–134. [Google Scholar] [CrossRef]
  2. Zhu, N.; Nie, Y.; Wu, D.; He, Y.; Chen, K. Feasibility Study on Quantitative Pixel-Level Visualization of Internal Quality at Different Cross Sections Inside Postharvest Loquat Fruit. Food Anal. Methods 2017, 10, 287–297. [Google Scholar] [CrossRef]
  3. Ren, G.; Liu, Y.; Ning, J.; Zhang, Z. Assessing Black Tea Quality Based on Visible–near Infrared Spectra and Kernel-Based Methods. J. Food Compos. Anal. 2021, 98, 103810. [Google Scholar] [CrossRef]
  4. Ren, G.; Sun, Y.; Li, M.; Ning, J.; Zhang, Z. Cognitive Spectroscopy for Evaluating Chinese Black Tea Grades (Camellia sinensis): Near-infrared Spectroscopy and Evolutionary Algorithms. J. Sci. Food Agric. 2020, 100, 3950–3959. [Google Scholar] [CrossRef] [PubMed]
  5. Ren, G.; Wang, Y.; Ning, J.; Zhang, Z. Highly Identification of Keemun Black Tea Rank Based on Cognitive Spectroscopy: Near Infrared Spectroscopy Combined with Feature Variable Selection. Spectroc. Acta Part A Molec. Biomolec. Spectr. 2020, 230, 118079. [Google Scholar] [CrossRef]
  6. Nie, L.; Dai, Z.; Ma, S. Enhanced Accuracy of Near-Infrared Spectroscopy for Traditional Chinese Medicine with Competitive Adaptive Reweighted Sampling. Anal. Lett. 2016, 49, 2259–2267. [Google Scholar] [CrossRef]
  7. Zeng, S.; Zhang, Z.; Cheng, X.; Cai, X.; Cao, M.; Guo, W. Prediction of Soluble Solids Content Using Near-Infrared Spectra and Optical Properties of Intact Apple and Pulp Applying PLSR and CNN. Spectroc. Acta Part A Molec. Biomolec. Spectr. 2024, 304, 123402. [Google Scholar] [CrossRef] [PubMed]
  8. Praiphui, A.; Kielar, F. Comparing the Performance of Miniaturized Near-Infrared Spectrometers in the Evaluation of Mango Quality. Food Meas. 2023, 17, 5886–5902. [Google Scholar] [CrossRef]
  9. Qi, H.; Shen, C.; Chen, G.; Zhang, J.; Chen, F.; Li, H.; Zhang, C. Rapid and Non-Destructive Determination of Soluble Solid Content of Crown Pear by Visible/near-Infrared Spectroscopy with Deep Learning Regression. J. Food Compos. Anal. 2023, 123, 105585. [Google Scholar] [CrossRef]
  10. Lee, J.S.; Kim, S.-C.; Seong, K.C.; Kim, C.-H.; Um, Y.C.; Lee, S.-K. Quality Prediction of Kiwifruit Based on Near Infrared Spectroscopy. Hort. Sci. Technol. 2012, 30, 709–717. [Google Scholar] [CrossRef]
  11. Han, Z.; Li, B.; Wang, Q.; Yang, A.; Liu, Y. Detection Storage Time of Mild Bruise’s Loquats Using Hyperspectral Imaging. J. Spectrosc. 2022, 2022, 1–9. [Google Scholar] [CrossRef]
  12. Yin, H.; Li, B.; Liu, Y.; Zhang, F.; Su, C.; Ou-yang, A. Detection of Early Bruises on Loquat Using Hyperspectral Imaging Technology Coupled with Band Ratio and Improved Otsu Method. Spectroc. Acta Part A Molec. Biomolec. Spectr. 2022, 283, 121775. [Google Scholar] [CrossRef]
  13. Ore Areche, F.; Flores, D.D.C.; Quispe-Solano, M.A.; Nayik, G.A.; Cruz-Porta, E.A.D.L.; Rodríguez, A.R.; Roman, A.V.; Chweya, R. Formulation, Characterization, and Determination of the Rheological Profile of Loquat Compote Mespilus germánica L. through Sustenance Artificial Intelligence. J. Food Qual. 2023, 2023, 1–12. [Google Scholar] [CrossRef]
  14. Sun, J.; Yang, W.; Feng, M.; Liu, Q.; Kubar, M.S. An Efficient Variable Selection Method Based on Random Frog for the Multivariate Calibration of NIR Spectra. RSC Adv. 2020, 10, 16245–16253. [Google Scholar] [CrossRef] [PubMed]
  15. Yun, Y.; Bin, J.; Liu, D.; Xu, L.; Yan, T.; Cao, D.; Xu, Q. A Hybrid Variable Selection Strategy Based on Continuous Shrinkage of Variable Space in Multivariate Calibration. Anal. Chim. Acta 2019, 1058, 58–69. [Google Scholar] [CrossRef] [PubMed]
  16. Tan, F.; Mo, X.; Ruan, S.; Yan, T.; Xing, P.; Gao, P.; Xu, W.; Ye, W.; Li, Y.; Gao, X.; et al. Combining Vis-NIR and NIR Spectral Imaging Techniques with Data Fusion for Rapid and Nondestructive Multi-Quality Detection of Cherry Tomatoes. Foods 2023, 12, 3621. [Google Scholar] [CrossRef]
  17. De Oliveira, G.A.; Bureau, S.; Renard, C.M.-G.C.; Pereira-Netto, A.B.; De Castilhos, F. Comparison of NIRS Approach for Prediction of Internal Quality Traits in Three Fruit Species. Food Chem. 2014, 143, 223–230. [Google Scholar] [CrossRef] [PubMed]
  18. Zhao, M.; Cang, H.; Chen, H.; Zhang, C.; Yan, T.; Zhang, Y.; Gao, P.; Xu, W. Determination of Quality and Maturity of Processing Tomatoes Using Near-Infrared Hyperspectral Imaging with Interpretable Machine Learning Methods. LWT-Food Sci. Technol. 2023, 183, 114861. [Google Scholar] [CrossRef]
  19. Akulich, F.; Anahideh, H.; Sheyyab, M.; Ambre, D. Explainable Predictive Modeling for Limited Spectral Data. Chemom. Intell. Lab. Syst. 2022, 225, 104572. [Google Scholar] [CrossRef]
  20. Ahmed, M.T.; Villordon, A.; Kamruzzaman, M. Hyperspectral Imaging and Explainable Deep-Learning for Non-Destructive Quality Prediction of Sweetpotato. Postharvest Biol. Technol. 2025, 222, 113379. [Google Scholar] [CrossRef]
  21. Zhang, G.; Abdulla, W. Explainable AI-Driven Wavelength Selection for Hyperspectral Imaging of Honey Products. Food Chem. Adv. 2023, 3, 100491. [Google Scholar] [CrossRef]
  22. Zhong, L.; Guo, X.; Ding, M.; Ye, Y.; Jiang, Y.; Zhu, Q.; Li, J. SHAP Values Accurately Explain the Difference in Modeling Accuracy of Convolution Neural Network between Soil Full-Spectrum and Feature-Spectrum. Comput. Electron. Agric. 2024, 217, 108627. [Google Scholar] [CrossRef]
  23. Fang, S.; Wu, S.; Chen, Z.; He, C.; Lin, L.L.; Ye, J. Recent Progress and Applications of Raman Spectrum Denoising Algorithms in Chemical and Biological Analyses: A Review. TrAC Trends Anal. Chem. 2024, 172, 117578. [Google Scholar] [CrossRef]
  24. Yang, Z.; Nascimento, Y.M.; Monteiro, J.D.; Alves, B.E.B.; Melo, M.F.; Paiva, A.A.P.; Pereira, H.W.B.; Medeiros, L.G.; Morais, I.C.; Fagundes Neto, J.C.; et al. Fast Determination of Oxides Content in Cement Raw Meal Using NIR Spectroscopy with SPXY Algorithm. Anal. Methods 2018, 10, 1280–1285. [Google Scholar] [CrossRef]
  25. Galvao, R.; Araujo, M.; Jose, G.; Pontes, M.; Silva, E.; Saldanha, T. A Method for Calibration and Validation Subset Partitioning. Talanta 2005, 67, 736–740. [Google Scholar] [CrossRef]
  26. Li, H.; Liang, Y.; Xu, Q.; Cao, D. Key Wavelengths Screening Using Competitive Adaptive Reweighted Sampling Method for Multivariate Calibration. Anal. Chim. Acta 2009, 648, 77–84. [Google Scholar] [CrossRef]
  27. Sarkheyli, A.; Zain, A.M.; Sharif, S. The Role of Basic, Modified and Hybrid Shuffled Frog Leaping Algorithm on Optimization Problems: A Review. Soft Comput. 2015, 19, 2011–2038. [Google Scholar] [CrossRef]
  28. Guo, J.; Huang, H.; He, X.; Cai, J.; Zeng, Z.; Ma, C.; Lü, E.; Shen, Q.; Liu, Y. Improving the Detection Accuracy of the Nitrogen Content of Fresh Tea Leaves by Combining FT-NIR with Moisture Removal Method. Food Chem. 2023, 405, 134905. [Google Scholar] [CrossRef]
  29. Zhang, H.; Zhan, B.; Pan, F.; Luo, W. Determination of Soluble Solids Content in Oranges Using Visible and near Infrared Full Transmittance Hyperspectral Imaging with Comparative Analysis of Models. Postharvest Biol. Technol. 2020, 163, 111148. [Google Scholar] [CrossRef]
  30. Yin, H.; Ma, F.; Wang, D.; He, X.; Yin, Y.; Song, C.; Zhao, L. Establishing a Prediction Model for Tea Leaf Moisture Content Using the Free-Space Method’s Measured Scattering Coefficient. Agriculture 2023, 13, 1136. [Google Scholar] [CrossRef]
  31. Lundberg, S.M.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. Neural Inf. Process. Syst. 2017, 30, 4765–4774. [Google Scholar]
  32. Yu, K.; Zhao, Y.; Liu, Z.; Li, X.; Liu, F.; He, Y. Application of Visible and Near-Infrared Hyperspectral Imaging for Detection of Defective Features in Loquat. Food Bioprocess Technol. 2014, 7, 3077–3087. [Google Scholar] [CrossRef]
  33. Yuan, L.; You, L.; Yang, X.; Chen, X.; Huang, G.; Chen, X.; Shi, W.; Sun, Y. Consensual Regression of Soluble Solids Content in Peach by Near Infrared Spectrocopy. Foods 2022, 11, 1095. [Google Scholar] [CrossRef] [PubMed]
  34. Luo, W.; Zhang, J.; Liu, S.; Huang, H.; Zhan, B.; Fan, G.; Zhang, H. Prediction of Soluble Solid Content in Nanfeng Mandarin by Combining Hyperspectral Imaging and Effective Wavelength Selection. J. Food Compos. Anal. 2024, 126, 105939. [Google Scholar] [CrossRef]
  35. Çifci, A.; Kırbaş, İ. Fusion of Machine Learning and Explainable AI for Enhanced Rice Classification: A Case Study on Cammeo and Osmancik Species. Eur. Food Res. Technol. 2024. [Google Scholar] [CrossRef]
  36. Zhao, Y.; Zhang, C.; Zhu, S.; Li, Y.; He, Y.; Liu, F. Shape Induced Reflectance Correction for Non-Destructive Determination and Visualization of Soluble Solids Content in Winter Jujubes Using Hyperspectral Imaging in Two Different Spectral Ranges. Postharvest Biol. Technol. 2020, 161, 111080. [Google Scholar] [CrossRef]
  37. Sharma, A.; Kumar, R.; Kumar, N.; Kaur, K.; Saxena, V.; Ghosh, P. Chemometrics Driven Portable Vis-SWNIR Spectrophotometer for Non-Destructive Quality Evaluation of Raw Tomatoes. Chemom. Intell. Lab. Syst. 2023, 242, 105001. [Google Scholar] [CrossRef]
Figure 1. A near-infrared spectrum measurement platform for the 900–1700 nm range.
Figure 1. A near-infrared spectrum measurement platform for the 900–1700 nm range.
Agriculture 15 00281 g001
Figure 2. (a) Original spectrum waveform, (b) Spectral waveform preprocessed by SG-SNV-DT method.
Figure 2. (a) Original spectrum waveform, (b) Spectral waveform preprocessed by SG-SNV-DT method.
Agriculture 15 00281 g002
Figure 3. (a) Feature variable selection based on SPA, (b) The distribution of feature variables is selected by SPA, (c) Feature variable selection based on CARS, (d) The distribution of feature variables is selected by CARS.
Figure 3. (a) Feature variable selection based on SPA, (b) The distribution of feature variables is selected by SPA, (c) Feature variable selection based on CARS, (d) The distribution of feature variables is selected by CARS.
Agriculture 15 00281 g003
Figure 4. (a) Feature variable selection based on UVE, (b) The distribution of feature variables is selected by UVE, (c) Feature variable selection based on R-frog, (d) The distribution of feature variables is selected by R-frog.
Figure 4. (a) Feature variable selection based on UVE, (b) The distribution of feature variables is selected by UVE, (c) Feature variable selection based on R-frog, (d) The distribution of feature variables is selected by R-frog.
Agriculture 15 00281 g004
Figure 5. Distribution of feature variables is selected by UVE-SPA (a) and VCPA-IRIV (b) methods.
Figure 5. Distribution of feature variables is selected by UVE-SPA (a) and VCPA-IRIV (b) methods.
Agriculture 15 00281 g005
Figure 6. Scatter plot of measured TSSC and predicted TSSC based on a combination of models and characteristic bands. (a) PLSR-Full spectrum, (b) PLSR-SPA, (c) ELM-Full spectrum, (d) ELM-UVE, (e) BPNN-Full spectrum, (f) BPNN-VCPA.
Figure 6. Scatter plot of measured TSSC and predicted TSSC based on a combination of models and characteristic bands. (a) PLSR-Full spectrum, (b) PLSR-SPA, (c) ELM-Full spectrum, (d) ELM-UVE, (e) BPNN-Full spectrum, (f) BPNN-VCPA.
Agriculture 15 00281 g006
Figure 7. Visualization of SHAP feature importance based on a combination of (a) PLSR + SPA, (b) ELM + UVE, (c) BPNN + VCPA models.
Figure 7. Visualization of SHAP feature importance based on a combination of (a) PLSR + SPA, (b) ELM + UVE, (c) BPNN + VCPA models.
Agriculture 15 00281 g007
Table 1. Loquat samples TSSC distribution of the calibration set and prediction set.
Table 1. Loquat samples TSSC distribution of the calibration set and prediction set.
Data SetNumber of SamplesMin (%)Max (%)Mean (%)Std/(%)
Prediction set476.912.510.021.39
Calibration Set1096.413.810.231.65
Table 2. TSSC prediction results from PLSR models based on raw spectra and different preprocessing methods.
Table 2. TSSC prediction results from PLSR models based on raw spectra and different preprocessing methods.
NumberPreprocessing MethodLVs R C RMSEC R P RMSEPRPD
1Raw140.97620.35760.82580.82261.7105
2SG(3)110.90770.69160.86140.71841.9587
3SG(7)110.90300.70820.86230.71891.9574
4SG(7)-DT110.90950.68490.87640.71231.9753
5SG(7)-SNV110.91480.66570.86260.73401.9171
6SG(7)-MSC110.91000.68320.87220.71031.9809
7SG(7)-SNV-DT100.90200.71170.88790.67992.0697
8SG(7)-MSC-DT100.90330.70690.88700.80431.7495
Table 3. TSSC prediction results from PLSR, ELM, and BPNN models based on key wavelengths selected using different wavelength-selecting algorithms.
Table 3. TSSC prediction results from PLSR, ELM, and BPNN models based on key wavelengths selected using different wavelength-selecting algorithms.
ModelWavelength SelectionCalibration SetValidation Set
R C RMSEC R P RMSEPRPD
PLSRFull-spectrum0.90200.71170.88790.67992.0697
SPA0.90590.69800.90310.61712.2803
UVE0.91060.68110.87330.72061.9527
CARS0.95640.48120.85830.79641.7668
R-Frog0.93140.59970.85050.80161.7553
UVE-SPA0.89150.74660.87520.69512.0244
VCPA-IRIV0.91890.65000.87830.69192.0126
ELMFull-spectrum0.85770.84750.83100.79131.7783
SPA0.84530.88040. 86010.73981.9019
UVE0.92830.61260.88230.66632.1118
CARS0.94530.53750.86470.72611.9378
R-Frog0.91900.64970.83590.84091.6734
UVE-SPA0.88250.77500.84990.74251.8952
VCPA-IRV0.92140.64030.87410.68542.0529
BPNNFull-spectrum0.94880.53120.79750.88911.5826
SPA0.90570.70060.85840.74491.8889
UVE0.85730.86630.85430.79961.7597
CARS0.96130.45940.86630.71941.9559
R-Frog0.93880.60910.82500.81561.7252
UVE-SPA0.91890.71610.83350.92231.5257
VCPA-IRIV0.94110.56390.88570.64732.1740
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Luo, Y.; Jin, Q.; Lu, H.; Li, P.; Qiu, G.; Qi, H.; Li, B.; Zhou, X. Advancing Loquat Total Soluble Solids Content Determination by Near-Infrared Spectroscopy and Explainable AI. Agriculture 2025, 15, 281. https://doi.org/10.3390/agriculture15030281

AMA Style

Luo Y, Jin Q, Lu H, Li P, Qiu G, Qi H, Li B, Zhou X. Advancing Loquat Total Soluble Solids Content Determination by Near-Infrared Spectroscopy and Explainable AI. Agriculture. 2025; 15(3):281. https://doi.org/10.3390/agriculture15030281

Chicago/Turabian Style

Luo, Yizhi, Qingting Jin, Huazhong Lu, Peng Li, Guangjun Qiu, Haijun Qi, Bin Li, and Xingxing Zhou. 2025. "Advancing Loquat Total Soluble Solids Content Determination by Near-Infrared Spectroscopy and Explainable AI" Agriculture 15, no. 3: 281. https://doi.org/10.3390/agriculture15030281

APA Style

Luo, Y., Jin, Q., Lu, H., Li, P., Qiu, G., Qi, H., Li, B., & Zhou, X. (2025). Advancing Loquat Total Soluble Solids Content Determination by Near-Infrared Spectroscopy and Explainable AI. Agriculture, 15(3), 281. https://doi.org/10.3390/agriculture15030281

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop